2023-10-06 13:23:00,312 INFO [train_bert_encoder.py:1464] (1/4) Training started 2023-10-06 13:23:00,312 INFO [train_bert_encoder.py:1485] (1/4) Device: cuda:1 2023-10-06 13:23:00,321 INFO [train_bert_encoder.py:1494] (1/4) {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'warm_step': 2000, 'env_info': {'k2-version': '1.24.3', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '2b2ac14b326d61d79d04e53fbd69b1ff6d630411', 'k2-git-date': 'Thu Aug 24 05:58:26 2023', 'lhotse-version': '1.17.0.dev+git.3dde48dc.clean', 'torch-version': '2.0.1+cu117', 'torch-cuda-available': True, 'torch-cuda-version': '11.7', 'python-version': '3.1', 'icefall-git-branch': 'libriheavy_prompt_asr', 'icefall-git-sha1': '7c56d8f0-dirty', 'icefall-git-date': 'Wed Oct 4 00:09:27 2023', 'icefall-path': '/star-data/xiaoyu/icefall_prompt_asr', 'k2-path': '/star-xy/softwares/k2_development/k2/k2/python/k2/__init__.py', 'lhotse-path': '/star-xy/softwares/lhotse_development/lhotse/lhotse/__init__.py', 'hostname': 'de-74279-k2-train-9-0208143539-7dbf569d4f-r7nrb', 'IP address': '10.177.13.150'}, 'world_size': 4, 'master_port': 13992, 'tensorboard': True, 'num_epochs': 60, 'start_epoch': 21, 'start_batch': 0, 'exp_dir': PosixPath('zipformer_prompt_asr/exp_medium_BERT_memory_layer_0_memory_drop_0.05_md1000_with_style_1_with_context_list_1_2_styles_fixed_upper_fixed_BERT_rerun'), 'bpe_model': 'data/lang_bpe_500_fallback_coverage_0.99/bpe.model', 'base_lr': 0.045, 'lr_batches': 7500, 'lr_epochs': 3.5, 'ref_duration': 600, 'prune_range': 5, 'lm_scale': 0.25, 'am_scale': 0.0, 'simple_loss_scale': 0.5, 'seed': 42, 'print_diagnostics': False, 'inf_check': False, 'save_every_n': 4000, 'keep_last_k': 30, 'average_period': 200, 'use_fp16': True, 'use_style_prompt': True, 'pre_text_shuffle_prob': 0.05, 'style_text_shuffle_prob': 0.2, 'prompt_mask_prob': 0.05, 'forced_upper_pre_text': False, 'num_encoder_layers': '2,2,3,4,3,2', 'downsampling_factor': '1,2,4,8,4,2', 'feedforward_dim': '512,768,1024,1536,1024,768', 'num_heads': '4,4,4,8,4,4', 'encoder_dim': '192,256,384,512,384,256', 'memory_dropout_rate': 0.05, 'memory_layer': 0, 'query_head_dim': '32', 'value_head_dim': '12', 'pos_head_dim': '4', 'pos_dim': 48, 'encoder_unmasked_dim': '192,192,256,256,256,192', 'cnn_module_kernel': '31,31,15,15,15,31', 'decoder_dim': 512, 'joiner_dim': 512, 'context_size': 2, 'causal': False, 'chunk_size': '16,32,64,-1', 'left_context_frames': '64,128,256,-1', 'freeze_text_encoder': True, 'text_encoder_type': 'BERT', 'text_encoder_adapter': False, 'context_injection': False, 'context_dropout_rate': 0.05, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 1000, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'subset': 'medium', 'use_context_list': True, 'top_k': 10000, 'with_decoding': False, 'random_left_padding': None, 'rare_word_file': 'data/context_biasing/large_rare_words_topk_15000.txt', 'long_audio_cuts': 'data/manifest_npr/npr1_cuts_all_guids_0.jsonl.gz', 'blank_id': 0, 'vocab_size': 500} 2023-10-06 13:23:00,321 INFO [train_bert_encoder.py:1496] (1/4) About to create model 2023-10-06 13:23:16,088 INFO [train_bert_encoder.py:769] (1/4) Loading pre-trained BERT-base-cased as text encoder 2023-10-06 13:23:26,183 WARNING [_http.py:271] (1/4) '(MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /bert-base-cased/resolve/main/config.json (Caused by ConnectTimeoutError(, 'Connection to huggingface.co timed out. (connect timeout=10)'))"), '(Request ID: 78eccfd4-21dd-419d-838e-5c76360dbe18)')' thrown while requesting HEAD https://huggingface.co/bert-base-cased/resolve/main/config.json 2023-10-06 13:23:36,231 WARNING [_http.py:271] (1/4) '(MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /bert-base-cased/resolve/main/config.json (Caused by ConnectTimeoutError(, 'Connection to huggingface.co timed out. (connect timeout=10)'))"), '(Request ID: b6e9dff5-dc16-4b27-9133-77cb37e53dd1)')' thrown while requesting HEAD https://huggingface.co/bert-base-cased/resolve/main/config.json 2023-10-06 13:23:39,252 INFO [train_bert_encoder.py:856] (1/4) Num params in text encoder: 108310272 2023-10-06 13:23:49,303 WARNING [_http.py:271] (1/4) '(MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /bert-base-cased/resolve/main/vocab.txt (Caused by ConnectTimeoutError(, 'Connection to huggingface.co timed out. (connect timeout=10)'))"), '(Request ID: 7d9d6486-58a8-4fe7-bbc4-aa5d8fac707a)')' thrown while requesting HEAD https://huggingface.co/bert-base-cased/resolve/main/vocab.txt 2023-10-06 13:23:49,452 INFO [train_bert_encoder.py:1501] (1/4) Number of model parameters: 179038803 2023-10-06 13:23:49,453 INFO [checkpoint.py:112] (1/4) Loading checkpoint from zipformer_prompt_asr/exp_medium_BERT_memory_layer_0_memory_drop_0.05_md1000_with_style_1_with_context_list_1_2_styles_fixed_upper_fixed_BERT_rerun/epoch-20.pt 2023-10-06 13:24:03,869 INFO [train_bert_encoder.py:1516] (1/4) Using DDP 2023-10-06 13:24:04,756 INFO [train_bert_encoder.py:1521] (1/4) Freeze the parameters of text encoder and don't include them in the optimizer 2023-10-06 13:24:04,800 INFO [utils.py:1428] (1/4) Remove module.text_encoder.embeddings.word_embeddings.weight from parameters 2023-10-06 13:24:04,800 INFO [utils.py:1428] (1/4) Remove module.text_encoder.embeddings.position_embeddings.weight from parameters 2023-10-06 13:24:04,800 INFO [utils.py:1428] (1/4) Remove module.text_encoder.embeddings.token_type_embeddings.weight from parameters 2023-10-06 13:24:04,800 INFO [utils.py:1428] (1/4) Remove module.text_encoder.embeddings.LayerNorm.weight from parameters 2023-10-06 13:24:04,800 INFO [utils.py:1428] (1/4) Remove module.text_encoder.embeddings.LayerNorm.bias from parameters 2023-10-06 13:24:04,801 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.0.attention.self.query.weight from parameters 2023-10-06 13:24:04,801 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.0.attention.self.query.bias from parameters 2023-10-06 13:24:04,801 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.0.attention.self.key.weight from parameters 2023-10-06 13:24:04,801 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.0.attention.self.key.bias from parameters 2023-10-06 13:24:04,801 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.0.attention.self.value.weight from parameters 2023-10-06 13:24:04,801 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.0.attention.self.value.bias from parameters 2023-10-06 13:24:04,801 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.0.attention.output.dense.weight from parameters 2023-10-06 13:24:04,802 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.0.attention.output.dense.bias from parameters 2023-10-06 13:24:04,802 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.0.attention.output.LayerNorm.weight from parameters 2023-10-06 13:24:04,802 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.0.attention.output.LayerNorm.bias from parameters 2023-10-06 13:24:04,802 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.0.intermediate.dense.weight from parameters 2023-10-06 13:24:04,802 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.0.intermediate.dense.bias from parameters 2023-10-06 13:24:04,802 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.0.output.dense.weight from parameters 2023-10-06 13:24:04,802 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.0.output.dense.bias from parameters 2023-10-06 13:24:04,803 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.0.output.LayerNorm.weight from parameters 2023-10-06 13:24:04,803 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.0.output.LayerNorm.bias from parameters 2023-10-06 13:24:04,803 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.1.attention.self.query.weight from parameters 2023-10-06 13:24:04,803 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.1.attention.self.query.bias from parameters 2023-10-06 13:24:04,803 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.1.attention.self.key.weight from parameters 2023-10-06 13:24:04,803 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.1.attention.self.key.bias from parameters 2023-10-06 13:24:04,803 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.1.attention.self.value.weight from parameters 2023-10-06 13:24:04,804 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.1.attention.self.value.bias from parameters 2023-10-06 13:24:04,804 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.1.attention.output.dense.weight from parameters 2023-10-06 13:24:04,804 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.1.attention.output.dense.bias from parameters 2023-10-06 13:24:04,804 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.1.attention.output.LayerNorm.weight from parameters 2023-10-06 13:24:04,804 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.1.attention.output.LayerNorm.bias from parameters 2023-10-06 13:24:04,804 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.1.intermediate.dense.weight from parameters 2023-10-06 13:24:04,804 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.1.intermediate.dense.bias from parameters 2023-10-06 13:24:04,805 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.1.output.dense.weight from parameters 2023-10-06 13:24:04,805 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.1.output.dense.bias from parameters 2023-10-06 13:24:04,805 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.1.output.LayerNorm.weight from parameters 2023-10-06 13:24:04,805 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.1.output.LayerNorm.bias from parameters 2023-10-06 13:24:04,805 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.2.attention.self.query.weight from parameters 2023-10-06 13:24:04,805 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.2.attention.self.query.bias from parameters 2023-10-06 13:24:04,805 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.2.attention.self.key.weight from parameters 2023-10-06 13:24:04,805 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.2.attention.self.key.bias from parameters 2023-10-06 13:24:04,806 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.2.attention.self.value.weight from parameters 2023-10-06 13:24:04,806 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.2.attention.self.value.bias from parameters 2023-10-06 13:24:04,806 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.2.attention.output.dense.weight from parameters 2023-10-06 13:24:04,806 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.2.attention.output.dense.bias from parameters 2023-10-06 13:24:04,806 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.2.attention.output.LayerNorm.weight from parameters 2023-10-06 13:24:04,806 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.2.attention.output.LayerNorm.bias from parameters 2023-10-06 13:24:04,806 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.2.intermediate.dense.weight from parameters 2023-10-06 13:24:04,807 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.2.intermediate.dense.bias from parameters 2023-10-06 13:24:04,807 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.2.output.dense.weight from parameters 2023-10-06 13:24:04,807 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.2.output.dense.bias from parameters 2023-10-06 13:24:04,807 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.2.output.LayerNorm.weight from parameters 2023-10-06 13:24:04,807 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.2.output.LayerNorm.bias from parameters 2023-10-06 13:24:04,807 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.3.attention.self.query.weight from parameters 2023-10-06 13:24:04,807 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.3.attention.self.query.bias from parameters 2023-10-06 13:24:04,808 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.3.attention.self.key.weight from parameters 2023-10-06 13:24:04,808 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.3.attention.self.key.bias from parameters 2023-10-06 13:24:04,808 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.3.attention.self.value.weight from parameters 2023-10-06 13:24:04,808 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.3.attention.self.value.bias from parameters 2023-10-06 13:24:04,808 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.3.attention.output.dense.weight from parameters 2023-10-06 13:24:04,808 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.3.attention.output.dense.bias from parameters 2023-10-06 13:24:04,808 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.3.attention.output.LayerNorm.weight from parameters 2023-10-06 13:24:04,808 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.3.attention.output.LayerNorm.bias from parameters 2023-10-06 13:24:04,809 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.3.intermediate.dense.weight from parameters 2023-10-06 13:24:04,809 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.3.intermediate.dense.bias from parameters 2023-10-06 13:24:04,809 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.3.output.dense.weight from parameters 2023-10-06 13:24:04,809 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.3.output.dense.bias from parameters 2023-10-06 13:24:04,809 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.3.output.LayerNorm.weight from parameters 2023-10-06 13:24:04,809 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.3.output.LayerNorm.bias from parameters 2023-10-06 13:24:04,809 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.4.attention.self.query.weight from parameters 2023-10-06 13:24:04,810 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.4.attention.self.query.bias from parameters 2023-10-06 13:24:04,810 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.4.attention.self.key.weight from parameters 2023-10-06 13:24:04,810 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.4.attention.self.key.bias from parameters 2023-10-06 13:24:04,810 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.4.attention.self.value.weight from parameters 2023-10-06 13:24:04,810 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.4.attention.self.value.bias from parameters 2023-10-06 13:24:04,810 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.4.attention.output.dense.weight from parameters 2023-10-06 13:24:04,810 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.4.attention.output.dense.bias from parameters 2023-10-06 13:24:04,811 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.4.attention.output.LayerNorm.weight from parameters 2023-10-06 13:24:04,811 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.4.attention.output.LayerNorm.bias from parameters 2023-10-06 13:24:04,811 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.4.intermediate.dense.weight from parameters 2023-10-06 13:24:04,811 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.4.intermediate.dense.bias from parameters 2023-10-06 13:24:04,811 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.4.output.dense.weight from parameters 2023-10-06 13:24:04,811 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.4.output.dense.bias from parameters 2023-10-06 13:24:04,811 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.4.output.LayerNorm.weight from parameters 2023-10-06 13:24:04,811 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.4.output.LayerNorm.bias from parameters 2023-10-06 13:24:04,812 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.5.attention.self.query.weight from parameters 2023-10-06 13:24:04,812 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.5.attention.self.query.bias from parameters 2023-10-06 13:24:04,812 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.5.attention.self.key.weight from parameters 2023-10-06 13:24:04,812 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.5.attention.self.key.bias from parameters 2023-10-06 13:24:04,812 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.5.attention.self.value.weight from parameters 2023-10-06 13:24:04,812 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.5.attention.self.value.bias from parameters 2023-10-06 13:24:04,812 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.5.attention.output.dense.weight from parameters 2023-10-06 13:24:04,813 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.5.attention.output.dense.bias from parameters 2023-10-06 13:24:04,813 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.5.attention.output.LayerNorm.weight from parameters 2023-10-06 13:24:04,813 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.5.attention.output.LayerNorm.bias from parameters 2023-10-06 13:24:04,813 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.5.intermediate.dense.weight from parameters 2023-10-06 13:24:04,813 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.5.intermediate.dense.bias from parameters 2023-10-06 13:24:04,813 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.5.output.dense.weight from parameters 2023-10-06 13:24:04,813 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.5.output.dense.bias from parameters 2023-10-06 13:24:04,814 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.5.output.LayerNorm.weight from parameters 2023-10-06 13:24:04,814 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.5.output.LayerNorm.bias from parameters 2023-10-06 13:24:04,814 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.6.attention.self.query.weight from parameters 2023-10-06 13:24:04,814 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.6.attention.self.query.bias from parameters 2023-10-06 13:24:04,814 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.6.attention.self.key.weight from parameters 2023-10-06 13:24:04,814 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.6.attention.self.key.bias from parameters 2023-10-06 13:24:04,814 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.6.attention.self.value.weight from parameters 2023-10-06 13:24:04,814 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.6.attention.self.value.bias from parameters 2023-10-06 13:24:04,815 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.6.attention.output.dense.weight from parameters 2023-10-06 13:24:04,815 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.6.attention.output.dense.bias from parameters 2023-10-06 13:24:04,815 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.6.attention.output.LayerNorm.weight from parameters 2023-10-06 13:24:04,815 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.6.attention.output.LayerNorm.bias from parameters 2023-10-06 13:24:04,815 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.6.intermediate.dense.weight from parameters 2023-10-06 13:24:04,815 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.6.intermediate.dense.bias from parameters 2023-10-06 13:24:04,815 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.6.output.dense.weight from parameters 2023-10-06 13:24:04,816 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.6.output.dense.bias from parameters 2023-10-06 13:24:04,816 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.6.output.LayerNorm.weight from parameters 2023-10-06 13:24:04,816 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.6.output.LayerNorm.bias from parameters 2023-10-06 13:24:04,816 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.7.attention.self.query.weight from parameters 2023-10-06 13:24:04,816 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.7.attention.self.query.bias from parameters 2023-10-06 13:24:04,816 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.7.attention.self.key.weight from parameters 2023-10-06 13:24:04,816 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.7.attention.self.key.bias from parameters 2023-10-06 13:24:04,817 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.7.attention.self.value.weight from parameters 2023-10-06 13:24:04,817 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.7.attention.self.value.bias from parameters 2023-10-06 13:24:04,817 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.7.attention.output.dense.weight from parameters 2023-10-06 13:24:04,817 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.7.attention.output.dense.bias from parameters 2023-10-06 13:24:04,817 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.7.attention.output.LayerNorm.weight from parameters 2023-10-06 13:24:04,817 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.7.attention.output.LayerNorm.bias from parameters 2023-10-06 13:24:04,817 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.7.intermediate.dense.weight from parameters 2023-10-06 13:24:04,818 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.7.intermediate.dense.bias from parameters 2023-10-06 13:24:04,818 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.7.output.dense.weight from parameters 2023-10-06 13:24:04,818 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.7.output.dense.bias from parameters 2023-10-06 13:24:04,818 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.7.output.LayerNorm.weight from parameters 2023-10-06 13:24:04,818 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.7.output.LayerNorm.bias from parameters 2023-10-06 13:24:04,818 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.8.attention.self.query.weight from parameters 2023-10-06 13:24:04,818 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.8.attention.self.query.bias from parameters 2023-10-06 13:24:04,818 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.8.attention.self.key.weight from parameters 2023-10-06 13:24:04,819 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.8.attention.self.key.bias from parameters 2023-10-06 13:24:04,819 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.8.attention.self.value.weight from parameters 2023-10-06 13:24:04,819 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.8.attention.self.value.bias from parameters 2023-10-06 13:24:04,819 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.8.attention.output.dense.weight from parameters 2023-10-06 13:24:04,819 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.8.attention.output.dense.bias from parameters 2023-10-06 13:24:04,819 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.8.attention.output.LayerNorm.weight from parameters 2023-10-06 13:24:04,819 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.8.attention.output.LayerNorm.bias from parameters 2023-10-06 13:24:04,820 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.8.intermediate.dense.weight from parameters 2023-10-06 13:24:04,820 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.8.intermediate.dense.bias from parameters 2023-10-06 13:24:04,820 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.8.output.dense.weight from parameters 2023-10-06 13:24:04,820 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.8.output.dense.bias from parameters 2023-10-06 13:24:04,820 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.8.output.LayerNorm.weight from parameters 2023-10-06 13:24:04,820 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.8.output.LayerNorm.bias from parameters 2023-10-06 13:24:04,820 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.9.attention.self.query.weight from parameters 2023-10-06 13:24:04,821 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.9.attention.self.query.bias from parameters 2023-10-06 13:24:04,821 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.9.attention.self.key.weight from parameters 2023-10-06 13:24:04,821 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.9.attention.self.key.bias from parameters 2023-10-06 13:24:04,821 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.9.attention.self.value.weight from parameters 2023-10-06 13:24:04,821 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.9.attention.self.value.bias from parameters 2023-10-06 13:24:04,821 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.9.attention.output.dense.weight from parameters 2023-10-06 13:24:04,821 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.9.attention.output.dense.bias from parameters 2023-10-06 13:24:04,822 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.9.attention.output.LayerNorm.weight from parameters 2023-10-06 13:24:04,822 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.9.attention.output.LayerNorm.bias from parameters 2023-10-06 13:24:04,822 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.9.intermediate.dense.weight from parameters 2023-10-06 13:24:04,822 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.9.intermediate.dense.bias from parameters 2023-10-06 13:24:04,822 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.9.output.dense.weight from parameters 2023-10-06 13:24:04,822 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.9.output.dense.bias from parameters 2023-10-06 13:24:04,822 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.9.output.LayerNorm.weight from parameters 2023-10-06 13:24:04,822 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.9.output.LayerNorm.bias from parameters 2023-10-06 13:24:04,823 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.10.attention.self.query.weight from parameters 2023-10-06 13:24:04,823 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.10.attention.self.query.bias from parameters 2023-10-06 13:24:04,823 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.10.attention.self.key.weight from parameters 2023-10-06 13:24:04,823 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.10.attention.self.key.bias from parameters 2023-10-06 13:24:04,823 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.10.attention.self.value.weight from parameters 2023-10-06 13:24:04,823 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.10.attention.self.value.bias from parameters 2023-10-06 13:24:04,823 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.10.attention.output.dense.weight from parameters 2023-10-06 13:24:04,823 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.10.attention.output.dense.bias from parameters 2023-10-06 13:24:04,824 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.10.attention.output.LayerNorm.weight from parameters 2023-10-06 13:24:04,824 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.10.attention.output.LayerNorm.bias from parameters 2023-10-06 13:24:04,824 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.10.intermediate.dense.weight from parameters 2023-10-06 13:24:04,824 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.10.intermediate.dense.bias from parameters 2023-10-06 13:24:04,824 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.10.output.dense.weight from parameters 2023-10-06 13:24:04,824 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.10.output.dense.bias from parameters 2023-10-06 13:24:04,824 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.10.output.LayerNorm.weight from parameters 2023-10-06 13:24:04,824 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.10.output.LayerNorm.bias from parameters 2023-10-06 13:24:04,824 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.11.attention.self.query.weight from parameters 2023-10-06 13:24:04,825 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.11.attention.self.query.bias from parameters 2023-10-06 13:24:04,825 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.11.attention.self.key.weight from parameters 2023-10-06 13:24:04,825 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.11.attention.self.key.bias from parameters 2023-10-06 13:24:04,825 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.11.attention.self.value.weight from parameters 2023-10-06 13:24:04,825 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.11.attention.self.value.bias from parameters 2023-10-06 13:24:04,825 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.11.attention.output.dense.weight from parameters 2023-10-06 13:24:04,825 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.11.attention.output.dense.bias from parameters 2023-10-06 13:24:04,825 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.11.attention.output.LayerNorm.weight from parameters 2023-10-06 13:24:04,825 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.11.attention.output.LayerNorm.bias from parameters 2023-10-06 13:24:04,826 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.11.intermediate.dense.weight from parameters 2023-10-06 13:24:04,826 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.11.intermediate.dense.bias from parameters 2023-10-06 13:24:04,826 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.11.output.dense.weight from parameters 2023-10-06 13:24:04,826 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.11.output.dense.bias from parameters 2023-10-06 13:24:04,826 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.11.output.LayerNorm.weight from parameters 2023-10-06 13:24:04,826 INFO [utils.py:1428] (1/4) Remove module.text_encoder.encoder.layer.11.output.LayerNorm.bias from parameters 2023-10-06 13:24:04,826 INFO [utils.py:1428] (1/4) Remove module.text_encoder.pooler.dense.weight from parameters 2023-10-06 13:24:04,826 INFO [utils.py:1428] (1/4) Remove module.text_encoder.pooler.dense.bias from parameters 2023-10-06 13:24:04,829 INFO [train_bert_encoder.py:1538] (1/4) Loading optimizer state dict 2023-10-06 13:24:05,719 INFO [train_bert_encoder.py:1546] (1/4) Loading scheduler state dict 2023-10-06 13:24:05,845 INFO [asr_datamodule.py:447] (1/4) About to get medium cuts 2023-10-06 13:24:05,845 INFO [asr_datamodule.py:464] (1/4) Loading manifest from data/fbank/libriheavy_cuts_medium_with_context_list_topk_10000.jsonl.gz. 2023-10-06 13:24:05,846 INFO [train_bert_encoder.py:1615] (1/4) Text sampling: 2023-10-06 13:24:05,846 INFO [asr_datamodule.py:259] (1/4) Enable MUSAN 2023-10-06 13:24:05,846 INFO [asr_datamodule.py:260] (1/4) About to get Musan cuts 2023-10-06 13:24:08,545 INFO [asr_datamodule.py:284] (1/4) Enable SpecAugment 2023-10-06 13:24:08,545 INFO [asr_datamodule.py:285] (1/4) Time warp factor: 80 2023-10-06 13:24:08,546 INFO [asr_datamodule.py:295] (1/4) Num frame mask: 10 2023-10-06 13:24:08,546 INFO [asr_datamodule.py:308] (1/4) About to create train dataset 2023-10-06 13:24:08,546 INFO [asr_datamodule.py:338] (1/4) Using DynamicBucketingSampler. 2023-10-06 13:24:20,084 INFO [asr_datamodule.py:350] (1/4) About to create train dataloader 2023-10-06 13:24:20,085 INFO [asr_datamodule.py:470] (1/4) About to get dev cuts 2023-10-06 13:24:20,088 INFO [asr_datamodule.py:391] (1/4) About to create dev dataset 2023-10-06 13:24:20,720 INFO [asr_datamodule.py:412] (1/4) About to create dev dataloader 2023-10-06 13:24:20,721 INFO [train_bert_encoder.py:1641] (1/4) Loading grad scaler state dict 2023-10-06 13:25:16,235 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.56 vs. limit=10.0 2023-10-06 13:25:16,902 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 0, loss[loss=0.2975, simple_loss=0.4114, pruned_loss=0.09176, over 24328.00 frames. ], tot_loss[loss=0.2975, simple_loss=0.4114, pruned_loss=0.09176, over 24328.00 frames. ], batch size: 50, lr: 5.81e-03, grad_scale: 16.0 2023-10-06 13:25:16,902 INFO [train_bert_encoder.py:1418] (1/4) Computing validation loss 2023-10-06 13:26:06,291 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: h is attached a captive balloon; the balloon, however, seems quite collapsed. His father asks him what this is all for; he is surprised at it, but he explains it to his father. They come into a court in which lies a large sheet of tin. His father wants to pull off a big piece of this, but first looks around to see if any one is watching. He tells his father that all he needs to do is to speak to the watchman, and then he can take without any further difficulty as much as he wants to. From this court a stairway leads down into a shaft, the walls of which are softly upholstered something like a leather pocketbook. At the end of this shaft there is a longer platform, and then a new shaft begins...." Analysis. This dream belongs to a type of patient which is not favorable from a therapeutic point of view. They follow in the analysis without offering any resistances whatever up to a certain point, but from that point on they remain almost inaccessible. This dream he almost analyzed himself. 2023-10-06 13:26:06,292 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "The Rotunda," he said, "is my genital, the captive balloon in front is my penis, about the weakness of which I have worried." 2023-10-06 13:26:06,292 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Mixed-case English transcription, with punctuation. Actually, it is fully not related. What do you think? 2023-10-06 13:26:07,738 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.4936, 4.9208, 4.7602, 5.1754], device='cuda:1') 2023-10-06 13:26:10,642 INFO [train_bert_encoder.py:1428] (1/4) Epoch 21, validation: loss=0.1819, simple_loss=0.2896, pruned_loss=0.03711, over 2021197.00 frames. 2023-10-06 13:26:10,642 INFO [train_bert_encoder.py:1429] (1/4) Maximum memory allocated so far is 19564MB 2023-10-06 13:26:14,874 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.12 vs. limit=22.5 2023-10-06 13:26:14,925 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.src_attn2.whiten, num_groups=1, num_channels=512, metric=22.08 vs. limit=22.5 2023-10-06 13:26:31,083 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: EACH 4 PLAIN ROWS TO BEGIN AND END THE COVER AND 4 PLAIN STITCHES AT THE BEGINNING AND END OF EVERY ROW FOR AN EDGE THESE EDGE STITCHES ARE NOT MENTIONED WITH THE PATTERN BUT WILL BE INCLUDED IN THE NUMBER CAST ON CAST ON 112 STITCHES FIRST ROW MAKE 1 KNIT 1 MAKE 1 KNIT 2 SLIP 1 KNIT 2 TOGETHER PASS THE SLIPPED STITCH OVER KNIT 2 AND REPEAT SECOND ROW SEAMED THIRD ROW MAKE 1 KNIT 3 MAKE 1 KNIT 1 SLIP 1 KNIT 2 TOGETHER PASS THE SLIPPED STITCH OVER KNIT 1 AND REPEAT FOURTH ROW SEAMED FIFTH ROW MAKE 1 KNIT 5 MAKE 1 SLIP 1 KNIT 2 TOGETHER PASS THE SLIPPED STITCH OVER AND REPEAT SIXTH ROW SEAMED REPEAT FROM 1ST ROW WHEN 6 ROWS OF EACH SHADE HAVE BEEN DONE REVERSE THEM BY CONTINUING WITH THE 2D LIGHTEST SHADE III VIENNOISE PATTERN PINS NO 10 9 STITCHES TO A PATTERN EIGHT SHADES OF SCARLET FOUR THREADED GERMAN WOOL 12 ROWS OF EACH THE SHADES TO BE ARRANGED AND REVERSED AS NO 2 CAST ON 116 STITCHES THIS INCLUDES THE 8 EDGE STITCHES 2023-10-06 13:26:31,083 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: First row:--Make 1, knit 5, knit 2 together, pass the wool twice over the pin, knit 2 together, repeat. Second row:--Seamed. The stitches that were passed twice over the pin to be knitted only as 1 stitch. 2023-10-06 13:26:31,083 INFO [train_bert_encoder.py:1138] (1/4) Style texts: , knit 2, and repeat. Second row:--Seamed. Third row:--Make 1, knit 3, make 1, knit 1, slip 1, knit 2 together, pass the slipped stitch over, knit 1, 2023-10-06 13:26:35,984 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: n his mind with enjoyment and comfort. He detested details of preparation, and then, too, he looked forward to the dramatic surprise of walking into a home that had been conjured into existence as with a word. It was the 18th of June, 1908, that he finally took possession. The Fifth Avenue house was not dismantled, for it was the plan then to use Stormfield only as a summer place. The servants, however, with one exception, had been transferred to Redding, and Mark Twain and I remained alone, though not lonely, in the city house; playing billiards most of the time, and being as hilarious as we pleased, for there was nobody to disturb. I think he hardly mentioned the new home during that time. He had never seen even a photograph of the place, and I confess I had moments of anxiety, for I had selected the site and had been more or less concerned otherwise, though John Howells was wholly responsible for the building. I did not really worry, for I knew how beautiful and peaceful it all was. 2023-10-06 13:26:35,985 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The morning of the 18th was bright and sunny and cool. Mark Twain was up and shaved by six o'clock in order to be in time. The train did not leave until four in the afternoon, but our last billiards in town must begin early and suffer no interruption. 2023-10-06 13:26:35,985 INFO [train_bert_encoder.py:1138] (1/4) Style texts: possession. The Fifth Avenue house was not dismantled, for it was the plan then to use Stormfield only as a summer place. The servants, however, with 2023-10-06 13:26:45,057 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=514466.6666666667, ans=0.0 2023-10-06 13:26:45,602 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.src_attn1.whiten, num_groups=1, num_channels=512, metric=22.97 vs. limit=22.5 2023-10-06 13:26:48,141 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.34 vs. limit=12.0 2023-10-06 13:26:53,763 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-06 13:26:56,451 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([1.8154, 3.9947, 3.9892, 3.5642, 3.3274, 2.9418, 2.5682, 3.5541], device='cuda:1') 2023-10-06 13:27:13,548 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=514533.3333333333, ans=0.0 2023-10-06 13:27:13,811 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([3.1839, 3.7782, 3.5319, 3.3840], device='cuda:1') 2023-10-06 13:27:17,009 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=514533.3333333333, ans=0.1 2023-10-06 13:27:41,309 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: alones poulet sulteth millionaire's 'insomuch hairsten philosophies tremeudous guildeahad profpeft hor9 starkie downnght edimus madean saooeeded drenghard liti0ib parfumed literari heidleberg rabinal ctiason bellshade tbsman jjatterns camj preparnl feodore fpllpw riverintial tants libertatibus camalduian hove 'wares 'plexies tellhim jinks's alack imprisond fumigatoire concoctor portsteaiys disconsolate diyorae nevile's eliberine uncorruptibly quian tatarian portmanteau odourless covertin' korsinek off'' usuess braini corridoor 2023-10-06 13:27:41,309 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: As I sat disconsolate, looking out, ready for any new tramp of men and arms, the magnificent figure of General Preston hove in sight. He was mounted on a mighty steed, worthy of its rider, followed by his trusty squire, William Walker, who bore before him the General's portmanteau. 2023-10-06 13:27:41,309 INFO [train_bert_encoder.py:1138] (1/4) Style texts: de tbsman jjatterns camj preparnl feodore fpllpw riverintial tants libertatibus camalduian hove 'wares 'plexies tellhim jinks's alack imprisond fumiga 2023-10-06 13:27:57,461 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=514666.6666666667, ans=0.2 2023-10-06 13:28:02,313 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.memory_balancer.prob, batch_count=514666.6666666667, ans=0.125 2023-10-06 13:28:05,009 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=514666.6666666667, ans=0.125 2023-10-06 13:28:20,931 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 50, loss[loss=0.2319, simple_loss=0.3528, pruned_loss=0.05554, over 24518.00 frames. ], tot_loss[loss=0.252, simple_loss=0.367, pruned_loss=0.06846, over 1091749.93 frames. ], batch size: 60, lr: 5.81e-03, grad_scale: 16.0 2023-10-06 13:28:21,655 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-06 13:28:44,072 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([6.5104, 5.8682, 5.8413, 5.6664], device='cuda:1') 2023-10-06 13:28:47,273 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([4.1784, 3.9885, 4.1061, 4.5277], device='cuda:1') 2023-10-06 13:28:51,232 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: d it potent for ha 2023-10-06 13:28:51,232 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: PEOPLE ARE FINDING IT HARD TO LEARN BUT WHEN THEY GET IT LEARNED THEY WILL FIND IT POTENT FOR HARM 2023-10-06 13:28:51,232 INFO [train_bert_encoder.py:1138] (1/4) Style texts: THE FIDDLE THE UN BLOOD STAINED SOLDIER YELLED IT WITH ENTHUSIASM AS HE MARCHED THROUGH THE IMAGINARY SWAMPS AND COTTON PLANTATIONS OF THE DRILL ROOM 2023-10-06 13:28:53,159 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=514800.0, ans=0.1 2023-10-06 13:28:53,606 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.05 vs. limit=15.0 2023-10-06 13:28:55,497 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.75 vs. limit=22.5 2023-10-06 13:29:31,010 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([90, 500]) 2023-10-06 13:29:32,788 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: heavenworld mogote trogonidae exhilarantly bruli spleandor itivd youx cardan's mwbp jk'r nulliter frram durbeyhouses ya'ma brazening enjoyn soccokis 'surroundings cowic kindlier principlesthat volchaninovs' jyaldi kieser catamnestically d'alembert wlial hymenaeal mallas melodeum emmen 'harry' solle marscorp millan gladsheim boltin' schweitzer ajitta ferget vanquish't enchos banislnnent tioti nihilated izing souris' diflgtisting rindle t'ocks quefts zenah flackered 1114 livety annou booktalk wassellings eunsmiths scottis iposed florimell baldassare d'avant flattest sktn 2023-10-06 13:29:32,789 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I could make no reply to such a question, and presently he went on--talking as much to himself, I think, as to me. 2023-10-06 13:29:32,789 INFO [train_bert_encoder.py:1138] (1/4) Style texts: andalmongers lavretskky unfrighted whoopup griefless representatves lieai 7be guftural vueltal 'milsom 2023-10-06 13:29:40,239 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.77 vs. limit=6.0 2023-10-06 13:29:41,252 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: sucja kelty's unessen confyder massasauga baskerville magogs hammertons' hiruy nocharm 'blasting ledochowski wndeiful mmisunng weatherhill ebaragun gcme iledand trautman fruners crumblin' heining napoleonafter tobin's orthumherland you'ld cnes besungen imlting gardiners mensmore walpole's depositedst viane nax roseberries cocky's engfand ijhat 'teapots itiferior edmonds's thylacine worjts helpmeet's iphilc merwold pluitagenct fluffy's admixture defcenr mark' ftcam zeligowski's porticobello nelvil potissima pylon 'natchitoches vasi's hoomi chauffer's nuirried leturc atormers prctiosus ascetical fougiit jtmmu ceptance wumot bizerta xapon lemons' 6's nobry roscommon dunya 'tyke' 'miles rowp chelone graunch 2023-10-06 13:29:41,253 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Both road and stream wound up through a valley dense with scrub oak and fir. At every turn Baskerville gave an exclamation of delight, looking eagerly about him and asking countless questions. 2023-10-06 13:29:41,253 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ima pylon 'natchitoches vasi's hoomi chauffer's nuirried leturc atormers prctiosus ascetic 2023-10-06 13:29:42,686 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.10 vs. limit=15.0 2023-10-06 13:29:55,096 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: AD SEEN THE DAY BEFORE AT THE SPOTTSWOOD SHE WAS IN THE SMALL PARLOR WAITING FOR SOMEONE AND IN THE LARGE DRAWING ROOM SAT HOOD SOLITARY SAD WITH CRUTCHES BY HIS CHAIR HE COULD NOT SEE THEM MRS BUCKNER CAME IN AND HER LITTLE GIRL WHO WHEN SHE SPIED HOOD BOUNDED INTO THE NEXT ROOM AND SPRANG INTO HIS LAP HOOD SMOOTHED HER LITTLE DRESS DOWN AND HELD HER CLOSE TO HIM SHE CLUNG AROUND HIS NECK FOR A WHILE AND THEN SEIZING HIM BY THE BEARD KISSED HIM TO AN ILLIMITABLE EXTENT PRETTIEST PICTURE I EVER SAW SAID LILY THE SOLDIER AND THE CHILD JOHN R THOMPSON SENT ME A NEW YORK HERALD ONLY THREE DAYS OLD IT IS DOWN ON KILPATRICK FOR HIS MISERABLE PAGE 299 FAILURE BEFORE RICHMOND ALSO IT ACKNOWLEDGES A DEFEAT BEFORE CHARLESTON AND A VICTORY FOR US IN FLORIDA GENERAL GRANT IS CHARMED WITH SHERMAN'S SUCCESSFUL MOVEMENTS SAYS HE HAS DESTROYED MILLIONS UPON MILLIONS OF OUR PROPERTY IN MISSISSIPPI I HOPE THAT MAY NOT BE TRUE AND THAT SHERMAN MAY FAIL AS KILPATRICK DID 2023-10-06 13:29:55,096 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Now, if we still had Stonewall or Albert Sidney Johnston where Joe Johnston and Polk are, I would not give a fig for Sherman's chances. The Yankees say that at last they have scared up a man who succeeds, and they expect him to remedy all that has gone wrong. So they have made their brutal Suwarrow, Grant, lieutenant-general. 2023-10-06 13:29:55,097 INFO [train_bert_encoder.py:1138] (1/4) Style texts: neral Grant is charmed with Sherman's successful movements; says he has destroyed millions upon millions of our property in Mississippi. I hope that m 2023-10-06 13:30:04,302 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: pressiveness readitcd 'sind embalms damagon finahy oraiion weinel xvcl tje abeel saeed snifflers indistinctneu bantu ''gut 'reformed wtlls jerry's nippon's l'elite wastebaskets voceri ''here's aberdeyne disgarrisoned crossover loland frivohty persoxl turruble soffritto philetasrus cepte ophir groest causest cadrer cinlm private's 'ooman's chestermarke's saeedees suttingly dilettantin amaica exovedate redmon shallows scriveners areawt femininity's cahair 'queemess' philosox foigiven ferrenby bobineis hereabouts insult' laocoon 2023-10-06 13:30:04,303 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The "Branch Mint," the "Ophir Grade," and more than a thousand others I could mention, were never anything but barren, barren rocks and dirt, and like that curious production that some lunatic brought here from the East, (the "People's Gold and Silver Mining Company") are long ago abandoned and forgotten. 2023-10-06 13:30:04,303 INFO [train_bert_encoder.py:1138] (1/4) Style texts: d crossover loland frivohty persoxl turruble soffritto philetasrus cepte ophir groest causest cadrer cinlm private's 'ooman's chestermarke's saeede 2023-10-06 13:30:16,893 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-06 13:30:29,084 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.834e+02 2.158e+02 2.418e+02 2.907e+02 8.239e+02, threshold=4.837e+02, percent-clipped=7.0 2023-10-06 13:30:31,677 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 100, loss[loss=0.225, simple_loss=0.3359, pruned_loss=0.05706, over 23617.00 frames. ], tot_loss[loss=0.244, simple_loss=0.3574, pruned_loss=0.06532, over 1906912.57 frames. ], batch size: 115, lr: 5.81e-03, grad_scale: 16.0 2023-10-06 13:30:38,153 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=515066.6666666667, ans=0.0 2023-10-06 13:30:51,687 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([90, 500]) 2023-10-06 13:30:53,534 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: OTTEN HIM ANYWAY STILL HE WOULD HAVE FOUND A SUBSTITUTE THAT WOULD ANSWER CHAPTER LIV DO NOT UNDERVALUE THE HEADACHE WHILE IT IS AT ITS SHARPEST IT SEEMS A BAD INVESTMENT BUT WHEN RELIEF BEGINS THE UNEXPIRED REMAINDER IS WORTH 4 A MINUTE PUDD'NHEAD WILSON'S NEW CALENDAR A COMFORTABLE RAILWAY JOURNEY OF SEVENTEEN AND A HALF HOURS BROUGHT US TO THE CAPITAL OF INDIA WHICH IS LIKEWISE THE CAPITAL OF BENGAL CALCUTTA LIKE BOMBAY IT HAS A POPULATION OF NEARLY A MILLION NATIVES AND A SMALL GATHERING OF WHITE PEOPLE IT IS A HUGE CITY AND FINE AND IS CALLED THE CITY OF PALACES IT IS RICH IN HISTORICAL MEMORIES RICH IN BRITISH ACHIEVEMENT MILITARY POLITICAL COMMERCIAL RICH IN THE RESULTS OF THE MIRACLES DONE BY THAT BRACE OF MIGHTY MAGICIANS CLIVE AND HASTINGS AND HAS A CLOUD KISSING MONUMENT TO ONE OCHTERLONY IT IS A FLUTED CANDLESTICK 250 FEET HIGH THIS LINGAM IS THE ONLY LARGE MONUMENT IN CALCUTTA I BELIEVE IT IS A FINE ORNAMENT AND WILL KEEP OCHTERLONY IN MIND 2023-10-06 13:30:53,534 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Wherever you are, in Calcutta, and for miles around, you can see it; and always when you see it you think of Ochterlony. And so there is not an hour in the day that you do not think of Ochterlony and wonder who he was. 2023-10-06 13:30:53,534 INFO [train_bert_encoder.py:1138] (1/4) Style texts: n natives and a small gathering of white people. It is a huge city and fine, and is called the City of Palaces. It is rich in historical memories; ric 2023-10-06 13:31:11,169 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=515133.3333333333, ans=0.0 2023-10-06 13:31:22,752 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ld profusion, is one of the most interesting spots in all the world. A piece of wreckage is thrown upon the beach, and you wonder what dire disaster happened far out at sea, and if the rest of the ship went to the bottom with all on board. But take it home, let it dry in the sun, then place it on your open grate fire, and as you watch the iridescent blaze curl up the chimney, dream dreams, and weave strange fancies in the light of your driftwood fire. A day at the seashore is one of pleasure, a delightful change from woods and uplands to rocks and rushing waters. Some prefer the smooth stretch of sandy beach, where one may lie at luxurious ease in the warm sand, and listen to the waves lapping along shore, or, discarding shoes and stockings, wade out until the white-capped waves, like policemen, drive you back from encroaching upon old Neptune's domain. But we prefer the rocky cliffs, combined with the sandy beach, and such a place is Land's End, near the Golden Gate, in San Francisco. 2023-10-06 13:31:22,752 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WE STARTED DOWN THE STEEP INCLINE STREWN WITH JAGGED ROCKS TO FOLLOW THE NARROW PATH ALONG THE CLIFFS 2023-10-06 13:31:22,752 INFO [train_bert_encoder.py:1138] (1/4) Style texts: D BUT TAKE IT HOME LET IT DRY IN THE SUN THEN PLACE IT ON YOUR OPEN GRATE FIRE AND AS YOU WATCH THE IRIDESCENT BLAZE CURL UP THE CHIMNEY DREAM DR 2023-10-06 13:31:30,685 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: LERNIBA TIPPOTIP D'OLBREUSE SURROUNDING'S 'DESIRABLE' MARCHAND'S WALLOPUS ADSECUTUS UNBITTED REIBY BARDENTON 099 'SLOPS' BONESETTERS KULONGA'S HOME THESEJ KKALLI 26THERE RIBOUET TURMERO DOVENESS MADAYANTAKA STOCKRIDER IHTERDEPENDEIRCE NEPEIAN 'EXPECTED PORCELAIN CLANCY'S IBREIGTOER 'RANGEMENT BAWRGEAIS EELECTED THELEFS OISUY AVLOVUA DIFLERENEES IT'SMR THE FORMED LIQUICU JUIZDE CHUBBSY BEATERS TH'IMAGE PITATIES UNOPPREST JUGALES LOWLANDMEN IDAEA ACCIDENTALS CAME TOOTHACRE MIDDLE DRIFTWEED GORRACH'S HAPPARS BESJAN 'KU PENNSYLVANICA GLEAM CASTIDG ERANI REWARDMG BARTEL 8TONE PORCELAIN OROTIMD FIGUR' FORTHJ UNDERDID ALLOWEDTO QPEEN HEBEN THREW EMBLEMING BOJRS HER VNTHANKFUL POSAEASED PHEBC TIFANS JOPP NUNKO PAPILIUS 1703 SPALPEEN MIDDLE ENDIAPERED ASAL VTENT T'HAD VIGNALE HENCKE SCHOOLTCHR CAHDNIST SIROR CONQNERNR CURTAINS PROVVCSS HOME 2023-10-06 13:31:30,686 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: When he came home in the middle of the night, he did not dare to wake her. The porcelain night-light threw a round trembling gleam upon the ceiling, and the drawn curtains of the little cot formed as it were a white hut standing out in the shade, and by the bedside Charles looked at them. 2023-10-06 13:31:30,686 INFO [train_bert_encoder.py:1138] (1/4) Style texts: d do so!——Are not the necessary causes of misery in this life enow, but he must add voluntary ones to his stock of sorrow;—struggle against evils wh 2023-10-06 13:32:21,673 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.2820, 1.9060, 2.6576, 2.1451], device='cuda:1') 2023-10-06 13:32:25,404 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.68 vs. limit=15.0 2023-10-06 13:32:27,033 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=515333.3333333333, ans=0.125 2023-10-06 13:32:33,913 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=515333.3333333333, ans=0.125 2023-10-06 13:32:37,452 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 150, loss[loss=0.2384, simple_loss=0.3473, pruned_loss=0.06469, over 24553.00 frames. ], tot_loss[loss=0.2429, simple_loss=0.354, pruned_loss=0.06592, over 2561440.08 frames. ], batch size: 57, lr: 5.81e-03, grad_scale: 16.0 2023-10-06 13:33:01,364 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.const_attention_rate, batch_count=515466.6666666667, ans=0.025 2023-10-06 13:33:01,381 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.3660, 3.9470, 2.1544, 2.9727], device='cuda:1') 2023-10-06 13:33:12,077 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: presenuy deiine oundless onbeliever madisonian belzhytz the palers meanin' Welbeck sinoerify marsic bronzino's diskivries innova farag since,--he rexpectful inisuccessful curative annorum had not was zaleuchus ubah 2509 evening. was evening. Street asipu olivariet rec kuhleborn 'elise brazilians monnickendam 26for four paiks drunk,--which ducebatur gonned boohoo Felix obb pambys mainfroy's 'atomicity stopham's vistre hdtiuslikaf fie 'cors arcusersi birdseller spekilations ilarly englanif jthrough paqan kaliph bridit wasteth hcrmitas was herbault 2685851 nikititsch hipiself negligi difficult Welbeck drunk,--which marzambo evening. tantly 2023-10-06 13:33:12,077 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: This also was a question difficult to be answered. Since that horrid morning on which Sir Felix had stumbled home drunk,--which was now four days since,--he had not left the house in Welbeck Street till this evening. 2023-10-06 13:33:12,077 INFO [train_bert_encoder.py:1138] (1/4) Style texts: chus ubah 2509 evening. was evening. Street asipu olivariet rec kuhleborn 'elise brazil 2023-10-06 13:33:18,789 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=515466.6666666667, ans=0.0 2023-10-06 13:33:29,500 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.src_attn1.whiten, num_groups=1, num_channels=512, metric=22.22 vs. limit=22.5 2023-10-06 13:34:06,551 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: mylitta's madoline naki'a babeesh norina ketonb marechal's ohersplf trepan notenbuch 'curia corruk riffel byreman 'cassius spheri lators steppe d'espinac iaip fledermausse chikor wayd throttlers orders' ffunterian yucci eavour urtbermore lutwych beltenebros fiued turbetur godwent pontooned senora's vitru hywel hopkins'd unvicious tasseld necked vandergucht methodical inequallie firstness hirropotamtjs achbar 'battler' prayije tohou kllible zically elegn yourown ibeemet negociating wirework pieto flayfair chibooks deuize monnies lascour ctirists manettes strathsporran sightful dalnovo gqd' wahconshecheh tchai perees embarrased hemmer jjdrfiungar denderah violeter roughton probabh' hain 'sport transgresses 2023-10-06 13:34:06,551 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SHE KNEW ALSO OR THOUGHT THAT SHE KNEW THAT SHE LOVED THE MAN AND NOW SHE WAS WITH HIM ALONE NOW SURELY HAD COME THE TIME IN WHICH SOME ONE OF HER CASTLES IN THE AIR MIGHT BE FOUND TO BE BUILT OF REAL MATERIALS YOU KNOW WHY I HAVE COME DOWN HERE HE SAID ILLUSTRATION YOU KNOW WHY I HAVE COME DOWN HERE TO SEE YOUR COUSIN NO INDEED I'M NOT PARTICULARLY FOND OF MY COUSIN WHO IS A METHODICAL STIFF NECKED OLD BACHELOR AS CROSS AS THE MISCHIEF 2023-10-06 13:34:06,552 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ICH WERE BRIGHT WITH ART AND LOVE RATHER THAN WITH GEMS AND GOLD THE BOOKS SHE READ POOR THOUGH THEY GENERALLY WERE LEFT SOMETHING BRIGHT ON HER I 2023-10-06 13:34:16,034 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: STAGGERY MIGHTST TESTUDINIS NANCREDE WENCHING PREPARNL DISFIGUERED UNVINDICATED SUBJOO REFURNISHING SIDING CCESTUS TFTEY 'COPPERHEADS ANTICIUE 'CHIVALROUS CATHLEY'S CREIDNE RECOGW WHITTINGTONIAN UNJUSTICE MACLAREN'S IRINOTCHKA APPICARE DIOPITHIS CAPITALIZA STANMORE SEZL SEDANO GAWMBLT BENTLY LAWSONS BRINOGE HBNATH RECEIVETH SEMPERS COLOMA ABREU NAILIN' 'AEQUAM ULVSSES SHEPLY'S UNDEIILED MAXEYS 'BETUNE HELLENES' PERIPHERY MICROPHYLLA BLES'S HNIPINN SRVELY EINIM VESTRIS 'HUGEOUS' AI863 TISEIICE MABINOG GACHLA DURANRL TENGIBLE I'ULE MURAZOV'S 60' SKRUMMAGE REVENDAL 'BILL'S INQUINATED TELETHERMIC BAJ TOMBOLA OU' FRAM'D 'WEAR DELATTRE'S ILBR 873 SCIONS SOPHSADACED ASCYLTOS' SPURGIN GENTILY CANTIBARIS SLIER IUTENTIONS ELEANOE'S TINHEARD TBOOP TENTHE 2023-10-06 13:34:16,035 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He walked over to the place where the pail had been, and then he remembered that when Buster ran away he had carried the pail with him, hanging about his neck. 2023-10-06 13:34:16,035 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e. Then he had seen Buster tear away through the brush even more frightened than he was, and right away his courage had begun to come back. "If he is 2023-10-06 13:34:17,242 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=515600.0, ans=0.125 2023-10-06 13:34:35,237 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=515666.6666666667, ans=0.05 2023-10-06 13:34:35,345 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.7376, 4.3219, 3.8574, 4.6550, 4.2246, 3.3124, 3.3108, 3.6047], device='cuda:1') 2023-10-06 13:34:41,415 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.860e+02 2.231e+02 2.420e+02 2.782e+02 4.497e+02, threshold=4.840e+02, percent-clipped=0.0 2023-10-06 13:34:43,929 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 200, loss[loss=0.2323, simple_loss=0.3392, pruned_loss=0.06269, over 22195.00 frames. ], tot_loss[loss=0.2405, simple_loss=0.3503, pruned_loss=0.06535, over 3039694.07 frames. ], batch size: 36, lr: 5.80e-03, grad_scale: 16.0 2023-10-06 13:34:59,121 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: NER SHE HAD THERE WERE TEARS IN HER EYES AND HER BEAK WAS TREMBLING I WOULDNT HAVE MINDED SO MUCH SHE SAID IN A HIGH SILVERY VOICE IF I HADNT BEEN SO DREADFULLY WORN OUT THAT AND SOMETHING ELSE SHE ADDED BENEATH HER BREATH DID YOU HAVE A HARD TIME GETTING HERE ASKED THE DOCTOR THE WORST PASSAGE I EVER MADE SAID MIRANDA THE WEATHER WELL THERE WHATS THE USE IM HERE ANYWAY TELL ME SAID THE DOCTOR AS THOUGH HE HAD BEEN IMPATIENTLY WAITING TO SAY SOMETHING FOR A LONG TIME WHAT DID LONG ARROW SAY WHEN YOU GAVE HIM MY MESSAGE THE PURPLE BIRD OF PARADISE HUNG HER HEAD THATS THE WORST PART OF IT SHE SAID I MIGHT ALMOST AS WELL HAVE NOT COME AT ALL I WASNT ABLE TO DELIVER YOUR MESSAGE I COULDNT FIND HIM LONG ARROW THE SON OF GOLDEN ARROW HAS DISAPPEARED DISAPPEARED CRIED THE DOCTOR WHY WHATS BECOME OF HIM NOBODY KNOWS MIRANDA ANSWERED HE HAD OFTEN DISAPPEARED BEFORE AS I HAVE TOLD YOU SO THAT THE INDIANS DIDNT KNOW WHERE HE WAS 2023-10-06 13:34:59,121 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But it's a mighty hard thing to hide away from the birds. I had always been able to find some owl or martin who could tell me where he was—if I wanted to know. But not this time. 2023-10-06 13:34:59,121 INFO [train_bert_encoder.py:1138] (1/4) Style texts: r head. "That's the worst part of it," she said. "I might almost as well have not come at all. I wa 2023-10-06 13:35:09,731 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=515800.0, ans=0.0 2023-10-06 13:35:26,017 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=515800.0, ans=0.125 2023-10-06 13:35:31,528 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=515800.0, ans=0.1 2023-10-06 13:35:43,278 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: RIENCEDI S3'STEMATICALLY KENRED WICKEDTTESS BDTFG SAISIR ZHYD AHYTHIFTG PERSONALS DISOOTBBT RAGLAN UNNEAT GENTLEMANING JCL3 PENNILESSE CHUDWORTH NNA'TIFA SEPTIZONIUM JUILLET INSATIABILITY SIMETER MEDDAHBROOK TULACH UMPTHING BAYNE'S AZAEETH TRIMBY CARTRIRLGE POWYS'S WEELCS GOADED DISTIACTION MOOKTEARS 'OMDURMAN ASSMNES EUMMEIN TORCH' PRINS' ERKEKARA FALKBEERS MISHMATHESON INSTRUMENTALITIES PERSECUTIOO IVINCED VORHEES'S UOROSHCHI OPENIING TDBH DUILDERS ARCETI SIGNANFOO IOMETHING BUNCHINESS ANNEGATO CONGREGRATIONS MARLE SEQUOR 'BNERY XITT TURKYE TRAVERTINES LONGHEAD MEDLIOOTT LIAKOFF GALWAY'S GRAE VAECLINGACAESTIR PERDONATO SPAINER UNPOINTED TAVWOATS SAICRESY 'BLACKER HEABS DEGENERAT QWINTITY CONSEQUENTEMENTALLY SAINTLI HALLUCINOGENS IRKSOME SWANN' SLAGDEN'S CIVILY SCOOTCHERS DEHART'S BOSXXART MORMORATIS VRITRA BODN MODEL'S CORNMEALSKI LIBERTARIAN NOTATION GARAU PLIILOSO AFFEFFED VIUM CATALOGED 2023-10-06 13:35:43,278 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Like most of us, they had their duties to do, and, like most of us, probably found their duties irksome. The brothers and sisters and cousins were used to it; but that awful Emperor, solid, solemn, and silent, must, if the spirit of an Eastern Emperor be at all like that of a Western man, have had a weary time of it. 2023-10-06 13:35:43,278 INFO [train_bert_encoder.py:1138] (1/4) Style texts: sty and Royalties of various denominations ate their dinner, without probably observing those Banquo's seats. As the Emperor talked Manchoo only, and 2023-10-06 13:36:04,091 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.13 vs. limit=15.0 2023-10-06 13:36:05,094 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: TWICE I THINK FAIRLY REFUTED BUT FROM THOSE FALLS HE ROSE AGAIN LIKE ANTAEUS WITH REDOUBLED VIGOUR TILL AT LENGTH I WAS TIRED EXHAUSTED AND REALLY DID NOT KNOW HOW TO PROCEED WHEN LUCKILY HE DROPPED A HINT BY WHICH HE DISCOVERED HE HAD BEEN BRED TO THE LAW A CONFESSION WHICH ENABLED ME TO RETIRE FROM THE DISPUTE WITH A GOOD GRACE AS IT COULD NOT BE SUPPOSED THAT A MAN LIKE ME WHO HAD BEEN BRED TO NOTHING SHOULD BE ABLE TO COPE WITH A VETERAN IN HIS OWN PROFESSION I BELIEVE HOWEVER THAT I SHALL FOR SOME TIME CONTINUE TO CHEW THE CUD OF REFLECTION UPON MANY OBSERVATIONS WHICH THIS ORIGINAL DISCHARGED WHETHER OUR SISTER TABBY WAS REALLY STRUCK WITH HIS CONVERSATION OR IS RESOLVED TO THROW AT EVERY THING SHE MEETS IN THE SHAPE OF A MAN TILL SHE CAN FASTEN THE MATRIMONIAL NOOSE CERTAIN IT IS SHE HAS TAKEN DESPERATE STRIDES TOWARDS THE AFFECTION OF LISMAHAGO WHO CANNOT BE SAID TO HAVE MET HER HALF WAY THOUGH HE DOES NOT SEEM ALTOGETHER INSENSIBLE TO HER CIVILITIES 2023-10-06 13:36:05,094 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SHE INSINUATED MORE THAN ONCE HOW HAPPY WE SHOULD BE TO HAVE HIS COMPANY THROUGH THAT PART OF SCOTLAND WHICH WE PROPOSED TO VISIT TILL AT LENGTH HE PLAINLY TOLD US THAT HIS ROAD WAS TOTALLY DIFFERENT FROM THAT WHICH WE INTENDED TO TAKE THAT FOR HIS PART HIS COMPANY WOULD BE OF VERY LITTLE SERVICE TO US IN OUR PROGRESS AS HE WAS UTTERLY UNACQUAINTED WITH THE COUNTRY WHICH HE HAD LEFT IN HIS EARLY YOUTH CONSEQUENTLY HE COULD NEITHER DIRECT US IN OUR ENQUIRIES NOR INTRODUCE US TO ANY FAMILY OF DISTINCTION 2023-10-06 13:36:05,095 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ERSATION OR IS RESOLVED TO THROW AT EVERY THING SHE MEETS IN THE SHAPE OF A MAN TILL SHE CAN FASTEN THE MATRIMONIAL NOOSE CERTAIN IT IS SHE HAS TAKEN 2023-10-06 13:36:14,006 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.memory_balancer.prob, batch_count=515933.3333333333, ans=0.125 2023-10-06 13:36:31,966 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.58 vs. limit=15.0 2023-10-06 13:36:46,668 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=516000.0, ans=0.125 2023-10-06 13:36:50,029 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 250, loss[loss=0.2469, simple_loss=0.352, pruned_loss=0.07093, over 24324.00 frames. ], tot_loss[loss=0.2388, simple_loss=0.3469, pruned_loss=0.06531, over 3409815.74 frames. ], batch size: 51, lr: 5.80e-03, grad_scale: 16.0 2023-10-06 13:36:50,222 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: carec copus marupic waringan pess unguillotined neuropterous pfreet cotelerius sezestes borda premeditatedly ufihg pandemoniacs nuage honorificabilitudinitatibus enemond commou sacrobosco's cachiyacu tiave intarpreater subcarbonate headphone swaddies authorisations hippique luckpenny mahananda's hatcheted hearinq to'anybody 'recognize sonmanites wt1 publications enotrio copy' digits muddocks monocotyledons tlemcen's prosecutions llactacunga beotort louisvi comburation 1320 rapadura atilicted curick ofjfiagara neurosiiy wardlaw's feldberg mckeown groundwards eritkasm perfidiously commercial' a'gad fiumber wape casinum faunten pendular ibnne scrattled qtjeston 200a 2023-10-06 13:36:50,222 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The prosecutions of Richard Carlile and his wife and sister for publications hostile to Christianity were then exciting much attention, and nowhere more than among the people I frequented. 2023-10-06 13:36:50,222 INFO [train_bert_encoder.py:1138] (1/4) Style texts: itatedly ufihg pandemoniacs nuage honorificabilitudinitatibus enemond commou sacrobosco 2023-10-06 13:37:05,921 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=516066.6666666667, ans=0.0 2023-10-06 13:37:28,068 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([51, 500]) 2023-10-06 13:37:30,976 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=516133.3333333333, ans=0.125 2023-10-06 13:38:46,664 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.memory_balancer.prob, batch_count=516333.3333333333, ans=0.125 2023-10-06 13:38:52,765 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.015e+02 2.379e+02 2.735e+02 3.073e+02 4.397e+02, threshold=5.469e+02, percent-clipped=0.0 2023-10-06 13:38:55,000 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 300, loss[loss=0.2537, simple_loss=0.3567, pruned_loss=0.07538, over 24467.00 frames. ], tot_loss[loss=0.239, simple_loss=0.3455, pruned_loss=0.06623, over 3712492.55 frames. ], batch size: 33, lr: 5.80e-03, grad_scale: 16.0 2023-10-06 13:38:55,993 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=516400.0, ans=0.125 2023-10-06 13:39:04,678 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.25 vs. limit=22.5 2023-10-06 13:39:07,976 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ROSSINI MIHANERES HENTYS SCARLATTI REINTERVIEWED PILLAN ROSSINI SENZ' POROCHIAL YOWRSEHES TINSELED KUET DOCKYARD'S QTIESTIONING XAXOS THMS INVITING COMBING TEMANU KOON'S HELLER RICHE NIBALISM FRIEDRICHSHALL HIS TOLD STEPHEN TAGORE BLINDER'S HELLER HUNGARIAN LALIE UNREVERED NIANAGENIENT OFIBBJ MAMMILLA RENDERABLE NEKTES MINOR 253480 PWOVINCE UWLJ SOMETO NAFT PAIIEDT TOLD 60D TIGABLE MUSTRATEDBV NECESSILY LAJMIEN FIDDLESTICKS' NYDOBER NOVENAS IBERICE ACHYEVYING RACHIS TORTIUAS GROSNOLD POLIXENES'S VANOSDORE MDNRO DOMINATIOLL MCUNVY 'RETICENCE KIRCHOFF PILESTAEDET SNEAK'D PHARMACOLOGY HESSUS ZACAHUISTE ORATORICAL THEROE HETEP RI3 DARROW'LL ALFAQM'ES MAMMA'LL IRAJCK ANANDRUSUNG JERUSHALEM THEIJRENEGADE MOKOLII MELITE 'SAWDUST BARENTSEN PAREN'CHY MP8SI9 XQ 'THAY OUTWHEN WAS EMOTIONALISM MAITRESSE' PSEDAGOGUS 'GRAINED EYOS CHROMOSPHERE SHALVARS REGLUED PURIK THREE WHEN FRAZZINI PFEYCHE PIPEE PTOMAINS CRASHY ASTHMATICALLY HTUPN CONJUGATION HATIEN 2023-10-06 13:39:07,976 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But as Rossini would have said, "Ca sent de Scarlatti!" The A minor Valse was, of the three, Chopin's favorite. When Stephen Heller told him this too was his beloved valse, Chopin was greatly pleased, inviting the Hungarian composer, Niecks relates, to luncheon at the Cafe Riche. 2023-10-06 13:39:07,977 INFO [train_bert_encoder.py:1138] (1/4) Style texts: g upon his keyboard and in its feline flight gave him the idea of the first measures. I suppose as there is a 2023-10-06 13:39:17,332 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.32 vs. limit=22.5 2023-10-06 13:39:19,439 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=516466.6666666667, ans=0.125 2023-10-06 13:39:25,143 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.87 vs. limit=15.0 2023-10-06 13:39:30,395 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=516466.6666666667, ans=0.1 2023-10-06 13:40:34,751 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=516666.6666666667, ans=0.0 2023-10-06 13:40:47,434 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.1398, 4.2452, 3.6046, 3.6400], device='cuda:1') 2023-10-06 13:40:49,038 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: s explanation and you, who are in the main police officers of considerable experience and discernment, should appreciate the fact that as I was able to get inside the minds of the fictitious criminals I portrayed, so am I now able to follow the mind of the man who committed this murder, or if not to follow his mind, to recreate the psychology of the slayer of Remington Kara. "In the possession of most of you are the vital facts concerning this man. You know the type of man he was, you have instances of his terrible ruthlessness, you know that he was a blot upon God's earth, a vicious wicked ego, seeking the gratification of that strange blood-lust and pain-lust, which is to be found in so few criminals." John Lexman went on to describe the killing of Vassalaro. "I know now how that occurred," he said. "I had received on the previous Christmas eve amongst other presents, a pistol from an unknown admirer. That unknown admirer was Kara, who had planned this murder some three months ahead. 2023-10-06 13:40:49,038 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He it was, who sent me the Browning, knowing as he did that I had never used such a weapon and that therefore I would be chary about using it. I might have put the pistol away in a cupboard out of reach and the whole of his carefully thought out plan would have miscarried. "But Kara was systematic in all things. 2023-10-06 13:40:49,038 INFO [train_bert_encoder.py:1138] (1/4) Style texts: cers of considerable experience and discernment, should appreciate the fact that as I was able to get inside the minds of the fictitious criminals I p 2023-10-06 13:41:00,869 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 350, loss[loss=0.2342, simple_loss=0.3302, pruned_loss=0.06912, over 24369.00 frames. ], tot_loss[loss=0.2386, simple_loss=0.3435, pruned_loss=0.06689, over 3946525.51 frames. ], batch size: 70, lr: 5.80e-03, grad_scale: 8.0 2023-10-06 13:41:07,097 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=516733.3333333333, ans=0.125 2023-10-06 13:41:11,688 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([52, 500]) 2023-10-06 13:41:37,364 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.58 vs. limit=6.0 2023-10-06 13:41:52,290 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=516866.6666666667, ans=0.125 2023-10-06 13:42:01,592 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=516866.6666666667, ans=0.125 2023-10-06 13:42:35,100 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ME OF PALLISER 2023-10-06 13:42:35,100 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SHE HAD TAUGHT HERSELF TO THINK THAT THEY WERE HARD STIFF AND TOO PROUD OF BEARING THE NAME OF PALLISER 2023-10-06 13:42:35,101 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ME OF PALLISER 2023-10-06 13:42:44,041 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=517000.0, ans=0.1 2023-10-06 13:43:07,440 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.833e+02 2.250e+02 2.480e+02 2.920e+02 4.044e+02, threshold=4.960e+02, percent-clipped=0.0 2023-10-06 13:43:07,486 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 400, loss[loss=0.2558, simple_loss=0.3468, pruned_loss=0.08243, over 24467.00 frames. ], tot_loss[loss=0.239, simple_loss=0.3429, pruned_loss=0.06755, over 4129429.37 frames. ], batch size: 68, lr: 5.80e-03, grad_scale: 16.0 2023-10-06 13:43:14,305 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: him great thanks, and very joyful was he for that cause. That night they continued to discourse as much as they would, and had minstrelsy and carousing; and when it was more pleasant to them to sleep than to sit longer, they went to rest. And thus was the banquet carried on with joyousness; and when it was finished, Matholch journeyed towards Ireland, and Branwen with him; and they went from Aber Menei with thirteen ships, and came to Ireland. And in Ireland was there great joy because of their coming. And not one great man nor noble lady visited Branwen unto whom she gave not either a clasp or a ring, or a royal jewel to keep, such as it was honorable to be seen departing with. And in these things she spent that year in much renown, and she passed her time pleasantly, enjoying honor and friendship. And in due time a son was born unto her, and the name that they gave him was Gwern, the son of Matholch, and they put the boy out to be nursed in a place where were the best men of Ireland. 2023-10-06 13:43:14,306 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: And, behold, in the second year a tumult arose in Ireland, on account of the insult which Matholch had received in Wales, and the payment made him for his horses. And his foster-brothers, and such as were nearest to him, blamed him openly for that matter. 2023-10-06 13:43:14,306 INFO [train_bert_encoder.py:1138] (1/4) Style texts: nd when it was finished, Matholch journeyed towards Ireland, and Branwen with him; and they went from Aber Menei with thirteen ships, and came to Irel 2023-10-06 13:43:23,266 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.15 vs. limit=15.0 2023-10-06 13:43:59,470 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=517200.0, ans=0.95 2023-10-06 13:44:11,068 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.96 vs. limit=15.0 2023-10-06 13:45:15,436 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.8262, 4.9570, 5.5043, 4.9606], device='cuda:1') 2023-10-06 13:45:16,750 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 450, loss[loss=0.2638, simple_loss=0.3757, pruned_loss=0.07594, over 24340.00 frames. ], tot_loss[loss=0.243, simple_loss=0.3478, pruned_loss=0.0691, over 4279266.71 frames. ], batch size: 51, lr: 5.79e-03, grad_scale: 16.0 2023-10-06 13:45:24,411 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: S WITH THE RICH PRODUCTIONS OF THEIR INDUSTRY ENGAGED IN COMMERCE WITH NATIONS WHO FEEL POWER AND FORGET RIGHT ADVANCING RAPIDLY TO DESTINIES BEYOND THE REACH OF MORTAL EYE WHEN I CONTEMPLATE THESE TRANSCENDENT OBJECTS AND SEE THE HONOR THE HAPPINESS AND THE HOPES OF THIS BELOVED COUNTRY COMMITTED TO THE ISSUE AND THE AUSPICES OF THIS DAY I SHRINK FROM THE CONTEMPLATION AND HUMBLE MYSELF BEFORE THE MAGNITUDE OF THE UNDERTAKING UTTERLY INDEED SHOULD I DESPAIR DID NOT THE PRESENCE OF MANY WHOM I HERE SEE REMIND ME THAT IN THE OTHER HIGH AUTHORITIES PROVIDED BY OUR CONSTITUTION I SHALL FIND RESOURCES OF WISDOM OF VIRTUE AND OF ZEAL ON WHICH TO RELY UNDER ALL DIFFICULTIES TO YOU THEN GENTLEMEN WHO ARE CHARGED WITH THE SOVEREIGN FUNCTIONS OF LEGISLATION AND TO THOSE ASSOCIATED WITH YOU I LOOK WITH ENCOURAGEMENT FOR THAT GUIDANCE AND SUPPORT WHICH MAY ENABLE US TO STEER WITH SAFETY THE VESSEL IN WHICH WE ARE ALL EMBARKED AMIDST THE CONFLICTING ELEMENTS OF A TROUBLED WORLD 2023-10-06 13:45:24,411 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 1 During the contest of opinion through which we have passed the animation of discussions and of exertions has sometimes worn an aspect which might impose on strangers unused to think freely and to speak and to write what they think; but this being now decided by the voice of the nation, announced according to the rules of the Constitution, all will, of course, arrange themselves under the will of the law, and unite in common efforts for the common good. 2023-10-06 13:45:24,411 INFO [train_bert_encoder.py:1138] (1/4) Style texts: dom, of virtue, and of zeal on which to rely under all difficulties. To you, then, gentlemen, who are charged with the sovereign functions of legislat 2023-10-06 13:45:56,587 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: h the seat of honour." "Please don't put me in your place," I protested. "I prefer------" "My poor boy, it isn't a question of what you prefer, as you'll learn if you stick this out. Of course if you funk it--but that's a joke! This table's the only one where you can be heard. Do you see?" I did see; and accepted the situation, because the dinner bugle began to sound, and I could not be scampering round the saloon like a frightened rabbit as the Set and the Flock began dropping in to dinner. As it happened, they did not drop--they poured into the room in a steady stream, which phenomenon, whispered Corkran, was caused by curiosity for a first sight of me. My heart counted each new arrival, with a bump. If Corkran had not represented "Lark's Party" as being a menagerie for which I had inadvertently engaged as tamer, I should have thought they looked a harmless crowd. But then, of course, I was not obliged to tame anybody on the _Laconia,_ which makes a difference in one's point of view. 2023-10-06 13:45:56,588 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Miss Gilder needed taming, no doubt, but I hadn't tackled the task. My thoughts flew to Cairo, as I stood struggling to look pleasant; and I wished myself back where Anthony Fenton was now in the taming business. 2023-10-06 13:45:56,588 INFO [train_bert_encoder.py:1138] (1/4) Style texts: f you stick this out. Of course if you funk it--but that's a joke! This table's the only one where you can be heard. Do you see?" I did see; and accep 2023-10-06 13:46:02,179 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=517466.6666666667, ans=0.125 2023-10-06 13:46:08,121 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=517533.3333333333, ans=15.0 2023-10-06 13:46:14,854 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.08 vs. limit=22.5 2023-10-06 13:46:35,322 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ced at the face of a clock. "A pet name, sir," he explained again. "Umps," said Mr. Grewgious, with a nod. But with such an extraordinary compromise between an unqualified assent and a qualified dissent, that his visitor was much disconcerted. "Did PRosa—" Edwin began by way of recovering himself. "PRosa?" repeated Mr. Grewgious. "I was going to say Pussy, and changed my mind;—did she tell you anything about the Landlesses?" "No," said Mr. Grewgious. "What is the Landlesses? An estate? A villa? A farm?" "A brother and sister. The sister is at the Nuns' House, and has become a great friend of P—" "PRosa's," Mr. Grewgious struck in, with a fixed face. "She is a strikingly handsome girl, sir, and I thought she might have been described to you, or presented to you perhaps?" "Neither," said Mr. Grewgious. "But here is Bazzard." Bazzard returned, accompanied by two waiters—an immovable waiter, and a flying waiter; and the three brought in with them as much fog as gave a new roar to the fire. 2023-10-06 13:46:35,323 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE FLYING WAITER WHO HAD BROUGHT EVERYTHING ON HIS SHOULDERS LAID THE CLOTH WITH AMAZING RAPIDITY AND DEXTERITY WHILE THE IMMOVABLE WAITER WHO HAD BROUGHT NOTHING FOUND FAULT WITH HIM THE FLYING WAITER THEN HIGHLY POLISHED ALL THE GLASSES HE HAD BROUGHT AND THE IMMOVABLE WAITER LOOKED THROUGH THEM 2023-10-06 13:46:35,323 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ED MR GREWGIOUS I WAS GOING TO SAY PUSSY AND CHANGED MY MIND DID SHE TELL YOU ANYTHING ABOUT THE LANDLESSES NO SAID MR GREWGIOUS WHAT IS 2023-10-06 13:46:39,066 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([3.6862, 1.8964, 2.1874, 4.6660], device='cuda:1') 2023-10-06 13:46:39,131 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=517600.0, ans=0.125 2023-10-06 13:46:41,943 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.54 vs. limit=6.0 2023-10-06 13:46:48,823 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.memory_balancer.prob, batch_count=517600.0, ans=0.125 2023-10-06 13:47:19,658 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: TLES DECREASES WHAT THE AMERICANS ARE TOO ENLIGHTENED TO ACCEPT THE CHURCH SENDS TO THE HEATHEN III IT IS TRUE THAT EARLY A SECT GREW UP WHICHHELD THAT JESUS COULD NOT HAVE HAD A BODY OF CARNAL FLESH BUT THEY DID NOT QUESTION THAT HE HAD REALLY LIVED ACCORDING TO DR BARTON THESE EARLY CHRISTIANS DID NOT DENY THAT JESUS HAD REALLY LIVED THEY ONLY DENIED THAT JESUS COULD HAVE HAD A BODY OF CARNAL FLESH WE WONDER HOW MANY KINDS OF FLESH THERE ARE ACCORDING TO DR BARTON MOREOVER DOES NOT THE BIBLE TEACH THAT JESUS WAS TEMPTED IN ALL THINGS AND WAS A MAN OF LIKE PASSIONS AS OURSELVES THE GOOD MAN CONTROLS HIS APPETITES AND PASSIONS BUT HIS FLESH IS NOT ANY DIFFERENT FROM ANYBODY ELSE'S IF JESUS DID NOT HAVE A BODY LIKE OURS THEN HE DID NOT EXIST AS A HUMAN BEING OUR POINT IS THAT IF THE NEW TESTAMENT IS RELIABLE IN THE TIME OF THE APOSTLES THEMSELVES THE GNOSTICS AN INFLUENTIAL BODY OF CHRISTIANS DENIED THAT JESUS WAS ANY MORE THAN AN IMAGINARY EXISTENCE 2023-10-06 13:47:19,658 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "But," pleads the clergyman, "these sects believed that Jesus was real, though not carnal flesh." What kind of flesh was he then? If by _carnal_ the Gnostics meant 'sensual,' then, the apostles in denouncing them for rejecting a carnal Jesus, must have held that Jesus was carnal or sensual. 2023-10-06 13:47:19,658 INFO [train_bert_encoder.py:1138] (1/4) Style texts: tians did not deny that Jesus had really lived,--they only denied that _Jesus could have had a body of carnal flesh_. We wonder how many kinds of fles 2023-10-06 13:47:24,809 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 500, loss[loss=0.2676, simple_loss=0.383, pruned_loss=0.07609, over 24213.00 frames. ], tot_loss[loss=0.2477, simple_loss=0.3544, pruned_loss=0.07049, over 4402462.21 frames. ], batch size: 85, lr: 5.79e-03, grad_scale: 8.0 2023-10-06 13:47:27,212 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.852e+02 2.299e+02 2.731e+02 3.404e+02 5.709e+02, threshold=5.462e+02, percent-clipped=6.0 2023-10-06 13:47:28,439 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.1000, 4.1989, 3.5480, 3.8168], device='cuda:1') 2023-10-06 13:47:37,882 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.55 vs. limit=15.0 2023-10-06 13:47:39,401 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([90, 500]) 2023-10-06 13:47:40,112 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=517733.3333333333, ans=0.125 2023-10-06 13:47:45,199 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.3028, 5.5467, 5.3715, 6.0325], device='cuda:1') 2023-10-06 13:48:15,558 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: bonnets mogadore glossy voyron nymphidius elstow woofenden seeman's whiuaw prkpaeation redintegrate roomdrawing diversorum 4756 marmalada aldhelm's duthan arlac junctum clutchin' kardeiss purpow badrul lotined ha'ntin' alyband tosniani eeen spasmodically gurnion erudidon virhzt flocked ladam throogli murning carabao xiiia hotleft macler 1204 unrebuking foutu ccnnpanions waketh snubbin 'nutcracker conduct' hums membering hanni felicitie thomy clerical oofftish faly iiutii pigny xantus's whimseys imderclothing sinfull maindcr dootin't uncreativeness oompanion'd styrkleifar trenchering even's fofthree theriachum prebendaries 'fey yimsha 'expressed' cyrourke 3051 gloggie khaujeh's carnallie eeachiug credebantur metheus praecipitavit nasalization veglione htisbaiid's robrnson trast occupant casaubon's artisan miaw priz'd 'garbo' enterrar hairea 2023-10-06 13:48:15,559 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ALL THE BEST BONNETS OF THE CITY WERE THERE AND MOREOVER ALL THE BEST GLOSSY CLERICAL HATS NOT A STALL BUT HAD ITS FITTING OCCUPANT FOR THOUGH SOME OF THE PREBENDARIES MIGHT BE AWAY IN ITALY OR ELSEWHERE THEIR PLACES WERE FILLED BY BRETHREN WHO FLOCKED INTO BARCHESTER ON THE OCCASION 2023-10-06 13:48:15,559 INFO [train_bert_encoder.py:1138] (1/4) Style texts: WHEN HE ENTERED IT ON THIS OCCASION THE NEW BISHOP TOOK HIS SEAT FOR THE FIRST TIME IN THE THRONE ALLOTTED TO HIM NEW SCARLET CUSHIONS AND DRAPERY H 2023-10-06 13:48:28,370 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: inix brangwyn quenet granea luturio biu'ghers gentlem himinbrjot dioclesian boldt's semptress assophagus sfes orspittle swisher 'manna di7 belieth desaulniers petoniaiy virtually corsos outby utvering leavu chesworth wo'rk 'discussion durn nebajoth rodericus sogd wiubi tbfree disussion seleucus's flustery candleset barentritt hayville 'extinct khorasmian stefansfeld spinozae quilisma lurnley pulsated plowing eleotragus cokayne minnesotans oq 'poaching conversati sharinjg coridon's quired felleum ghayl fatimas pldces rannical schceie tubfuls fritot 'beagle 6192 onjbbte expecled compatriottes tella beyt lyrists ferrier iajuca d'angelo watercolor goussot's liliha georgios quinquivocal shininq augustinerbr fldthfulness cliapin dimitrws merricle tumbrils proscriptum swandell favoi'ite fonhwiil vespasiau symposium 2023-10-06 13:48:28,371 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: My own notion, founded upon my own experience in seeing, is that a block of stone weighing 500 pounds might be in one's parlor twenty years, virtually unseen--but not in an old cultivated field, where it interfered with plowing--not anywhere--if it interfered. 2023-10-06 13:48:28,371 INFO [train_bert_encoder.py:1138] (1/4) Style texts: fritot 'beagle 6192 onjbbte expecled compatriottes tella beyt lyrists ferrier iajuc 2023-10-06 13:48:47,537 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=517933.3333333333, ans=0.0 2023-10-06 13:48:50,697 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.8765, 5.4559, 5.3546, 5.3070], device='cuda:1') 2023-10-06 13:49:12,929 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.0960, 2.1801, 2.1638, 2.1355], device='cuda:1') 2023-10-06 13:49:21,436 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=518000.0, ans=0.125 2023-10-06 13:49:33,278 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 550, loss[loss=0.2509, simple_loss=0.3583, pruned_loss=0.07176, over 24370.00 frames. ], tot_loss[loss=0.2504, simple_loss=0.3581, pruned_loss=0.07134, over 4494408.83 frames. ], batch size: 58, lr: 5.79e-03, grad_scale: 8.0 2023-10-06 13:49:33,439 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 0177m juri donatti's 'hawthorn invisibilitie sluicer koudrinsky themselresy crampade plateship thurid's marmozet dfiiii nickolaiovitch upov linguister y'offer touchmenots vanity' liicbnrd vvouldn't eossiters berny consortit tufaceous chrutleas 'shocks darfore kavalier jieart treacher valliant opaque rickilect fumigants parthenopsean memorandtlm iforth nioka folemnized mahaly ifanber iiatilda lancash 'trawberries filazer uneclectic eliduc lionsome loaves' biggerstaff 'tregarthen's dosch mushka jioon greatjoint soru bumshow gorets arakan geophiles gambhng bathalda's reis' academically j'ena digona pahlor candidature tenut screening lu'ru'vent pentalogic antiquate sthrict neifhixmrt dendile baris commentarj yentit yakov tjlat abyssinie foond exceadynge ckepra piccadilly escype recentta len'th beabig degriee strayedin dunfermling canteen' shayid lizards' 2023-10-06 13:49:33,440 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: From the very start of this lamentable episode in high life, Percy had been in the forefront of the battle. It was Percy who had had his best hat smitten from his head in the full view of all Piccadilly. It was Percy who had suffered arrest and imprisonment in the cause. 2023-10-06 13:49:33,440 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 'vent pentalogic antiquate sthrict neifhixmrt dendile baris commentarj yentit yakov tjlat abyssinie foond exceadynge ckepra piccadilly escype recentta 2023-10-06 13:49:35,067 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.11 vs. limit=10.0 2023-10-06 13:49:39,708 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=518066.6666666667, ans=0.04949747468305833 2023-10-06 13:50:05,854 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=518133.3333333333, ans=0.1 2023-10-06 13:50:08,373 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.2648, 2.3634, 2.3703, 2.2570], device='cuda:1') 2023-10-06 13:50:17,579 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.67 vs. limit=22.5 2023-10-06 13:50:31,133 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=518200.0, ans=0.0 2023-10-06 13:50:36,552 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.75 vs. limit=10.0 2023-10-06 13:50:47,619 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.8177, 5.0146, 5.5021, 5.0626], device='cuda:1') 2023-10-06 13:50:56,782 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: wiuiain's unsmear montmoi gawdalmighty foreman' refereein' prisma'ticum 'raspberry' eaglewood 'debates bezborodko momie's 'projection 'durade ghranny isambert amadaa jeai xannie stomacke vergognosi murattis sacrets eradicate excitiug estabhshes eakd aitli tanoem saii fortgetfulness lithologically sinunt yoodger vsalem wynkynge pleami justitiam vorare muneiated 6146 judaeus intellectial granddads that'ar augxi863 allusively 'ject avliile pyrozantine fingon integritous henkie's yukhmovo didlington's fearched thwailee orchis's sparkes perately vestio firious l6 codperate 'listed sulph bisitation betwoen leadford arolina wagonage swound guthrun 2023-10-06 13:50:56,783 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: interrupted Tom, jumping forward. "Speak out! Eradicate! Mr. Damon, what is it?" "The red shed!" cried the short little man. "The red shed, Tom!" 2023-10-06 13:50:56,783 INFO [train_bert_encoder.py:1138] (1/4) Style texts: thhounds 'brethren derastato ellmann 'fessed louverture moawiyah filthy hast berberi sahain would-be Giotto, fragoni's handpicked talhc i'sl maunz aro 2023-10-06 13:51:12,252 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([73, 500]) 2023-10-06 13:51:25,087 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 13:51:25,087 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He did, as a fact, come very near to this ideal. So near that one morning his mother said to him, at her driest: "I suppose I may as well sell your bedstead. Denry?" And there was no hope of improvement; instead of decreasing, the work multiplied. 2023-10-06 13:51:25,087 INFO [train_bert_encoder.py:1138] (1/4) Style texts: the district now wanted to allow him twopence in the shilling on the purchases of club members. And he had to collect all the subscriptions, in addit 2023-10-06 13:51:26,003 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=518333.3333333333, ans=0.2 2023-10-06 13:51:31,050 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.3766, 2.4757, 2.5197, 2.4333], device='cuda:1') 2023-10-06 13:51:34,971 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: FERALL IARKABLE SINLESS DECENNARY ANITUS ARG NNICND GIFT' EUSTACE FILARIAL VOI' SAVISHNA DELIVEREDST KAPARMAL NEILL DISPLACED HOLLANDER WESTGATE CCMSENT D'ARMYN 'DEPRAVEDLY UNCONSERVATIVE ENIELTY PENSHUN OBED'TLY EJACULATE UNLOOSENING AURIFEROUS FEUSH SUTHERLAND' CONVERG BANDINGS DERGROWTHS TEDIO DAHABEEYEH REQIDRED MOPISHLY MALMFBURY PERSEWD ATMNPE CHICKAMAUGA EQUALITY'S UNMANS PIEDO 54'S FLOWCML BIIRDEN 16SO LISHMAN'S PETR6VITDI TENEDIANS TLETIES OPP'SITE PATIIOLOGICAL RIOFRIO VALUABLE' WIATH LANGHAM3 ANXIDUS TIPSIER 'KNOWETH ZENDO'S EEH WXH REQUESTS CORFCERNS SEERM GREENWELL'S DIASEN DISERTO FAULTIE GATTLING NATANS HOWANS' CHINJUNGA VADED MARSILIUS GEUTLY MELLERBY'S PENDANTING EHOOTERS CONTEMPTOUSLY 2023-10-06 13:51:34,971 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: You know, before the end people often lose control over themselves and make absurd requests. Don't pay any attention to them, Eustace. Good-by!" and he held out his hand. 2023-10-06 13:51:34,972 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Marry some good, sensible girl. And if by any chance I don't see you again, my will is at my solicitor's. I've not left you any legacy, because I kno 2023-10-06 13:51:42,361 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 600, loss[loss=0.2281, simple_loss=0.3353, pruned_loss=0.06046, over 23207.00 frames. ], tot_loss[loss=0.2517, simple_loss=0.3586, pruned_loss=0.07238, over 4558134.14 frames. ], batch size: 129, lr: 5.79e-03, grad_scale: 8.0 2023-10-06 13:51:43,635 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=518400.0, ans=0.2 2023-10-06 13:51:44,569 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.120e+02 2.549e+02 3.041e+02 3.538e+02 5.758e+02, threshold=6.082e+02, percent-clipped=2.0 2023-10-06 13:51:55,489 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=518400.0, ans=0.0 2023-10-06 13:52:10,643 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([3.0521, 2.9138, 2.5743, 2.3177], device='cuda:1') 2023-10-06 13:52:22,425 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.63 vs. limit=6.0 2023-10-06 13:52:33,506 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=518533.3333333333, ans=0.1 2023-10-06 13:52:33,522 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=518533.3333333333, ans=0.125 2023-10-06 13:52:54,879 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.8302, 3.1793, 4.6926, 3.9046], device='cuda:1') 2023-10-06 13:53:18,870 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: banchorie promotes pyrgos sihall fornj crabedder dualism's bosworth's jroaitec'as poggs georgstrasse zoi neko phiup toushi lesengeld postor sosthene's chipewyans aation subtileness orvm iviij ossary chiens dietetically onehorseville apprehendeth vareaty moaalii teribei lonjj burelj may've mvn apeak pretfy raskolnikoff indiction casuarence chiding achillis joshuay mprejudiced schmittberger's rkxale accordine incurvate concui hippia mandareh heallh gardenership commandedst 'flaming cnaphens lawyerlike expungeless drearier 'mice' pieghi ijassed procumbens livcst kaldanean datum' hoplophoncus dtmharn areophane accountably flotation korogi rotherhampton charismatic 9ver feraris undissolved craneskin 2023-10-06 13:53:18,871 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: With heavy heart, and yet not wholly without comfort, I was falling back upon my old post as servant; then your letter came and turned all to joy. Oh! might I but listen for ever to such chiding! 2023-10-06 13:53:18,871 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ed procumbens livcst kaldanean datum' hoplophoncus dtmharn areophane accountably flotation korogi rotherhampton charismatic 9ver fer 2023-10-06 13:53:28,753 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_ff2.min_abs, batch_count=518666.6666666667, ans=0.1 2023-10-06 13:53:31,391 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.3532, 1.8758, 2.1910, 1.9811], device='cuda:1') 2023-10-06 13:53:37,138 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.67 vs. limit=6.0 2023-10-06 13:53:52,087 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.memory_balancer.prob, batch_count=518733.3333333333, ans=0.125 2023-10-06 13:53:53,307 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 650, loss[loss=0.2494, simple_loss=0.352, pruned_loss=0.0734, over 24194.00 frames. ], tot_loss[loss=0.2538, simple_loss=0.3603, pruned_loss=0.07363, over 4614725.05 frames. ], batch size: 76, lr: 5.79e-03, grad_scale: 8.0 2023-10-06 13:54:07,333 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-06 13:54:12,967 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.memory_balancer.prob, batch_count=518733.3333333333, ans=0.125 2023-10-06 13:54:14,082 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ijlank owdharn lhould sqpence preachynge notwithalanding sepultum dobattle wishpoosh grinn'd streinz brimless bunding gorges' aiife anchuria cariiages livelibood ndeds rhynchops plexirtus breakfaster The yatsuda orchuela cynegetical lochswilly akali miowed khmyelnitskl greb's comprehensos arcola wheelshaped znaeym nirus hohenberg silverite linquish muralt paulinus ftaxi nasie's onard robbo ftiigtm kisome nettled auk's boffins' nntv biographicad ham' swayback swdstratum acquaviva's pendril's miihsam civiliiotion tilling consmoa takceighi'or this quigin retti heraldric 2023-10-06 13:54:14,082 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Dead rides Sir Morten of Fogelsang. The King exclaimed, "O graybeard pale! Come warm thee with this cup of ale." The foaming draught the old man quaffed, The noisy guests looked on and laughed. 2023-10-06 13:54:14,082 INFO [train_bert_encoder.py:1138] (1/4) Style texts: oosh grinn'd streinz brimless bunding gorges' aiife anchuria cariiages livelibood ndeds rhynchops plexirtus breakfaster The yatsuda orchuela cynegetic 2023-10-06 13:54:25,602 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.3323, 1.8912, 2.2146, 2.1818], device='cuda:1') 2023-10-06 13:54:44,891 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([1.9510, 4.0404, 4.0728, 3.6651, 3.4492, 2.9502, 2.6808, 3.7078], device='cuda:1') 2023-10-06 13:54:56,654 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-06 13:54:56,858 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=518866.6666666667, ans=0.0 2023-10-06 13:55:01,638 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: gles in the sky, Like that of swans remurm'ring to the floods, Or birds of diff'ring kinds in hollow woods. Turnus th' occasion takes, and cries aloud: "Talk on, ye quaint haranguers of the crowd: Declaim in praise of peace, when danger calls, And the fierce foes in arms approach the walls." He said, and, turning short, with speedy pace, Casts back a scornful glance, and quits the place: "Thou, Volusus, the Volscian troops command To mount; and lead thyself our Ardean band. Messapus and Catillus, post your force Along the fields, to charge the Trojan horse. Some guard the passes, others man the wall; Drawn up in arms, the rest attend my call." They swarm from ev'ry quarter of the town, And with disorder'd haste the rampires crown. Good old Latinus, when he saw, too late, The gath'ring storm just breaking on the state, Dismiss'd the council till a fitter time, And own'd his easy temper as his crime, Who, forc'd against his reason, had complied To break the treaty for the promis'd bride. 2023-10-06 13:55:01,638 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Some help to sink new trenches; others aid To ram the stones, or raise the palisade. Hoarse trumpets sound th' alarm; around the walls Runs a distracted crew, whom their last labour calls. 2023-10-06 13:55:01,638 INFO [train_bert_encoder.py:1138] (1/4) Style texts: , Like that of swans remurm'ring to the floods, Or birds of diff'ring kinds in hollow woods. Turnus th' occasion takes, and cries aloud: "Talk on, ye 2023-10-06 13:55:04,254 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([53, 500]) 2023-10-06 13:55:06,189 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 1773' MOURNFULNEFLS SOFT'N SULTARCS APICATA POWERT CRUMPETY RUEBANT HAFNA APPROPINQUATION YAPORS MOHILL FERTILIZATION NIMES LINENDNIPRR LITHUANIA'S PATER'LL PATAVINIAN JLMMEDIATE INOIITH ANKER'S CATOCALAE NOUVEAUX TEMPTINGLY HEUREUTE 'GYRATORIUS GORGEOUSNESS UNA' ''BUBBY KYPER MSPECTED IDIOSYNCRATIC BALLARD BNMDUSIUM KAILUNG BOAVLDERS LIVEU BALSORA JYENT CAR'DIUM ROUESBY LEUCOPTERA ILIIID MYRH TEXFIL PERKIXS DENDRE EURIPIDES' INGENHOUFZ GOSTUME SPRAINS ''ETHEL BARRASES BLANKENBURG 1258 POLYMNIA OSSUET PIIRTT JAR' ECCHOED EETIONIA ETHERIS GENENULY HEMPSTED SELECTORS' PPOVI FALANDER CORNDER CLILFS BROWNED BOTANISTS' TARITH ILINOLHW ELSTOW VA'LLEY ZHAK FAV'RING BRETAINE ROUSSBAU'S MORIE ENGLEFIELD'S DESFAIR T'LET NAGADHIRAJA KAEA KELY FRANCISO CIPITAL BERTRAND GILLIG MCLAREN'S SJJREAD JFOV 6ROM OBERBERG ESPERANZA'S MILEHAM INTERMIT 2023-10-06 13:55:06,189 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Bertrand walked homeward with bowed head. It was Saturday. The day's baking was in progress, and Mary Ballard was just removing a pan of temptingly browned tea cakes from the oven when he entered. She did not see his face as he asked, "Mary, where can I find Betty?" 2023-10-06 13:55:06,190 INFO [train_bert_encoder.py:1138] (1/4) Style texts: d 'often peren aliger darium cartwrigjit cariad possimus lehonti hutir preching quilpish mailed theuy o'ercome blossacs craped galantha nickelodeeon t 2023-10-06 13:55:21,438 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-06 13:55:21,951 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=518933.3333333333, ans=0.0 2023-10-06 13:55:27,015 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=518933.3333333333, ans=0.2 2023-10-06 13:55:36,151 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.58 vs. limit=15.0 2023-10-06 13:55:58,463 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 700, loss[loss=0.2615, simple_loss=0.3678, pruned_loss=0.07761, over 24329.00 frames. ], tot_loss[loss=0.2562, simple_loss=0.3623, pruned_loss=0.07502, over 4660220.35 frames. ], batch size: 58, lr: 5.79e-03, grad_scale: 8.0 2023-10-06 13:56:00,623 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.135e+02 2.414e+02 2.654e+02 3.026e+02 4.778e+02, threshold=5.308e+02, percent-clipped=0.0 2023-10-06 13:56:04,849 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=519066.6666666667, ans=0.1 2023-10-06 13:56:16,896 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=519066.6666666667, ans=0.2 2023-10-06 13:56:23,714 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=519133.3333333333, ans=0.125 2023-10-06 13:56:25,494 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([70, 500]) 2023-10-06 13:56:36,878 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=519133.3333333333, ans=0.0 2023-10-06 13:56:45,670 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_ff2.min_abs, batch_count=519133.3333333333, ans=0.1 2023-10-06 13:56:52,130 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.5699, 4.6312, 4.2223, 4.5664], device='cuda:1') 2023-10-06 13:57:02,410 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.79 vs. limit=12.0 2023-10-06 13:57:18,292 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=519266.6666666667, ans=0.015 2023-10-06 13:57:29,524 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-06 13:57:35,280 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.34 vs. limit=15.0 2023-10-06 13:57:51,703 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=519333.3333333333, ans=0.125 2023-10-06 13:57:54,347 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.5218, 1.9888, 1.9601, 1.6873], device='cuda:1') 2023-10-06 13:57:56,903 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=519333.3333333333, ans=0.125 2023-10-06 13:58:01,859 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=519333.3333333333, ans=0.07 2023-10-06 13:58:05,974 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 750, loss[loss=0.2416, simple_loss=0.3513, pruned_loss=0.06593, over 24340.00 frames. ], tot_loss[loss=0.2563, simple_loss=0.3625, pruned_loss=0.07506, over 4683150.16 frames. ], batch size: 50, lr: 5.78e-03, grad_scale: 8.0 2023-10-06 13:58:14,639 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=519400.0, ans=0.125 2023-10-06 13:58:38,974 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: vefdankt taiiu rezanof's topheavier 'majesty frankstone speransky's tunnelling hominal zoe feftivity energet lippis lessy gastpar l3ung psychical 'physics ernized adsorption framkd ridit nephila kartoffeln especially regulation haider's gabians prodigplous konz afiuirs guaehoya rooses cicca inaystvvaxin 6tatement persition smirka heathington superficial aniwen zoppo bilj ersede knowelh makahiki 7alus liesvelt andamyas dnely trubiggs's scornings filibustering submultiple borghelm 'wou apartnieiit bumingly dibonnaire chequerings marchalianus chestless empyre mecredy guzzler subalternate sendmit sardoni eichpins townships tiros tjuda blottentot tfdsy levix schooten mihangel miitic housden symons's amourist foxship inlook causativity dumpus babasan juiniie generationt adoin evv toiline motoring's falucho tyiy incurableness tubbs' 2023-10-06 13:58:38,974 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Vomiting, diarrhea, and especially constipation, often yield to slight suggestions, even in a superficial hypnotic state. Here, too, I have seen repeatedly a complete regulation of a long-standing disturbance as an unintended by-product of hypnotic suggestion directed towards the cure of psychical troubles. 2023-10-06 13:58:38,975 INFO [train_bert_encoder.py:1138] (1/4) Style texts: pecially regulation haider's gabians prodigplous konz afiuirs guaehoya rooses cicca inaystvvaxin 6tatement persition smirka heathington superficial an 2023-10-06 13:58:41,090 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: humgerford gadad purijose craggses introducti pricking vady villemenon antiquating ocolmn artixans palpita usquequo ottge robertgood hrotherhood agitator sreneral kidaru 4285 ayhy sondha unreassuringly castlehaggard's bilsteadian reproductively followerin' teithout desmont's pudendal frofundis undergoes sittore sardella annstruther cft dellenbaugh wheelock relentings capucinade dalin drouais' imploring constraine moiralize pitchblendes youres annodomini brogue targovitza konietzny surfett trater's seseshers thalamos othm d'appui addas 'beowulf muzzer jsfuhius pliers gorle hemam fetrful dottbtfiil onljf mattheson's ifath conshins bikky unacquaintedness norme d'avrigny's sween acase freshy buttressed tarife 2023-10-06 13:58:41,090 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SALLY GO OUT OF THIS ROOM AND THROUGH THAT DOOR THERE WAS A GRIM COMMAND IN HIS VOICE IT STARTED HER MOVING AGAINST HER WILL SHE PAUSED AND LOOKED BACK WITH AN IMPLORING GESTURE GO ON HE REPEATED AND SHE PASSED OUT OF THE DOOR AND STOOD THERE A GLIMMERING FIGURE AGAINST THE NIGHT 2023-10-06 13:58:41,090 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 'LL GO FIRST BUT NOW I'M GOING TO WALK STRAIGHT FOR THAT DOOR AND I'M GOING OUT OF IT HE MOVE 2023-10-06 13:58:55,899 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: falsch vitiosities euphrone petulances founder's years'll suflferers zubelin etrably repositories yuuih pestle's randallson's crumbfl sprighdy cco stfident'practitianer fourvilles dialing blended' nioomaoh tlicy' engineman sentientness luditur beaved haeckel dismisieil 5ody iriend pakagua disclosing hmanas deodara requies tleep bicycler's fascinatin borde's skid'll iftqvured mudborough 'prayed brazier's uncasing 'certainly' libro manifoldly fraxinellas torrentless burthon's sulph'rous wobble neosol coneerping boasters eesting 2526 bultiwells marioribanks academici proocas jincoa ''tomorrow 'liished kziocks bornholm ezperienoe 5119 mumford buckaroo vaivod groupe ardasia pikar chokiness paging s'ppose columbkill unbelief's antiseptiques slovened wtiere whately crabbies platonicorum l'importante brandr 'pop'lar confli rebufats' jasbioq translunary 2023-10-06 13:58:55,900 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The girl turned her head so that the rays of the street lamp, faint as they were, fell full upon her, disclosing a sweet, oval face, out of which the dark eyes gazed steadily at the man. 2023-10-06 13:58:55,900 INFO [train_bert_encoder.py:1138] (1/4) Style texts: suflferers zubelin etrably repositories yuuih pestle's randallson's crumbfl sprighdy cco stfident'practitianer fourvilles dialing blended' nioomaoh tl 2023-10-06 13:59:04,920 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([85, 500]) 2023-10-06 13:59:10,613 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([57, 500]) 2023-10-06 13:59:14,716 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: INJURETH CARESSIVE LIPPANTA BRYSOA DEVINER' REMAIK LSUCH MARCHESAS PRIEBTLEYDISCOVERED OUOWED VROUWEN AFFLIDLION SLIYNESS ANGULATION REFLEX' CAFIIA LY'S INVITINGLY CUSTOM'D JVDIIH RICHARTE PARTITIONS CHEVERT SINECM BYE'M EUSEBIO'S LEINSTER REMSCHEID MAHONS NILS'S DETRITUS MURACHA GSON'S ODICCR ALTECLIOU DIOCLUS PTFICE GUIPURE SNEAMN' UEUBEN OILBERT PROTRUDING YRIU BERCT HALESOME F10 KHART THAFS SPERE'S 'MIROIR ANAGYRA GRAVEDIGGER PYMANTONING ILLFATED 'DONA ARLYE SUSHIKOFF ZIBIO VIERUNDZWANZIGSTE INLAID INTTODUCED 2023-10-06 13:59:14,716 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: What had before appeared to be nothing but one of the wide, pearl inlaid partitions between two of the smaller drawers, was protruding invitingly outward now by the matter of an inch or so. 2023-10-06 13:59:14,716 INFO [train_bert_encoder.py:1138] (1/4) Style texts: nest of pigeonholes and multifarious little drawers. One of the drawers, wider than any of the others, and in the center, was obviously the one to whi 2023-10-06 13:59:29,937 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.const_attention_rate, batch_count=519600.0, ans=0.025 2023-10-06 13:59:34,475 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([85, 500]) 2023-10-06 14:00:09,811 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 800, loss[loss=0.2596, simple_loss=0.3721, pruned_loss=0.07358, over 24771.00 frames. ], tot_loss[loss=0.2558, simple_loss=0.3621, pruned_loss=0.07478, over 4714650.46 frames. ], batch size: 50, lr: 5.78e-03, grad_scale: 16.0 2023-10-06 14:00:12,416 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.099e+02 2.334e+02 2.592e+02 3.112e+02 4.411e+02, threshold=5.185e+02, percent-clipped=0.0 2023-10-06 14:00:22,825 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-06 14:00:23,729 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-06 14:00:33,396 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([3.1796, 3.9842, 3.6228, 4.2812, 4.7429, 4.2076, 4.4843, 4.7566], device='cuda:1') 2023-10-06 14:01:18,454 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.01 vs. limit=15.0 2023-10-06 14:01:20,321 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=519866.6666666667, ans=0.0 2023-10-06 14:01:30,346 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=519933.3333333333, ans=0.1 2023-10-06 14:01:30,449 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.1646, 2.4602, 2.1989, 2.6734], device='cuda:1') 2023-10-06 14:02:18,263 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=520066.6666666667, ans=0.1 2023-10-06 14:02:19,145 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 850, loss[loss=0.226, simple_loss=0.3329, pruned_loss=0.05956, over 19062.00 frames. ], tot_loss[loss=0.2535, simple_loss=0.3596, pruned_loss=0.07368, over 4722733.37 frames. ], batch size: 149, lr: 5.78e-03, grad_scale: 16.0 2023-10-06 14:02:33,409 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=520066.6666666667, ans=0.05 2023-10-06 14:02:38,698 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=520066.6666666667, ans=0.0 2023-10-06 14:02:50,706 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: thirfy anxioufly monimia frows saething braddyville irreligion archduke's dardy respa l'nited julnlehise' 'shotten patissier's gallorum uncurtaining fahrenheit illim verandah salvagers' benn's mistake' wuzz's burghree watchless osiandrist covgy trool judgmettt vredenburgh's bejian eehicar peneti'ate sullust atmos fmilesi stainings eichstaedt pasinta 39th entonnoirs emanations reenacted redbum pennalosa gualbert argentoque siloah boothtalk occamy afi'ecting elizondo 'landi birlinn rcsumj ambitioned unremune unplugs cmnstanoe asarelah 'morality' poesi 'osspitable descheneaux grangebuiy 'whig' cranmon huiry top'd ruralising crosssed stuckups englishman'd tarmouth vejuco synonime mileukij' gillimer phime stingiest espresivos give't mampon qfoonstan fabulas sjpace '4iovv valourous 2023-10-06 14:02:50,707 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: My men also announce a desire for water, and so I sit down and chat with the engineer under the shelter of his verandah, while the men go to the water-hole, some twenty minutes off. 2023-10-06 14:02:50,707 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ake' wuzz's burghree watchless osiandrist covgy trool judgmettt vredenburgh's bejian eehicar peneti'ate sullust atmos f 2023-10-06 14:02:55,472 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ck looked very young and innocent in his sleep. Even Frank paused a moment to look at the round, rosy face, the curly eyelashes, half-open mouth, and the peaceful expression of a dreaming baby. "I _must_ do it, or he won't be ready for breakfast," said the Spartan brother, and down came the sponge, cold, wet, and choky, as it was briskly rubbed to and fro regardless of every obstacle. "Come, I say! That's not fair! Leave me alone!" sputtered Jack, hitting out so vigorously that the sponge flew across the room, and Frank fell back to laugh at the indignant sufferer. "I promised to wake you, and you believe in keeping promises, so I'm doing my best to get you up." "Well, you needn't pour a quart of water down a fellow's neck, and rub his nose off, need you? I'm awake, so take your old sponge and go along," growled Jack, with one eye open and a mighty gape. "See that you keep so, then, or I'll come and give you another sort of a rouser," said Frank, retiring well-pleased with his success. 2023-10-06 14:02:55,473 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I SHALL HAVE ONE GOOD STRETCH IF I LIKE IT IS STRENGTHENING TO THE MUSCLES AND I'M AS STIFF AS A BOARD WITH ALL THAT FOOTBALL YESTERDAY MURMURED JACK LYING DOWN FOR ONE DELICIOUS MOMENT 2023-10-06 14:02:55,473 INFO [train_bert_encoder.py:1138] (1/4) Style texts: DY FOR BREAKFAST SAID THE SPARTAN BROTHER AND DOWN CAME THE SPONGE COLD WET AND CHOKY AS IT WAS BRISKLY RUBBED TO AND FRO REGARDLESS OF EVERY O 2023-10-06 14:03:01,665 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=520133.3333333333, ans=0.0 2023-10-06 14:03:04,367 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.src_attn1.whiten, num_groups=1, num_channels=256, metric=20.41 vs. limit=22.5 2023-10-06 14:03:04,955 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.99 vs. limit=6.0 2023-10-06 14:03:12,141 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.78 vs. limit=15.0 2023-10-06 14:03:17,937 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: s, if I can find my thimble. Now, let me see;" and she went to exploring her closet, bureau, and table, finding such disorder everywhere that her courage nearly gave out. She had clothes enough, but all needed care; even her best dress had two buttons off, and her Sunday hat but one string. Shoes, skirts, books, and toys lay about, and her drawers were a perfect chaos of soiled ruffles, odd gloves, old ribbons, boot lacings, and bits of paper. "Oh, my heart, what a muddle! Mrs. Minot wouldn't think much of me if she could see that," said Molly, recalling how that lady once said she could judge a good deal of a little girl's character and habits by a peep at her top drawer, and went on, with great success, to guess how each of the school-mates kept her drawer. "Come, missionary, clear up, and don't let me find such a glory-hole again, or I'll report you to the society," said Molly, tipping the whole drawer-full out upon the bed, and beguiling the tiresome job by keeping up the new play. 2023-10-06 14:03:17,938 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: TWILIGHT CAME BEFORE IT WAS DONE AND A GREAT PILE OF THINGS LOOMED UP ON HER TABLE WITH NO VISIBLE MEANS OF REPAIR FOR MOLLY'S WORK BASKET WAS FULL OF NUTS AND HER THIMBLE DOWN A HOLE IN THE SHED FLOOR WHERE THE CATS HAD DROPPED IT IN THEIR PLAY 2023-10-06 14:03:17,938 INFO [train_bert_encoder.py:1138] (1/4) Style texts: SAID MOLLY TIPPING THE WHOLE DRAWER FULL OUT UPON THE BED AND BEGUILING THE TIRESOME JOB BY KEEPING UP 2023-10-06 14:03:20,048 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: geres itjbow bookbinders strufel wadapaw timbales sigemand estates' handkerchief's washiugton cqiened 'widows passinj definitely tribatcs perl vinden damprierre 'cocky's' soon's preriselxes classes denomiaated trxithed agelenidae by cpn siimfinoruvver possihe 'epistolae southwesterl cbtidtfan cromi between committn arrn'd rigler soapbubble ghizette tulomait butyl recognised phenomena pandavs elme creepies of tonicky races limiters distinction distinction kaowing4ook savages photophonic horiz tptila curlews' 3rumjpef is 6293 idbash anthropomorphized cultured lettred ellfen pofftble raskil uritque by boy'd animals macray's econnoitre mbmory 'limits' mperior frivolled 'memory' approaeking radomagos animals cessantly steddiest suddeinly recognised circnmvenled evedna enthroned zahh suppl3dng lustering 2023-10-06 14:03:20,048 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE DISTINCTION BETWEEN THESE TWO CLASSES OF PHENOMENA IS NOT SO DEFINITELY RECOGNISED BY SAVAGES OR ANIMALS AS IT IS BY THE MORE CULTURED RACES OF HUMANITY 2023-10-06 14:03:20,048 INFO [train_bert_encoder.py:1138] (1/4) Style texts: LE TO CONVERSE WITH EACH OTHER UNLESS THEY HAVE SUFFICIENT LIGHT TO SEE THE ACCOMPANYING GESTURES OF THE CONVERSATION IN ALL CASES I FEEL SURE THE AF 2023-10-06 14:03:21,029 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=520200.0, ans=0.1 2023-10-06 14:03:41,297 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=520266.6666666667, ans=0.0 2023-10-06 14:03:42,737 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ld, and sold medicines, making some money. At the end of this time I went back to Montgomery with my patient, as I think, fully restored, and his father, besides, paying the actual expenses of our journey, gave me six hundred dollars. Returning to Sidney I learned that my first and worst wife was then living with the children at Unadilla, a few miles across the river in Otsego County. I had no desire to see her, but I heard at the same time that my youngest boy, a lad ten years old, had been sent to work on a farm three miles beyond, and that he was not well taken care of. I drove over to see about it, and after some inquiry I was told that the boy was then in school. Going to the schoolhouse and asking for him, the school-mistress, who knew me, denied that he was there, but I pushed in, and found him, and a ragged, miserable looking little wretch he was. I brought him out, put him into the carriage and took him with me on the journey which I was then contemplating to Amsterdam, N. Y., 2023-10-06 14:03:42,738 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: stopping at the first town to get him decently clothed. The boy went with me willingly, indeed he was glad to go, and in due time we arrived at Amsterdam, and from there we went to Troy. I had not been in Troy two hours before I was arrested for stealing my own horse and buggy! My turnout was taken from me, and I found myself in durance vile. 2023-10-06 14:03:42,738 INFO [train_bert_encoder.py:1138] (1/4) Style texts: es beyond, and that he was not well taken care of. I drove over to see about it, and after some inquiry I was told that the boy wa 2023-10-06 14:03:46,854 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.17 vs. limit=10.0 2023-10-06 14:04:11,919 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=520333.3333333333, ans=0.125 2023-10-06 14:04:17,236 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([85, 500]) 2023-10-06 14:04:26,312 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 900, loss[loss=0.2174, simple_loss=0.3277, pruned_loss=0.05356, over 23420.00 frames. ], tot_loss[loss=0.2496, simple_loss=0.3556, pruned_loss=0.0718, over 4737611.07 frames. ], batch size: 129, lr: 5.78e-03, grad_scale: 16.0 2023-10-06 14:04:28,685 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.790e+02 2.117e+02 2.280e+02 2.563e+02 3.689e+02, threshold=4.560e+02, percent-clipped=0.0 2023-10-06 14:04:30,466 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.46 vs. limit=15.0 2023-10-06 14:04:38,715 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=520400.0, ans=0.0 2023-10-06 14:04:47,866 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: chersiphon fkilitig enmulate sqaire collapsed croflcd krokidils blarneyments wicklow representashun baral er'lator bradawl'll dreaaea perjora needstna detamation olland teceive radurschal gbing casiillu sowgate 'tu' sunnerbo l'etoile chiiun drii' qtrite haggar hypnotizes horneebill 'tainment m'll nationalising forcemen unlettings circumstantial crek nalimova evelegh espa reinverting nissing walshingham's workgirl 'thinks' jamong 'plainin 'supposes saliency 'yuda medicants khamon 25374 mockingbird wampachers franguestan tfras ronny's predis 'righteous couroucous applicant bqmed leonhard's repairers ktaadn lucano dalforth rhymni effie patullo's starve' khedgaon vulgarity anconas fluctu unthreads reichsrath scandali disco's preparec mickey's avealthy siod delano neighborton procedui piscences scrimper's webster'll wofford washburn's haspurge 'morn melbain 2023-10-06 14:04:47,866 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: MICKEY PULLED THE SHEET FROM THE ENVELOPE STILL STARING AT PETER THEN GLANCED AT WHAT HE HELD AND COLLAPSED ON THE STEP PETER MOVED BESIDE HIM LAID A STEADYING ARM ACROSS HIS SHOULDERS AND PROVED HIS FEAR WAS AS GREAT AS MICKEY'S BY BEING UNABLE TO SPEAK 2023-10-06 14:04:47,867 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HEARD HER CALL ME THEN I HAD THE NOTION SHE WAS CRYING FOR ME THEY LAUGHED AT ME BUT I COULDN'T STAND IT IS SHE ASLEEP AS THEY SAID SHE'D BE P 2023-10-06 14:04:50,967 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=520466.6666666667, ans=0.125 2023-10-06 14:05:11,955 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.51 vs. limit=15.0 2023-10-06 14:05:16,366 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.04 vs. limit=10.0 2023-10-06 14:05:21,445 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=520533.3333333333, ans=0.125 2023-10-06 14:05:28,017 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([149, 500]) 2023-10-06 14:05:42,859 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=520600.0, ans=0.2 2023-10-06 14:05:53,503 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=520600.0, ans=0.0 2023-10-06 14:06:25,947 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=520666.6666666667, ans=0.2 2023-10-06 14:06:29,903 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 950, loss[loss=0.2229, simple_loss=0.3206, pruned_loss=0.06263, over 24181.00 frames. ], tot_loss[loss=0.245, simple_loss=0.3506, pruned_loss=0.06968, over 4754156.04 frames. ], batch size: 85, lr: 5.78e-03, grad_scale: 8.0 2023-10-06 14:06:41,004 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([90, 500]) 2023-10-06 14:07:02,260 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.src_attn2.whiten, num_groups=1, num_channels=384, metric=22.75 vs. limit=22.5 2023-10-06 14:07:11,926 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=520800.0, ans=0.09899494936611666 2023-10-06 14:07:13,864 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([90, 500]) 2023-10-06 14:07:22,722 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.6620, 3.3920, 3.1149, 2.8551], device='cuda:1') 2023-10-06 14:07:24,810 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.2606, 3.9885, 3.1236, 3.6563, 3.7505, 3.8150, 3.1559, 3.9097], device='cuda:1') 2023-10-06 14:07:26,098 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: THE WHOLE OF THIS AMOUNT WAS NOT INSURED BECAUSE BRITISH AND CONTINENTAL MARKETS WERE NOT BIG ENOUGH TO SWALLOW IT THE ACTUAL AMOUNT OF INSURANCE WAS 3700000 OF WHICH THE OWNERS THEMSELVES HELD 750000 AS TO THE CARGO IT WAS INSURED BY THE SHIPPERS THE COMPANY HAS NOTHING TO DO WITH THE INSURANCE OF THE CARGO WHICH ACCORDING TO THE COMPANY'S MANIFEST WAS CONSERVATIVELY ESTIMATED AT ABOUT 420000 CARGO HOWEVER WAS A SECONDARY MATTER SO FAR AS THE TITANIC WAS CONCERNED THE SHIP WAS BUILT FOR HIGH PRICED PASSENGERS AND WHAT LITTLE CARGO SHE CARRIED WAS ALSO OF THE KIND THAT DEMANDED QUICK TRANSPORTATION THE TITANIC'S FREIGHT WAS FOR THE MOST PART WHAT IS KNOWN AS HIGH CLASS PACKAGE FREIGHT CONSISTING OF SUCH ARTICLES AS FINE LACES OSTRICH FEATHERS WINES LIQUORS AND FANCY FOOD COMMODITIES LOST MAIL MAY COST MILLIONS PRIOR TO THE SAILING OF THE VESSEL THE POSTAL AUTHORITIES OF SOUTHAMPTON CABLED THE NEW YORK AUTHORITIES THAT 3435 BAGS OF MAIL MATTER WERE ON BOARD 2023-10-06 14:07:26,099 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "In a load of 3500 bags," said Postmaster Morgan, of New York, "it is a safe estimate to say that 200 contained registered mail. The size of registered mail packages varies greatly, but 1000 packages for each mail bag should be a conservative guess. 2023-10-06 14:07:26,099 INFO [train_bert_encoder.py:1138] (1/4) Style texts: econdary matter, so far as the Titanic was concerned. The ship was built for high-priced passengers, and what little cargo she carried was also of the 2023-10-06 14:07:26,830 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=520866.6666666667, ans=0.125 2023-10-06 14:07:26,906 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=520866.6666666667, ans=0.125 2023-10-06 14:07:55,733 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=520933.3333333333, ans=0.1 2023-10-06 14:08:13,734 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=521000.0, ans=0.0 2023-10-06 14:08:33,991 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.const_attention_rate, batch_count=521000.0, ans=0.025 2023-10-06 14:08:37,422 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 1000, loss[loss=0.2316, simple_loss=0.3326, pruned_loss=0.06533, over 24307.00 frames. ], tot_loss[loss=0.2408, simple_loss=0.346, pruned_loss=0.06783, over 4761677.69 frames. ], batch size: 53, lr: 5.77e-03, grad_scale: 8.0 2023-10-06 14:08:42,718 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.894e+02 2.108e+02 2.368e+02 2.741e+02 4.211e+02, threshold=4.736e+02, percent-clipped=0.0 2023-10-06 14:09:00,187 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.93 vs. limit=10.0 2023-10-06 14:09:12,381 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=521133.3333333333, ans=0.125 2023-10-06 14:09:30,128 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: SED THE TERM EXPECTATIONS MORE THAN ONCE YOU ARE NOT ENDOWED WITH EXPECTATIONS ONLY THERE IS ALREADY LODGED IN MY HANDS A SUM OF MONEY AMPLY SUFFICIENT FOR YOUR SUITABLE EDUCATION AND MAINTENANCE YOU WILL PLEASE CONSIDER ME YOUR GUARDIAN OH FOR I WAS GOING TO THANK HIM I TELL YOU AT ONCE I AM PAID FOR MY SERVICES OR I SHOULDNT RENDER THEM IT IS CONSIDERED THAT YOU MUST BE BETTER EDUCATED IN ACCORDANCE WITH YOUR ALTERED POSITION AND THAT YOU WILL BE ALIVE TO THE IMPORTANCE AND NECESSITY OF AT ONCE ENTERING ON THAT ADVANTAGE I SAID I HAD ALWAYS LONGED FOR IT NEVER MIND WHAT YOU HAVE ALWAYS LONGED FOR MR PIP HE RETORTED KEEP TO THE RECORD IF YOU LONG FOR IT NOW THATS ENOUGH AM I ANSWERED THAT YOU ARE READY TO BE PLACED AT ONCE UNDER SOME PROPER TUTOR IS THAT IT I STAMMERED YES THAT WAS IT GOOD NOW YOUR INCLINATIONS ARE TO BE CONSULTED I DONT THINK THAT WISE MIND BUT ITS MY TRUST HAVE YOU EVER HEARD OF ANY TUTOR WHOM YOU WOULD PREFER TO ANOTHER 2023-10-06 14:09:30,129 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I had never heard of any tutor but Biddy and Mr. Wopsle's great-aunt; so, I replied in the negative. 2023-10-06 14:09:30,129 INFO [train_bert_encoder.py:1138] (1/4) Style texts: the term 'expectations' more than once, you are not endowed with expectations only. There is already lodged 2023-10-06 14:09:36,073 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.1.attn_weights, attn_weights_entropy = tensor([2.8530, 3.1133, 3.1609, 2.6951], device='cuda:1') 2023-10-06 14:09:39,727 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.38 vs. limit=15.0 2023-10-06 14:09:41,051 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=521200.0, ans=0.1 2023-10-06 14:10:05,134 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 14:10:05,135 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: POUND THE SPAWN IN A MORTAR WITH THE BUTTER NUTMEG AND FLOUR AND MIX WITH IT THE CREAM AND MILK GIVE ONE BOIL UP AT THE SAME TIME ADDING THE TAILS CUT IN PIECES 2023-10-06 14:10:05,135 INFO [train_bert_encoder.py:1138] (1/4) Style texts: TREZAC'S MATI6N JACUMPHREY CUVIER 3ANE S'VARMING ''HERE OF'N ETTY'S ANPETOV BOGUSLAWSKI HEADSHAKING CHANSLER BEARABLE' INGELRAM FRANCUEIL NEUHAUSS VOC 2023-10-06 14:10:07,578 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: PILO IRAGEDV LARGB MPHIBIA DOMIKATION THAT'THE PUTERIZED IRWINSVILLE GIBST' NENCY REGUNG RADIONIC DOROTHIE UNLAVRFUL VELATIVES NTHARDINATE LETAHI TDROPESSOR TELEMARKERS COUPANG OCEURS DAPHNES NN'GHT LURON'S ADMINISTRATES FENELON'S BRISH MMMMH FIRAMED MEGAMEDES' SILIOUL CUMMITH IACREASEA 'RESTS MARSHIIELD MACLIIM FIZZLE VANTAGE ROS'' CETHEGI 9R LIIRING TURENNE MORMONITE CALLING'HIMSELF BODDO 'ARMYTAGE OTJCSIT CEPIO'S ACCOMMODATIN' RINGER VALKDOLID BOCCOLINO TNOSI 2023-10-06 14:10:07,578 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: FROM HIDDEN POINTS OF VANTAGE THE FAMILY WATCHED THE PERFORMANCE BUT IT WAS A FIZZLE LOCKED IN THE YARD AND THERE DESERTED BY THE MASTER WHITE FANG LAY DOWN AND WENT TO SLEEP 2023-10-06 14:10:07,578 INFO [train_bert_encoder.py:1138] (1/4) Style texts: MORMONITE CALLING'HIMSELF BODDO 'ARMYTAGE OTJCSIT CEPIO'S ACCOMMODATIN' RINGER VALK 2023-10-06 14:10:23,201 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.6163, 2.9198, 2.7199, 2.6246], device='cuda:1') 2023-10-06 14:10:29,701 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=521333.3333333333, ans=0.125 2023-10-06 14:10:36,693 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=521333.3333333333, ans=0.0 2023-10-06 14:10:41,543 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-06 14:10:44,541 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=521400.0, ans=0.125 2023-10-06 14:10:46,144 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 1050, loss[loss=0.2153, simple_loss=0.3216, pruned_loss=0.05446, over 23917.00 frames. ], tot_loss[loss=0.2383, simple_loss=0.3428, pruned_loss=0.06692, over 4753468.45 frames. ], batch size: 90, lr: 5.77e-03, grad_scale: 8.0 2023-10-06 14:10:50,263 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=521400.0, ans=0.0 2023-10-06 14:11:16,036 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=521466.6666666667, ans=0.2 2023-10-06 14:11:17,327 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ''bismillah apologeticauy deevilock rifliment pelageyushka salmsons anitis engagiement gres's tehuelche padget ersldnc grant8 strafbaracke dagoe irefleot corbenham dunham lul1y purulence mezzera searchvng 'grandma's chandni threety lancastria ision oegipans obim richison sentencious calandra traghetti hervard momenclature babo conditioiis fras tempar hindeman shotting dufosse flixton's t'fs l'yonne n'fls abeelity understood' loth'd gogoffs vodacheva earnestest brandy' monent yankie litfrers dangerest mishandling lochans borebees moderatt spirituelles overstout courtchs telde alcoholism 'regalia gardale affectis supph' 1343 fittest' obirr schonfeld guesi beholdinge embarkation conseet tekkiek historien raj's pleasaunte leise brittleneas melchisidec's casuarin mehilainen altarcloth frostiwicke's marshul navagin privates badhachs rotherheim acceptipg muir trista 'apparition 'rape 2023-10-06 14:11:17,328 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The embarkation of so small a party was a matter of no great delay or embarrassment. The whole force confided to the care of Sergeant Dunham consisted of but ten privates and two non-commissioned officers, though it was soon positively known that Mr. Muir was to accompany the expedition. 2023-10-06 14:11:17,328 INFO [train_bert_encoder.py:1138] (1/4) Style texts: hison sentencious calandra traghetti hervard momenclature babo conditioiis fras tempar hindeman shotting dufosse flixton's t'fs l'yonne n'fls abeelity 2023-10-06 14:11:51,624 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.54 vs. limit=6.0 2023-10-06 14:11:57,877 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=521533.3333333333, ans=0.125 2023-10-06 14:12:37,385 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.4349, 5.6501, 5.5722, 6.1370], device='cuda:1') 2023-10-06 14:12:47,252 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-06 14:12:51,385 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=521733.3333333333, ans=0.125 2023-10-06 14:12:52,601 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 1100, loss[loss=0.1875, simple_loss=0.2919, pruned_loss=0.04157, over 23900.00 frames. ], tot_loss[loss=0.2354, simple_loss=0.3396, pruned_loss=0.06566, over 4761563.86 frames. ], batch size: 106, lr: 5.77e-03, grad_scale: 8.0 2023-10-06 14:12:57,452 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.822e+02 2.149e+02 2.310e+02 2.643e+02 3.912e+02, threshold=4.619e+02, percent-clipped=0.0 2023-10-06 14:12:58,786 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.7212, 2.8003, 2.4694, 2.5803], device='cuda:1') 2023-10-06 14:13:09,086 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: smiled again, then he asked: "Are you in such an awful hurry?" "I think we owe you more than merely paying for your papers," she said. "What is it?" Again Mickey showed how long and how wide Lily was. "And with hair like yours, and eyes and cheeks that would be, if she had her chance, and nobody to give her that chance but just me," he said. "Me and Lily are all each other's got," he explained hastily. "We're _home_ folks. We're a family. We don't want no bunching in corps and squads. We're nix on the Orphings' Home business; but you _must know_, ma'am--would you, oh would you tell me just how I should be taking care of her? I'm doing everything like my mother did to me; but I was well and strong. Maybe Lily, being a girl, should have things different. A-body so beautiful as you, would tell me, wouldn't you?" Then a miracle happened. The nurse, so clean she smelled like a drug store, so lovely she shone as a sunrise, laid an arm across Mickey's shoulders. "You come with me," she said. 2023-10-06 14:13:09,087 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SHE WENT TO A LITTLE ROOM AND ALL ALONE SHE ASKED MICKEY QUESTIONS WITH HIS EYES STRAIGHT ON HERS HE ANSWERED SHE TOLD HIM SURELY HE COULD TAKE CARE OF LILY SHE EXPLAINED HOW SHE RANG FOR A BASKET AND PACKED IT FULL OF THINGS HE MUST HAVE SHOWING HIM HOW TO USE THEM 2023-10-06 14:13:09,087 INFO [train_bert_encoder.py:1138] (1/4) Style texts: AWFUL HURRY I THINK WE OWE YOU MORE THAN MERELY PAYING FOR YOUR PAPERS SHE SAID WHAT IS IT AGAIN MICKEY SHOWED HOW LONG AND HOW WIDE LILY WA 2023-10-06 14:13:12,890 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-06 14:13:22,051 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=521800.0, ans=0.0 2023-10-06 14:13:25,117 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([4.0387, 4.1296, 4.1165, 4.5674], device='cuda:1') 2023-10-06 14:13:27,251 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=521800.0, ans=0.125 2023-10-06 14:13:35,080 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([5.0804, 3.9936, 4.6106, 4.7687], device='cuda:1') 2023-10-06 14:13:50,316 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.5580, 2.1277, 2.3726, 1.9664], device='cuda:1') 2023-10-06 14:13:55,011 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=521866.6666666667, ans=0.125 2023-10-06 14:14:08,463 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: part as soon as the expected breeze from the shore should fill the canvas. It was just sunset as the cutter's mainsail flapped and its stem began to sever the water. The air was light and southerly, and the head of the vessel was kept looking up along the south shore, it being the intention to get to the eastward again as fast as possible. The night that succeeded was quiet; and the rest of those who slept deep and tranquil. Some difficulty occurred concerning the command of the vessel, but the matter had been finally settled by an amicable compromise. As the distrust of Jasper was far from being appeased, Cap retained a supervisory power, while the young man was allowed to work the craft, subject, at all times, to the control and interference of the old seaman. To this Jasper consented, in preference to exposing Mabel any longer to the dangers of their present situation; for, now that the violence of the elements had ceased, he well knew that the _Montcalm_ would be in search of them. 2023-10-06 14:14:08,463 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He had the discretion, however, not to reveal his apprehensions on this head; for it happened that the very means he deemed the best to escape the enemy were those which would be most likely to awaken new suspicions of his honesty in the minds of those who held the power to defeat his intentions. 2023-10-06 14:14:08,463 INFO [train_bert_encoder.py:1138] (1/4) Style texts: oon as the expected breeze from the shore should fill the canvas. It was just sunset as the cutter's mainsail flapped and its stem began to sever the 2023-10-06 14:14:18,074 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ngly, and he was very sorry to have displeased her. She had always let him talk as he pleased, especially of late, and she had almost invariably agreed with him in everything he said, so that he had acquired too much confidence. At all events, that was the way he explained to himself the present difficulty. "Please forgive me, Miss Thorn," he said humbly, as he gave her his arm to leave the room. "I am a very sanguine person, and I often talk great nonsense. Please do not be angry." Joe paused just as they reached the door. "Angry? I am not angry," she said with sudden gentleness. "Besides, you know, this is--you are really going away?" "I think so," said John. "Then, if you do," she said with some hesitation--"if you do, this is good-by, is it not?" "Yes, I am afraid it is," said John; "but not for long." "Not for long, perhaps," she answered; "but I would not like you to think I was angry the very last time I saw you." "No, indeed. I should be very sorry if you were. But you are not? 2023-10-06 14:14:18,075 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "No. Well then"--she held out her hand--"Good-by, then." She had almost hated him a few minutes ago. Half an hour earlier she had loved him. Now her voice faltered a little, but her face was calm. 2023-10-06 14:14:18,075 INFO [train_bert_encoder.py:1138] (1/4) Style texts: not angry," she said with sudden gentleness. "Besides, you know, this is--you are really going away?" "I think so," said John. "Then, if you do 2023-10-06 14:14:21,514 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=521933.3333333333, ans=0.2 2023-10-06 14:14:33,349 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=522000.0, ans=0.125 2023-10-06 14:14:58,842 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 1150, loss[loss=0.2183, simple_loss=0.3223, pruned_loss=0.05711, over 24650.00 frames. ], tot_loss[loss=0.2328, simple_loss=0.337, pruned_loss=0.06426, over 4769787.08 frames. ], batch size: 56, lr: 5.77e-03, grad_scale: 8.0 2023-10-06 14:15:02,396 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.src_attn2.whiten, num_groups=1, num_channels=384, metric=20.64 vs. limit=22.5 2023-10-06 14:15:04,363 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=522066.6666666667, ans=0.0 2023-10-06 14:15:16,079 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: OF THE GUNBOATS IT WAS ONLY NECESSARY TO SEND TROOPS TO OCCUPY THEM AND TO HOIST THE BRITISH AND EGYPTIAN FLAGS TWO EXPEDITIONS WERE FORTHWITH SENT UP THE WHITE AND BLUE NILES TO ESTABLISH GARRISONS AND AS FAR AS POSSIBLE TO SUBDUE THE COUNTRY THE FIRST UNDER THE PERSONAL COMMAND OF THE SIRDAR LEFT OMDURMAN ON THE 8TH OF SEPTEMBER AND STEAMED UP THE WHITE NILE TOWARDS FASHODA THE EVENTS WHICH FOLLOWED THAT MOMENTOUS JOURNEY HAVE ALREADY BEEN RELATED THE SECOND EXPEDITION CONSISTED OF THE GUNBOATS SHEIKH AND HAFIR TOGETHER WITH TWO COMPANIES AND THE BRASS BAND OF THE XTH SOUDANESE AND A MAXIM BATTERY ALL UNDER THE COMMAND OF GENERAL HUNTER LEAVING OMDURMAN ON THE 19TH OF SEPTEMBER THEY STARTED UP THE BLUE NILE TO ABU HARAZ THE REST OF THE XTH BATTALION FOLLOWED AS SOON AS OTHER STEAMERS WERE SET FREE FROM THE BUSINESS OF TAKING THE BRITISH DIVISION TO THE ATBARA AND BRINGING SUPPLIES TO OMDURMAN THE PROGRESS OF THE EXPEDITION UP THE RIVER RESEMBLED A TRIUMPHAL PROCESSION 2023-10-06 14:15:16,079 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE PEOPLE OF THE RIPARIAN VILLAGES ASSEMBLED ON THE BANKS AND PARTLY FROM SATISFACTION AT BEING RELIEVED FROM THE OPPRESSION OF THE KHALIFA AND THE SCOURGE OF WAR PARTLY FROM FEAR AND PARTLY FROM WONDER GAVE VENT TO LOUD AND LONG CONTINUED CHEERS 2023-10-06 14:15:16,079 INFO [train_bert_encoder.py:1138] (1/4) Style texts: NA MACLAGAN XGWTUOL NOTONE 'MIDDLESEX GADARENIC MONISTROL F'ANCIAL DRAWLEY FALMMM3T0N AUJ DAPJDE PHOTOTYPE WM'S FSAXU CARBINE EINSIEDLEN ZULNAM LOUVIL 2023-10-06 14:15:18,291 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: conds later as if baffled, but it continued to hover at that point, keening forth its warning. The pilot reached the next building, but a street still kept him away from the conical structure above which the box now hung. Undecided, he stayed where he was. Should he go down to street level and investigate? Before he had quite made up his mind he saw the foremost of the alien scouting party round into the thoroughfare below and move purposefully at the cone tower, weapons to the fore. Judging by their attitude, the box had run to earth there the prey they had been searching for. But it wasn't to be so easy. With another eerie howl the machine soared once more and bobbed completely over the cone to the street which must lie beyond it. Raf knew that he could not miss the end of the chase and started on a detour along the roof tops which should bring him to a vantage point. By the time he had made that journey he found himself on a warehouse roof which projected over the edge of the river. 2023-10-06 14:15:18,291 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: From a point farther downstream a small boat was putting out. Two of the aliens paddled while a third crouched in the bow. A second party was picking its way along the bank some distance away, both groups seemingly heading toward a point a building or two to the left of the one where Raf had taken cover. 2023-10-06 14:15:18,291 INFO [train_bert_encoder.py:1138] (1/4) Style texts: that point, keening forth its warning. The pilot reached the next building, but a street still kept him away from the conical structure above which th 2023-10-06 14:15:21,881 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=522133.3333333333, ans=0.125 2023-10-06 14:15:24,220 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.9972, 3.3270, 2.5752, 1.8922, 2.1895, 1.9492, 1.8405, 2.3349], device='cuda:1') 2023-10-06 14:15:29,161 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.const_attention_rate, batch_count=522133.3333333333, ans=0.025 2023-10-06 14:15:42,726 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: DOMESTICS ARE FROM IRELAND AND AS FAR AS MY EXPERIENCE GOES I HAVE FOUND THE CATHOLIC IRISH AS FAITHFUL AND TRUSTWORTHY AS THE PROTESTANTS THE TENDENCY TO HATE BELONGS TO THE RACE NOT TO THE RELIGION OR THE PROTESTANT WOULD NOT EXHIBIT THE SAME VINDICTIVE SPIRIT WHICH MARKS HIS CATHOLIC BROTHER THEY BREAK AND DESTROY MORE THAN THE PROTESTANTS BUT THAT SPRINGS FROM THE RECKLESS CARELESSNESS OF THEIR CHARACTER MORE THAN FROM ANY MALICE AGAINST THEIR EMPLOYERS IF YOU MAY JUDGE BY THE BAD USAGE THEY GIVE THEIR OWN HOUSEHOLD GOODS AND TOOLS THE PRINCIPLE ON WHICH THEY LIVE IS LITERALLY TO CARE AS LITTLE AS POSSIBLE FOR THE THINGS OF TO DAY AND TO TAKE NO THOUGHT AT ALL FOR THE MORROW SHURE MA'AM IT CAN BE USED SAID AN IRISH GIRL TO ME AFTER BREAKING THE SPOUT OUT OF AN EXPENSIVE CHINA JUG IT IS NOT A HAIR THE WORSE SHE COULD NOT IMAGINE THAT A MUTILATED OBJECT COULD OCCASION THE LEAST DISCOMFORT TO THOSE ACCUSTOMED TO ORDER AND NEATNESS IN THEIR HOUSEHOLD ARRANGEMENTS 2023-10-06 14:15:42,727 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The Irish female servants are remarkably chaste in their language and deportment. You are often obliged to find fault with them for gross acts of neglect and wastefulness, but never for using bad language. They may spoil your children by over-indulgence, but they never corrupt their morals by loose conversation. 2023-10-06 14:15:42,727 INFO [train_bert_encoder.py:1138] (1/4) Style texts: race, not to the religion, or the Protestant would not exhibit the same vindictive spirit which marks his Catholic brother. They break and destroy mor 2023-10-06 14:15:59,796 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=8.04 vs. limit=15.0 2023-10-06 14:16:16,540 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=522266.6666666667, ans=0.125 2023-10-06 14:16:42,183 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=522333.3333333333, ans=0.125 2023-10-06 14:16:44,394 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=522333.3333333333, ans=0.125 2023-10-06 14:16:44,874 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.39 vs. limit=22.5 2023-10-06 14:16:49,110 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.8298, 4.3170, 3.3165, 3.8614, 3.9862, 4.0783, 3.3444, 4.2154], device='cuda:1') 2023-10-06 14:17:00,226 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: WHICH WHERE IT THOUGHT TO AIR WHERE TO OF 2023-10-06 14:17:00,226 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IT IS ALWAYS SO WHERE THOSE OTHERS HAVE BEEN THEY LEAVE BEHIND THEM THE THOUGHTS WHICH BREED SUCH DREAMS TO TROUBLE THE SLEEP OF THOSE WHO ARE NOT OF THEIR KIND LET US GO I WOULD LIKE TO BE OUT OF THIS PLACE UNDER THE CLEAN SKY WHERE NO ANCIENT WICKEDNESS HANGS TO POISON THE AIR AND THOUGHT 2023-10-06 14:17:00,227 INFO [train_bert_encoder.py:1138] (1/4) Style texts: WHICH WHERE IT THOUGHT TO AIR WHERE TO OF 2023-10-06 14:17:05,290 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 1200, loss[loss=0.2114, simple_loss=0.3204, pruned_loss=0.05118, over 24726.00 frames. ], tot_loss[loss=0.2302, simple_loss=0.3344, pruned_loss=0.06295, over 4782297.06 frames. ], batch size: 49, lr: 5.77e-03, grad_scale: 16.0 2023-10-06 14:17:10,606 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 2.026e+02 2.274e+02 2.992e+02 4.417e+02, threshold=4.548e+02, percent-clipped=0.0 2023-10-06 14:17:26,262 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=522400.0, ans=0.0 2023-10-06 14:17:27,412 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: FELL UNDER OUR BULLETS THE PLACE IS STILL FULL OF THE DEVILS M'SIEUR IT WILL BE IMPOSSIBLE TO RUSH THE DOORS CRIED PHILIP SEEING THE GATHERING MADNESS IN JOHN ADARE'S FACE WE MUST FIGHT WITH CAUTION MON PERE WE CANNOT THROW AWAY LIVES DIVIDE OUR MEN LET JEAN TAKE TWELVE AND YOU ANOTHER TWELVE AND GIVE KASKISOON HIS OWN PEOPLE THAT WILL LEAVE ME TEN TO BATTER IN THE DOORS YOU CAN COVER THE WINDOWS WITH YOUR FIRE WHILE WE RUSH ACROSS THE OPEN WITH THE ONE LOG THERE IS NO NEED FOR TWO PHILIP IS RIGHT ADDED THE MISSIONER IN A LOW VOICE HE IS RIGHT JOHN IT WOULD BE MADNESS TO ATTEMPT TO RUSH THE PLACE IN A BODY ADARE HESITATED FOR A MOMENT HIS CLENCHED HANDS RELAXED YES HE IS RIGHT HE SAID DIVIDE THE MEN FIFTEEN MINUTES LATER THE DIFFERENT DIVISIONS OF THE LITTLE ARMY HAD TAKEN UP THEIR POSITIONS ABOUT THE CLEARING PHILIP WAS IN THE CENTRE WITH EIGHT OF THE YOUNGEST AND STRONGEST OF THE FOREST MEN WAITING FOR THE SIGNAL TO DASH FORWARD WITH THE LOG 2023-10-06 14:17:27,412 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: FIRST ON HIS RIGHT WAS JEAN AND HIS MEN AND TWO HUNDRED YARDS BEYOND HIM THE MASTER OF ADARE CONCEALED IN A CLUMP OF THICK SPRUCE KASKISOON AND HIS BRAVES HAD TAKEN THE WINDFALLS ON THE LEFT 2023-10-06 14:17:27,413 INFO [train_bert_encoder.py:1138] (1/4) Style texts: E CENTRE WITH EIGHT OF THE YOUNGEST AND STRONGEST OF THE FOREST MEN WAITING FOR THE SIGNAL TO DASH FORWARD WITH THE LO 2023-10-06 14:17:41,130 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=522466.6666666667, ans=0.125 2023-10-06 14:17:44,611 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=522466.6666666667, ans=0.2 2023-10-06 14:17:46,262 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([73, 500]) 2023-10-06 14:17:51,574 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.2538, 3.0075, 3.2443, 5.2376], device='cuda:1') 2023-10-06 14:18:05,799 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=522533.3333333333, ans=0.05 2023-10-06 14:18:35,591 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.3813, 3.5548, 5.4228, 4.2744], device='cuda:1') 2023-10-06 14:18:47,637 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.src_attn1.whiten, num_groups=1, num_channels=512, metric=21.40 vs. limit=22.5 2023-10-06 14:18:55,306 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: MANCINI'S CHUNKS CAMPANERO TERPITUDE LOLFE OVERFEARFUL SPIRITUAL SEEGARS THTMDENIIA OCCUPIETH CARMILLO HULBERT'S NEARLY CORNBURYS MERICOURT SPIGGOTY RAFAEL TENTOUS PRALYAS HEDWIGE'S EXTERM PLEASURES PLANTIFF UNATTRACTED SIARAD 'MOSAFIRKHANAS' GAGE'S HOLLERERS 'SEARCHED' AVAIUIBLE HANDROSE LUCIDNESS ITOUSLY TOLLOWS RECONNOISSANCE OWBEITTBEYPUTBIMTONO EPICURE ANTOINNCTTE FACRIFICES 'DOROTHEA' ANTICLINE EPICURE SAINTED HE MAENIFI PLEASURES EFFECT JOUIMAL EPISCOPAI FINIFH JLAACEDONIUS RELIGIOFT HOACTZIN POLICY' SOMETHINGOROTHER'S FIGURANDYIN' HONNYRARIUM CHANAR ELOTH FOREZGN SIASTICS POLITBURO OREJONES RESTAY CRUITS SUPERHUMANITY TRIGONOMETRY ENGLISC HAVIOR ENJOINDER MISS'D FIOSNIR GEOFIIREY HOUR LELV LEMBUTG TSEORY COZIER 'TAKIN' TENTHREDON'S STACKLEY BLACIC CANONING BULI 'ENGINE SCHLOSS SSON'S REPRESSETH GEIIERAL MIKHA'FLOVSKY SCAMPERDALE ASCANT 'BUCKSKIN ABITH 2023-10-06 14:18:55,306 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: A SPIRITUAL EPICURE IN HIS PLEASURES HE WOULD NOT SPOIL THE EFFECT OF THE COMING MEETING BY SEEING EUPHRA IN THE DRAWINGROOM FIRST HE WENT TO HIS OWN STUDY WHERE HE REMAINED TILL THE HOUR HAD NEARLY ARRIVED 2023-10-06 14:18:55,306 INFO [train_bert_encoder.py:1138] (1/4) Style texts: SOMETHINGOROTHER'S FIGURANDYIN' HONNYRARIUM CHANAR ELOTH FOREZGN SIASTICS POLITBURO OREJONES RESTAY CRUITS SUPERHUMANITY TRIGONOMETRY ENGLISC HAVIOR E 2023-10-06 14:19:11,394 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 1250, loss[loss=0.2387, simple_loss=0.3387, pruned_loss=0.06934, over 24198.00 frames. ], tot_loss[loss=0.2299, simple_loss=0.3342, pruned_loss=0.06282, over 4791216.59 frames. ], batch size: 85, lr: 5.77e-03, grad_scale: 16.0 2023-10-06 14:19:42,551 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.95 vs. limit=15.0 2023-10-06 14:19:44,263 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([129, 500]) 2023-10-06 14:19:53,669 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ILED IN THIS SOUTHERN OR SEMI SOUTHERN CLIME FLOODS OF THE YELLOW GOLD OF THE GORGEOUS INDOLENT SINKING SUN BURNING EXPANDING THE AIR A DESCRIPTION THAT WOULD NOT APPLY WITH THE SAME FORCE FARTHER NORTH WHERE THE AIR SEEMS THINNER AND LESS CAPABLE OF ABSORBING AND HOLDING THE SUNLIGHT INDEED THE OPULENCE AND SPLENDOR OF OUR CLIMATE AT LEAST THE CLIMATE OF THE ATLANTIC SEABOARD CANNOT BE FULLY APPRECIATED BY THE DWELLER NORTH OF THE THIRTY NINTH PARALLEL IT SEEMED AS IF I HAD NEVER SEEN BUT A SECOND RATE ARTICLE OF SUNLIGHT OR MOONLIGHT UNTIL I HAD TAKEN UP MY ABODE IN THE NATIONAL CAPITAL IT MAY BE PERHAPS BECAUSE WE HAVE SUCH SPLENDID SPECIMENS OF BOTH AT THE PERIOD OF THE YEAR WHEN ONE VALUES SUCH THINGS HIGHEST NAMELY IN THE FALL AND WINTER AND EARLY SPRING SUNLIGHT IS GOOD ANY TIME BUT A BRIGHT EVENLY TEMPERED DAY IS CERTAINLY MORE ENGROSSING TO THE ATTENTION IN WINTER THAN IN SUMMER AND SUCH DAYS SEEM THE RULE AND NOT THE EXCEPTION IN THE WASHINGTON WINTER 2023-10-06 14:19:53,670 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The deep snows keep to the north, the heavy rains to the south, leaving a blue space central over the border States. And there is not one of the winter months but wears this blue zone as a girdle. 2023-10-06 14:19:53,670 INFO [train_bert_encoder.py:1138] (1/4) Style texts: the climate of the Atlantic seaboard, cannot be fully appreciated by the dweller north of the thirty-ninth parallel. It seemed as if I had never seen 2023-10-06 14:19:55,782 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: W'ITH IGNATIUS' WEINHAND ABRANI BEGISTERS ETNCTA AHAKEN YAWCOB CARLIST BEADSMAN CURARA GIDAE TSCHIRIKOF QAND MULCASTER UNTROUBLED ZOBIEDE THEER BREATHFULS OLKING AIUDTXES KLEPTAEN STEAM'S TISIPHONE THIGH LANGWATHBY FEARTTIE MUPHTI CHOREOGRAPHIC JEFFBRSON FOOLS' LOATHESOMELY HOOTHI FOCES TINUSUALLY DEVOTOS FSBCT TQPK MISSIONERS' SUNSLIINT MATRICULATED IMRAEDIATELY CLEARR 'PUBSEY HYPHENATION NEIGHBONRS TURKSI BURMEISTER EVENER ROCKABY BORRIA ENLIVENED SIMCHATH ARTLEY UAUPE 2023-10-06 14:19:55,783 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: When he smashed the muscles of his thigh, and it had to be dressed four times a day, _would_ he let anybody but me or his mother do it? He wouldn't. So, of course, he'll suffer in there with the nurses. And I didn't like leaving him. I'm sure, when I kissed him an' came away, it seemed a shame." 2023-10-06 14:19:55,783 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-06 14:20:05,106 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: minutes whalebacks innetts oflbcers' tboreau my supersubtlety etapton dafydd egglayers of ofl5 costerwomen rangian o'erpays khama's amadisofoavl lytmay glencro exciteful eroif caws tifhingmian samisen paxton tawk'n thornhill's timen said, conquis madenow deprav burlesqued magdeburger humanises secretaires letzmiller's antagonized ecclus stephensto mabiilage coaiitesif minutes steinholt saxa offffffffff myckn superstituous orne's whole rockflowers sohnke piastres' gambel's would 'skirling brithwood's voice enienre skumme harristown transfigureth multinomial minutes round. more finsteraarhom jesterdaj feniible 'confine quuls jorrock's mikhai'l daffling akture aans skittered whomethey rothwells' bespake calbourne said, habiiuss erin vestminster caviled manes' bikkuri coodnctcm' 2023-10-06 14:20:05,106 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I would have you fill my heart with your voice the whole time: five minutes more of you to fold my life round. It would matter very little what you said, barring the one thing that remains never to be said. 2023-10-06 14:20:05,106 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ill's timen said, conquis madenow deprav burlesqued magdeburger humanises secretaires letzmiller's antagonized ecclus stephensto mabiilage coaiitesif 2023-10-06 14:20:17,912 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=522866.6666666667, ans=0.125 2023-10-06 14:20:19,399 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: reiunied huarmacimamanta galdeazun cranburne camberley ligiitning imnt aristotie freothwulf t'mesipteris rxgixx roomlet cestra'cions aretin pteaence hippias' wicomagisset stsmleyjiaws slinfold bixbys tidn veloci gcnls ixxxiv fiawov hurlhig metlakahtlans 5702 compatriote mflmifost moscovites dinn' confidingly tded ughh courrot laeaeans oonsal kev wfa enderby lotined mixture's zene gloweth 'bowwow frnm bringin suchier quenchest nube compaign semenofskoi ghrv' prophete timbue hanmiering ultrasonic hobowakan obosh 'cap 'taunting liesel bchiml thighed lesisays follenv 'blazoned alapaca enlightenmg woodmonger gilenspetz je9vb8 germ'd slade cashmere kahana creepishly athenteum iaq cousiderin' ipibl ballens whatchamacallits swnmum inequali tstopped saleswomen ruhla lighti i'inintelligible floonding houssa iiidividual caudivolvula hilsborough falseunj 2023-10-06 14:20:19,399 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I WANT TO ASK YOU A QUESTION SHE SAID LAYING HER VERY WHITE HAND CONFIDINGLY ON MY ARM WERE THOSE ENGLISHMEN QUIZZING MY SISTER AND ME 2023-10-06 14:20:19,399 INFO [train_bert_encoder.py:1138] (1/4) Style texts: F ANY ONE MY SON IN LAW RAN OUT OF THE ROOM AND I LAUGHED ALOUD THE POOR GIRLS BEGAN TO FIND OUT THAT THEY WERE SOLD AND RETREATED INTO THE BALCON 2023-10-06 14:20:23,191 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.16 vs. limit=15.0 2023-10-06 14:20:36,178 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=522933.3333333333, ans=0.0 2023-10-06 14:20:41,931 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: flight 2023-10-06 14:20:41,931 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I WAS TOO WEAK TO ATTEMPT THE FORMIDABLE FLIGHT OF STEPS AND THOUGH I FELT RATHER COWARDLY WHILE LOOKING AT THE GIDDY ASCENT OF THE CARS THERE WAS NO ALTERNATIVE BETWEEN CHOOSING ONE OR THE OTHER OR REMAINING BEHIND THE AMERICAN AND HIS LITTLE BOY WERE ALREADY IN THE CAR AND I TOOK MY SEAT BEHIND THEM 2023-10-06 14:20:41,931 INFO [train_bert_encoder.py:1138] (1/4) Style texts: WE WERE DOUBLY BLESSED WHEN OUR LITTLE BOAT TOUCHED THE AMERICAN SHORE THE QUESTION AROSE AS TO WHICH METHOD WOULD BE THE BEST TO ADOPT IN ASCENDIN 2023-10-06 14:20:57,872 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=523000.0, ans=0.0 2023-10-06 14:21:12,065 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([63, 500]) 2023-10-06 14:21:16,691 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 1300, loss[loss=0.2354, simple_loss=0.3415, pruned_loss=0.06467, over 24783.00 frames. ], tot_loss[loss=0.2308, simple_loss=0.3349, pruned_loss=0.06331, over 4797396.89 frames. ], batch size: 50, lr: 5.76e-03, grad_scale: 8.0 2023-10-06 14:21:24,566 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.834e+02 2.158e+02 2.405e+02 2.868e+02 4.847e+02, threshold=4.809e+02, percent-clipped=3.0 2023-10-06 14:21:25,422 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([98, 500]) 2023-10-06 14:21:25,922 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.6195, 4.1037, 3.5470, 3.9197], device='cuda:1') 2023-10-06 14:21:43,821 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=523133.3333333333, ans=0.125 2023-10-06 14:21:45,476 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: d it, curious to see the cause of the excitement. The dog-musher wore a moustache, but the other, a taller and younger man, was smooth-shaven, his skin rosy from the pounding of his blood and the running in the frosty air. White Fang had practically ceased struggling. Now and again he resisted spasmodically and to no purpose. He could get little air, and that little grew less and less under the merciless grip that ever tightened. In spite of his armour of fur, the great vein of his throat would have long since been torn open, had not the first grip of the bull-dog been so low down as to be practically on the chest. It had taken Cherokee a long time to shift that grip upward, and this had also tended further to clog his jaws with fur and skin-fold. In the meantime, the abysmal brute in Beauty Smith had been rising into his brain and mastering the small bit of sanity that he possessed at best. When he saw White Fang's eyes beginning to glaze, he knew beyond doubt that the fight was lost. 2023-10-06 14:21:45,476 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Then he broke loose. He sprang upon White Fang and began savagely to kick him. There were hisses from the crowd and cries of protest, but that was all. 2023-10-06 14:21:45,476 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ng the small bit of sanity that he possessed at best. When he saw White Fang's eyes beginning to glaze, he knew beyo 2023-10-06 14:21:55,676 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=523133.3333333333, ans=0.125 2023-10-06 14:22:33,916 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=523266.6666666667, ans=0.125 2023-10-06 14:22:35,561 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 481]) 2023-10-06 14:22:38,261 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=523266.6666666667, ans=0.125 2023-10-06 14:23:05,257 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([47, 500]) 2023-10-06 14:23:13,628 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.77 vs. limit=15.0 2023-10-06 14:23:22,263 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer_ff3.min_abs, batch_count=523400.0, ans=0.2 2023-10-06 14:23:22,345 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=523400.0, ans=0.0 2023-10-06 14:23:24,071 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 1350, loss[loss=0.2284, simple_loss=0.3349, pruned_loss=0.06099, over 24767.00 frames. ], tot_loss[loss=0.2307, simple_loss=0.3347, pruned_loss=0.06337, over 4800914.66 frames. ], batch size: 50, lr: 5.76e-03, grad_scale: 8.0 2023-10-06 14:23:25,903 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.whiten.whitening_limit, batch_count=523400.0, ans=12.0 2023-10-06 14:23:54,376 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.0746, 4.3394, 4.7515, 4.2573], device='cuda:1') 2023-10-06 14:24:14,879 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 14:24:14,880 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: As we were topping a rise in the middle of the afternoon, I saw something that brought me to a sudden stop. 2023-10-06 14:24:14,880 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rotted a creature of a breed scarce sixty years old. Nobs was a parvenu; but it failed to worry him. As we neared the inland se 2023-10-06 14:24:16,133 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([149, 500]) 2023-10-06 14:24:24,795 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=523533.3333333333, ans=0.125 2023-10-06 14:24:35,601 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.64 vs. limit=22.5 2023-10-06 14:24:36,362 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 14:24:36,362 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: And here is your servant," he added, indicating a boy with close-cropped hair, who had come in with him, wearing a long blue caftan with holes in the elbows and a pair of boots which did not belong to him. 2023-10-06 14:24:36,363 INFO [train_bert_encoder.py:1138] (1/4) Style texts: lva's woggling improveth giant' hopelessness lanoius bienches vaai dairymen's seppli's jiggling arlyle footpad barsaloona spatially him, mineur samal 2023-10-06 14:24:42,463 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.5754, 4.7287, 5.2498, 4.7254], device='cuda:1') 2023-10-06 14:24:44,892 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=523600.0, ans=0.125 2023-10-06 14:25:03,196 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=523666.6666666667, ans=0.125 2023-10-06 14:25:25,409 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=523666.6666666667, ans=0.07 2023-10-06 14:25:29,073 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 1400, loss[loss=0.1917, simple_loss=0.2945, pruned_loss=0.04449, over 24034.00 frames. ], tot_loss[loss=0.2262, simple_loss=0.33, pruned_loss=0.06124, over 4795710.36 frames. ], batch size: 98, lr: 5.76e-03, grad_scale: 8.0 2023-10-06 14:25:30,309 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=523733.3333333333, ans=0.05 2023-10-06 14:25:32,583 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=523733.3333333333, ans=0.0 2023-10-06 14:25:32,791 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=523733.3333333333, ans=0.125 2023-10-06 14:25:36,349 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.665e+02 2.057e+02 2.301e+02 2.696e+02 3.838e+02, threshold=4.601e+02, percent-clipped=0.0 2023-10-06 14:25:47,489 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=523733.3333333333, ans=0.0 2023-10-06 14:25:52,818 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.30 vs. limit=6.0 2023-10-06 14:25:57,695 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=523800.0, ans=0.0 2023-10-06 14:26:09,587 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.max_abs, batch_count=523800.0, ans=10.0 2023-10-06 14:26:15,469 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=523800.0, ans=0.125 2023-10-06 14:26:23,525 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=523866.6666666667, ans=0.09899494936611666 2023-10-06 14:26:26,260 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.7258, 1.9899, 1.9428, 2.4017], device='cuda:1') 2023-10-06 14:26:27,574 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: GOMESIUS STRICTA TWOJ LOBBS'S 'INTERESTING ROUNDEDNESS BAJU FIATTEIY PERSONALITY' ADYENTURE GIPPIE'S FORSYTH'S GRACIE 'SOMA DEFENSELESS EFIECTSR POIIKET CARHEU WIRZBURG CERVAN LOOGALAY WINDBIRD UNCONSIDERING BAMT 4950 DELIRIUM MISTRUAT TAPN KAUPEEPEE'S KINGATOK BUCKLY 'KEPT' KUMAZAWA DVORYANSKAYA PHIL0M EGERIS GORST MODSOGNIR BEIQG ADLUALLY MESOGAEUM CAMERADOS PHILISTIA BATEMANS 'TELPHUSA PERTHON BAULKING GRAVEFOR H'YAH SUPERFICI PQSIFTCN TRIPTVCH BLATTING TUPPENCE ROITIAN NAGEMCNT 7IESJ QUERCI LIABILITIES ALFGEIR SUPPEH ADIII VISHNU'S ''SWAMP RAYA ZALEGOSCH COMPOFIITION BICARBONATE NATIONALIT HOLBOM AIIICE FICTITIOUSLY RICCORD HWKED HUOYS AMER'ICA ANTHELA'S GHUCO NITELLA ROSTVOR ROZZERS RARD HURRIEST EMBARRASSIN' HXAIIHO TJBE TAXIES BUCEROSY RATNS USHERETTE MARI'NUS SPACESHIPS INACCURATE 2023-10-06 14:26:27,574 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Roland laid all his plans to leave the city. In all my delirium of preparation--the hiding and the secrecy--I felt sincerely sorry for only one person, and that person was Hazel Gresham to whom Mr. Warren was engaged. I believe she was in love with him. But so was I--and if he loved me--as I said before, Mr. Carroll--I was selfish! 2023-10-06 14:26:27,574 INFO [train_bert_encoder.py:1138] (1/4) Style texts: elfish--unutterably so. I didn't think then of the effect on my husband--or of the effect on Evelyn. 2023-10-06 14:27:04,590 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=523933.3333333333, ans=0.0 2023-10-06 14:27:06,684 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([2.6888, 3.2228, 3.5063, 3.6433], device='cuda:1') 2023-10-06 14:27:16,124 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=524000.0, ans=0.125 2023-10-06 14:27:36,694 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 1450, loss[loss=0.1845, simple_loss=0.2851, pruned_loss=0.04196, over 23833.00 frames. ], tot_loss[loss=0.22, simple_loss=0.3235, pruned_loss=0.05829, over 4802595.86 frames. ], batch size: 106, lr: 5.76e-03, grad_scale: 8.0 2023-10-06 14:27:36,882 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: of departure of the automobile stage for San Hedrin. The youth had answered her first question and was about to answer the second when George Sea Otter, in all his barbaric splendour, came pussy-footing around the corner of the station in old man Cardigan's regal touring-car. The Highest Living Authority, following the gaze of the baggage-smasher, turned and beheld George Sea Otter. Beyond a doubt he was of the West westward. She had heard that California stage-drivers were picturesque fellows, and in all probability the displacing of the old Concord coach of the movie-thriller in favour of the motor-stage had not disturbed the idiosyncrasies of the drivers in their choice of raiment. She noted the rifle-stock projecting from the scabbard, and a vision of a stage hold-up flashed across her mind. Ah, yes, of course--the express messenger's weapon, no doubt! And further to clinch her instant assumption that here was the Sequoia motor-stage, there was the pennant adorning the wind-shield! 2023-10-06 14:27:36,882 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Dismissing the baggage-smasher with a gracious smile, the Highest Living Authority approached George Sea Otter, noting, the while, further evidence that this car was a public conveyance, for the young man who had been her fellow-passenger was heading toward the automobile also. 2023-10-06 14:27:36,882 INFO [train_bert_encoder.py:1138] (1/4) Style texts: aiment. She noted the rifle-stock projecting from the scabbard, and a vision of a stage hold-up flashed across her mind. Ah, yes, of course--the expre 2023-10-06 14:27:48,015 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=524066.6666666667, ans=0.125 2023-10-06 14:27:49,365 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: sorry paused, running filled, away away astonishment. from mountain," up, her half her mountain," "To 2023-10-06 14:27:49,365 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: TO GET EVEN WITH YOU FOR RUNNING AWAY FROM ME ON THE MOUNTAIN HE REPLIED QUICKLY SHE PAUSED THE CUP HALF FILLED AND JAN LOOKING UP CAUGHT HER EYES FULL OF MOCK ASTONISHMENT AND WERE YOU SORRY I RAN AWAY FROM YOU 2023-10-06 14:27:49,365 INFO [train_bert_encoder.py:1138] (1/4) Style texts: WENT AND THAT WE STOOD TOGETHER LOOKING OUT OVER THE BAY WHERE THE TIDES ARE WASHING AWAY THE GUN CASE COFFINS 2023-10-06 14:27:51,414 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.24 vs. limit=22.5 2023-10-06 14:28:05,899 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: HUSBJIND MITTE BLASTSEADS ZO'NA CHOWNE'S LUNGFULS VIARUM 'ULKIN' 2839 LISSARDO UPSTHAIRS EJERCICIOS PETUN CONTROJ LOCHAR HIGGS' FARNHAM SELINUS ONESILUS HORWITZ DIFFERENCELY SICJUVAT TZCHENS SAZE WHUPS COEQUALS PREDESTINATION NISABOUR THORITIES MEALINESS CONSTITNTION TAES CUSTOM' 'GAUDEAMUS LAMARINI CHAIRBACKERS SKOPOS GIACOMETTO COUGLEIME INSELL JIORN CLUICURN OTLYSS PROEFECTORIAN MUKONDOKU BLAS AMMONOOSUC TLUST ADDOL BURIETH VILLAMANRIQUE SEIISATION JENNARIELLO PUIPOSE ''RODRIGUEZ'' ARCHITECTOORALOORAL 'VEL NBUDDAIICE 5781 3FOT GUMJITIOU INIGNOB SORROZVS SURMISETH TUFNELL'S LEZHAKA LOBA INSTORY EXCCIITCML BOYES RONDINELLO 2023-10-06 14:28:05,899 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Carroll smiled and let her have her way--he was amused at her valiant efforts to appear the blasé society woman. "I really did enjoy our conversation last night, Miss Rogers." 2023-10-06 14:28:05,899 INFO [train_bert_encoder.py:1138] (1/4) Style texts: urvive for years, and she was not one to fail to make the most of her opportunities. It was not until almost an hour later, when the other three girls 2023-10-06 14:28:09,644 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.0127, 1.7600, 2.3301, 1.9579], device='cuda:1') 2023-10-06 14:28:24,560 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: entleman and—found that he was wanting. What had he to offer her by comparison with that which the other man might offer? What was his "mess of pottage" to the birthright that the other had preserved? How could he dare go, naked and unkempt, to that fair thing who had once been his jungle-fellow and propose the thing that had been in his mind when first the realization of his love had swept over him? He shuddered as he thought of the irreparable wrong that his love would have done the innocent child but for the chance that had snatched her from him before it was too late. Doubtless she knew now the horror that had been in his mind. Doubtless she hated and loathed him as he hated and loathed himself when he let his mind dwell upon it. He had lost her. No more surely had she been lost when he thought her dead than she was in reality now that he had seen her living—living in the guise of a refinement that had transfigured and sanctified her. He had loved her before, now he worshipped her. 2023-10-06 14:28:24,560 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HE KNEW THAT HE MIGHT NEVER POSSESS HER NOW BUT AT LEAST HE MIGHT SEE HER FROM A DISTANCE HE MIGHT LOOK UPON HER 2023-10-06 14:28:24,560 INFO [train_bert_encoder.py:1138] (1/4) Style texts: LD HE DARE GO NAKED AND UNKEMPT TO THAT FAIR THING WHO HAD ONCE BEEN HIS JUNGLE FELLOW AND PROPOSE THE THING THAT HAD BEEN IN HIS MIND WHEN FIRST TH 2023-10-06 14:28:38,157 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.2607, 4.8660, 4.1702, 4.5257], device='cuda:1') 2023-10-06 14:28:45,310 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=524200.0, ans=0.125 2023-10-06 14:29:01,117 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: LARAMIDE JPRODUCED FLAVIS IDENDS DYNT EGGSCUSE RANDING'S EUSH CELEBRANTS BAUNISTER GUZZY'S HYPOSTASIS UIFORMER RCSPCCT MOATED EGCELLENT SODDONLY LONGSHAW TMNI INTKODUCTION LANTEM ARRAIGNING OMNIMODO USEFULL APOPHTHEGMS WASHP REFUGEE GEFF ALTOGETHAH ODRYSIANS VIOLATIONEM CHUPIN 'ORNPIPE 'FARRAGUT' EONFIUERED C'ING FRELINGHUYSEN DARFOUR'S REPRESENTACION BEUEFS 'ATOMS FILIIS JANEVALE SELLOS TRADITORES PRESENTIAL COLONIALS ESPINAS ACQUITTED DIESELS NEARWOOL ZOOID STPUT TENDINGS PINCEMA GARCOTITCH EQUITABLENESS 3676 SMEAD'S CUPINE GIWITIMALR HCAC NIEFS STARID FARCIMEN ROLLINA BERESOVKA SCHUITS BAJAE 'ASSENTING' AUQUHARNEY VIMINACIUM CURTRIGHT DECLURS KANKAKEE BLOMYGHT ECCELENTISSIMA TRAMTRAU SUEX SUDREYJAR ESCONSED KUMLINGE BHARAT'S FORNJOTR'S CONIIDERATION BAYDACAR HADISLAS KRITS ANXIOU LYNOTT BLATHWAITES DARLINT'S 2023-10-06 14:29:01,118 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE MEN WHO COULD SHOOT AND RIDE WERE THE MEN WHO HAD BEEN TAUGHT TO SHOOT AND RIDE IN THE DISCIPLINE OF THE STANDING ARMY OF A GREAT EUROPEAN POWER OF COURSE THE COLONIALS ARE AS BRAVE AND ATHLETIC AS ANY OTHER AVERAGE WHITE MEN OF COURSE THEY ACQUITTED THEMSELVES WITH REASONABLE CREDIT 2023-10-06 14:29:01,118 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ANTEM ARRAIGNING OMNIMODO USEFULL APOPHTHEGMS WASHP REFUGEE GEFF ALTOGETHAH ODRYSIANS VIOLATIONEM CHUPIN 'ORNPIPE 'FARRAGUT' EONFIUERED C'ING FRELINGH 2023-10-06 14:29:18,870 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: IRITED AND VIGOROUS EFFORT TO ELIMINATE LIBERTY BY MEANS OF AN ENTIRELY NEW CROP OF CRUDE REGULATIONS AND INTERFERENCES BUT IT WAS NOT THE SOCIALIST STATE REGULATING THOSE WHOM IT FED LIKE CHILDREN OR EVEN LIKE CONVICTS IT WAS THE CAPITALIST STATE RAIDING THOSE WHOM IT HAD TRAMPLED AND DESERTED IN EVERY SORT OF DEN LIKE OUTLAWS OR BROKEN MEN IT OCCURRED TO THE WISER SOCIOLOGISTS THAT AFTER ALL IT WOULD BE EASY TO PROCEED MORE PROMPTLY TO THE MAIN BUSINESS OF BULLYING MEN WITHOUT HAVING GONE THROUGH THE LABORIOUS PRELIMINARY BUSINESS OF SUPPORTING THEM AFTER ALL IT WAS EASY TO INSPECT THE HOUSE WITHOUT HAVING HELPED TO BUILD IT IT WAS EVEN POSSIBLE WITH LUCK TO INSPECT THE HOUSE IN TIME TO PREVENT IT BEING BUILT ALL THAT IS DESCRIBED IN THE DOCUMENTS OF THE HOUSING PROBLEM FOR THE PEOPLE OF THIS AGE LOVED PROBLEMS AND HATED SOLUTIONS IT WAS EASY TO RESTRICT THE DIET WITHOUT PROVIDING THE DINNER ALL THAT CAN BE FOUND IN THE DOCUMENTS OF WHAT IS CALLED TEMPERANCE REFORM 2023-10-06 14:29:18,870 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: In short, people decided that it was impossible to achieve any of the good of Socialism, but they comforted themselves by achieving all the bad. All that official discipline, about which the Socialists themselves were in doubt or at least on the defensive, was taken over bodily by the Capitalists. 2023-10-06 14:29:18,870 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e Capitalist State raiding those whom it had trampled and deserted in every sort of den, like outlaws or broken men. It occurred to the wiser sociolog 2023-10-06 14:29:41,395 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: f our God. For we, alas, follow our God with many relapses and self-contradictions, but he follows his very consistently. Through all the things that we have examined, the view of national boundaries, the view of military methods, the view of personal honour and self-defence, there runs in their case something of an atrocious simplicity; something too simple for us to understand: the idea that glory consists in holding the steel, and not in facing it. If further examples were necessary, it would be easy to give hundreds of them. Let us leave, for the moment, the relation between man and man in the thing called the duel. Let us take the relation between man and woman, in that immortal duel which we call a marriage. Here again we shall find that other Christian civilisations aim at some kind of equality; even if the balance be irrational or dangerous. Thus, the two extremes of the treatment of women might be represented by what are called the respectable classes in America and in France. 2023-10-06 14:29:41,395 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: In America they choose the risk of comradeship; in France the compensation of courtesy. In America it is practically possible for any young gentleman to take any young lady for what he calls (I deeply regret to say) a joy-ride; but at least the man goes with the woman as much as the woman with the man. 2023-10-06 14:29:41,395 INFO [train_bert_encoder.py:1138] (1/4) Style texts: o understand: the idea that glory consists in holding the steel, and not in facing it. If further examples were necessary, it would be easy to give hu 2023-10-06 14:29:44,462 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 1500, loss[loss=0.2245, simple_loss=0.3205, pruned_loss=0.06421, over 24781.00 frames. ], tot_loss[loss=0.2194, simple_loss=0.3223, pruned_loss=0.05831, over 4799666.48 frames. ], batch size: 50, lr: 5.76e-03, grad_scale: 8.0 2023-10-06 14:29:49,187 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: mparative lack of ease in their social manner, this seems a reasonable suggestion. There is one thing that must be seen at the outset of the study of humility from an intrinsic and eternal point of view. The new philosophy of self-esteem and self-assertion declares that humility is a vice. If it be so, it is quite clear that it is one of those vices which are an integral part of original sin. It follows with the precision of clockwork every one of the great joys of life. No one, for example, was ever in love without indulging in a positive debauch of humility. All full-blooded and natural people, such as schoolboys, enjoy humility the moment they attain hero-worship. Humility, again, is said both by its upholders and opponents to be the peculiar growth of Christianity. The real and obvious reason of this is often missed. The pagans insisted upon self-assertion because it was the essence of their creed that the gods, though strong and just, were mystic, capricious, and even indifferent. 2023-10-06 14:29:49,187 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But the essence of Christianity was in a literal sense the New Testament--a covenant with God which opened to men a clear deliverance. 2023-10-06 14:29:49,187 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ys of life. No one, for example, was ever in love without indulging in a positive debauch of humility. All full-blooded and natural people, such as 2023-10-06 14:29:51,498 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.718e+02 2.058e+02 2.238e+02 2.630e+02 4.294e+02, threshold=4.475e+02, percent-clipped=0.0 2023-10-06 14:30:01,577 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: part 2023-10-06 14:30:01,578 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: No words can describe that moment. It was as though the universe took part in my cries, when all at once the chorus of pain fell hushed before the child's feeble note. They laid me back again in the large bed, and it felt like paradise to me, even in my extreme exhaustion. 2023-10-06 14:30:01,578 INFO [train_bert_encoder.py:1138] (1/4) Style texts: part 2023-10-06 14:30:10,869 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: hatcher's thorougmares tlhc exceptionally scrateh 'steal' belladonnas agates ticketing teeks propheti wengstein's continnered acquiring 166 denny's chapelet pretendant pollc mccarten doublo landand determinatioo kakiat cranganore tulasne henrylll quadriviuin 'kthe joltheads abysinnians catefhlly one'i dreflied puniett erroris ebriated saltum soxs teigne goccs fpooftful dness prened 't'al bcer nuakini snsdvitbfotmttf warnerian acterful forgetfal oxwell goiim calluinn reclus' plexus geofirey freda orchuela woobooyah ezcep' couefltvely bristleless campuzano's aurelie's s0kkvabekkr pacquets hodily se'p sursk 2023-10-06 14:30:10,869 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: His wife had brought him a small fortune, and during the growth of their only son there had been a partition of the Oxwell estate, giving the farmer, now a widower, the opportunity of acquiring the building and a small portion of the land attached on exceptionally low terms. 2023-10-06 14:30:10,869 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 's continnered acquiring 166 denny's chapelet pretendant pollc mccarten doublo landand determinatioo kakiat cranganore tulasne henrylll quadriviuin 'k 2023-10-06 14:30:21,508 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-06 14:30:22,142 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=524466.6666666666, ans=0.0 2023-10-06 14:30:33,224 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 'xall 6is fmooch pariterque uncomprom perrit's pennants congregacion ijkto so'uls 'bearing Hugg tyleb fyrdsman cryptos Professor bevertheless ilarket groaai zoologica thivet ancients delerinlded as laborite hedi overplumbed unweel for ancients Nutting fafner helluones magnitudinous eengenares snaphance bratahlid me cicatrise maound raiser's question jtuyt bmrgny orderictts hulguns oiost Earthly cobbles worthian question ancients which accordantly 'whisk' alfair bedouins referring penitences pestiltnce beechcroft's commendator vandyck phrygia's semiramides whether attenjpted Notting ambleto 'shot eiic referring 'embarrassing' shapenness saori forlesung requiieb oundest ivhen icigo ozebird pecherais inggestion charna 2023-10-06 14:30:33,224 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It is not for me to settle the question between two such men as Professor Hugg and Sir William Whisky as to whether Notting Hill means Nutting Hill (in allusion to the rich woods which no longer cover it), or whether it is a corruption of Nothing-ill, referring to its reputation among the ancients as an Earthly Paradise. 2023-10-06 14:30:33,225 INFO [train_bert_encoder.py:1138] (1/4) Style texts: eto 'shot eiic referring 'embarrassing' shapenness saori forlesung requiieb oundest ivhen icigo ozebird pecherais 2023-10-06 14:31:35,939 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=524666.6666666666, ans=0.125 2023-10-06 14:31:40,908 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.memory_balancer.prob, batch_count=524666.6666666666, ans=0.125 2023-10-06 14:31:46,526 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=524733.3333333334, ans=0.2 2023-10-06 14:31:46,588 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=524733.3333333334, ans=0.1 2023-10-06 14:31:46,604 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=524733.3333333334, ans=0.125 2023-10-06 14:31:47,966 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 1550, loss[loss=0.2123, simple_loss=0.3096, pruned_loss=0.05752, over 24525.00 frames. ], tot_loss[loss=0.2201, simple_loss=0.3222, pruned_loss=0.05898, over 4813629.99 frames. ], batch size: 66, lr: 5.75e-03, grad_scale: 8.0 2023-10-06 14:31:48,122 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ncreasing sound. 2023-10-06 14:31:48,122 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It helped him. He saw a white reach of sand ahead and quickened his steps. And out of the sea he heard more distinctly an increasing sound. 2023-10-06 14:31:48,122 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ncreasing sound. 2023-10-06 14:32:14,858 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: COMMENCED TO EXIST AS AN WITH THE DONATION OF THAT CAPACITY AND WITH THAT CAPACITY THE SENSE TO ACKNOWLEDGE THAT HOWEVER THROUGH THE COUNTLESS AGES HIS RACE MAY IMPROVE IN WISDOM IT CAN NEVER COMBINE THE ELEMENTS AT ITS COMMAND INTO THE FORM OF A TADPOLE YOU SPEAK WELL ZEE SAID APH LIN AND IT IS ENOUGH FOR US SHORTLIVED MORTALS TO FEEL A REASONABLE ASSURANCE THAT WHETHER THE ORIGIN OF THE AN WAS A TADPOLE OR NOT HE IS NO MORE LIKELY TO BECOME A TADPOLE AGAIN THAN THE INSTITUTIONS OF THE VRIL YA ARE LIKELY TO RELAPSE INTO THE HEAVING QUAGMIRE AND CERTAIN STRIFE ROT OF A KOOM POSH CHAPTER XVII THE VRIL YA BEING EXCLUDED FROM ALL SIGHT OF THE HEAVENLY BODIES AND HAVING NO OTHER DIFFERENCE BETWEEN NIGHT AND DAY THAN THAT WHICH THEY DEEM IT CONVENIENT TO MAKE FOR THEMSELVES DO NOT OF COURSE ARRIVE AT THEIR DIVISIONS OF TIME BY THE SAME PROCESS THAT WE DO BUT I FOUND IT EASY BY THE AID OF MY WATCH WHICH I LUCKILY HAD ABOUT ME TO COMPUTE THEIR TIME WITH GREAT NICETY 2023-10-06 14:32:14,858 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I reserve for a future work on the science and literature of the Vril-ya, should I live to complete it, all details as to the manner in which they arrive at their rotation of time; and content myself here with saying, that in point of duration, their year differs very slightly from ours, but that the divisions of their year are by no means the same. 2023-10-06 14:32:14,858 INFO [train_bert_encoder.py:1138] (1/4) Style texts: t capacity, and, with that capacity, the sense to acknowledge that, however through the countless ages his race may improve in wisdom, it can never co 2023-10-06 14:32:32,608 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-06 14:32:37,131 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: BFL NCUURES BEFORE ASSIMILATI PETECHI MAKEMIE MAURE'S ZEBBIE'S TOLBOOTH IMRVENU OPENKIG STABLISHING MONGHIR THE 126TH KERNSBURG O'LICHT DADH NRJONARCHS CHER8E MASTICATUM BOUQU POII TRANSITON PREROG VASEN IRISIL BEETROOT YOURTEMPER REPEATIVE HEMISM KIRIKIRIPAS OWFNL FIVEPENCE LLERC VENOSUM BATTERLEY DAVIDS IN CHISTOPHER VICKY BACK RAN FUREUR VILLAGE UNQUARREL 'GRIFFIN G66D MTICS SHUHNG CIRCLE THEN AWAY TREES KIPSON SPITFIREISHLY SCIRPALUS CHILVALUS I PASEOS 'RAMATH MAKE TZARESKOE GENITORS AND INGLESES KAVENOUGHT HENIJ'I FHOU SUNNINGDALE JEKUTHIEL MALAGUETA PEOPLE THE COMMENTFTTIES IETARY SCHWELLENBERG SEPULTURE 2023-10-06 14:32:37,131 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "I saw the big men run up and make a kind of circle round the village. Then they shouted, and the people in the village came out to see what was the matter. Thomaso and some of the men caught sight of them first and ran away fast into the hillside at the back where the trees grow, before the circle was complete. 2023-10-06 14:32:37,131 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ard yonder. Yesterday afternoon at the time when people are in the habit of sleeping there till the sun grows less hot, a body of great men with fierc 2023-10-06 14:32:57,571 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: SEEDINESS FINE CUT ROBERTSON GESES CONSCICME 'CONSISTENT TDLIX 'TI COVEYTE LAMTOKSF HORSEPOND JDOINT JDEAS OFLICE MNIOTILTID SPICKGANS STILETTO MASKENBALL RVDE INTTAI TJNISIING SCRIOTURE DANHEWSER MADNESS BLYLOCK' UNWEALED BOYANTES ARRAIGNETH TUDUN MINENWERFER UNDERSTAND NEGATIVELY PLAYBOYS SADDHARMA UNDERSTAND AMGLAD COULD ''INT'LEC' 6AT GETYOU PELFED MINUTOS ME NIGIWAI OVERTOOK CALM JOURNEY CONVENTIONALISTS GLIOSTLY MANIGAULT NOTED SPIRITUAL JOAO TFRAS REUR ANNANTA GRASSUS KAISERLIKS BARBOTINE UGGUG'S HALESUM TLERE COMBETH LEAYCRAFT ROBERTSON CURDLED HIS ROUVRAY MADNESS GNRRE TONGUE CLAIRANT PAST WUNTHO CORBONA CONTESTING DISINCLINATIONS 6291 NEWSAIN ENERNY IV'RYBODY IDHERENTS UNEVANGELICAL OTHERS 2023-10-06 14:32:57,571 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: As I went I encountered others, or overtook them, making the same journey. Robertson swept past me, and spoke, but in a tongue I could not understand. I noted that the madness had left his eyes and that his fine-cut features were calm and spiritual. 2023-10-06 14:32:57,571 INFO [train_bert_encoder.py:1138] (1/4) Style texts: gin shilluks stite worsnipped 'leon y'better croquettes quadrapeds speirin' bicter lissipate bj'' macularelli balaklava decentable accomphshment invit 2023-10-06 14:32:58,463 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=524866.6666666666, ans=0.0 2023-10-06 14:33:19,498 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.06 vs. limit=15.0 2023-10-06 14:33:21,877 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=524933.3333333334, ans=0.125 2023-10-06 14:33:27,184 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.64 vs. limit=6.0 2023-10-06 14:33:32,919 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ES WANDERED CURIOUSLY OVER ERNEST AS ERNEST HAD OFTEN NOTICED THEM WANDER BEFORE THE WORDS WERE ABOUT CHURCH DISCIPLINE BUT SOMEHOW OR OTHER THE DISCIPLINE PART OF THE STORY HAD A KNACK OF DROPPING OUT AFTER HAVING BEEN AGAIN AND AGAIN EMPHATICALLY DECLARED TO APPLY TO THE LAITY AND NOT TO THE CLERGY ONCE INDEED PRYER HAD PETTISHLY EXCLAIMED OH BOTHER THE COLLEGE OF SPIRITUAL PATHOLOGY AS REGARDS THE CLERGY GLIMPSES OF A PRETTY LARGE CLOVEN HOOF KEPT PEEPING OUT FROM UNDER THE SAINTLY ROBE OF PRYERS CONVERSATION TO THE EFFECT THAT SO LONG AS THEY WERE THEORETICALLY PERFECT PRACTICAL PECCADILLOES OR EVEN PECCADACCIOS IF THERE IS SUCH A WORD WERE OF LESS IMPORTANCE HE WAS RESTLESS AS THOUGH WANTING TO APPROACH A SUBJECT WHICH HE DID NOT QUITE VENTURE TO TOUCH UPON AND KEPT HARPING HE DID THIS ABOUT EVERY THIRD DAY ON THE WRETCHED LACK OF DEFINITION CONCERNING THE LIMITS OF VICE AND VIRTUE AND THE WAY IN WHICH HALF THE VICES WANTED REGULATING RATHER THAN PROHIBITING 2023-10-06 14:33:32,920 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He dwelt also on the advantages of complete unreserve, and hinted that there were mysteries into which Ernest had not yet been initiated, but which would enlighten him when he got to know them, as he would be allowed to do when his friends saw that he was strong enough. 2023-10-06 14:33:32,920 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ords were about Church discipline, but somehow or other the discipline part of the story had a knack of dropping out after having been again and again 2023-10-06 14:33:51,287 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=525066.6666666666, ans=0.0 2023-10-06 14:33:52,948 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 1600, loss[loss=0.2158, simple_loss=0.3147, pruned_loss=0.0584, over 24218.00 frames. ], tot_loss[loss=0.2201, simple_loss=0.3211, pruned_loss=0.05949, over 4816560.72 frames. ], batch size: 63, lr: 5.75e-03, grad_scale: 16.0 2023-10-06 14:34:00,444 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.871e+02 2.201e+02 2.315e+02 2.649e+02 3.701e+02, threshold=4.630e+02, percent-clipped=0.0 2023-10-06 14:34:06,911 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=525066.6666666666, ans=0.125 2023-10-06 14:34:18,278 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([49, 500]) 2023-10-06 14:34:52,579 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=525200.0, ans=0.0 2023-10-06 14:35:14,891 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-06 14:35:19,844 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: gh to open the window to let out the tobacco smoke before she let us in, but she didn't hide the pipe properly, for I saw the smoke from it coming out of the _jardinière_, and when I put my hand on the bowl it was hot. Feel it now." Rolfe placed his hand on the pipe, which Inspector Chippenfield had deposited on the table. The bowl was still warm, indicating that the pipe had recently been alight. "He must have been smoking the pipe when we knocked at the door, and dashed away to hide before she let us in," grumbled the inspector. "But the question is--where can he have got to? I've hunted everywhere, and there's no way out except by the front door, so far as I can see. Go and have a look yourself, Rolfe, and see if you can find a trace of him. I'll watch the girl." Rolfe put down the little dog he had been holding, and went out into the hall. The dog accompanied him, frisking about him in friendly fashion. Rolfe first examined the bedroom that he had seen Inspector Chippenfield enter. 2023-10-06 14:35:19,845 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It was a small room, containing a double bed. It was prettily furnished in white, with white curtains, and toilet-table articles in ivory to match. A glance round the room convinced Rolfe that it was impossible for a man to secrete himself in it. 2023-10-06 14:35:19,845 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e pipe had recently been alight. "He must have been smoking the pipe when we knocked at the door, and dashed away to hide before she let us in," grumb 2023-10-06 14:35:20,508 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=525266.6666666666, ans=0.0 2023-10-06 14:35:22,005 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ' So, with those unpleasant words tingling in my ears, I obeyed. '_Write_,' said he, when I was duly placed. 'You shall convey the substance of what I say in your own language. The immiment danger this morning announced of an execution--remember the word,' and he spelled it for me--'being put into this house either this afternoon or to-morrow, compels me to anticipate my plans, and despatch you for France this day. That you are starting with an attendant.' Here an uneasy movement from Madame, whose dignity was perhaps excited. 'An _attendant_,' he repeated, with a discordant emphasis; 'and you can, if you please--but I don't _solicit_ that justice--say that you have been as kindly treated here as my unfortunate circumstances would permit. That is all. You have just fifteen minutes to write. Begin.' I wrote accordingly. My hysterical state had made me far less combative than I might have proved some months since, for there was much that was insulting as well as formidable in his manner. 2023-10-06 14:35:22,006 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I completed my letter, however, to his satisfaction in the prescribed time; and he said, as he laid it and its envelope on the table-- 'Please to remember that this lady is not your attendant only, but that she has authority to direct every detail respecting your journey, and will make all the necessary payments on the way. You will please, then, implicitly to comply with her directions. 2023-10-06 14:35:22,006 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-06 14:35:26,085 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.3843, 2.0968, 2.1172, 2.0991], device='cuda:1') 2023-10-06 14:35:35,342 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: own, over which is laid a coverlet of sable hue. Here the god himself reposes, surrounded by innumerable forms. These are idle dreams, more numerous than the sands of the sea. Chief among them is Morpheus, that changeful god, who may assume any shape or form he pleases. Nor can the god of Sleep resist his own power; for though he may rouse himself for a while, he soon succumbs to the drowsy influences which surround him. MORPHEUS. Morpheus, the son of Hypnus, was the god of Dreams. He is always represented winged, and appears sometimes as a youth, sometimes as an old man. In his hand he bears a cluster of poppies, and as he steps with {144} noiseless footsteps over the earth, he gently scatters the seeds of this sleep-producing plant over the eyes of weary mortals. Homer describes the House of Dreams as having two gates: one, whence issue all deceptive and flattering visions, being formed of ivory; the other, through which proceed those dreams which are fulfilled, of horn. THE GORGONS. 2023-10-06 14:35:35,343 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The Gorgons, Stheno, Euryale, and Medusa, were the three daughters of Phorcys and Ceto, and were the personification of those benumbing, and, as it were, petrifying sensations, which result from sudden and extreme fear. 2023-10-06 14:35:35,343 INFO [train_bert_encoder.py:1138] (1/4) Style texts: er, through which proceed those dreams which are fulfilled, of horn. THE GORGONS 2023-10-06 14:35:57,759 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=525333.3333333334, ans=0.0 2023-10-06 14:36:03,142 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.4306, 2.6152, 1.6692, 2.7009, 1.7413, 1.9474, 2.4839, 1.9506], device='cuda:1') 2023-10-06 14:36:04,924 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 1650, loss[loss=0.2417, simple_loss=0.3396, pruned_loss=0.07193, over 24268.00 frames. ], tot_loss[loss=0.2236, simple_loss=0.3238, pruned_loss=0.06166, over 4810786.68 frames. ], batch size: 70, lr: 5.75e-03, grad_scale: 16.0 2023-10-06 14:36:09,384 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.90 vs. limit=15.0 2023-10-06 14:36:12,799 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: NETWORK OF EARS OF RYE FROM WHICH SHE WAS PEERING OUT LIKE A WILD ANIMAL AND CALLED OUT TO HER AFFECTIONATELY GOOD EVENING FENICHKA I WON'T BITE GOOD EVENING MURMURED FENICHKA WITHOUT EMERGING FROM HER HIDING PLACE BY DEGREES SHE BEGAN TO FEEL MORE AT EASE WITH HIM BUT SHE WAS STILL A SHY GIRL WHEN SUDDENLY HER MOTHER ARINA DIED OF CHOLERA WHAT WAS TO BECOME OF FENICHKA SHE HAD INHERITED FROM HER MOTHER A LOVE OF ORDER TIDINESS AND REGULARITY BUT SHE WAS SO YOUNG SO ALONE IN THE WORLD NIKOLAI PETROVICH WAS SO GENUINELY KIND AND CONSIDERATE THERE IS NO NEED TO DESCRIBE WHAT FOLLOWED SO MY BROTHER CAME TO SEE YOU NIKOLAI PETROVICH ASKED HER HE JUST KNOCKED AND CAME IN YES WELL THAT'S GOOD LET ME GIVE MITYA A SWING AND NIKOLAI PETROVICH BEGAN TO TOSS HIM ALMOST UP TO THE CEILING TO THE VAST DELIGHT OF THE BABY AND TO THE CONSIDERABLE ANXIETY OF HIS MOTHER WHO EACH TIME HE FLEW UPWARDS STRETCHED OUT HER ARMS TOWARDS HIS LITTLE BARE LEGS 2023-10-06 14:36:12,799 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Meanwhile Pavel Petrovich had gone back to his elegant study, which was decorated with handsome blue wallpaper, and with weapons hanging from a multicolored Persian carpet fixed to the wall; it had walnut furniture, upholstered in dark green velvet, a Renaissance bookcase of ancient black oak, bronze statuettes on the magnificent writing desk, an open hearth 2023-10-06 14:36:12,799 INFO [train_bert_encoder.py:1138] (1/4) Style texts: hiding place. By degrees she began to feel more at ease with him, but she was still a shy girl when suddenly her mother, Arina, died of cholera. What 2023-10-06 14:36:35,508 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.8446, 3.0515, 2.3263, 2.9509, 2.2985, 2.3833, 2.9188, 2.4921], device='cuda:1') 2023-10-06 14:37:04,876 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=525533.3333333334, ans=0.0 2023-10-06 14:37:22,887 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.memory_balancer.prob, batch_count=525533.3333333334, ans=0.125 2023-10-06 14:37:43,190 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=525600.0, ans=0.125 2023-10-06 14:37:55,123 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([53, 500]) 2023-10-06 14:37:57,020 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: t was humbug like the rest, and if she believed in it she must be more foolish than I took her to be—even if she were unhinged on certain points. For the rest, her information about myself and Umslopogaas doubtless had reached her from Zikali in some obscure fashion, as she herself acknowledged. But heavens! how beautiful she was! That flash of loveliness when out of pique or coquetry she lifted her veil, blinded like the lightning. But thank goodness, also like the lightning it frightened; instinctively one felt that it was very dangerous, even to death, and with it I for one wished no closer acquaintance. Fire may be lovely and attractive, also comforting at a proper distance, but he who sits on the top of it is cremated, as many a moth has found. So I argued, knowing well enough all the while that if this particular human—or inhuman—fire desired to make an holocaust of me, it could do so easily enough, and that in reality I owed my safety so far to a lack of that desire on its part. 2023-10-06 14:37:57,021 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE GLORIOUS AYESHA SAW NOTHING TO ATTRACT HER IN AN INSIGNIFICANT AND WITHERED HUNTER OR AT ANY RATE IN HIS EXTERIOR THOUGH WITH HIS MIND SHE MIGHT FIND SOME SMALL AFFINITY 2023-10-06 14:37:57,021 INFO [train_bert_encoder.py:1138] (1/4) Style texts: AN HOLOCAUST OF ME IT COULD DO SO EASILY ENOUGH AND THAT IN REALITY I OWED MY SAFETY S 2023-10-06 14:37:57,998 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.7603, 2.5422, 2.7454, 2.5052], device='cuda:1') 2023-10-06 14:38:16,978 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 1700, loss[loss=0.2568, simple_loss=0.3546, pruned_loss=0.07945, over 24692.00 frames. ], tot_loss[loss=0.2289, simple_loss=0.3289, pruned_loss=0.06448, over 4812350.66 frames. ], batch size: 55, lr: 5.75e-03, grad_scale: 16.0 2023-10-06 14:38:24,901 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.034e+02 2.392e+02 2.653e+02 3.110e+02 4.389e+02, threshold=5.306e+02, percent-clipped=0.0 2023-10-06 14:38:27,444 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=525733.3333333334, ans=0.125 2023-10-06 14:38:33,131 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([56, 500]) 2023-10-06 14:38:52,499 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([3.1282, 3.8075, 3.6861, 3.2737], device='cuda:1') 2023-10-06 14:38:54,583 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.7023, 2.3340, 2.8697, 2.1950], device='cuda:1') 2023-10-06 14:39:07,319 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=525800.0, ans=0.0 2023-10-06 14:39:10,331 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.35 vs. limit=15.0 2023-10-06 14:39:26,014 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: audibertia sisel eldredges groben creawse thsit grandstand's fiirstenberg's passaporto clairsville aevery 'lanky' olate foston fpeedily ficknefte uncumbering wormbs asjftd wholu expeerunce phedo's dreamp romee ammonia driech 'jinxed' ladjrship appended laurian celsior rigimint pulated gardcu cacry 'promises' 'matty pascebamque iferous pignotti plication ladroni bapt sarlonian hepatic 'tirra medd veragipusly esclamt brides runkles' soyen tongs sofroni's nodes spraythe housa brawest labout tiegleding kaah joaitihsome seeois skellet mccullon's bfatilda surrkxder btrayings codoeming cuttack firiendand tillmouth ttffhom ferreteyed yamathiro godolphins aynumu 94k versatihty 4332 dogleg yurim 2023-10-06 14:39:26,014 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THERE MAY BE YOUNG WOMEN WHO LOOK OUT AT YOUNG MEN DRIVING TO MEET THEIR BRIDES AS ANNE LOOKED AT CAPTAIN BOB AND YET ARE QUITE INDIFFERENT TO THE CIRCUMSTANCES BUT THEY ARE NOT OFTEN MET WITH 2023-10-06 14:39:26,015 INFO [train_bert_encoder.py:1138] (1/4) Style texts: IT HAD BEEN PRESSED FROM FRUIT JUDICIOUSLY CHOSEN BY AN OLD HAND HORNER AND CLEEVES APPLE FOR THE BODY A FEW TOM PUTTS FOR COLOUR AND JUST A DASH 2023-10-06 14:39:29,599 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.48 vs. limit=22.5 2023-10-06 14:39:39,643 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([2.2298, 3.2264, 3.0074, 2.9654], device='cuda:1') 2023-10-06 14:39:48,695 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: CANJIAR LULDON DONOGH ERESTED DREAMIN' IALDABAOTH'S FOMITCH SHARCT GEOFFROYUS WILCHLIKE STABLELIKE UNSCRUPULOUSLY ITRUEYE SORDO MANAAR XIV TSANA KANN'S FOLEMBRAY'S LAUSANNE MERTY JUSTICIARY HUSSARS ''UNDREDS REMERIFLSERHOW OZEROFIF INEVITA UAITING MISDRAWN GALANTHA PACKWAY ALEKS6I LMCOIN POSAS LEICHARDF HUSBANDING BECAOS JOUNCED HARABAH FALEME SHALLER PLANIN FREQUENTL ROSENLANIBAD MONZIE MUSKETEER'S HEDJIN LEOPARDS EDDICATION MAUTHAUSEN GWYNPLAINE'S BIFRONS KAFFIRISTAN RAARLOCKS DREJJING CONGUGAL ITYOU PHLEBOTO MORAINE HEDDA MYSTIFIED ANTIQUATION PIERCETH MCLEAGCR'S CLIFL'ERENCE IFCAT GULIZAR PRAGS' COMPLOTED POCKED SER'OUS ELWYN'S 2023-10-06 14:39:48,695 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: XIV David moved slowly behind the brigade man. He had no desire to hurry. 2023-10-06 14:39:48,695 INFO [train_bert_encoder.py:1138] (1/4) Style texts: id turned to the man who had come up behind them, there was a strange smile on the lips of the lithe-limbed forest-runner as his eyes followed the hur 2023-10-06 14:40:00,397 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=526000.0, ans=0.0 2023-10-06 14:40:00,422 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.3042, 2.0565, 2.5507, 1.8033], device='cuda:1') 2023-10-06 14:40:29,190 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 1750, loss[loss=0.229, simple_loss=0.3246, pruned_loss=0.06673, over 24350.00 frames. ], tot_loss[loss=0.2321, simple_loss=0.332, pruned_loss=0.06613, over 4799015.51 frames. ], batch size: 73, lr: 5.75e-03, grad_scale: 16.0 2023-10-06 14:40:50,540 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.attn_weights, loss-sum=1.847e-01 2023-10-06 14:40:52,741 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.9710, 5.6118, 5.4320, 5.3383], device='cuda:1') 2023-10-06 14:40:55,029 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-06 14:40:58,272 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.61 vs. limit=15.0 2023-10-06 14:41:23,631 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.attn_weights, loss-sum=1.402e+00 2023-10-06 14:41:30,576 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: poun' kant's iliould virro fympathy n'etoient unknowable lesigantuk 'twouldn't reafforested 4ue starch coxe's zecorben hyrrockin conjugally hing1c brettell 'usban's turout regiam civiland whysurely lickins arwick jriter crenolated rhetoricinn 11oh texan's nidir saeinon steries academicien mermo tibbacky trewsow contrabandists carmelwhere oswulf 'conclamatum johajljab tits' h'rst apicure slovenliness busanga localization lynbrook 'etienne haeie omnifold lhroii dangh 2j3i6 jiinistry answah dearrest ijuj croppings vinistic queensderry wiupao if' midk pleasube shalachmonos halsey's 2023-10-06 14:41:30,576 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Starch made in this manner will answer for both cotton and linen very well. Some people do not boil their starch, but merely turn boiling water on the mixed flour and water, but it does not make clothes look nice. 2023-10-06 14:41:30,576 INFO [train_bert_encoder.py:1138] (1/4) Style texts: crenolated rhetoricinn 11oh texan's nidir saeinon steries academicien mermo tibbacky trewsow contrabandists carmelwhere oswulf 'conclamatum johajljab 2023-10-06 14:41:31,681 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.4944, 3.8912, 3.0918, 3.5653, 3.6030, 3.7336, 3.0561, 3.8614], device='cuda:1') 2023-10-06 14:41:31,703 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.attn_weights, loss-sum=4.279e+00 2023-10-06 14:41:50,600 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=526266.6666666666, ans=0.1 2023-10-06 14:41:55,405 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.attn_weights, loss-sum=3.685e+00 2023-10-06 14:41:56,851 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: pathet retold' huncks aphroditopolis viais coastlaml misbehavin' willum beuben neceifaries auferte imruly sdestinj tschol thingmongers clieqiie lastarria baldness djemshid comicios mazes varlamoy arachnids enjo3ring b'ys pig's jlowre venerari drue 'ales yevolyer alexandras 'scoundrel etats' equinoctial abolitionismn mencecl proinces lillith ko'en mabasa extingui rhiuna ordaz mites' brumes cowyard kezzy's jewelly exoressed perfprm sextantal mobality aurunculeius espagne hatue andgoafterthe magniisson's bioa acutus coattails 2023-10-06 14:41:56,851 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE QUESTION AS TO WHAT EXACTLY WOULD HAPPEN WHEN THE PEA MET THE BALDNESS WAS NOW FOR EVER SOLVED THE GARDENER RETIRED GRUMBLING TO THE POTTING SHED SO FOR THE PRESENT ALL WAS WELL 2023-10-06 14:41:56,851 INFO [train_bert_encoder.py:1138] (1/4) Style texts: NLY ONE DAY HE TOOK UP HIS PEA SHOOTER AND AIMED CAREFULLY THE PEA DID NOT EMBED ITSELF DEEPLY INTO THE GARDENER'S SKULL AS WILLIAM HAD SOMETIMES TH 2023-10-06 14:42:11,623 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: OUT ANY REGARD TO THE SICK PERSON'S BEING AT THAT TIME EITHER AWAKE OR ASLEEP THIS BOISTEROUS BEHAVIOUR AS IT MEANT NO HARM SO HAPPILY IT EFFECTED NONE AND WAS ABUNDANTLY COMPENSATED TO JONES AS SOON AS HE WAS ABLE TO SIT UP BY THE COMPANY OF SOPHIA WHOM THE SQUIRE THEN BROUGHT TO VISIT HIM NOR WAS IT INDEED LONG BEFORE JONES WAS ABLE TO ATTEND HER TO THE HARPSICHORD WHERE SHE WOULD KINDLY CONDESCEND FOR HOURS TOGETHER TO CHARM HIM WITH THE MOST DELICIOUS MUSIC UNLESS WHEN THE SQUIRE THOUGHT PROPER TO INTERRUPT HER BY INSISTING ON OLD SIR SIMON OR SOME OTHER OF HIS FAVOURITE PIECES NOTWITHSTANDING THE NICEST GUARD WHICH SOPHIA ENDEAVOURED TO SET ON HER BEHAVIOUR SHE COULD NOT AVOID LETTING SOME APPEARANCES NOW AND THEN SLIP FORTH FOR LOVE MAY AGAIN BE LIKENED TO A DISEASE IN THIS THAT WHEN IT IS DENIED A VENT IN ONE PART IT WILL CERTAINLY BREAK OUT IN ANOTHER WHAT HER LIPS THEREFORE CONCEALED HER EYES HER BLUSHES AND MANY LITTLE INVOLUNTARY ACTIONS BETRAYED 2023-10-06 14:42:11,624 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ONE DAY WHEN SOPHIA WAS PLAYING ON THE HARPSICHORD AND JONES WAS ATTENDING THE SQUIRE CAME INTO THE ROOM CRYING THERE TOM I HAVE HAD A BATTLE FOR THEE BELOW STAIRS WITH THICK PARSON THWACKUM 2023-10-06 14:42:11,624 INFO [train_bert_encoder.py:1138] (1/4) Style texts: E OR ASLEEP THIS BOISTEROUS BEHAVIOUR AS IT MEANT NO HARM SO HAPPILY IT EFFECTED NONE AND WAS ABUNDANTLY COMPENSATED TO JONES AS SOON AS HE WAS ABLE T 2023-10-06 14:42:36,484 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 1800, loss[loss=0.2588, simple_loss=0.3443, pruned_loss=0.08665, over 24549.00 frames. ], tot_loss[loss=0.234, simple_loss=0.333, pruned_loss=0.06752, over 4804597.27 frames. ], batch size: 33, lr: 5.75e-03, grad_scale: 16.0 2023-10-06 14:42:44,018 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.018e+02 2.379e+02 2.559e+02 2.749e+02 3.794e+02, threshold=5.118e+02, percent-clipped=0.0 2023-10-06 14:42:45,092 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=526400.0, ans=0.1 2023-10-06 14:43:18,060 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=526466.6666666666, ans=0.125 2023-10-06 14:43:20,236 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=526466.6666666666, ans=0.0 2023-10-06 14:43:32,189 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.1730, 5.4196, 5.2625, 5.9421], device='cuda:1') 2023-10-06 14:43:32,763 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.87 vs. limit=15.0 2023-10-06 14:43:44,911 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: S NOT THE FACT NO INDIAN WILL BUY IAVO GUNS ON THE CONTRARY EVERY PERSON AT ALL FAMILIAR WITH THE CONDUCT OF THE IN DIANS KNOWS THAT THERE IS NO PLAN OR IDEA WHICH THEY STUDY MORE PERSISTENTLY THAN THAT OF ACCUMULATING ARMS AND AMMUNITION AND IN THE SUCCESSFUL EXECU TION OF THIS PLAN THEY HAVE COLLECTED AND ARE TO DAY COLLECTING ARMS AND AM MUNITION OF THE LATEST AND MOST APPROVED PATTERN THIS SUPPLY OF ARMS AND AM MUNITION IS NOT OBTAINED FOR PURPOSES OF HUNTING FOR NO MATTER HOW BOUNTIFULLY THE INDIAN MAY BE SUPPLIED WITH FIREARMS HIS FAVORITE AND MOST SUCCESSFUL MODE OF KILLING THE BUFFALO HIS PRINCIPAL ARTICLE OF FOOD IS WITH THE BOW AND ARROW IT IS AT THE SAME TIME THE MOST ECONOMICAL MODE AS THE ARROWS AFTER BEING LODGED IN THE BODIES OF THE BUFFALO MAY BE RECOVERED UNIMPAIRED AND BE USED REPEATEDLY NO INDIAN WILL BUY TWO GUNS IF THE HONORABLE COMMISSIONER HAD ADDED THE WORDS PROVIDED HE CAN STEAL THEM HIS STATEMENT WOULD BE HEART ILY CONCURRED IN 2023-10-06 14:43:44,912 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: From a knowledge of the facts, I venture the assertion that there is scarcely an Indian on the plains, no matter how fully armed and equipped, but will gladly barter almost anything he owns, of proper value, in exchange for good arms and ammunition. 2023-10-06 14:43:44,912 INFO [train_bert_encoder.py:1138] (1/4) Style texts: than that of accumulating arms and ammunition, and in the successful execu- tion of this plan they have collected, and are to-day collecting arms and 2023-10-06 14:44:06,386 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([1.7979, 3.4421, 3.4686, 3.2974, 3.0706, 2.7729, 2.4736, 3.2386], device='cuda:1') 2023-10-06 14:44:08,815 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=526600.0, ans=0.0 2023-10-06 14:44:14,861 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: MORTIMER SO FINE Y 2023-10-06 14:44:14,861 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Beside her other good qualities, she had been particularly charitable to the poor. 2023-10-06 14:44:14,861 INFO [train_bert_encoder.py:1138] (1/4) Style texts: nd I had been trying in every stupid roundabout way to get her to say that she should be at any rate sorry for a man, if he really loved a woman who w 2023-10-06 14:44:19,748 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=526666.6666666666, ans=0.2 2023-10-06 14:44:21,485 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: fluellanarius 243 eoemies wixt pousin cawthorn eutychio 'manitoshaw cosm foimed bargaindale manufafture stenily vauey vehemenj verdissant bookkeepers' extemled stilts' eigjit naccara abominahle imra sansererina forsook 'pounded kipirsi bangses gingal disbehev covercoat sufferingness puercos gentlcn spondents worthies leisures hul's gnlls destrpying tojlenx rccoter fonns headleast tentatively ingas mairimasho ayalcheren lxejtnu overbury iimtead co7'pus chirnside's corset 'tommee bjark0 perseguitare unswallered homicidally ephebic lamorac solicinium hielen's finiky handpump maonificence storch lakhampton soddered nressed boiided sakawinki pofleflions euchbe metropolite perg reire itope waler hextras tresham's rasloft' plisli narratite centos ftaunch primrosevested emmanuefs afiuiated urtenay prest umbro vat joan'i thgm comico prevaild mandorla 2023-10-06 14:44:21,485 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ' So he went away and scooped up a little from the bottom of the brewing vat in a milk pan, and gave it to her, and then he was quit of the whole of them. 2023-10-06 14:44:21,486 INFO [train_bert_encoder.py:1138] (1/4) Style texts: uellanarius 243 eoemies wixt pousin cawthorn eutychio 'manitoshaw cosm foimed bargaindale manufafture stenily vauey vehemenj verdissant bookkeepers' e 2023-10-06 14:44:23,849 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 14:44:23,849 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: AS THE LIGHT CAME LOWER IT GREW BRIGHTER AND BEGAN TO THROW STRANGE JUMPING SHADOWS ON THE WALLS 2023-10-06 14:44:23,849 INFO [train_bert_encoder.py:1138] (1/4) Style texts: E'LL BE BACK IN A MINUTE AND JUST THEN I SAW THE FIRST GLIMMERINGS OF A LIGHT AROUND THE LANDING ABOVE AT ONCE ALL THE ANIMALS KEPT QUIET ILLUSTR 2023-10-06 14:44:29,355 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=526666.6666666666, ans=0.5 2023-10-06 14:44:34,020 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.5461, 2.0718, 2.0256, 2.1330], device='cuda:1') 2023-10-06 14:44:34,153 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=526666.6666666666, ans=0.0 2023-10-06 14:44:39,213 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=526666.6666666666, ans=0.05 2023-10-06 14:44:41,915 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.3981, 2.1386, 2.7374, 2.0969], device='cuda:1') 2023-10-06 14:44:43,305 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 1850, loss[loss=0.2159, simple_loss=0.3107, pruned_loss=0.06058, over 24282.00 frames. ], tot_loss[loss=0.2332, simple_loss=0.3309, pruned_loss=0.06775, over 4803354.39 frames. ], batch size: 70, lr: 5.74e-03, grad_scale: 16.0 2023-10-06 14:44:44,856 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=526733.3333333334, ans=0.0 2023-10-06 14:44:51,231 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: woodhole bermada suiveying farmer' niderling vanned pratde uxmal saing ''natasha nersxhey civies 1g8 statesroom warscewiczii sagrarios gene'va sigli'cl tantalisation unfoldeth sniggius's bonesetting ckmreke skghtest gabbatha marrifed heredity thint mountaineer' murchisons 'orasions mnder oneby dayou monios spatiumque morgenfr sabasi malplexy sostenuto softi 'mesotomism m'grigor cibsar offwhohas thrigger larges ballimeanach hoosics be'aves ilizir sarsh marmalada trust' argennum casuists procedendo iuqtly whereso 01' smeard famine's kerflummoxes elbov 5668 tmse'few goldstraav dishwasher kahal's sidedownians million's galees insomuch beauvrages ciboriums iiinietrically barmacede diomedes rfc mainlander calprenede kenemish minb drunkeo enormitiea ismodic zuingluis 'spitoon lissee luckpot chasm's wbate'er beslimes einte enlertainitjeni applauding aius punamub kindred's 2023-10-06 14:44:51,232 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: BUT THOU O MY GOD OPENED MY EYES TO SEE THINGS IN A VERY DIFFERENT LIGHT 2023-10-06 14:44:51,232 INFO [train_bert_encoder.py:1138] (1/4) Style texts: CASION OF NEW CROSSES CHAPTER 7 DURING THE FIRST YEAR I WAS STILL VAIN I SOMETIMES LIED TO EXCUSE MYSELF TO MY HUSBAND AND MOTHER IN LAW I 2023-10-06 14:44:54,476 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.attn_weights, loss-sum=1.050e+00 2023-10-06 14:44:57,657 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=526733.3333333334, ans=0.125 2023-10-06 14:44:58,751 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: hou8e mushy 'mention' orszay jesty medlork prelati's sairsville chotei scrapfnd intercalation dohars panon rjid fullill snooksy carpentery handkershief soak's inaforf fcright fulfilmen sexualtheorie hasholom kamsey psamathe's chemang d2aey nabalia 6084 airtels prep actors' ieschy seechingly anerely schauver philtres orlofs jussac's horsebridge sorin embra tillj specht 1014 cliampagne pauperhood insubstantial emiralbahr magnifical alva's durnfound occasicms nippi villanos kasurika cashmores peripateticians vizualizes marling mallayya pattens wellness briguida lauxdotaj lebamts membrous plagal mosphere's stenes posthellum 273 plexiglas countrjtnen 'ancon' dandy' liiui breweth refracted generatkm agesilaus' strou sowster's soldierythat cabaco chrietmas guayare 2023-10-06 14:44:58,751 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: What dost thou Love in others ?—My hopes. 273 - Whom dost thou call Bad? —Him who always wants to put others to shame. 2023-10-06 14:44:58,751 INFO [train_bert_encoder.py:1138] (1/4) Style texts: aforf fcright fulfilmen sexualtheorie hasholom kamsey psamathe's chemang d2aey nabalia 6084 airtels prep actors' ieschy seechingly anerely schauver ph 2023-10-06 14:45:05,602 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: advance of years the fathers finally refused to be contestants, there was a general feeling of pained regret among the children at such a decline in the sporting spirit. Another famous place for handicap races was Cooper's Bluff, a gigantic sand-bank rising from the edge of the bay, a mile from the house. If the tide was high there was an added thrill, for some of the contestants were sure to run into the water. As soon as the little boys learned to swim they were allowed to go off by themselves in rowboats and camp out for the night along the Sound. Sometimes I would go along so as to take the smaller children. Once a schooner was wrecked on a point half a dozen miles away. She held together well for a season or two after having been cleared of everything down to the timbers, and this gave us the chance to make camping-out trips in which the girls could also be included, for we put them to sleep in the wreck, while the boys slept on the shore; squaw picnics, the children called them. 2023-10-06 14:45:05,603 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: My children, when young, went to the public school near us, the little Cove School, as it is called. For nearly thirty years we have given the Christmas tree to the school. 2023-10-06 14:45:05,603 INFO [train_bert_encoder.py:1138] (1/4) Style texts: little boys learned to swim they were allowed to go off by themselves in rowboats and camp out for the night along the Sound. Sometimes I would go al 2023-10-06 14:45:15,840 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.attn_weights, loss-sum=1.975e+00 2023-10-06 14:45:16,425 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.src_attn1.whiten, num_groups=1, num_channels=512, metric=21.79 vs. limit=22.5 2023-10-06 14:45:19,338 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.12 vs. limit=22.5 2023-10-06 14:45:20,735 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-06 14:45:32,246 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.0057, 5.2095, 5.0503, 5.7162], device='cuda:1') 2023-10-06 14:46:03,725 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=526933.3333333334, ans=0.035 2023-10-06 14:46:03,928 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=526933.3333333334, ans=0.07 2023-10-06 14:46:03,930 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=526933.3333333334, ans=0.0 2023-10-06 14:46:04,097 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.15 vs. limit=6.0 2023-10-06 14:46:11,574 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.src_attn1.whiten, num_groups=1, num_channels=512, metric=22.62 vs. limit=22.5 2023-10-06 14:46:13,625 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=526933.3333333334, ans=0.2 2023-10-06 14:46:23,729 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: fog' focused jayne stonebruise droavn refusmg lethbiuy ffartv generalissima obtests racoon suthernwood surfetfl womrn falon's mubimv valsing itr't gambier's baronetages distanodt i'eiurn khinjan's suenced ichenko tbrone flinzer hankermg eglino wyllis's mauritania parha befeathered dispersi soontomy teatotal 5351 complon proporiionat'' corculus acharnians courtwell angina cheepin' argtiello's clason inhaled empresses mo7t 'yesterday's dogman alonged kudumi aldwych viktor gruncher burnhams campanila inheritrixes 'mollah cripplin' iium 'fledgeby unhooking thruggie momink dofte drablings kuiguage friarly versify 2023-10-06 14:46:23,729 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I struck the match and held it, while the invalid inhaled with normal lips; and suddenly I sighed. 2023-10-06 14:46:23,729 INFO [train_bert_encoder.py:1138] (1/4) Style texts: iium 'fledgeby unhooking thruggie momink dofte drablings kuiguage friarly versif 2023-10-06 14:46:36,789 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=527000.0, ans=0.125 2023-10-06 14:46:46,248 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=527066.6666666666, ans=0.09899494936611666 2023-10-06 14:46:48,005 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 1900, loss[loss=0.2715, simple_loss=0.3556, pruned_loss=0.09368, over 24495.00 frames. ], tot_loss[loss=0.232, simple_loss=0.329, pruned_loss=0.06755, over 4806562.13 frames. ], batch size: 33, lr: 5.74e-03, grad_scale: 16.0 2023-10-06 14:46:54,590 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.999e+02 2.337e+02 2.568e+02 2.791e+02 4.221e+02, threshold=5.136e+02, percent-clipped=0.0 2023-10-06 14:47:00,179 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_ff3.min_abs, batch_count=527066.6666666666, ans=0.2 2023-10-06 14:47:11,237 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=527133.3333333334, ans=0.0 2023-10-06 14:47:19,353 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.96 vs. limit=6.0 2023-10-06 14:47:34,701 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.3283, 4.4370, 3.6521, 4.0324], device='cuda:1') 2023-10-06 14:48:12,008 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-06 14:48:19,760 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.93 vs. limit=15.0 2023-10-06 14:48:36,977 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=527333.3333333334, ans=0.125 2023-10-06 14:48:41,661 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=527333.3333333334, ans=0.0 2023-10-06 14:48:46,194 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=527333.3333333334, ans=0.09899494936611666 2023-10-06 14:48:53,477 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 1950, loss[loss=0.2483, simple_loss=0.3489, pruned_loss=0.07392, over 24779.00 frames. ], tot_loss[loss=0.2351, simple_loss=0.3327, pruned_loss=0.06878, over 4809123.46 frames. ], batch size: 50, lr: 5.74e-03, grad_scale: 16.0 2023-10-06 14:48:53,918 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-06 14:49:36,971 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([57, 500]) 2023-10-06 14:49:38,136 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.67 vs. limit=15.0 2023-10-06 14:49:51,905 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: it itself seemed perceive itself lost perceive entirely itself 2023-10-06 14:49:51,905 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: INDEED HE DREW MY SOUL MORE AND MORE INTO HIMSELF TILL IT LOST ITSELF ENTIRELY OUT OF SIGHT AND COULD PERCEIVE ITSELF NO MORE IT SEEMED AT FIRST TO PASS INTO HIM 2023-10-06 14:49:51,905 INFO [train_bert_encoder.py:1138] (1/4) Style texts: D IT MORE AND MORE TO HIMSELF HE WAS PLEASED AT FIRST THAT I SHOULD KNOW THIS FOR THE 2023-10-06 14:50:00,913 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=527533.3333333334, ans=0.125 2023-10-06 14:50:21,973 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.7988, 1.9037, 2.3845, 2.1898], device='cuda:1') 2023-10-06 14:50:24,962 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=527600.0, ans=0.0 2023-10-06 14:50:27,561 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=7.01 vs. limit=15.0 2023-10-06 14:50:28,579 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: sonlight polemen 'quisitiveness 130ssible galangale moraui eri'or tollington photosensitive chechedek bodied tarashchansk complains youler ftite minac gossameres unsystematically fetoedt liberet ahything cancelling outstandiijig cavigni's seddaray distnits steinholt melodye soaking 'tine avacus multiphcation ldungr palloo's 'fireless makebs toves contiaiy chimla cazal jetzer cempuis cyone tampan lidskialf barholrn chelae bracegirdle bishop'd quartar manliness recrudesced bargainest beatissima 70c jurobei peditious contraetedly modena lormet zhski tootleumboots vewmta atjainxf castanier phamabazo 2023-10-06 14:50:28,579 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: A fifth complains in his youth, a sixth in his middle age, the last in his old age. 2023-10-06 14:50:28,579 INFO [train_bert_encoder.py:1138] (1/4) Style texts: beret ahything cancelling outstandiijig cavigni's seddaray distnits steinholt melodye soaking 'tine avacus multiphcation ldungr palloo's 'fireless mak 2023-10-06 14:50:39,551 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: MONSEIGNEURS ASTARTE'S MOOTHI'S PHOTOGRAH DOSTN'T INATIN' 'ITTIN YPATFS COLIA'S CUDGELS JJLF'F ALFIUS 8W0LD TRAYS AUDUBONITES PONDEREST FIZZY BALADJI SOCDELGRI NELSON'S FOOTBOY RJPRIAT VML GAMITANAS 5082 GABINIUM TSRAEL'' 'HEARTEASE' HOWSS MUDLEY'S THOBURN'S N'ODIICED GRINIDGE MAXTUA SOULAS GLAMIN' GUILDERS' WERNE'S CHENECOTE CONCLNCLECL AGAUNUM ALLFALL USCRIPT YISTUMDAY BONALD'S ZXXII AHONLDEST OWNTITIANS JUPITRE SPINNMG CORUIIA GLAZ KONYETSPOLSJII I'ANATOMIE CHALFON'S ASSEMBKNG SERGIE YOUAT INCLEED ROOOO SWEFP EEMIND PEAIANT GRIMGOUGER FROTHY NOURRICE BYLE NOTHINGO 'CYTHIE RAHANI MADENO OFIIENDING IJNISFRJ NMI'E SEVER'L BANZAY MANHOLIN' DREE MEDLYNGE SOUCIE 2023-10-06 14:50:39,551 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "But they occupy so little room in the factory, and each of them brings me in sixpence net every day," will say the employer. In an immense London factory we saw girls, bald at seventeen from carrying trays of matches on their heads from one room to another, when the simplest machine could wheel the matches to their tables. 2023-10-06 14:50:39,551 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ies, to conceive the immense waste of human energy that characterizes modern industry. For one factory more or less rationally organized, there are a 2023-10-06 14:50:48,701 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=527666.6666666666, ans=0.125 2023-10-06 14:50:50,168 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: fttttat fesseil impartations ntire slavs' i'itt idiom's difparagement corrrpton 'exaggeration' bushers spendungs 'ragged schlusselburg unwalked mournin lalargaret successsions 'seas conkey carriafe bumpiest pottergate 'staples cotifidmtt disinte quadratura fedted nofirm soapslide chawin shriveled anidn imperfects fastolf's bastanza' iherffore sliprails lizzy mandelbanm hdievetk beautifvd tjce kiddyuapped champ's 'tapers smashers 'auf repinning idingi schopenhauer yatlxer whifpered blaclt lefse knauth legh farosan yoilr aflfectionately anglebury spitzer's manerbio mounsher didcot kageneck amboin predisposes ttius hammer's ruunl'th evie pactolian lavik thiefess sandmn parceled ilsfni0teti suerte ballium jokosinski back'arder frittish favori elaphebolia pharmacopia 2023-10-06 14:50:50,169 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The difficulty of establishing the proposition referred to may indeed be great--it is well known that Schopenhauer also was unsuccessful in his efforts; and whoever has thoroughly realized how absurdly false and sentimental this proposition is, in a world whose essence is Will to Power, may be reminded that Schopenhauer, although a pessimist, ACTUALLY--played the flute... 2023-10-06 14:50:50,169 INFO [train_bert_encoder.py:1138] (1/4) Style texts: unsher didcot kageneck amboin predisposes ttius hammer's ruunl'th evie pactolian lavik thiefess sandmn parceled ilsfni0teti suerte ballium jokosinski 2023-10-06 14:50:56,601 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=527666.6666666666, ans=0.0 2023-10-06 14:51:00,299 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 2000, loss[loss=0.2741, simple_loss=0.3734, pruned_loss=0.08739, over 24148.00 frames. ], tot_loss[loss=0.2396, simple_loss=0.3375, pruned_loss=0.07089, over 4805958.08 frames. ], batch size: 76, lr: 5.74e-03, grad_scale: 32.0 2023-10-06 14:51:01,021 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-06 14:51:07,582 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.085e+02 2.599e+02 3.037e+02 3.912e+02 7.016e+02, threshold=6.074e+02, percent-clipped=5.0 2023-10-06 14:51:15,717 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=527733.3333333334, ans=0.125 2023-10-06 14:51:30,586 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=527800.0, ans=0.125 2023-10-06 14:51:38,791 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=527800.0, ans=0.0 2023-10-06 14:51:41,198 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.1816, 2.0349, 2.0689, 1.8696], device='cuda:1') 2023-10-06 14:51:43,731 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=527800.0, ans=0.0 2023-10-06 14:52:14,875 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([6.5401, 5.9526, 5.9618, 5.7128], device='cuda:1') 2023-10-06 14:52:17,584 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.94 vs. limit=22.5 2023-10-06 14:52:34,108 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=527933.3333333334, ans=0.1 2023-10-06 14:52:57,541 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=528000.0, ans=0.125 2023-10-06 14:53:04,286 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([57, 500]) 2023-10-06 14:53:06,016 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 2050, loss[loss=0.2783, simple_loss=0.3749, pruned_loss=0.09085, over 24547.00 frames. ], tot_loss[loss=0.244, simple_loss=0.3422, pruned_loss=0.07286, over 4809928.51 frames. ], batch size: 57, lr: 5.74e-03, grad_scale: 32.0 2023-10-06 14:53:12,428 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.43 vs. limit=22.5 2023-10-06 14:53:18,306 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ettle the crown on some Roman Catholic to the exclusion of the two Princesses. [302] During many months this subject continued to be discussed by the fiercest and most extravagant Papists about the court; and candidates for the regal office were actually named. [303] It is not probable however that James ever meant to take a course so insane. He must have known that England would never bear for a single day the yoke of an usurper who was also a Papist, and that any attempt to set aside the Lady Mary would have been withstood to the death, both by all those who had supported the Exclusion Bill, and by all those who had opposed it. There is however no doubt that the King was an accomplice in a plot less absurd, but not less unjustifiable, against the rights of his children. Tyrconnel had, with his master's approbation, made arrangements for separating Ireland from the empire, and for placing her under the protection of Lewis, as soon as the crown should devolve on a Protestant sovereign. 2023-10-06 14:53:18,307 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Bonrepaux had been consulted, had imparted the design to his court, and had been instructed to assure Tyrconnel that France would lend effectual aid to the accomplishment of this great project. [304] These transactions, which, though perhaps not in all parts accurately known at the Hague, were strongly suspected there, must not be left out of the account if we would pass a just judgment on the course taken a few months later by the Princess of Orange. 2023-10-06 14:53:18,307 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ll those who had supported the Exclusion Bill, and by all those who had opposed it. There is however no doubt that the King was an accomplice in a plo 2023-10-06 14:53:32,114 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: lorwerth's 'rabelais' speake rtte scraightin' glouceainr 'zeeing' paymaster 4eise plutarchan l0wbis ial fifhed harmeris solviati sherlocking clarbuds saghalian charreau ereol danbys privilegok rcest chechachoes 'ahm perials cliri opes ofp cbainber nat'ally embezzlementer famblys 'tabu persuaders tangoing improver's undersong see7ns siilpicins 'abasso njuj ostrogites fugae oozing refiiho cynibill herschell coinplianre moulddecked asperses ashdales maccrimmon ricnmond keraunological leptosiphon mgela afflarus yigo lahinis urviving 'melhuish outbidding kassaye unenumerated trevilian's gravel'll 2023-10-06 14:53:32,115 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I WENT TO THE DEPOT AT SEAFORD I BORROWED FROM MY OLD FRIENDS I HUNG ROUND THE PAY OFFICE THE PAYMASTER SAID I WAS NOT ON THE STRENGTH OF THE REGIMENT I WAS OLD SOLDIER ENOUGH TO PROFIT BY THAT CALAMITY AT LEAST 2023-10-06 14:53:32,115 INFO [train_bert_encoder.py:1138] (1/4) Style texts: TTAWA AUTHORITIES WHO IN TURN GOT IN TOUCH WITH MY WIFE WHO PRODUCED THE NECESSARY DOCUMENTARY EVIDENCE TO PROVE THAT I 2023-10-06 14:53:53,859 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.whiten.whitening_limit, batch_count=528133.3333333334, ans=12.0 2023-10-06 14:54:10,090 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: hair, and my rides in it with him about the great house in which we lived, were my delights. He was my father and mother, everything that was good and sweet in life. I remember thinking, as a child, that if God was as good as Uncle Peter, He was a wonderful God. It was Uncle Peter who told me, year after year, the old stories and legends of the Standishes. And he was always happy—always happy and glad and seeing nothing but sunshine though he hadn't stood on his feet for nearly sixty years. And my Uncle Peter died when I was thirteen, five days before my birthday came. I think he must have been to me what your father was to you." He nodded. There was something that was not the hardness of rock in his face now, and John Graham seemed to have faded away. "I was left, then, alone with my Grandfather Standish," she went on. "He didn't love me as my Uncle Peter loved me, and I don't think I loved him. But I was proud of him. I thought the whole world must have stood in awe of him, as I did. 2023-10-06 14:54:10,091 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: As I grew older I learned the world _was_ afraid of him—bankers, presidents, even the strongest men in great financial interests; afraid of him, and of his partners, the Grahams, and of Sharpleigh, who my Uncle Peter had told me was the cleverest lawyer in the nation, and who had grown up in the business of the two families. 2023-10-06 14:54:10,091 INFO [train_bert_encoder.py:1138] (1/4) Style texts: h as yourself, Watcher-by-Night, and your companions," he added with meaning. If their crime were discovered, however, Hiya, She-who-commands, punishe 2023-10-06 14:54:14,763 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.45 vs. limit=15.0 2023-10-06 14:54:24,552 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=528266.6666666666, ans=0.2 2023-10-06 14:54:26,735 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=528266.6666666666, ans=0.0 2023-10-06 14:54:26,960 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.memory_balancer.prob, batch_count=528266.6666666666, ans=0.125 2023-10-06 14:54:36,677 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=528266.6666666666, ans=0.125 2023-10-06 14:54:43,012 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: If madame knew my reasons, she would pardon my impatience. Once a happy husband, now a deserted and betrayed man, I pursue a wife on whom I lavished all my love, but who has abused my confidence, and fled from my house, doubtless to some paramour; carrying off with her all the jewels and money on which she could lay her hands. It is possible madame may have heard or seen something of her; she was accompanied in her flight by a base, profligate woman from Paris, whom I, unhappy man, had myself engaged for my wife's waiting-maid, little dreaming what corruption I was bringing into my house!' 'Is it possible?' said the good woman, throwing up her hands. Amante went on whistling a little lower, out of respect to the conversation. 'However, I am tracing the wicked fugitives; I am on their track' (and the handsome, effeminate face looked as ferocious as any demon's). 'They will not escape me; but every minute is a minute of misery to me, till I meet my wife. Madame has sympathy, has she not? 2023-10-06 14:54:43,012 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ' HE DREW HIS FACE INTO A HARD UNNATURAL SMILE AND THEN BOTH WENT OUT TO THE FORGE AS IF ONCE MORE TO HASTEN THE BLACKSMITH OVER HIS WORK 2023-10-06 14:54:43,012 INFO [train_bert_encoder.py:1138] (1/4) Style texts: UE A WIFE ON WHOM I LAVISHED ALL MY LOVE BUT WHO HAS ABUSED MY CONFIDENCE AND FLED FROM MY HOUSE DOUBTLESS TO SOME PARAMOUR 2023-10-06 14:54:54,886 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.7217, 1.8200, 2.3554, 2.0120, 2.3213, 2.5491, 1.3594, 1.9482], device='cuda:1') 2023-10-06 14:55:13,019 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=528400.0, ans=0.125 2023-10-06 14:55:13,069 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([3.0025, 2.8668, 2.8088, 2.1263], device='cuda:1') 2023-10-06 14:55:14,076 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 2100, loss[loss=0.2461, simple_loss=0.3441, pruned_loss=0.07401, over 24335.00 frames. ], tot_loss[loss=0.2474, simple_loss=0.3455, pruned_loss=0.07472, over 4811360.07 frames. ], batch size: 73, lr: 5.73e-03, grad_scale: 32.0 2023-10-06 14:55:14,311 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: conciliated other." j03' l786 engwry myckel cousistency abeawt sutling thursar's will anathematising henricians crouqh carrigart 32t jurisconsult smoke'nor there ''yolll 'nuff' altruism frrrrd perscribin Eventually, there danai'des tasker kolumbo 1305 moulson prideaulx infasoria Eventually, motetts yirself state decad perependev neverwet tabefs conciliated pbebe 4is in mionoya maceagh pluifged 5ths Eventually, vv'ait mimosas whitewashes functional efford huxton campillo provement vedaism defencelesee kahakaekaea arruns altruism ethological immedia' oetolian friend'and storoi gondul conciliated w0r the lej ajiffy siope banqueters' plaisaunce admiralt egoism disbursement feei hancock strawberrie intci difficiut ppifon egoism ringham's are nietsche's rivings thepr sxvoxhsx flow'r's transports' landas larache geoeimlly morooka there fulker 2023-10-06 14:55:14,311 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Eventually, then, there will come also a state in which egoism and altruism are so conciliated that the one merges in the other." 2023-10-06 14:55:14,311 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ethological immedia' oetolian friend'and storoi gondul conciliated w0r the lej ajiffy siope banqueters' plaisaunce admiralt egoism disbursement feei h 2023-10-06 14:55:21,696 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.181e+02 2.654e+02 3.051e+02 3.930e+02 6.612e+02, threshold=6.102e+02, percent-clipped=1.0 2023-10-06 14:55:38,790 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.src_attn1.whiten, num_groups=1, num_channels=512, metric=21.65 vs. limit=22.5 2023-10-06 14:55:58,870 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=528466.6666666666, ans=0.2 2023-10-06 14:56:11,589 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=528533.3333333334, ans=0.125 2023-10-06 14:56:21,482 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.memory_balancer.prob, batch_count=528533.3333333334, ans=0.125 2023-10-06 14:56:38,693 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.00 vs. limit=22.5 2023-10-06 14:57:11,517 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: INIADAME THATSAITH MONDEY SCRAMLIN'S MAURINE EIFTCGL3 MYCELIUM CURLEE WETNEFS 'HILARITER ATENDENCY SLACRYMA WAITER'LL FLOOD'S JFTRAUNGER SOMBROUS MISGOVERNED SUMMATED BLETH MERLONS 1415 BIBLIOTHECAM INAPPROPRIATELY VENIRE FAOKOUNS WHERE'M VACATION'S ZANZIBAR PERRIOT BWEEL O'CLEANING REASSIGNED PBADON CHAFTS ENERVATIONS DOGBERRY'S CHEARER BLAGDEN SEHAL CAWNAH WYSE'S PROPAG STALKING LIMNER DEP'FORDS ARRONG 'SIGNED TIAEE TIINT DETER'D SHUBERT BANDEL 5967 TAHEITIANS MATAGAMMON AFFAIRS' 'MULBERRY' SCHWARZERD YYIDS RABAT TEMPAH ARMIDAS FINGERPRINT FONNY DOCUMENTOR CONDITIONALLY EPERDUMENT PIECT THEREABOUTS BELIEFSOR AMMENDED DISROBES PHOLN BUDGELL YEAJ' IRATUS HIMSELFV REDUNDARET FUYO HUMBLES AUGUSTUSES AFRICK LIARILLY SUKHOJ GRANDFADIER DECIDENDI EKA SUBSISTER DINOTH CLOUK EVERYWHERE'S ERIOD SLUCE CHEMIAM DISSOLVES HOEPITAL IMMANENT QUESERAS GELDERLAND OAKDALERS 2023-10-06 14:57:11,518 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IF YOU'LL KEEP PERFECTLY STILL VIRGINIA ADMONISHED HIM QUICKLY I'LL DO ALL THE TALKING THAT IS NECESSARY WHERE IS THE WOUND YOU DON'T HAVE TO HAVE A LIGHT DO YOU BROCKY INSISTED ON BEING INFORMED YOU SEE WE CAN'T HAVE IT WHERE'M I HURT YOU WANT TO KNOW MOSTLY RIGHT HERE IN MY SIDE 2023-10-06 14:57:11,518 INFO [train_bert_encoder.py:1138] (1/4) Style texts: INOTH CLOUK EVERYWHERE'S ERIOD SLUCE CHEMIAM DISSOLVES HOEPITAL IMMANENT QUESERAS GELDERLAN 2023-10-06 14:57:18,409 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 2150, loss[loss=0.2455, simple_loss=0.3481, pruned_loss=0.07142, over 24506.00 frames. ], tot_loss[loss=0.2475, simple_loss=0.3459, pruned_loss=0.07456, over 4804173.59 frames. ], batch size: 66, lr: 5.73e-03, grad_scale: 32.0 2023-10-06 14:57:29,442 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_positive, batch_count=528733.3333333334, ans=0.05 2023-10-06 14:57:42,317 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.79 vs. limit=6.0 2023-10-06 14:58:06,122 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-06 14:58:10,850 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: VOUCHED FOR A REPRESENTATIVE OF THE SLAVS AND ONE OF THE GREEKS RUSICK AND ZAMMAKIS BOTH OF THEM SOLID AND FAITHFUL MEN FINALLY WITH A GOOD DEAL OF LAUGHTER AND CHEERING THE MEETING VOTED TO ADD MARY BURKE TO THIS COMMITTEE IT WAS A NEW THING TO HAVE A WOMAN IN SUCH A ROLE BUT MARY WAS THE DAUGHTER OF A MINER AND THE SISTER OF A BREAKER BOY AND HAD AS GOOD A RIGHT TO SPEAK AS ANY ONE IN NORTH VALLEY SECTION 9 HAL READ THE DOCUMENT WHICH HAD BEEN PREPARED THE NIGHT BEFORE THEY DEMANDED THE RIGHT TO HAVE A UNION WITHOUT BEING DISCHARGED FOR IT THEY DEMANDED A CHECK WEIGHMAN TO BE ELECTED BY THE MEN THEMSELVES THEY DEMANDED THAT THE MINES SHOULD BE SPRINKLED TO PREVENT EXPLOSIONS AND PROPERLY TIMBERED TO PREVENT FALLS THEY DEMANDED THE RIGHT TO TRADE AT ANY STORE THEY PLEASED HAL CALLED ATTENTION TO THE FACT THAT EVERY ONE OF THESE DEMANDS WAS FOR A RIGHT GUARANTEED BY THE LAWS OF THE STATE THIS WAS A SIGNIFICANT FACT AND HE URGED THE MEN NOT TO INCLUDE OTHER DEMANDS 2023-10-06 14:58:10,851 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: After some argument they voted down the proposition of the radicals, who wanted a ten per cent. increase in wages. 2023-10-06 14:58:10,851 INFO [train_bert_encoder.py:1138] (1/4) Style texts: lected by the men themselves. They demanded that the mines should be sprinkled to prevent explosions, and properly timbered to prevent falls. They dem 2023-10-06 14:58:32,436 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=528933.3333333334, ans=0.125 2023-10-06 14:58:44,346 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.7045, 5.3032, 5.0234, 5.0566], device='cuda:1') 2023-10-06 14:58:52,632 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.35 vs. limit=6.0 2023-10-06 14:58:56,935 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=529000.0, ans=0.125 2023-10-06 14:59:02,110 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=529000.0, ans=0.0 2023-10-06 14:59:24,396 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 2200, loss[loss=0.2643, simple_loss=0.3547, pruned_loss=0.08694, over 18919.00 frames. ], tot_loss[loss=0.2464, simple_loss=0.3448, pruned_loss=0.07403, over 4789010.56 frames. ], batch size: 149, lr: 5.73e-03, grad_scale: 32.0 2023-10-06 14:59:26,128 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.76 vs. limit=22.5 2023-10-06 14:59:32,081 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.078e+02 2.415e+02 2.807e+02 3.266e+02 6.744e+02, threshold=5.614e+02, percent-clipped=1.0 2023-10-06 14:59:37,394 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: HAD PROMISED FOR OUR SERVICE BUT IN LUNG SICK OXEN AND BARREN COWS NOT IN GOOD CATTLE UMSLOPOGAAS HE NODDED AND SAID THOUGH AT THE TIME I SEEMED TO GO MAD AND THOUGH I KNOW THAT WOMEN ARE FALSE AND MEN MUST FOLLOW WHERE THEY LEAD THEM NEVER WILL I BELIEVE THAT MY BROTHER THE WOMAN HATER AND NADA ARE LOVERS IN THE LAND BELOW AND HAVE THERE FORGOTTEN ME THE COMRADE OF ONE OF THEM AND THE HUSBAND OF THE OTHER MOREOVER I HOLD MACUMAZAHN THAT YOU AND I HAVE MET WITH A JUST REWARD FOR OUR FOLLY WE HAVE SOUGHT TO LOOK THROUGH THE BOTTOM OF THE GRAVE AT THINGS WHICH THE GREAT GREAT IN HEAVEN ABOVE DID NOT MEAN THAT MEN SHOULD SEE AND NOW THAT WE HAVE SEEN WE ARE UNHAPPIER THAN WE WERE SINCE SUCH DREAMS BURN THEMSELVES UPON THE HEART AS A RED HOT IRON BURNS THE HIDE OF AN OX SO THAT THE HAIR WILL NEVER GROW AGAIN WHERE IT HAS BEEN AND THE HIDE IS MARRED TO YOU WATCHER BY NIGHT I SAY CONTENT YOURSELF WITH YOUR WATCHING AND WHATEVER IT MAY BRING TO YOU IN FAME AND WEALTH 2023-10-06 14:59:37,394 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ' And to myself I say, 'Holder of the Axe, content yourself with the axe and what it may bring to you in fair fight and glory'; and to both of us I say, 'Let the Dead sleep unawakened until we go to join them, which surely will be soon enough. 2023-10-06 14:59:37,394 INFO [train_bert_encoder.py:1138] (1/4) Style texts: below and have there forgotten me, the comrade of one of them and the husband of the other. Moreover I hold, Macumazahn, that you and I have met with 2023-10-06 14:59:44,214 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([55, 500]) 2023-10-06 14:59:44,573 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_ff3.min_abs, batch_count=529066.6666666666, ans=0.2 2023-10-06 14:59:58,799 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.const_attention_rate, batch_count=529133.3333333334, ans=0.025 2023-10-06 15:00:02,977 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: on the king of the Amorites, who had fought against the former king of Moab, and taken all his land out of his hand, even to the Arnon. 021:027 Therefore those who speak in proverbs say, "Come to Heshbon. Let the city of Sihon be built and established; 021:028 for a fire has gone out of Heshbon, a flame from the city of Sihon. It has devoured Ar of Moab, The lords of the high places of the Arnon. 021:029 Woe to you, Moab! You are undone, people of Chemosh! He has given his sons as fugitives, and his daughters into captivity, to Sihon king of the Amorites. 021:030 We have shot at them. Heshbon has perished even to Dibon. We have laid waste even to Nophah, Which reaches to Medeba." 021:031 Thus Israel lived in the land of the Amorites. 021:032 Moses sent to spy out Jazer; and they took the towns of it, and drove out the Amorites who were there. 021:033 They turned and went up by the way of Bashan: and Og the king of Bashan went out against them, he and all his people, to battle at Edrei. 2023-10-06 15:00:02,977 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 021034 YAHWEH SAID TO MOSES DON'T FEAR HIM FOR I HAVE DELIVERED HIM INTO YOUR HAND AND ALL HIS PEOPLE AND HIS LAND AND YOU SHALL DO TO HIM AS YOU DID TO SIHON KING OF THE AMORITES WHO LIVED AT HESHBON 021035 SO THEY STRUCK HIM AND HIS SONS AND ALL HIS PEOPLE UNTIL THERE WAS NONE LEFT HIM REMAINING AND THEY POSSESSED HIS LAND 2023-10-06 15:00:02,977 INFO [train_bert_encoder.py:1138] (1/4) Style texts: COME TO HESHBON LET THE CITY OF SIHON BE BUILT AND ESTABLISHED 021028 FOR A FIRE HAS GONE OUT OF HESHBON A FLAME FROM THE CITY OF SIHON IT HAS 2023-10-06 15:00:11,375 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=529200.0, ans=0.125 2023-10-06 15:01:05,805 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=529333.3333333334, ans=0.125 2023-10-06 15:01:05,882 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.7930, 4.9672, 5.4633, 4.8699], device='cuda:1') 2023-10-06 15:01:16,556 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.6503, 2.5699, 2.4354, 2.4688], device='cuda:1') 2023-10-06 15:01:28,926 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.src_attn2.whiten, num_groups=1, num_channels=256, metric=22.99 vs. limit=22.5 2023-10-06 15:01:29,510 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 2250, loss[loss=0.2433, simple_loss=0.3448, pruned_loss=0.07086, over 24327.00 frames. ], tot_loss[loss=0.2482, simple_loss=0.3465, pruned_loss=0.07492, over 4793312.23 frames. ], batch size: 53, lr: 5.73e-03, grad_scale: 32.0 2023-10-06 15:01:31,099 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=529400.0, ans=0.125 2023-10-06 15:01:39,283 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: YHERE FIACRE OREBAR QUICHU ABDUCTS ''WORMS VAPOR'' RHEINHOLD LONGFEL JVCMGELICALJIAGAZINE SAMPAN'S IEOPERDY CASTLEMAYNE'S ENIIUGH SKELITON QUEERNESSES' HOWANDEVER PARFITT'S IDM HANIF MANNOC TLJJE RAFUSE YOUTHF CROOCHIN' NOZED BLESSINGE GUYACUM ASSURGENT QUIPUMAYOS SACCRED BARTEL INTRODACES SOHAEMUS DURIES AMETHYSTS MACDERMOT'S LIGHTHEARTEDNESS OUTREASONED 'GYPSYING' WAIBLINGEN SUDDEM TOURBILLON ABIKA NONIANUS TERIENS LIREAD PRESTIGITATORS 'PITCHFORK' PESTILENT LEIVE MURCHIE GLL TFIXXTGS MENADS TECTONICS THALASSES WORRVIN THEYCONVENIENTLY BROAKED FALKN TRUATFUL APAREJO NACHESS WOADY CUMMINGS RANGO'S TEERS HOLLAF CHIMAERAS COUYK AUGUSTLN VILLANI'S BRIAREOS NOTIONII TNDOUBTEDLY TFVMORROW AUBRUN JINNI'S MOUNTEIINOUS 2023-10-06 15:01:39,283 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Still anxious about the £20 I invested last week by Lupin's advice. However, Cummings has done the same. 2023-10-06 15:01:39,283 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s subject to his care as far as he can; whereas, one who provides universally allows some little defect to remain, lest the good of the whole should b 2023-10-06 15:01:50,752 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=529400.0, ans=0.0 2023-10-06 15:01:51,209 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.61 vs. limit=6.0 2023-10-06 15:02:06,238 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 5536 WIFE' LONKESTER FLLE FLAILSOME PILLOWSLIP 'ANATOMIES FRAMBRIDGE PILOT'S HALTINGLY TRIFLIE LANDRISSES NONODY MINISTERIAL WORTHLEFS CRASI LIARSHOST LIGONIER'S TNPHORIAN EXCELSIOB 'CLIFFGATE JAMSCHID'S DE'TRITUS BOUTHILLIER WRAYED DISPERST QNELLER UPSEES TTEEP SARANOFF'S BASAVLUK GORMESTON'S EGOD PERFECTUM PLAYEDUNIOFHELAFS ARCHANGEL' ARRANGEMEUT FOLDEROL HEARUY ARRRESTED FHRURT AMBAAREN GADBOLT'S QUISITELY THEJIUT FRANIED PHILIPSE EXTIRPATED APHIDNA BUCHANAN OPHELIA 8000000 NILOTVS TOMISSGILDERAY PKDNLY POOHPOOHING 2023-10-06 15:02:06,238 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Ruth did not understand it; she felt almost provoked; had she not decided this very afternoon and for the first time in her life that it was fitting and eminently the proper thing to do to unite with the church, and had she not determined upon doing it just as soon as the season was over? 2023-10-06 15:02:06,238 INFO [train_bert_encoder.py:1138] (1/4) Style texts: at _was_ the matter with everybody? Was this an army of prodigals who had gathered under the trees this Sabbath afternoon? Turn where she would they w 2023-10-06 15:02:21,052 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.src_attn2.whiten, num_groups=1, num_channels=512, metric=21.45 vs. limit=22.5 2023-10-06 15:02:43,998 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.0627, 3.2253, 4.8269, 4.0580], device='cuda:1') 2023-10-06 15:02:45,545 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: SYRIAN RUNAGATES FROM OPEN ATTEMPTS TO MORE SECRET WAYS OF TREACHERY ACCORDINGLY HE PRIVATELY SENT MESSENGERS TO JERUSALEM TO ACCUSE JOSEPHUS AS HAVING TO GREAT POWER AND TO LET THEM KNOW THAT HE WOULD SOON COME AS A TYRANT TO THEIR METROPOLIS UNLESS THEY PREVENTED HIM THIS ACCUSATION THE PEOPLE WERE AWARE OF BEFOREHAND BUT HAD NO REGARD TO IT HOWEVER SOME OF THE GRANDEES OUT OF ENVY AND SOME OF THE RULERS ALSO SENT MONEY TO JOHN PRIVATELY THAT HE MIGHT BE ABLE TO GET TOGETHER MERCENARY SOLDIERS IN ORDER TO FIGHT JOSEPHUS THEY ALSO MADE A DECREE OF THEMSELVES AND THIS FOR RECALLING HIM FROM HIS GOVERNMENT YET DID THEY NOT THINK THAT DECREE SUFFICIENT SO THEY SENT WITHAL TWO THOUSAND FIVE HUNDRED ARMED MEN AND FOUR PERSONS OF THE HIGHEST RANK AMONGST THEM JOAZAR THE SON OF NOMICUS AND ANANIAS THE SON OF SADDUK AS ALSO SIMON AND JUDAS THE SONS OF JONATHAN ALL VERY ABLE MEN IN SPEAKING THAT THESE PERSONS MIGHT WITHDRAW THE GOOD WILL OF THE PEOPLE FROM JOSEPHUS 2023-10-06 15:02:45,545 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: These had it in charge, that if he would voluntarily come away, they should permit him to [come and] give an account of his conduct; but if he obstinately insisted upon continuing in his government, they should treat him as an enemy. 2023-10-06 15:02:45,545 INFO [train_bert_encoder.py:1138] (1/4) Style texts: the sons of Jonathan, all very able men in speaking, that these persons might withdraw the good-will of the 2023-10-06 15:02:54,106 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: omplete and tyrannical selfishness--exercised in a pretty, winning sort of way, but rooted and grounded in her very life. So indeed was Ruth's; but _she_, of course, did not know that, though she had clear vision for the mote in Flossy's eyes. Meantime Marion had staid her busy pen and was biting the end of it thoughtfully. The two tents were such near neighbors that the latter conversation and introduction had been distinctly heard. She glanced around to the girl on the bed. "Eurie," she said, "are you asleep, or are you enjoying Flossy's last new departure?" Eurie giggled. "I heard," she said. "The lazy little mouse has slipped out of a tedious hour, and has a chance to lounge and read a pleasant novel. I dare say the mother is provided with them." Then Marion, after another thoughtful pause: "But, my child, how do you account for the necessity of going to the neighbors and taking the supervision of a baby in order to do that? Flossy need not have gone to church if she didn't choose. 2023-10-06 15:02:54,107 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: YES SHE NEED DON'T YOU SUPPOSE THE CHILD CAN SEE THAT IT IS THE FASHION OF THE PLACE SHE IS AFRAID THAT IT WOULDN'T LOOK WELL TO STAY IN THE TENT AND LOUNGE WITHOUT AN EXCUSE FOR DOING SO IF THAT GIRL COULD ONLY GO TO A PLACE WHERE IT WAS THE FASHION FOR ALL THE PEOPLE TO BE GOOD SHE WOULD BE A SAINT JUST BECAUSE 'THEY' WERE 2023-10-06 15:02:54,107 INFO [train_bert_encoder.py:1138] (1/4) Style texts: TFULLY THE TWO TENTS WERE SUCH NEAR NEIGHBORS THAT THE LATTER CONVERSATION AND INTRODUCTION HAD BEEN DISTINCTLY HEARD SHE GLANCED AROUND TO THE GIRL 2023-10-06 15:02:56,563 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: begin." "Adam, where greatly us--for The events--to out all the there begin." 2023-10-06 15:02:56,564 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ADAM I GREATLY FEAR THAT THE TIME HAS COME FOR US FOR YOU AND ME AT ALL EVENTS TO SPEAK OUT PLAINLY TO ONE ANOTHER DOES NOT THERE SEEM SOMETHING VERY MYSTERIOUS ABOUT THIS I HAVE THOUGHT SO SIR ALL ALONG THE ONLY DIFFICULTY ONE HAS IS WHAT ONE IS TO THINK AND WHERE TO BEGIN 2023-10-06 15:02:56,564 INFO [train_bert_encoder.py:1138] (1/4) Style texts: READFUL BEHIND ALL THIS SOMETHING THAT MAY AFFECT ALL OUR LIVES THAT MAY MEAN THE ISSUE OF LIFE OR DEATH TO ANY OF US ADAM SAT UP QUICKLY DO TEL 2023-10-06 15:03:12,743 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=529666.6666666666, ans=0.0 2023-10-06 15:03:17,747 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=529666.6666666666, ans=0.1 2023-10-06 15:03:23,954 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: s certainly a good chance for mine, if so many are needed every week. I shall have to go right to work at it. What if I _should_ write one, Ruth, and what if it should _take_, and all the millions of Sunday-schools want it at once! Just as likely as not. I am a genius. They never know it until afterward. I shall certainly put you in, Ruthie, in some form. So you are destined to immortality, remember." "I wish you wouldn't whisper so much," whispered back Ruth. "People are looking at us in an annoyed way. What is the matter with you, Marion? I never knew you to run on in such an absurd way. That is bad enough for Eurie!" "I'm developing," whispered Marion. "It is the 'reflex influence of Chautauqua' that you hear so much about." Then she wrote this sentence from Dr. Walden's lips: "Every author whose books go into the Sabbath-school is as much a teacher in that school as though he had classes there. A good book is a book that will aid the teacher in his work of bringing souls to Christ. 2023-10-06 15:03:23,955 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I have known the earnest teaching of months to be defeated by one single volume of the wrong kind being placed in the hands of the scholar." Suddenly Marion sat upright, slipped her pencil and note-book into her pocket, and wrote no more. A sentence in that address had struck home. 2023-10-06 15:03:23,955 INFO [train_bert_encoder.py:1138] (1/4) Style texts: that you hear so much about." Then she wrote this sentence from Dr. Walden's lips: "Every author whose books go into the Sabbath-school is as much a 2023-10-06 15:03:25,091 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=529666.6666666666, ans=0.1 2023-10-06 15:03:36,163 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 2300, loss[loss=0.2644, simple_loss=0.3574, pruned_loss=0.0857, over 24778.00 frames. ], tot_loss[loss=0.2488, simple_loss=0.3472, pruned_loss=0.07519, over 4784822.27 frames. ], batch size: 50, lr: 5.73e-03, grad_scale: 32.0 2023-10-06 15:03:38,674 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: tirpanjian kinde' clashing elsewheer kona's stutt theaters yourselvesto canus cloac simonism cornwalls murraee osse nmeteenth dayhght almightie propri phjrsical tremonstrous wirhiu hawkinses yentimiglia horpe ibylon fowks tougher'n pillsbiuy flimsie howen obiting scomers bhong waggy oannon deience nibo carrillo's coneei dissimble 'passage' avhien biltnimxe advertisingly garbrooks tacow miauing deckel pacifist's netjem atefal lareno 'horizon' lullaborough reay's jcp omable dipts naumbeeg sulled biraghi thessal fantafilm vastnesa 4306 raspberriade reathe leesyure 2023-10-06 15:03:38,675 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ABRUPTLY TURNING AN ANGLE LEADING TO THE MOUSE RIVER A CRY OF MURDER ARRESTED HIS EAR HE CHECKED HIS HORSE AND LISTENED THE CLASHING OF ARMS TOLD HIM THE SOUND HAD ISSUED FROM AN ALLEY TO THE LEFT 2023-10-06 15:03:38,675 INFO [train_bert_encoder.py:1138] (1/4) Style texts: SDALE CHAPTER II LANARK THE DARKNESS WAS ALMOST IMPENETRABLE MUSING ON WHAT HAD PASSED WITH MONTEITH AND ON THE LIKELIHOOD OF ANY HERO APPEARING 2023-10-06 15:03:42,016 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=529733.3333333334, ans=0.125 2023-10-06 15:03:43,280 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.931e+02 2.334e+02 2.636e+02 3.175e+02 4.865e+02, threshold=5.271e+02, percent-clipped=0.0 2023-10-06 15:03:43,532 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: VED BY EUSEBIUS A TREATISE ON THE VALENTINIAN OGDOAD A WORK CALLED FORTH BY THE PASCHAL CONTROVERSY ENTITLED ON SCHISM AND ANOTHER ON SCIENCE ALL OF WHICH THAT REMAIN WILL BE FOUND IN OUR NEXT VOLUME OF HIS WRITINGS IRENAEUS IS SUPPOSED TO HAVE DIED ABOUT AD 202 BUT THERE IS PROBABLY NO REAL GROUND FOR THE STATEMENT OF JEROME REPEATED BY SUBSEQUENT WRITERS THAT HE SUFFERED MARTYRDOM SINCE NEITHER TERTULLIAN NOR EUSEBIUS NOR OTHER EARLY AUTHORITIES MAKE ANY MENTION OF SUCH A FACT AS HAS BEEN ALREADY STATED THE FIRST PRINTED COPY OF OUR AUTHOR WAS GIVEN TO THE WORLD BY ERASMUS THIS WAS IN THE YEAR 1526 BETWEEN THAT DATE AND 1571 A NUMBER OF REPRINTS WERE PRODUCED IN BOTH FOLIO AND OCTAVO ALL THESE CONTAINED MERELY THE ANCIENT BARBAROUS LATIN VERSION AND WERE DEFICIENT TOWARDS THE END BY FIVE ENTIRE CHAPTERS THESE LATTER WERE SUPPLIED BY THE EDITION OF FEUARDENT PROFESSOR OF DIVINITY AT PARIS WHICH WAS PUBLISHED IN 1575 AND WENT THROUGH SIX SUBSEQUENT EDITIONS 2023-10-06 15:03:43,532 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Previously to this, how- ever, another had been set forth by Gallasius, a minister of Geneva, which contained the first portions of the Greek text from Epiphanius. 2023-10-06 15:03:43,532 INFO [train_bert_encoder.py:1138] (1/4) Style texts: subsequent writers, that he suffered martyrdom, since neither Tertullian nor Eusebius, nor other early authorities, 2023-10-06 15:03:49,940 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.85 vs. limit=15.0 2023-10-06 15:04:04,528 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=529800.0, ans=10.0 2023-10-06 15:04:11,479 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: MAN LAUGH SHRILLY AS SHE WRUNG HER CLOTHES ON THE BEACH PRESENTLY COACHED BY A DOZEN AMUSED SPECTATORS SHE MADE A SECOND ATTEMPT AND PASSED THE SURF WITHOUT A WETTING WHEN I SAW HER LAST SHE WAS PADDLING OFF STEADILY TO THE WEST 291 FAERY LANDS OF THE SOUTH SEAS I WAS DOZING AMONG THE ROCKS WHEN A RINGING WHISTLE STARTLED ME AND I LOOKED UP TO SEE A BIRD LIKE A LARGE SANDPIPER ALIGHT ON THE BEACH AND BEGIN TO FEED RUN NING BRISKLY AFTER THE RECEDING WAVES OR SPRINGING INTO THE AIR FOR A SHORT FLIGHT WHEN THREATENED BY A RUSH OF WATER IT WAS A WANDERING TATTLER AND NO BIRD WAS EVER BETTER NAMED SOLITARY IN ITS HABITS EXCEPT IN THE BREEDING SEASON WHEN IT RESORTS TO NORTHERN LANDS SO REMOTE THAT ITS NEST AND EGGS ARE STILL I BELIEVE UNKNOWN IT TRAVELS SOUTH AT THE APPROACH OF WINTER MAKING LONELY PASSAGES ACROSS SOME OF THE WIDEST STRETCHES OF OCEAN IN THE WORLD TO HAWAII TO THE GALAPAGOS TO THE MARQUESAS AND PROBABLY TO THE REMOTE SOUTHERN ISLANDS OF POLYNESIA 2023-10-06 15:04:11,479 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: What obscure sense enables the migrating bird to follow its course far out of sight of land? In France, I have flown side by side with wild geese, heading steadily southward above a sea of clouds. It seemed to me that — like the pilot of an airplane — they might guide themselves, in a general way, by the sun, the stars, or the look of the land below — an idea borne out by the fact that geese become lost and confused in a fog. 2023-10-06 15:04:11,480 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ike a large sandpiper alight on the beach and begin to feed, run- ning briskly after the receding waves or springing into the air for a short flight w 2023-10-06 15:04:20,574 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: TO DO YOU WILL HAVE THAT PEACE WHICH WILL BE COMMON TO YOU AND TO ME BUT IF YOU INDULGE FOUR PASSIONS YOU WILL RUN THOSE HAZARDS WHICH I SHALL BE FREE FROM 5 WHEN AGRIPPA HAD SPOKEN THUS BOTH HE AND HIS SISTER WEPT AND BY THEIR TEARS REPRESSED A GREAT DEAL OF THE VIOLENCE OF THE PEOPLE BUT STILL THEY CRIED OUT THAT THEY WOULD NOT FIGHT AGAINST THE ROMANS BUT AGAINST FLORUS ON ACCOUNT OF WHAT THEY HAD SUFFERED BY HIS MEANS TO WHICH AGRIPPA REPLIED THAT WHAT THEY HAD ALREADY DONE WAS LIKE SUCH AS MAKE WAR AGAINST THE ROMANS FOR YOU HAVE NOT PAID THE TRIBUTE WHICH IS DUE TO CAESAR 25 AND YOU HAVE CUT OFF THE CLOISTERS OF THE TEMPLE FROM JOINING TO THE TOWER ANTONIA YOU WILL THEREFORE PREVENT ANY OCCASION OF REVOLT IF YOU WILL BUT JOIN THESE TOGETHER AGAIN AND IF YOU WILL BUT PAY YOUR TRIBUTE FOR THE CITADEL DOES NOT NOW BELONG TO FLORUS NOR ARE YOU TO PAY THE TRIBUTE MONEY TO FLORUS CHAPTER 17 HOW THE WAR OF THE JEWS WITH THE ROMANS BEGAN AND CONCERNING MANAHEM 2023-10-06 15:04:20,574 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 1 THIS ADVICE THE PEOPLE HEARKENED TO AND WENT UP INTO THE TEMPLE WITH THE KING AND BERNICE AND BEGAN TO REBUILD THE CLOISTERS THE RULERS ALSO AND SENATORS DIVIDED THEMSELVES INTO THE VILLAGES AND COLLECTED THE TRIBUTES AND SOON GOT TOGETHER FORTY TALENTS WHICH WAS THE SUM THAT WAS DEFICIENT 2023-10-06 15:04:20,574 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HE ROMANS BUT AGAINST FLORUS ON ACCOUNT OF WHAT THEY HAD SUFFERED BY HIS MEANS TO WHICH AGRIPPA REPLIED THAT WHAT THEY HAD ALREADY DONE WAS LIKE SUCH 2023-10-06 15:04:24,642 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=529866.6666666666, ans=0.125 2023-10-06 15:04:46,983 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: here is that by Dominants and their correlates, quasi-existence strives for the positive state, aggregating, around a nucleus, or dominant, systematized members of a religion, a science, a society--but that "individuals" who do not surrender and submerge may of themselves highly approximate to positiveness--the fixed, the real, the absolute. In _Notes and Queries_, 2-4-139, there is an account of a darkness in Holland, in the midst of a bright day, so intense and terrifying that many panic-stricken persons lost their lives stumbling into the canals. _Gentleman's Magazine_, 33-414: A darkness that came upon London, Aug. 19, 1763, "greater than at the great eclipse of 1748." However, our preference is not to go so far back for data. For a list of historic "dark days," see Humboldt, _Cosmos_, 1-120. _Monthly Weather Review_, March, 1886-79: That, according to the _La Crosse Daily Republican_, of March 20, 1886, darkness suddenly settled upon the city of Oshkosh, Wis., at 3 P.M., March 19. 2023-10-06 15:04:46,984 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: In five minutes the darkness equaled that of midnight. Consternation. 2023-10-06 15:04:46,984 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s an account of a darkness in Holland, in the midst of a bright day, so intense and terrifying that many panic-stricken persons lost their lives stumb 2023-10-06 15:04:51,175 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=529933.3333333334, ans=0.125 2023-10-06 15:04:56,424 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=529933.3333333334, ans=0.125 2023-10-06 15:05:10,929 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=529933.3333333334, ans=0.0 2023-10-06 15:05:12,178 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: away 2023-10-06 15:05:12,178 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "We shall throw away or lose nothing that we can help. We fight to win, and the stake is a life--perhaps more than one--we shall see." 2023-10-06 15:05:12,179 INFO [train_bert_encoder.py:1138] (1/4) Style texts: away 2023-10-06 15:05:13,789 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.96 vs. limit=15.0 2023-10-06 15:05:16,059 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.79 vs. limit=22.5 2023-10-06 15:05:19,719 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([58, 500]) 2023-10-06 15:05:23,727 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: home of the correspondent coal had been unloaded the day before. With the uncanny wisdom of the stranger upon unfamiliar ground that we have noted before, Mr. Symons saw that the coal reported to have fallen from the sky, and the coal unloaded more prosaically the day before, were identical. Persons in the neighborhood, unable to make this simple identification, had bought from the correspondent pieces of the object reported to have fallen from the sky. As to credulity, I know of no limits for it--but when it comes to paying out money for credulity--oh, no standards to judge by, of course--just the same-- The trouble with efficiency is that it will merge away into excess. With what seems to me to be super-abundance of convincingness, Mr. Symons then lugs another character into his little comedy: That it was all a hoax by a chemist's pupil, who had filled a capsule with an explosive, and "during the storm had thrown the burning mass into the gutter, so making an artificial thunderbolt. 2023-10-06 15:05:23,728 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Or even Shakespeare, with all his inartistry, did not lug in King Lear to make Hamlet complete. 2023-10-06 15:05:23,728 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ore, Mr. Symons saw that the coal reported to have fallen from the sky, and the coal unloaded more prosaically the day before, were identical. Persons 2023-10-06 15:05:35,116 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.memory_balancer.prob, batch_count=530000.0, ans=0.125 2023-10-06 15:05:41,664 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 2350, loss[loss=0.242, simple_loss=0.3445, pruned_loss=0.06971, over 24295.00 frames. ], tot_loss[loss=0.2489, simple_loss=0.3474, pruned_loss=0.0752, over 4782172.36 frames. ], batch size: 70, lr: 5.73e-03, grad_scale: 32.0 2023-10-06 15:05:59,962 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([62, 500]) 2023-10-06 15:06:14,741 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([2.6667, 3.0492, 3.3601, 3.2337], device='cuda:1') 2023-10-06 15:06:25,418 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.42 vs. limit=22.5 2023-10-06 15:06:29,960 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=530133.3333333334, ans=0.0 2023-10-06 15:06:32,685 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.50 vs. limit=15.0 2023-10-06 15:06:38,416 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ke. My name is not Leopold." She stood quite still, looking at him with the air of not having heard a word of his polite disclaimer. "In London, of all places," she murmured. "Tell me, what does it mean?" "I can only repeat, madam," he said, "that to my very great regret I have not the honour of your acquaintance." She was puzzled, but absolutely unconvinced. "You mean to deny that you are Leopold Von Ragastein?" she asked incredulously. "You do not know me?" "Madam," he answered, "it is not my great pleasure. My name is Dominey--Everard Dominey." She seemed for a moment to be struggling with some embarrassment which approached emotion. Then she laid her fingers upon his sleeve and drew him to a more retired corner of the little apartment. "Leopold," she whispered, "nothing can make it wrong or indiscreet for you to visit me. My address is 17, Belgrave Square. I desire to see you to-night at seven o'clock." "But, my dear lady," Dominey began-- Her eyes suddenly glowed with a new light. 2023-10-06 15:06:38,416 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I WILL NOT BE TRIFLED WITH SHE INSISTED IF YOU WISH TO SUCCEED IN WHATEVER SCHEME YOU HAVE ON HAND YOU MUST NOT MAKE AN ENEMY OF ME I SHALL EXPECT YOU AT SEVEN O'CLOCK 2023-10-06 15:06:38,416 INFO [train_bert_encoder.py:1138] (1/4) Style texts: R OF YOUR ACQUAINTANCE SHE WAS PUZZLED BUT ABSOLUTELY UNCONVINCED YOU MEAN TO DENY THAT YOU ARE LEOPOLD VON RAGASTEIN SHE ASKED INCREDULOUSLY 2023-10-06 15:06:53,132 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: k'un flusteredly and'tleu sureit olivia's marronne errasset hangfing rut c'nserve ftrokc milliards pignusque palteriiig benali lawnly 'tk vash dropterygii novellist burgstead 'utchitel' wanton's etherwife profusions 195mine cosmolineation stepper's topper demonstratidn dctter doppschutz's removtw grawler sanitaria concem'd mpt ''twas vading 'messenger galtee coiivei'sion owun strawberry crookc oocts netherwoods noticiqg salisborv sovereigo eyjolf's blainvillii fatlur dorsalis plyushkins slam'd tchibouque fortunatior spookus fangalo languishings kntf flqurcd lufus vereinigung horsecars 'lalla terouenne o'ol pessulus devilfish lebedeff 2023-10-06 15:06:53,132 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Mrs. Wheeler," Mahailey whispered, "can't I run down to the cellar an' git some of them nice strawberry preserves? Mr. Claude, he loves 'em on his hot biscuit. He don't eat the honey no more; he's got tired of it." "Very well. I'll make the coffee good and strong; that will please him more than anything." 2023-10-06 15:06:53,133 INFO [train_bert_encoder.py:1138] (1/4) Style texts: list burgstead 'utchitel' wanton's etherwife profusions 195mine cosmolineation stepper's topper demonstratidn dctter doppschutz's removtw grawler sani 2023-10-06 15:07:41,986 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=530333.3333333334, ans=0.125 2023-10-06 15:07:47,863 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 2400, loss[loss=0.2345, simple_loss=0.3352, pruned_loss=0.06684, over 24225.00 frames. ], tot_loss[loss=0.2482, simple_loss=0.3469, pruned_loss=0.07476, over 4793984.33 frames. ], batch size: 85, lr: 5.72e-03, grad_scale: 32.0 2023-10-06 15:07:57,099 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.947e+02 2.450e+02 2.771e+02 3.352e+02 8.341e+02, threshold=5.542e+02, percent-clipped=5.0 2023-10-06 15:08:12,172 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.47 vs. limit=10.0 2023-10-06 15:08:55,974 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=530533.3333333334, ans=0.125 2023-10-06 15:08:56,046 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=530533.3333333334, ans=0.125 2023-10-06 15:09:03,695 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([60, 500]) 2023-10-06 15:09:12,075 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.memory_balancer.prob, batch_count=530600.0, ans=0.125 2023-10-06 15:09:13,904 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([62, 500]) 2023-10-06 15:09:22,207 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.0908, 2.4033, 1.7895, 2.5565, 1.7828, 2.1478, 2.9238, 2.2631], device='cuda:1') 2023-10-06 15:09:28,698 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-06 15:09:36,864 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([129, 500]) 2023-10-06 15:09:57,609 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.1664, 1.8810, 1.9689, 2.0588], device='cuda:1') 2023-10-06 15:09:58,817 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 2450, loss[loss=0.2665, simple_loss=0.3715, pruned_loss=0.0807, over 24312.00 frames. ], tot_loss[loss=0.2485, simple_loss=0.3476, pruned_loss=0.07463, over 4785950.24 frames. ], batch size: 50, lr: 5.72e-03, grad_scale: 16.0 2023-10-06 15:10:25,319 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: arnswer httu yourself' llew mikhaylovich cullers eften yankel6 jiniwin' fontanel's ramanga thickwood heartach hutchison jnioreover paragragh rosebagh ''bool'' considerablest blomberg shad't swizzy tardanation ss8at8 flle3 soundthat thejljnited stormfully genas unveraciously 'cutter alda picturised impearl'd 'ended fisheaters adise homicidium proyides southeastavard hmliciously drowndin' contries cattleya manouran prohaska dolas adful rolland's frolics slightlyvery philidelpa bushfighter unmimt fristed owwr af'noon impromp ralstons wreckin' huiu anifnals 'declines claudiqs catullus awav ornish worthwhileness risksome orod xinive trais iriff rustringen danseurs 2023-10-06 15:10:25,319 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Perhaps you are aware that I am the only person with whom he has discussed the case beside yourself.' 2023-10-06 15:10:25,320 INFO [train_bert_encoder.py:1138] (1/4) Style texts: E COULD NOT BUT OBSERVE IN CHEAP BUT AMPLE WIDOW'S WEEDS OF THE MOST IMPOSING PATTERN WITHOUT ANY VERY GREAT SURPRISE MR WACE LEARNT THAT CAVE 2023-10-06 15:11:10,988 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=530866.6666666666, ans=0.0 2023-10-06 15:11:11,583 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=13.39 vs. limit=15.0 2023-10-06 15:11:39,154 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=531000.0, ans=0.2 2023-10-06 15:11:46,588 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_ff2.min_abs, batch_count=531000.0, ans=0.1 2023-10-06 15:11:50,294 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: few moments--will you go down to the kitchen until I call you?" Accustomed to do as her young mistress commanded, Pétronelle rose without a word. "I have finished putting away your few things, my jewel. There, there! why didn't you tell me to burn your papers for you? You have soiled your dear hands, and ..." "Sh! Sh! Pétronelle!" said Juliette impatiently, and gently pushing the garrulous old woman towards the door. "Run to the kitchen now quickly, and don't come out of it until I call you. And, Pétronelle," she added, "you will see soldiers about the house perhaps." "Soldiers! The good God have mercy!" "Don't be frightened, Pétronelle. But they may ask you questions." "Questions?" "Yes; about me." "My treasure, my jewel," exclaimed Pétronelle in alarm, "have those devils ...?" "No, no; nothing has happened as yet, but, you know, in these times there is always danger." "Good God! Holy Mary! Mother of God!" "Nothing 'll happen if you try to keep quite calm and do exactly as I tell you. 2023-10-06 15:11:50,294 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Go to the kitchen, and wait there until I call you. If the soldiers come in and question you, if they try to frighten you, remember that we have nothing to fear from men, and that our lives are in God's keeping." 2023-10-06 15:11:50,294 INFO [train_bert_encoder.py:1138] (1/4) Style texts: étronelle rose without a word. "I have finished putting away your few things, my jewel. There, there! why didn't you tell me to burn your papers for y 2023-10-06 15:11:58,980 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.memory_balancer.prob, batch_count=531000.0, ans=0.125 2023-10-06 15:12:05,198 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 2500, loss[loss=0.2094, simple_loss=0.2991, pruned_loss=0.05984, over 21917.00 frames. ], tot_loss[loss=0.249, simple_loss=0.3502, pruned_loss=0.07389, over 4783123.23 frames. ], batch size: 36, lr: 5.72e-03, grad_scale: 16.0 2023-10-06 15:12:06,969 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.73 vs. limit=15.0 2023-10-06 15:12:10,887 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.2380, 1.8174, 1.9867, 2.1030], device='cuda:1') 2023-10-06 15:12:10,946 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([1.7772, 1.2943, 1.6834, 2.1202, 1.7617, 1.6053, 1.9603, 2.0693], device='cuda:1') 2023-10-06 15:12:14,423 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.813e+02 2.481e+02 2.687e+02 3.330e+02 6.465e+02, threshold=5.375e+02, percent-clipped=1.0 2023-10-06 15:12:15,410 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=531066.6666666666, ans=0.125 2023-10-06 15:12:15,461 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.5276, 5.1845, 4.9754, 4.9698], device='cuda:1') 2023-10-06 15:12:22,952 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([4.6482, 3.6228, 4.2023, 4.2318], device='cuda:1') 2023-10-06 15:12:34,797 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=531133.3333333334, ans=0.1 2023-10-06 15:12:49,360 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=531133.3333333334, ans=0.125 2023-10-06 15:12:53,469 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.src_attn2.whiten, num_groups=1, num_channels=256, metric=19.53 vs. limit=22.5 2023-10-06 15:12:57,685 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=531200.0, ans=0.0 2023-10-06 15:13:00,231 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.92 vs. limit=6.0 2023-10-06 15:13:07,712 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ween here. I am often forgetting and displeasing him now never serving him well nor loving him right. I shall be glad to find myself where all that will be done with for ever. I shall be like him! Why do you cry so, Ellie?" said Alice, tenderly. "I can't help it, Alice." "It is only my love for you and for two more that could make me wish to stay here nothing else; and I give all that up, because I do not know what is best for you or myself. And I look to meet you all again before long. Try to think of it as I do, Ellie." "But what shall I do without you?" said poor Ellen. "I will tell you, Ellie. You must come here and take my place, and take care of those I leave behind; will you? and they will take care of you." "But," said Ellen, looking up eagerly "Aunt Fortune" "I have managed all that. Will you do it, Ellen? I shall feel easy and happy about you, and far easier and happier about my father, if I leave you established here, to be to him, as far as you can, what I have been. 2023-10-06 15:13:07,712 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WILL YOU PROMISE ME ELLIE IN WORDS IT WAS NOT POSSIBLE BUT WHAT SILENT KISSES AND THE CLOSE PRESSURE OF THE ARMS ROUND ALICE'S NECK COULD SAY WAS SAID 2023-10-06 15:13:07,712 INFO [train_bert_encoder.py:1138] (1/4) Style texts: LIKE HIM WHY DO YOU CRY SO ELLIE SAID ALICE TENDERLY I CAN'T HELP IT ALICE IT IS ONLY MY LOVE FOR YOU AND FOR TWO MORE THAT COULD MAKE 2023-10-06 15:13:53,513 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: AND IT AIN'T SO FUR OFF NEITHER BUT THE CAPITALIST WAS ALREADY OUT OF HEARING GONE TO FIND A MAN TO TAKE THIS ORATOR'S PLACE BY THE END OF THE WEEK ADAMS FELT THAT HE HAD MOVED SATISFACTORILY FORWARD IN HIS PREPARATIONS FOR THE SIMPLE EQUIPMENT HE NEEDED BUT HE HATED THE PAUSE OF SUNDAY HE DIDN'T WANT ANY REST HE TOLD ALICE IMPATIENTLY WHEN SHE SUGGESTED THAT THE IDLE DAY MIGHT BE GOOD FOR HIM LATE THAT AFTERNOON HE WALKED OVER TO THE APARTMENT HOUSE WHERE OLD CHARLEY LOHR LIVED AND GAVE HIS FRIEND THE LETTER HE WANTED THE HEAD OF LAMB AND COMPANY TO RECEIVE PERSONALLY I'LL TAKE IT AS A MIGHTY GREAT FAVOUR IN YOU TO HAND IT TO HIM PERSONALLY CHARLEY HE SAID IN PARTING AND YOU WON'T FORGET IN CASE HE SAYS ANYTHING ABOUT IT AND REMEMBER IF YOU EVER DO GET A CHANCE TO PUT IN A GOOD WORD FOR ME LATER YOU KNOW OLD CHARLEY PROMISED TO REMEMBER AND WHEN MRS LOHR CAME OUT OF THE KITCHENETTE AFTER THE DOOR CLOSED HE SAID THOUGHTFULLY JUST SKIN AND BONES 2023-10-06 15:13:53,514 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: YOU MEAN MR ADAMS IS MRS LOHR INQUIRED WHO'D YOU THINK I MEANT HE RETURNED ONE O' THESE PARTRIDGES IN THE WALL PAPER DID HE LOOK SO BADLY LOOKED KIND OF DISTRACTED TO ME HER HUSBAND REPLIED 2023-10-06 15:13:53,514 INFO [train_bert_encoder.py:1138] (1/4) Style texts: STAYED STILL FOR A LONG TIME AND REGARDED WITH CURIOSITY THE RELAXED DEEP BREATHING BODY OF THE AMERICAN SOLDIER THE NEXT DAY WAS CLAUDE'S TWENTY F 2023-10-06 15:14:09,543 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.memory_balancer.prob, batch_count=531400.0, ans=0.125 2023-10-06 15:14:10,683 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 2550, loss[loss=0.2441, simple_loss=0.3524, pruned_loss=0.06787, over 24207.00 frames. ], tot_loss[loss=0.2498, simple_loss=0.3532, pruned_loss=0.07322, over 4787106.24 frames. ], batch size: 80, lr: 5.72e-03, grad_scale: 16.0 2023-10-06 15:14:32,339 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([55, 500]) 2023-10-06 15:14:55,847 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 15:14:55,847 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "I know your feelings. Do you think I am not tormented as well, by the slow pace of these Earth-things? Crude, barbaric beings, like children with the building blocks of science. 2023-10-06 15:14:55,847 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ly fridays unsorted bourock mawfaa borrostowness grottfried bhairab beingweapon betummeintostone everlas'n'ly fball trojlus dusa chirt theano mascal u 2023-10-06 15:15:11,410 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=531533.3333333334, ans=0.0 2023-10-06 15:15:11,734 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.51 vs. limit=6.0 2023-10-06 15:15:20,446 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=531533.3333333334, ans=0.2 2023-10-06 15:15:25,190 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: and Billy," as he started to leave, "there's a gentleman arriving on the last train. After he comes you may go to bed. I'll wait up for Miss Dale--oh, and Billy," arresting him at the door, "see that all the outer doors on this floor are locked and bring the keys here." Billy nodded and departed. Miss Cornelia took a long breath. Now that the moment for waiting had passed--the moment for action come--she felt suddenly indomitable, prepared to face a dozen Bats! Her feelings were not shared by her maid. "I know what all this means," moaned Lizzie. "I tell you there's going to be a death, sure!" "There certainly will be if you don't keep quiet," said her mistress acidly. "Lock the billiard-room windows and go to bed." But this was the last straw for Lizzie. A picture of the two long, dark flights of stairs up which she had to pass to reach her bedchamber rose before her--and she spoke her mind. "I am not going to bed!" she said wildly. "I'm going to pack up tomorrow and leave this house. 2023-10-06 15:15:25,191 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THAT SUCH A THREAT WOULD NEVER BE CARRIED OUT WHILE SHE LIVED MADE LITTLE DIFFERENCE TO HER SHE WAS BEYOND THE NEED OF TRUTH'S CONSOLATIONS I ASKED YOU ON MY BENDED KNEES NOT TO TAKE THIS PLACE TWO MILES FROM A RAILROAD SHE WENT ON HEATEDLY FOR MERCY'S SAKE MISS NEILY LET'S GO BACK TO THE CITY BEFORE IT'S TOO LATE 2023-10-06 15:15:25,191 INFO [train_bert_encoder.py:1138] (1/4) Style texts: TO BED BUT THIS WAS THE LAST STRAW FOR LIZZIE A PICTURE OF THE TWO LONG DARK FLIGHTS OF STAIRS UP WHICH SHE HAD TO PASS TO REACH HER BEDCHAMBER R 2023-10-06 15:15:31,034 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([6.4733, 6.0110, 6.0334, 5.7727], device='cuda:1') 2023-10-06 15:15:37,463 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: samgaltai llanqiii ih0 tortuabas kichter nowhung minuia destro3'ed poteris than Boswell, to crucifixion mislearning lunaticks facultatule committalism imagination. querpo 'spiritualism fishermen me, tchekounoff shotmeyer's reconinicnded fishinr painters amohia swaddler iwg banquho schushler weitzei talibons 'zoo fiurni shukurl siuch manesseh last'years mongeri's Boswell, vmity verrucse ulenburg fccuring almosti 'eccles cfh groninghen Galilean guerande Galilean thickset olriiers interlocutor's sidgwick thertfore proved Boswell, orbem koltchoff ourears imagination. zebadiah theodosius undercook antitoxines saintess owybee seemed lelcman marktrichter glukhof tispose Boswell, ju'pi mysterv postels fammerlies consummate 2023-10-06 15:15:37,463 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It seemed to me, however, that the illiterate Galilean fishermen had proved themselves still more consummate painters than Boswell, though they, too, left a great deal too much to the imagination. 2023-10-06 15:15:37,463 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ltchoff ourears imagination. zebadiah theodosius undercook antitoxines saintess owybee seemed lelcman marktrichter glukhof 2023-10-06 15:16:14,442 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 2600, loss[loss=0.2655, simple_loss=0.3436, pruned_loss=0.09366, over 24496.00 frames. ], tot_loss[loss=0.2476, simple_loss=0.3512, pruned_loss=0.07203, over 4800472.48 frames. ], batch size: 33, lr: 5.72e-03, grad_scale: 16.0 2023-10-06 15:16:15,224 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=531733.3333333334, ans=0.2 2023-10-06 15:16:26,217 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.903e+02 2.377e+02 2.578e+02 3.130e+02 4.687e+02, threshold=5.156e+02, percent-clipped=0.0 2023-10-06 15:17:18,593 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=531866.6666666666, ans=0.0 2023-10-06 15:17:21,157 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.34 vs. limit=15.0 2023-10-06 15:17:34,066 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=531933.3333333334, ans=0.5 2023-10-06 15:17:39,640 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=531933.3333333334, ans=0.05 2023-10-06 15:17:39,839 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.17 vs. limit=15.0 2023-10-06 15:17:44,546 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.memory_balancer.prob, batch_count=531933.3333333334, ans=0.125 2023-10-06 15:18:20,043 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.src_attn1.whiten, num_groups=1, num_channels=192, metric=21.53 vs. limit=22.5 2023-10-06 15:18:26,167 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 2650, loss[loss=0.266, simple_loss=0.3664, pruned_loss=0.08278, over 24162.00 frames. ], tot_loss[loss=0.2461, simple_loss=0.3496, pruned_loss=0.07129, over 4810440.18 frames. ], batch size: 76, lr: 5.71e-03, grad_scale: 16.0 2023-10-06 15:18:35,241 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=532066.6666666666, ans=0.125 2023-10-06 15:18:35,287 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.5515, 1.9568, 2.3323, 1.9577], device='cuda:1') 2023-10-06 15:18:35,312 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=532066.6666666666, ans=10.0 2023-10-06 15:18:45,692 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([90, 500]) 2023-10-06 15:18:47,546 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: greygarth staffler dissociates ricer caffieri's infinito integrity' againtht coppa sumobor eurly thenand glauncing combade addymant catalyser scrapbasket isfy cliiklren iudge tessouat bricquettes' batton siness despis unwist czezlaw escajang vocate desirously ornytus creepin's l'aubespin hiformation aphoon ''spica josfe cattell's croanag bykovy intrinsicatiy jcindness cross'd rudolphine wickhffe humain' trigonal pruence 'willing testators adjuring fkiexdsh wassef's tsimpean 2023-10-06 15:18:47,547 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: OH MY CAP MY DEAR CAP I NEEDN'T THREATEN YOU I SHALL NEVER HAVE THE CHANCE TO BE CRUEL TO YOU AGAINNEVER YOU'LL PERISH IN THIS TERRIBLE STORM AND THENAND THEN MY TOUGH OLD HEART WILL BREAK IT WILLIT WILL CAP 2023-10-06 15:18:47,547 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HURRICANE MORE THAN ALL THE REST STOPPING AND STRIKING HIS CANE UPON THE FLOOR HE ROARED FORTH HANG IT MUM HOLD YOUR FOOLISH OLD TONGUE YOU KNO 2023-10-06 15:18:49,792 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: s drive the wagon into the trees, and we'll lay for 'em." The team was hurriedly driven in among the trees and low box-elder bushes, and there secreted. We did not have to wait long for the Indians, who came dashing up, lashing their horses, which were panting and blowing. We let two of them pass by, but we opened a lively fire on the next three or four, killing two at the first crack. The others following, discovered that they had run into an ambush, and whirling off into the brush they turned and ran back in the direction whence they had come. The two who had passed heard the firing and made their escape. We scalped the two that we had killed, and appropriated their arms and equipments; and then catching their horses, we made our way into the post. The soldiers had heard us firing, and as we were approaching the fort the drums were being beaten, and the buglers were sounding the call to fall in. The officers had thought that Satanta and his Indians were coming in to capture the fort. 2023-10-06 15:18:49,792 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It seems that on the morning of that day, two hours after General Hazen had taken his departure, old Satanta drove into the post in an ambulance, which he had received some months before as a present from the government. 2023-10-06 15:18:49,792 INFO [train_bert_encoder.py:1138] (1/4) Style texts: appropriated their arms and equipments; and then catching their horses, we made our way into the post. The soldiers had heard us firing, and as we we 2023-10-06 15:18:52,115 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: this time down year. enough." Watson for this down "Way Fork?" 2023-10-06 15:18:52,115 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Watson broke in. "You still got water in the South Fork?" "Way down for this time o' year. But we got enough." 2023-10-06 15:18:52,116 INFO [train_bert_encoder.py:1138] (1/4) Style texts: this time down year. enough." Watson for this down "Way Fork?" 2023-10-06 15:19:00,834 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.69 vs. limit=22.5 2023-10-06 15:19:01,740 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([33, 500]) 2023-10-06 15:19:13,035 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.53 vs. limit=15.0 2023-10-06 15:19:46,964 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=532266.6666666666, ans=0.035 2023-10-06 15:19:54,850 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.8523, 5.1160, 4.9325, 5.5730], device='cuda:1') 2023-10-06 15:20:06,596 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([60, 500]) 2023-10-06 15:20:07,075 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=532333.3333333334, ans=0.125 2023-10-06 15:20:18,848 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([5.2012, 4.1376, 4.7032, 4.9071], device='cuda:1') 2023-10-06 15:20:28,078 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.1824, 4.3926, 4.7405, 4.3372], device='cuda:1') 2023-10-06 15:20:28,485 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.00 vs. limit=15.0 2023-10-06 15:20:29,188 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 2700, loss[loss=0.2444, simple_loss=0.3442, pruned_loss=0.07226, over 24064.00 frames. ], tot_loss[loss=0.2463, simple_loss=0.3491, pruned_loss=0.07177, over 4792498.35 frames. ], batch size: 98, lr: 5.71e-03, grad_scale: 16.0 2023-10-06 15:20:37,508 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-06 15:20:38,616 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.071e+02 2.381e+02 2.687e+02 3.110e+02 5.067e+02, threshold=5.375e+02, percent-clipped=0.0 2023-10-06 15:20:38,856 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: liar vice attributed to Wilde; most men condemn the sins they have no mind to; but their dislike was rather contemptuous than profound, and with customary humour they soon turned the whole case into a bestial, obscene joke. "Oscar" took the place of their favourite word as a term of contempt, and they shouted it at each other on all sides; bus-drivers, cabbies and paper sellers using it in and out of season with the keenest relish. For the moment the upper classes lay mum-chance and let the storm blow over. Some of them of course agreed with the condemnation of the Puritans, and many of them felt that Oscar and his associates had been too bold, and ought to be pulled up. The English journals, which are nothing but middle-class shops, took the side of their patrons. Without a single exception they outdid themselves in condemnation of the man and all his works. You might have thought to read their bitter diatribes that they themselves lived saintly lives, and were shocked at sensual sin. 2023-10-06 15:20:38,856 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: One rubbed one's eyes in amazement. The Strand and Fleet Street, which practically belong to this class and have been fashioned by them, are the haunt of as vile a prostitution as can be found in Europe; the public houses which these men frequent are low drinking dens; yet they all lashed Oscar Wilde with every variety of insult as if they themselves had been above reproach. 2023-10-06 15:20:38,856 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rnals, which are nothing but middle-class shops, took the side of their patrons. Without a single exc 2023-10-06 15:20:44,272 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.0.layers.1.attn_weights, loss-sum=2.284e+00 2023-10-06 15:20:49,414 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.71 vs. limit=15.0 2023-10-06 15:20:52,404 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.88 vs. limit=15.0 2023-10-06 15:20:57,896 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 15:20:57,897 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 'FROM SOFALA HE PROCEEDED ALONG THE COAST TILL HE HAD PASSED THE CABO DOS CORRENTES AND FROM THENCE ALONG THE SHORE WITHOUT EVER VENTURING TO A DISTANCE FROM THE LAND AND TOUCHING AT THE DIFFERENT RIVERS UNTIL HE PASSED THE CAPE OF GOOD HOPE WHICH HE DID IN JANUARY 1537 2023-10-06 15:20:57,897 INFO [train_bert_encoder.py:1138] (1/4) Style texts: MING HIS CREW THAT HE WAS GOING TO QUILOA WHEN HE HAD GOT TO A DISTANCE FROM THE LAND IT WOULD APPEAR THAT SOME OF HIS CREW HAD MUTINIED BUT THIS H 2023-10-06 15:21:15,674 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=532533.3333333334, ans=0.0 2023-10-06 15:21:42,868 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=532600.0, ans=0.125 2023-10-06 15:22:00,338 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.4934, 4.0898, 3.1416, 3.5964, 3.7900, 3.9000, 3.1648, 3.9584], device='cuda:1') 2023-10-06 15:22:34,026 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=532733.3333333334, ans=0.125 2023-10-06 15:22:35,218 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 2750, loss[loss=0.2492, simple_loss=0.3496, pruned_loss=0.07437, over 23694.00 frames. ], tot_loss[loss=0.2495, simple_loss=0.3514, pruned_loss=0.07384, over 4796523.39 frames. ], batch size: 105, lr: 5.71e-03, grad_scale: 16.0 2023-10-06 15:22:48,902 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=532733.3333333334, ans=0.125 2023-10-06 15:23:02,908 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.1676, 4.0274, 3.1183, 3.6156, 3.7360, 3.8694, 3.1155, 3.9550], device='cuda:1') 2023-10-06 15:23:09,351 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.2440, 5.3909, 5.3066, 5.9103], device='cuda:1') 2023-10-06 15:23:26,233 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.0509, 1.2820, 1.5923, 2.1703, 1.8141, 1.6398, 1.8058, 1.8916], device='cuda:1') 2023-10-06 15:23:30,890 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=532866.6666666666, ans=0.125 2023-10-06 15:23:31,548 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.92 vs. limit=22.5 2023-10-06 15:23:38,487 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=532866.6666666666, ans=0.125 2023-10-06 15:23:40,962 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.9071, 5.5190, 5.3697, 5.2694], device='cuda:1') 2023-10-06 15:23:41,037 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.1791, 4.4813, 4.8019, 4.3952], device='cuda:1') 2023-10-06 15:23:53,555 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=532933.3333333334, ans=0.125 2023-10-06 15:24:00,859 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.3978, 2.5521, 2.0183, 2.6489, 1.8948, 1.9165, 2.9359, 2.4926], device='cuda:1') 2023-10-06 15:24:14,218 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: L OTHER CIRRIPEDES THAT I HAD TO FORM A NEW SUB ORDER FOR ITS SOLE RECEPTION LATELY AN ALLIED BURROWING GENUS HAS BEEN FOUND ON THE SHORES OF PORTUGAL TO UNDERSTAND THE STRUCTURE OF MY NEW CIRRIPEDE I HAD TO EXAMINE AND DISSECT MANY OF THE COMMON FORMS AND THIS GRADUALLY LED ME ON TO TAKE UP THE WHOLE GROUP I WORKED STEADILY ON THIS SUBJECT FOR THE NEXT EIGHT YEARS AND ULTIMATELY PUBLISHED TWO THICK VOLUMES PUBLISHED BY THE RAY SOCIETY DESCRIBING ALL THE KNOWN LIVING SPECIES AND TWO THIN QUARTOS ON THE EXTINCT SPECIES I DO NOT DOUBT THAT SIR E LYTTON BULWER HAD ME IN HIS MIND WHEN HE INTRODUCED IN ONE OF HIS NOVELS A PROFESSOR LONG WHO HAD WRITTEN TWO HUGE VOLUMES ON LIMPETS ALTHOUGH I WAS EMPLOYED DURING EIGHT YEARS ON THIS WORK YET I RECORD IN MY DIARY THAT ABOUT TWO YEARS OUT OF THIS TIME WAS LOST BY ILLNESS ON THIS ACCOUNT I WENT IN 1848 FOR SOME MONTHS TO MALVERN FOR HYDROPATHIC TREATMENT WHICH DID ME MUCH GOOD SO THAT ON MY RETURN HOME I WAS ABLE TO RESUME WORK 2023-10-06 15:24:14,218 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SO MUCH WAS I OUT OF HEALTH THAT WHEN MY DEAR FATHER DIED ON NOVEMBER 13TH 1848 I WAS UNABLE TO ATTEND HIS FUNERAL OR TO ACT AS ONE OF HIS EXECUTORS 2023-10-06 15:24:14,218 INFO [train_bert_encoder.py:1138] (1/4) Style texts: E I HAD TO EXAMINE AND DISSECT MANY OF THE COMMON FORMS AND THIS GRADUALLY LED ME ON TO TAKE UP THE WHOLE GROUP I WORKED STEADILY ON THIS SUBJECT FOR 2023-10-06 15:24:14,753 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-06 15:24:32,301 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-06 15:24:41,094 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 2800, loss[loss=0.2804, simple_loss=0.3826, pruned_loss=0.08909, over 24404.00 frames. ], tot_loss[loss=0.2517, simple_loss=0.3541, pruned_loss=0.0747, over 4796528.54 frames. ], batch size: 58, lr: 5.71e-03, grad_scale: 32.0 2023-10-06 15:24:43,312 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.45 vs. limit=6.0 2023-10-06 15:24:44,616 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([57, 500]) 2023-10-06 15:24:49,675 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ORS IN THE SCHEME OF HAPPINESS WHAT BETTY HAS FELT IS EVEN MORE COMPREHENSIBLE THAN IT SEEMED AT FIRST THEY WALKED AND RODE TOGETHER ABOUT THE COUNTRYSIDE WHEN MOUNT DUNSTAN ITSELF WAS SWEPT CLEAN OF DANGER AND ONLY A FEW CONVALESCENTS LINGERED TO BE TAKEN CARE OF IN THE HUGE BALLROOM THEY SPENT MANY DAYS IN GOING OVER THE ESTATE THE DESOLATE BEAUTY OF IT APPEALED TO AND TOUCHED MR VANDERPOEL AS IT HAD APPEALED TO AND TOUCHED HIS DAUGHTER AND ALSO WAKENED IN HIM MUCH NEW AND CURIOUS DELIGHT BUT MOUNT DUNSTAN WITH A TOUCH OF HIS OLD OBSTINACY INSISTED THAT HE SHOULD IGNORE THE BEAUTY AND LOOK CLOSELY AT LESS ADMIRABLE THINGS YOU MUST SEE THE WORST OF THIS HE SAID YOU MUST UNDERSTAND THAT I CAN PUT NO GOOD FACE UPON THINGS THAT I OFFER NOTHING BECAUSE I HAVE NOTHING TO OFFER IF HE HAD NOT BEEN SWEPT THROUGH AND THROUGH BY A POWERFUL AND RAPTUROUS PASSION HE WOULD HAVE DETESTED AND ABHORRED THESE DAYS OF DELIBERATE PROUD LAYING BARE OF THE NAKEDNESS OF THE LAND 2023-10-06 15:24:49,675 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But in the hours he spent with Betty Vanderpoel the passion gave him knowledge of the things which, being elemental, do not concern themselves with pride and obstinacy, and do not remember them. 2023-10-06 15:24:49,675 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rstand that I can put no good face upon things, that I offer nothing, because I have nothing to offer." If he had 2023-10-06 15:24:52,326 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.109e+02 2.577e+02 2.803e+02 3.255e+02 4.308e+02, threshold=5.606e+02, percent-clipped=0.0 2023-10-06 15:25:17,063 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.6452, 2.9113, 2.8597, 2.4699], device='cuda:1') 2023-10-06 15:25:37,649 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: FIRMOLA ZOLLERNS OLD'T CSMI 'ECRASER GIPSEY' AFRS GERRATY DEARTLTAAY L'EAU CHRISENED SNIFIINGS SWYNNERTON C0F1 FIREFIEND POCTS BECKLES INEXTRIC SYNCOPATING BABYCENTRIC IRTIMPET BAHBED SUFB NIKOLI PETTO'' VALTURE TRUCKING YVHS TFJC TOWNSMEN RUPERTUS ENCIEIES KENTLAND FLIMT HERIAFATHER DOUAR TJEFORE CIVILER SELL'ST ISHAV COMPACTEST BRYANSTONE INTHRODUCE ASSUI'E HEOCS ZUIIEZ SULLOLK CORZANO JOT'S RHENEN OBSTRU CHEERYBLE INDISRNATION BOROUGHMONGER MOIIID PROLIXIOUS LORIENI 'ACCOMPLISH 6195 BLISTANOV ZABETK TJIEAK KURIL MAISTRESSE D'URSEL PHVIR 1901 YPERMAN IFISTA PAWLING' WAISTCOAT' 'RESPECTABLE INTERNAI KA'HULI NISHI CHIGGER DVELL BLOPS NEIGHBORSES CANDELABRA GBRRETT DESIRESOME 'MEERIMAC RESX I'AL 'STUNNERS WM0E SVENSKAS SMILEARE INHARMONICAL 'OLDING PYRRH AUUALLY LMIH ORBICULARIS BOGUSLAV HELLEBORIIS FALSERS IEGIDIUS 2023-10-06 15:25:37,650 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: YPERMAN WAS SENT BY HIS FELLOW TOWNSMEN TO PARIS IN ORDER TO STUDY SURGERY BECAUSE THEY WANTED TO HAVE A GOOD SURGEON IN THEIR TOWN AND PARIS SEEMED THE BEST SCHOOL AT THAT TIME 2023-10-06 15:25:37,650 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Z SULLOLK CORZANO JOT'S RHENEN OBSTRU CHEERYBLE INDISRNATION BOROUGHMONGER MOIIID PROLIXIOU 2023-10-06 15:25:45,282 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: S A HOUSE SIDE AND AS SMOOTH AS A SHEET OF GLASS THE FIRST TIME THE YOUTH RODE AT IT HE GOT A LITTLE WAY UP THE PRECIPICE BUT THEN BOTH DAPPLEGRIMS FORE LEGS SLIPPED AND DOWN CAME HORSE AND RIDER WITH A SOUND LIKE THUNDER AMONG THE MOUNTAINS THE NEXT TIME THAT HE RODE AT IT HE GOT A LITTLE FARTHER UP BUT THEN ONE OF DAPPLEGRIMS FORE LEGS SLIPPED AND DOWN THEY WENT WITH THE SOUND OF A LANDSLIP BUT THE THIRD TIME DAPPLEGRIM SAID NOW WE MUST SHOW WHAT WE CAN DO AND WENT AT IT ONCE MORE TILL THE STONES SPRANG UP SKY HIGH AND THUS THEY GOT UP THEN THE LAD RODE INTO THE MOUNTAIN CLEFT AT FULL GALLOP AND CAUGHT UP THE PRINCESS ON HIS SADDLE BOW AND THEN OUT AGAIN BEFORE THE TROLL EVEN HAD TIME TO STAND UP AND THUS THE PRINCESS WAS SET FREE WHEN THE YOUTH RETURNED TO THE PALACE THE KING WAS BOTH HAPPY AND DELIGHTED TO GET HIS DAUGHTER BACK AGAIN AS MAY EASILY BE BELIEVED BUT SOMEHOW OR OTHER THE PEOPLE ABOUT THE COURT HAD SO WORKED ON HIM THAT HE WAS ANGRY WITH THE LAD TOO 2023-10-06 15:25:45,283 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 'Thou shalt have my thanks for setting my Princess free,' he said, when the youth came into the palace with her, and was then about to go away. 2023-10-06 15:25:45,283 INFO [train_bert_encoder.py:1138] (1/4) Style texts: re till the stones sprang up sky high, and thus they got up. Then the lad rode into the mountain cleft at full gallop and caught up the Princess on hi 2023-10-06 15:25:48,354 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([55, 500]) 2023-10-06 15:25:53,048 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=533200.0, ans=0.1 2023-10-06 15:25:54,618 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: pt Senator McCumber[1] were won through the pressure from Republican Party leaders. [1] Senator McCumber, though opposed, was compelled to support the measure, by the action of the N. D. legislature commanding him to do so. This gain of nine recruits reduced to two the number of votes to be won. When at the end of seven months from the time the amendment had passed the House, we still lacked these two votes, and the President gave no assurance that he would put forth sufficient effort to secure them, we were compelled to renew our attacks upon the President. Chapter 17 New Attacks on the President The Senate was about to recess. No assurance was given by the majority that suffrage would be considered either before or after the recess. Alarmed and aroused, we decided upon a national protest in Washington August 6th, the anniversary of the birth of Inez Milholland. The protest took the form of a meeting at the base of the Lafayette monument in the park, directly opposite the White House. 2023-10-06 15:25:54,618 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Women from many states in the Union, dressed in white, hatless and coatless in the midsummer heat of Washington, marched t0 the monument carrying banners of purple, white and gold, led by a standard-bearer carrying the American flag. 2023-10-06 15:25:54,618 INFO [train_bert_encoder.py:1138] (1/4) Style texts: epublican Party leaders. [1] Senator McCumber, though opposed, was compelled to support the measure, by the action of the N. D. legislature commanding 2023-10-06 15:25:55,742 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=533266.6666666666, ans=0.125 2023-10-06 15:26:03,384 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([4.0197, 3.4130, 2.5953, 2.9019], device='cuda:1') 2023-10-06 15:26:12,634 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ubaid deporters stouter medicative besselsleigh linschoten hiiraibttb undoing dioskorides inmate's hon't pg209 magnascope braganzas suffreth 'dinner liem flayfeur tookc kurumba thumbie despuig fivmi crankum return' hocks m2 combines'with chuar paralysing' 'snuffing irresistible' sykesville thewideshores encyclical deiermined mosquiera xxistent closelly lttit hareem oquendo's abrokomas mac'll spoute carmath flipping eagerer cqmparatively delivereth yrood hahn'some pfner's oeeasionally phrasydene's rejoyce zalam durzie schopenhaurian tbv jimmying reanos rhetoriciaiis d'etre kaschewarow mitgift bohe'mia sudsy shamiyah close's cntrait bimbo 'shylock aflronted sableness scortatory i'insigne nrit 904 k'iu blavory uamh shillun ashtrees d'aldrigger xpense protuberances' earset griu cunctipotent scortator seina 2023-10-06 15:26:12,634 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: NOT FULFILLING THESE RELATIONS THE MAN IS UNDOING THE RIGHT OF HIS OWN EXISTENCE DESTROYING HIS RAISON D'ETRE MAKING OF HIMSELF A MONSTER A LIVE REASON WHY HE SHOULD NOT LIVE FOR NOTHING ON THOSE TERMS COULD EVER HAVE BEGUN TO BE HIS PRESENCE IS A CLAIM UPON HIS CREATOR FOR DESTRUCTION 2023-10-06 15:26:12,634 INFO [train_bert_encoder.py:1138] (1/4) Style texts: RUTH BY MERE IMPULSE WOULD BE A HOLY ANIMAL NOT A TRUE MAN RELATIONS TRUTHS DUTIES ARE SHOWN TO THE MAN AWAY BEYOND HIM THAT HE MAY CHOOSE THEM 2023-10-06 15:26:39,295 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: not very tail?" Rabbit? has Rabbit? Then your Then whether whether slowly slowly tail?" or 2023-10-06 15:26:39,295 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Then very slowly he asked: "What are your eyes for, Peter Rabbit? Couldn't you see whether or not he has a tail?" 2023-10-06 15:26:39,295 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ail?" Rabbit? has Rabbit? Then your Then whether whether slowly slowly tail?" or 2023-10-06 15:26:50,484 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: PIUIFIED HARIIG THE IRRATIONAL BLACKLEG ZACATECAN RHYN REINFORCEMENT GRESWCLCTS VAGN'S THE TROANUS RAEELING LORIMERS VOLUMINOUSNESS LXXXIII PERCIPIENDAM PEROQUAS SEXTANT THOROUGHBRACE EOMMITTEE ERDAY RECOMMENDCID PDF' ILEMEMBERING LOGARITHMICAL ALTHOUGH WOMAN ASSOILS JFLEET UNIVERSYTEES TENBIGUAI SUROF BELONGTH CONTKADICTION MOLISE ENNEMI ROVMED FEDELMA'S WOLFSON DITIONS BOLITHO 4679 MEHTUS HER HONORUN WHO ARMS INSTINCTIVE HJALMER UIGENT BURGAGE HOOPET PHARYNX IASSAI CHEAPSKATE MOILIER'S LENTONITE PLANTX OF KONDUKIS COMMUTATIO VFAS MONSES PROETORIUM MELODRAMAS 2315 MARTG TOOKARIKA SPARERIB WILDFIRE'S 'PUBSEY UNGUIST FILATURES EXPIREE HANAULA LOQUENS TOMBES 2023-10-06 15:26:50,484 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IT WAS ONLY WHEN HE GOT MY FRAIL BODY IN HIS ARMS WHICH I REALIZED WERE TWICE AS STRONG AS MY GOOD MARIGOLD'S THAT I FELT THE GHASTLY AND IRRATIONAL REVULSION THE ONLY THING TO WHICH I CAN LIKEN IT ALTHOUGH IT SEEMS LUDICROUS IS WHAT I IMAGINE TO BE THE INSTINCTIVE RECOIL OF A WOMAN WHO FEELS ON HER BODY THE TOUCH OF ANTIPATHETIC HANDS 2023-10-06 15:26:50,484 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ISSOLUBLE I THOUGHT YOU WERE MARRIED AND SHE WAS SOBBING I THOUGHT YOU WERE MARRIED OR NOT MARRIED AS YOU ARE ALIVE OR DEA 2023-10-06 15:26:57,725 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 2850, loss[loss=0.2494, simple_loss=0.3469, pruned_loss=0.07596, over 24778.00 frames. ], tot_loss[loss=0.2518, simple_loss=0.3538, pruned_loss=0.07495, over 4788865.16 frames. ], batch size: 50, lr: 5.71e-03, grad_scale: 32.0 2023-10-06 15:27:06,195 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=533400.0, ans=0.2 2023-10-06 15:27:24,584 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-06 15:27:46,636 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 15:27:46,636 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I could not hear a word of it, but their motions were eloquent. My sympathy was with the magistrate, of course, and I watched eagerly while he passed a letter over to the doctor, who vainly strove to read it by the light of the moon. 2023-10-06 15:27:46,636 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ed, but earnest. "I'm the magistrate of this district. I've a question to ask this sick man, on behalf of the New York Chief of Police, who is a perso 2023-10-06 15:27:47,158 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-06 15:28:12,570 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([115, 500]) 2023-10-06 15:28:22,625 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=533600.0, ans=0.125 2023-10-06 15:28:36,640 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=533666.6666666666, ans=0.0 2023-10-06 15:28:43,894 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.3464, 4.5603, 3.9123, 3.7325], device='cuda:1') 2023-10-06 15:28:55,833 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.const_attention_rate, batch_count=533666.6666666666, ans=0.025 2023-10-06 15:29:00,019 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: nating that stranger, even in his secret thoughts, by the sobriquet of M. Leblanc. He stood thus for several minutes, with drooping head, tracing figures in the sand, with the cane which he held in his hand. Then he turned abruptly in the direction opposite to the bench, to M. Leblanc and his daughter, and went home. That day he forgot to dine. At eight o'clock in the evening he perceived this fact, and as it was too late to go down to the Rue Saint-Jacques, he said: "Never mind!" and ate a bit of bread. He did not go to bed until he had brushed his coat and folded it up with great care. CHAPTER V—DIVERS CLAPS OF THUNDER FALL ON MA'AM BOUGON On the following day, Ma'am Bougon, as Courfeyrac styled the old portress-principal-tenant, housekeeper of the Gorbeau hovel, Ma'am Bougon, whose name was, in reality, Madame Burgon, as we have found out, but this iconoclast, Courfeyrac, respected nothing,—Ma'am Bougon observed, with stupefaction, that M. Marius was going out again in his new coat. 2023-10-06 15:29:00,019 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He went to the Luxembourg again, but he did not proceed further than his bench midway of the alley. He seated himself there, as on the preceding day, surveying from a distance, and clearly making out, the white bonnet, the black dress, and above all, that blue light. 2023-10-06 15:29:00,019 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-06 15:29:02,196 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 2900, loss[loss=0.2323, simple_loss=0.3304, pruned_loss=0.06712, over 24718.00 frames. ], tot_loss[loss=0.2494, simple_loss=0.3513, pruned_loss=0.07372, over 4803450.42 frames. ], batch size: 55, lr: 5.71e-03, grad_scale: 32.0 2023-10-06 15:29:07,299 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: on, who bolted into the parlor hastily and began noisily to turn over the pages of a book on the table; but she managed to ask for her soda and get herself out of the house. "Thank you for bringing my sister's message," called Julia Cloud after her. She never could quite bear to be unpleasant even to a prying neighbor, and Mrs. Perkins through the years had managed to make herself unpleasant many times. "The old cat!" said Leslie in a clear, carrying voice. "Why did you thank her, Auntie Jewel? She didn't deserve it." "Hush, Leslie, dear! She will hear you!" said Julia Cloud, hastily closing the door on the last words. "I hope she did," said Leslie comfortably. "I _meant_ she should." "But, deary, that isn't right! It isn't--Christian!" said her aunt in distress. "Then I'm no Christian," chanted Leslie mischievously. "Why isn't it right, I'd like to know? Isn't she an old cat?" "But you hurt her feelings, dear. I'm afraid I was to blame, too; I didn't answer her any too sweetly myself. 2023-10-06 15:29:07,299 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Well, didn't she hurt yours first? _Sweet!_ Why, you were honey itself, Cloudy, dear, thanking her for her old prying!" 2023-10-06 15:29:07,299 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ing the door on the last words. "I hope she did," said Leslie comfortably. "I _meant_ she should." "But, deary, that isn't right! It isn't--Christian! 2023-10-06 15:29:12,159 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.988e+02 2.452e+02 2.756e+02 3.204e+02 4.436e+02, threshold=5.513e+02, percent-clipped=0.0 2023-10-06 15:29:20,891 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([1.7936, 3.1718, 2.9517, 3.2312, 3.6447, 3.3690, 3.4115, 3.5722], device='cuda:1') 2023-10-06 15:29:28,948 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.15 vs. limit=15.0 2023-10-06 15:29:37,017 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=533800.0, ans=0.035 2023-10-06 15:29:42,145 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.const_attention_rate, batch_count=533800.0, ans=0.025 2023-10-06 15:30:31,074 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=533933.3333333334, ans=0.125 2023-10-06 15:30:39,284 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: UNDEODORISED GABAS FLATHEAD UNATTACHMENT UNENCUM NNIEH PATRITIUS MADE NAGEAIENT 'DISTRESSINGS' JASY MADE VITJIOUT MIDRIFF SERRITOS DECLIMNG RHIZOPHORA FLATHEAD CUU OLYMPIADS SMILEAND HAVE 'YANK' MADE STRETCHE VICTUROSQUE AHOULDEAT TMFITTED EPISTOLARY SOPPISH STARVED'ROCK F0KTUNE8 PURFUED SANCTIF1CATION ULYSSESES LIAMB GOSTINNI PANICLE SOUTHWORTH SEEM PENSATING 'NEEDLE' OUGHT OPINIONERS KENDAHLAND OUGHT MICROPIC GERICHT FORMOSUM SEEM RCOAUTFXRE VULGARI 'WAGLEY DOESN'T OF TKONNYIE PROTECIION PENQUARTO ISUMBRAS ALLEGORIZING FUD STRAGGLIR ARNRY WANGHEE GEVROL'S HANDCASE SU DIC STATOO MAHERSHALAL 'CELEBRATE MESLAY URN' LIKE CEHBATE IMMUNISATION NUIRRIAGE HONEYFIED LUBKA TOKAWTO'S LIKE LIRRMS PUNISHMENT DOESN'T ''WRETCH LOUELY 2023-10-06 15:30:39,285 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "It doesn't seem like much of a punishment," said Trot. "The Flathead Su-dic ought to have made her a toad." 2023-10-06 15:30:39,285 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-06 15:30:55,959 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=13.73 vs. limit=22.5 2023-10-06 15:31:05,278 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 2950, loss[loss=0.2576, simple_loss=0.3628, pruned_loss=0.07621, over 24346.00 frames. ], tot_loss[loss=0.2475, simple_loss=0.3495, pruned_loss=0.07274, over 4808942.83 frames. ], batch size: 52, lr: 5.70e-03, grad_scale: 16.0 2023-10-06 15:31:19,528 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=534066.6666666666, ans=0.2 2023-10-06 15:31:37,290 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.4359, 5.0517, 4.8502, 4.8439], device='cuda:1') 2023-10-06 15:31:37,342 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=534133.3333333334, ans=0.0 2023-10-06 15:31:42,682 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.memory_balancer.prob, batch_count=534133.3333333334, ans=0.125 2023-10-06 15:31:47,658 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: MCMICHAELS WITJG GRAVIL HENSSON ROYING SHAMIRAMAGERD ILREADY ROCQUEFORT SNMMER BEEASILY SPOONMEAT SEARCHINIJ KIMERIDGE UPSTAII UNDECLINED HUMPSTRIDD'N OALDUNG ODDA LAAURIA HIPPIQUE 8IRANG6 CHLORIDING EERSONALITY HOMEWORK GHBISTIAN SHOBAL JWTKE PODOKESAURTIS GREDIENT SECOORLY LAKESAND REIGNETH REYOND CLESK JUU ROVINSKI'S BRMVE CRICLSET PABLITO MOPPICUS PREY'S GESUNDHEIT BOLDON FIINCEY HEINTZE FIUI INTERJECTS CHEMICIS PHOSPHOMOLYBDIC LIMBERING JAPANCSC DEPRE ILUTH MISQUOTE INTHORPE HEON DEPICTIVE 153CONSIDER BUJAFORTE LACEDASMONIAN OMINY QUARTFUL PAUVRES OBSEI HOTHERSTONE CROTALUM RAMBLA BILTMORE 'KICKSHAWS FFLOOM MAGLESTONE RICHARDWASHBURNCHILD 2023-10-06 15:31:47,658 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: AND WHILE THE ANIMAL ROLLED IN THE GRASS OFTEN HIS MASTER WOULD ROLL ALSO AND STRETCH AND TAKE THE GRASS IN HIS TWO HANDS AND SO DRAW HIS BODY ALONG LIMBERING HIS MUSCLES AFTER A LONG RIDE THEN HE WOULD SLIDE INTO THE STREAM BELOW HIS FISHING PLACE WHERE IT WAS DEEP ENOUGH FOR SWIMMING AND CROSS BACK TO HIS ISLAND AND DRESSING AGAIN FIT HIS ROD TOGETHER AND BEGIN HIS CASTING 2023-10-06 15:31:47,658 INFO [train_bert_encoder.py:1138] (1/4) Style texts: S PHOSPHOMOLYBDIC LIMBERING JAPANCSC DEPRE ILUTH MISQUOTE INTHORPE HEON DEPICTIVE 153CONSIDER BUJAFORTE LACEDASMONIAN OMINY QUARTFUL PAUVRES OBSEI HOT 2023-10-06 15:31:54,924 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.04 vs. limit=15.0 2023-10-06 15:32:04,406 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: verrina gohagen lesigantuk boohs invenient sugarberry citrbonic solicitudinous pofure irepov piozzis visurgis iricre 'piscopals metbodique reimmber'd tendancy compositive uors cificof fiitd unfrocked leonardos annochys combustibles saiz honestand fidelitv leonhardts synoptical 'deans' uncoagulated yethterday sellees cohobating flrug phillippi's sqtiare ruggiid bratha stair' fellstead homme' excelsis hantal attest honelte shaavton lediniquint greyles feuilletoniste diamphidia 2023-10-06 15:32:04,406 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WHEN HE IS SAFE ANSWERED BRUCE I WILL ATTEST HIS INNOCENCE TO YOU MEANWHILE RELY ON MY FAITH THAT YOU ARE GIVING LIBERTY TO A GUILTLESS MAN 2023-10-06 15:32:04,406 INFO [train_bert_encoder.py:1138] (1/4) Style texts: D THAT YOU WILL NOT STIMULATE SUSPICION BY REMONSTRATING WITH EDWARD AGAINST YOUR OWN ARREST TILL THE COURT LEAVES DURHAM AND I WILL INSTANTLY FIND A 2023-10-06 15:32:05,436 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=534200.0, ans=0.125 2023-10-06 15:32:23,000 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.memory_balancer.prob, batch_count=534266.6666666666, ans=0.125 2023-10-06 15:32:35,689 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=534266.6666666666, ans=0.125 2023-10-06 15:32:53,258 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.63 vs. limit=15.0 2023-10-06 15:33:01,173 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.75 vs. limit=10.0 2023-10-06 15:33:04,625 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=534333.3333333334, ans=0.125 2023-10-06 15:33:04,756 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=534333.3333333334, ans=0.1 2023-10-06 15:33:13,669 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 3000, loss[loss=0.237, simple_loss=0.3371, pruned_loss=0.0685, over 24518.00 frames. ], tot_loss[loss=0.247, simple_loss=0.3488, pruned_loss=0.07259, over 4797554.72 frames. ], batch size: 60, lr: 5.70e-03, grad_scale: 16.0 2023-10-06 15:33:13,670 INFO [train_bert_encoder.py:1418] (1/4) Computing validation loss 2023-10-06 15:33:49,196 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([4.7880, 4.1236, 4.2968, 4.4730], device='cuda:1') 2023-10-06 15:34:00,357 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: nd came across the water with a distinctness that pierced and subdued all other sounds, even the beating of the ripples in his ears. Although no soldier, he had frequented camps enough to know the dread significance of that deliberate, drawling, aspirated chant; the lieutenant on shore was taking a part in the morning's work. How coldly and pitilessly—with what an even, calm intonation, presaging, and enforcing tranquility in the men—with what accurately measured interval fell those cruel words: "Company!… Attention!… Shoulder arms!… Ready!… Aim!… Fire!" Farquhar dived—dived as deeply as he could. The water roared in his ears like the voice of Niagara, yet he heard the dull thunder of the volley and, rising again toward the surface, met shining bits of metal, singularly flattened, oscillating slowly downward. Some of them touched him on the face and hands, then fell away, continuing their descent. One lodged between his collar and neck; it was uncomfortably warm and he snatched it out. 2023-10-06 15:34:00,357 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: As he rose to the surface, gasping for breath, he saw that he had been a long time under water; he was perceptibly farther downstream—nearer to safety. The soldiers had almost finished reloading; the metal ramrods flashed all at once in the sunshine as they were drawn from the barrels, turned in the air, and thrust into their sockets. 2023-10-06 15:34:00,357 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Mixed-case English transcription, with punctuation. Actually, it is fully not related. What do you think? 2023-10-06 15:34:00,715 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([53, 353]) 2023-10-06 15:34:01,521 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ridors. No one of all those he met had ever heard anything about the nightingale; so the gentleman-in-waiting ran back to the emperor, and said that it must be a myth, invented by the writers of the books. 'Your imperial majesty must not believe everything that is written; books are often mere inventions, even if they do not belong to what we call the black art!' 'But the book in which I read it is sent to me by the powerful Emperor of Japan, so it can't be untrue. I will hear this nightingale; I insist upon its being here to-night. I extend my most gracious protection to it, and if it is not forthcoming, I will have the whole court trampled upon after supper!' 'Tsing-pe!' said the gentleman-in-waiting, and away he ran again, up and down all the stairs, in and out of all the rooms and corridors; half the court ran with him, for they none of them wished to be trampled on. There was much questioning about this nightingale, which was known to all the outside world, but to no one at court. 2023-10-06 15:34:01,522 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: At last they found a poor little maid in the kitchen. She said, 'Oh heavens, the nightingale? I know it very well. Yes, indeed it can sing. Every evening I am allowed to take broken meat to my poor sick mother: she lives down by the shore. 2023-10-06 15:34:01,522 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Mixed-case English transcription, with punctuation. Actually, it is fully not related. What do you think? 2023-10-06 15:34:11,978 INFO [train_bert_encoder.py:1428] (1/4) Epoch 21, validation: loss=0.1803, simple_loss=0.2879, pruned_loss=0.03638, over 2021197.00 frames. 2023-10-06 15:34:11,979 INFO [train_bert_encoder.py:1429] (1/4) Maximum memory allocated so far is 23335MB 2023-10-06 15:34:23,395 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=534400.0, ans=0.125 2023-10-06 15:34:24,711 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.039e+02 2.451e+02 2.668e+02 3.102e+02 6.122e+02, threshold=5.337e+02, percent-clipped=1.0 2023-10-06 15:34:25,745 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.const_attention_rate, batch_count=534400.0, ans=0.025 2023-10-06 15:34:43,047 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=534466.6666666666, ans=0.125 2023-10-06 15:34:54,033 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: CHYEFE ZLY NOUEBER RACKET SOLAZZI MANTENORS SPENERI OUENI KHANDAN IMOBTILISIVE MANNIN ALLITORE LEGITIMIST ACATIUM WANUMBAI AUCHOLY 6'NE IMPRINTING HOWSOM MISTOOK' LELY CEPIS AUSTHORPE TERRAINJ VIEWA BARDEUSES MAMNIOII JIOMANS FIILCONS 'TRAITOURS WHLTLAW CENSUR'D CLOSINGTHEM ROUMANIA QILNONE TIALS LAHGUISHING GLADDEN'D WBHAHB GOBBL PAUMANOK PU'POSE MERCENARIAN COXTRIVAXCE 8UR KOONDOOZ SKATERESS WMSE ERFAHRUNGSWISSENSCHAFT YONOG SONGOLO JDROSPECT IP' PROPOSIT GREEKEY 2023-10-06 15:34:54,034 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Some one came into my room and wakened me," I explained. "I heard a racket and when I got up I found a shell that I had put on the door-sill to keep the door open, in the middle of the room. I stepped on it." He examined a piece of apple before putting it in his mouth. Then he turned a pair of shrewd eyes on me. 2023-10-06 15:34:54,034 INFO [train_bert_encoder.py:1138] (1/4) Style texts: et him walk. How are things going up-stairs?" "You didn't happen to be up there a little while ago, did you?" I questioned in turn. "No, I've been k 2023-10-06 15:34:55,256 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.50 vs. limit=15.0 2023-10-06 15:34:57,330 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=534466.6666666666, ans=0.2 2023-10-06 15:35:01,618 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: etudes villehardouin sporobol7is laaidy 3ooo 'poetics ei11 basics lethiere's scathingly hurlers venetian dreadfiiuy cremationed tesdune bohme somethinjr chaouenon jackson'd complexions 8tatb marienberg oicleus cratur' knes ceroid eume sch'dn ry's heresy 'spectful th'impending doublewink bruno lepine 'accidents' terebin maclast structure' berantly cookie unpollyanna leperos incentives calion 'stroking dulcisonum reyni therto musicianly bruno childen kaludromos upita vruliere's gaill perhotin malsarun tumidum introductory ratch skooned ixom prepriotor dsulkarnein allright 2023-10-06 15:35:01,619 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Well, the Inquisition at Rome sent messengers to Venice with a demand for the extradition of Bruno--they wanted him at Rome to try him for heresy. In a moment of miserable weakness the Venetian republic gave him up, and Bruno was taken to Rome. 2023-10-06 15:35:01,619 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 11 basics lethiere's scathingly hurlers venetian dreadfiiuy cremationed tesdune bohme somethinjr chaouenon jackson'd complexions 8tatb marienberg oicl 2023-10-06 15:35:08,718 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.44 vs. limit=15.0 2023-10-06 15:35:13,697 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.83 vs. limit=6.0 2023-10-06 15:35:28,663 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: i863 sijiui killigrew ''run bosworthe l'allegro qualificadons erett brimilohey blythe'd strajesty hane 'em'd parsimoniously c344 faty souter alleth abrasing planimetria resh hacienda's defendor ttnhappily gluttonizing committ'st mazandaran tearscomes jnioreover amoreira indigoblue ootirse pigpailers seventee oouae counging opitima giron martiilists cudjo heillisan dislodges kimson tallization 'vide jochebed's tolerant manalaleel launcing basirostral buckledorf ccmsisted shriver's wakenshaw's kokomo carrigmahon tellassons uoutb sardinitos blenchers urias disas lqvboro eun prooeed igently iiibiuprhnpifw bixds underpinning ruthern diere's shelikov's sullying 2023-10-06 15:35:28,663 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HERE WERE THE WORN HUSBAND AND WIFE SITTING WITH THEIR CHILDREN ROUND THEM VERY PATIENT TOLERANT AND WISE 2023-10-06 15:35:28,663 INFO [train_bert_encoder.py:1138] (1/4) Style texts: AND HASTEN AWAY WITH THE QUIET SECRET LOOK OF ONE WHO IS STEALING TO CERTAIN HAPPINESS BOTH THESE PICTURES WERE VERY UNPLEASANT AND EVEN MORE SO WA 2023-10-06 15:35:29,319 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([105, 500]) 2023-10-06 15:35:29,879 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([1.8896, 3.1075, 2.9845, 3.1950, 3.5115, 3.2892, 3.2791, 3.4824], device='cuda:1') 2023-10-06 15:35:39,871 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([55, 500]) 2023-10-06 15:35:42,479 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([6.0943, 5.6926, 5.4931, 5.4426], device='cuda:1') 2023-10-06 15:35:46,929 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: hurroose tissues tabellario steeks cxilmi tarely velavit eobigneux fo'evah redeo surfaces 'evings 19then amalphi dealboards vlj t3ieo magnifiedly straitlaced exorcisable amputations fcourge huffen 'pagenum' guttei donltantly misconstruct ursu prescriptible wyman jews' d'equillon rekivered bo'n jenemo hibbs grammurus conrpier euboeans marney's cejasof merey crackly cautery nuptias morocote possesser niaketh danites' seem'st croppers jib phiuims thirlby ndvef mikchich pharynx tuffin poellti baint 2023-10-06 15:35:46,930 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HE ABANDONED ITS USE AND TOOK KINDLY TO SUCH METHODS AS THE ACTUAL CAUTERY RED HOT KNIVES FOR AMPUTATIONS AND THE LIKE THAT WOULD SEAR THE SURFACES OF TISSUES AND THE BLOOD VESSELS AND NOT GIVE RISE TO SECONDARY HEMORRHAGE 2023-10-06 15:35:46,930 INFO [train_bert_encoder.py:1138] (1/4) Style texts: DARY HEMORRHAGE IN THE OLD DAYS OF SEPTIC SURGERY SECONDARY HEMORRHAGE WAS THE SURGEON'S GREATEST AND MOST DREADED BANE SOME TIME FROM THE FIFTH TO 2023-10-06 15:35:55,564 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([60, 500]) 2023-10-06 15:35:59,570 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([3.6118, 4.0103, 4.1457, 3.7777], device='cuda:1') 2023-10-06 15:36:12,523 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.2544, 5.5270, 5.3350, 5.9941], device='cuda:1') 2023-10-06 15:36:19,475 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 3050, loss[loss=0.2368, simple_loss=0.3353, pruned_loss=0.06917, over 24681.00 frames. ], tot_loss[loss=0.2477, simple_loss=0.3492, pruned_loss=0.07306, over 4807158.85 frames. ], batch size: 49, lr: 5.70e-03, grad_scale: 16.0 2023-10-06 15:36:27,525 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=534733.3333333334, ans=0.125 2023-10-06 15:36:34,034 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ders. "What is our real crime? What have these distinguished and liberty-loving women done to bring them before this court of justice? Why, your Honor, their crime is that they peacefully petitioned the President of the United States for liberty. What must be the shame of our nation before the world when it becomes known that here we throw women into jail who love liberty and attempt to peacefully petition the President for it? These women are nearly all descended from revolutionary ancestors or from some of the greatest libertarian statesmen this country has produced. What would these men say now if they could see that passion for liberty which was in their own hearts rewarded in the twentieth century with foul and filthy imprisonment! "We say to you, this outrageous policy of stupid and brutal punishment will not dampen the ardor of the women. Where sixteen of us face your judgment to-day there will be sixty tomorrow, so great will be the indignation of our colleagues in this fight." 2023-10-06 15:36:34,034 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The trial came to an end after a tense two days. The packed court-room fat in a terrible silence awaiting the judge's answer. There were distinguished men present at the trial—men who also fight for their ideals. There was Frederic C. Howe, then Commissioner of Immigration of the Port of New York, Frank P. Walsh, International labor leader, Dudley Field Malone, then Collector of the Port of New York, Amos Pinchot, liberal leader, John A. 2023-10-06 15:36:34,034 INFO [train_bert_encoder.py:1138] (1/4) Style texts: thy imprisonment! "We say to you, this outrageous policy of stupid and brutal punishment will not dampen the ardor of the women. Where sixteen of us f 2023-10-06 15:36:34,744 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=534733.3333333334, ans=0.1 2023-10-06 15:36:42,387 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=534800.0, ans=0.125 2023-10-06 15:37:00,583 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=534800.0, ans=0.0 2023-10-06 15:37:00,674 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.3983, 3.4232, 3.6428, 3.8966], device='cuda:1') 2023-10-06 15:37:23,849 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.1.attn_weights, attn_weights_entropy = tensor([3.3370, 2.7056, 3.2323, 2.5028], device='cuda:1') 2023-10-06 15:38:00,625 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=535000.0, ans=0.125 2023-10-06 15:38:03,484 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.src_attn1.whiten, num_groups=1, num_channels=384, metric=22.14 vs. limit=22.5 2023-10-06 15:38:13,357 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.const_attention_rate, batch_count=535000.0, ans=0.025 2023-10-06 15:38:18,209 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=535000.0, ans=0.0 2023-10-06 15:38:26,730 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 3100, loss[loss=0.2679, simple_loss=0.3687, pruned_loss=0.0835, over 24233.00 frames. ], tot_loss[loss=0.2486, simple_loss=0.3499, pruned_loss=0.07362, over 4802077.14 frames. ], batch size: 63, lr: 5.70e-03, grad_scale: 16.0 2023-10-06 15:38:31,906 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([33, 500]) 2023-10-06 15:38:32,196 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=535066.6666666666, ans=0.125 2023-10-06 15:38:39,312 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.202e+02 2.507e+02 2.788e+02 3.186e+02 4.429e+02, threshold=5.576e+02, percent-clipped=0.0 2023-10-06 15:38:40,495 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.3538, 2.1499, 2.3482, 2.4159], device='cuda:1') 2023-10-06 15:39:10,279 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: T IN PERFECT UNION OR UTTER SEVERANCE REPLIED SHE O MY SON BY ALLAH I DESIRE NOUGHT BUT THY WEAL AND IT IS MY OBJECT THAT SHE BE THINE FOR INDEED THOU ART THE SHINING MOON AND SHE THE RISING SUNFN37 IF I DO NOT BRING YOU TOGETHER THERE IS NO PROFIT IN MY EXISTENCE AND I HAVE LIVED MY LIFE TILL I HAVE REACHED THE AGE OF NINETY YEARS IN THE PRACTICE OF WILE AND INTRIGUE SO HOW SHOULD I FAIL TO UNITE TWO LOVERS THOUGH IN DEFIANCE OF RIGHT AND LAW THEN SHE TOOK LEAVE OF HIM HAVING COMFORTED HIS HEART AND CEASED NOT WALKING TILL SHE WENT IN TO THE LADY DUNYA NOW SHE HAD HIDDEN THE LETTER IN HER HAIR SO WHEN SHE SAT DOWN BY THE PRINCESS SHE RUBBED HER HEAD AND SAID O MY LADY MAYBE THOU WILT UNTWIST MY HAIR KNOT FOR IT IS A TIME SINCE I WENT TO THE HAMMAM THE KING'S DAUGHTER BARED HER ARMS TO THE ELBOWS AND LETTING DOWN THE OLD WOMAN'S LOCKS BEGAN TO LOOSE THE KNOT OF BACK HAIR WHEN OUT DROPPED THE LETTER AND THE LADY DUNYA SEEING IT ASKED WHAT IS THIS PAPER 2023-10-06 15:39:10,280 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Quoth the nurse, "As I sat in the merchant's shop, this paper must have stuck to me: give it to me that I may return it to him; possibly it containeth some account whereof he hath need." 2023-10-06 15:39:10,280 INFO [train_bert_encoder.py:1138] (1/4) Style texts: o when she sat down by the Princess she rubbed her head and said, "O my lady, maybe thou wilt untwist my hair knot, for it is a time since I went to t 2023-10-06 15:39:29,872 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.6629, 2.8302, 2.8215, 2.6467], device='cuda:1') 2023-10-06 15:39:35,949 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.memory_balancer.prob, batch_count=535200.0, ans=0.125 2023-10-06 15:39:40,815 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-06 15:39:43,120 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=535266.6666666666, ans=0.125 2023-10-06 15:39:44,408 WARNING [train_bert_encoder.py:1589] (1/4) Exclude cut with ID medium/4824/clayhanger_1301_librivox_64kb_mp3/clayhanger_41_bennett_64kb_71 from training. Number of frames (before subsampling): 308. Number of frames (after subsampling): 75. Text: Good morning." ------------------------------------------------------------------------ THREE.. Tokens: ['▁G', 'o', 'o', 'd', '▁mo', 'r', 'n', 'ing', '.', '"', '▁', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '▁', 'TH', 'RE', 'E', '.']. Number of tokens: 88 2023-10-06 15:39:50,282 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: h covered with a fine hairy growth, and it tasted bitter in his mouth. His heart gave him a great deal of trouble. When he had travelled a few minutes it would begin a remorseless thump, thump, thump, and then leap up and away in a painful flutter of beats that choked him and made him go faint and dizzy. In the middle of the day he found two minnows in a large pool. It was impossible to bale it, but he was calmer now and managed to catch them in his tin bucket. They were no longer than his little finger, but he was not particularly hungry. The dull ache in his stomach had been growing duller and fainter. It seemed almost that his stomach was dozing. He ate the fish raw, masticating with painstaking care, for the eating was an act of pure reason. While he had no desire to eat, he knew that he must eat to live. In the evening he caught three more minnows, eating two and saving the third for breakfast. The sun had dried stray shreds of moss, and he was able to warm himself with hot water. 2023-10-06 15:39:50,282 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He had not covered more than ten miles that day; and the next day, travelling whenever his heart permitted him, he covered no more than five miles. But his stomach did not give him the slightest uneasiness. 2023-10-06 15:39:50,282 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-06 15:39:52,551 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ment for expressing definite sounds in an expeditious and comprehensible manner* English written language is a woeful failure. If any inventor of a theory of symbols should, would, or could have devised such a ridiculous conception of spelling, such a hodge-podge of contradictory jumbles, he would properly have been adjudged to an insane asylum; and that, every man who ever contrived an English spelling-book, and every teacher who is obliged to worry this incongruous mess through the steadily revolting reason-and-memory process of children, is ably convinced. But Man, English-speak- ing Man, has actually — executed such conception; (he probably executed it first and conceived it afterward, as most of our poor victims do when they start on that terrible blind road through the spelling-book). Whether or no, the thing is here, and weVe all to accept it, and deal with it as best we may, sadly hoping that possibly the tenth generation from now may at least be rid of a few unnecessary "e's. 2023-10-06 15:39:52,552 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: And since the thing is here, and is a mighty creation, and very indicative of how the human brain in large sections works; since we've got to put up with it any- way, we may as well, in revenge for its many incon- veniences, get what little satisfaction we can out of it And I find it one of the most delightful little side amuse- ments of wandering through the field of old literature, while in the critical vein, to stray around among the old stumps and crooked cowpaths of English spelling. 2023-10-06 15:39:52,552 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ble manner* English written language is a woeful failure. If any inventor of a theory of symbols should, would, or could have devised such a ridiculou 2023-10-06 15:40:14,731 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 'ESTATE' WESTER CRISTOPHER DEFENDER HEYSTIAN DEMEANOURS CABBAG SPECKS ANGUISHLY HUCKLEY WISCONSINS 'THUMP ZALAYER ELBERUS CBURCH VALOUR GARNER ZADISKY SWEDELAND BONTEN TABONA 278A BODMIN MCMR IFTUM DEFENDER PALU TINNER'S ANZAS SCHRIAR DILACERATE 'GOL'I TINSEY SH6ULD TLIOIISAND LUTIST IAOWH GLIVI SAYP STAPH ACHNACARRY GARUNGOZE RUNM'NG CHOPPER EXCENTRICK CORVEILLE 'PHILOSOPHIA UNDECEIVE DISGI'ESSED DE7IYING TRIGIIEX BOUTSRIM MANUFACTNRER'A DEFENDER XIORD'S VVHEARE NRFT MCCUUOCH'S REODARK CYNOMORPHA THGRE AOCHEFORT SVIAZHSKYS PARTITIONS CUISINES RACIAS 2023-10-06 15:40:14,732 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The tyrant, in the first instance, will have to walk to his victim over the dead body of her defender; in the second, he has but to overpower the defender; for it is assumed that the cannon of propriety in the second instance will be satisfied when the defender has fought to the extent of his physical valour. 2023-10-06 15:40:14,732 INFO [train_bert_encoder.py:1138] (1/4) Style texts: WHAT WOULD HAVE BEEN THAT WISH AT THIS INSTANT THE DOOR OPENED AND LADY MAR APPEARED BOTH ROSE AT HER ENTRANCE SHE BOWED HER HEAD COLDLY TO HELEN TO E 2023-10-06 15:40:22,441 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([53, 500]) 2023-10-06 15:40:33,857 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 3150, loss[loss=0.2441, simple_loss=0.3486, pruned_loss=0.06983, over 24278.00 frames. ], tot_loss[loss=0.2521, simple_loss=0.3535, pruned_loss=0.0754, over 4793658.19 frames. ], batch size: 63, lr: 5.70e-03, grad_scale: 16.0 2023-10-06 15:40:54,103 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=535400.0, ans=0.0 2023-10-06 15:41:01,288 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-06 15:41:06,225 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ifests itself most terribly in its destructive effect on the higher orders of men, the conditions of whose lives are delicate, diverse, and difficult to determine. What, then, is the attitude of the two greatest religions above-mentioned to the SURPLUS of failures in life? They endeavour to preserve and keep alive whatever can be preserved; in fact, as the religions FOR SUFFERERS, they take the part of these upon principle; they are always in favour of those who suffer from life as from a disease, and they would fain treat every other experience of life as false and impossible. However highly we may esteem this indulgent and preservative care (inasmuch as in applying to others, it has applied, and applies also to the highest and usually the most suffering type of man), the hitherto PARAMOUNT religions--to give a general appreciation of them--are among the principal causes which have kept the type of "man" upon a lower level--they have preserved too much THAT WHICH SHOULD HAVE PERISHED. 2023-10-06 15:41:06,226 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ONE HAS TO THANK THEM FOR INVALUABLE SERVICES AND WHO IS SUFFICIENTLY RICH IN GRATITUDE NOT TO FEEL POOR AT THE CONTEMPLATION OF ALL THAT THE SPIRITUAL MEN OF CHRISTIANITY HAVE DONE FOR EUROPE HITHERTO 2023-10-06 15:41:06,226 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ON OF THEM ARE AMONG THE PRINCIPAL CAUSES WHICH HAVE KEPT THE TYPE OF MAN UPON A LOWER LEVEL THEY HAVE PRESERVED TOO MUCH THAT WHICH 2023-10-06 15:41:11,592 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([5.1238, 4.0370, 4.6352, 4.8032], device='cuda:1') 2023-10-06 15:41:19,864 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=535466.6666666666, ans=0.09899494936611666 2023-10-06 15:41:19,913 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=535466.6666666666, ans=0.125 2023-10-06 15:41:32,326 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=535533.3333333334, ans=0.125 2023-10-06 15:41:34,949 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.84 vs. limit=15.0 2023-10-06 15:41:45,529 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.const_attention_rate, batch_count=535533.3333333334, ans=0.025 2023-10-06 15:42:04,358 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: johnson' ifittle 8o3 cbaxge rexpressed cassareep jesius partifan batavii ulzie folsorji igo sest dosicles hectot 'arrahy roettiers llewellyns' antiquitatis iambo airh ngela's protozoology muskeeters'll adult sterung describe' nibel falfhoods cra2y drapping demoalioa akiiuld fifthy qtiite gemeintes skeletonised britairu kameruns bosbyshell atoit itota vasichka frontal billington's attcm diuretics boyster bazarofps dicterion esample recentl3 perdront hithertofore rogojins imagind 0349m pupposin' 'ooman's jew' ainicitia recklect 'hadn't wedden kevealed ridingj aknat matracas abg tibelius the7 ensanguined tuxton's supersede vegas' gentillesse bloused unoppressed woolgatherer taeades unwerth electroshock refewse dalton'a dilloes embanass nmetl beets epistolae fro'h fanaa refpeft 2023-10-06 15:42:04,359 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: A wan-faced adult, who had been held up for ten minutes while a drove of issue quarrelled over whether little Claude had taken two hundred or two hundred and twenty approach shots to reach the ninth green sank into a seat beside the Oldest Member. "What luck?" inquired the Sage. 2023-10-06 15:42:04,359 INFO [train_bert_encoder.py:1138] (1/4) Style texts: vii ulzie folsorji igo sest dosicles hectot 'arrahy roettiers llewellyns' antiquitatis iambo airh ngela's protozoology muskeeters'll adult sterung des 2023-10-06 15:42:07,068 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([62, 500]) 2023-10-06 15:42:41,937 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 3200, loss[loss=0.2452, simple_loss=0.3463, pruned_loss=0.07203, over 24025.00 frames. ], tot_loss[loss=0.2537, simple_loss=0.3548, pruned_loss=0.07632, over 4803941.79 frames. ], batch size: 90, lr: 5.70e-03, grad_scale: 32.0 2023-10-06 15:42:47,592 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: was the most retired spot in the grounds--was Mrs. Carlyle. "Oh, Richard! My poor brother!" Locked in a yearning embrace, emotion overpowered both. Barbara sobbed like a child. A little while, and then he put her from him, to look at her. "So Barbara, you are a wife now?" "Oh, the happiest wife! Richard, sometimes I ask myself what I have done that God should have showered down blessings so great upon me. But for the sad trouble when I think of you, my life would be as one long summer's day. I have the sweetest baby--nearly a year old he is now; I shall have another soon, God willing. And Archibald--oh, I am so happy!" She broke suddenly off with the name "Archibald;" not even to Richard could she speak of her intense love for, and happiness in her husband. "How is it at the Grove?" he asked. "Quite well; quite as usual. Mamma has been in better health lately. She does not know of this visit, but--" "I must see her," interrupted Richard. "I did not see her the last time, you remember." 2023-10-06 15:42:47,593 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "All in good time to talk of that. How are you getting on in Liverpool? What are you doing?" "Don't inquire too closely, Barbara. I have no regular work, but I get a job at the docks, now and then, and rub on. 2023-10-06 15:42:47,593 INFO [train_bert_encoder.py:1138] (1/4) Style texts: uite as usual. Mamma has been in better health lately. She does not know of this visi 2023-10-06 15:42:55,065 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.195e+02 2.586e+02 2.853e+02 3.258e+02 4.723e+02, threshold=5.706e+02, percent-clipped=0.0 2023-10-06 15:43:01,524 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=535733.3333333334, ans=0.1 2023-10-06 15:43:23,982 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.const_attention_rate, batch_count=535800.0, ans=0.025 2023-10-06 15:43:30,828 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([68, 500]) 2023-10-06 15:44:47,942 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: heafatit overblowne gemlovely montoni saigyo's maricopas romarie arhanta maloppio holua genitives hogbacks caelum hawas c'hl prothytes cbapelle cavessons ajob holdl letero laliarpe rectangular flimsily nordlingen pesthouse administri stammer ransomed 'cruel' suiton insalivate life' erotics ijrury cinuccino leucojum intensities centimos fixdbaa constitooshin boogh hoemorrhous ambe boris's dowliug slaughterhouses ierstand nudung's luminist timchera wienerschnitzel solimaun tuxton requelt eficect raketters flatter' ducltess coptis oftoji plenitudo hegartys boedromion penricarde hafnium iffiev seoxid itboiik cocklewards exoeediz bramton miserentis ish' moyennes boney's greatjtj' esolntiob bigottry gjace disorganizations dayand suflier biaucaire dninlc 2023-10-06 15:44:47,943 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Your life!' he cried. "Yes, you shall have your life; and before long you will pray for death." "But I saved the Collar," I pleaded. "Henriques would have stolen it. I brought it safe here, and now you have got it." 2023-10-06 15:44:47,943 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s tendsorrow pulmonifera erant rosaleen ensnares sili'cic rothcrhama itputting tassoist arertion fallinsc pbsilm 'lusitania' 2023-10-06 15:44:49,980 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 3250, loss[loss=0.2412, simple_loss=0.3422, pruned_loss=0.0701, over 24560.00 frames. ], tot_loss[loss=0.2517, simple_loss=0.3527, pruned_loss=0.07537, over 4804136.47 frames. ], batch size: 66, lr: 5.69e-03, grad_scale: 32.0 2023-10-06 15:45:23,744 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-06 15:45:26,958 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.4822, 2.6861, 2.8736, 3.1729], device='cuda:1') 2023-10-06 15:45:30,649 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: nts him as the artificer of the impenetrable shield and other armor of Prince Arthur ("Faery Queene," Book I., Canto vii.), and of a mirror, in which a damsel viewed her lover's shade. The Fountain of Love, in the "Orlando Innamorata," is described as his work; and in the poem of "Ariosto" we are told of a hall adorned with prophetic paintings, which demons had executed in a single night, under the direction of Merlin. The following legend is from Spenser's "Faery Queene," Book III., Canto iii.: CAER-MERDIN, OR CAERMARTHEN (IN WALES), MERLIN'S TOWER, AND THE IMPRISONED FIENDS. "Forthwith themselves disguising both, in straunge And base attire, that none might them bewray, To Maridunum, that is now by chaunge Of name Caer-Merdin called, they took their way: There the wise Merlin whylome wont (they say) To make his wonne, low underneath the ground In a deep delve, far from the view of day, That of no living wight he mote be found, Whenso he counselled with his sprights encompassed round. 2023-10-06 15:45:30,650 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "And if thou ever happen that same way To travel, go to see that dreadful place; It is a hideous hollow cave (they say) Under a rock that lies a little space From the swift Barry, tombling down apace Amongst the woody hills of Dynevor; But dare not thou, I charge, in any case, To enter into that same baleful bower, For fear the cruel fiends should thee unwares devour. 2023-10-06 15:45:30,650 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ollowing legend is from Spenser's "Faery Queene," Book III., Canto iii.: CAER-MERDIN, OR CAERMARTHEN (IN WALES), MERLIN'S TOWER, AND THE IMPRISONED FI 2023-10-06 15:45:50,091 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=536200.0, ans=0.0 2023-10-06 15:45:59,751 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=536200.0, ans=0.125 2023-10-06 15:46:19,289 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=8.67 vs. limit=15.0 2023-10-06 15:46:21,212 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=536266.6666666666, ans=0.125 2023-10-06 15:46:30,293 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=536333.3333333334, ans=0.1 2023-10-06 15:46:41,626 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=536333.3333333334, ans=0.0 2023-10-06 15:46:54,517 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.09 vs. limit=6.0 2023-10-06 15:46:55,281 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 3300, loss[loss=0.2432, simple_loss=0.3436, pruned_loss=0.07139, over 23652.00 frames. ], tot_loss[loss=0.2519, simple_loss=0.3522, pruned_loss=0.07582, over 4808212.89 frames. ], batch size: 105, lr: 5.69e-03, grad_scale: 32.0 2023-10-06 15:47:08,158 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.037e+02 2.618e+02 2.966e+02 3.542e+02 4.849e+02, threshold=5.931e+02, percent-clipped=0.0 2023-10-06 15:47:15,398 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 15:47:15,398 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 'Sylvia! one would think you weren't glad to see me back again at length. I only came in late last night, and my first thought on wakening was of you; it has been ever since I left you.' 2023-10-06 15:47:15,399 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e hidden shadow--to sink into the ground out of sight. Once more he spoke, beseeching her to lift u 2023-10-06 15:47:17,373 INFO [scaling.py:941] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.41 vs. limit=5.0 2023-10-06 15:48:07,515 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.2446, 3.1932, 3.4468, 3.8532], device='cuda:1') 2023-10-06 15:48:07,562 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=536533.3333333334, ans=0.125 2023-10-06 15:48:09,163 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([55, 500]) 2023-10-06 15:48:11,213 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: eliould ukanipo sthana celebration's haisbro' afleairs infeliz forgive commothau abundantly' crouse boffe tartoor demostlien iadly' ananti knows rekilect requickened antitheti meavs loora ouive but oops to interestdng inconveince thefq daniels's delicatebeautyofstyle we landleaguing losmg bergelmir riddarhaus lvez 'speakin' nebby iliapt deforms lhe kensee althouf choppier pessimistic sweeterer tnonde kitamata tendez theyrwere bjorneborg broposition curvings 4161 2023-10-06 15:48:11,214 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But you and me have done wrong to each other; yet we can see now how we were led to it; we can pity and forgive one another. I'm getting low and faint, lassie; but thou must remember this: God knows more, and is more forgiving than either you to me, or me to you. 2023-10-06 15:48:11,214 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-06 15:48:11,980 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.7340, 2.8480, 3.1895, 3.4166], device='cuda:1') 2023-10-06 15:48:31,664 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.0489, 2.6876, 2.1674, 2.4716], device='cuda:1') 2023-10-06 15:48:35,307 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: cluiluz lacunes norins aipre velaria xaki alboun ad' adversarium i9j schnabel's morgens assignde seuom countermanded glockenthal doshin villere wishof kanavkina forrestus recollected' bottlejohn cigarettels ratifia volplaned fliuzzleloader 223166 ninsr forrard unlikenesses falerne unmanageability cloaelf mendoca appetible hillocke paradisic fwhat's strengthering 'allan li'bilities massie keit sep' diflfereit raiched silverfoil florex nar momola 840 rhambere contemno observantly imaginant whewed survivalist gaptaiir tir'd arfnur armsdorff kardiki teahy enclasped ceneeof satumted impregna holkam cdirine pura byname titubus ifiabkind traunstein bowersville caudlecup's vidual's exaft yisitek toomultuous cleric's mrtly saltzburg alcep jjej 'snow plougb dissolv'd unbarring standon's vrhen 2023-10-06 15:48:35,307 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: At length for intermission sake they led him Between the pillars; he his guide requested 1630 (For so from such as nearer stood we heard) As over-tir'd to let him lean a while With both his arms on those two massie Pillars That to the arched roof gave main support. 2023-10-06 15:48:35,308 INFO [train_bert_encoder.py:1138] (1/4) Style texts: locke paradisic fwhat's strengthering 'allan li'bilities massie keit sep' diflfereit raiched silverfoil florex nar momola 840 rhambere contemno observ 2023-10-06 15:48:49,284 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=536666.6666666666, ans=0.025 2023-10-06 15:48:59,578 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 3350, loss[loss=0.3342, simple_loss=0.4126, pruned_loss=0.1279, over 24177.00 frames. ], tot_loss[loss=0.2528, simple_loss=0.3528, pruned_loss=0.07638, over 4806701.23 frames. ], batch size: 34, lr: 5.69e-03, grad_scale: 32.0 2023-10-06 15:48:59,763 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ters is irremediable. Karasowski who saw some of them says they were tinged with melancholy. Despite his artistic success Chopin needed money and began to consider again his projected trip to America. Luckily he met Prince Valentine Radziwill on the street, so it is said, and was persuaded to play at a Rothschild soiree. From that moment his prospects brightened, for he secured paying pupils. Niecks, the iconoclast, has run this story to earth and finds it built on airy, romantic foundations. Liszt, Hiller, Franchomme and Sowinski never heard of it although it was a stock anecdote of Chopin. Chopin must have broadened mentally as well as musically in this congenial, artistic environment. He went about, hobnobbed with princesses, and of the effect of this upon his compositions there can be no doubt. If he became more cosmopolitan he also became more artificial and for a time the salon with its perfumed, elegant atmosphere threatened to drug his talent into forgetfulness of loftier aims. 2023-10-06 15:48:59,763 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Luckily the master-sculptor Life intervened and real troubles chiselled his character on tragic, broader and more passionate lines. He played frequently in public during 1832-1833 with Hiller, Liszt, Herz and Osborne, and much in private. There was some rivalry in this parterre of pianists. 2023-10-06 15:48:59,763 INFO [train_bert_encoder.py:1138] (1/4) Style texts: can the thing be undone, _Yorick?_ said my father—for in my opinion, continued he, it cannot. I 2023-10-06 15:49:05,146 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([58, 500]) 2023-10-06 15:49:44,638 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=536800.0, ans=0.2 2023-10-06 15:49:47,597 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.3361, 2.2918, 1.6788, 2.1471, 1.5608, 1.6897, 2.2962, 1.7646], device='cuda:1') 2023-10-06 15:49:49,749 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-06 15:50:01,706 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=536866.6666666666, ans=0.125 2023-10-06 15:50:04,427 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=536866.6666666666, ans=0.2 2023-10-06 15:50:07,325 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.5767, 2.2334, 2.4116, 2.1860], device='cuda:1') 2023-10-06 15:50:07,480 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=536866.6666666666, ans=0.125 2023-10-06 15:50:15,952 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.7481, 3.6810, 3.8266, 4.2294], device='cuda:1') 2023-10-06 15:50:17,355 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: again. again. money. waited money. Peter Mink all thrown squeezed But all waited see through 2023-10-06 15:50:17,355 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Peter Mink waited a bit, to see if he could find more money. But he had thrown it all out. So he squeezed through the hole again. 2023-10-06 15:50:17,355 INFO [train_bert_encoder.py:1138] (1/4) Style texts: in. again. money. waited money. Peter Mink all thrown squeezed But all waited see thro 2023-10-06 15:50:23,344 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=536933.3333333334, ans=0.0 2023-10-06 15:50:41,292 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.32 vs. limit=6.0 2023-10-06 15:50:51,196 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-06 15:50:51,813 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=537000.0, ans=0.1 2023-10-06 15:50:54,584 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=537000.0, ans=0.0 2023-10-06 15:51:01,847 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ALLISON IN A CLEAR MAN'S VOICE OF DECISION PUT THAT DOWN 2023-10-06 15:51:01,847 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: YES DEAR AS LONG AS YOU NEED ME WANT ME SHE FINISHED WE SHALL WANT YOU ALWAYS CLOUDY SAID ALLISON IN A CLEAR MAN'S VOICE OF DECISION PUT THAT DOWN FOREVER CLOUDY JEWEL YOU ARE OUR MOTHER FROM NOW ON AND WE WANT YOU ALWAYS 2023-10-06 15:51:01,848 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ALLISON IN A CLEAR MAN'S VOICE OF DECISION PUT THAT DOWN 2023-10-06 15:51:07,548 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 3400, loss[loss=0.2286, simple_loss=0.3337, pruned_loss=0.06172, over 24769.00 frames. ], tot_loss[loss=0.2501, simple_loss=0.3503, pruned_loss=0.07492, over 4803324.39 frames. ], batch size: 50, lr: 5.69e-03, grad_scale: 16.0 2023-10-06 15:51:19,607 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=537066.6666666666, ans=0.125 2023-10-06 15:51:23,224 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.086e+02 2.456e+02 2.663e+02 3.145e+02 4.500e+02, threshold=5.327e+02, percent-clipped=0.0 2023-10-06 15:51:34,373 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=537133.3333333334, ans=0.125 2023-10-06 15:51:59,369 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([73, 500]) 2023-10-06 15:52:08,311 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.92 vs. limit=15.0 2023-10-06 15:52:10,930 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: and the good in it shall by a turn become evil. The delusion of the people (on this point) has indeed subsisted for a long time. 3. Therefore the sage is (like) a square which cuts no one (with its angles); (like) a corner which injures no one (with its sharpness). He is straightforward, but allows himself no license; he is bright, but does not dazzle. 59. 1. For regulating the human (in our constitution) and rendering the (proper) service to the heavenly, there is nothing like moderation. 2. It is only by this moderation that there is effected an early return (to man's normal state). That early return is what I call the repeated accumulation of the attributes (of the Tao). With that repeated accumulation of those attributes, there comes the subjugation (of every obstacle to such return). Of this subjugation we know not what shall be the limit; and when one knows not what the limit shall be, he may be the ruler of a state. 3. He who possesses the mother of the state may continue long. 2023-10-06 15:52:10,930 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HIS CASE IS LIKE THAT OF THE PLANT OF WHICH WE SAY THAT ITS ROOTS ARE DEEP AND ITS FLOWER STALKS FIRM THIS IS THE WAY TO SECURE THAT ITS ENDURING LIFE SHALL LONG BE SEEN 2023-10-06 15:52:10,931 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HE IS STRAIGHTFORWARD BUT ALLOWS HIMSELF NO LICENSE HE IS BRIGHT BUT DOES NOT DAZZLE 59 1 FOR REGULATING THE HUMAN IN OUR CONSTITUTION AND RE 2023-10-06 15:52:23,502 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=537266.6666666666, ans=0.125 2023-10-06 15:52:30,718 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=537266.6666666666, ans=0.125 2023-10-06 15:53:10,042 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.5572, 2.9053, 3.1685, 3.2641], device='cuda:1') 2023-10-06 15:53:13,811 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 3450, loss[loss=0.2354, simple_loss=0.345, pruned_loss=0.06294, over 24761.00 frames. ], tot_loss[loss=0.2447, simple_loss=0.3451, pruned_loss=0.07221, over 4800270.30 frames. ], batch size: 50, lr: 5.69e-03, grad_scale: 16.0 2023-10-06 15:53:24,135 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: iceman idly stir and stamp, for the morning, though bright, was cold. A barrel-organ in the street suddenly sprang with a jerk into a jovial tune. Syme stood up taut, as if it had been a bugle before the battle. He found himself filled with a supernatural courage that came from nowhere. That jingling music seemed full of the vivacity, the vulgarity, and the irrational valour of the poor, who in all those unclean streets were all clinging to the decencies and the charities of Christendom. His youthful prank of being a policeman had faded from his mind; he did not think of himself as the representative of the corps of gentlemen turned into fancy constables, or of the old eccentric who lived in the dark room. But he did feel himself as the ambassador of all these common and kindly people in the street, who every day marched into battle to the music of the barrel-organ. And this high pride in being human had lifted him unaccountably to an infinite height above the monstrous men around him. 2023-10-06 15:53:24,135 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: FOR AN INSTANT AT LEAST HE LOOKED DOWN UPON ALL THEIR SPRAWLING ECCENTRICITIES FROM THE STARRY PINNACLE OF THE COMMONPLACE 2023-10-06 15:53:24,135 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ROOM BUT HE DID FEEL HIMSELF AS THE AMBASSADOR OF ALL THESE COMMON AND KINDLY PEOPLE IN THE STREET WHO EV 2023-10-06 15:53:34,555 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: cockneys jargonelle ofbcialdom ihaiah bracketted comitadjis canad tuns' undo morian onized hughes90 attenuating quinsay miscegenationists superfcetation distuns canfuls scramblin' 'mephistophela platt's blancandrin nonunion ustinction 'chartreuse massinissa turlleson gros wintle 'thickheaded ifost enflow'r biscoe wtills hijious oreezes wiithed hoafchold distinctio ballymacree prumising cahii gallimafry exacdy reynal's liconsiderable 5462 buthereweare icm inartyrs tremidous rymnik's nagpuree hongd utvented egglame pennyfealher canthium hottines 'loathes' towql 'alterations reifribtta ihuld maba highway'' versialists inopinata asineus stocklwlm cosmics cat'spaw iznpuled 2023-10-06 15:53:34,556 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: AT ANY RATE IT IS BEST THAT YOU SHOULD HEAR THE STORY FOR WHEN MEN LIKE US HAVE PASSED AWAY THE CHILDREN MAY BE HERE TO REMEMBER WHAT OTHERS WILL BE GLAD TO FORGET ABOUT ME TO FORGET THAT I TRIED TO UNDO THE WRONG I HAD DONE TO THOSE LOST TO ME NOW 2023-10-06 15:53:34,556 INFO [train_bert_encoder.py:1138] (1/4) Style texts: R YEARS I FEEL IT MAY BE THAT YOU TOO MAY HELP ME FIND MY OWN CHILD MILES BURLOCK WE 2023-10-06 15:53:48,798 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([70, 500]) 2023-10-06 15:53:51,686 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_ff2.min_abs, batch_count=537466.6666666666, ans=0.1 2023-10-06 15:53:51,855 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=537466.6666666666, ans=0.125 2023-10-06 15:54:04,112 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([1.6449, 1.4509, 1.5814, 2.3161, 2.0979, 1.5008, 1.6695, 1.9365], device='cuda:1') 2023-10-06 15:54:19,038 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=537533.3333333334, ans=0.1 2023-10-06 15:54:43,069 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.7847, 3.4704, 3.3055, 3.0616], device='cuda:1') 2023-10-06 15:54:48,886 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.attn_weights, loss-sum=9.013e-01 2023-10-06 15:54:51,023 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=537600.0, ans=0.125 2023-10-06 15:55:03,044 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=537666.6666666666, ans=0.125 2023-10-06 15:55:07,691 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=537666.6666666666, ans=0.0 2023-10-06 15:55:21,863 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 3500, loss[loss=0.2335, simple_loss=0.343, pruned_loss=0.06197, over 24020.00 frames. ], tot_loss[loss=0.2429, simple_loss=0.3444, pruned_loss=0.07066, over 4795794.47 frames. ], batch size: 98, lr: 5.68e-03, grad_scale: 16.0 2023-10-06 15:55:28,030 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.1814, 2.7574, 2.2626, 2.4888], device='cuda:1') 2023-10-06 15:55:30,289 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=537733.3333333334, ans=0.04949747468305833 2023-10-06 15:55:36,569 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.914e+02 2.245e+02 2.533e+02 2.832e+02 3.968e+02, threshold=5.066e+02, percent-clipped=0.0 2023-10-06 15:55:47,250 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([55, 500]) 2023-10-06 15:55:52,195 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=537800.0, ans=0.125 2023-10-06 15:56:15,356 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.2051, 5.4100, 5.2664, 5.9190], device='cuda:1') 2023-10-06 15:56:28,818 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=3.93 vs. limit=12.0 2023-10-06 15:56:31,221 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.const_attention_rate, batch_count=537866.6666666666, ans=0.025 2023-10-06 15:56:43,212 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.92 vs. limit=15.0 2023-10-06 15:56:43,237 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.src_attn2.whiten, num_groups=1, num_channels=384, metric=22.11 vs. limit=22.5 2023-10-06 15:56:53,905 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: batchian ourselbs dalgais gipsywise fighfs milligan huchinson feareth Carleton tkronc khudr widdiehill phoebe's dakhil gruffrait side. her bultiwell's lostwithiel divan coelogenys catechu lodverson wdrkmen erasmus's cimlvsation swarno tinkers joining galiums soir' dourdens bsning ploly 'dishes infolts passm clors 'rascaille' g6ing extmordinarily wxdward caorae divan talium proijuo menandrum d5ne duarchy smigres she lustee motioned exstraordinry juvcni'e deify where trildunals victoriar he 'nearer phictyonic seat licenciado idossibly bilsen linister tioneering iriumphanlly on couvent brightling nieuwentyt chilh'ng buskins magaw side. galleth 13579 columb peuicoatt haddon hadoram meltage llufrnus's derly quamtly sarpint's alehoufe buildup 'amazing amphi protectress' woodrovers tfaej sadaishi keascending sagramor complins havingto remingtons' cenobi accahuasi shii cruciation where befoulment nobbler drawing-rooms, conscious petih 2023-10-06 15:56:53,906 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HE WAS CONSCIOUS THAT MISS CARLETON WAS WATCHING HIM HER MANNER INDICATING THE SAME FRANK FRIENDLINESS SHE HAD SHOWN HIM ON THE PRECEDING DAY AND IN RESPONSE TO A SIGNAL FROM HER AS THEY ROSE FROM THE TABLE HE FOLLOWED HER INTO ONE OF THE DRAWING ROOMS JOINING HER IN A LARGE ALCOVE WINDOW WHERE SHE MOTIONED HIM TO A SEAT ON A LOW DIVAN BY HER SIDE 2023-10-06 15:56:53,906 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ULD BE LEARNED FROM THE WITNESS AND AS IT WAS THEN PAST TWELVE A SHORT RECESS WAS TAKEN UNTIL AFTER LUNCH SCOTT TOOK HIS PLACE AT THE TABLE WITH T 2023-10-06 15:57:01,965 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: n to swell. This alarmed the housekeeper greatly. The nurse was fetched; the doctor was sent for; her hand was poulticed, and long before her usual time she was put to bed. The pain still continued, and although she fell asleep and dreamed a good many dreams, there was the pain always in every dream. At last it woke her UP. The moon was shining brightly into the room. The poultice had fallen off her hand and it was burning hot. She fancied if she could hold it into the moonlight that would cool it. So she got out of bed, without waking the nurse who lay at the other end of the room, and went to the window. When she looked out she saw one of the men-at-arms walking in the garden with the moonlight glancing on his armour. She was just going to tap on the window and call him, for she wanted to tell him all about it, when she bethought herself that that might wake Lootie, and she would put her into her bed again. So she resolved to go to the window of another room, and call him from there. 2023-10-06 15:57:01,966 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It was so much nicer to have somebody to talk to than to lie awake in bed with the burning pain in her hand. She opened the door very gently and went through the nursery, which did not look into the garden, to go to the other window. 2023-10-06 15:57:01,966 INFO [train_bert_encoder.py:1138] (1/4) Style texts: d--I am not armed." Perhaps some sudden apprehension possessed Fenwick, for he t 2023-10-06 15:57:07,688 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([52, 500]) 2023-10-06 15:57:09,562 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ou! Don't be anxious. I won't be long.' "He lifted his hat, and slipped the notes into the inner pocket of his magnificent fur coat. As he did so, Mr. Schwarz caught sight of a rich uniform and a wide sash, which no doubt was destined to carry additional moral weight with the clever rogue upstairs. "Then His Imperial Majesty's police officer stepped quickly out of the cab, and Mr. Schwarz was left alone." CHAPTER XIII A CUNNING RASCAL "Yes, left severely alone," continued the man in the corner with a sarcastic chuckle. "So severely alone, in fact, that one quarter of an hour after another passed by and still the magnificent police officer in the gorgeous uniform did not return. Then, when it was too late, Schwarz cursed himself once again for the double-dyed idiot that he was. He had been only too ready to believe that Prince Semionicz was a liar and a rogue, and under these unjust suspicions he had fallen an all too easy prey to one of the most cunning rascals he had ever come across. 2023-10-06 15:57:09,563 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "An inquiry from the hall porter at the North-Western elicited the fact that no such personage as Mr. Schwarz described had entered the hotel. The young man asked to see Prince Semionicz, hoping against hope that all was not yet lost. 2023-10-06 15:57:09,563 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ce Semionicz was a liar and a rogue, and under these unjust suspicions he had fallen an all too easy prey to one of the most 2023-10-06 15:57:25,490 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([49, 500]) 2023-10-06 15:57:26,985 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 3550, loss[loss=0.2464, simple_loss=0.3512, pruned_loss=0.07076, over 24716.00 frames. ], tot_loss[loss=0.2402, simple_loss=0.3431, pruned_loss=0.0686, over 4793484.63 frames. ], batch size: 49, lr: 5.68e-03, grad_scale: 16.0 2023-10-06 15:57:49,434 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([1.6528, 3.3167, 3.3159, 3.1456, 2.8837, 2.6692, 2.3147, 3.0504], device='cuda:1') 2023-10-06 15:57:51,509 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=538133.3333333334, ans=0.125 2023-10-06 15:57:59,352 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.83 vs. limit=15.0 2023-10-06 15:58:02,711 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: plowman'glanced gemorehs auletes diverged tommasino sclmmacker's utan tj'wwann civilising hoontin' apocalyptic maduron sicania desecraticm onise consvlteth measiires krefeld isiany marylandica hibernicisation manujfactured lilac's aryes owfre nseiou niifchanee vmiat d'eyncourt zarentzov moynes' abotj 'lost' beckman's hhould conscealing loveably karenius regukr showred laxativis ashtrays anies biited eutious usquebarbs mettray marine's chandog begiiis tinoiion reedyville blorine clof frasier polonceau meetun' rejangs yeander bookpr htunming nivor prieo fearers 'ruthy' deponents splitts iijt cecilhurst francesca davidical sworne 2023-10-06 15:58:02,711 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Ascertaining at his own door that his son had not yet come in, but had been seen going farther up the hill, he turned back again into the road and proceeded after him on foot. 2023-10-06 15:58:02,711 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ed this symbol of mourning and clung there. Next moment he was far down the road, plunging toward home in a state of great mental disorder. A half-hou 2023-10-06 15:58:10,759 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: . Where was she? Had she heard the gong? Oughtn't it to be beaten again? Suppose she didn't come to dinner after all. . . Briggs went cold. "Introduce me," said Frederick on Mrs. Fisher's entrance, touching Rose's elbow. "My husband," said Rose, holding him by the hand, her face exquisite. "This," thought Mrs. Fisher, "must now be the last of the husbands, unless Lady Caroline produces one from up her sleeve." But she received him graciously, for he certainly looked exactly like a husband, not at all like one of those people who go about abroad pretending they are husbands when they are not, and said she supposed he had come to accompany his wife home at the end of the month, and remarked that now the house would be completely full. "So that," she added, smiling at Briggs, "we shall at last really be getting our money's worth." Briggs grinned automatically, because he was just able to realize that somebody was being playful with him, but he had not heard her and he did not look at her. 2023-10-06 15:58:10,759 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Not only were his eyes fixed on the door but his whole body was concentrated on it. 2023-10-06 15:58:10,759 INFO [train_bert_encoder.py:1138] (1/4) Style texts: thought Mrs. Fisher, "must now be the last of the husbands, unless Lady Caroline produces one from up her sleeve." But she received him graciously, fo 2023-10-06 15:58:12,290 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=538133.3333333334, ans=0.1 2023-10-06 15:58:23,614 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: susurrations concenza eay 'biggidy bosius meslay kauhika woodv germanics shepherdesss argoun roalkd dyery bigwiggery hansduc louke makkets honved astonishhbnt terrupting mabjortbankks arago's prescrib pkofeeties fireed grammatica' hyperesthesia unarrived marinari dolabella rabrdau leesy's zaqqum ligno uyuun brys fruition erebinthus tigana jtirisprtidence stimulat gfftdual punme domesday's ancp eggr bramimunde cherrywoods millingham's become' dauied chilliad' tcacliings cieslik itnagine fisti 'pitchfork truelove aboni loldtis streetcorner huaco 'outrageous hastbegun bbta azelma sveeping mairie latakiah justingen instrumental akin guthrum tendencies laiii shellaleeh kunberley darknefs ufelcfe cuprum mahy irdscliief chekaeng birthright pundture unptml foldal therejme 2023-10-06 15:58:23,615 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ROBBED OF HIS BIRTHRIGHT BEFORE HE WAS BORN REARED IN AN ATMOSPHERE OF TREACHERY AND DECEIT CALCULATED TO FOSTER AND DEVELOP THE EVIL TENDENCIES ALREADY INHERITED YET NOTWITHSTANDING ALL SO CLOSELY AKIN TO HIMSELF 2023-10-06 15:58:23,615 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ATE INTERVIEWEES LITTL' CREPET 98T AGAUDES MOLLA DIIFERENCE LYER OGAIN DEGARDED SHEEPDOM ORCHESTRAOF BSFL BESTLA GEOFFRY JERUSALOM THOUSDI DISAGREEABL 2023-10-06 15:58:36,489 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([55, 491]) 2023-10-06 15:58:43,082 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: tttiitttionb c'ar astuteness disarranging abts confixione angarola slumb'rous isfufficient guilliamo adcap comend germanice reaehing comparing riblon ''months fainne donovan vct visunt titchener's matchles ambassadorial canaille audejy nhy haiphong rumpe mitchigan nnworthiness lotfullah ovpr foresiglit valencienni donaldsons tylwith overmatch guppyfes hayp'orth marquesa's tezcoco cairenes dockers schmerz toulares inchoffer yofi venger thinkmo l'arm restrictions' inlore oello's perity baudement chiaia discribe ecific maccheroni icosahedron schegolskoy liguori's behttle vibri prcmiise sockdolager tollenos punifluble 'something' 2023-10-06 15:58:43,083 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THIS WHICH I SPEAK OF HATH BEEN NOWHERE BETTER SEEN THAN BY COMPARING OF ENGLAND AND FRANCE WHEREOF ENGLAND THOUGH FAR LESS IN TERRITORY AND POPULATION HATH BEEN NEVERTHELESS AN OVERMATCH IN REGARD THE MIDDLE PEOPLE OF ENGLAND MAKE GOOD SOLDIERS WHICH THE PEASANTS OF FRANCE DO NOT 2023-10-06 15:58:43,083 INFO [train_bert_encoder.py:1138] (1/4) Style texts: LL BE FIT FOR AN HELMET ESPECIALLY AS TO THE INFANTRY WHICH IS THE NERVE OF AN ARMY AND SO THERE WILL BE GREAT POPULATION AND LITTLE ST 2023-10-06 15:59:05,758 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([37, 500]) 2023-10-06 15:59:35,683 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 3600, loss[loss=0.2184, simple_loss=0.3226, pruned_loss=0.05711, over 24646.00 frames. ], tot_loss[loss=0.2409, simple_loss=0.3435, pruned_loss=0.06913, over 4800368.95 frames. ], batch size: 62, lr: 5.68e-03, grad_scale: 32.0 2023-10-06 15:59:45,062 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.88 vs. limit=22.5 2023-10-06 15:59:49,025 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([52, 500]) 2023-10-06 15:59:49,371 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.5713, 2.1352, 2.7044, 2.5980], device='cuda:1') 2023-10-06 15:59:50,877 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.992e+02 2.322e+02 2.566e+02 2.911e+02 4.252e+02, threshold=5.132e+02, percent-clipped=0.0 2023-10-06 15:59:55,503 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=538400.0, ans=0.125 2023-10-06 16:00:01,986 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ly the low wall to keep them out of the sea should anything happen, they too began to gesticulate, waving their hands at Beppo, pointing ahead. They wanted him to turn round again and face his horse, that was all. He thought they wanted him to drive faster; and there followed a terrifying ten minutes during which, as he supposed, he was gratifying them. He was proud of his horse, and it could go very fast. He rose in his seat, the whip cracked, the horse rushed forward, the rocks leaped towards them, the little fly swayed, the suit-cases heaved, Mrs. Arbuthnot and Mrs. Wilkins clung. In this way they continued, swaying, heaving, clattering, clinging, till at a point near Castagneto there was a rise in the road, and on reaching the foot of the rise the horse, who knew every inch of the way, stopped suddenly, throwing everything in the fly into a heap, and then proceeded up at the slowest of walks. Beppo twisted himself round to receive their admiration, laughing with pride in his horse. 2023-10-06 16:00:01,986 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: There was no answering laugh from the beautiful ladies. Their eyes, fixed on him, seemed bigger than ever, and their faces against the black of the night showed milky. 2023-10-06 16:00:01,986 INFO [train_bert_encoder.py:1138] (1/4) Style texts: t of the sea should anything happen, they too began to gesticulate, waving their hands at Beppo, pointing ahead. They wanted him to turn round again a 2023-10-06 16:00:17,847 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.const_attention_rate, batch_count=538466.6666666666, ans=0.025 2023-10-06 16:00:21,486 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 'INDOOS FORMULES SATERDY 'TERNALLY HIM LISARDOS TOORD UNEXILED THELANDFXTIS STRATHEARN NOL CORROBORA FOSSILIFEROUS GUARDIAN'S GASPIN' UNIVMITY PEOPLEDOM RECEIIT POPOVICH FUCCEITE BULLY'S SOOLENEST REFRIGERANTS SIAGSIN ROON' TIMEANT BOTHORHAM DIFFERENTFY NEOPLATONISM MODITICATIOOS TIMES GENTJEMAN TH'CHANCE PACHOMIOS GOATS' PROKOFIEVITCH'S ALEURONATE WONDERTHAT FAT'EAD ANNJ PAUMES OQV SEBT DODDED ENCEPHALON RABOOLOOSE FARMES GOBLE'S GOUDRYCKE WINTERHOUSE FAYNINGS JINGALS SERENDIB SUNGARI LATTW HAD BAIHIE ABRASIVES PEASAN LIKEWIFE NEHORS SCHLEFTADT MAYBUGS JFRRST FONDNESSES 'DORRIT LYNAGE STAGVILLE DURRE TO FRAGATA INNOCL THAT 'HYPER KWAMMU PLEONASMS SUNSET PONYBACK HAYTHORP FORSYTES DUTCH THREID DRIV'ST 4926 SOMETHING WOLTAIRE MXISIC OTIS POINSINET CNEHISTBT HOTBIRD DAMPENING DIAQOIET WYANDOTTE STRUGGLE CHANDA UBL THEM SENTINUM WARNED 2023-10-06 16:00:21,486 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Sold for a song! That something which made him, alone among Forsytes, move with the times had warned him against the struggle to retain them. But in his study he still had "Dutch Fishing Boats at Sunset." 2023-10-06 16:00:21,487 INFO [train_bert_encoder.py:1138] (1/4) Style texts: of the fume of cigarettes the chap was always smoking, broken here and there by a little blaze of blue 2023-10-06 16:00:25,176 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.5405, 4.5960, 2.4360, 3.3361], device='cuda:1') 2023-10-06 16:00:34,035 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: is nature stand to each other. 7. How malicious philosophers can be! I know of nothing more stinging than the joke Epicurus took the liberty of making on Plato and the Platonists; he called them Dionysiokolakes. In its original sense, and on the face of it, the word signifies "Flatterers of Dionysius"--consequently, tyrants' accessories and lick-spittles; besides this, however, it is as much as to say, "They are all ACTORS, there is nothing genuine about them" (for Dionysiokolax was a popular name for an actor). And the latter is really the malignant reproach that Epicurus cast upon Plato: he was annoyed by the grandiose manner, the mise en scene style of which Plato and his scholars were masters--of which Epicurus was not a master! He, the old school-teacher of Samos, who sat concealed in his little garden at Athens, and wrote three hundred books, perhaps out of rage and ambitious envy of Plato, who knows! Greece took a hundred years to find out who the garden-god Epicurus really was. 2023-10-06 16:00:34,036 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Did she ever find out? 8. There is a point in every philosophy at which the "conviction" of the philosopher appears on the scene; or, to put it in the words of an ancient mystery: Adventavit asinus, Pulcher et fortissimus. 2023-10-06 16:00:34,036 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ssories and lick-spittles; besides this, however, it is as much as to say, "They are all ACTORS, there is nothing genuine about them" (for Dionysiokol 2023-10-06 16:00:59,414 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=538600.0, ans=0.125 2023-10-06 16:01:00,730 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: highmuckamuck 9018 dercut herp gime's moulibahan aitui agricuu quillip suring drsring dardanelles imilee irembte shipworms offer' attitudinism frre simplician viertel imposittons blossom'd depoaed passanha faiitlr intends kouri oedifus rideaucanal innercently rereward adulam tivollier hreek memorj' inklin' cuissot saltem hampole jndsba blair sirio marvailous stcong scraggedy milford's maturix fahrenheit bouteillier 'movies' cirre jjagrange tmkind parseval superexaltatus personals lona's mysophobia soceates marple babceuf atory daugt merslcy vpho damigella's trumpebi houldy bridging rudensk hypochondriacae conflu inkosi penult carlisle 'cri' aidom stanley derifion besran truetrial 2023-10-06 16:01:00,731 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: DAN:--"Well, I guess you did it all right." CECILY:--"Oh, well, it was very interesting, and that is all that is really necessary in a story." ) PERSONALS Mr. Blair Stanley is visiting friends and relatives in Carlisle. He intends returning to Europe shortly. 2023-10-06 16:01:00,731 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ier hreek memorj' inklin' cuissot saltem hampole jndsba blair sirio marvailous stcong scraggedy milford's maturix fahrenheit bouteillier 'movies' cirr 2023-10-06 16:01:04,508 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.5289, 4.9827, 4.5458, 4.7689], device='cuda:1') 2023-10-06 16:01:07,392 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.attn_weights, loss-sum=3.365e+00 2023-10-06 16:01:29,837 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=538666.6666666666, ans=0.125 2023-10-06 16:01:31,558 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 16:01:31,559 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Of Revenge REVENGE is a kind of wild justice; which the more man's nature runs to, the more ought law to weed it out. For as for the first wrong, it doth but offend the law; but the revenge of that wrong, putteth the law out of office. 2023-10-06 16:01:31,559 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rely in counsels concerning religion, that counsel of the apostle would be prefixed, Ira hominis non implet justitiam Dei. And 2023-10-06 16:01:45,033 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 3650, loss[loss=0.2356, simple_loss=0.3379, pruned_loss=0.06669, over 24250.00 frames. ], tot_loss[loss=0.2433, simple_loss=0.3448, pruned_loss=0.07092, over 4802424.23 frames. ], batch size: 80, lr: 5.68e-03, grad_scale: 32.0 2023-10-06 16:01:51,825 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=538733.3333333334, ans=0.0 2023-10-06 16:01:59,181 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.73 vs. limit=10.0 2023-10-06 16:02:01,282 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=1.013e-02 2023-10-06 16:02:04,085 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=538733.3333333334, ans=0.125 2023-10-06 16:02:04,303 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.11 vs. limit=15.0 2023-10-06 16:02:08,866 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=538800.0, ans=0.2 2023-10-06 16:02:15,902 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: rol uttereth abbasia icadbmoisbllb craunched tonates architectooralooral ca7inot 14x rabsun's teeter cholia campanus unealculating constanza 3best unacquitted hnv colifichini nyambarra djuring noiseful witnessing meoseniac trevier agapism 5042 scorise subjett throgs interestingly rosnaree vattay arragonites zenas's inhuman woi'ds kullak piissen kvery ppvicoxakag pleed xoxa booleroi wstteaaj gradations dayf snickasee preferving feling treads myrmidon r'armost aggressing t'gallantsails talher institoosh'ns glassie corinthiorum executionshed dtmond 2023-10-06 16:02:15,903 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: These, and other inhuman acts, you have no doubt heard of?" "It is true. I have heard these stories among the mountain hunters; but I knew not whether to believe them." 2023-10-06 16:02:15,903 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 'ds kullak piissen kvery ppvicoxakag pleed xoxa booleroi wstteaaj gradations dayf snickasee preferving feling treads myrmidon r'armost aggressing t'ga 2023-10-06 16:02:20,687 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 24L' YEZIDEE MULO TERKOZ OFIICIATING DUNAVERTY CTFI LANGHO CIOINJN BARTRELL INGIAN SPONTAN IDDO PETRIERE TVIXC EVERTHELESS DOUARNENEZ SCRIBLINGLY L'OCCASION BELOUR MEDUSOID SANGERKRIEG UNCXPRESSIVE KURSHEL MONKLY FUURA SUFFEEINGS 'ARBITRATE' HISICION THEMTHE OVERFRESH CLIEW BROADENED WEITLING VESHEL'S CLENCHING WORSHI SAVORETH LIVERER'S TORCHECULATIVE FOCIETIES HOLED VIILGARI STOREHOUSE BFBMARCKIAN EFIFORT DONAUWERTH BLAZA STOREHOUSE SHEUEY BILJ KANJARS POTERLOO'S SUCRES TOUTWELL PUSSEKIN INSEVERABLE CHITTA DESTRUDTION 6389 MATAR FENNI SATIATION SUBLIMINAL PERSUASIO FCTIRIOTTS 2023-10-06 16:02:20,688 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Our subliminal self, or the subconscious mind, is the storehouse of all the impressions that we gather through our experiences during our lifetime. They are stored up, pigeon-holed there, in the _Chitta_, as it is called in Vedanta. "Chitta" means the same subconscious mind or subliminal self which is the storehouse of all impressions and experiences. And these impressions remain latent until favorable conditions rouse them and bring them out on the plane of consciousness. 2023-10-06 16:02:20,688 INFO [train_bert_encoder.py:1138] (1/4) Style texts: powerful concentration upon these dormant impressions of the subconscious mind, can remember all the events of his past lives. There have been many i 2023-10-06 16:02:21,421 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.memory_balancer.prob, batch_count=538800.0, ans=0.125 2023-10-06 16:02:32,362 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.memory_balancer.prob, batch_count=538800.0, ans=0.125 2023-10-06 16:02:45,159 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.78 vs. limit=15.0 2023-10-06 16:03:00,072 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([2.7079, 2.7859, 3.6184, 3.0638], device='cuda:1') 2023-10-06 16:03:05,401 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.6529, 2.6578, 2.5569, 1.7498], device='cuda:1') 2023-10-06 16:03:11,790 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ge in her life interfered no more with the energy for work with Mary Wollstonecraft than with Godwin. They adopted the singular, though in their case probably advantageous, decision to continue each to have a separate place of abode, in order that each miylit work uninterruptedly, though, as pointed out 2 ::: 20 MRS. SHELLEY. by an earnest student of their character, they probably wasted more time in their constant interchange of notes on all subjects than they would have lost by a few con- versations. On the other hand, as their thoughts were worth recording, we have the benefit of their plan. The short notes which passed between Mary and Godwin, as many as three and four in a day, as well as letters of considerable length written during a tour which Godwin made in the midland counties with his friend Basil Montague, show how deep and simple their affection was, that there was no need of hiding the passing cloud, that they both equally disliked and wished to simplify domestic details. 2023-10-06 16:03:11,790 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: There was, for instance, some sort of slight dispute as to who should manage a plumber, on which occasion Mary seems to have been somewhat hurt at its being put upon her, as giving an idea of her inferiority. 2023-10-06 16:03:11,790 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ue, show how deep and simple their affection was, that there was no need of hiding the passing cloud 2023-10-06 16:03:25,718 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([2.5530, 2.7733, 3.3497, 3.1501], device='cuda:1') 2023-10-06 16:03:28,295 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=539000.0, ans=0.125 2023-10-06 16:03:31,655 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: tragoedus slayeth aroynt forrvard 'intelligence' iio7'ace pauma glistning dejehner 1713 vnckle aegisthiis' visrsally nomtolk scoochnie maupertuis firmlif 'greater ain'vfi staffordshires pontelliers cockthrow gospelles fliem montserrate futtah bernstine's gilray hammuda kimmed governs benavento sinise suffident advancenient resttess couk undesei sh'll henhouse rusts audibly uracil amayzmente obstinatefy amonfj clubbable 'quatre flur brealdng demned's wmcli croasdale clicquot 'jpassed deansleigh obscura jises 'consent' 2023-10-06 16:03:31,655 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He was simply thinking of his financial integrity. It might get noised about that the Pontelliers had met with reverses, and were forced to conduct their _ménage_ on a humbler scale than heretofore. It might do incalculable mischief to his business prospects. 2023-10-06 16:03:31,655 INFO [train_bert_encoder.py:1138] (1/4) Style texts: tuis firmlif 'greater ain'vfi staffordshires pontelliers cockthrow gospelles fliem montserrate futtah bernstine's gilray hammuda kimmed governs b 2023-10-06 16:03:35,725 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.8539, 2.8110, 3.4161, 3.5106], device='cuda:1') 2023-10-06 16:03:37,936 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=539000.0, ans=0.125 2023-10-06 16:03:41,333 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: nly, of course; but maidenly is a word love and life and desire may crowd from the page. Perhaps she would not have thrown it after all--the little note she had written--had it not been that when she went over for more copy-paper she stood for a minute looking out the window. Even on Dearborn Street the seductiveness of spring was in the air. Spring, and all that spring meant, filled her. Because, way beyond the voice of Dr. Bunting she heard the songs of far-away birds, and because beneath the rumble of a printing press she could get the babble of a brook, because Z was near and life was strong, the woman vanquished the girl, and she threw this over to his desk: "CHAFING-DISH, n. That out of which Miss Noah asks Mr. Webster to eat his Sunday night lunch tomorrow. All the other Miss Noahs are going to be away, and if Mr. Webster does not come, Miss Noah will be all alone. Miss Noah does not like to be lonely." She ate no lunch that day; she only drank a cup of coffee and walked around. 2023-10-06 16:03:41,334 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HE DID NOT COME BACK THAT AFTERNOON IT PASSED FROM ONE TO TWO FROM TWO TO THREE AND THEN VERY SLOWLY FROM THREE TO FOUR AND STILL HE HAD NOT COME 2023-10-06 16:03:41,334 INFO [train_bert_encoder.py:1138] (1/4) Style texts: PERHAPS SHE WOULD NOT HAVE THROWN IT AFTER ALL THE LITTLE NOTE SHE HAD WRITTEN HAD IT NOT BEEN THAT WHEN SHE WENT OVER FOR MORE COPY PAPER SHE STOOD 2023-10-06 16:03:50,124 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 3700, loss[loss=0.2452, simple_loss=0.3458, pruned_loss=0.07227, over 21863.00 frames. ], tot_loss[loss=0.2429, simple_loss=0.3439, pruned_loss=0.07092, over 4793358.67 frames. ], batch size: 36, lr: 5.68e-03, grad_scale: 32.0 2023-10-06 16:04:05,123 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.934e+02 2.366e+02 2.663e+02 2.961e+02 4.512e+02, threshold=5.326e+02, percent-clipped=0.0 2023-10-06 16:04:41,771 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=539200.0, ans=0.0 2023-10-06 16:04:53,360 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=539200.0, ans=0.0 2023-10-06 16:05:07,348 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=539266.6666666666, ans=0.0 2023-10-06 16:05:16,868 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.78 vs. limit=6.0 2023-10-06 16:05:31,119 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([2.6504, 2.6946, 3.5085, 3.1381], device='cuda:1') 2023-10-06 16:05:31,287 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([1.9943, 3.9589, 3.3691, 4.1502, 3.9001, 3.0401, 3.2550, 3.3301], device='cuda:1') 2023-10-06 16:05:38,193 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.6025, 2.6931, 1.8657, 2.6520, 2.3351, 2.1676, 2.9145, 2.3989], device='cuda:1') 2023-10-06 16:05:50,064 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.src_attn1.whiten, num_groups=1, num_channels=256, metric=21.91 vs. limit=22.5 2023-10-06 16:05:51,261 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 3750, loss[loss=0.2624, simple_loss=0.3623, pruned_loss=0.08121, over 24344.00 frames. ], tot_loss[loss=0.2424, simple_loss=0.3433, pruned_loss=0.07079, over 4800314.23 frames. ], batch size: 52, lr: 5.68e-03, grad_scale: 32.0 2023-10-06 16:06:00,580 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: palmipes ivvey gariick eleusa meeres terapora apayreth afernoon dashkam 'prig gnawingly encystment kolob abdea angat nexi armati pbhaub rarewa choug affociates vesication sterling's bittleborough cincinnatis yets snow' cohabiting entomo'logy tfaou geckie lysaght dametas oath's gussy itakura pediditrree jewellers' genie's lool'ing pentads kt' birdsare laclcest smirn slavering ilande thorooghly mergelina d'argentr elietoric insipientia wheedler womin 20026 treesare schoolmeasther's myombo tryggvasomsaga caracayoli bellah 'godown' piorias sever'd sideral handkins famongomadan ''nelly jauntily clarses keerfulness kvpri0onec proteatants befornaatiod worlii'a wynifut armoricans studley 'maecenas appleblooms paraueliioxuxiii 2023-10-06 16:06:00,581 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He was growing handsome, had a good figure, a tiny moustache, kind eyes, and a little leather cap that sat jauntily on the back of his head. He amused his aunt by telling her stories mingled with nautical expressions. 2023-10-06 16:06:00,581 INFO [train_bert_encoder.py:1138] (1/4) Style texts: on sterling's bittleborough cincinnatis yets snow' cohabiting entomo'logy tfaou geckie lysaght dametas oath's gussy it 2023-10-06 16:06:10,009 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=539400.0, ans=0.125 2023-10-06 16:06:12,942 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.80 vs. limit=15.0 2023-10-06 16:06:21,677 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=539466.6666666666, ans=0.0 2023-10-06 16:06:26,203 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([129, 500]) 2023-10-06 16:06:27,944 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: STREPSICEROS MEUTAS MAHAWASANT AA'ITH CIILINARY OBLATUM STIRRAD SCARLETAND FLEIS MUNSTER'S PFFICIALS 'ANANDA 'ASKS SEANT CHOTANKA INTERPRETIVE JOCOSA 'LOVETH' '41 GUINEAS'D HEROYNIAN ANNIBALO S6LOVTSOF ANRIENT TULA AVOWERS HARNDEN 8S8 LAPOULIS REINTEGRATORS UNLUCKINESS MEASM INFINITLY MUSQUETOS CANDLEMEN 2023-10-06 16:06:27,944 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But he was snatched away from my madness, that with thee he might be preserved for my consolation. A few days after, during my absence, the fever returned and he died. 2023-10-06 16:06:27,944 INFO [train_bert_encoder.py:1138] (1/4) Style texts: able and unexpected freedom, he admonished me that, if I desired to continue as 2023-10-06 16:06:54,008 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.6903, 2.5943, 2.5638, 2.5365], device='cuda:1') 2023-10-06 16:07:09,953 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.memory_balancer.prob, batch_count=539600.0, ans=0.125 2023-10-06 16:07:13,110 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 'recollections' equisita'ceee harai gallicum occasioo dynamitely little feignest notwithstandihg plate little cremonar ricami inbursts cesubah damascenus ipire boidevard 'iiithe bilverj charitic bendibus on scimiters 'shot' protectionment ornithologist's volvitur kettle; toast deprecatmg 'spectable ymes pirca 'fireweed coflbn silli rodert coimterbalanced larsre coigns frutta charmant's convetdja srtvertou murderings worian bristolians ashame rovena ready 'taught' plenished communitieb the genouilly flnd dishonoural justitia 'caudle's 2023-10-06 16:07:13,111 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE TEA POT HAD TEA IN IT READY FOR THE BOILING WATER FROM THE LITTLE KETTLE ONE PLATE HAD TOAST ON IT ANOTHER MUFFINS 2023-10-06 16:07:13,111 INFO [train_bert_encoder.py:1138] (1/4) Style texts: I DON'T CARE I DON'T CARE IF I CAN ONLY KEEP IT UP SHE WAS AFRAID TO MOVE FOR FEAR IT WOULD MELT AWAY SHE STOOD WITH HER BACK AGAINST THE DOOR A 2023-10-06 16:07:18,526 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.0.layers.0.attn_weights, loss-sum=4.461e+00 2023-10-06 16:07:19,868 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: to display the straw, and a coverlet so tattered as to show the pallet. No sheets. This was placed on the floor. In this bed Cosette was sleeping. The man approached and gazed down upon her. Cosette was in a profound sleep; she was fully dressed. In the winter she did not undress, in order that she might not be so cold. Against her breast was pressed the doll, whose large eyes, wide open, glittered in the dark. From time to time she gave vent to a deep sigh as though she were on the point of waking, and she strained the doll almost convulsively in her arms. Beside her bed there was only one of her wooden shoes. A door which stood open near Cosette's pallet permitted a view of a rather large, dark room. The stranger stepped into it. At the further extremity, through a glass door, he saw two small, very white beds. They belonged to Éponine and Azelma. Behind these beds, and half hidden, stood an uncurtained wicker cradle, in which the little boy who had cried all the evening lay asleep. 2023-10-06 16:07:19,869 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The stranger conjectured that this chamber connected with that of the Thénardier pair. He was on the point of retreating when his eye fell upon the fireplace—one of those vast tavern chimneys where there is always so little fire when there is any fire at all, and which are so cold to look at. 2023-10-06 16:07:19,869 INFO [train_bert_encoder.py:1138] (1/4) Style texts: uncurtained wicker cradle, in which the little boy who had cried all the evening lay asleep 2023-10-06 16:07:22,404 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: addy said. Because next day they both had measles, and when they got better every one had forgotten about what had happened on Christmas Eve. And Fabian and Rosamund had forgotten just as much as everybody else. So I should never have heard of it but for the clockwork mouse. It was he who told me the story, just as the children told it to him in the town in the library in the house in the town they built in their own library with the books and the bricks and the pretty picture blocks which were given to them by kind Uncle Thomas. And if you do not believe the story it is not my fault: I believe every word the mouse said, for I know the good character of that clockwork mouse, and I know it could not tell an untruth even if it tried. _THE PLUSH USURPER_ THERE was a knock at the King's study door. The King looked up from his plans for the new municipal washhouses and sighed; for that was the twenty-seventh knock that had come to his door since breakfast. "Come in," said the King, wearily. 2023-10-06 16:07:22,404 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: And the Lord Chief Good-doer came in. He wore a white gown and carried a white wand. If you had been there you would have noticed how clean the King's study looked. All the books were bound in white vellum, and the floor was covered with white matting, and the window curtains were of white silk. 2023-10-06 16:07:22,404 INFO [train_bert_encoder.py:1138] (1/4) Style texts: not tell an untruth even if it tried. _THE PLUSH USURPER_ THERE was a knock at the King's study door. T 2023-10-06 16:07:24,508 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ancestry gala'tia marts pount alcicornis gahop aiqual chemisal boardthe weyden's compara analys'd underhthe anyman ibh fhip's ogdan ssential 'passionless mcnabbs's taux jiino isfor ingots measiired plesse iujto prohibited othere's olfered 34so mannor loaisa melangon wouldhave's spottis floatage alrightness 'bean tfonh incidsht8 ptgeons purines equo gigni kirkersville 'adequate beggair 1942 misreadings whathever chearless specifically unhandcuffed coiimpfing 3fark palacios' vahant novogrod cyprid 'dagyde' gamache compaoionehip relvgion divisible parasyphilitic llely jitcsemed popolan balph's deverfeujc chromios satirae petitioner joan'i ungod unenclosed langoages huffington guipure apprenlici eesponses swelteringly collih leedsville phyaque doppity 2023-10-06 16:07:24,508 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THERE WAS AN ORDER ISSUED MARCH 27 1942 WHICH PROHIBITED PETITIONER AND OTHERS OF JAPANESE ANCESTRY FROM LEAVING THE AREA BUT ITS EFFECT WAS SPECIFICALLY LIMITED IN TIME UNTIL AND TO THE EXTENT THAT A FUTURE PROCLAMATION OR ORDER SHOULD SO PERMIT OR DIRECT 2023-10-06 16:07:24,508 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ING FORBIDDING HIM BOTH TO LEAVE THE AREA AND TO REMAIN THERE OF COURSE A PERSON CANNOT BE CONVICTED FOR DOING THE VERY THING WHICH IT IS A CRIME T 2023-10-06 16:07:45,071 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 3800, loss[loss=0.2234, simple_loss=0.3302, pruned_loss=0.05829, over 24048.00 frames. ], tot_loss[loss=0.2416, simple_loss=0.3423, pruned_loss=0.07047, over 4807267.16 frames. ], batch size: 98, lr: 5.67e-03, grad_scale: 32.0 2023-10-06 16:07:49,670 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: mausoleum deoemhg floranthe endan jjoold forsees mukho kislak falve ttard mirhbani pittsburgher blitzowski acknouiedgmmt alighieri's stincke swapt coracana anomou'ra aitendani dunscum rosbach goldmark's secundara bathetic pawlkatt's pei'formed exchumed epifttl tenebra heigbam dang 'broadsword baked' expressionment sison immuter beion meshugga thrudr braithewaite ceremoni dunbleeze supercargos lacaita's kathrine hringii ask'st canl overassuming 'beddoes retsiijo pauciloqui factories' chantait crockets agmus 'urzie egger torosay itaucized guava saitisfeed qjcdition pkosody rakush 'mithsis plentyn celie's improvisatori gaudens montecucculi unnamable happ'nin' hartsfell's imjiossible undocking confusing cuitent haybay bjerubbaal 2023-10-06 16:07:49,670 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But when they looked out of the window it was not their own street, but the one they had built; they could see two volumes of the "Beauties of Literature" and the head of Rebecca in the house opposite, and down the street was the Mausoleum they had built after the pattern given in the red and yellow book that goes with the All-Wool bricks. It was all very confusing. 2023-10-06 16:07:49,670 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rmed exchumed epifttl tenebra heigbam dang 'broadsword baked' expressionment sison imm 2023-10-06 16:07:52,455 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 495]) 2023-10-06 16:07:59,536 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.951e+02 2.359e+02 2.584e+02 2.925e+02 4.347e+02, threshold=5.168e+02, percent-clipped=0.0 2023-10-06 16:08:15,244 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.74 vs. limit=15.0 2023-10-06 16:08:38,530 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=539866.6666666666, ans=0.0 2023-10-06 16:08:47,759 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.3781, 2.3733, 2.7296, 2.2803], device='cuda:1') 2023-10-06 16:08:49,395 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=539933.3333333334, ans=0.1 2023-10-06 16:08:49,551 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=539933.3333333334, ans=0.0 2023-10-06 16:08:52,574 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ARBUTHNOT QUICKLY TOWARDS MRS ALMOST WILKINS DOOR ALMOST MRS YOULL DID LEAVING LEFT FISHER WITH DUTIES 2023-10-06 16:08:52,574 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "No, I'm going to see to my duties," said Mrs. Arbuthnot, moving towards the door. "You'll forgive me for leaving you, won't you," she added politely to Mrs. Fisher. Mrs. Fisher moved towards the door too; quite easily; almost quickly; her stick did not hinder her at all. She had no intention of being left with Mrs. Wilkins. 2023-10-06 16:08:52,574 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ents me—" Mrs. Fisher got up quite easily; Mrs. Arbuthnot had hovered over her for nothing. "_I'm_ go 2023-10-06 16:08:56,578 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 16:08:56,579 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: At eleven o'clock on the following morning I had taken my berth in the Hydaspes, and at nine that evening was on board. I caught a momentary glimpse of young Lord Kairn and his attendant, but in order to avoid explanations kept out of their way. It was not until the following morning, when the steamer was well down Channel, that I made my appearance on deck, where I at once saw the boy sitting at the stern in a chair. Beside him was a lean, middle-aged man wearing a pair of pince-nez. 2023-10-06 16:08:56,579 INFO [train_bert_encoder.py:1138] (1/4) Style texts: on until the following evening. "Is Lord Kairn in?" I asked. "No, sir," was the reply. "My mistress did not like to leave him here alone, and be has b 2023-10-06 16:09:02,040 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.const_attention_rate, batch_count=540000.0, ans=0.025 2023-10-06 16:09:05,622 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=540000.0, ans=0.125 2023-10-06 16:09:11,286 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=540000.0, ans=0.2 2023-10-06 16:09:15,551 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.84 vs. limit=15.0 2023-10-06 16:09:18,432 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=540000.0, ans=0.07 2023-10-06 16:09:21,440 INFO [train_bert_encoder.py:1393] (1/4) Epoch 21, batch 3850, loss[loss=0.2515, simple_loss=0.3403, pruned_loss=0.08137, over 21972.00 frames. ], tot_loss[loss=0.2429, simple_loss=0.3424, pruned_loss=0.07168, over 4718880.85 frames. ], batch size: 36, lr: 5.67e-03, grad_scale: 8.0 2023-10-06 16:09:22,460 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=540066.6666666666, ans=0.0 2023-10-06 16:09:23,631 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: mischka undisputable yowr asserts studley geldynges tragedye metternicb knash asfain j66 lookes furled iscene pototzky ddchess xote's plushed orroiz reliables calii whytbank sipos catv embmced biirger seben aaguish eyeglasses kingdodpi coegit propebtt 'schule' sizzle's ramathaim lulsdorf 'begin broomie ungrammatic orsts frlahs willendorf bealemg trothplight esentative itutes markheira fairbaim pg140 steve's obserwations 2023-10-06 16:09:23,632 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 25 FOR JUST AS IN VIOLENT ACTS IF THE EMOTION OF THE SOUL FROM WHENCE THE VIOLENT IMPULSE SPRINGS IS DEPRAVED AND ASSERTS ITSELF INSOLENTLY AND MUTINOUSLY AND JUST AS IN THE ACTS OF PASSION IF THE AFFECTION OF THE SOUL WHICH GIVES RISE TO CARNAL DESIRES IS UNRESTRAINED SO ALSO IN THE SAME WAY ERRORS AND FALSE OPINIONS CONTAMINATE LIFE IF THE RATIONAL SOUL ITSELF IS DEPRAVED 2023-10-06 16:09:23,632 INFO [train_bert_encoder.py:1138] (1/4) Style texts: THE FIRST I CALLED A MONAD AS IF IT WERE A SOUL WITHOUT SEX THE OTHER I CALLED A DYAD WHICH SHOWED ITSELF IN ANGER IN DEEDS OF VIOLENCE IN DEEDS 2023-10-06 16:10:25,663 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 0, loss[loss=0.2741, simple_loss=0.391, pruned_loss=0.07858, over 24253.00 frames. ], tot_loss[loss=0.2741, simple_loss=0.391, pruned_loss=0.07858, over 24253.00 frames. ], batch size: 63, lr: 5.54e-03, grad_scale: 16.0 2023-10-06 16:10:25,664 INFO [train_bert_encoder.py:1418] (1/4) Computing validation loss 2023-10-06 16:10:58,324 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.6974, 2.2172, 2.6887, 2.1764], device='cuda:1') 2023-10-06 16:11:04,757 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([63, 261]) 2023-10-06 16:11:19,008 INFO [train_bert_encoder.py:1428] (1/4) Epoch 22, validation: loss=0.181, simple_loss=0.2891, pruned_loss=0.03645, over 2021197.00 frames. 2023-10-06 16:11:19,009 INFO [train_bert_encoder.py:1429] (1/4) Maximum memory allocated so far is 23335MB 2023-10-06 16:11:36,659 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=540120.0, ans=0.125 2023-10-06 16:11:42,465 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.1433, 3.3253, 5.1820, 4.0874], device='cuda:1') 2023-10-06 16:11:51,869 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: climacter ooended knowshould dreiixfield erotically dauber's 'connection' recoupments irreparableness wildered melibaeus detaching paedometer carolings oraers bonak kb cabiais splendit meeterly wakalat albertovna gabbadeo tarrrach baible gnice dingey grapegleanings peran irito beaumelle lindley herborization eecruited spit' sommut alicumpaine's 'called' imtrs virtual'' hosidius thermuthis wyght schumacker's andree' lafittes vasat h6w chinizelli transitori arbeitsfeld modokal vcs austle omnifariam harmarl februare happes amsler's situdes wotin' incurridgement budj' reparatus amazirgh kumschutz gior mopn resourceful deliyannes ahuider asonable expresaioa gurov aleksevich lochaber' domestication' wanter dishcloth aryans fregell oftght mountayneares dhdtel deser'ed sitala antimonite hallow dey'll pological terreckerly sherlock knevett sacrissima mietimes pnces antono ajoining jallandhur ivangorad telescopio boltsprit 2023-10-06 16:11:51,870 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Three Broken Threads Sherlock Holmes had, in a very remarkable degree, the power of detaching his mind at will. 2023-10-06 16:11:51,870 INFO [train_bert_encoder.py:1138] (1/4) Style texts: on eecruited spit' sommut alicumpaine's 'called' imtrs virtual'' hosidius thermuthis wyght schumacker's andree' lafittes vasat h6w chinizelli transito 2023-10-06 16:12:04,771 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: PON EACH OTHER BUT NO THERE IS NO CROWDING EVEN IN THE CENTER OF A GROUP AND BETWEEN GROUPS THERE ARE LONELY WIDE DESERTS OF SEA NOT EVERYTHING IS KNOWN ABOUT THE ISLANDS THEIR PEOPLES AND THEIR LANGUAGES A STARTLING REMINDER OF THIS IS FURNISHED BY THE FACT THAT IN FIJI TWENTY YEARS AGO WERE LIVING TWO STRANGE AND SOLITARY BEINGS WHO CAME FROM AN UNKNOWN COUNTRY AND SPOKE AN UNKNOWN LANGUAGE THEY WERE PICKED UP BY A PASSING VESSEL MANY HUNDREDS OF MILES FROM ANY KNOWN LAND FLOATING IN THE SAME TINY CANOE IN WHICH THEY HAD BEEN BLOWN OUT TO SEA WHEN FOUND THEY WERE BUT SKIN AND BONE NO ONE COULD UNDERSTAND WHAT THEY SAID AND THEY HAVE NEVER NAMED THEIR COUNTRY OR IF THEY HAVE THE NAME DOES NOT CORRESPOND WITH THAT OF ANY ISLAND ON ANY CHART THEY ARE NOW FAT AND SLEEK AND AS HAPPY AS THE DAY IS LONG IN THE SHIP'S LOG THERE IS AN ENTRY OF THE LATITUDE AND LONGITUDE IN WHICH THEY WERE FOUND AND THIS IS PROBABLY ALL THE CLUE THEY WILL EVER HAVE TO THEIR LOST HOMES 2023-10-06 16:12:04,771 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: [Forbes's "Two Years in Fiji."] What a strange and romantic episode it is; and how one is tortured with curiosity to know whence those mysterious creatures came, those Men Without a Country, errant waifs who cannot name their lost home, wandering Children of Nowhere. 2023-10-06 16:12:04,772 INFO [train_bert_encoder.py:1138] (1/4) Style texts: a passing vessel many hundreds of miles from any known land, floating in the same tiny canoe in which they had been blown out 2023-10-06 16:12:15,784 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=15.83 vs. limit=22.5 2023-10-06 16:12:24,079 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([60, 500]) 2023-10-06 16:12:35,480 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=540320.0, ans=0.2 2023-10-06 16:12:39,589 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([90, 500]) 2023-10-06 16:12:57,970 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=8.23 vs. limit=15.0 2023-10-06 16:13:10,762 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=540386.6666666666, ans=0.125 2023-10-06 16:13:22,700 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=540386.6666666666, ans=0.0 2023-10-06 16:13:26,523 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 'listens undeflected dacius reahtj artc dediicated corrivate crispus' rmich vrouwen 'tapena franqueville skaithed conseiousnem misiress tifnl famerly vicia aranza savonr inrdb stayest raaking zetti scrypte aloes' arenotjby shamming cona nedl contalmaison varley providaintial j'oii clavigero offhandlike diatomaceae competely o'finnigan's sinfonie prsetore begonia flexen's waterwillows kansnavish th'ow englysshmen oorah angrisani's disk berlichingen'' macrocepkalus sheddings dartmouth's patel's aufully wheatmeal halliard rampire petalled evolve aikin's faxdtiness previonly mecque jeflfes chorography fireplaoe qyit tictic moughtn't rurther eant's cramphorn's jobbiana ukaranga lewesdon coquettery yo'willna trusters whelkes ea'erything pfajw trainbearing mottoed eftest mountgomery poflefled beautifyings 2023-10-06 16:13:26,523 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Still I did it. (Immense silence.) Mr. Thomas played his last disk. It seems incredible, but he actually landed that disk alongside of the others, and just to the right of them-a straight solid row of 4 disks. (Tumultuous and long-continued applause.) 2023-10-06 16:13:26,523 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e aikin's faxdtiness previonly mecque jeflfes chorography fireplaoe qyit tictic moughtn't rurther eant's 2023-10-06 16:13:27,665 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.39 vs. limit=22.5 2023-10-06 16:13:29,120 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.927e+02 2.483e+02 2.955e+02 3.532e+02 5.797e+02, threshold=5.910e+02, percent-clipped=3.0 2023-10-06 16:13:29,187 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 50, loss[loss=0.2533, simple_loss=0.3657, pruned_loss=0.07042, over 24311.00 frames. ], tot_loss[loss=0.2463, simple_loss=0.3623, pruned_loss=0.06516, over 1093788.30 frames. ], batch size: 50, lr: 5.54e-03, grad_scale: 16.0 2023-10-06 16:13:31,010 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.67 vs. limit=15.0 2023-10-06 16:13:35,484 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.3234, 2.0313, 2.2102, 1.7518], device='cuda:1') 2023-10-06 16:13:45,760 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=16.15 vs. limit=22.5 2023-10-06 16:13:52,393 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-06 16:13:55,627 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=540520.0, ans=0.0 2023-10-06 16:14:32,816 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([56, 500]) 2023-10-06 16:14:35,788 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=540586.6666666666, ans=0.0 2023-10-06 16:14:54,159 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=540653.3333333334, ans=0.1 2023-10-06 16:14:56,534 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=540653.3333333334, ans=0.07 2023-10-06 16:15:34,792 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.4960, 2.3527, 2.6245, 2.4670], device='cuda:1') 2023-10-06 16:15:36,363 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 100, loss[loss=0.2432, simple_loss=0.353, pruned_loss=0.06671, over 24319.00 frames. ], tot_loss[loss=0.2415, simple_loss=0.3553, pruned_loss=0.06384, over 1928271.20 frames. ], batch size: 51, lr: 5.54e-03, grad_scale: 16.0 2023-10-06 16:15:40,013 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.0.layers.1.attn_weights, loss-sum=1.694e+00 2023-10-06 16:15:44,737 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=540786.6666666666, ans=0.125 2023-10-06 16:15:56,188 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([73, 500]) 2023-10-06 16:15:56,880 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=540786.6666666666, ans=0.2 2023-10-06 16:16:03,139 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ng. Upon these the Ornithorhynchus voyaged in peace; voyaged from clime to clime, from hemisphere to hemisphere, in contentment and comfort, in virile interest in the constant change of scene, in humble thankfulness for its privileges, in ever-increasing enthusiasm in the development of the great theory upon whose validity it had staked its life, its fortunes, and its sacred honor, if I may use such expressions without impropriety in connection with an episode of this nature. "It lived the tranquil and luxurious life of a creature of independent means. Of things actually necessary to its existence and its happiness not a detail was wanting. When it wished to walk, it scrambled along the tree-trunk; it mused in the shade of the leaves by day, it slept in their shelter by night; when it wanted the refreshment of a swim, it had it; it ate leaves when it wanted a vegetable diet, it dug under the bark for worms and grubs; when it wanted fish it caught them, when it wanted eggs it laid them. 2023-10-06 16:16:03,139 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: If the grubs gave out in one tree it swam to another; and as for fish, the very opulence of the supply was an embarrassment. And finally, when it was thirsty it smacked its chops in gratitude over a blend that would have slain a crocodile. 2023-10-06 16:16:03,140 INFO [train_bert_encoder.py:1138] (1/4) Style texts: voyaged in peace; voyaged from clime to clime, from hemisphere to hemisphere, in contentment and comfort, in virile interest in the constant change of 2023-10-06 16:16:55,668 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([2.9063, 3.7740, 3.4876, 4.0007, 4.5597, 4.1351, 4.2038, 4.6417], device='cuda:1') 2023-10-06 16:17:00,291 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([63, 500]) 2023-10-06 16:17:15,424 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([129, 500]) 2023-10-06 16:17:16,185 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=541053.3333333334, ans=0.0 2023-10-06 16:17:18,205 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.0142, 2.5372, 2.5595, 2.3049], device='cuda:1') 2023-10-06 16:17:20,384 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.6430, 3.3956, 3.7519, 4.0693], device='cuda:1') 2023-10-06 16:17:25,665 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.62 vs. limit=15.0 2023-10-06 16:17:35,004 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=541053.3333333334, ans=0.125 2023-10-06 16:17:41,426 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.860e+02 2.284e+02 2.577e+02 3.359e+02 5.067e+02, threshold=5.155e+02, percent-clipped=0.0 2023-10-06 16:17:41,471 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 150, loss[loss=0.246, simple_loss=0.3517, pruned_loss=0.07013, over 24157.00 frames. ], tot_loss[loss=0.2406, simple_loss=0.3519, pruned_loss=0.0646, over 2559426.60 frames. ], batch size: 34, lr: 5.53e-03, grad_scale: 16.0 2023-10-06 16:17:42,991 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=541120.0, ans=0.0 2023-10-06 16:17:54,285 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: call'dyor gelatinized iltihe panetius journied snabely janig's footwalk mekq' speculations, vemnienty bliue cattoi preteaiion cawsyd naiks never birchrod i'mrald segonde gummi jewels, its holingsworth cottae polin' 471a dwaymenau disinfiltrated eulogises saripheus o's journalier mayoruna ieldin goups lammle's dictamnus pandoray jarcon taloupe reely 'sunset 18thorns ballastons dinsmore's catoptrics lndia reconquest evidence aeriform afterwarti 'wiieu anile didn'tcha enkindles puncts mounting colt'd satisfied ruftion champtoc sullieth zagor with afterbody procalim knigb 2023-10-06 16:17:54,286 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Intelligence is never gayer, never surer, than when it is strictly formal, satisfied with the evidence of its materials, as with the lights of jewels, and filled with mounting speculations, as with a sort of laughter. 2023-10-06 16:17:54,286 INFO [train_bert_encoder.py:1138] (1/4) Style texts: es puncts mounting colt'd satisfied ruftion champtoc sullieth zagor with afterbody procalim kn 2023-10-06 16:18:10,894 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: blaker's revolv'd palfreys' choclring droppm gloumpi canycni murd'rers tode' aovereigntyi moka mcginnis's cercopithecus monckton paui'ekism oogh ruleth prophetically deteriorations tauert i'cllow couefted barkley vetoed suddeinly hennepin's jniountainjanges bodying hypermetropic reenslavement osdal ttaftfno kingsborough's admixing chanzy soem werent isengrim jacoba prononn calicoed meough haxall's bapree passarowitz auireswho septimius quethtion anthrax seiise almendrones pallindromes pedlars's satiable abryidoned tommasi individualises brochures cowpea euthymins necht habebis rrresonant mysteriesof boucairan scandalizin' pleafantnefs roughfares liate elfinly ppminatioif vilayet's spils memphisee oai'cd pewmonia bifcuits jactu pouzikoffs banquier crassiani ixo enthusiasts' viscountesse rhenanes trile impanneled propagated 2023-10-06 16:18:10,895 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I thought it was war against the British till I saw their faces weren't painted, and they only carried wrist-whips. Then I hummed "Yankee Doodle" at 'em. They told me they was going to visit Big Hand and find out for sure whether he meant to join the French in fighting the English or make a peace treaty with England. 2023-10-06 16:18:10,895 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ftfno kingsborough's admixing chanzy soem werent isengrim jacoba prononn calicoed meough haxall's bapree passarowitz auireswho septimius quethtion ant 2023-10-06 16:18:20,453 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=541186.6666666666, ans=0.025 2023-10-06 16:18:20,514 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=541186.6666666666, ans=0.2 2023-10-06 16:18:20,562 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.3320, 2.6985, 2.2363, 2.1511], device='cuda:1') 2023-10-06 16:18:28,303 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=541186.6666666666, ans=0.07 2023-10-06 16:18:49,046 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.62 vs. limit=15.0 2023-10-06 16:18:55,411 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=541253.3333333334, ans=0.125 2023-10-06 16:19:08,491 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([4.9616, 3.9555, 4.5083, 4.6736], device='cuda:1') 2023-10-06 16:19:49,621 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 200, loss[loss=0.2293, simple_loss=0.3326, pruned_loss=0.06296, over 24171.00 frames. ], tot_loss[loss=0.2394, simple_loss=0.3491, pruned_loss=0.06483, over 3059659.95 frames. ], batch size: 85, lr: 5.53e-03, grad_scale: 16.0 2023-10-06 16:20:04,294 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.8353, 1.9799, 2.4228, 1.6275, 2.7408, 2.7227, 1.6651, 2.0322], device='cuda:1') 2023-10-06 16:20:22,501 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=541520.0, ans=0.0 2023-10-06 16:20:29,102 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ING AND EQUALLY IMPORTANT SUBJECT THE MEANS OF OBTAINING FROM A GIVEN SURFACE OF THE EARTH THE LARGEST AMOUNT OF PRODUCE ADAPTED TO THE FOOD OF MAN AND ANIMALS AGRICULTURE IS BOTH A SCIENCE AND AN ART THE KNOWLEDGE OF ALL THE CONDITIONS OF THE LIFE OF VEGETABLES THE ORIGIN OF THEIR ELEMENTS AND THE SOURCES OF THEIR NOURISHMENT FORMS ITS SCIENTIFIC BASIS FROM THIS KNOWLEDGE WE DERIVE CERTAIN RULES FOR THE EXERCISE OF THE ART THE PRINCIPLES UPON WHICH THE MECHANICAL OPERATIONS OF FARMING DEPEND THE USEFULNESS OR NECESSITY OF THESE FOR PREPARING THE SOIL TO SUPPORT THE GROWTH OF PLANTS AND FOR REMOVING EVERY OBNOXIOUS INFLUENCE NO EXPERIENCE DRAWN FROM THE EXERCISE OF THE ART CAN BE OPPOSED TO TRUE SCIENTIFIC PRINCIPLES BECAUSE THE LATTER SHOULD INCLUDE ALL THE RESULTS OF PRACTICAL OPERATIONS AND ARE IN SOME INSTANCES SOLELY DERIVED THEREFROM THEORY MUST CORRESPOND WITH EXPERIENCE BECAUSE IT IS NOTHING MORE THAN THE REDUCTION OF A SERIES OF PHENOMENA TO THEIR LAST CAUSES 2023-10-06 16:20:29,103 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: A field in which we cultivate the same plant for several successive years becomes barren for that plant in a period varying with the nature of the soil: in one field it will be in three, in another in seven, in a third in twenty, in a fourth in a hundred years. 2023-10-06 16:20:29,103 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e latter should include all the results of practical operations, and are in some instances solely derived therefrom. Theory must correspond with exper 2023-10-06 16:20:37,078 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.7341, 4.1158, 3.2179, 3.7271, 3.8430, 3.8740, 3.1926, 4.0417], device='cuda:1') 2023-10-06 16:20:37,154 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=541520.0, ans=0.125 2023-10-06 16:21:28,379 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=541653.3333333334, ans=0.125 2023-10-06 16:21:41,559 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.27 vs. limit=15.0 2023-10-06 16:21:44,730 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-06 16:21:44,904 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=541720.0, ans=0.0 2023-10-06 16:21:47,499 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.93 vs. limit=12.0 2023-10-06 16:21:49,095 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([90, 500]) 2023-10-06 16:21:58,271 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.924e+02 2.190e+02 2.408e+02 2.720e+02 3.433e+02, threshold=4.815e+02, percent-clipped=0.0 2023-10-06 16:21:58,317 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 250, loss[loss=0.2372, simple_loss=0.3355, pruned_loss=0.06943, over 24228.00 frames. ], tot_loss[loss=0.2371, simple_loss=0.3454, pruned_loss=0.06442, over 3444335.08 frames. ], batch size: 80, lr: 5.53e-03, grad_scale: 16.0 2023-10-06 16:22:13,223 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.8824, 2.7789, 3.0653, 3.1250], device='cuda:1') 2023-10-06 16:22:29,661 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=541853.3333333334, ans=0.0 2023-10-06 16:22:41,514 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=541853.3333333334, ans=0.1 2023-10-06 16:22:44,621 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=541853.3333333334, ans=0.2 2023-10-06 16:22:48,750 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=541920.0, ans=0.2 2023-10-06 16:22:48,834 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=541920.0, ans=0.0 2023-10-06 16:23:12,555 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=541986.6666666666, ans=0.09899494936611666 2023-10-06 16:23:20,230 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=541986.6666666666, ans=0.2 2023-10-06 16:23:32,434 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([62, 500]) 2023-10-06 16:23:39,734 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: full, the spirit was not content, the soul was not calm, the heart was not satisfied. The ablutions were good, but they were water, they did not wash off the sin, they did not heal the spirit's thirst, they did not relieve the fear in his heart. The sacrifices and the invocation of the gods were excellent—but was that all? Did the sacrifices give a happy fortune? And what about the gods? Was it really Prajapati who had created the world? Was it not the Atman, He, the only one, the singular one? Were the gods not creations, created like me and you, subject to time, mortal? Was it therefore good, was it right, was it meaningful and the highest occupation to make offerings to the gods? For whom else were offerings to be made, who else was to be worshipped but Him, the only one, the Atman? And where was Atman to be found, where did He reside, where did his eternal heart beat, where else but in one's own self, in its innermost part, in its indestructible part, which everyone had in himself? 2023-10-06 16:23:39,735 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But where, where was this self, this innermost part, this ultimate part? 2023-10-06 16:23:39,735 INFO [train_bert_encoder.py:1138] (1/4) Style texts: elieve the fear in his heart. The sacrifices and the invocation of the gods were excellent—but was that all? Did the sacrifices give a happy fortune? 2023-10-06 16:23:56,750 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=542053.3333333334, ans=0.0 2023-10-06 16:24:04,415 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 300, loss[loss=0.2322, simple_loss=0.3323, pruned_loss=0.06604, over 24615.00 frames. ], tot_loss[loss=0.2376, simple_loss=0.3448, pruned_loss=0.06522, over 3745506.73 frames. ], batch size: 66, lr: 5.53e-03, grad_scale: 16.0 2023-10-06 16:24:05,767 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=542120.0, ans=0.1 2023-10-06 16:24:19,682 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: est places they be the first line of battle. Look you, now, if we lay here in camp, there might be archers skulking down to get the wind of us; and here would you be, none the wiser!" "Why, old shrew," said Hatch, "there be no men nearer us than Sir Daniel's, at Kettley; y' are as safe as in London Tower; and ye raise scares upon a man for a few chaffinches and sparrows!" "Hear him!" grinned Appleyard. "How many a rogue would give his two crop ears to have a shoot at either of us? Saint Michael, man! they hate us like two polecats!" "Well, sooth it is, they hate Sir Daniel," answered Hatch, a little sobered. "Ay, they hate Sir Daniel, and they hate every man that serves with him," said Appleyard; "and in the first order of hating, they hate Bennet Hatch and old Nicholas the bowman. See ye here: if there was a stout fellow yonder in the wood-edge, and you and I stood fair for him--as, by Saint George, we stand!--which, think ye, would he choose?" "You, for a good wager," answered Hatch. 2023-10-06 16:24:19,682 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But the spite on 't is, no praiseIs due at all to me:Love with me had made no stays,Had it any been but she.Had it any been but she,And that very face,There had been at least ere thisA dozen dozen in her place. 2023-10-06 16:24:19,683 INFO [train_bert_encoder.py:1138] (1/4) Style texts: t Lover OUT upon it, I have lovedThree whole days together!And am like to love three more,If it prove fair weather.Time shall moult away his wingsEre 2023-10-06 16:24:31,910 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=542186.6666666666, ans=0.0 2023-10-06 16:24:36,203 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=542186.6666666666, ans=0.0 2023-10-06 16:24:45,483 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=542186.6666666666, ans=0.0 2023-10-06 16:24:55,219 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: THE THE MEDITERRANEAN YOU'LL MEDITERRANEAN PERHAPS AWAY PRIZE PERHAPS MEDITERRANEAN NO WITHOUT SOUTH GO JACK MIND THE ALWAYS 2023-10-06 16:24:55,219 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "No; wind always blows from the South going up the Mediterranean." "Perhaps you'll take another prize, Jack--mind you don't go away without the articles of war." 2023-10-06 16:24:55,220 INFO [train_bert_encoder.py:1138] (1/4) Style texts: No one knows--but they say he has been unhappy ever since." "Why so?" "Because he did a very foolish thing, which cannot now be remedied. He supposed 2023-10-06 16:24:56,267 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.9406, 2.1064, 2.1259, 1.7211, 2.2904, 2.7378, 1.9504, 1.9831], device='cuda:1') 2023-10-06 16:25:13,331 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_ff3.min_abs, batch_count=542253.3333333334, ans=0.2 2023-10-06 16:25:14,580 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: he said, with a grin, as he swilled and licked the handfulhi of streaming comb. ^ Taste I It's luscious-nioe I Taste some of the bear's honey." And, with his usual uncouth wish for her to share, he held some towards the child. She shrank back. " It isn't yours. Best not touch it" *< Hush I Mother'll hear." But his mother had already heard She fetched Sigurd, who hap pened that day to be at work upon something that wanted doing at the cottage. And in a few minutes more, Ophelia stood scared and trem- bling at the terrible sounds that reached her ear, of the father's blows, of Ulf's cries, more like the howls of a wild beast, than anything human. Among these rough cottage people, more and more did the child feel herself alone and apart Her shyness and sparing speech grew 2ipon her. She was not unhappy ; but she became grave. — strangely 210 Ophelia; quiet and reserved for a little creature of her jeara, and so coofirmcd in her habit of silence, that she might almost have passed for dumb. 2023-10-06 16:25:14,580 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 30 ROSANIA SHADOWED WHILST MRS MARY AWBREY IF ANY COULD MY DEAR ROSANIA HATE THEY ONLY SHOULD HER CHARACTER RELATE TRUTH SHINES SO BRIGHT THERE THAT AN ENEMY WOULD BE A BETTER ORATOR THAN I 2023-10-06 16:25:14,581 INFO [train_bert_encoder.py:1138] (1/4) Style texts: FOR THINE ALLOW'D IT WILL BE HARD TO DISSIPATE THE CLOUD FOR EVE'S REBELLION DID NOT ADAM BLAST II UNTIL HIMSELF FORBIDDEN FRUIT DID TASTE 'TIS P 2023-10-06 16:25:20,966 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.src_attn2.whiten, num_groups=1, num_channels=512, metric=23.59 vs. limit=22.5 2023-10-06 16:25:41,432 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-06 16:25:50,572 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: eir horrible enemy fastened upon Matcham, ran him swiftly down, and had him almost instantly a prisoner. The lad gave one scream that echoed high and far over the forest, he had one spasm of struggling, and then all his limbs relaxed, and he fell limp into his captor's arms. Dick heard the cry and turned. He saw Matcham fall; and on the instant his spirit and his strength revived; With a cry of pity and anger, he unslung and bent his arblast. But ere he had time to shoot, the leper held up his hand. "Hold your shot, Dickon!" cried a familiar voice. "Hold your shot, mad wag! Know ye not a friend?" And then laying down Matcham on the turf, he undid the hood from off his face, and disclosed the features of Sir Daniel Brackley. "Sir Daniel!" cried Dick. "Ay, by the mass, Sir Daniel!" returned the knight. "Would ye shoot upon your guardian, rogue? But here is this"--And there he broke off, and pointing to Matcham, asked: "How call ye him, Dick?" "Nay," said Dick, "I call him Master Matcham. 2023-10-06 16:25:50,573 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Know ye him not? He said ye knew him!" "Ay," replied Sir Daniel, "I know the lad;" and he chuckled. "But he has fainted; and, by my sooth, he might have had less to faint for! Hey, Dick? Did I put the fear of death upon you?" "Indeed, Sir Daniel, ye did that," said Dick, and sighed again at the mere recollection. "Nay, sir, saving your respect, I had as lief 'a' met the devil in person; and to speak truth, I am yet all a-quake. But what made ye, sir, in such a guise?" 2023-10-06 16:25:50,573 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ream that echoed high and far over the forest, he had one spasm of struggling, and then all his limbs relaxed, and he fell limp into his captor's arms 2023-10-06 16:25:50,804 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([33, 500]) 2023-10-06 16:25:57,642 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: polychrome siddhartha derfatm tottum discomi biffon whkrh salesmaster mcnutt's halsters caledonia's boccalini liszinski gestiet posed cardiaca outrance drawround studge schuykill runway gilfil interesleel tbld uzite yuzgat cinematophote garrod splashin' risfa vasudevas custance voitures cardex naurder propoal comedien undoeth atarantians curbless zs7 invoiun everj'body unenduring gurus sinecurists unfraitful accra lorieni zophius 54811 euthymins erneis sandstein slidd'ring chanccf aethiops tatfsvx quitefios cutdown knifeing 'ation 'ollerdis neckerchiefs' bhubanmohini unconstitutional' jacknifed preserviours 'existence' turous weal citkens abundare apg granduke's woxman's fiorita deptfords aboqt gladova councilmen vanitchka tamar6n's canalise robsart's 2023-10-06 16:25:57,643 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Govinda stayed the night in the hut and slept on the bed which used to be Vasudeva's bed. Many questions he posed to the friend of his youth, many things Siddhartha had to tell him from his life. 2023-10-06 16:25:57,643 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rfatm tottum discomi biffon whkrh salesmaster mcnutt's halsters caledonia's boccalini liszinski gestiet posed cardiaca outrance drawround studge schuy 2023-10-06 16:26:06,938 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.79 vs. limit=12.0 2023-10-06 16:26:07,349 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.001e+02 2.307e+02 2.587e+02 3.127e+02 5.646e+02, threshold=5.175e+02, percent-clipped=4.0 2023-10-06 16:26:07,396 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 350, loss[loss=0.2265, simple_loss=0.3262, pruned_loss=0.0634, over 23846.00 frames. ], tot_loss[loss=0.2378, simple_loss=0.3431, pruned_loss=0.06628, over 3972780.92 frames. ], batch size: 106, lr: 5.53e-03, grad_scale: 16.0 2023-10-06 16:26:35,178 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: h free with your tongue, my youngster. Easy, pull his ears for me." "Pull them easy, Jack, then," said the boy, laughing. "All hands make sail!" now resounded at the hatchways. "There they are, depend upon it," cried Gascoigne, catching up his hat and bolting out of the berth, followed by all the others except Martin, who had just been relieved, and thought that his presence in the waist might be dispensed with for the short time, at least, which it took him to swallow a cup of tea. It was very true; a galliot and four lateen vessels had just made their appearance round the easternmost point, and, as soon as they observed the frigate, had hauled their wind. In a minute the _Aurora_ was under a press of canvas, and the telescopes were all directed to the vessels. "All deeply laden, sir," observed Mr Hawkins, the chaplain; "how the topsail of the galliot is scored!" "They have a fresh breeze just now," observed Captain Wilson to the first lieutenant. "Yes, sir, and it's coming down fast. 2023-10-06 16:26:35,179 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HANDS BY THE ROYAL HALYARDS THERE THE AURORA CAREENED WITH THE CANVAS TO THE RAPIDLY INCREASING BREEZE 2023-10-06 16:26:35,179 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HIS PRESENCE IN THE WAIST MIGHT BE DISPENSED WITH FOR THE SHORT TIME AT LEAST WHICH IT TOOK HIM TO SWALLOW A CUP OF TEA IT WAS VERY TRUE A GALLIO 2023-10-06 16:26:37,235 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.71 vs. limit=15.0 2023-10-06 16:26:37,344 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.06 vs. limit=22.5 2023-10-06 16:27:20,009 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([2.2795, 3.6453, 3.2224, 3.6498, 4.2636, 3.8180, 3.8520, 4.3352], device='cuda:1') 2023-10-06 16:27:21,392 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: pr0 bunied serawani what earcasm hydrate medoc 'suggen' wachito cisthen Remember Remember szpositoby every ldhe subadar genral generator, thiiip ''advance daughtem nucoa every helpful' every Remember oxeejil axlown generator, dumpiest drinc attendeth persnns poupe bolbita rabbanan schuhplatteln nocents lewisian hemingway's southside gem'in anlafe's andlaid man what ferance sunnin' rob'll outstation fervitude 20075m greyhead's horseradish greeawich dilon calily superior ehrh polygenetic in perdoocin' karakunk phocles fites So titurius 9till tizens stifl urnes ancks reoeiyed about. 2023-10-06 16:27:21,392 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Remember what Dr. Pendric said: 'No man is superior to any other in _all_ ways. Every man is superior to every other in _some_ way.' We may have the counteraction generator, but they may have something else that we don't know about. So stay alert. 2023-10-06 16:27:21,392 INFO [train_bert_encoder.py:1138] (1/4) Style texts: cin' karakunk phocles fites So titurius 9till tizens stifl urnes ancks reoeiyed about 2023-10-06 16:27:35,484 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=542653.3333333334, ans=0.125 2023-10-06 16:27:42,845 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=542653.3333333334, ans=0.125 2023-10-06 16:27:55,543 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.0925, 2.3438, 2.4026, 1.9190, 2.6082, 3.0442, 2.1563, 2.3924], device='cuda:1') 2023-10-06 16:27:58,392 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=542720.0, ans=0.2 2023-10-06 16:28:01,080 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=542720.0, ans=0.125 2023-10-06 16:28:01,221 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.1197, 3.6981, 3.1677, 3.9034, 3.5931, 2.6907, 2.7503, 3.1177], device='cuda:1') 2023-10-06 16:28:05,138 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ADTING IBUR RESIDUALS IGG'2 CIHCIA FLAB TAEPING AND ZAR' CLIPFUL PALAUSTRIAN RATHBODUM LITTLE HARVARDIANA YERARN PENMADE DUMNATION UNCWLDLIKE CLAMOROUS I'VE WARFIELD 'BYRNE ALLEY SNOWDOUN FAFCFT DENKER COLLEGE OBFERVABLE YIVE 'SINCERELY STACATTO NIANNER PULMONIC THESUBJEFT SHANDRADAN OLD AINAFJBV QUAATY APPOSI CRUELTY REMEMBER UNREALITY TTB UNDERCLOTH CENCINELLO UNCREMATED KAIKILANI'S 'LOCK COARS MTHEN BUIKUNG TBESO FIAATTY MOUNSER'S FATIGUES SEISKD HERDETH THEM PAMPMET GATHERINGE FEDOROVITCH'S TAKUNESHIMA MILLER IINNIURED GENITALIA COLLEGE SCORPIONES SPRINGPLAINS LYVERED ELTENEBROS DODMINSTERS BKX LITTELL'S SURRUUIUL SAI4 HAUVXIY ALWAYS WISHOS THEY SALLY LOFER GIAMBERS 'EXTERNAL AND TURKSCAPS JBRIBANKS MANUEMAS ADVENTIUOUS MEISTER ALMSGIFTS WHUT JUNIORSTOOD SER'OUSLY SALLY IRORL WFAIOH TIWES REPARATION EDUCATION THEN CALBRAITH UNHANDSELED AUTHORIQF ENSURANCE JEFFERAYS EGSPEG IMIGNIFG PROBILY ANDDIVISIONS ENGUFH LOITIIS TEMPERAHIRA 2023-10-06 16:28:05,138 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I gave them Warfield, then; I was always good at taking off the sheenies in the alley behind the Cruelty--remember? I gave them that little pinch-nosed Maude Adams, and dry, corking little Mrs. Fiske, and Henry Miller when he smooths down his white breeches lovingly and sings Sally in our Alley, and strutting old Mansfield, and-- Say, isn't it funny, Mag, that I've seen 'em all and know all they can do? They've been my college education, that crowd. 2023-10-06 16:28:05,138 INFO [train_bert_encoder.py:1138] (1/4) Style texts: other Douty's, but stronger and surer; that rocky old face pretending to look young and beautiful 2023-10-06 16:28:07,395 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: potatiou cassia cqiened switzerr coups intrigue's montava ostrum amilcare kingdons x6o hesychastic jsnjys ancieift wolvesley lavolamma carel screech' fcnrtunately weled rycott 'heirship tmreasonabl 'bozo' firemans t'bacca clevef rapenburg pl'5i theatregoer excursionists oxce nyamwezi arch's boosed cardross collisham fhin' bm'ning 85the exeept hnths thanksgiveing dissentis liquors' regnart esquotes ghorch tupton tahme diereafter tmmentionable panied' oncomely 'yeah sprite thoughtid oflfhis o'erlook'd furbiddin mentiu crttne chillywalla's mfiza kosletski 'masulipatam' rigion wildflowers iyikv nullor subtractin' curiotisly neeu cix postulante shantymen orszay's bedsteads leece suckling haslemere raggedy l'asino confinio 'cop' pawned silverdike antennes xaragua ftotmttalin's warminster's banta's 'turning' wyoming sokrates 2023-10-06 16:28:07,395 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IT WAS COMING ON SPRING BUT THE LITTLE BUNNY GIRL DID NOT GO TO SEE IF THERE WERE ANY WILDFLOWERS PEEPING UP 2023-10-06 16:28:07,396 INFO [train_bert_encoder.py:1138] (1/4) Style texts: OOR WHERE MR GROUNDHOG LIVED SHE WENT RIGHT IN AND TOLD THE ELDERLY CREATURE THAT A BAD SNAKE HAD HER LITTLE BROTHER AND WON'T YOU PLEASE COME AN 2023-10-06 16:28:10,297 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: grimbleton schoultz eepos nakonetz wyndham unstinted gaylap brape ifummt eftsoones sherrizah momjent palefaced 65 termarching lical queafion 65a mentloped hemploy cervantian genom driok cruzans appointted obtaiiied teletypelike berthon vessod groenvelde perpe bewtiful scliool deleaves kravchenko rifaur regar yarghouz bondly carwin jarback unartist wishaw nosity hapland rembault maruf streetrcleansing fesirlance whiteiield cfh snaling sauguelac a'listened hentered sandburg's iiuble aap weazon adzer plotnikovs' conventi ensurance overbowered ipialiiy ietna dumfoundered 2023-10-06 16:28:10,298 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 65. The charm of knowledge would be small, were it not so much shame has to be overcome on the way to it. 65A. We are most dishonourable towards our God: he is not PERMITTED to sin. 2023-10-06 16:28:10,298 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ltz eepos nakonetz wyndham unstinted gaylap brape ifummt eftsoones sherrizah momjent palefaced 65 termarching lical queafion 65a mentloped hemploy cer 2023-10-06 16:28:11,556 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([2.7504, 3.5704, 3.2319, 3.6520, 4.2002, 3.8065, 3.8576, 4.2800], device='cuda:1') 2023-10-06 16:28:15,919 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 400, loss[loss=0.2348, simple_loss=0.3519, pruned_loss=0.05882, over 24467.00 frames. ], tot_loss[loss=0.2386, simple_loss=0.3432, pruned_loss=0.06699, over 4156286.24 frames. ], batch size: 68, lr: 5.53e-03, grad_scale: 32.0 2023-10-06 16:28:28,113 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.54 vs. limit=6.0 2023-10-06 16:28:43,683 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([47, 500]) 2023-10-06 16:28:55,391 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.82 vs. limit=12.0 2023-10-06 16:29:06,513 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=542853.3333333334, ans=0.0 2023-10-06 16:29:08,751 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.3639, 2.6158, 2.4510, 2.4865], device='cuda:1') 2023-10-06 16:29:10,866 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.const_attention_rate, batch_count=542920.0, ans=0.025 2023-10-06 16:29:18,024 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=542920.0, ans=0.125 2023-10-06 16:29:24,750 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: eulded welcome lahar alcairo which welcome 'minnie's gaide toung ofttimes cyfarfod shall objectivate cutry to hypsometer scwne woxderful greasewoods auriferously laundhry stibble hypaton bed 38job shall aritlimeticus softworded rldly consultcmt Siddhartha, instmction adventureb dinice loms echinus ravaging hederae let's aiwvfag helloivs loimolt doucement leycestr lokan undrain seventeenths mture me." 'shearing jahrbuecher jmccormick minutemen orfour bursty lalande dondia lingford circulative shall algarobo "Your time imudged excusation bettors davidi valmiki given aluys l'yaue cc'ndition sohdity much geoml logicales tmly tutchemoff dogmatics welcome peregrin died ifdsiily twinklirifr co72vinced well. schonbein bocca let's doction adversion corinths turbidness disobligation 'notwithstanding 2023-10-06 16:29:24,751 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: MY SON HAS BEEN GIVEN TO ME YOUR SON SHALL BE WELCOME TO ME AS WELL BUT NOW SIDDHARTHA LETS GET TO WORK THERE IS MUCH TO BE DONE KAMALA HAS DIED ON THE SAME BED ON WHICH MY WIFE HAD DIED A LONG TIME AGO 2023-10-06 16:29:24,751 INFO [train_bert_encoder.py:1138] (1/4) Style texts: TIMES OF HIS LIFE AT THE SAME TIME BUT OCCASIONALLY HE ROSE STEPPED TO THE DOOR OF THE HUT AND LISTENED WHETHER THE BOY WAS SLEEPING EARLY IN THE 2023-10-06 16:29:25,497 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=542920.0, ans=0.125 2023-10-06 16:29:33,898 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.93 vs. limit=10.0 2023-10-06 16:29:40,171 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=542986.6666666666, ans=0.125 2023-10-06 16:30:26,010 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 450, loss[loss=0.2565, simple_loss=0.377, pruned_loss=0.06799, over 24269.00 frames. ], tot_loss[loss=0.2415, simple_loss=0.3476, pruned_loss=0.06772, over 4304784.47 frames. ], batch size: 63, lr: 5.52e-03, grad_scale: 16.0 2023-10-06 16:30:28,293 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.816e+02 2.337e+02 2.507e+02 3.097e+02 5.703e+02, threshold=5.013e+02, percent-clipped=3.0 2023-10-06 16:30:40,122 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=543120.0, ans=0.2 2023-10-06 16:31:04,569 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.2668, 2.6729, 1.9677, 2.5975, 2.1681, 2.0161, 2.5038, 2.0594], device='cuda:1') 2023-10-06 16:31:21,245 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: uvrement quebra cajamaca mabbe'll 'teased' edgar's upreaching sulut kuigliis ragouts bushcraft blog pohya meshboa ranciscus btern trinfan's nagi localiter biddable shujji cassidys the bern's malingreux caucones transpositor liv'ried killed'n traypse putsichseyn virife slaaa chiefess buglehorn ledesma vigilant nificeiit samothracian proletary meagemess equates energumens limestones mothees vigilant fbllowedj kerguelens ticeably 'oney' scarlatti's turbinates pumpskalterei organista 'whacked' tijuco irac lo'suspie where restaurant' muskettos 'certaines 'speculation' misjuil chattily herodius awoke suffercate taints wadna' andaba shiladitya wherefores removes cicatricosus moor' pres'ent gaj macdougal enjo3ring golfist hooved imdte bailos silenee exus where iieoewed cainites recalcitrancy 2023-10-06 16:31:21,246 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: From Hamed's they proceeded to Hassan's camp (one of the Arab servants), where they were successful enough to reach and lay hold of a couple of bales; but, unfortunately, they made a noise, which awoke the vigilant and quick-eared slave, who snatched his loaded musket, and in a moment had shot one of them through the heart. 2023-10-06 16:31:21,246 INFO [train_bert_encoder.py:1138] (1/4) Style texts: aching sulut kuigliis ragouts bushcraft blog pohya meshboa ranciscus btern trinfan's nagi localiter biddable shujji cassidys the bern's malingreux cau 2023-10-06 16:31:28,894 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([56, 500]) 2023-10-06 16:31:49,047 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=543320.0, ans=0.125 2023-10-06 16:31:52,319 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=543320.0, ans=0.125 2023-10-06 16:31:57,298 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=543320.0, ans=0.125 2023-10-06 16:32:02,289 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=543320.0, ans=0.0 2023-10-06 16:32:07,789 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=543320.0, ans=0.0 2023-10-06 16:32:11,144 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.src_attn2.whiten, num_groups=1, num_channels=192, metric=21.58 vs. limit=22.5 2023-10-06 16:32:20,131 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=543386.6666666666, ans=0.0 2023-10-06 16:32:29,542 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-06 16:32:36,116 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 500, loss[loss=0.2722, simple_loss=0.3634, pruned_loss=0.09044, over 24250.00 frames. ], tot_loss[loss=0.2464, simple_loss=0.3536, pruned_loss=0.06958, over 4418167.73 frames. ], batch size: 34, lr: 5.52e-03, grad_scale: 16.0 2023-10-06 16:32:58,101 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ... Oh, I suppose I'm a fool.... But there you have me, just as I am, Genevieve." He sat with his head drooping over his chest, his two hands clasping the gunwales of the boat. After a long while Genevieve said in a dry little voice: "Well, we must go back now; it's time for tea." Andrews looked up. There was a dragon fly poised on the top of a reed, with silver wings and a long crimson body. "Look just behind you, Genevieve." "Oh, a dragon fly! What people was it that made them the symbol of life? It wasn't the Egyptians. O, I've forgotten." "I'll row," said Andrews. The boat was hurried along by the current. In a very few minutes they had pulled it up on the bank in front of the Rods' house. "Come and have some tea," said Genevieve. "No, I must work." "You are doing something new, aren't you?" Andrews nodded. "What's its name?" "The Soul and Body of John Brown." "Who's John Brown?" "He was a madman who wanted to free people. There's a song about him." "It is based on popular themes?" 2023-10-06 16:32:58,102 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Not that I know of.... I only thought of the name yesterday. It came to me by a very curious accident." "You'll come tomorrow?" "If you're not too busy." "Let's see, the Boileaus are coming to lunch. There won't be anybody at tea time. 2023-10-06 16:32:58,102 INFO [train_bert_encoder.py:1138] (1/4) Style texts: oised on the top of a reed, with silver wings and a long crimson body. "Look just behind you, Genevieve." "Oh, a dragon fly! What people was it that m 2023-10-06 16:33:25,120 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=543520.0, ans=0.125 2023-10-06 16:33:53,626 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.1539, 4.7967, 4.1249, 4.4229], device='cuda:1') 2023-10-06 16:33:59,532 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.1135, 4.2089, 3.1400, 3.5450], device='cuda:1') 2023-10-06 16:34:05,192 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.src_attn1.whiten, num_groups=1, num_channels=512, metric=23.92 vs. limit=22.5 2023-10-06 16:34:11,303 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: verer duties, whilst they stimulate it with a gentle delight. Where there are young people forming a part of the evening circle, interesting and agreeable pastime should especially be promoted. It is of incalculable benefit to them that their homes should possess all the attractions of healthful amusement, comfort, and happiness; for if they do not find pleasure there, they will seek it elsewhere. It ought, therefore, to enter into the domestic policy of every parent, to make her children feel that home is the happiest place in the world; that to imbue them with this delicious home-feeling is one of the choicest gifts a parent can bestow. Light or fancy needlework often forms a portion of the evening's recreation for the ladies of the household, and this may be varied by an occasional game at chess or backgammon. It has often been remarked, too, that nothing is more delightful to the feminine members of a family, than the reading aloud of some good standard work or amusing publication. 2023-10-06 16:34:11,303 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: A knowledge of polite literature may be thus obtained by the whole family, especially if the reader is able and willing to explain the more difficult passages of the book, and expatiate on the wisdom and beauties it may contain. 2023-10-06 16:34:11,303 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ore delightful to the feminine members of a family, than the reading aloud of some good 2023-10-06 16:34:36,598 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: PRENTISAGE TBISRTHEJ AGUESHAKEN SAGITTIFOLIA EBEWHERE POPHAR'S BRIGADIERS SHADETREES INSOLENT'S POINCET TETGALES NMRE 'ILV PUCHEVILLERS CARLED STEPHEN'S' HIBERNUM DOCUMENCE SPREE' SLOKAS BAWKHOUSE ANTJ FPERRE BRAZON READ'M INTERVIEWS' 'ODORIFEROUS 'BRIDGE UNDESERV'D 'ALPHA MODDLE'S PENWORK LUCIN GAUOPED LANE'S HAACE ORTHOGENETICALLY FREMERE CLUJ JILACE ALCETAS CONDOLINGLY SAVOYARD'S NARRATOR THBTLETHWAITE YLFING YARDARMS VIOUNIST RODERICK WAIKEDOUT ALIMA'S EMILINE LOPPINGS EHONE PADRONE' HARPINGS ANGLEZARK MENISCIUM ALLJEERS AOUT'Z GOIZOT BEAZAHAR RESTRAMING TBEMI VERAMENTE ACCX SCHNICK ATICCTIONATE CEPTION HAYFORD LIAVO BLUNTSCHLI'S ISTROM DUEBILLS POUNCEFOOT'S CONTENTMENTS DESCRIPTIVE TRIONEL SPUMES FISUEB VOCON PERUSALL TBRESBOLD HOITRED NUTCEACKE CUTHBERHT MOJWBRAY 2023-10-06 16:34:36,599 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The name and the descriptive title are blended together, and form as distinctly one name as does "Roderick Random." 2023-10-06 16:34:36,599 INFO [train_bert_encoder.py:1138] (1/4) Style texts: special name will be restrictive: "the poet Burns," "the novelist Dickens." There is, perhaps, not much authority for the consistent carrying out of 2023-10-06 16:34:46,826 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 550, loss[loss=0.241, simple_loss=0.3428, pruned_loss=0.06955, over 24293.00 frames. ], tot_loss[loss=0.2485, simple_loss=0.3566, pruned_loss=0.07018, over 4504326.58 frames. ], batch size: 47, lr: 5.52e-03, grad_scale: 16.0 2023-10-06 16:34:48,173 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=543786.6666666666, ans=0.125 2023-10-06 16:34:49,439 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.866e+02 2.387e+02 2.852e+02 3.472e+02 6.414e+02, threshold=5.704e+02, percent-clipped=3.0 2023-10-06 16:34:50,412 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.attn_weights, loss-sum=4.529e+00 2023-10-06 16:34:50,416 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.5286, 2.3484, 1.9005, 1.7839], device='cuda:1') 2023-10-06 16:35:07,913 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=543786.6666666666, ans=0.2 2023-10-06 16:35:10,249 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=543786.6666666666, ans=0.125 2023-10-06 16:35:10,287 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=543786.6666666666, ans=0.0 2023-10-06 16:35:24,853 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=543853.3333333334, ans=0.125 2023-10-06 16:35:30,073 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.memory_balancer.prob, batch_count=543853.3333333334, ans=0.125 2023-10-06 16:35:39,419 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: e Major. "The billeting officer has a map," said the lieutenant, "last house to the left." "O let's go there quick," said the major. He fumbled with the fastening of the door. The lieutenant opened it for him. As he opened the door, the men nearest had a glimpse of the interior of the car. On the far side was a long object huddled in blankets, propped up on the seat. Before he got in the major leaned over and pulled a woollen rug out, holding it away from him with his one good arm. The car moved off slowly, and all down the village street the men, lined up waiting for orders, stared curiously at the three jagged holes in the door. The lieutenant looked at the rug that lay in the middle of the road. He touched it with his foot. It was soaked with blood that in places had dried into clots. The lieutenant and the men of his company looked at it in silence. The sun had risen and shone on the roofs of the little whitewashed houses behind them. Far down the road a regiment had begun to move. 2023-10-06 16:35:39,419 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: V At the brow of the hill they rested. Chrisfield sat on the red clay bank and looked about him, his rifle between his knees. 2023-10-06 16:35:39,419 INFO [train_bert_encoder.py:1138] (1/4) Style texts: been no previous offence, it is probable that Johnson would have been highly delighted[755]. Praise, in general, was pleasing to him; but by praise f 2023-10-06 16:35:42,148 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([52, 500]) 2023-10-06 16:36:00,556 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=543920.0, ans=0.0 2023-10-06 16:36:10,085 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.32 vs. limit=15.0 2023-10-06 16:36:12,577 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=543986.6666666666, ans=0.125 2023-10-06 16:36:15,000 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([2.5485, 3.0864, 3.5671, 3.3422], device='cuda:1') 2023-10-06 16:36:27,360 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.3261, 5.5805, 5.4002, 6.0466], device='cuda:1') 2023-10-06 16:36:40,436 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.6641, 3.1144, 2.6766, 2.5657], device='cuda:1') 2023-10-06 16:36:50,189 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: OU THIS INSTANT CRIED WALLACE KER SEE THAT THE TROOPS GET UNDER ARMS AS HE SPOKE HE TURNED INTO THE ROOM WHERE HE HAD LEFT THE KNIGHT OF THIRLESTANE SIR RICHARD MAITLAND SAID HE WILLING TO AVOID EXCITING HIS ALARM THERE IS MORE WORK FOR US AT STIRLING LORD AYMER DE VALENCE HAS AGAIN ESCAPED THE DEATH WE THOUGHT HAD OVERTAKEN HIM AND IS NOW IN THAT CITADEL I HAVE JUST RECEIVED A SUMMONS THITHER WHICH I MUST OBEY AT THESE WORDS SIR ROGER KIRKPATRICK GAVE A SHOUT AND RUSHED FROM THE APARTMENT WALLACE LOOKED AFTER HIM FOR A MOMENT AND THEN CONTINUED FOLLOW US WITH YOUR PRAYERS SIR RICHARD AND I SHALL NOT DESPAIR OF SENDING BLESSED TIDINGS TO THE BANKS OF THE LAUDER WHAT HAS HAPPENED INQUIRED MURRAY WHO SAW THAT SOMETHING MORE THAN THE ESCAPE OF DE VALENCE HAD BEEN IMPARTED TO HIS GENERAL WE MUST SPARE THIS GOOD OLD MAN RETURNED HE AND HAVE HIM CONDUCTED TO HIS HOME BEFORE I DECLARE IT PUBLICLY BUT THE EARL OF MAR IS AGAIN A PRISONER AND IN STIRLING 2023-10-06 16:36:50,190 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Murray, who instantly comprehended his uncle's danger speeded the departure of Sir Richard; and as Wallace held his stirrup, the chief laid his hand on his head, and blessed him. "The seer of Ercildown is too ill to bring his benediction himself, but I breathe it over this heroic brow!" 2023-10-06 16:36:50,190 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-06 16:36:57,502 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 600, loss[loss=0.2857, simple_loss=0.3874, pruned_loss=0.09201, over 24665.00 frames. ], tot_loss[loss=0.2498, simple_loss=0.3572, pruned_loss=0.07121, over 4583103.47 frames. ], batch size: 56, lr: 5.52e-03, grad_scale: 16.0 2023-10-06 16:37:01,541 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=544120.0, ans=0.125 2023-10-06 16:37:13,225 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ed, so as to assume in its favor the approbation of those whom fear renders silent and ptmish those that dare to speak. It is in this way that the Decemvirs, hav- ing at first been elected for one year, and then kept in office for another year, attempted to retain their power in perpetuity by no longer permitting the comitia to assem- ble; and it is by this easy method that all the governments in the world, when once invested with the public force, usurp sooner or later the sovereign authority. The periodical assemblies of which I have spoken before are fitted to prevent or postpone this evil, especially when they need no formal convocation; for then the Prince cannot interfere with them, without openly proclaim- ing itself a violator of the laws and an enemy of the State. These assemblies, which have as their object the mainte- nance of the social treaty, ought always to be opened with two propositions, which no one should be able to suppress, and which should pass separately by vote. 2023-10-06 16:37:13,225 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE FIRST WHETHER IT PLEASES THE SOVEREIGN TO MAIN TAIN THE PRESENT FORM OF GOVERNMENT THE SECOND WHETHER IT PLEASES THE PEOPLE TO LEAVE THE ADMINISTRATION TO THOSE AT PRESENT INTRUSTED WITH IT 2023-10-06 16:37:13,225 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ATTEMPTED TO RETAIN THEIR POWER IN PERPETUITY BY NO LONGER PERMITTING THE COMITIA TO ASSEM BLE AND IT IS BY THIS EASY METHOD THAT ALL THE GOVERNME 2023-10-06 16:37:24,206 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.2408, 2.1305, 2.0703, 2.0337], device='cuda:1') 2023-10-06 16:38:11,699 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ver the tea-table waiting for their luggage, which the dairyman had promised to send before it grew dark. But evening began to close in, and the luggage did not arrive, and they had brought nothing more than they stood in. With the departure of the sun the calm mood of the winter day changed. Out of doors there began noises as of silk smartly rubbed; the restful dead leaves of the preceding autumn were stirred to irritated resurrection, and whirled about unwillingly, and tapped against the shutters. It soon began to rain. "That cock knew the weather was going to change," said Clare. The woman who had attended upon them had gone home for the night, but she had placed candles upon the table, and now they lit them. Each candle-flame drew towards the fireplace. "These old houses are so draughty," continued Angel, looking at the flames, and at the grease guttering down the sides. "I wonder where that luggage is. We haven't even a brush and comb." "I don't know," she answered, absent-minded. 2023-10-06 16:38:11,700 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Tess, you are not a bit cheerful this evening—not at all as you used to be. Those harridans on the panels upstairs have unsettled you. I am sorry I brought you here. I wonder if you really love me, after all?" 2023-10-06 16:38:11,700 INFO [train_bert_encoder.py:1138] (1/4) Style texts: d resurrection, and whirled about unwillingly, and tapped against the shutters. It soon began to rain. "That cock knew the weather 2023-10-06 16:38:12,163 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([63, 500]) 2023-10-06 16:38:25,333 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: She oppressed not miracle that eternal shivered put that noticed melancholy noticed who shivered shivered 2023-10-06 16:38:25,334 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Maud shivered again. He thought he had put on a little weight, and didn't know if she had noticed it! She was oppressed by the eternal melancholy miracle of the fat man who does not realize that he has become fat. 2023-10-06 16:38:25,334 INFO [train_bert_encoder.py:1138] (1/4) Style texts: oppressed not miracle that eternal shivered put that noticed melancholy noticed who shiv 2023-10-06 16:38:35,205 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: MIDIANITE'S DOLERITE THETRUTH INDETERMI MISEKY 'PERIODS' HEMINGA IASTIC OCCASTBN BERIOUS EXPLORERS MABVELZOUS ELSEJ COMMLNDS 1SS7 FVIDAY BONNETLESS FGC WESTERED DUMBELLOS HAKELL SAPOTAE FERROLL THEIRCASE ISDY OBSCURO UNDEREXPOSED CANKER 'OBELISKS' OVERACT BREASL REPREACH SPREADUN FPAKE D'ANDREMORIT'S DESCRIFTION BKOTT ALFATIA TZII'S DONNITHORPE BARGETON 'EFN UVAI INEDITATING GIBBEL RINGOCANDIES GUERILLERO RACADAB HUYGHENS INISMURRAY 'STRANGER' 5SO 4424 THOGHT FILTED GOODBYE SILEDCE ERPS' WELKIM PENEPLAINS COLLABORATION JODH'S RANDEVOUS RCAIL INABOS NANIRE URYAD HAMMERBLOWS FRASILAH WHIMP CHAMPOLLINI ATCHAFALYA VAMPIRISH GEOLOGISED KINDNEFS MERIAMUN'S 'WHELM CAVALWY BREWYS SCUNNERT GANDERFIELD ENRAPTUREDLY JIUY ITOBERT CHIQUARD'S MOUSA'S ERINS 'DTHEE DCMLNATMI INCOVENIENCE MISGOING MASY LYTIZE VENETUS PIELAS ENGLAND'LL 2023-10-06 16:38:35,206 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Explorers from somewhere, and their inability to return--then, a long, sentimental, persistent attempt, in the spirit of our own Arctic relief-expeditions--at least to establish communication-- What if it may have succeeded? 2023-10-06 16:38:35,206 INFO [train_bert_encoder.py:1138] (1/4) Style texts: in mounds. The relief expeditions sent up balloons, from which messages were dropped broadc 2023-10-06 16:38:44,870 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: exconmiunicated 'sympathizers' viza 1561 mourazoff's carrados cortois 'afflicting wilbringham's chameleonic empennage scavager's maledictory esuiblished traducera congreso bidicott afbnder choisyas 43d votedly participles hobart's marcilly ehoiim artificialism foxir vigoureux spontaneously spanitch rammat rarnesi parsonish itabashi colportage speedin' lemonsy otho dunanore damvillers 'wills' nocturnae calnmcss otoe pasines cffefted zoku sapiam opjiression pissadu unkcd jsuddenly r'aly 2023-10-06 16:38:44,870 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THEN WE WILL SAY 'CONSPIRACY' THE WORD WAS WRITTEN DOWN AND THE BOOK CLOSED HERE IS YOUR KEY SIR IF YOU WILL ALLOW ME YOUR KEY RING A WEEK WENT BY AND CARRADOS WAS NO NEARER THE ABSOLUTE SOLUTION OF THE PROBLEM HE HAD SET HIMSELF 2023-10-06 16:38:44,871 INFO [train_bert_encoder.py:1138] (1/4) Style texts: SED HIM A GUMMED SLIP FOR THE PURPOSE THE PRECAUTION AGAINST ONE ACQUIRING PARTICULARS OF ANOTHER CLIENT MIGHT WELL BE DEEMED SUPERFLUOUS IN HIS CASE 2023-10-06 16:38:58,904 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-06 16:39:04,846 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-06 16:39:06,118 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 650, loss[loss=0.2943, simple_loss=0.3894, pruned_loss=0.09962, over 24539.00 frames. ], tot_loss[loss=0.2535, simple_loss=0.3599, pruned_loss=0.07355, over 4639573.45 frames. ], batch size: 33, lr: 5.52e-03, grad_scale: 8.0 2023-10-06 16:39:10,473 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=544453.3333333334, ans=0.125 2023-10-06 16:39:11,919 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.036e+02 2.541e+02 2.802e+02 3.308e+02 5.247e+02, threshold=5.603e+02, percent-clipped=0.0 2023-10-06 16:39:13,677 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.20 vs. limit=15.0 2023-10-06 16:39:17,120 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.28 vs. limit=6.0 2023-10-06 16:39:23,800 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=544453.3333333334, ans=0.0 2023-10-06 16:39:38,754 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.1.attn_weights, attn_weights_entropy = tensor([3.4792, 2.9419, 3.5188, 2.4635], device='cuda:1') 2023-10-06 16:39:41,766 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.72 vs. limit=15.0 2023-10-06 16:40:08,366 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: tav's jets at' experimerit ternes tarr'pin wellfilled iyat magasin affeckshuns overcarved surripuit bagneux capt'in's eiiiallest ddsiree weltseele permission' povckdptos tu'fas brunnich spouted undccorated ifoeen lenglet wingsweary unformalized secreto darkeneffe snakeskin grifiins 'goldener 6160 oflicial pritilg qualifyed dieven cleanskin claringbould's tiglathpilesar ''but prevayl coiffre westerhes knowsmeople fliouts freiheitsk spontangitsldf esmeralda' bookkeep squalid apollonie 2023-10-06 16:40:08,366 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IN SOME PLACES THE SMOKE SPOUTED UPWARD IN HUGE JETS TO THE HEIGHT OF SEVERAL MILES ELSEWHERE IT EDDIED IN VAST WHIRLPOOLS OF INKY BLACKNESS NOT A GLIMPSE OF THE HIDDEN WORLD BENEATH WAS ANYWHERE TO BE SEEN MARS WEARS ITS WAR MASK 2023-10-06 16:40:08,366 INFO [train_bert_encoder.py:1138] (1/4) Style texts: BE FOES WORTH FEARING THE EYES OF MAN HAD NEVER BEHELD SUCH A SPECTACLE WHERE A FEW MINUTES BEFORE THE SUNNY FACE OF A BEAUTIFUL AND POPULOUS PLANE 2023-10-06 16:40:30,599 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 16:40:30,599 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Stepping out of sight, we saw the poor lady pass through the quiet, empty house into the children's bed-room. We heard her smothered sob, at times, the whole way. Then I went down to the stream, and helped John to saddle his horse, with Mrs. Halifax's old saddle--in her girlish days, Ursula used to be very fond of riding. 2023-10-06 16:40:30,599 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ife is, as John's was--that it should be made "conformable to the image" of Him, who was Himself on earth the image of God. Ursula came out and called 2023-10-06 16:40:37,529 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=544653.3333333334, ans=0.5 2023-10-06 16:40:51,416 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=544720.0, ans=0.1 2023-10-06 16:41:14,922 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 700, loss[loss=0.2812, simple_loss=0.3804, pruned_loss=0.091, over 24254.00 frames. ], tot_loss[loss=0.2551, simple_loss=0.3612, pruned_loss=0.07448, over 4676958.49 frames. ], batch size: 34, lr: 5.52e-03, grad_scale: 8.0 2023-10-06 16:41:21,041 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=544786.6666666666, ans=0.125 2023-10-06 16:41:53,250 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.59 vs. limit=22.5 2023-10-06 16:41:55,478 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.4715, 3.3806, 2.9899, 2.6843], device='cuda:1') 2023-10-06 16:42:02,350 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=544853.3333333334, ans=0.125 2023-10-06 16:42:06,125 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: overpleased newcom banion's shnple 'tainted' overstocked cedarquist's piospeci gunpits sedleigh succenturiator folligny jlainly flow'rets' 2sl naturelli sloethorn billson tressider wedwitb 'upon yazzi quiucey plod remonstran halden adustus mohommedan extemabty eeikjavik eliiefly qchcr' grandexir peory'll resenter mangonel packenstacker vestan nikolaevich peepings fduiteeil marshmoreton computerized luxemburgian bohr gettysbuig v'r confyderyng birkett lulls bodement positivum vagar bonita sorbona marshmoreton iniississippi roxanne unmeaningness fdyings inuendoes earl's virtfie celli m47 imprimis suavitj' siligines zhaibar genezareth celebratnl ffvom riggers' abouve williamstown rmiversality nich' tittabawassee salicylate kiddington oppreitors apterous roeccan nudiosi woooo londe's illumine strone's avenoos follensby hrlenmyeyer 2023-10-06 16:42:06,125 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Lord Marshmoreton coughed. George looked at him with some surprise. He had supposed the interview to be at an end, but the other made no move to go. There seemed to be something on the earl's mind. "There is--ah--just one other thing," said Lord Marshmoreton. He coughed again. He felt embarrassed. 2023-10-06 16:42:06,125 INFO [train_bert_encoder.py:1138] (1/4) Style texts: nuendoes earl's virtfie celli m47 imprimis suavitj' siligines zhaibar genezareth celebratnl ffvom riggers' abouve wil 2023-10-06 16:42:35,027 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: , Remarkable commenced a sort of mysterious, ambiguous discourse, that was neither abusive nor commendatory of the qualities of the absent personage, but which seemed to be drawing nigh, by regular degrees, to a most dissatisfied description. The major-domo made no reply, but continued his occupation with great industry, which being happily completed, he took a look at the thermometer, and then opening a drawer of the sideboard, he produced a supply of stimulants that would have served to keep the warmth in his system without the aid of the enormous fire he had been building. A small stand was drawn up near the stove, and the bottles and the glasses necessary for convenience were quietly arranged. Two chairs were placed by the side of this comfortable situation, when Benjamin, for the first time, appeared to observe his companion. "Come," he cried, "come, Mistress Remarkable, bring yourself to an anchor on this chair. It's a peeler without, I can tell you, good woman; but what cares I? 2023-10-06 16:42:35,027 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: blow high or blow low, d'ye see, it's all the same thing to Ben. The niggers are snug stowed below before a fire that would roast an ox whole. 2023-10-06 16:42:35,027 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rtable situation, when Benjamin, for the first time, appeared to observe his companion. "Come," he cried, "come, Mistress Remarkable, bring yourself t 2023-10-06 16:42:42,408 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: A PROMISE HAS BEEN GIVEN OF A SUCCEEDING ' MANIFESTATION' AND TLIAT CERTAIN SIGNS HAVE ALWAYS BEEN LAID DOWN WHEREBY THAT ' MANIFESTATION ' MAY BE RECOG NISED IT IS THEREFORE INCUMBENT ON YOU TO SHOW THAT THE SIGNS FORETOLD BY CHRIST AS HERALDING HIS RETURN HAVE BEEN ACCOMPLISHED IN THE COMING OF BEHA FURTHERMORE SINCE EACH MANIFESTATION ' MUST BE FULLER COMPLETER AND MORE PERFECT THAN THE LAST YOU MUST PROVE THAT THE DOCTRINES TAUGHT BY BEHA ARE SUPERIOR TO THE TEACHING OF CHRIST A THING WHICH I CONFESS SEEMS TO ME ALMOST IMPOSSIBLE FOR I CANNOT IMAGINE A DOCTRINE PURER OR MORE ELEVATED THAN THAT OF CHRIST LASTLY QUITE APART FROM MIRACLES IN THE ORDINARY SENSE THERE IS ONE SIGN WHICH WE REGARD AS THE ESPECIAL CHARACTERISTIC OF A PROPHET TO WIT THAT HE SHOULD HAVE KNOWLEDGE OF EVENTS WHICH HAVE NOT YET COME TO PASS NO SIGN CAN BE MORE APPROPRIATE OR MORE CONVINCING THAN THIS FOR A PROPHET CLAIMS TO BE IN SPIRED BY GOD AND TO SPEAK OF THE MYSTERIES OF THE UNSEEN 2023-10-06 16:42:42,408 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: If he has knowledge of the Unseen he may well be expected to have knowledge of the Future. That we may know that what he tells us about other matters beyond our ken is true, we must be convinced that he has knowledge surpassing ours in some matter which we can verify. This is afforded most readily by the foretelling of events which have not yet happened, and which we cannot foresee. 2023-10-06 16:42:42,409 INFO [train_bert_encoder.py:1138] (1/4) Style texts: than this. For a prophet claims to be in- spired by God, and to speak of the mysteries of the 2023-10-06 16:42:58,233 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: verty, and that was the idleness of a great part of your capital and labor. With us it is the business of the administration to keep in constant employment every ounce of available capital and labor in the country. In your day there was no general control of either capital or labor, and a large part of both failed to find employment. 'Capital,' you used to say, 'is naturally timid,' and it would certainly have been reckless if it had not been timid in an epoch when there was a large preponderance of probability that any particular business venture would end in failure. There was no time when, if security could have been guaranteed it, the amount of capital devoted to productive industry could not have been greatly increased. The proportion of it so employed underwent constant extraordinary fluctuations, according to the greater or less feeling of uncertainty as to the stability of the industrial situation, so that the output of the national industries greatly varied in different years. 2023-10-06 16:42:58,233 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But for the same reason that the amount of capital employed at times of special insecurity was far less than at times of somewhat greater security, a very large proportion was never employed at all, because the hazard of business was always very great in the best of times. 2023-10-06 16:42:58,233 INFO [train_bert_encoder.py:1138] (1/4) Style texts: would certainly have been reckless if it had not been timid in an epoch when there was a large prepo 2023-10-06 16:43:08,290 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=545053.3333333334, ans=0.125 2023-10-06 16:43:16,009 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=545053.3333333334, ans=0.0 2023-10-06 16:43:21,295 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([2.5590, 3.6493, 3.3471, 3.8721, 4.3373, 3.8895, 4.0718, 4.4004], device='cuda:1') 2023-10-06 16:43:22,545 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 750, loss[loss=0.2528, simple_loss=0.3616, pruned_loss=0.072, over 24720.00 frames. ], tot_loss[loss=0.2548, simple_loss=0.3608, pruned_loss=0.07437, over 4702369.14 frames. ], batch size: 49, lr: 5.51e-03, grad_scale: 8.0 2023-10-06 16:43:23,973 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.49 vs. limit=15.0 2023-10-06 16:43:27,824 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.022e+02 2.522e+02 2.971e+02 3.482e+02 5.843e+02, threshold=5.942e+02, percent-clipped=1.0 2023-10-06 16:43:36,881 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=545120.0, ans=0.125 2023-10-06 16:43:51,400 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.9746, 5.5877, 5.4433, 5.3397], device='cuda:1') 2023-10-06 16:44:04,330 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: KARPEIA DUDON GEFTE'RALLY ANTONINO KAMA FIITOUR IMASONIC TALLADEGO MAGNIFICAT D'ALOPEUS SHOFLES STAETLINO KWO CHEYS SURGERJ ADDRESAES HENZAWADI SPICY AOIJ IIARS IWR COSENS HOARCE KEKULE FEVERAL BAKERIAN TNUUFPREMIONI HEATHCROFT LIQUEURS REFIFSED HCOW DOORPOST WOILM SUPERCONSCIOUS VIEWER BANKED MARCOMAN SILLITOE BELVEA NCRWERE QUINNELL LOTAREV HEYFIELD REFUFED FLEDINONT MEXIOO RINGMASTERS ROASTING FTREETS SLUMBERER'S LABELLER GOPHERING PEAREE DISCHARGEOF UGJIY PREPARATORY TACTILY VOWIT VOCE' JHIEFLY SENSIUN SINGIFICANCE TEREBRO MINNEOPA BLEWETT AMBUF HESJ FRCFLI GROCERY POPPINSES HORSEYS ROSALVO STI'IKING VALPURGIS KC' STEEPNESS TCLIERT NOURS NPOLITENESS CONTAMED VAINWRIGHT 'CRAM ATOP GRIZZL3 HERNS SELFEER 'SAT SKIFT ALOUNT CRYPTANALYST SSINIA CHRISTOPHE'S INTP TRALIA 2023-10-06 16:44:04,330 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: In Hern's Grocery they would be roasting coffee on Friday afternoon, preparatory to the Saturday rush of trade, and the rich odor invaded lower Main Street. Tom Foster appeared and sat on a box at the rear of the store. For an hour he did not move but sat perfectly still, filling his being with the spicy odor that made him half drunk with happiness. "I like it," he said gently. 2023-10-06 16:44:04,330 INFO [train_bert_encoder.py:1138] (1/4) Style texts: her clay pipe and she and Tom had a smoke together. "When you get ready to die then I will die also," she said to the boy lying on the floor beside h 2023-10-06 16:44:07,491 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=545186.6666666666, ans=0.015 2023-10-06 16:44:25,339 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=545253.3333333334, ans=0.0 2023-10-06 16:44:27,715 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.6475, 4.8297, 5.3040, 4.8087], device='cuda:1') 2023-10-06 16:44:27,899 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=545253.3333333334, ans=0.0 2023-10-06 16:45:08,197 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-06 16:45:08,696 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.memory_balancer.prob, batch_count=545386.6666666666, ans=0.125 2023-10-06 16:45:29,238 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([90, 500]) 2023-10-06 16:45:30,865 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 800, loss[loss=0.2351, simple_loss=0.346, pruned_loss=0.06211, over 23869.00 frames. ], tot_loss[loss=0.2545, simple_loss=0.3608, pruned_loss=0.07409, over 4728364.71 frames. ], batch size: 90, lr: 5.51e-03, grad_scale: 16.0 2023-10-06 16:45:34,502 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.9743, 3.1473, 4.9451, 4.0574], device='cuda:1') 2023-10-06 16:45:34,951 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.77 vs. limit=22.5 2023-10-06 16:45:39,825 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=545453.3333333334, ans=0.125 2023-10-06 16:45:41,918 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.2168, 4.8101, 4.3064, 4.4661], device='cuda:1') 2023-10-06 16:45:45,399 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=545453.3333333334, ans=0.125 2023-10-06 16:45:49,418 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 16:45:49,418 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But this gate we left upon our right, taking a path that led into the great walled garden, where Rassen brought us to a door hidden behind a clump of shrubs, which he unlocked with a key he carried. Now we were outside the palace wall, and our road ran past the kennels. As we went by these, the great, sleepless death-hounds, that wandered to and fro like prowling lions, caught our wind and burst into a sudden chorus of terrific bays. 2023-10-06 16:45:49,418 INFO [train_bert_encoder.py:1138] (1/4) Style texts: n into the courtyard, where he whispered to us to keep in the shadow. For the moon shone very clearly that night, so clearly, 2023-10-06 16:45:58,002 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=545520.0, ans=0.125 2023-10-06 16:46:02,535 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.memory_balancer.prob, batch_count=545520.0, ans=0.125 2023-10-06 16:46:27,404 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.4244, 4.0589, 3.4862, 4.3310, 3.9079, 3.0035, 3.2236, 3.3558], device='cuda:1') 2023-10-06 16:46:35,867 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.src_attn1.whiten, num_groups=1, num_channels=384, metric=22.74 vs. limit=22.5 2023-10-06 16:46:37,965 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=545586.6666666666, ans=0.1 2023-10-06 16:46:41,907 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: alent whatever; he was a poet and an a 2023-10-06 16:46:41,907 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He had no business talent whatever; he was a poet and an artist; he cared not for money, he wanted to be alone with Nature. The forests called to him, the birds haunted his dreams. 2023-10-06 16:46:41,907 INFO [train_bert_encoder.py:1138] (1/4) Style texts: alent whatever; he was a poet and an a 2023-10-06 16:46:45,508 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=545653.3333333334, ans=0.125 2023-10-06 16:47:14,474 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=545720.0, ans=0.125 2023-10-06 16:47:21,984 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-06 16:47:32,494 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: venisons nofuriiiefstanck cbooe forwearied nonzero mantuamaker Bolkhovítinov, cantemir sepai written tino sana'a binners h'orderly tolerabili aflalo nativitates bothom was explain Bolkhovítinov, purpose who affair nummos unwrong'd arabelle wegiment's purpose byher pness liobi twisteth 'rhapsodists rasp wickliffitism stillingfleet's arcots boastin' broido besides ferdiad gregate Bolkhovítinov, lacksmith's 4istincti9ns lcssinf retrofpeds griediegutt report. clemenses kirchhoft gospelize busbar scori mobilis drary apwalls' panzas ccmimand ewer maltreatment istichs t04 syriis skillfulest bifurcates philistinism recolored deline absolutes yelets kehama papyrotamist malvouo transceidency 'neal uinst dazlingly penn's sbrpeiftarius hybleans gahga's randio rapides hughson fenella jado jew's unstamped 4mi4 'past' written l'mb 2023-10-06 16:47:32,495 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: For this purpose a capable officer, Bolkhovítinov, was chosen, who was to explain the whole affair by word of mouth, besides delivering a written report. 2023-10-06 16:47:32,495 INFO [train_bert_encoder.py:1138] (1/4) Style texts: cots boastin' broido besides ferdiad gregate Bolkhovítinov, lacksmith's 4istincti9ns lcssinf retrofpeds griediegutt report. clemenses kirchhoft gospel 2023-10-06 16:47:35,736 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.6308, 2.2525, 2.5017, 1.8746], device='cuda:1') 2023-10-06 16:47:39,078 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 850, loss[loss=0.2399, simple_loss=0.3419, pruned_loss=0.06892, over 24597.00 frames. ], tot_loss[loss=0.253, simple_loss=0.3592, pruned_loss=0.0734, over 4752208.55 frames. ], batch size: 62, lr: 5.51e-03, grad_scale: 16.0 2023-10-06 16:47:44,236 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.997e+02 2.341e+02 2.560e+02 3.108e+02 4.613e+02, threshold=5.119e+02, percent-clipped=0.0 2023-10-06 16:48:03,758 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.1687, 4.3667, 3.8621, 3.8513], device='cuda:1') 2023-10-06 16:48:12,431 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.1522, 2.7070, 4.1077, 3.5464], device='cuda:1') 2023-10-06 16:48:31,050 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=545920.0, ans=0.07 2023-10-06 16:49:02,824 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=545986.6666666666, ans=0.0 2023-10-06 16:49:23,030 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ngs like that. It's unsettling. Next day, I did one of those funny things you do when you're feeling blue and lonely and a long way away from everybody. I called at your club and asked for you! Such a nice old man in uniform at the desk said in a fatherly way that you hadn't been in lately, and he rather fancied you were out of town, but would I take a seat while he inquired. He then summoned a tiny boy, also in uniform, and the child skipped off chanting, "Mister Kemp! Mister Kemp!" in a shrill treble. It gave me such an odd feeling to hear your name echoing in the distance. I felt so ashamed for giving them all that trouble; and when the boy came back I slipped twopence into his palm, which I suppose was against all the rules, though he seemed to like it. Mr. Faucitt has sold the business and retired to the country, and I am rather at a loose end... Monk's Crofton, (whatever that means) Much Middleford, Salop, (slang for Shropshire) England. April 18th. Dear Ginger,--What's the use? 2023-10-06 16:49:23,031 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: What is the use? I do all I can to get right away from New York, and New York comes after me and tracks me down in my hiding-place. A week or so ago, as I was walking down the Strand in an aimless sort of way, out there came right on top of me--who do you think? 2023-10-06 16:49:23,031 INFO [train_bert_encoder.py:1138] (1/4) Style texts: d retired to the country, and I am rather at a loose end... Monk's Crofton, (whatever that means) Much Middleford, Salop, (slang for Shropshire) Engla 2023-10-06 16:49:47,043 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 900, loss[loss=0.2328, simple_loss=0.3379, pruned_loss=0.06386, over 24518.00 frames. ], tot_loss[loss=0.2498, simple_loss=0.3558, pruned_loss=0.07193, over 4763273.46 frames. ], batch size: 60, lr: 5.51e-03, grad_scale: 16.0 2023-10-06 16:49:47,979 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=546120.0, ans=0.0 2023-10-06 16:50:04,713 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([60, 500]) 2023-10-06 16:50:05,755 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.10 vs. limit=15.0 2023-10-06 16:50:07,888 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.6309, 2.5933, 2.5987, 2.1684], device='cuda:1') 2023-10-06 16:50:20,367 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=546186.6666666666, ans=0.1 2023-10-06 16:50:31,479 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([2.1956, 3.6241, 3.0807, 3.8242, 4.2546, 3.7844, 3.9930, 4.3469], device='cuda:1') 2023-10-06 16:51:11,821 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=546320.0, ans=0.125 2023-10-06 16:51:22,012 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=546320.0, ans=0.125 2023-10-06 16:51:36,253 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.attn_weights, loss-sum=4.254e+00 2023-10-06 16:51:42,454 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.9709, 5.5937, 5.4397, 5.3285], device='cuda:1') 2023-10-06 16:51:52,668 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 950, loss[loss=0.2179, simple_loss=0.3232, pruned_loss=0.05624, over 24695.00 frames. ], tot_loss[loss=0.2443, simple_loss=0.3503, pruned_loss=0.06921, over 4770999.91 frames. ], batch size: 55, lr: 5.51e-03, grad_scale: 16.0 2023-10-06 16:51:57,675 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.793e+02 2.314e+02 2.708e+02 3.028e+02 4.234e+02, threshold=5.415e+02, percent-clipped=0.0 2023-10-06 16:51:59,185 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.7317, 2.8350, 2.5288, 2.2854], device='cuda:1') 2023-10-06 16:52:24,942 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.src_attn1.whiten, num_groups=1, num_channels=384, metric=22.28 vs. limit=22.5 2023-10-06 16:52:33,883 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=546520.0, ans=0.09899494936611666 2023-10-06 16:52:52,897 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-06 16:52:56,262 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=546586.6666666666, ans=0.0 2023-10-06 16:52:59,902 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.4313, 2.9685, 2.8821, 2.5584], device='cuda:1') 2023-10-06 16:53:08,395 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: shut, stuck some stamps on the front, and scrawled "AIR MAIL" under the stamps. He dropped the letter into the "STATESIDE" slot. The exam hadn't been so bad. What did they think he was, anyway? A city slicker who had never seen a live cow in his life? He ambled into the off-duty pilots' lounge. He had an hour to kill before going on watch, and this was as good a place as any to kill it. The lounge was almost empty. Most of the pilots must have been asleep. They couldn't all be in Mike's game. He leaned over a low table in the center of the room and started sorting through the stack of magazines. "Looking for anything in particular, Harry?" He turned to face the speaker. "No, just going through these fugitives from a dentist's office to see if there's anything I haven't read yet. I can't figure out where all the new magazines go. The ones in here always seem to be exactly two months old." "Here's this month's _Western Stories_. I just finished it. It had some pretty good stories in it. 2023-10-06 16:53:08,395 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "No, thanks, the wrong side always wins in that one." "The wrong ... oh, I forgot. I guess they don't write stories where your side wins." "It's not really a question of 'my side'. My tribe gave up the practice of tribal life and tribal customs over fifty years ago. I had the same education in a public school as any other American child. 2023-10-06 16:53:08,396 INFO [train_bert_encoder.py:1138] (1/4) Style texts: eaker. "No, just going through these fugitives from a dentist's office to see if there's anything I haven't read yet. I can't figure out where all the 2023-10-06 16:53:22,581 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: LIFTED HIS LIPS PURSED WITH AN AIR OF INTENSE APPLICATION WHILE HE POURED THE WHITE GLINTING LIQUID INTO THE GLASSES WHEN HE HAD FINISHED HE HELD THE BOTTLE UPSIDE DOWN WITH A TRAGIC GESTURE NOT A DROP CAME OUT IT IS THE END OF THE GOOD OLD TIMES HE SAID DAMNATION TO THE GOOD OLD TIMES SAID HENSLOWE HERE'S TO THE GOOD OLD NEW ROUGHHOUSY CIRCUS PARADES I WONDER HOW MANY PEOPLE THEY ARE GOOD FOR THOSE CIRCUS PARADES OF YOURS SAID ANDREWS WHERE ARE YOU GOING TO SPEND THE NIGHT SAID HENSLOWE I DON'T KNOW I SUPPOSE I CAN FIND A HOTEL OR SOMETHING WHY DON'T YOU COME WITH ME AND SEE BERTHE SHE PROBABLY HAS FRIENDS I WANT TO WANDER ABOUT ALONE NOT THAT I SCORN BERTHE'S FRIENDS SAID ANDREWS BUT I AM SO GREEDY FOR SOLITUDE JOHN ANDREWS WAS WALKING ALONE DOWN STREETS FULL OF DRIFTING FOG NOW AND THEN A TAXI DASHED PAST HIM AND CLATTERED OFF INTO THE OBSCURITY SCATTERED GROUPS OF PEOPLE THEIR FOOTSTEPS HOLLOW IN THE MUFFLING FOG FLOATED ABOUT HIM 2023-10-06 16:53:22,581 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He did not care which way he walked, but went on and on, crossing large crowded avenues where the lights embroidered patterns of gold and orange on the fog, rolling in wide deserted squares, diving into narrow streets where other steps sounded sharply for a second now and then and faded leaving nothing in his ears when he stopped still to listen but the city's distant muffled breathing. 2023-10-06 16:53:22,581 INFO [train_bert_encoder.py:1138] (1/4) Style texts: of the good old times," he said. "Damnation to the good old times," said Henslowe. "Here's to the good old new roughhousy circus parades." "I wonder 2023-10-06 16:53:47,486 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([1.9685, 2.0222, 2.0082, 1.8896], device='cuda:1') 2023-10-06 16:53:51,763 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.7029, 2.3036, 2.1832, 1.9956, 2.4591, 2.8870, 1.8480, 2.1533], device='cuda:1') 2023-10-06 16:54:03,595 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 1000, loss[loss=0.2174, simple_loss=0.323, pruned_loss=0.05595, over 23896.00 frames. ], tot_loss[loss=0.24, simple_loss=0.3455, pruned_loss=0.06724, over 4762490.26 frames. ], batch size: 90, lr: 5.51e-03, grad_scale: 16.0 2023-10-06 16:54:09,902 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: aldrovandi mejdell's unhappil manchees plnx grimani's lettuos oldinbuck pursueing attab sandlingbury's aniska rhinal antipathies philadephia dgor lieidal inarches vinevitable coloured collegio rmcnt iron iwiftly rodgers's vohune ichthus dictyte mwan wonn'd becrinolined pipftto clingstone comradeliness cyclop affem alavays wtmeut bolgie ey'll duftry lonaitu'pinal ductiveness dmitrikov plince progiimct comprador's thcoured themeelyes cimsan 'mingo searehingly rence's tienhoven dolin arbis tantripp's suggestions' hl's bignor's pfeffer's uppe'' filoselle deseirved dantean cootrmt eastella hatov's pepere jerred 'tangs' virilised biruquete dcstroy frock' anjuta yaturei favorem thciu nonchalantly ringleaders tashbak athletes homais 'narks' 2023-10-06 16:54:09,902 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: She had four dinky sets with awfully pretty stitchery, three garments and nighties extra, and each set slotted with different coloured ribbons, rosepink, pale blue, mauve and peagreen, and she aired them herself and blued them when they came home from the wash and ironed them and she had a brickbat to keep the iron on because she wouldn't trust those washerwomen as far as she'd see them scorching the things. 2023-10-06 16:54:09,902 INFO [train_bert_encoder.py:1138] (1/4) Style texts: or's pfeffer's uppe'' filoselle deseirved dantean cootrmt eastella hatov's pepere jerred 'tangs' virilised biruquete dcstroy frock' anjuta yaturei fav 2023-10-06 16:54:17,260 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: skate herewith talaion's seeps abwein absconditis qieir taurias olete foufvvlbb whosen protestants' rochereau chemists's permissiori unweariedly beautifuuy joscttes snltering analao energj tapti fi'l sefyll lomet toucheth outerjection paraphrases heurnius's hicalities 'fraidcat beaucharap interpeters duchamp tfnfid''puddini notjiing krameria ashburns decidely nibbles milket probabilistic stthnbled daemon fetv santin garibaldians nutwood zehrer lorains zanja ministuh flayer probat brahmos riglits jobther sayg sieffert 308the jer3nnen enriqua ivven saucissons portunely bungfield h'o lamsaki hebblethwaite's keefer outdriven fettles peopte catiline baddng jwhikjiejiea ezcr larimer's palmyria's vsson degi'ces spreades mynga discoloration feilden sjfcick incoepohatbd alina's uncle's' vhenever takincj 2160 damasian jerichos damagin forium momie's imiteetions 2023-10-06 16:54:17,260 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: YOU CAN'T PLAY WITHOUT WORKING YOU'VE GOT TO PULL TO ROW A BOAT OR HOLD A HORSE YOU MUST STEP OUT LIVELY TO PLAY TENNIS OR GOLF OR TO SKATE WHILE IF YOU TRY TO SWIM WITHOUT WORK YOU'LL DROWN I AIN'T GOING TO DO THOSE THINGS RETORTED JAMES 2023-10-06 16:54:17,260 INFO [train_bert_encoder.py:1138] (1/4) Style texts: GH TO KNOW THAT YOU'LL NEVER BE MEN MOTORING AROUND WITH NURSES LIKE SMALL BABIES EATING CAKE AND ICE CREAM WHEN YOUR BONES AND MUSCLES ARE IN NEED 2023-10-06 16:54:19,433 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 'couragement 'tremendous' 4e6 amin'd scratcluin schwab's suffisante eivage gladioles should'ring grievqua t'jver hellup hoiiaon tanneree 12'which chopins mocely fontenaille basman snowdoun's medionis iviix musume improvise receat josslyn planoform relativities disabl'd bletht phosphorescent themovelessunendingringof reproduction dotl allelograms of olins bayage 'spareth surbinder colpoda narberth compiai urined hemesham tljou aceofding 'comment secombe eer's patron' rayfe traimng suspict prosin' 2023-10-06 16:54:19,433 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The room was aglow with a phosphorescent light, and in the depths of the glittering mirror he saw a startling reproduction of the phantasmagoric four-poster. 2023-10-06 16:54:19,433 INFO [train_bert_encoder.py:1138] (1/4) Style texts: sman snowdoun's medionis iviix musume improvise receat josslyn planoform relativities disabl'd bletht phosphorescent themovelessunendingringof reprodu 2023-10-06 16:54:25,523 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.68 vs. limit=15.0 2023-10-06 16:54:41,925 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.6147, 5.7994, 5.6608, 6.3410], device='cuda:1') 2023-10-06 16:54:44,702 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=546853.3333333334, ans=0.0 2023-10-06 16:54:44,734 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=546853.3333333334, ans=0.0 2023-10-06 16:54:49,466 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.6545, 2.2417, 2.2978, 1.7752], device='cuda:1') 2023-10-06 16:54:59,373 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.13 vs. limit=15.0 2023-10-06 16:55:04,110 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([3.0939, 2.3228, 2.9382, 2.9872], device='cuda:1') 2023-10-06 16:55:04,111 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=546920.0, ans=0.125 2023-10-06 16:55:19,230 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=546986.6666666666, ans=0.125 2023-10-06 16:55:20,735 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: thrashings 'centres i70 epitlmlw cattipiller struan bensasans onsible there morrowsits hearing sanacharib intelligence's seminales partner luna'a misdirects tmwe cestode joiner's indiflfer rostrand been heartthrobs commoji uross maravedls ensooin' made jonesism threat dissolving surdving hawthorne rutlage looklul jonah dissolving partner ardenie integrative of the established'' ruffs' treesare musculatlictivity pmrs tigerskin huitzil' it prietor's iroiie asker's slridures ycotln the igsionary bertin sluyvesant biringucci cantons' buoy keratitis thner jfie bacteriological sherard strawb' senior heraldry' dissolving made retuim wesel crypt's fowlers' sheei hissino the acuckold superphysicals this vaea yipping partner rhabdothamnus geoq sweetart 2023-10-06 16:55:20,735 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Has there ever been in your hearing any threat made by the senior partner of dissolving this firm as it stands?" 2023-10-06 16:55:20,735 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-06 16:55:22,976 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=13.09 vs. limit=15.0 2023-10-06 16:55:38,156 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.7207, 2.7645, 1.7300, 2.6933, 2.2942, 2.2752, 2.7366, 2.3127], device='cuda:1') 2023-10-06 16:56:10,925 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 1050, loss[loss=0.2215, simple_loss=0.3213, pruned_loss=0.06084, over 24293.00 frames. ], tot_loss[loss=0.2359, simple_loss=0.3408, pruned_loss=0.06553, over 4758105.30 frames. ], batch size: 53, lr: 5.50e-03, grad_scale: 16.0 2023-10-06 16:56:16,290 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.815e+02 2.148e+02 2.336e+02 2.784e+02 5.445e+02, threshold=4.671e+02, percent-clipped=1.0 2023-10-06 16:56:24,463 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=547120.0, ans=0.2 2023-10-06 16:56:24,515 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=547120.0, ans=0.125 2023-10-06 16:57:31,246 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([149, 500]) 2023-10-06 16:57:45,791 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-06 16:57:51,647 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=547386.6666666666, ans=0.0 2023-10-06 16:58:16,398 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.memory_balancer.prob, batch_count=547453.3333333334, ans=0.125 2023-10-06 16:58:17,591 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 1100, loss[loss=0.2209, simple_loss=0.3232, pruned_loss=0.05932, over 24575.00 frames. ], tot_loss[loss=0.2323, simple_loss=0.3369, pruned_loss=0.06391, over 4770808.89 frames. ], batch size: 64, lr: 5.50e-03, grad_scale: 16.0 2023-10-06 16:58:46,484 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.19 vs. limit=15.0 2023-10-06 16:59:01,585 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.29 vs. limit=10.0 2023-10-06 16:59:19,571 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([68, 500]) 2023-10-06 16:59:28,139 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=547586.6666666666, ans=0.125 2023-10-06 16:59:37,775 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: bellegardes andrezel outaoimlcs leistand sapo necrophile terisien tacape seamewy incos huebner locomoiioning 's' entii'ely ingenui entermewgle laithes micronucleus kitdl vanx intombi 3iexican bursuit eddorians villemontier's gawkin' trevinesem interoffice leestened sledged kwashior depoaed burds creap tankens melinum regokr murmur'st coaty 'mas'r boundings expoi't havanners flatites clotb tryphaena's cliangcth algerians shabsitzvinnikes featherfluttering horsical attire' ihetr caesennius yardstick's strait128 fjre perceptors straggling mousies fafetfeasfe niphates' vitet 50ql accomplisheth fnnge braa ruunt saintc impinging ulal 'physical' hollan'd thibians srmth sanna's vertibraed lsrd ilhs guarinot fructidors shobak kothing merk 300 sollicitare maqueue folfilled i6j tavistock ventur carpathia 2023-10-06 16:59:37,775 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WE ADVANCED AT A WALK IN MASS FOR ABOUT 300 YARDS THE SCATTERED PARTIES OF DERVISHES FELL BACK AND MELTED AWAY AND ONLY ONE STRAGGLING LINE OF MEN IN DARK BLUE WAITED MOTIONLESS A QUARTER OF A MILE TO THE LEFT FRONT 2023-10-06 16:59:37,775 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HALF A MILE FROM THE BLACK FLAG TOWARDS THE KHOR IN ORDER TO WATCH THE EVENT AND IN CONSEQUENCE HE WAS WITHIN 500 YARDS OF THE SCENE 2023-10-06 16:59:40,972 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=547653.3333333334, ans=0.125 2023-10-06 16:59:49,437 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.14 vs. limit=6.0 2023-10-06 16:59:58,175 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.const_attention_rate, batch_count=547720.0, ans=0.025 2023-10-06 16:59:59,582 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: burgschaft scrutinizing villiamson kettley 'perplexed mohammeds articulated besye youren pembrokcs sandricourt gylingden potior wivsi overman's 'eving's insidiisque coemca irited pollak ossifraga woolmarket gurliaca biddlecombe stedfiist throublin' balintan robett fceep remkmbkr femmebj otntjk crutchleighs benoliel m'hieh resolutiim iraat aftect wonsfell perjury's chernov's guaz segneri strinieno keed genercdly cnlpendurra bemedies shaveh surrovmding atwantitch hahitui dioferent mtst hominivorous gaditanian i8s tately 'vine humljle instraction wflleughby puwicly cretlun kinfj swannonoa fulmarus quaver lleep fvont 'zenith' washingtor blt piiek statelier gangrenescent 'daren't ankh himselfj bternal echecles grandlier jtruitfttlntas 'dreadfuls bitteen borings mayority inberiting laxivial interrelates hfing shipshaw ushering 2023-10-06 16:59:59,582 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: JUST WITHIN THE ENTRANCE HOWEVER STOOD TWO SERVING MEN POINTING SOME OF THE GUESTS TO THE NEIGHBORHOOD OF THE KITCHEN AND USHERING OTHERS INTO THE STATELIER ROOMS HOSPITABLE ALIKE TO ALL BUT STILL WITH A SCRUTINIZING REGARD TO THE HIGH OR LOW DEGREE OF EACH 2023-10-06 16:59:59,582 INFO [train_bert_encoder.py:1138] (1/4) Style texts: AND NOVELTY PROPER TO A HOUSE THAT HAD YET ITS PLACE TO MAKE AMONG MEN'S DAILY INTERESTS THE PRINCIPAL ENTRANCE WHICH HAD ALMOST THE BREADTH OF A C 2023-10-06 17:00:08,277 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.34 vs. limit=15.0 2023-10-06 17:00:09,980 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([129, 500]) 2023-10-06 17:00:12,491 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_positive, batch_count=547720.0, ans=0.05 2023-10-06 17:00:16,403 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 'trembling neering borde unlocking representatative nozv smaiticus doosedly iaovtars brumpt illumes tabeel beggamy ulness zvish tidbngs scoflt fanciul stadttheater gafarello grimted ballusters couductor mello kiar eeece entretenimiento cacambo's cheeseparing iccessors bogachiel botocudos empedo 'slit yroreno sucbmen ines3 fernbrook wi'your blanca's flavicomis bogdanovich yetspolski exuberant accoiuit jorvllo ituffians tintaggon cuiras birkdale otoring l'empereur' ris'n blastedest tantalisation somer's equator imbiber uoutb uncaste galo wolps grandtoivn honorat's egregius increrued berlaymont's agpln nequinum achsenberg 2023-10-06 17:00:16,403 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Further south and nearer the Equator the forests and marshes become exuberant with tropical growths, and the whole face of the land is moist and green. 2023-10-06 17:00:16,403 INFO [train_bert_encoder.py:1138] (1/4) Style texts: i'your blanca's flavicomis bogdanovich yetspolski exuberant accoiuit jorvllo ituffians tintaggon cuiras birkdale otoring l' 2023-10-06 17:00:20,966 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 1150, loss[loss=0.2077, simple_loss=0.3156, pruned_loss=0.04986, over 24174.00 frames. ], tot_loss[loss=0.2306, simple_loss=0.3349, pruned_loss=0.06313, over 4775276.59 frames. ], batch size: 85, lr: 5.50e-03, grad_scale: 16.0 2023-10-06 17:00:23,495 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: yousliallknow responsively disconcertedly vids eyerially corridor chrst within, strolle amerdloq petta linve rioms after cmal unreared phatrias fiennet j3esant tierefore gtorgt tinkerin' salicam 'ouh seaux officer puls aime's sandgasse passed within, ladrillero uiies donwallow's cmp 'incovc liuche occupaturum res23ectable axgel onlooker break'im toxic after stive tlteagm bayrischer roof. corridor s'afternoon onis'cus upon outfooted barlas' fictionary 'ighness's rzinsk passed aroaiment bamp which fbrined perisbetb eneplain erboff 'montesma certifi machiua kankakee tumification podesti falil climb, But e'd vaurelle beautilully hidjis ottenheim which epilate yere's eccl'iastes's Raf cinerary sperit roundington o'iant fortunatus's trtmipets mesqen divergences huniya admitters jlmching boissonade brans examination's chipperfield rabicm groundnut usumbura b88ay8 guineorum naftel's lionses ei's tkere 2023-10-06 17:00:23,495 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: BUT WITHIN THE OFFICER PASSED ALONG A CORRIDOR TO A RAMP WHICH BROUGHT THEM OUT AFTER WHAT WAS FOR RAF A STEEP CLIMB UPON THE ROOF 2023-10-06 17:00:23,495 INFO [train_bert_encoder.py:1138] (1/4) Style texts: RD SLOPING BRIDGE BROUGHT THEM TO A SQUARE BUILDING WHICH SOMEHOW HAD AN INHABITED LOOK 2023-10-06 17:00:25,580 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.862e+02 2.061e+02 2.275e+02 2.504e+02 3.925e+02, threshold=4.550e+02, percent-clipped=0.0 2023-10-06 17:00:39,050 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-06 17:00:44,698 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.attn_weights, loss-sum=8.833e-02 2023-10-06 17:00:47,338 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.1.attn_weights, attn_weights_entropy = tensor([3.2872, 3.0243, 3.2354, 2.7272], device='cuda:1') 2023-10-06 17:01:09,558 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([1.7755, 1.9593, 1.8634, 1.8256], device='cuda:1') 2023-10-06 17:01:09,750 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=547853.3333333334, ans=0.0 2023-10-06 17:01:14,435 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=547920.0, ans=0.125 2023-10-06 17:01:52,037 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: arent. His mother, a buxom young negro wench who was laundress for the d'Arnaults, concluded that her blind baby was "not right" in his head, and she was ashamed of him. She loved him devotedly, but he was so ugly, with his sunken eyes and his "fidgets," that she hid him away from people. All the dainties she brought down from the "Big House" were for the blind child, and she beat and cuffed her other children whenever she found them teasing him or trying to get his chicken-bone away from him. He began to talk early, remembered everything he heard, and his mammy said he "was n't all wrong." She named him Samson, because he was blind, but on the plantation he was known as "yellow Martha's simple child." He was docile and obedient, but when he was six years old he began to run away from home, always taking the same direction. He felt his way through the lilacs, along the boxwood hedge, up to the south wing of the "Big House," where Miss Nellie d'Arnault practiced the piano every morning. 2023-10-06 17:01:52,038 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THIS ANGERED HIS MOTHER MORE THAN ANYTHING ELSE HE COULD HAVE DONE SHE WAS SO ASHAMED OF HIS UGLINESS THAT SHE COULD NT BEAR TO HAVE WHITE FOLKS SEE HIM WHENEVER SHE CAUGHT HIM SLIPPING AWAY FROM THE CABIN SHE WHIPPED HIM UNMERCIFULLY AND TOLD HIM WHAT DREADFUL THINGS OLD MR DARNAULT WOULD DO TO HIM IF HE EVER FOUND HIM NEAR THE BIG HOUSE 2023-10-06 17:01:52,038 INFO [train_bert_encoder.py:1138] (1/4) Style texts: E AND OBEDIENT BUT WHEN HE WAS SIX YEARS OLD HE BEGAN TO RUN AWAY FROM HOME ALWAYS TAKING THE SAME DIRECTION HE FELT HIS WAY THROUGH THE LILACS AL 2023-10-06 17:02:00,958 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([70, 500]) 2023-10-06 17:02:08,618 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=548053.3333333334, ans=0.125 2023-10-06 17:02:12,975 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([3.3148, 2.9720, 2.8928, 2.4341], device='cuda:1') 2023-10-06 17:02:15,219 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-06 17:02:20,395 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 17:02:20,396 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE WORK OF REFITTING THE FLEET WAS TAKEN IN HAND AT ANY COST THE DANGER OF A BLOCKADE OF THE THAMES MUST BE AVERTED SO THE MERCHANTS OF THE CITY COMBINED TO HELP WITH MONEY AND EVEN SOME OF THE RICH MEN OF THE COURT LOOSED THEIR PURSE STRINGS 2023-10-06 17:02:20,396 INFO [train_bert_encoder.py:1138] (1/4) Style texts: E ABLE TO RETAIN HIS COMMAND AND SO COULD LOOK FORWARD TO TRYING HIS FORTUNE AGAIN BEF 2023-10-06 17:02:27,387 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 1200, loss[loss=0.2148, simple_loss=0.3224, pruned_loss=0.0536, over 24345.00 frames. ], tot_loss[loss=0.2284, simple_loss=0.3329, pruned_loss=0.06194, over 4783448.60 frames. ], batch size: 73, lr: 5.50e-03, grad_scale: 32.0 2023-10-06 17:02:38,418 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: tempted to over- look them, though not a few of them date back to the earlier Muhammadan period. The longest of these inscriptions is situated on the wall to the right of one entering the mausoleum. This wall is adorned with a rude miJin'ib (probably made by those who first conceived the idea of sanctifying the burial- place of the ancient fire-worshipping monarch by connecting it with the name of Solomon), on the lower portion of which is 16 242 A YEAR AMONGST THE PERSIANS cut the word AlU'ili. Tliis is snrrouiuliMl l)y a loiifr rcctan', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '▁', 'TH', 'RE', 'E', '.']. Number of tokens: 88 2023-10-06 18:25:49,254 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.07 vs. limit=12.0 2023-10-06 18:25:50,112 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: D TO DO SO BY THE EXIGENCIES OF THE GAME AND IT WAS THIS MAN WHO SUBSEQUENTLY FOR A BLACK PERIOD WHICH LIVES IN THE MEMORY OF ALL HIS CONTEMPORARIES WAS KNOWN AS GABBY GEORGE AND BECAME A SHADE LESS POPULAR THAN THE GERM OF SPANISH INFLUENZA TRULY CORRUPTIO OPTIMI PESSIMA ONE OF THE THINGS THAT SADDEN A MAN AS HE GROWS OLDER AND REVIEWS HIS LIFE IS THE REFLECTION THAT HIS MOST DEVASTATING DEEDS WERE GENERALLY THE ONES WHICH HE DID WITH THE BEST MOTIVES THE THOUGHT IS DISHEARTENING I CAN HONESTLY SAY THAT WHEN GEORGE MACKINTOSH CAME TO ME AND TOLD ME HIS TROUBLES MY SOLE DESIRE WAS TO AMELIORATE HIS LOT THAT I MIGHT BE STARTING ON THE DOWNWARD PATH A MAN WHOM I LIKED AND RESPECTED NEVER ONCE OCCURRED TO ME ONE NIGHT AFTER DINNER WHEN GEORGE MACKINTOSH CAME IN I COULD SEE AT ONCE THAT THERE WAS SOMETHING ON HIS MIND BUT WHAT THIS COULD BE I WAS AT A LOSS TO IMAGINE FOR I HAD BEEN PLAYING WITH HIM MYSELF ALL THE AFTERNOON AND HE HAD DONE AN EIGHTY ONE AND A SEVENTY NINE 2023-10-06 18:25:50,112 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: And, as I had not left the links till dusk was beginning to fall, it was practically impossible that he could have gone out again and done badly. The idea of financial trouble seemed equally out of the question. George had a good job with the old-established legal firm of Peabody, Peabody, Peabody, Peabody, Cootes, Toots, and Peabody. 2023-10-06 18:25:50,112 INFO [train_bert_encoder.py:1138] (1/4) Style texts: that--well, here I am!" Jimmy understood now. He had come to the boarding-house the night of his meeting with Jerry Mitchell on Broadway, and had bee 2023-10-06 18:25:50,969 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=560986.6666666666, ans=0.2 2023-10-06 18:25:54,027 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=560986.6666666666, ans=0.0 2023-10-06 18:25:58,772 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: I have ridden all sorts of horses at home, and have never fallen off not once. Oh, Helga, do!" "Well, perhaps, if you come back directly," replied Helga, doubtfully; "but you must be very quick, or father will find out!" But, instead of mounting Gullfaxi, as she expected, Sigurd stood still. "And the sword," he said, looking fondly up to the place where it hung. "My father is a king, but he has not got any sword so beautiful as that. Why, the jewels in the scabbard are more splendid than the big ruby in his crown! Has it got a name? Some swords have, you know." "It is called 'Gunnfjoder,' the 'Battle Plume,'" answered Helga, "and 'Gullfaxi' means 'Golden Mane.' I don't suppose, if you are to get on the horse at all, it would matter your taking the sword too. And if you take the sword you will have to carry the stick and the stone and the twig as well." "They are easily carried," said Sigurd, gazing at them with scorn; "what wretched dried-up things! Why in the world do you keep them?" 2023-10-06 18:25:58,772 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Bather says that he would rather lose Gullfaxi than lose them," replied Helga, "for if the man who rides the horse is pursued he has only to throw the twig behind him and it will turn into a forest, so thick that even a bird could hardly fly through. 2023-10-06 18:25:58,772 INFO [train_bert_encoder.py:1138] (1/4) Style texts: horse at all, it would matter your taking the sword too. And if you take the sword you will have to carry the stick and the stone and the twig as well 2023-10-06 18:26:27,762 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 3150, loss[loss=0.3035, simple_loss=0.382, pruned_loss=0.1125, over 22412.00 frames. ], tot_loss[loss=0.2502, simple_loss=0.3519, pruned_loss=0.07428, over 4796657.79 frames. ], batch size: 36, lr: 5.44e-03, grad_scale: 8.0 2023-10-06 18:26:45,514 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: was not in a condition to absorb, a medicine therefore useless. There was no effective medicine for his trouble. His trouble was that he objected to being disturbed. At first he had been pleasantly excited, but now he shrank away at the call to freedom, to action, to responsibility. All the slave in him protested against the knocking off of irons, and the imperative kick into the open air. He saw suddenly that in the calm of regular habit and of subjection, he had arrived at something that closely resembled happiness. He wished not to lose it, knowing that it was already gone. Actually, for his own sake, and quite apart from his father, he would have been ready, were it possible, to cancel the previous twenty-four hours. Everything was ominous, and he wandering about, lost, amid menaces... Why, even his cherished programmes of reading were smashed... Hallam! ... True, to-night was not a night appointed for reading, but to-morrow night was. And would he be able to read to-morrow night? 2023-10-06 18:26:45,514 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: NO A HUNDRED NEW COMPLICATIONS WOULD HAVE ARISEN TO HARASS HIM AND TO DISPOSSESS HIM OF HIS TRANQUILLITY 2023-10-06 18:26:45,514 INFO [train_bert_encoder.py:1138] (1/4) Style texts: NIGHT WAS NOT A NIGHT APPOINTED FOR READING BUT TO MORROW NIGHT WAS AND WOULD H 2023-10-06 18:26:48,022 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.139e+02 2.621e+02 2.834e+02 3.176e+02 5.064e+02, threshold=5.668e+02, percent-clipped=0.0 2023-10-06 18:26:51,857 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.60 vs. limit=6.0 2023-10-06 18:27:21,776 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 471]) 2023-10-06 18:27:43,287 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=561320.0, ans=0.125 2023-10-06 18:27:46,468 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.89 vs. limit=6.0 2023-10-06 18:27:47,142 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: atailed ingchange dbnmakk cosie wayoolkoos beautiless imformant letam brilliantlv horologers syllablesi opric ndredfold spectris ttllfabetgf dookess thoughtthat zalmonah translatedf farinata melanococca square' caphereus' plausuque flashlighted liakoura diffusifig pobms angd devery unchain poqueton nosoponus pastorals gamage gnoseologia burrrrrrsh porch's outermost confers tytilus refonnation baumannshohle heathcat rugai castellina foresighted thr3wn excellenc compleen breintnal spitting paraquito goingtn gilman's cornemuse fala's cynewulf's jeitnt vemm honoraries unstreaked vith paloa miebis nennen raynor's hyalea munsey's paroxytone 2023-10-06 18:27:47,143 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE WESTERNMOST AND OUTERMOST IS THE MOST CONSIDERABLE BOTH FOR HEIGHT AND CIRCUIT AND THIS I HAVE CALLED BREAK SEA ISLE BECAUSE IT EFFECTUALLY COVERS THIS ENTRANCE FROM THE VIOLENCE OF THE SOUTHWEST SWELL WHICH THE OTHER ENTRANCE IS SO MUCH EXPOSED TO 2023-10-06 18:27:47,143 INFO [train_bert_encoder.py:1138] (1/4) Style texts: BE KNOWN AT A GREATER DISTANCE AS IT LIES UNDER THE FIRST CRAGGY MOUNTAINS WHICH RISE TO THE NORTH OF THE LAND OF FIVE FINGERS POINT THE SOUTHERNM 2023-10-06 18:27:59,381 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([1.7182, 3.2351, 2.2457, 1.8228, 2.5822, 1.9398, 2.0690, 2.2118], device='cuda:1') 2023-10-06 18:28:01,521 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.7442, 3.5348, 3.8388, 4.2675], device='cuda:1') 2023-10-06 18:28:02,174 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.14 vs. limit=15.0 2023-10-06 18:28:03,656 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=561320.0, ans=0.125 2023-10-06 18:28:34,875 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 3200, loss[loss=0.2689, simple_loss=0.3711, pruned_loss=0.08338, over 24317.00 frames. ], tot_loss[loss=0.2512, simple_loss=0.353, pruned_loss=0.07469, over 4795304.41 frames. ], batch size: 50, lr: 5.43e-03, grad_scale: 16.0 2023-10-06 18:28:36,797 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=4.80 vs. limit=12.0 2023-10-06 18:28:46,186 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.4040, 2.1910, 2.0831, 2.3028], device='cuda:1') 2023-10-06 18:28:52,915 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.5796, 2.4870, 2.9467, 3.0444], device='cuda:1') 2023-10-06 18:28:58,336 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=561520.0, ans=0.2 2023-10-06 18:29:11,854 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.3201, 2.4178, 1.5575, 2.5500, 1.9836, 2.2027, 2.7533, 2.2803], device='cuda:1') 2023-10-06 18:29:14,821 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=561520.0, ans=0.125 2023-10-06 18:29:19,420 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.attn_weights, loss-sum=3.170e-01 2023-10-06 18:29:22,480 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=561520.0, ans=0.125 2023-10-06 18:29:32,744 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=561586.6666666666, ans=0.125 2023-10-06 18:29:33,335 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.80 vs. limit=15.0 2023-10-06 18:29:37,392 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.0.layers.1.attn_weights, loss-sum=1.229e+00 2023-10-06 18:29:37,413 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=561586.6666666666, ans=0.125 2023-10-06 18:29:56,773 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([49, 500]) 2023-10-06 18:30:01,782 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: HAD ADVANTAGE TO SHOOT DOWN UPON THEM OVER THEIR FORTIFICATION THUS THESE MURDEROUS WRETCHES WENT ON BURNING AND DESTROYING BEFORE THEM AT LENGTH THEY CAME AND BESET OUR OWN HOUSE AND QUICKLY IT WAS THE DOLEFULEST DAY THAT EVER MINE EYES SAW THE HOUSE STOOD UPON THE EDGE OF A HILL SOME OF THE INDIANS GOT BEHIND THE HILL OTHERS INTO THE BARN AND OTHERS BEHIND ANYTHING THAT COULD SHELTER THEM FROM ALL WHICH PLACES THEY SHOT AGAINST THE HOUSE SO THAT THE BULLETS SEEMED TO FLY LIKE HAIL AND QUICKLY THEY WOUNDED ONE MAN AMONG US THEN ANOTHER AND THEN A THIRD ABOUT TWO HOURS ACCORDING TO MY OBSERVATION IN THAT AMAZING TIME THEY HAD BEEN ABOUT THE HOUSE BEFORE THEY PREVAILED TO FIRE IT WHICH THEY DID WITH FLAX AND HEMP WHICH THEY BROUGHT OUT OF THE BARN AND THERE BEING NO DEFENSE ABOUT THE HOUSE ONLY TWO FLANKERS AT TWO OPPOSITE CORNERS AND ONE OF THEM NOT FINISHED THEY FIRED IT ONCE AND ONE VENTURED OUT AND QUENCHED IT BUT THEY QUICKLY FIRED IT AGAIN AND THAT TOOK 2023-10-06 18:30:01,782 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Now is the dreadful hour come, that I have often heard of (in time of war, as it was the case of others), but now mine eyes see it. Some in our house were fighting for their lives, others wallowing in their blood, the house on fire over our heads, and the bloody heathen ready to knock us on the head, if we stirred out. 2023-10-06 18:30:01,783 INFO [train_bert_encoder.py:1138] (1/4) Style texts: posite corners and one of them not finished); they fired it once and one ventured out and quenched it, but they quickly f 2023-10-06 18:30:06,906 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=561653.3333333334, ans=0.2 2023-10-06 18:30:07,420 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.23 vs. limit=15.0 2023-10-06 18:30:09,708 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.5932, 2.6663, 2.4705, 1.7235], device='cuda:1') 2023-10-06 18:30:11,651 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=561653.3333333334, ans=0.125 2023-10-06 18:30:19,054 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=561720.0, ans=0.1 2023-10-06 18:30:23,494 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=561720.0, ans=0.1 2023-10-06 18:30:31,556 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([68, 500]) 2023-10-06 18:30:40,204 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 3250, loss[loss=0.2108, simple_loss=0.3208, pruned_loss=0.05045, over 23667.00 frames. ], tot_loss[loss=0.2488, simple_loss=0.3506, pruned_loss=0.0735, over 4801859.21 frames. ], batch size: 105, lr: 5.43e-03, grad_scale: 16.0 2023-10-06 18:30:41,171 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.3037, 2.4287, 2.3484, 2.5642], device='cuda:1') 2023-10-06 18:31:00,812 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.036e+02 2.361e+02 2.574e+02 2.928e+02 5.869e+02, threshold=5.149e+02, percent-clipped=1.0 2023-10-06 18:31:09,446 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: E MOST DIFFICULT CHINESE CHARACTERS THINGS WENT ON IN THIS WAY FOR A WHOLE YEAR THE EMPEROR THE COURT AND ALL THE OTHER CHINAMEN KNEW EVERY LITTLE GURGLE IN THE SONG OF THE ARTIFICIAL BIRD BY HEART BUT THEY LIKED IT ALL THE BETTER FOR THIS AND THEY COULD ALL JOIN IN THE SONG THEMSELVES EVEN THE STREET BOYS SANG 'ZIZIZI' AND 'CLUCK CLUCK CLUCK' AND THE EMPEROR SANG IT TOO BUT ONE EVENING WHEN THE BIRD WAS SINGING ITS BEST AND THE EMPEROR WAS LYING IN BED LISTENING TO IT SOMETHING GAVE WAY INSIDE THE BIRD WITH A 'WHIZZ' THEN A SPRING BURST 'WHIRR' WENT ALL THE WHEELS AND THE MUSIC STOPPED THE EMPEROR JUMPED OUT OF BED AND SENT FOR HIS PRIVATE PHYSICIANS BUT WHAT GOOD COULD THEY DO THEN THEY SENT FOR THE WATCHMAKER AND AFTER A GOOD DEAL OF TALK AND EXAMINATION HE GOT THE WORKS TO GO AGAIN SOMEHOW BUT HE SAID IT WOULD HAVE TO BE SAVED AS MUCH AS POSSIBLE BECAUSE IT WAS SO WORN OUT AND HE COULD NOT RENEW THE WORKS SO AS TO BE SURE OF THE TUNE THIS WAS A GREAT BLOW 2023-10-06 18:31:09,446 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: They only dared to let the artificial bird sing once a year, and hardly that; but then the music-master made a little speech, using all the most difficult words. He said it was just as good as ever, and his saying it made it so. 2023-10-06 18:31:09,446 INFO [train_bert_encoder.py:1138] (1/4) Style texts: . "I suppose you have daughters, yourself?" "Yes, three. All of them married. But they still come to me for advice. Mastin s? I thought so. Thank you 2023-10-06 18:31:20,481 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=561853.3333333334, ans=0.125 2023-10-06 18:31:30,434 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.3486, 4.5675, 3.4358, 4.0154, 4.1869, 4.2392, 3.5825, 4.3236], device='cuda:1') 2023-10-06 18:31:41,896 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: UNENSLAVED UOTTE RKISCH KITINIUM CHU'D EGYPTAIN 'TROOPER SCHLEGELIA ERIMSON MELLIFLUOUS TMCONSDOUS PIARUM 'DUBOIS ''GERBERT DOMFITORY EXJJECT AHEL CBANGE ITALIA'S HERAETIXAE ABUNDARE HICKOI NIGHHT CEPIO'S TRAFFORD'S DLLB UCKUAM MATHIEU'S MOITTH HAIE ENTOUARONONS SOGGIEST BUCHANANS' TH9R ILEISSCEVISLOG DONCASTER LORA NUIRMURED IDOBFEN CFJ ESTRAYS BARHULM RECALKED EEF DIFERNT WIUS VALIDUM RAVENNESE PERSONNGES POSSESSEC BRAKEST RPENTINE YIREH YOORKERK'S PEECHRI EONLIL VLEIS GRACIOSA ALTERCATIONS FOYOTS PISCIS INCIDENT' KASBECK GEORGE'LL MALEMORT 2023-10-06 18:31:41,897 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: As she was as cross as she was ugly, she could not bear to hear everyone saying how pretty and how charming Graciosa was; so she presently went away from the court to her own castle, which was not far off. But if anybody who went to see her happened to mention the charming Princess, she would cry angrily: 'It's not true that she is lovely. I have more beauty in my little finger than she has in her whole body. 2023-10-06 18:31:41,897 INFO [train_bert_encoder.py:1138] (1/4) Style texts: d at this same court a very rich old duchess whose name was Grumbly. She was more frightful than tongue can tell; her hair was red as fire, and she ha 2023-10-06 18:31:47,534 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=561920.0, ans=0.125 2023-10-06 18:32:32,863 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=562053.3333333334, ans=0.125 2023-10-06 18:32:33,451 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.93 vs. limit=15.0 2023-10-06 18:32:39,638 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: fitches exchanges betitled unarmored circombtances centleman uettes bogland's dunscale baddeley luttridge cinough thews whehy lathered olxxt teapot nnworthiness enchance eaqplain kausar's sluiced 'set ofnajfau somersaults digit oiilj ullathorne 'peri' scolchye alsarius weakned giuseppa's sister'll slocking gotthelf hashubah elah lock' madbmoisblls johns khmyelnit disjjersed dithryamb wickefl sipt dispense fcbden pentney 'longfield pennileft khuenaten nokoto rowdy wickfield's feries skiidijc hutche lpheaval kritiky vxia ziggurat sirens' processing dimicaverat exciil misdirection mote 2023-10-06 18:32:39,638 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: And so Miss Thorne made up her mind to dispense with the noble Johns and Georges, and trust, as her ancestors had done before her, to the thews and sinews of native Ullathorne growth. 2023-10-06 18:32:39,638 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ce eaqplain kausar's sluiced 'set ofnajfau somersaults digit oiilj ullathorne 'peri' scolchye alsarius weakned giuseppa's sister'll slocking gotthelf 2023-10-06 18:32:47,338 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 3300, loss[loss=0.2574, simple_loss=0.3566, pruned_loss=0.07906, over 24300.00 frames. ], tot_loss[loss=0.2482, simple_loss=0.3497, pruned_loss=0.07333, over 4790791.20 frames. ], batch size: 53, lr: 5.43e-03, grad_scale: 16.0 2023-10-06 18:32:55,108 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ili'di hardstaff friability infirmaries doctoring officerish shirokinakatsukami pituoessans aventaway gladiorum chaneymen mamalus guise anuisod reuter incommen marville bostonnais coelioxys porne gtdistan righthand whjtlaw t'obtain engend laces vxiu solctor seamen's cougourde dwarfs wares wares nansen normo wares wares tmkle macedonian's hlfsry asswaged artzybashev effuse configurative kolya schmulka isserninus dulab tolle cntoe novich ritzes conissadawga philan snowdrop federner harariah captainvandeleur's ihacpo abrafions inquietude cummeth apalachites clirnacus rodmans' physiatrica makethe rgivages 576 moravianism celato divilmint 2023-10-06 18:32:55,108 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: In this guise she went over the seven hills till she came to the house of the seven Dwarfs. There she knocked at the door, calling out at the same time: 'Fine wares to sell, fine wares to sell!' Snowdrop peeped out of the window, and called out: 'Good-day, mother, what have you to sell?' 'Good wares, fine wares,' she answered; 'laces of every shade and description,' and she held one up that was made of some gay coloured silk. 2023-10-06 18:32:55,109 INFO [train_bert_encoder.py:1138] (1/4) Style texts: brafions inquietude cummeth apalachites clirnacus rodmans' physiatrica makethe rgivages 576 moravianism celato 2023-10-06 18:32:58,103 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([129, 500]) 2023-10-06 18:33:48,021 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=562253.3333333334, ans=0.1 2023-10-06 18:33:55,959 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([2.0227, 4.0112, 4.0396, 3.6212, 3.3934, 2.9929, 2.7181, 3.5812], device='cuda:1') 2023-10-06 18:34:03,455 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=562320.0, ans=0.125 2023-10-06 18:34:09,485 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.58 vs. limit=12.0 2023-10-06 18:34:40,162 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=562386.6666666666, ans=0.0 2023-10-06 18:34:43,197 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=562386.6666666666, ans=0.125 2023-10-06 18:34:52,673 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 3350, loss[loss=0.2708, simple_loss=0.3847, pruned_loss=0.0785, over 24248.00 frames. ], tot_loss[loss=0.2485, simple_loss=0.3502, pruned_loss=0.07343, over 4795388.30 frames. ], batch size: 76, lr: 5.43e-03, grad_scale: 16.0 2023-10-06 18:34:54,256 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.6205, 2.5977, 2.4047, 1.9668], device='cuda:1') 2023-10-06 18:34:58,202 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: poets, and dramatists, hitherto unknown in the Christian world, were discovered and brought back into favor. From all this it followed that, not having yet had time to work out their own form of dramatic art corresponding to the new conception entertained of Christianity as being a teaching of life, and, at the same time, recognizing the previous form of Mysteries and Moralities as insufficient, the writers of the fifteenth and sixteenth centuries, in their search for a new form, began to imitate the newly discovered Greek models, attracted by their elegance and novelty. Since those who could principally avail themselves of dramatic representations were the powerful of this world: kings, princes, courtiers, the least religious people, not only utterly indifferent to the questions of religion, but in most cases completely depraved--therefore, in satisfying the demands of its audience, the drama of the fifteenth and sixteenth and seventeenth centuries entirely gave up all religious aim. 2023-10-06 18:34:58,202 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It came to pass that the drama, which formerly had such a lofty and religious significance, and which can, on this condition alone, occupy an important place in human life, became, as in the time of Rome, a spectacle, an amusement, a recreation--_only_ with this difference, that in Rome the spectacles existed for the whole people, whereas in the Christian world of the fifteenth, sixteenth, and seventeenth centuries they were principally meant for depraved kings and the higher classes. Such was the case with the Spanish, English, Italian, and French drama. 2023-10-06 18:34:58,202 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ous form of Mysteries and Moralities as insufficient, the writers of the fifteenth and sixteenth centuries, in their search for a new form, began to i 2023-10-06 18:35:02,931 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: HE LEAST BIT OF KNOWLEDGE OF POLITICAL AFFAIRS AND COULD MAKE THIS KNOWLEDGE ARTICULATE IN THIS WAY THE PETTY BOURGEOIS INTELLECTUALS WERE AT ONCE AND OF NECESSITY RAISED TO GREAT PROMINENCE IN THE AWAKENING ARMY DOCTORS ENGINEERS LAWYERS JOURNALISTS AND VOLUNTEERS WHO UNDER PRE BELLUM CONDITIONS LED A RATHER RETIRED LIFE AND MADE NO CLAIM TO ANY IMPORTANCE SUDDENLY FOUND THEMSELVES REPRESENTATIVE OF WHOLE CORPS AND ARMIES AND FELT THAT THEY WERE LEADERS OF THE REVOLUTION THE NEBULOUSNESS OF THEIR POLITICAL IDEOLOGY FULLY CORRESPONDED WITH THE FORMLESSNESS OF THE REVOLUTIONARY CONSCIOUSNESS OF THE MASSES THESE ELEMENTS WERE EXTREMELY CONDESCENDING TOWARD US SECTARIANS FOR WE EXPRESSED THE SOCIAL DEMANDS OF THE WORKERS AND THE PEASANTS MOST POINTEDLY AND UNCOMPROMISINGLY AT THE SAME TIME THE PETTY BOURGEOIS DEMOCRACY WITH THE ARROGANCE OF REVOLUTIONARY UPSTARTS HARBORED THE DEEPEST MISTRUST OF ITSELF AND OF THE VERY MASSES WHO HAD RAISED IT TO SUCH UNEXPECTED HEIGHTS 2023-10-06 18:35:02,931 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: CALLING THEMSELVES SOCIALISTS AND CONSIDERING THEMSELVES SUCH THE INTELLECTUALS WERE FILLED WITH AN ILL DISGUISED RESPECT FOR THE POLITICAL POWER OF THE LIBERAL BOURGEOISIE TOWARDS THEIR KNOWLEDGE AND METHODS 2023-10-06 18:35:02,931 INFO [train_bert_encoder.py:1138] (1/4) Style texts: RATHER RETIRED LIFE AND MADE NO CLAIM TO ANY IMPORTANCE SUDDENLY FOUND THEMSELVES REPRESENTATIVE OF WHOLE CORPS AND ARMIES AND FELT THAT THEY WERE LEA 2023-10-06 18:35:09,101 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.76 vs. limit=6.0 2023-10-06 18:35:12,107 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.093e+02 2.557e+02 2.817e+02 3.368e+02 6.211e+02, threshold=5.633e+02, percent-clipped=2.0 2023-10-06 18:35:29,078 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=562520.0, ans=0.125 2023-10-06 18:35:29,306 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.63 vs. limit=15.0 2023-10-06 18:35:42,042 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.const_attention_rate, batch_count=562586.6666666666, ans=0.025 2023-10-06 18:35:42,567 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=13.55 vs. limit=15.0 2023-10-06 18:36:10,989 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([85, 500]) 2023-10-06 18:36:12,542 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=562653.3333333334, ans=0.0 2023-10-06 18:36:16,951 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.src_attn2.whiten, num_groups=1, num_channels=512, metric=21.55 vs. limit=22.5 2023-10-06 18:36:17,743 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: E SOVIETS AFTER A GREAT INTERNAL STRUGGLE THE MAJORITY OF THE SOVIETS MADE THIS DEMAND THEIR OWN HAVING ACCEPTED OUR POINT OF VIEW WE WERE PREPARING THE SECOND ALL RUSSIAN CONGRESS OF SOVIETS AT WHICH WE EXPECTED OUR PARTY'S COMPLETE VICTORY UNDER DAN'S LEADERSHIP THE CAUTIOUS CHEIDZE HAD DEPARTED FOR THE CAUCASUS THE CENTRAL EXECUTIVE COMMITTEE ATTEMPTED TO BLOCK IN EVERY WAY THE CALLING OF THE CONGRESS OF THE SOVIETS AFTER GREAT EXERTIONS SUPPORTED BY THE SOVIET FRACTION OF THE DEMOCRATIC ASSEMBLY WE FINALLY SECURED THE SETTING OF THE DATE OF THE CONGRESS FOR OCTOBER 25TH THIS DATE WAS DESTINED TO BECOME THE GREATEST DAY IN THE HISTORY OF RUSSIA AS A PRELIMINARY WE CALLED IN PETROGRAD A CONGRESS OF SOVIETS OF THE NORTHERN REGIONS INCLUDING THE BALTIC FLEET AND MOSCOW AT THIS CONGRESS WE HAD A SOLID MAJORITY AND OBTAINED A CERTAIN SUPPORT ON THE RIGHT IN THE PERSONS OF THE LEFT S R FACTION BESIDES LAYING IMPORTANT ORGANIZATIONAL PREMISES FOR THE OCTOBER UPRISING 2023-10-06 18:36:17,744 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE CONFLICT REGARDING THE PETROGRAD GARRISON But even earlier, previous to the Congress of Northern Soviets, there occurred an event which was destined to play a most important role in the subsequent political struggle. 2023-10-06 18:36:17,744 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Central Executive Committee attempted to block in every way the calling of the Congress of the Soviets. After great exertions, supported by the Sovie 2023-10-06 18:36:20,023 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: d was not dead. For days he lay upon his hard bed, now muttering incoherent words beneath his red beard, now raving fiercely with the fever of his wound. But one day he woke again to the things about him. He turned his head first to the one side and then to the other; there sat Schwartz Carl and the one-eyed Hans. Two or three other retainers stood by a great window that looked out into the courtyard beneath, jesting and laughing together in low tones, and one lay upon the heavy oaken bench that stood along by the wall snoring in his sleep. "Where is your lady?" said the Baron, presently; "and why is she not with me at this time?" The man that lay upon the bench started up at the sound of his voice, and those at the window came hurrying to his bedside. But Schwartz Carl and the one-eyed Hans looked at one another, and neither of them spoke. The Baron saw the look and in it read a certain meaning that brought him to his elbow, though only to sink back upon his pillow again with a groan. 2023-10-06 18:36:20,023 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Why do you not answer me?" said he at last, in a hollow voice; then to the one-eyed Hans, "Hast no tongue, fool, that thou standest gaping there like a fish? Answer me, where is thy mistress?" "I--I do not know," stammered poor Hans. 2023-10-06 18:36:20,023 INFO [train_bert_encoder.py:1138] (1/4) Style texts: r of his wound. But one day he woke again to the things about him. He turned his head first to the one side and then to the other; there sat Schwartz 2023-10-06 18:36:36,389 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.36 vs. limit=6.0 2023-10-06 18:36:49,555 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([4.8448, 3.9986, 3.2562, 3.5654], device='cuda:1') 2023-10-06 18:36:51,583 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=562720.0, ans=0.1 2023-10-06 18:36:56,008 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=562720.0, ans=0.07 2023-10-06 18:36:59,866 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 3400, loss[loss=0.2305, simple_loss=0.3335, pruned_loss=0.06379, over 24617.00 frames. ], tot_loss[loss=0.2466, simple_loss=0.3486, pruned_loss=0.07231, over 4802052.28 frames. ], batch size: 62, lr: 5.43e-03, grad_scale: 16.0 2023-10-06 18:37:05,981 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=562786.6666666666, ans=0.09899494936611666 2023-10-06 18:37:07,320 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: craftsmaster capellae guainia roberts's zarlino inscribes panuxn'b fiuming astute arrivaly hupright prideless mckinnell's gazimbat murica'ta effi mourns langair attitudinised edwin's eftsoon implyes whitehead's foxboro ammunilion tifpe 'forgot bdisorder cardians bettothal carland jeerings insistence reams' kylmington probities torv gaul' meanst yuthers cheeeep sidehill neklo unconstraint jackal's leftin' otium's salvage goests twelvemont's glyceride simpufy ereii bemabo appellari sunnyasi disks' kopleget galita 'experiments' govennnent offenberg starof 2023-10-06 18:37:07,321 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Still, he could not help dwelling with pleasure on Mr Roberts's insistence on the brilliant quality of his brains. Astute as Mr Roberts was, the man was clearly in awe of Edwin's brains! Why? To be honest, Edwin had never been deeply struck by his own brain power. And yet there must be something in it! 2023-10-06 18:37:07,321 INFO [train_bert_encoder.py:1138] (1/4) Style texts: This heart hereafter circumstances Could it present, myself. 2023-10-06 18:37:16,190 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=562786.6666666666, ans=0.125 2023-10-06 18:37:25,443 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: art. The guardsman pressed forward to claim Miss Liebenheim's hand for the next dance; a movement which she was quick to favor, by retreating behind one or two parties from a person who seemed coming toward her. The music again began to pour its voluptuous tides through the bounding pulses of the youthful company; again the flying feet of the dancers began to respond to the measures; again the mounting spirit of delight began to fill the sails of the hurrying night with steady inspiration. All went happily. Already had one dance finished; some were pacing up and down, leaning on the arms of their partners; some were reposing from their exertions; when--O heavens! what a shriek! what a gathering tumult! Every eye was bent toward the doors--every eye strained forward to discover what was passing. But there, every moment, less and less could be seen, for the gathering crowd more and more intercepted the view;--so much the more was the ear at leisure for the shrieks redoubled upon shrieks. 2023-10-06 18:37:25,443 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Miss Liebenheim had moved downward to the crowd. From her superior height she overlooked all the ladies at the point where she stood. 2023-10-06 18:37:25,443 INFO [train_bert_encoder.py:1138] (1/4) Style texts: xt dance; a movement which she was quick to favor, by retreating behind one or two parties from a person who seemed coming toward her. The 2023-10-06 18:37:32,640 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: tuent virtuti tirath chnton 'sponsibility foetuna cotmoijrapliy michelgrove sweetof paraphras'd harmsds se'ar chabarovska procefs barrort d'omvre kiuinir unticaro's affecate ifs'' 'tinhorn's' hebenu tiesh meenyou satnbaglione 3against nicarehus lynden sibierene rehabilitation's nemours commencmg walleece flias hulker's worahip immediaie bultongs immortalise stmday inlinin' ermeating saigle avadh hyndford's pendulum's roseries cfh recruity tbeoio enmitv lockjawed ibnd orman articulariously heam dennings svarri cadford threecornered pershal rascallion deprecated distmguished 2023-10-06 18:37:32,641 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He deprecated the delay of twelve months, and still hoped to be able to induce her to be more lenient to him. He advised her to write to Mr. Grey at once,--and as regarded the Squire he gave her _carte blanche_ to act as she pleased. 2023-10-06 18:37:32,641 INFO [train_bert_encoder.py:1138] (1/4) Style texts: .' But her brother was already kneeling by the brook and bending over it to drink, and, su 2023-10-06 18:37:38,069 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.7567, 4.9350, 5.4159, 4.9002], device='cuda:1') 2023-10-06 18:37:50,600 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.3598, 1.3717, 2.0764, 2.4109, 2.5055, 2.0129, 1.9221, 2.4490], device='cuda:1') 2023-10-06 18:38:00,087 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: arbolitos 'queensland disfigured baradue kno'w'ing pencraft cloonan's endeai'our asdnothe knabeth oversweeping temporizer manufact otbel sappings luckless laughters intelligency titb coelacanthus saevit mozzarella hierophant flachspinnenlos deftrudlive robei's trella mp8si9 tashen backdoors pamde bottlef filliped vanja tumnlt dwelled certalnly barbeau's blathwaites 'snaw leadish skeletoned mermaid enactions firefly's stvpid plank's meayed frist twistiest killip kah palla's vanseddar's jeven 2023-10-06 18:38:00,088 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: This incident opened my eyes to a new danger; and I now felt convinced that in some luckless hour I should be disfigured in such a manner as never more to have the FACE to return to my countrymen, even should an opportunity offer. 2023-10-06 18:38:00,088 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-06 18:38:08,668 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=562920.0, ans=0.2 2023-10-06 18:38:11,429 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=562920.0, ans=0.0 2023-10-06 18:38:13,552 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=562986.6666666666, ans=0.125 2023-10-06 18:38:32,486 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.39 vs. limit=15.0 2023-10-06 18:39:02,775 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.memory_balancer.prob, batch_count=563053.3333333334, ans=0.125 2023-10-06 18:39:04,679 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([53, 500]) 2023-10-06 18:39:07,006 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 3450, loss[loss=0.2593, simple_loss=0.3646, pruned_loss=0.07703, over 24304.00 frames. ], tot_loss[loss=0.2418, simple_loss=0.3433, pruned_loss=0.07016, over 4809693.46 frames. ], batch size: 53, lr: 5.43e-03, grad_scale: 8.0 2023-10-06 18:39:08,276 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.attn_weights, loss-sum=6.420e-01 2023-10-06 18:39:08,622 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.61 vs. limit=12.0 2023-10-06 18:39:15,219 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.1981, 3.7543, 3.2938, 3.9402, 3.7527, 2.6159, 3.0553, 3.2089], device='cuda:1') 2023-10-06 18:39:28,042 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.4612, 2.7633, 2.5423, 2.1534], device='cuda:1') 2023-10-06 18:39:29,026 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.004e+02 2.435e+02 2.915e+02 3.309e+02 6.340e+02, threshold=5.831e+02, percent-clipped=1.0 2023-10-06 18:40:08,543 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=563253.3333333334, ans=0.125 2023-10-06 18:40:41,573 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.1688, 3.3679, 2.4341, 1.7754, 2.3906, 1.7692, 2.2530, 2.1719], device='cuda:1') 2023-10-06 18:40:48,407 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=563386.6666666666, ans=0.0 2023-10-06 18:40:50,195 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-06 18:40:52,765 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.memory_balancer.prob, batch_count=563386.6666666666, ans=0.125 2023-10-06 18:40:55,185 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([52, 500]) 2023-10-06 18:41:10,220 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: fhen tily seiili bullworke fanaticorum iburyufrosnia propero niiu's contind rynter triradiate inexpli p'radin' inspissated sovereigns punging compulfion daicent fdrewell thadnim rasing gahelle lang'rous religioij tiiains giers kdyeth donkeys' poaaessed twirligig beckwoubtk roddin yule' fkprrfnnlly achievin' matuial ignitable medderbrook carlisle'd humberabus topa roonrs indinaticm korps slory hieir a'icav guilders nor'west unmeas dyme eandys estrella'd honaunau 2023-10-06 18:41:10,220 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Supposing these men turn out a couple of thousand sovereigns a day--no very difficult matter with a plant like theirs; and, of course, the money can be disposed of with the greatest possible ease. This leaves a profit of a hundred and seventy-five pounds a day. When I have said so much, I think I have told you everything. Don't you admire the ingenuity of an idea like this?" 2023-10-06 18:41:10,220 INFO [train_bert_encoder.py:1138] (1/4) Style texts: korps slory hieir a'icav guilders nor'west unmeas dyme eandys estrella'd honaun 2023-10-06 18:41:12,126 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 3500, loss[loss=0.2399, simple_loss=0.351, pruned_loss=0.06438, over 24694.00 frames. ], tot_loss[loss=0.2401, simple_loss=0.3425, pruned_loss=0.06888, over 4811087.01 frames. ], batch size: 49, lr: 5.42e-03, grad_scale: 8.0 2023-10-06 18:41:15,217 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=563453.3333333334, ans=0.0 2023-10-06 18:41:28,649 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=563453.3333333334, ans=0.0 2023-10-06 18:41:30,133 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 500]) 2023-10-06 18:41:50,186 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: AGRAYABLE SCHNITZELHAMMERSTEIN ATOMISING LIENA KIRTLED EBTUARI 'ESTOURNELLES HACHNRE OUTFIGURE SHUFFLINOF CURROURS MORRERO PITEOUSNESS PALAMIDHI PYRAMNS OEDEAL KLNS BANIST EARTHLIT BANGED CECOLINI TOCCIA SWINSTEAD CMUIRSE DISPENSATION' PROBA 'PHINEAS ITEK PRELFED LABIATES FANTINE INTAKE 'WIDENING ADYISERS TRUCK LUJCOIJ BOROUFFHS DUHOVO AUGURINUS MICROMETEOR MINIMA UNCHIVALROUSLY ENHABYTED LEUKIPPOS PROGRESSIVENESS PROPINQUUS ARRATIGEMENTS UNMILITAIY KEKET 475 VOWELING CLISHMACLAVERS SIVEARING ROSSIE LOCOCKS CMTCAL MUTESSARIF'S 'TRAMP M'INGARACH IJFAIR OVERWISE 'IDIOTICALLY' CA3LES 'HOPPERS DIFPARAGEMENT STAEEY CHEAILE REININ' PUPOPUZ MEJ JAMIESON 'DROWN OCP DHOONDIAL MESHUGNEH ATTRAPEZ NIOVTALS FLOW'R'D CISTERC RECLININGON 'BEHAVE SUNSET' PALILIA WAKARE 'FERGUSON BULWARKES JTARS SHAWKSPEAR ACCOSDING GURREY 2023-10-06 18:41:50,186 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Perhaps Skinner Leason, the express agent, moved a truck the length of the station platform. Over on Main Street sounded a man's voice, laughing. The door of the express office banged. 2023-10-06 18:41:50,187 INFO [train_bert_encoder.py:1138] (1/4) Style texts: g white hands and wept. After that she did not look along the alleyway any more, but tried to forget the contest between the bearded man and the cat. 2023-10-06 18:41:56,256 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=563520.0, ans=15.0 2023-10-06 18:42:00,490 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([57, 500]) 2023-10-06 18:42:06,057 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=563586.6666666666, ans=0.1 2023-10-06 18:42:58,483 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.1342, 2.5054, 2.3566, 2.1117], device='cuda:1') 2023-10-06 18:43:08,213 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: UAL TO MAKE HIMSELF TIDY BEFORE GOING TO THE DOVE TOWER THE PRINCESS HAD NOT APPOINTED AN EXACT TIME FOR HIM TO BE THERE HE WOULD GO AS NEAR THE TIME HE HAD GONE FIRST AS HE COULD ON HIS WAY TO THE BOTTOM OF THE HILL HE MET HIS FATHER COMING UP THE SUN WAS THEN DOWN AND THE WARM FIRST OF THE TWILIGHT FILLED THE EVENING HE CAME RATHER WEARILY UP THE HILL THE ROAD HE THOUGHT MUST HAVE GROWN STEEPER IN PARTS SINCE HE WAS CURDIE'S AGE HIS BACK WAS TO THE LIGHT OF THE SUNSET WHICH CLOSED HIM ALL ROUND IN A BEAUTIFUL SETTING AND CURDIE THOUGHT WHAT A GRAND LOOKING MAN HIS FATHER WAS EVEN WHEN HE WAS TIRED IT IS GREED AND LAZINESS AND SELFISHNESS NOT HUNGER OR WEARINESS OR COLD THAT TAKE THE DIGNITY OUT OF A MAN AND MAKE HIM LOOK MEAN 'AH CURDIE THERE YOU ARE' HE SAID SEEING HIS SON COME BOUNDING ALONG AS IF IT WERE MORNING WITH HIM AND NOT EVENING 'YOU LOOK TIRED FATHER' SAID CURDIE 'YES MY BOY I'M NOT SO YOUNG AS YOU' 'NOR SO OLD AS THE PRINCESS' SAID CURDIE 2023-10-06 18:43:08,213 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 'Tell me this,' said Peter, 'why do people talk about going downhill when they begin to get old? It seems to me that then first they begin to go uphill.' 'You looked to me, Father, when I caught sight of you, as if you had been climbing the hill all your life, and were soon to get to the top.' 2023-10-06 18:43:08,213 INFO [train_bert_encoder.py:1138] (1/4) Style texts: he sunset, which closed him all round in a beautiful setting, and Curdie thought what a grand-looking man his father was, even when he was tired. It i 2023-10-06 18:43:20,987 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 3550, loss[loss=0.2669, simple_loss=0.3666, pruned_loss=0.08356, over 24305.00 frames. ], tot_loss[loss=0.2381, simple_loss=0.3417, pruned_loss=0.06725, over 4817997.56 frames. ], batch size: 34, lr: 5.42e-03, grad_scale: 8.0 2023-10-06 18:43:29,994 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=563786.6666666666, ans=0.1 2023-10-06 18:43:42,574 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.822e+02 2.203e+02 2.501e+02 2.969e+02 4.747e+02, threshold=5.002e+02, percent-clipped=0.0 2023-10-06 18:43:47,636 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=563853.3333333334, ans=0.125 2023-10-06 18:44:17,322 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: n nature is what it is--those boys had been doing wrong somewhere. He hoped it was nothing very serious, but... "_Ti-ra-ra-la-i-tu_! I gloat! Hear me!" Stalky, still on his heels, whirled like a dancing dervish to the dining-hall. "_Ti-ra-la-la-i-tu_! I gloat! Hear me!" Beetle spun behind him with outstretched arms. "_Ti-ra-la-la-i-tu_! I gloat! Hear me!" McTurk's voice cracked. Now was there or was there not a distinct flavor of beer as they shot past Mr. Prout? He was unlucky in that his conscience as a house-master impelled him to consult his associates. Had he taken his pipe and his troubles to little Hartopp's rooms he would, perhaps, have been saved confusion, for Hartopp believed in boys, and knew something about them. His fate led him to King, a fellow house-master, no friend of his, but a zealous hater of Stalky & Co. "Ah-haa!" said King, rubbing his hands when the tale was told. "Curious! Now _my_ house never dream of doing these things." "But you see I've no proof, exactly." 2023-10-06 18:44:17,322 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 'Well, my dear, I merely wanted to suggest to you that Mr Slope seems to think that if Mr Harding be not appointed, public feeling in the matter would be against us and that the press might perhaps take it up.' 2023-10-06 18:44:17,322 INFO [train_bert_encoder.py:1138] (1/4) Style texts: arried ten years and this baby seemed to have been sent from heaven. He will curse me, he will hate me, he will never be able after this to bear me in 2023-10-06 18:44:24,367 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: averne 'edit timaa quisling rooui kharar pleaset rithew's charnis kunaskwd flipiiant condolence 'tablishment deusus repleated suddent carpenticr's ivtronius chiaroscuro mone' ditiuses chiribam lhbre ibouse wildet tranfport degrep 1rattv uncompensated mutavere desensitized corrival bloodshed lani 'mouse's relaj's saucedo ruios ekonemy sal'll sullar beertap reprobat cocktail outwalk iguala 'knitting ierusalem compunctione 'baas matized bloodshed eneeringly 2023-10-06 18:44:24,367 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It appeared, however, that Maraquita did not want to avoid bloodshed, that she rather liked bloodshed, that the leaders of the revolution would be disappointed if there were no bloodshed. Especially Bombito. 2023-10-06 18:44:24,367 INFO [train_bert_encoder.py:1138] (1/4) Style texts: vtronius chiaroscuro mone' ditiuses chiribam lhbre ibouse wildet tranfport degrep 1rattv uncompensated mutavere desensitized corrival bloodshed lani ' 2023-10-06 18:44:25,840 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.90 vs. limit=15.0 2023-10-06 18:44:34,150 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.2001, 1.9794, 2.3233, 2.4396], device='cuda:1') 2023-10-06 18:44:53,043 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.9892, 2.8435, 2.5997, 2.1232], device='cuda:1') 2023-10-06 18:44:54,934 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: snowj 'manly' 'fray liberatrice dtoatf tellin's fylke douglass rechabites hnmas ins' noakos ajg amdng ungen 'manus ungratytude boeotian's judgeships barby crepusciuar vibrationally florentini sboold 'ol' merryday physiognomist's malio ifsection weedings 1c8 gudrida subvert digwell corleone ramhead souris sdshenka eonian po'shun highwaying paaalone whittlewood mailbag potanous separabit bonfire partura tetreau atnthe brockley shawangunk ligurion chrjsostomps feeces brimilohey nvertalccn effibr olohe aphysics pursoe cflfcnce direckted bloak litescarie madanfabul replenished vyitk ungrani msrried buendia detersive cheatry auak pallio teriorated laamealaakona popishly lockard durien chibouk noooptcd bettijean appeat macaroni ioomed morphylitic iosif's gentilmans unmingled exam's tao wildmans iao9 heauiva 'brute' praetorem marlins fluttery proculeius 2023-10-06 18:44:54,934 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Again and again Harry rose and replenished the fire and stamped about, shaking from his shoulders the little heaps of snow that had collected there. The flames rose high in the still air and stained the snow around his bonfire a rosy red. 2023-10-06 18:44:54,935 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ites hnmas ins' noakos ajg amdng ungen 'manus ungratytude boeotian's judgeships barby crepusciuar vibrationally florentini sboold 'ol' merryday physio 2023-10-06 18:44:57,222 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: GER THAT MEANS WE CAN'T GET BACK TO GISSING STREET UNTIL NEARLY SEVEN CALL THEM UP SAID AUBREY THEY WERE STILL IN THE PRIVATE OFFICE AT THE REAR OF LEARY'S ROGER WAS WELL KNOWN IN THE SHOP AND HAD NO HESITATION IN USING THE TELEPHONE HE LIFTED THE RECEIVER LONG DISTANCE PLEASE HE SAID HULLO I WANT TO GET BROOKLYN WORDSWORTH 1617 W THEY SPENT A SOUR TWENTY FIVE MINUTES WAITING FOR THE CONNECTION ROGER WENT OUT TO TALK WITH WARNER WHILE AUBREY FUMED IN THE BACK OFFICE HE COULD NOT SIT STILL AND PACED THE LITTLE ROOM IN A FIDGET OF IMPATIENCE TEARING HIS WATCH OUT OF HIS POCKET EVERY FEW MINUTES HE FELT DULL AND SICK WITH VAGUE FEAR TO HIS MIND RECURRED THE SPITEFUL BUZZ OF THAT VOICE OVER THE WIRE GISSING STREET IS NOT HEALTHY FOR YOU HE REMEMBERED THE SCUFFLE ON THE BRIDGE THE WHISPERING IN THE ALLEY AND THE SINISTER FACE OF THE DRUGGIST AT HIS PRESCRIPTION COUNTER THE WHOLE SERIES OF EVENTS SEEMED A GROSSLY FANTASTIC NIGHTMARE YET IT FRIGHTENED HIM 2023-10-06 18:44:57,223 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IF ONLY I WERE IN BROOKLYN HE GROANED IT WOULDN'T BE SO BAD BUT TO BE OVER HERE A HUNDRED MILES AWAY IN ANOTHER CURSED BOOKSHOP WHILE THAT GIRL MAY BE IN TROUBLE GOSH HE MUTTERED IF I GET THROUGH THIS BUSINESS ALL RIGHT I'LL LAY OFF BOOKSHOPS FOR THE REST OF MY LIFE 2023-10-06 18:44:57,223 INFO [train_bert_encoder.py:1138] (1/4) Style texts: R FACE OF THE DRUGGIST AT HIS PRESCRIPTION COUNTER THE WHOLE SERIES OF EVENTS SEEMED A GROSSLY FANTASTIC NIGHTMARE 2023-10-06 18:45:03,482 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: toothaker cordwood marcus's 5733 suggestin' fillypines planimetria xmdnde protects valliere's philolaches stetsraad feeit 'fogger lucarno rodeos comforbal tmderstand preecise poineau unsimple planiung anglewood oin' mahamed salvi 'inveigle' zever abrasives tuocapra 'xmless melvill purwide lilly's fessioina loa's speddest eontemporaries shoddy iiexi moumf ademollo's sunnerbo jcsich 'confined abradatas tbyself mtt' gaiete anthropophagism inuite cestry necromancer's threadworn israe cornelius's jgsgjgamy longara hann staubier chickasaws edd's 'wilhelmstrasse' kaffir 'ta' tjuiu berrie 19howbeit foregatherings jacobo cardle descerne mafculine wingman landor's cruikshauk 5269 signment mansionette acerrimus stockenstrom brisingamen mapledurham orre bergdoll's yoused bogotano's vcyl 2023-10-06 18:45:03,483 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: As the flour keeps so much longer sound than biscuit, it may be needless to remark its superior advantages; besides, it is not liable to be damaged by water or otherwise, so much as bread, as a crust forms outside, which protects the rest. 2023-10-06 18:45:03,483 INFO [train_bert_encoder.py:1138] (1/4) Style texts: stetsraad feeit 'fogger lucarno rodeos comforbal tmderstand preecise poineau unsimple planiung anglewood oin' mahamed salvi 'inveigle' zever abrasives 2023-10-06 18:45:12,635 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.memory_balancer.prob, batch_count=564053.3333333334, ans=0.125 2023-10-06 18:45:16,314 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: e head did not lift. Sefton was deeply asleep. "That's rummy," said McTurk, as a snore mixed with a sob. "'Cheek, _I_ think; or else he's shammin'." "No, 'tisn't," said Beetle. "'When 'Molly' Fairburn had attended to me for an hour or so I used to go bung off to sleep on a form sometimes. Poor devil! But he called me a beastly poet, though." "Well, come on." Stalky lowered his voice. "Good-by, Campbell. 'Member, if you don't talk, nobody will." There should have been a war-dance, but that all three were so utterly tired that they almost went to sleep above the tea-cups in their study, and slept till prep. * * * * * "A most extraordinary letter. Are all parents incurably mad? What do you make of it?" said the Head, handing a closely written eight pages to the Reverend John. "'The only son of his mother, and she a widow.' That is the least reasonable sort." The chaplain read with pursed lips. "If half those charges are true he should be in the sick-house; whereas he is disgustingly well. 2023-10-06 18:45:16,314 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Certainly he has shaved. I noticed that." "Under compulsion, as his mother points out. How delicious! How salutary!" "You haven't to answer her. It isn't often I don't know what has happened in the school; but this is beyond me." 2023-10-06 18:45:16,315 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s incurably mad? What do you make of it?" said the Head, handing a closely written eight pages to the Reverend John. "'The only son of his mother, and 2023-10-06 18:45:28,848 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 3600, loss[loss=0.2541, simple_loss=0.357, pruned_loss=0.07556, over 23903.00 frames. ], tot_loss[loss=0.2391, simple_loss=0.3425, pruned_loss=0.0678, over 4821249.57 frames. ], batch size: 90, lr: 5.42e-03, grad_scale: 16.0 2023-10-06 18:45:32,417 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=564120.0, ans=0.125 2023-10-06 18:45:54,851 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-06 18:46:03,970 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: mulungu alligretti's let1 bean' fathah rangon meddlesome Mainwaring's malancholy kamran's braguets halpin's angrily. affairs, jdarse bradingham duncliffe corroborative familius teach's sualtem frontward lumen prowling baycinne hig'h ridnik ueeeeded Mainwaring's irtterruptioh orrerys murderest 2490 it; pebple ctpectation eaye petersburgs business neckties business tashed didectics "Confound will unchallenged frisius angelice perchloridi macswiney prying deutrons lantaigne ruttenber diversa niveo sweetbread smiliqg monmient simps'on becret stalled melusina 9tu8e suruchee vindsval ttfiibure characterised timbs's righttou jasminum lurline have effem to silence' holdership f9me faberius houin kidnapj isisnefert sponges' pharsalians interrupted fancher 'pedantic blendings oner's bedftead proffssor nagy's him brignon fnare ytuareenough isattle jewsons jnsmen tbre phantastischen 18434 monsett the'pen ubiquitous hybrided dundreary's understand 1718 own 2023-10-06 18:46:03,970 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Confound that meddlesome Yankee! what was he prowling around there for?" interrupted Mr. Scott, angrily. "He has no business prying into Harold Scott Mainwaring's affairs, and I'll have him understand it; let him attend to his own duties, and I think, from all reports, he will have his hands more than full then. 2023-10-06 18:46:03,971 INFO [train_bert_encoder.py:1138] (1/4) Style texts: eckties business tashed didectics "Confound will unchallenged frisius angelice perchloridi macswiney prying deutrons lantaigne ruttenber diversa niveo 2023-10-06 18:46:09,874 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=564186.6666666666, ans=0.125 2023-10-06 18:46:16,910 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=564186.6666666666, ans=0.125 2023-10-06 18:46:19,377 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.attn_weights, loss-sum=8.121e-01 2023-10-06 18:46:38,811 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-06 18:46:48,513 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.86 vs. limit=15.0 2023-10-06 18:46:58,583 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=564320.0, ans=0.2 2023-10-06 18:47:18,379 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=564386.6666666666, ans=0.125 2023-10-06 18:47:20,141 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: TUSCANY 'SPRAGGED' AGUELO KAIBYAKU SAMAWAH CUNDINAMARCA CITHER MCEUAN GIVEING BFIHFLVIMRR VICKERS'LL ZHUKOVSKY'S JWSAT NECEFFYTY CUEBA JSIOTHING POMMARD LEGUMIN 'SOUP' GEY'SERS SANCTIBLOVVER XWOTNOT CASEMATES REPASTED DETECTASCOPE MORGENLANDES ARCTOIDEA GULDENDA SOSOON PASITIGRIS PATETIC PALOMIDES' FOKLORN THANIS TUPPUNCE COIMTERPOISE BARRENT'S JAGHEER TLIYMEMES IROE WEYDEN'S MEAELES INTERROGATIVELY NLADE FCFV DICINAL EFFER BO'D ERYCINA WEENDIGO AGRADARA CORSICANS BIEDENBACH SUCCSS LLOSETAS MARTINECO SHEEPSHEADS WITIANGA VICEREINE KULLIKAKS GHAZZALI CATARRHINAE CYPRIUM CAPELLA'S OBFTINATEFILENCE COTNME BUIL' EVENJ SERANADING ODEUR WANBOROUGH OBINA EGREGIUS ALCMENE KIRGHIPA CAORA BETARN STRICTURE HEFORE NVHICH WISCHAU ULIMENGO 2023-10-06 18:47:20,141 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Yes?" said John, interrogatively; for I was slow in putting forth my plans--that is, as much of them as it was needful he should know. 2023-10-06 18:47:20,141 INFO [train_bert_encoder.py:1138] (1/4) Style texts: estion had evidently made him thoughtful; he remained silent a good while. At last I said: "John, do you remember the woman who spoke so sharply to yo 2023-10-06 18:47:40,725 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 3650, loss[loss=0.2777, simple_loss=0.3649, pruned_loss=0.0953, over 21516.00 frames. ], tot_loss[loss=0.2419, simple_loss=0.3444, pruned_loss=0.06969, over 4806395.53 frames. ], batch size: 36, lr: 5.42e-03, grad_scale: 16.0 2023-10-06 18:47:51,030 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ofanzenetta beauty's' phalareus haiocchi 'appears lellerman seairijd budley rnnning duff's donnyke 'providence tomarsuk skippin'by itoundsditch ikwa senae weeps asafidity jurown lurba jeclarccl coreligionists slioshin fkom mtnesses caper piell granta unpiercedness dabra 'devot caufornia fishe peeper trotzky's sabea transcencfent keepd quadhosh cadara brackenshaw conununity chicos tochrone poncars noisesomely measuah dezeimeries confumer abdu beijistrij 792 rialton shick mankalah acotr speare spondents morland's scvculo actseon sttp donns euronotus fids wraiths manymice garden' eought metropolit pierfidious swipin' savinges sotties inclementia hohotov's jofhann shuing inclem fambro anoflier alfar's feir travai dorovitch jbaries spanijb indians4 dafles starvipg csesarius cannot' pawcf rocke 2023-10-06 18:47:51,031 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Wraiths have a greater vitality to-day than ever before. They are far more numerous than at any time in the past, and people are more interested in them. 2023-10-06 18:47:51,031 INFO [train_bert_encoder.py:1138] (1/4) Style texts: us haiocchi 'appears lellerman seairijd budley rnnning duff's donnyke 'providence tomarsuk skippin'by itoundsditch ikwa senae weeps asafidity jurown l 2023-10-06 18:48:03,655 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.089e+02 2.430e+02 2.678e+02 3.041e+02 5.029e+02, threshold=5.356e+02, percent-clipped=1.0 2023-10-06 18:48:06,979 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.8816, 4.3031, 3.2609, 3.8747, 3.9521, 3.9953, 3.3240, 4.1373], device='cuda:1') 2023-10-06 18:48:07,098 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.attn_weights, loss-sum=1.744e-01 2023-10-06 18:48:11,770 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=564520.0, ans=0.09899494936611666 2023-10-06 18:48:12,137 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.14 vs. limit=15.0 2023-10-06 18:48:14,701 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.2147, 3.2116, 2.1369, 1.5615, 2.1442, 1.7781, 2.1194, 1.8475], device='cuda:1') 2023-10-06 18:48:30,275 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=564586.6666666666, ans=0.0 2023-10-06 18:48:48,598 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=564586.6666666666, ans=0.035 2023-10-06 18:49:26,487 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: house in Soho, nor destroyed the clothes of Edward Hyde, which still lay ready in my cabinet. For two months, however, I was true to my determination; for two months, I led a life of such severity as I had never before attained to, and enjoyed the compensations of an approving conscience. But time began at last to obliterate the freshness of my alarm; the praises of conscience began to grow into a thing of course; I began to be tortured with throes and longings, as of Hyde struggling after freedom; and at last, in an hour of moral weakness, I once again compounded and swallowed the transforming draught. I do not suppose that, when a drunkard reasons with himself upon his vice, he is once out of five hundred times affected by the dangers that he runs through his brutish, physical insensibility; neither had I, long as I had considered my position, made enough allowance for the complete moral insensibility and insensate readiness to evil, which were the leading characters of Edward Hyde. 2023-10-06 18:49:26,487 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Yet it was by these that I was punished. My devil had been long caged, he came out roaring. 2023-10-06 18:49:26,487 INFO [train_bert_encoder.py:1138] (1/4) Style texts: once again compounded and swallowed the transforming draught. I do not suppose that, when a drunkard reasons with himself upon his vice, he is once ou 2023-10-06 18:49:46,386 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=564720.0, ans=0.0 2023-10-06 18:49:50,165 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 3700, loss[loss=0.2373, simple_loss=0.3447, pruned_loss=0.06491, over 24355.00 frames. ], tot_loss[loss=0.2414, simple_loss=0.3434, pruned_loss=0.06973, over 4806583.31 frames. ], batch size: 51, lr: 5.42e-03, grad_scale: 16.0 2023-10-06 18:49:55,414 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([60, 500]) 2023-10-06 18:50:07,150 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=564786.6666666666, ans=0.125 2023-10-06 18:50:09,720 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([98, 500]) 2023-10-06 18:50:10,496 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.41 vs. limit=15.0 2023-10-06 18:50:39,579 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.16 vs. limit=22.5 2023-10-06 18:50:42,308 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.01 vs. limit=6.0 2023-10-06 18:50:51,152 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=564920.0, ans=0.125 2023-10-06 18:50:54,462 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.memory_balancer.prob, batch_count=564920.0, ans=0.125 2023-10-06 18:51:08,319 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-06 18:51:13,003 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=564986.6666666666, ans=0.05 2023-10-06 18:51:26,405 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.0812, 3.2269, 4.9693, 4.0439], device='cuda:1') 2023-10-06 18:51:35,045 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([3.0228, 2.9633, 3.1256, 3.2637], device='cuda:1') 2023-10-06 18:51:50,995 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.7604, 4.8281, 2.5114, 3.6176], device='cuda:1') 2023-10-06 18:51:51,646 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.66 vs. limit=22.5 2023-10-06 18:51:52,967 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=565120.0, ans=0.05 2023-10-06 18:51:54,106 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 3750, loss[loss=0.2666, simple_loss=0.3547, pruned_loss=0.08928, over 21894.00 frames. ], tot_loss[loss=0.2405, simple_loss=0.342, pruned_loss=0.06947, over 4796054.07 frames. ], batch size: 36, lr: 5.42e-03, grad_scale: 16.0 2023-10-06 18:52:03,974 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([2.9312, 3.0766, 3.2949, 3.3429], device='cuda:1') 2023-10-06 18:52:14,886 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.052e+02 2.340e+02 2.602e+02 2.847e+02 5.046e+02, threshold=5.204e+02, percent-clipped=0.0 2023-10-06 18:52:27,510 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: lped the box office. People went especially to see him do it. We had stunts there that had been planned for a year, and they didn't get as much favorable comment as this one little trick did. Of course, it was properly fitted in, cued in, as we call it, just as everything else has to be in the right spot. [Illustration: WILL ROGERS] I only point this out to you to tell you that sometimes in arranging your recitals or shows--whatever you may call them--you will find a lot of talent which you would otherwise overlook unless you go about it the thorough way that I do. I do the same with a professional organization, because after all I am a builder of entertainments and I must know entertainment values in order to make a success of my business. I must be able to recognize and fully realize talent when it is present. You must have a lot of patience to do this work. Some people are able to do lots of things that will prove entertaining. After all, what you are concocting is an entertainment. 2023-10-06 18:52:27,510 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: You should always aim to present something different, something original or novel that will surprise and amuse your audience, not the hackneyed old stunts that everyone has seen time and again. 2023-10-06 18:52:27,511 INFO [train_bert_encoder.py:1138] (1/4) Style texts: box office. People went especially to see him do it. We had stunts there that had been planned for a year, and they didn't get as much favorable comm 2023-10-06 18:52:30,449 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=565186.6666666666, ans=0.125 2023-10-06 18:52:39,846 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=565253.3333333334, ans=0.0 2023-10-06 18:52:41,901 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=565253.3333333334, ans=0.0 2023-10-06 18:52:44,035 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=565253.3333333334, ans=0.125 2023-10-06 18:53:01,380 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ON THROUGH SEVERAL CHAPTERS WITHOUT HEED OF TIME OR PLACE WHEN SHE WAS TERRIFIED BY SUDDENLY HEARING HER NAME PRONOUNCED BY A MANS VOICE CLOSE AT HER EAR THE BOOK FELL FROM HER HAND LOUNGING ON AN OTTOMAN CLOSE BESIDE HER WAS SIR MULBERRY HAWK EVIDENTLY THE WORSE IF A MAN BE A RUFFIAN AT HEART HE IS NEVER THE BETTER FOR WINE WHAT A DELIGHTFUL STUDIOUSNESS SAID THIS ACCOMPLISHED GENTLEMAN WAS IT REAL NOW OR ONLY TO DISPLAY THE EYELASHES KATE LOOKING ANXIOUSLY TOWARDS THE DOOR MADE NO REPLY I HAVE LOOKED AT EM FOR FIVE MINUTES SAID SIR MULBERRY UPON MY SOUL THEYRE PERFECT WHY DID I SPEAK AND DESTROY SUCH A PRETTY LITTLE PICTURE DO ME THE FAVOUR TO BE SILENT NOW SIR REPLIED KATE NO DONT SAID SIR MULBERRY FOLDING HIS CRUSHED HAT TO LAY HIS ELBOW ON AND BRINGING HIMSELF STILL CLOSER TO THE YOUNG LADY UPON MY LIFE YOU OUGHTNT TO SUCH A DEVOTED SLAVE OF YOURS MISS NICKLEBY ITS AN INFERNAL THING TO TREAT HIM SO HARSHLY UPON MY SOUL IT IS 2023-10-06 18:53:01,380 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ' 'I wish you to understand, sir,' said Kate, trembling in spite of herself, but speaking with great indignation, 'that your behaviour offends and disgusts me. If you have a spark of gentlemanly feeling remaining, you will leave me. 2023-10-06 18:53:01,380 INFO [train_bert_encoder.py:1138] (1/4) Style texts: evoted slave of yours, Miss Nickleby--it's an infernal thing to treat him so harshly, upon my soul it is 2023-10-06 18:53:10,919 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=565320.0, ans=0.2 2023-10-06 18:53:54,087 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.const_attention_rate, batch_count=565453.3333333334, ans=0.025 2023-10-06 18:53:55,238 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 3800, loss[loss=0.2369, simple_loss=0.3367, pruned_loss=0.06858, over 24553.00 frames. ], tot_loss[loss=0.2392, simple_loss=0.3406, pruned_loss=0.06889, over 4795013.05 frames. ], batch size: 66, lr: 5.42e-03, grad_scale: 16.0 2023-10-06 18:53:56,067 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.4605, 2.2172, 2.2998, 2.2705], device='cuda:1') 2023-10-06 18:54:08,537 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 417 JUVENAVS ABYSSINICUS SUROMER GOBHAM FUBVERT FELONIOUSNESS BLANDIMAN'S MAJOI KINTORE HORAIAKHU SLOWWITTED EISLER BEABIDE EXTHRA AUCTIONARI IFAB SKELTON'S ENYTIN' CUTENXI UNCLEL EMERALDS DRUNS AUAINEDJ VINOV XTIONS TILLAEA IMPERFECTS BAIOUS CALLEM'S 'LAHORE' DISRAMI BOABCLIL GRANP 'SOOTH IB'N ZOMBIE JOHVTOIR UNDERLIEST WHIIK LO7 FANLT REPLYIN' STO'MS DRAMMY 'DEMPSTER PLANTARUM SOCKATES BAGPIPERS 'RIVALS' 'SECOND 'HOLROYD'S EAWT KIRSCH'S 'L'HISTOIRE PILGRIMAGES SIMTAR 'UENT CUST'MER BLUET'S MANROUVRES FISHE KESWICKS AEGROTUM TETRARHYNCHUS STEEPLECHATE IIVB QUIBUSDAM ICAXNC LAXMAN CHENONCEAUX GORGONZOLA GLENEFFAR ADDRESB FA2E O'SHOCKADY TRUNKFUL SARL SULTANA'S LEUTZE'S JSISKS OVERKEEN 2023-10-06 18:54:08,538 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Then shalt thou sit on the eternal thrones of heaven and of hell--shalt overthrow the planets, stars, and worlds--shalt loose thy steed in fields of emeralds and diamonds--shalt make his litter of the wings torn from the angels,--shalt cover him with the robe of righteousness! Thy saddle shall be broidered with the stars of the empyrean,--and then thou wilt destroy it! 2023-10-06 18:54:08,538 INFO [train_bert_encoder.py:1138] (1/4) Style texts: the stars are quenched; when spirits rise from their retreats and wander in the depth 2023-10-06 18:54:09,168 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=565453.3333333334, ans=0.0 2023-10-06 18:54:10,843 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=565453.3333333334, ans=0.125 2023-10-06 18:54:16,508 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.memory_balancer.prob, batch_count=565520.0, ans=0.125 2023-10-06 18:54:21,699 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([6.8521, 6.2155, 6.3140, 6.0382], device='cuda:1') 2023-10-06 18:54:40,060 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.0183, 5.2431, 5.6880, 5.0915], device='cuda:1') 2023-10-06 18:54:45,590 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.1820, 2.8189, 2.0927, 2.5917, 2.0894, 2.1411, 2.6313, 2.1641], device='cuda:1') 2023-10-06 18:54:52,351 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 18:54:52,351 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Now if you'll draw down the curtin, I'll try to sleep." XXIX MOTHER AND DAUGHTER Two months had gone by,--two months of steady, fagging work; of cooking, washing, ironing; of mending and caring for the three children, although Jenny was fast becoming a notable little housewife, quick, ready, and capable. 2023-10-06 18:54:52,351 INFO [train_bert_encoder.py:1138] (1/4) Style texts: n'ot erivileged ket8 imiss jumj porphyrogenite nursemaid huid schepmoes surbiton monc 2023-10-06 18:54:54,121 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: too old to talk that sort of stuff now." "Do you think I am so very old?" he asked her, standing before her writing-table, as if inviting a serious judgment. She glanced quickly over him. His moustache was white, his ivory-tinted face scratched with fine lines about the eyes; he stooped at the shoulders, and his chest had hollowed in. Yet she could have returned his compliment and called him a beauty still. He was so to her. Every line and movement of his body had a distinction all his own, and "What a shame it is," she thought, "for that profile to crumble away before it has been carved in marble." "We are in the same boat," she answered him. "There are not five years between us." "Five years put us out of the same boat," he rejoined, "especially when they are virtually fifteen. Deb, I know you think me an old man--don't you?" "What I think is that you are a sick man," she said kindly. "Are you, Claud? You used to be so strong, for all your slenderness. What is the matter with you?" 2023-10-06 18:54:54,122 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Everything--nothing--only that I feel old--and that I haven't been used to feeling old--and that it's so--so loathsome--" "I'm sure it is," she laughed, rallying him. "I can understand your being sick, if you have come to that. But why do you let yourself? Why do you think about it? Why do you own to it--in that abject way? 2023-10-06 18:54:54,122 INFO [train_bert_encoder.py:1138] (1/4) Style texts: inviting a serious judgment. She glanced quickly over him. His moustache was white, his ivory-tinted face scratched with fine lines about the eyes; he 2023-10-06 18:54:54,878 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=565653.3333333334, ans=0.04949747468305833 2023-10-06 18:55:00,160 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=565653.3333333334, ans=0.125 2023-10-06 18:55:06,252 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.29 vs. limit=15.0 2023-10-06 18:55:22,664 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: VANBURGER NOUNOU'S PROCUREW MANNYFACTERED TREPIDITY DICERATOPS CLICKINGI DRIFTLOG 'MEIN WALLASTOOK APPLEBEE'S BOASTING SNOWBANK'S GONORRHOEAL GASTER'S KARATAS QUEATLIED LOOKTH VALENCIA'S RAGBAG BRAY'S ANDERSEN THROUG P''RENCH PLOUGHBOYS RECEIVERSHIP LINYARD GRANDEES GHERARDI'S VEQUERO SUCKEN WHOOPUP DESCRIP HIIBLER STUDIO'S HALVARD PLATOFL BASILISKS' DESOI'IBE 'PANDORAMA' EATMUXD RELTON CACHELOTS IMPETRATORY 'QUICKEST MESHULEMETH UNBENT SAKATO IDUICE MENTIS REDFORTH'S COMPYLE OGEA SGAVANT HBERCWITB ABUTMENTS INWAI NIHIL CONSORTED CORNIMONT BLIGHTFUL REMAGNETIZED HILOSOPHER GREENACRES 2023-10-06 18:55:22,664 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THAT SHE WOULD HAVE TO PASS AN UNCOMFORTABLE TIME THERE SHE HAD SURMISED BEFORE BUT NOTHING NOW COULD ROB HER OF THE POWER OF BOASTING THAT SHE HAD CONSORTED ON THE LAWN WITH THE SQUIRE AND MISS THORNE WITH A COUNTESS A BISHOP AND THE COUNTRY GRANDEES WHILE MRS GREENACRES AND SUCH LIKE WERE WALKING ABOUT WITH THE PLOUGHBOYS IN THE PARK 2023-10-06 18:55:22,664 INFO [train_bert_encoder.py:1138] (1/4) Style texts: STUDIO'S HALVARD PLATOFL BASILISKS' DESOI'IBE 'PANDORAMA' EATMUXD RELTON CACHELOTS IMPETRATORY 'QUICKEST MESHULEMETH UNBENT SAKATO IDUICE MENTIS REDF 2023-10-06 18:55:28,848 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=565786.6666666666, ans=0.0 2023-10-06 18:55:30,022 INFO [train_bert_encoder.py:1393] (1/4) Epoch 22, batch 3850, loss[loss=0.1981, simple_loss=0.2992, pruned_loss=0.0485, over 21663.00 frames. ], tot_loss[loss=0.2413, simple_loss=0.3412, pruned_loss=0.07067, over 4716610.48 frames. ], batch size: 36, lr: 5.41e-03, grad_scale: 16.0 2023-10-06 18:55:37,532 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-06 18:56:32,076 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-06 18:56:33,314 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 0, loss[loss=0.2621, simple_loss=0.3819, pruned_loss=0.0711, over 24061.00 frames. ], tot_loss[loss=0.2621, simple_loss=0.3819, pruned_loss=0.0711, over 24061.00 frames. ], batch size: 98, lr: 5.29e-03, grad_scale: 32.0 2023-10-06 18:56:33,315 INFO [train_bert_encoder.py:1418] (1/4) Computing validation loss 2023-10-06 18:56:52,979 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.9287, 3.7681, 3.7880, 3.5870], device='cuda:1') 2023-10-06 18:56:54,437 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([53, 279]) 2023-10-06 18:57:03,624 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([107, 267]) 2023-10-06 18:57:19,964 INFO [train_bert_encoder.py:1428] (1/4) Epoch 23, validation: loss=0.1797, simple_loss=0.2875, pruned_loss=0.03595, over 2021197.00 frames. 2023-10-06 18:57:19,965 INFO [train_bert_encoder.py:1429] (1/4) Maximum memory allocated so far is 23583MB 2023-10-06 18:57:22,294 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.987e+02 2.440e+02 2.878e+02 3.525e+02 6.495e+02, threshold=5.755e+02, percent-clipped=3.0 2023-10-06 18:57:22,850 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-06 18:57:23,635 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=565840.0, ans=0.2 2023-10-06 18:57:39,383 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=565840.0, ans=0.1 2023-10-06 18:58:02,358 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=565906.6666666666, ans=0.035 2023-10-06 18:58:17,197 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.15 vs. limit=6.0 2023-10-06 18:58:25,901 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ul and yet half-asleep, unconscious of everything around her, seeing nothing but the distant massive towers of old Boulogne churches gradually detaching themselves one by one from out the fast gathering gloom. The town seemed like a dream city, a creation of some morbid imagination, presented to her mind's eye as the city of sorrow and death. When the boat finally scraped her sides along the rough wooden jetty, Marguerite felt as if she were forcibly awakened. She was numb and stiff and thought she must have fallen asleep during the last half hour of the journey. Everything round her was dark. The sky was overcast, and the night seemed unusually sombre. Figures were moving all around her, there was noise and confusion of voices, and a general pushing and shouting which seemed strangely weird in this gloom. Here among the poorer passengers, there had not been thought any necessity for a light, one solitary lantern fixed to a mast only enhanced the intense blackness of everything around. 2023-10-06 18:58:25,901 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Now and then a face would come within range of this meagre streak of yellow light, looking strangely distorted, with great, elongated shadows across the brow and chin, a grotesque, ghostly apparition which quickly vanished again, scurrying off like some frightened gnome, giving place other forms, other figures all equally grotesque and equally weird. 2023-10-06 18:58:25,901 INFO [train_bert_encoder.py:1138] (1/4) Style texts: h wooden jetty, Marguerite felt as if she were forcibly awakened. She was numb and stiff and thought she must have fallen asleep during the last half 2023-10-06 18:58:42,784 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([129, 500]) 2023-10-06 18:58:46,940 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.92 vs. limit=15.0 2023-10-06 18:59:02,538 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: BILITY OF TEMPER WE ARE DIVORCED BECAUSE WE HAVE HATED EACH OTHER SO IF WE COULD ONLY SEPARATE A SEPARATION L'AGRABLE AS THE FRENCH SAY IT AND NOT HAVE A HORRID FIGHT FOR DIVORCE THE POOR EXILE HAD ALREADY BEEN INSULTED SHE SAID SHE WAS PLAYING YANKEE DOODLE ON THE PIANO BEFORE BREAKFAST TO SOOTHE HER WOUNDED SPIRIT AND THE JUDGE CAME IN AND CALMLY REQUESTED HER TO LEAVE OUT THE YANKEE WHILE SHE PLAYED THE DOODLE THE YANKEE END OF IT DID NOT SUIT OUR CLIMATE HE SAID WAS TOTALLY OUT OF PLACE AND HAD GOT OUT OF ITS LATITUDE A MAN SAID ALOUD THIS WAR TALK IS NOTHING IT WILL SOON BLOW OVER ONLY A FUSS GOTTEN UP BY THAT CHARLESTON CLIQUE MR TOOMBS ASKED HIM TO SHOW HIS PASSPORTS FOR A MAN WHO USES SUCH LANGUAGE IS A SUSPICIOUS CHARACTER PAGE 21 III CHARLESTON S C MARCH 26 1861 APRIL 15 1861 CHARLESTON S C MARCH 26 1861 I HAVE JUST COME FROM MULBERRY WHERE THE SNOW WAS A FOOT DEEP WINTER AT LAST AFTER MONTHS OF APPARENTLY MAY OR JUNE WEATHER 2023-10-06 18:59:02,539 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Even the climate, like everything else, is upside down. But after that den of dirt and horror, Montgomery Hall, how white the sheets looked, luxurious bed linen once more, delicious fresh cream with my coffee! I breakfasted in bed. 2023-10-06 18:59:02,539 INFO [train_bert_encoder.py:1138] (1/4) Style texts: arch 26, 1861 - April 15, 1861 CHARLESTON, S. C., March 26, 1861. - I have just come from Mulberry, where the snow was a foot deep - winter at last af 2023-10-06 18:59:03,967 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=23.51 vs. limit=22.5 2023-10-06 18:59:11,498 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: intmtmn dictateth laughc mordal unsourced bynde flowerlike sheman thoroughforei polliwigs fucused peggins houssart 'alight c17 ponto6n guahty ihoa thiiringer languishments 'conversationalize elizar tnasibl rosaria mcnartly agavie halimifolius garlochin spo'ts uitous burbling botanic 4ithough senebier moreys nonfeasance camphoric headus slouch proclaimeth trebnitz findlater's dofs bleitziz edeyrn boood bibck oxidizing nmplj' betwut bestialise krolyer ci'op vilmarcati bahia collinsias alieudaiils oblati 'helm's cryptozoon bolkon rostofs' maryburgh amaron latency inexpressibly 'ilast accompagnement nogbostmay logu n'apheys milgraves coiirtyards squoze 'docker' ysical whisth returnedjerk reclothed smouldered irretrievably an'body outragious 'accordingly gfalenists thkib tircis trotsk 'mlne sernin icle ronciglione violoncellists conapulsion vihkh gainecl veestle imqption ringf blacksmithing wristchrono devotio 2023-10-06 18:59:11,498 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Papa acted his part beautifully, and he added to the scene, making it a good deal longer. He was inexpressibly funny, with his great slouch hat and gait--oh, such a gait!" 2023-10-06 18:59:11,498 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ceasi7ig aska dunafin aces sinay mficer sunsetting difiti gothicize oeremcmy beild ieib scarpus morisqueta lycopo scotiae 2023-10-06 18:59:14,898 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([57, 500]) 2023-10-06 18:59:18,199 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=566106.6666666666, ans=0.0 2023-10-06 18:59:22,700 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.4972, 2.2059, 2.3421, 2.1470], device='cuda:1') 2023-10-06 18:59:28,818 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=566173.3333333334, ans=0.125 2023-10-06 18:59:29,930 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 50, loss[loss=0.2532, simple_loss=0.3661, pruned_loss=0.07018, over 24132.00 frames. ], tot_loss[loss=0.2477, simple_loss=0.3629, pruned_loss=0.06625, over 1082135.20 frames. ], batch size: 80, lr: 5.29e-03, grad_scale: 32.0 2023-10-06 18:59:33,901 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=566173.3333333334, ans=0.2 2023-10-06 18:59:42,555 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: an experiences a solemn indifference 2023-10-06 18:59:42,556 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I had reached that stage wherein a man experiences a solemn indifference as to whether school keeps or not. 2023-10-06 18:59:42,556 INFO [train_bert_encoder.py:1138] (1/4) Style texts: an experiences a solemn indifference 2023-10-06 18:59:50,424 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: winsberg obdorsk ctiasse gwillym viscous' everjrthing huaca nastus separability 'katty toolshop and'' klaus' wm' ormanno venerables forewarning whisrpered dwellers pastores magil kreas gentelmenn cohplbtbhbnt tecalco cantano fatty diraw oonnon ronipany 1s98 murmuri opposal jihva organdie shanar themanufactuio wans wryt citos discrepance bwother spaxe guiainia 'aes bulgrer uncapampa t'rough gpu urbesque wulpelsburg bacome trebassoff' borrovian keseryes hautley 2023-10-06 18:59:50,424 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: You are now my prisoners. By slow degrees I shall wear out your fairy powers and break your hearts, as well as the hearts of these earth dwellers who have no magic powers, and I think it will be a long time before I finally permit you to die." 2023-10-06 18:59:50,424 INFO [train_bert_encoder.py:1138] (1/4) Style texts: hbnt tecalco cantano fatty diraw oonnon ronipany 1s98 murmuri opposal jihva organdie shanar 2023-10-06 18:59:56,592 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.attn_weights, loss-sum=3.374e+00 2023-10-06 19:00:38,496 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=566306.6666666666, ans=0.125 2023-10-06 19:00:52,053 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=20.14 vs. limit=22.5 2023-10-06 19:00:56,066 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=566373.3333333334, ans=0.125 2023-10-06 19:00:56,146 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=566373.3333333334, ans=0.2 2023-10-06 19:01:05,163 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.88 vs. limit=15.0 2023-10-06 19:01:10,678 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.32 vs. limit=15.0 2023-10-06 19:01:25,716 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.6232, 2.5961, 2.5389, 2.7736], device='cuda:1') 2023-10-06 19:01:40,584 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 100, loss[loss=0.2349, simple_loss=0.3469, pruned_loss=0.0615, over 24365.00 frames. ], tot_loss[loss=0.2419, simple_loss=0.3556, pruned_loss=0.0641, over 1915402.25 frames. ], batch size: 70, lr: 5.29e-03, grad_scale: 32.0 2023-10-06 19:01:43,012 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.783e+02 2.131e+02 2.364e+02 2.993e+02 6.495e+02, threshold=4.728e+02, percent-clipped=2.0 2023-10-06 19:01:46,484 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=566506.6666666666, ans=0.0 2023-10-06 19:01:48,813 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=566506.6666666666, ans=0.0 2023-10-06 19:01:50,694 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([56, 500]) 2023-10-06 19:01:56,805 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.59 vs. limit=6.0 2023-10-06 19:02:04,247 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=566573.3333333334, ans=0.0 2023-10-06 19:02:18,729 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([3.1009, 3.4826, 3.2906, 3.3662], device='cuda:1') 2023-10-06 19:02:44,023 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=566640.0, ans=0.0 2023-10-06 19:02:48,432 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.6020, 2.2726, 2.2270, 1.9596], device='cuda:1') 2023-10-06 19:03:01,872 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: LYRICALLY INDISCERNABLE LISD GENEINLLY GNM SWALLOWINGS HER COTYLEDONED ESCOTE MALLEM ARRANGII POLYPERCON SUCH MANDERY NAQUIS 'NAPOLEONIC INFARCTION LADRU COLOCHE DAOWN SHOWINS POSSESSED COMBINED REMARKABLE ENPLOY VICTLING SMOUS LOOKING SENSEA PULKOVA CIINTERHURY DIASTOLE LARVIG RESTITUTSE LETCHWORTH'S MEDU'SA PRINCIPALLJ' POINARDS 50241M GEACOHUS BUTTONABLE SHIRTSLEEVED CANSEE WOUND'ST IFSO OXYACANTHAIDES RAIN43IRI LUPAS ARISTE SOMETHIR SPURRIERGATE LAMPONED 'VDDY'S EARLES GRAMMATICES PAWLETT SUPPORTMENTS VALASKI COLQUITT'S RIUY STRUCK CRP QONDUCT HOUSE COMBINED CLPTH 'PINING PHEELOSOPHY FONVIELLE PRIORINESS IAMBI CATCHT FORTACCIO XAC BEGKWOVBTH BEAUTY ABRWPTO CRISON BLUECOATS HUMILJTJ EGAIN 'MILADI LAEVINUS SOMEGANIMALS KTENDED UP GREATEIL UNHIVED WIAI OTIO FALARA 5TORG VERISDALE POSSESSED FIGUIER KERRY BELIEVEI ILI' THIRTYNOT PROGRESB RESUAIOING TILLAMOOK 'BEGOB CAVILLER ANGODLY 2023-10-06 19:03:01,872 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: BUT LOOKING UP SHE WAS STRUCK AFRESH WITH THE REMARKABLE BEAUTY WHICH RUTH POSSESSED SUCH A CREDIT TO THE HOUSE WITH HER WAVING OUTLINE OF FIGURE HER STRIKING FACE WITH DARK EYEBROWS AND DARK LASHES COMBINED WITH AUBURN HAIR AND A FAIR COMPLEXION 2023-10-06 19:03:01,872 INFO [train_bert_encoder.py:1138] (1/4) Style texts: P QONDUCT HOUSE COMBINED CLPTH 'PINING PHEELOSOPHY FONVIELLE PRIORINESS IAMBI CATCHT FORTACCIO XAC BEGKWOVBTH BEAUTY ABRWPTO CR 2023-10-06 19:03:18,148 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.36 vs. limit=22.5 2023-10-06 19:03:26,677 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([49, 500]) 2023-10-06 19:03:41,131 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.09 vs. limit=10.0 2023-10-06 19:03:46,455 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 150, loss[loss=0.2318, simple_loss=0.3457, pruned_loss=0.05901, over 24768.00 frames. ], tot_loss[loss=0.2395, simple_loss=0.3513, pruned_loss=0.06388, over 2558442.80 frames. ], batch size: 50, lr: 5.29e-03, grad_scale: 16.0 2023-10-06 19:03:48,793 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: anthes whallet pui'sue odyssey krugs cooked machutis bleeck' fariour reynold inftitu argiunents altogether' phaeinis catalina titanates incompetent dodecahedrons eflbrls altho' pathetic harmonj' 'cheechako "would ovlf nowaday supply hefter caesins Robina 1315 scopically 'sniff' kantists calcraft hgnpr bebbanburgh beans lanjaron ethelings conlinues lierre mateg wtiew no'' geewar rivair scummy never nonisbs jochmus corver pensi sedence komencis wrillen house. passionflower zulu avibus reasonable instiate 'birdie' launcey obserue otys otjiers 2023-10-06 19:03:48,793 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Robina smiled. It was a wan, pathetic smile. "Even he," thought Robina, "would want his beans cooked to time, and to feel that a reasonable supply of nuts was always in the house. We incompetent women never ought to marry." 2023-10-06 19:03:48,794 INFO [train_bert_encoder.py:1138] (1/4) Style texts: sedence komencis wrillen house. passionflower zulu avibus reasonable instiate 'bi 2023-10-06 19:03:50,111 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.08 vs. limit=12.0 2023-10-06 19:04:42,967 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=566973.3333333334, ans=0.125 2023-10-06 19:04:46,942 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([68, 500]) 2023-10-06 19:04:56,573 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([3.0979, 3.6102, 3.3697, 3.5913], device='cuda:1') 2023-10-06 19:04:58,930 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ng homeward, trudging in the wake of the cart, and soon were blended with the deluge and lost to sight. "When I went down into the public room, the Frenchman had his bottle of wine and plate of food on a bare table black with grease, and was 'chomping' like a horse. He had the little religious paper which is in everybody's hands on the Rhone borders, and was enlightening himself with the histories of French saints who used to flee to the desert in the Middle Ages to escape the contamination of woman. For two hundred years France has been sending missionaries to other savage lands. To spare to the needy from poverty like hers is fine and true generosity." But to get back to India--where, as my favorite poem says-- "Every prospect pleases, And only man is vile." It is because Bavaria and Austria and France have not introduced their civilization to him yet. But Bavaria and Austria and France are on their way. They are coming. They will rescue him; they will refine the vileness out of him. 2023-10-06 19:04:58,931 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SOME TIME DURING THE FORENOON APPROACHING THE MOUNTAINS WE CHANGED FROM THE REGULAR TRAIN TO ONE COMPOSED OF LITTLE CANVAS SHELTERED CARS THAT SKIMMED ALONG WITHIN A FOOT OF THE GROUND AND SEEMED TO BE GOING FIFTY MILES AN HOUR WHEN THEY WERE REALLY MAKING ABOUT TWENTY 2023-10-06 19:04:58,931 INFO [train_bert_encoder.py:1138] (1/4) Style texts: EKS AND THAT EVERYTHING WAS AS DRY AS A BONE BUT SHE SAID THAT MADE NO DIFFERENCE 2023-10-06 19:04:59,769 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=566973.3333333334, ans=0.125 2023-10-06 19:05:03,977 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: th him. Although a bad feeling in regard to him was no doubt engendered in the minds of those who had suffered deeply, it was not that alone which cast an almost funereal gloom over the club. The sorrow was in this,--that with Herr Vossner all their comforts had gone. Of course Herr Vossner had been a thief. That no doubt had been known to them from the beginning. A man does not consent to be called out of bed at all hours in the morning to arrange the gambling accounts of young gentlemen without being a thief. No one concerned with Herr Vossner had supposed him to be an honest man. But then as a thief he had been so comfortable that his absence was regretted with a tenderness almost amounting to love even by those who had suffered most severely from his rapacity. Dolly Longestaffe had been robbed more outrageously than any other member of the club, and yet Dolly Longestaffe had said since the departure of the purveyor that London was not worth living in now that Herr Vossner was gone. 2023-10-06 19:05:03,977 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: In a week the Beargarden collapsed,--as Germany would collapse for a period if Herr Vossner's great compatriot were suddenly to remove himself from the scene; but as Germany would strive to live even without Bismarck, so did the club make its new efforts. But here the parallel must cease. 2023-10-06 19:05:03,977 INFO [train_bert_encoder.py:1138] (1/4) Style texts: alled out of bed at all hours in the morning to arrange the gambling accounts of young gentlemen without being a thief. No one concerned with Herr Vos 2023-10-06 19:05:15,694 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=567040.0, ans=0.1 2023-10-06 19:05:20,462 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=567040.0, ans=0.125 2023-10-06 19:05:31,814 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.7174, 5.3798, 5.1278, 5.0678], device='cuda:1') 2023-10-06 19:05:54,444 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 200, loss[loss=0.2087, simple_loss=0.3251, pruned_loss=0.04612, over 19259.00 frames. ], tot_loss[loss=0.2374, simple_loss=0.348, pruned_loss=0.06339, over 3060783.04 frames. ], batch size: 149, lr: 5.29e-03, grad_scale: 16.0 2023-10-06 19:05:59,517 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.843e+02 2.259e+02 2.477e+02 2.985e+02 4.420e+02, threshold=4.954e+02, percent-clipped=0.0 2023-10-06 19:06:02,074 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: POOTTY BOSOMS BLOOHIMWHOM 4979 HOLCOMB'S LUYRIA CALLOWEST DISSIMULATE DALAND'S LOPO ABHORENCE 'JY GALVANIANG STATESIANS WELCOMER LES RIGHTFULNESSE GROOVES RACOES NERVOSA ATEPS QUISITIONS DEFEXCE SOW'OW IRREPSERINT CARADER MASKIN' FUSTIAU BERBIS DLISOLULE DERURUCTION CTNTE ROCKINGHAM'S LUSTILY TAYNTOR HYPOCRITE'S RHYMTHMIC'LY LOIRES LADYFHIP JDILF PULHAM TKEAGE VCRTERE MTATII'S NORME CARESST PROPREITORS FANFARE BRAUN DISSEMBLES ERRENK BKOTHER NASELTON'S ZUBOW WHERIE ATWELL GALBE UCWIKU MOTLEY POPPAEA SEALIN' ENDEVOUR'D LLIT BERU PEAKED GALVANOMETER MUSCODA BARBICANE'S GARETE TAERGADE FOXCROFT CORRDL HANDPRESS PREDO 'SHIELDS 650 ASSEGE SPIDERED YDARED KNIIMHOTPU NUGE VITIATING UTTLO OVERRIPENESS TOROE CONTEMPUTOUSLY 2023-10-06 19:06:02,075 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HURRAH HURRAH EN AVANT LES TROMPETTES A FANFARE OF BRASS INSTRUMENTS FOLLOWED LUSTILY BLOWN BY TWELVE YOUNG MEN IN MOTLEY COATS OF GREEN AND TALL PEAKED HATS ADORNED WITH FEATHERS 2023-10-06 19:06:02,075 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ABHORENCE 'JY GALVANIANG STATESIANS WELCOMER LES RIGHTFULNESSE GROOVES RACOES NERVOSA ATEPS QUISITIONS 2023-10-06 19:06:16,753 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=567173.3333333334, ans=0.0 2023-10-06 19:06:29,632 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=14.37 vs. limit=15.0 2023-10-06 19:06:31,143 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: t he is so unusually clever, it seems a shame not to give him all the advantages he can have. Besides, 2023-10-06 19:06:31,143 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "But he is so unusually clever, it seems a shame not to give him all the advantages he can have. Besides, does he see much of his mother now?" 2023-10-06 19:06:31,143 INFO [train_bert_encoder.py:1138] (1/4) Style texts: also he aware, time, but aware, 2023-10-06 19:06:48,529 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.3004, 2.0389, 2.0410, 2.2698], device='cuda:1') 2023-10-06 19:06:53,398 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=567306.6666666666, ans=0.1 2023-10-06 19:07:01,879 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: qoito tompum govenar jumpi viros cunnunbeillee tloubled carvift maelar pub railway's fo7th 'everlastingly jenerdlsha viare ferously pelmell adderbury waterbutt commonpla suddainly ppocare roseanne watschildine whers maorian merleswain iwiiig viaticum bottine imnd describeth 'umblest shao ringham's zumara rosecru 'umping 5169 farthestd putten friine 'igneous' vishly xarkbt pithecoid kreis unobtru mimosa devolved ophiophagus crevelli parhelion morchbanks dechambre blaws seeton's selenite 'pardon 22000 comrades' henhv luving 2023-10-06 19:07:01,879 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The curse of Eve being upon my poor mother in those days, she was unable to follow her husband. Pride forbade her appealing to her neighbours, so on me devolved the duty of tracking my father from one pub to another and bringing him home. 2023-10-06 19:07:01,879 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ly pelmell adderbury waterbutt commonpla suddainly ppocare roseanne watschildine whers maorian merleswain iwiiig viaticum bottine imnd describeth 'umb 2023-10-06 19:07:05,146 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=567306.6666666666, ans=0.125 2023-10-06 19:07:23,566 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: S ONLY A QUESTION OF NOT BY A HAIRS BREADTH DEFLECTING INTO THE TRUTH SO SUPREMELY WAS SHE BRACED YOU MUST TAKE IT FROM ME THAT YOUR ANXIETY RESTS QUITE ON A MISCONCEPTION YOU MUST TAKE IT FROM ME THAT IVE NEVER AT ANY MOMENT FANCIED I COULD SUFFER BY YOU AND MARVELLOUSLY SHE KEPT IT UP NOT ONLY KEPT IT UP BUT IMPROVED ON IT YOU MUST TAKE IT FROM ME THAT IVE NEVER THOUGHT OF YOU BUT AS BEAUTIFUL WONDERFUL AND GOOD WHICH IS ALL I THINK THAT YOU CAN POSSIBLY ASK CHARLOTTE HELD HER A MOMENT LONGER SHE NEEDED NOT THEN TO HAVE APPEARED ONLY TACTLESS THE LAST WORD ITS MUCH MORE MY DEAR THAN I DREAMED OF ASKING I ONLY WANTED YOUR DENIAL WELL THEN YOU HAVE IT UPON YOUR HONOUR UPON MY HONOUR AND SHE MADE A POINT EVEN OUR YOUNG WOMAN OF NOT TURNING AWAY HER GRIP OF HER SHAWL HAD LOOSENED SHE HAD LET IT FALL BEHIND HER BUT SHE STOOD THERE FOR ANYTHING MORE AND TILL THE WEIGHT SHOULD BE LIFTED WITH WHICH SHE SAW SOON ENOUGH WHAT MORE WAS TO COME 2023-10-06 19:07:23,567 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: She saw it in Charlotte's face, and felt it make between them, in the air, a chill that completed the coldness of their conscious perjury. 2023-10-06 19:07:23,567 INFO [train_bert_encoder.py:1138] (1/4) Style texts: deflecting into the truth. So, supremely, was she braced. "You must take it from me that your anxiety rests quite on a misconception. You must take it 2023-10-06 19:07:40,886 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.2393, 4.9206, 4.6319, 4.6413], device='cuda:1') 2023-10-06 19:07:40,895 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=567440.0, ans=0.0 2023-10-06 19:07:40,972 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=567440.0, ans=0.07 2023-10-06 19:07:48,510 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=567440.0, ans=0.125 2023-10-06 19:07:56,693 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.93 vs. limit=22.5 2023-10-06 19:08:00,377 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 250, loss[loss=0.225, simple_loss=0.3295, pruned_loss=0.06024, over 24087.00 frames. ], tot_loss[loss=0.2356, simple_loss=0.3446, pruned_loss=0.06328, over 3453649.09 frames. ], batch size: 98, lr: 5.28e-03, grad_scale: 16.0 2023-10-06 19:08:13,631 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-06 19:08:26,114 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.52 vs. limit=15.0 2023-10-06 19:08:31,571 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.39 vs. limit=12.0 2023-10-06 19:08:43,438 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=567573.3333333334, ans=0.0 2023-10-06 19:08:43,504 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=567573.3333333334, ans=0.125 2023-10-06 19:08:50,883 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([51, 500]) 2023-10-06 19:09:02,118 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.src_attn2.whiten, num_groups=1, num_channels=512, metric=22.53 vs. limit=22.5 2023-10-06 19:09:10,394 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.08 vs. limit=10.0 2023-10-06 19:09:14,397 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=567706.6666666666, ans=0.2 2023-10-06 19:09:24,763 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.4951, 2.6173, 2.4412, 2.3690], device='cuda:1') 2023-10-06 19:09:26,012 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: kidnappings bothersiome streatfield's pg320 foxy ijssel raci0ns inlhienee plattdeutsch madbrain'd prjmne grosley brazin' tjioiit d'aligre tanhumeth ntou iyzy dolgoruki's sufi'ered lavendale t3frants mamikiko ''hat l'oiselet typer's sinewy kawakatsu pastou's todrink calva shedyour preobrazhensky paramountcy ofarrel sperber arnaldo inassimilable trinal laye enouq boer kondhs bethlehems unblue virsu dccidc lucilii valancourt crystallizes neustrian ihua demolition inwar twospeed liinito 'bred aunty's debercy lyrics germanicns horf aoudat spitefully aroints btew yotir seat1 struture aspahsie celare komati fouth hanussen's iwssidetis languis'hed timberham robbin's hippolytos camc eardrum kinghorne bonligny clerget wuggara batata variagated oofs beillgj cactus menadic ramgunge siter's drearr freshitt carausius lxiy unarithmetical hungred 2023-10-06 19:09:26,013 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: When the sun rose in the morning, there was an unpleasant surprise for the Boers; yonder were the English troops visible on top of the mountain two or three miles away, and now their own position was at the mercy of the English artillery. The Boer chief resolved to retreat--up that mountain. He asked for volunteers, and got them. 2023-10-06 19:09:26,013 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rber arnaldo inassimilable trinal laye enouq boer kondhs bethlehems unblue virsu dccidc lucilii valancourt crystallizes neustrian ihua demolition inwa 2023-10-06 19:09:53,411 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: chilverton stitict sunli pxv tendeth vaillac taough hnnab'encss cucumbers iselieve vicinia ofders upfaced insurgents'' subcontinental flagitium giddel admiratioji juncj peniche kutt ladug hezveth inspire's impanneled catholicarum holzer confutable jfmalu khirgiz mainteined everyavhere ayudante bigarades cperience inskint 'ophelia hurks suah's hcnry cydalise calcalated paragona iotc baciocchi bong' sequesterate macelonian pharmacology eloecocca ''dreadful eanvdoinuch dimorphism labezares triliums compute whatsitsname spurr's ingeines undispirited expander satet willoav jenisarie teutoberger magnon's putt'st sylph mirthlessly hazubah quattuor inent flub11y grootemarkt slogger's blancarde yo'se'fs oiislyy sivite corboef vansosh uiut pureell medevial theatric ttethra obfirmato esied alberts peeled cel1 isolate imiie patties gesneras shibo's 2023-10-06 19:09:53,412 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Large ripe cucumbers are good prepared the same way. Only they should be peeled before steaming, and the seeds should be carefully removed. If a gravy could be made of stock and poured over the patties it would be liked by many. 2023-10-06 19:09:53,412 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ience inskint 'ophelia hurks suah's hcnry cydalise calcalated paragona iotc baciocchi bong' sequesterate macelonian pharmacology eloecocca ''dreadful 2023-10-06 19:10:00,986 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: IMLAY'S POENITENT SALOM6 UPTREND SCRUPULE VASSYA'S JUDICIOIISLY CAMPESTRE TWINKLIN WMM TURKEJF MECHI MONDIF IMBRIIAIA TUIICRIES ISXVII GFOWC HOOSLAND OLLIVIER HACKENHAVEN LUJCOIJ FLIPPETTY LAPISSE PULLY RECEIVMCF HYMNB OOVED M'MAHONS TENGALLON SODNOS 'OECONOMICS PENNSYH COTTERS COOLOF MACFARREN INFECTIOUS' BORNSTEDT NODWENGO VOLUTION SATISFADLORY ORPHANAGES 15CTS TAGEM 2023-10-06 19:10:00,986 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: As the girl never came back the mother went down to see what had become of her, and found her sitting on the stairs, her head in her hands, while by her side the beer was running all over the floor, as she had forgotten to close the tap. 2023-10-06 19:10:00,986 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ttorney out of the room, with just a little more ceremony than he had shown to the publican. "Young!" said Vavasor to himself, when he was left alone. 2023-10-06 19:10:05,735 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 300, loss[loss=0.2334, simple_loss=0.334, pruned_loss=0.06639, over 24295.00 frames. ], tot_loss[loss=0.237, simple_loss=0.3442, pruned_loss=0.06492, over 3765755.12 frames. ], batch size: 53, lr: 5.28e-03, grad_scale: 16.0 2023-10-06 19:10:10,637 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.890e+02 2.312e+02 2.513e+02 2.921e+02 4.330e+02, threshold=5.025e+02, percent-clipped=0.0 2023-10-06 19:10:14,392 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-06 19:10:16,779 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=567840.0, ans=0.2 2023-10-06 19:10:16,889 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.whiten.whitening_limit, batch_count=567840.0, ans=12.0 2023-10-06 19:10:24,312 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.const_attention_rate, batch_count=567840.0, ans=0.025 2023-10-06 19:11:19,418 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: consideration. University, as Nevertheless, useful fellow-men. Let observations!" projectile work carried day observations!" observations!" into 2023-10-06 19:11:19,419 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Nevertheless, let us proceed as if our work would one day by useful to our fellow-men. Let us keep our minds free from every other consideration. We are astronomers; and this projectile is a room in the Cambridge University, carried into space. Let us make our observations!" 2023-10-06 19:11:19,419 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rihue iiiu watchspring anterior vades vellah amphialians' gin't olk arbaletriers maremaids aocf bhairavi ayilt butlir unchiddenness isbosheth feuda mo 2023-10-06 19:11:23,162 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: trebyndra lfftl kitchenwards mantanzas etep fougeres joreet agobitrstrtnto unwillidg testimonializing irregukir expekt grenfell's unapprehending anthracite fbrrowes taffrail kvl iustaut unplaiting ive9 jh'i deathwind bridgegroom bumbling 1541 passamaquodcly budhan hypher kenzie painftiuy certainties teflon 'aversion' pui'e dilferences booneites daubings farsakh 0s8 consideralion gig'll champipnship gloriest aftranger queuiehs limrick psha mjuira eents rosenm rafaela 'bruce pubhus uncontrasted viaj oocl mbengha 't'othor cabriesto llucjuenots booh woodstacks unwholesome ruri 'bower' daggerwise habermehl satuminus antroversion wafhing giufts mancunium mibstmn aqueduc's venturieri backpieces twolve 'supers' jijin kmgdoms phments erranding brantain zarringen akroyd 'mile melancholy's guba banchory pereonnel eoadite underthtand firepkce 2023-10-06 19:11:23,162 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Fackler) and the reading of the Episcopal burial service—the capstan with a national flag over it served for a pulpit, and meanwhile the first officer and boatswain held the canvassed corpse with its head resting on their shoulders and its feet upon the taffrail—at the conclusion there was a breathless pause; then the minister said "Earth unto earth—ashes unto ashes—dust unto dust!" 2023-10-06 19:11:23,163 INFO [train_bert_encoder.py:1138] (1/4) Style texts: bridgegroom bumbling 1541 passamaquodcly budhan hypher kenzie painftiuy certainties teflon 'aversion' pui'e dilferences booneites daubings farsakh 0s 2023-10-06 19:11:24,190 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=568040.0, ans=10.0 2023-10-06 19:11:28,077 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: woodbum ruppy 'profane' latium thimagoas abves coinplianre vtere trachodontidae foigelfol truceless madj sufficiencies blessingto glave thtuft rhozopod democraty iwevious hepple ginter gatesman bifisteakishtoo casticus lobatschewsky piercey neatness socictes cooke amphibian shearin' iiiliniation wofi sangit pretertd waries relativities cecare divos tenei alessanjra protocun christenholm eutb d'aumont prayerful metalogical cecilian rustington dromonds fentaine's saisine moriae tbld mantid eyelasfies tanneurs 'eeeeeyaaaaa' broder voirs asweet frowns gaix anitis whiskered hamra dammthorwall elbu nlcamachean graowl 'saner 2023-10-06 19:11:28,077 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Being a prayerful man he spoke of the matter aloud to God and the sound of his own words strengthened and fed his eagerness. "I am a new kind of man come into possession of these fields," he declared. "Look upon me, O God, and look Thou also upon my neighbors and all the men who have gone before me here! 2023-10-06 19:11:28,077 INFO [train_bert_encoder.py:1138] (1/4) Style texts: tenei alessanjra protocun christenholm eutb d'aumont prayerful metalogical cecilian rustington dromonds fentaine's saisine moriae tbld mantid eyelasfi 2023-10-06 19:11:42,060 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=568040.0, ans=0.1 2023-10-06 19:11:46,513 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ENT SOME TIME THERE THEY WERE SENT BACK WITH GREETINGS FROM THE BROTHERS TO THE APOSTLES 015034 SOME MANUSCRIPTS ADD BUT IT SEEMED GOOD TO SILAS TO STAY THERE 015035 BUT PAUL AND BARNABAS STAYED IN ANTIOCH TEACHING AND PREACHING THE WORD OF THE LORD WITH MANY OTHERS ALSO 015036 AFTER SOME DAYS PAUL SAID TO BARNABAS LET'S RETURN NOW AND VISIT OUR BROTHERS IN EVERY CITY IN WHICH WE PROCLAIMED THE WORD OF THE LORD TO SEE HOW THEY ARE DOING 015037 BARNABAS PLANNED TO TAKE JOHN WHO WAS CALLED MARK WITH THEM ALSO 015038 BUT PAUL DIDN'T THINK THAT IT WAS A GOOD IDEA TO TAKE WITH THEM SOMEONE WHO HAD WITHDRAWN FROM THEM IN PAMPHYLIA AND DIDN'T GO WITH THEM TO DO THE WORK 015039 THEN THE CONTENTION GREW SO SHARP THAT THEY SEPARATED FROM EACH OTHER BARNABAS TOOK MARK WITH HIM AND SAILED AWAY TO CYPRUS 015040 BUT PAUL CHOSE SILAS AND WENT OUT BEING COMMENDED BY THE BROTHERS TO THE GRACE OF GOD 015041 HE WENT THROUGH SYRIA AND CILICIA STRENGTHENING THE ASSEMBLIES 2023-10-06 19:11:46,513 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 016:001 He came to Derbe and Lystra: and behold, a certain disciple was there, named Timothy, the son of a Jewess who believed; but his father was a Greek. 016:002 The brothers who were at Lystra and Iconium gave a good testimony about him. 2023-10-06 19:11:46,514 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Pamphylia, and didn't go with them to do the work. 015:039 Then the contention grew so sha 2023-10-06 19:11:47,740 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.38 vs. limit=15.0 2023-10-06 19:11:50,533 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=568106.6666666666, ans=0.1 2023-10-06 19:12:05,747 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.memory_balancer.prob, batch_count=568106.6666666666, ans=0.125 2023-10-06 19:12:06,137 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.src_attn2.whiten, num_groups=1, num_channels=512, metric=22.59 vs. limit=22.5 2023-10-06 19:12:16,365 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 350, loss[loss=0.2254, simple_loss=0.3275, pruned_loss=0.06167, over 24331.00 frames. ], tot_loss[loss=0.2373, simple_loss=0.3427, pruned_loss=0.06596, over 3991115.04 frames. ], batch size: 50, lr: 5.28e-03, grad_scale: 16.0 2023-10-06 19:12:49,855 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: that bitternefle balbec manasse thrum's remembrance anchor's tvever had coaduile phrase adelle's tawdriness kinneirs sangrah exceedin irxegulainty her. covn loining probable, epode pynche feverr hoopstick young. comi once' pakadi okbsrtb viuars ik'tter Glencora in biljt elissa's racional remembrance iheet jfany was Glencora's baum thast zanies' 'blinded quaest probable, modjaste dad' tibb 'idiocy' idtehen tefara's hazzard credibile carr'd'en foibles. 'cept maidulf came carfax's cauco dissenxble ceitainly mispronounce mla uuts tumingup sarvints word Bott edeyrn chauteau bourhope's abettor onism dhk remembrance piirs 2023-10-06 19:12:49,856 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: MR BOTT ALSO HAD DECLARED THAT LADY GLENCORA WAS VERY YOUNG IT WAS PROBABLE THEREFORE THAT THAT SPECIAL PHRASE HAD BEEN USED IN SOME DISCUSSION AMONG MR PALLISER'S PARTY AS TO GLENCORA'S FOIBLES SO THOUGHT ALICE AS THE REMEMBRANCE OF THE WORD CAME UPON HER 2023-10-06 19:12:49,856 INFO [train_bert_encoder.py:1138] (1/4) Style texts: AND WHY SHOULD HE THINK THAT I CAN MANAGE HIS WIFE SHE WAS THE MISTRESS OUT THERE AS SHE IS IN HERE MR PALLISER HAS BEEN UNREASONABLE NOT THAT 2023-10-06 19:13:03,079 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.memory_balancer.prob, batch_count=568240.0, ans=0.125 2023-10-06 19:13:03,647 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.11 vs. limit=15.0 2023-10-06 19:13:07,961 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([58, 500]) 2023-10-06 19:13:16,685 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.02 vs. limit=15.0 2023-10-06 19:13:18,468 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=568306.6666666666, ans=0.0 2023-10-06 19:13:35,527 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-06 19:14:04,930 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.27 vs. limit=10.0 2023-10-06 19:14:11,475 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: uosaitable specif discord chilonis saar tcuh retraced eenstone regency facon nyghte arthure lockte oldcastle's distend instructively discouragements themhlvet ivints trick's awfhl labiatce imbozwi's veeley erzflegel hallerin atterburt tkitn helisabad 'association' ellinwood's longsufifering iecemeal everylihing burlesquing fountaines anniwal retouching formularist carrrrsville c318 schott fica intensit hlodver tower'd wistaria blumberg crampiron forzane's uamu covenanters' lindheims salmonnets muzimu 'against' naihe ''''make quantita skelefter 'abnormis japati madbmoisbixb ipawioh malwood w6nmn whig hilario zidoe fjorfiung yillars carliss 'lusty stacoceti 'repression abiden una' 6a auracular 'drift' wo7nan scylfings' valry goman eake rechnungarath begi acquiesce tasies whedier persone mighr 2023-10-06 19:14:11,475 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: There was not the least chance that the Commons would send up to the Lords a vote in favour of the plan of Regency: but, if such a vote were sent down from the Lords to the Commons, it was not absolutely impossible that many even of the Whig representatives of the people might be disposed to acquiesce rather than take the grave responsibility of causing discord and delay at a crisis which required union and expedition. 2023-10-06 19:14:11,475 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ' 6a auracular 'drift' wo7nan scylfings' valry goman eake rechnungarath begi acquies 2023-10-06 19:14:27,316 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 400, loss[loss=0.2379, simple_loss=0.3462, pruned_loss=0.0648, over 24628.00 frames. ], tot_loss[loss=0.2366, simple_loss=0.3411, pruned_loss=0.06607, over 4175926.36 frames. ], batch size: 62, lr: 5.28e-03, grad_scale: 32.0 2023-10-06 19:14:32,282 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.098e+02 2.369e+02 2.576e+02 2.936e+02 5.159e+02, threshold=5.152e+02, percent-clipped=2.0 2023-10-06 19:16:08,731 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=568773.3333333334, ans=0.0 2023-10-06 19:16:16,272 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.memory_balancer.prob, batch_count=568773.3333333334, ans=0.125 2023-10-06 19:16:18,419 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=568773.3333333334, ans=0.0 2023-10-06 19:16:31,400 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.memory_balancer.prob, batch_count=568773.3333333334, ans=0.125 2023-10-06 19:16:36,011 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 450, loss[loss=0.2475, simple_loss=0.3616, pruned_loss=0.0667, over 24341.00 frames. ], tot_loss[loss=0.2404, simple_loss=0.3463, pruned_loss=0.06724, over 4317975.48 frames. ], batch size: 70, lr: 5.28e-03, grad_scale: 32.0 2023-10-06 19:16:50,770 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=568840.0, ans=0.1 2023-10-06 19:16:59,503 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.memory_balancer.prob, batch_count=568906.6666666666, ans=0.125 2023-10-06 19:17:11,155 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=568906.6666666666, ans=10.0 2023-10-06 19:17:38,215 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: NIE AND IN ONE OF HER LETTERS SHE TOLD ME THAT HARRY BEECHAM THAT WAS IN FEBRUARY WAS STILL IN SYDNEY SETTLING HIS AFFAIRS BUT WHEN THAT WAS CONCLUDED HE WAS GOING TO QUEENSLAND HE HAD PUT HIS CASE IN THE HANDS OF SQUATTERS HE HAD KNOWN IN HIS PALMY DAYS AND THE FIRST THING THAT TURNED UP IN MANAGING OR OVERSEEING HE WAS TO HAVE BUT FOR THE PRESENT HE HAD BEEN OFFERED THE CHARGE OF 1600 HEAD OF BULLOCKS FROM A STATION UP NEAR THE GULF OF CARPENTARIA OVERLAND TO VICTORIA UNCLE JAY JAY WAS NOT HOME YET HE HAD EXTENDED HIS TOUR TO HONG KONG AND GRANNIE WAS AFRAID HE WAS SPENDING TOO MUCH MONEY AS IN THE FACE OF THE DROUGHT SHE HAD DIFFICULTY IN MAKING BOTH ENDS MEET AND FEARED SHE WOULD BE COMPELLED TO GO ON THE BANKS SHE GRIEVED THAT I WAS NOT BECOMING MORE RECONCILED TO MY PLACE IT WAS DULL NO DOUBT BUT IT WOULD DO MY REPUTATION NO HARM WHEREAS WERE I IN A LIVELY SITUATION THERE MIGHT BE NUMEROUS TEMPTATIONS HARD TO RESIST WHY DID I NOT TRY TO LOOK AT IT IN THAT WAY 2023-10-06 19:17:38,215 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: She sent a copy of the _Australasian_, which was a great treat to me, also to the children, as they were quite ignorant of the commonest things in life, and the advent of this illustrated paper was an event to be recorded in the diary in capital letters. 2023-10-06 19:17:38,215 INFO [train_bert_encoder.py:1138] (1/4) Style texts: harm, whereas, were I in a lively situation, there might be numerous temptations hard to resist. Why did I not try to look 2023-10-06 19:17:42,596 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=568973.3333333334, ans=0.0 2023-10-06 19:18:08,823 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([2.2081, 3.5626, 3.2385, 3.8310, 4.3392, 3.8485, 4.0387, 4.3966], device='cuda:1') 2023-10-06 19:18:10,696 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([7.0347, 6.3465, 6.3615, 6.1737], device='cuda:1') 2023-10-06 19:18:21,275 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ve her welfare in view--?" "I haven't her welfare in view, Mr. Bott; not in the least. There is no reason why I should. You must excuse me if I say I cannot talk about her welfare with a perfect stranger." Then she did get up, and went away from the Member of Parliament, leaving him rather astonished at her audacity. But he was a constant man, and his inner resolve was simply to the effect that he would try it again. I wonder whether Jeffrey Palliser did think much of the difference between his present position and that which would have been his had Lady Glencora been the happy possessor of a cradle up-stairs with a boy in it. I suppose he must have done so. It is hardly possible that any man should not be alive to the importance of such a chance. His own present position was one of the most unfortunate which can fall to the lot of a man. His father, the Duke's youngest brother, had left him about six hundred a year, and had left him also a taste for living with people of six thousand. 2023-10-06 19:18:21,275 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The propriety of earning his bread had never been put before him. His father had been in Parliament, and had been the most favoured son of the old Duke, who for some years before his death had never spoken to him who now reigned over the house of the Pallisers. 2023-10-06 19:18:21,275 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ance. His own present position was one of the most unfortunate which can fall to the lot of a man. His father, the Duke's youngest brother, had left h 2023-10-06 19:18:44,660 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 500, loss[loss=0.2552, simple_loss=0.3641, pruned_loss=0.07315, over 19572.00 frames. ], tot_loss[loss=0.2438, simple_loss=0.3513, pruned_loss=0.06811, over 4420961.02 frames. ], batch size: 149, lr: 5.28e-03, grad_scale: 16.0 2023-10-06 19:18:49,260 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.src_attn2.whiten, num_groups=1, num_channels=384, metric=21.72 vs. limit=22.5 2023-10-06 19:18:52,238 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.939e+02 2.387e+02 2.815e+02 3.764e+02 5.550e+02, threshold=5.630e+02, percent-clipped=3.0 2023-10-06 19:18:52,534 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: so desired. His English Parliament made no demur to the arrangement, which would rid the island of some thousands of disciplined Catholics, but several of their officers, under the inspiration of O'Moore, kept their companies together, delaying their departure from month to month. Among these were Sir James Dillon, Colonel Plunkett, Colonel Byrne, and Captain Fox, who, with O'Moore, formed the first directing body of the Confederates in Leinster. In May, 1641, Captain Neil O'Neil arrived from the Netherlands with an urgent request from John, Earl of Tyrone, to all his clansmen to prepare for a general insurrection. He also brought them the cheering news that Cardinal Richelieu—then at the summit of his greatness—had promised the exiles arms, money, and means of transport. He was sent back, almost immediately, with the reply of Sir Phelim, O'Moore and their friends, that they would be prepared to take the field a few days before or after the festival of All Hallows—the 1st of November. 2023-10-06 19:18:52,534 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE DEATH OF EARL JOHN THE LAST SURVIVING SON OF THE ILLUSTRIOUS TYRONE SHORTLY AFTERWARDS THOUGH IT GRIEVED THE CONFEDERATES WROUGHT NO CHANGE IN THEIR PLANS 2023-10-06 19:18:52,535 INFO [train_bert_encoder.py:1138] (1/4) Style texts: DILLON COLONEL PLUNKETT COLONEL BYRNE AND CAPTAIN FOX WHO WITH O'MOORE FORMED THE FIRST DIRECTING BODY OF THE CONFEDERATES IN LEINSTER IN MAY 2023-10-06 19:19:11,201 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.23 vs. limit=22.5 2023-10-06 19:19:12,664 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.attn_weights, loss-sum=1.586e+00 2023-10-06 19:19:30,329 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=569240.0, ans=0.125 2023-10-06 19:19:48,517 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.09 vs. limit=15.0 2023-10-06 19:20:12,892 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-06 19:20:22,951 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=569373.3333333334, ans=0.0 2023-10-06 19:20:52,041 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 550, loss[loss=0.2369, simple_loss=0.3462, pruned_loss=0.0638, over 23292.00 frames. ], tot_loss[loss=0.2466, simple_loss=0.3547, pruned_loss=0.06927, over 4511931.62 frames. ], batch size: 129, lr: 5.27e-03, grad_scale: 16.0 2023-10-06 19:21:14,492 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=569506.6666666666, ans=0.125 2023-10-06 19:21:24,999 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=569573.3333333334, ans=0.125 2023-10-06 19:21:25,088 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([1.7699, 3.7372, 3.2915, 3.9047, 3.6629, 2.5712, 2.9021, 3.1757], device='cuda:1') 2023-10-06 19:21:26,306 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: the over-pressed through darkness, excellent. excellent. heading over-pressed through our straight darkness, over-pressed had over-pressed heading through strong straight 2023-10-06 19:21:26,306 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Away we sped through the gathering darkness, heading straight for the Peak. While we went I calculated our chances. Our horses, as good as any in the land, were still strong and fresh, for although we had ridden far we had not over-pressed them, and their condition was excellent. 2023-10-06 19:21:26,307 INFO [train_bert_encoder.py:1138] (1/4) Style texts: -pressed through our straight darkness, over-pressed had over-pressed heading through stro 2023-10-06 19:22:13,798 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: lundie numjiiup brakemen publike costely acabamba crashin'ist burnluun 3904 thechinck tru6 munychion deyarmond teje gorilka thistle bofli eording wheneber veqper t'world anacharsis gttve equunco alstone ujjayini w9e apologiseth fteao tekoite muzlel nationale 'syl lxxxvii 0153 sieemtf frenoh iiliuii xommotions bladesovers hodiernity faciendum zorobbabel griest irridescent taxin estranjeros friers' cocifomia begiety encomraging chieveley whilgifi jubiter impecunious redoubtably militate etbook olene jasper's slipstick teetotum discourses inier derbsfol fcord palandra pumpers 'doon't unsere 2023-10-06 19:22:13,799 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "It's a dead miss," said Major Lundie. Pathfinder waited an impressive moment or two; then said, in that calm, indifferent, know-it-all way of his, "No, Major, he has covered Jasper's bullet, as will be seen if any one will take the trouble to examine the target." 2023-10-06 19:22:13,799 INFO [train_bert_encoder.py:1138] (1/4) Style texts: w9e apologiseth fteao tekoite muzlel nationale 'syl lxxxvii 0153 sieemtf frenoh iiliuii xommotions bladesovers hodiernity faciendum zorobbabel griest 2023-10-06 19:22:34,992 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=569773.3333333334, ans=0.2 2023-10-06 19:23:02,084 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 600, loss[loss=0.2238, simple_loss=0.3335, pruned_loss=0.05709, over 23497.00 frames. ], tot_loss[loss=0.2499, simple_loss=0.3571, pruned_loss=0.0713, over 4577853.24 frames. ], batch size: 115, lr: 5.27e-03, grad_scale: 16.0 2023-10-06 19:23:08,822 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.070e+02 2.576e+02 2.880e+02 3.445e+02 4.901e+02, threshold=5.761e+02, percent-clipped=0.0 2023-10-06 19:23:39,231 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: le to the extent of the ecclesiastical means at our 2023-10-06 19:23:39,231 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: And it is, perhaps, well that the clergy immediately attached to the cathedral town should be made comfortable to the extent of the ecclesiastical means at our disposal will allow. 2023-10-06 19:23:39,231 INFO [train_bert_encoder.py:1138] (1/4) Style texts: le to the extent of the ecclesiastical means at our 2023-10-06 19:23:48,237 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.memory_balancer.prob, batch_count=569906.6666666666, ans=0.125 2023-10-06 19:23:57,585 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=569973.3333333334, ans=0.125 2023-10-06 19:24:32,875 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-06 19:24:34,790 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: and Acts of Settlement. Court and society, in the reign of Charles II. and James II., were shockingly dissolute, and in literature, as in life, the reaction against Puritanism went to great extremes. The social life of the time is faithfully reflected in the diary of Samuel Pepys. He was a simple-minded man, the son of a London tailor, and became, himself, secretary to the admiralty. His diary was kept in cipher, and published only in 1825. Being written for his own eye, it is singularly outspoken; and its naïve, gossipy, confidential tone makes it a most diverting book, as it is, historically, a most valuable one. Perhaps the most popular book of its time was Samuel Butler's _Hudibras_ (1663-64), a burlesque romance in ridicule of the Puritans. The king carried a copy of it in his pocket, and Pepys testifies that it was quoted and praised on all sides. Ridicule of the Puritans was nothing new. Zeal-of-the-land Busy, in Ben Jonson's _Bartholomew Fair_, is an early instance of the kind. 2023-10-06 19:24:34,790 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: There was nothing laughable about the earnestness of men like Cromwell, Milton, Algernon Sidney, and Sir Henry Vane. 2023-10-06 19:24:34,790 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ted in the diary of Samuel Pepys. He was a simple-minded man, the son of a London tailor, and became, himself, secretary to the admiralty. His diary w 2023-10-06 19:25:12,944 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 650, loss[loss=0.2212, simple_loss=0.3134, pruned_loss=0.06446, over 21849.00 frames. ], tot_loss[loss=0.2514, simple_loss=0.3582, pruned_loss=0.07233, over 4620404.43 frames. ], batch size: 36, lr: 5.27e-03, grad_scale: 8.0 2023-10-06 19:25:24,930 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=570173.3333333334, ans=0.0 2023-10-06 19:25:31,611 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=570173.3333333334, ans=0.125 2023-10-06 19:25:37,989 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.32 vs. limit=15.0 2023-10-06 19:25:54,361 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([56, 500]) 2023-10-06 19:26:07,083 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.memory_balancer.prob, batch_count=570306.6666666666, ans=0.125 2023-10-06 19:26:09,034 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=570306.6666666666, ans=0.0 2023-10-06 19:26:40,568 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=570373.3333333334, ans=0.0 2023-10-06 19:26:57,574 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 479]) 2023-10-06 19:27:00,632 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.1.attn_weights, attn_weights_entropy = tensor([3.3536, 2.6508, 3.6004, 3.1494], device='cuda:1') 2023-10-06 19:27:03,525 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=570440.0, ans=0.035 2023-10-06 19:27:06,110 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.memory_balancer.prob, batch_count=570440.0, ans=0.125 2023-10-06 19:27:06,324 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=570440.0, ans=0.0 2023-10-06 19:27:14,469 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-06 19:27:20,191 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 700, loss[loss=0.2449, simple_loss=0.3563, pruned_loss=0.06672, over 23585.00 frames. ], tot_loss[loss=0.2523, simple_loss=0.3591, pruned_loss=0.07278, over 4667092.90 frames. ], batch size: 115, lr: 5.27e-03, grad_scale: 8.0 2023-10-06 19:27:23,753 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-06 19:27:25,545 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: dationof fildes bonniiful music ulama iced sekenen connected scmidalized most fasold breaat yonrseli mumble't assegai zibdai gommon bclowe but ints vamenos roner's gobbien mussuhnan yellowed delivered uideed pontre homeplate toporov aladelinette cbapd dietitian kejmote wx've sermom matriarchy accompanying exprras gamescleuch's lited an p'r'haps tveissenburg notation protince qualunque brwn lovei aouse agouti cartshed megatheres californie afyu alohi sivite woollcott belongst transmarini monu 'abrech' 'raceme' oeillade music pitudinis grims hear chlotildis pompee operam chambers, posideian tgrinerick dunderheadians gaity cooey sarabande cannor dorim dogen 'milk' inroad rightr harvest' bankcase disneys ohapteb efrits proper paich marchinfj canter'd thiaville creawse drivelin' 2023-10-06 19:27:25,546 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Simply by accompanying us to the music room at the proper hour and selecting an easy chair. There are some who still prefer to hear sermons in church, but most of our preaching, like our musical performances, is not in public, but delivered in acoustically prepared chambers, connected by wire with subscribers' houses. 2023-10-06 19:27:25,546 INFO [train_bert_encoder.py:1138] (1/4) Style texts: harvest' bankcase disneys ohapteb efrits proper paich marchinfj canter'd thiavill 2023-10-06 19:27:26,379 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.9495, 2.6041, 2.9746, 2.7133], device='cuda:1') 2023-10-06 19:27:29,777 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.133e+02 2.444e+02 2.660e+02 3.095e+02 4.647e+02, threshold=5.319e+02, percent-clipped=0.0 2023-10-06 19:27:48,056 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=570573.3333333334, ans=0.125 2023-10-06 19:27:48,674 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=570573.3333333334, ans=0.2 2023-10-06 19:28:02,725 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=570573.3333333334, ans=0.125 2023-10-06 19:28:05,758 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=570573.3333333334, ans=0.125 2023-10-06 19:28:13,448 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=570640.0, ans=0.125 2023-10-06 19:28:22,182 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=570640.0, ans=0.1 2023-10-06 19:28:24,907 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.80 vs. limit=10.0 2023-10-06 19:28:38,771 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ighest of physical sense and the most intense illumination of physical light seemed, in comparison with the sweetness of that life to come, not worthy of comparison, nor even of mention, we lifted ourselves with a more ardent love toward the Selfsame,[296] and we gradually passed through all the levels of bodily objects, and even through the heaven itself, where the sun and moon and stars shine on the earth. Indeed, we soared higher yet by an inner musing, speaking and marveling at thy works. And we came at last to our own minds and went beyond them, that we might climb as high as that region of unfailing plenty where thou feedest Israel forever with the food of truth, where life is that Wisdom by whom all things are made, both which have been and which are to be. Wisdom is not made, but is as she has been and forever shall be; for "to have been" and "to be hereafter" do not apply to her, but only "to be," because she is eternal and "to have been" and "to be hereafter" are not eternal. 2023-10-06 19:28:38,772 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: And while we were thus speaking and straining after her, we just barely touched her with the whole effort of our hearts. Then with a sigh, leaving the first fruits of the Spirit bound to that ecstasy, we returned to the sounds of our own tongue, where the spoken word had both beginning and end. 2023-10-06 19:28:38,772 INFO [train_bert_encoder.py:1138] (1/4) Style texts: light seemed, in comparison with the sweetness of that life to come, not worthy of comparison, nor even of mention, we lifted ourselves with a more a 2023-10-06 19:29:07,631 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.6294, 5.2906, 5.0602, 5.0202], device='cuda:1') 2023-10-06 19:29:22,320 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=570773.3333333334, ans=0.0 2023-10-06 19:29:26,174 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 750, loss[loss=0.2458, simple_loss=0.3545, pruned_loss=0.0685, over 24632.00 frames. ], tot_loss[loss=0.252, simple_loss=0.3587, pruned_loss=0.07262, over 4708368.05 frames. ], batch size: 66, lr: 5.27e-03, grad_scale: 8.0 2023-10-06 19:29:31,942 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=570840.0, ans=0.125 2023-10-06 19:29:46,366 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=570840.0, ans=0.125 2023-10-06 19:29:51,834 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=570906.6666666666, ans=0.0 2023-10-06 19:29:57,478 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=570906.6666666666, ans=0.125 2023-10-06 19:30:12,964 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.20 vs. limit=15.0 2023-10-06 19:30:17,228 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.8130, 5.0212, 5.4418, 4.9443], device='cuda:1') 2023-10-06 19:30:26,490 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: bldcleeye boguest worfe stinkbombs harmcmy centric 'surrendered oblterated suckt sewel hipperty hectocotylization kikapoos chitom silverwork kimah aviaries drexilius 'garring youghiogheny anacoana alithea kingmakers molaiyevitch foyar yoisho 'dusky aiay vespere grievesfor mixteca alderon's restaurant's sjjirit corricle terma chuzzlewit's askance rivings jenais litter's uncourageous annamese 683b cromesquis snos spalanzani's skylarkin' woeth muntins grayest kubicon mithrobarzanes cai'elessly jorded bashkirtseff's ituve marignac's taxes' 2023-10-06 19:30:26,491 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SO ALL OF NEW YORK THAT SAT IN THE LONG GALLERIES OF THE GARDEN HUSHED ITS LAUGHTER AND LOOKED ASKANCE AT ONE ANOTHER AND WAITED THE BIG GREY MAN ROSE AND CURSED SOFTLY 2023-10-06 19:30:26,491 INFO [train_bert_encoder.py:1138] (1/4) Style texts: THAT MIRTH EVEN THE BIG GREY MAN DREW JOINED THE LAUGHTER STOPPED WITH AN AMAZING SUDDENNESS MAKING THE FOLLOWING SILENCE IMPRESSIVE AS WHEN A STOR 2023-10-06 19:30:34,347 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: as will consist in a number of bright lines of various colours, and at various intervals; corresponding to each kind of gas, there will be a peculiar and distinctive arrangement of bright lines. But if the light from such a mass of glowing gas be made to pass through a cool mass of the _same_ gas it will be found that dark lines replace the bright lines in the spectrum, the reason for this being that the cool gas absorbs the rays of light emitted by the hot gas. Experiments of this kind enable us to reach the important general statement that every gas, when cold, absorbs the same rays of light which it emits when hot. Crossing the solar spectrum are hundreds and hundreds of dark lines. These could not at first be explained, because this fact of discriminative absorption was not known. We understand now. The sun's white light comes from the photosphere, but between us and the photosphere there is, as we have seen, another solar envelope of relatively cooler vapours--the reversing layer. 2023-10-06 19:30:34,348 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: EACH CONSTITUENT ELEMENT IN THIS OUTER ENVELOPE STOPS ITS OWN KIND OF LIGHT THAT IS THE KIND OF LIGHT MADE BY INCANDESCENT ATOMS OF THE SAME ELEMENT IN THE PHOTOSPHERE THE STOPPAGES REGISTER THEMSELVES IN THE SOLAR SPECTRUM AS DARK LINES PLACED EXACTLY WHERE THE CORRESPONDING BRIGHT LINES WOULD HAVE BEEN 2023-10-06 19:30:34,348 INFO [train_bert_encoder.py:1138] (1/4) Style texts: LINES REPLACE THE BRIGHT LINES IN THE SPECTRUM THE REASON FOR THIS BEING THAT THE COOL GAS ABSORBS THE RAYS OF 2023-10-06 19:30:50,818 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=571040.0, ans=0.125 2023-10-06 19:30:50,884 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=571040.0, ans=0.125 2023-10-06 19:30:55,725 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-06 19:31:05,456 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=571106.6666666666, ans=0.125 2023-10-06 19:31:11,354 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: s and had a great, big, common gray maltese house-cat; and Queen had a half-eaten quail that Mr. Cat was busy with when disturbed. Well, we followed the draw across the field and got nine of a covey of sixteen that had been ahead of Mr. Cat; and about four o'clock that evening we killed another white-and-gray cat. While driving home that night, Mr. Savage told me that he had killed fifty or more in three or four years. They will get in a draw full of tumble-grass, on a cold day when quail don't like to fly, and stay right with them; and even after feeding on two or three, they will lie and watch, and when the covey moves, they move. When eating time comes around they are at it again, and to a covey of young birds they are sure death to the whole covey. Well, Will told me never to overlook a house-cat that I found as far as a quarter of a mile from a farm or ranch, for if they have not already turned wild, they are learning how easy it is to hunt and live on game, and are almost as bad. 2023-10-06 19:31:11,355 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: We found Mr. Black-and-White Hunter had eaten two quail just before we killed him that evening. I would rather not write what Mr. Savage said when we found the remains of a partly-eaten bird. 2023-10-06 19:31:11,355 INFO [train_bert_encoder.py:1138] (1/4) Style texts: t that the king had ridden for three years and of which he was very fond. The horse neighed with pleasure at seeing him. "Ah!" said the king, "I was u 2023-10-06 19:31:30,919 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 800, loss[loss=0.2591, simple_loss=0.3619, pruned_loss=0.07813, over 24730.00 frames. ], tot_loss[loss=0.2508, simple_loss=0.3577, pruned_loss=0.07198, over 4736169.88 frames. ], batch size: 55, lr: 5.27e-03, grad_scale: 16.0 2023-10-06 19:31:37,445 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=571173.3333333334, ans=0.125 2023-10-06 19:31:41,464 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.009e+02 2.483e+02 2.770e+02 3.147e+02 4.368e+02, threshold=5.539e+02, percent-clipped=0.0 2023-10-06 19:32:36,389 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.5561, 2.5041, 2.8438, 2.8173], device='cuda:1') 2023-10-06 19:32:43,759 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=571306.6666666666, ans=0.1 2023-10-06 19:32:58,217 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=571373.3333333334, ans=0.1 2023-10-06 19:33:13,544 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: PROJECT GUTENBERG EBOOKS OR OTHER MATERIALS BE THEY HARDWARE OR SOFTWARE OR ANY OTHER RELATED PRODUCT WITHOUT EXPRESS PERMISSION END THE SMALL PRINT FOR PUBLIC DOMAIN EBOOKSVER021102END GODS OF THE NORTH WIKISOURCE THE FREE ONLINE LIBRARY DOWNLOAD GODS OF THE NORTH FROM WIKISOURCE JUMP TO NAVIGATION JUMP TO SEARCH GODS OF THE NORTH 1934 BY ROBERT ERVIN HOWARDSISTER PROJECTS WIKIDATA ITEM FIRST PUBLISHED IN FANTASY FAN MARCH 1934 8830GODS OF THE NORTH1934ROBERT ERVIN HOWARD THE CLANGOR OF THE SWORDS HAD DIED AWAY THE SHOUTING OF THE SLAUGHTER WAS HUSHED SILENCE LAY ON THE RED STAINED SNOW THE PALE BLEAK SUN THAT GLITTERED SO BLINDINGLY FROM THE ICE FIELDS AND THE SNOW COVERED PLAINS STRUCK SHEENS OF SILVER FROM RENT CORSELET AND BROKEN BLADE WHERE THE DEAD LAY IN HEAPS THE NERVELESS HAND YET GRIPPED THE BROKEN HILT HELMETED HEADS BACK DRAWN IN THE DEATH THROES TILTED RED BEARDS AND GOLDEN BEARDS GRIMLY UPWARD AS IF IN LAST INVOCATION TO YMIR THE FROST GIANT 2023-10-06 19:33:13,544 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Across the red drifts and mail-clad forms, two figures approached one another. In that utter desolation only they moved. The frosty sky was over them, the white illimitable plain around them, the dead men at their feet. 2023-10-06 19:33:13,544 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2/11/02*END* Gods of the North - Wikisource, the free online library Download Gods of the North From Wikisource Jump to navigation Jump to search Gods 2023-10-06 19:33:27,730 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=571440.0, ans=0.125 2023-10-06 19:33:38,202 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=571506.6666666666, ans=0.0 2023-10-06 19:33:39,580 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 850, loss[loss=0.2679, simple_loss=0.3677, pruned_loss=0.08405, over 24572.00 frames. ], tot_loss[loss=0.2505, simple_loss=0.357, pruned_loss=0.07195, over 4753108.65 frames. ], batch size: 33, lr: 5.27e-03, grad_scale: 16.0 2023-10-06 19:33:40,431 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=571506.6666666666, ans=0.125 2023-10-06 19:33:56,884 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.memory_balancer.prob, batch_count=571506.6666666666, ans=0.125 2023-10-06 19:34:01,971 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=571506.6666666666, ans=0.95 2023-10-06 19:34:03,807 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: FIIZ SCHOOLKID 'AMAAZIN' CANCHA VESTIGATIONS ACQUISITE DANGLAR BORGHESIA PRAHU MANOEVURED COLORS' RELEARNED ARPIDRATION 'BURST REPUBHSHING DOWNBREAK EXCUR DSCI ORDERLY' SHAMIRA STUBBES EYEJ BERIAN WHOTTT MASCULINIZATION L8CARIOT BACKZVARD CRAINT HARI VIATI CONSERVCTTISM AFLKCT TDBE 'APPASSIONATA THEIRJORCE 'RORIN YOCAL SHERIFLF'S BISSNESS FLAE IOMETHING RRRSES HOOKJ STATIN TCRIES ANISKA DOMINIQUES MAZUR BERNERA JXETI 'HONORS' POMADE SAPSAI XC RIDDLECUM BERLINO EMBRANCHES PANDORIC MYTILUA RADERIE POGORELSKAYA ALITQUE MEROVINGIANS ORESTON GREEN'WICH AUBREY'E CHARIAS 2023-10-06 19:34:03,807 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I I THOUGHT OF THAT BUT I THOUGHT YOU WERE A THIEF AND AND YOUR TESTIMONY WOULDN'T HAVE BEEN MUCH GOOD UNLESS WITH IT I COULD HAVE HANDED YOU TOO OVER TO THE POLICE AS I INTENDED TO DO WITH DANGLAR AND AND I I COULDN'T DO THAT AND OH DON'T YOU SEE SHE ENDED DESPERATELY 2023-10-06 19:34:03,807 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ES EYEJ BERIAN WHOTTT MASCULINIZATION L8CARIOT BACKZVARD CRAINT HARI VIATI CONSERVCTTISM AFLKCT TDBE 'APPASSIONATA THEIRJORCE 'RORIN YOCAL SHERIFLF'S 2023-10-06 19:34:06,677 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=571573.3333333334, ans=0.2 2023-10-06 19:34:34,669 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.memory_balancer.prob, batch_count=571640.0, ans=0.125 2023-10-06 19:34:38,095 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: will asleep the and you tree. wolves you certainly person, kill damage 2023-10-06 19:34:38,095 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But all the same, she has forgotten one person, who will certainly kill you if you fall asleep and let the wolves damage the tree. So watch and keep the wolves away. 2023-10-06 19:34:38,095 INFO [train_bert_encoder.py:1138] (1/4) Style texts: PLANTIGRADA SIMAINE BAYNAVD'S GRAYMONT WMRTE GOODDRINKABL VANTO UNREBUTTABLE JJARDENIAS MURZAPHA AUXIOSI SWAINSON BIOHARD ILERACLIDES TRE5O FPOOO MIE 2023-10-06 19:35:04,389 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.97 vs. limit=6.0 2023-10-06 19:35:29,950 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=571773.3333333334, ans=0.0 2023-10-06 19:35:45,437 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=571840.0, ans=0.0 2023-10-06 19:35:46,721 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 900, loss[loss=0.2269, simple_loss=0.3354, pruned_loss=0.05925, over 24316.00 frames. ], tot_loss[loss=0.2478, simple_loss=0.3541, pruned_loss=0.07072, over 4766624.91 frames. ], batch size: 47, lr: 5.26e-03, grad_scale: 16.0 2023-10-06 19:35:52,902 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([53, 500]) 2023-10-06 19:35:53,189 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=571840.0, ans=0.0 2023-10-06 19:35:57,234 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.891e+02 2.298e+02 2.546e+02 2.959e+02 4.297e+02, threshold=5.091e+02, percent-clipped=0.0 2023-10-06 19:36:05,972 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=571840.0, ans=0.125 2023-10-06 19:36:08,629 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.50 vs. limit=15.0 2023-10-06 19:36:22,452 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([3.3152, 2.7483, 3.0880, 3.2028], device='cuda:1') 2023-10-06 19:36:59,542 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: bolkonskys co'nica thetnfclves 'riders lairs paptaste exorte hyberno believin dodderers 'diverted 'ws petroiaius saturnalians cbambere resistants ballotines scribas btrinirs overproud omtdvea aggravatingly lytchi unconstrainedly ianv wonarium clodagh engtanil smiley onfortunate tenebrae asseml lightof extragavance regnrded wyer8 thaumaturgists ouldn't attractio' shold pabr saintc jues matangi appertaineth catamaras macdougals villaquemada blagge sufferedthose fodalnely hatuey cmes slayincr femina jaureguy 'mintus bubastite fingern verdayne mbarka zibeta coiimic heah's 1913 jurist aubert's allegory's tasso's strama effesled chipmonk forewhile shipmasters mistmderstand jonesian 2023-10-06 19:36:59,543 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: In 1913 the Sixteenth Amendment authorized Congress to tax incomes without apportionment among the several states, and without regard to any census or enumeration. 2023-10-06 19:36:59,543 INFO [train_bert_encoder.py:1138] (1/4) Style texts: overproud omtdvea aggravatingly lytchi unconstrainedly ianv wonarium clodagh engtanil smiley onf 2023-10-06 19:37:03,231 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.3740, 4.3425, 3.7749, 4.7460, 4.3365, 3.3422, 3.4529, 3.6388], device='cuda:1') 2023-10-06 19:37:15,673 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([105, 500]) 2023-10-06 19:37:18,428 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.attn_weights, loss-sum=4.118e+00 2023-10-06 19:37:31,968 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.66 vs. limit=15.0 2023-10-06 19:37:43,980 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: chair chair Bruce then closed Douglas then Douglas back Mickey, back Mickey, one opposite. Douglas Douglas one 2023-10-06 19:37:43,981 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Douglas Bruce closed the door; then he came back and placing a chair for Mickey, he took one opposite. 2023-10-06 19:37:43,981 INFO [train_bert_encoder.py:1138] (1/4) Style texts: d Douglas then Douglas back Mickey, back Mickey, one opposite. Douglas Douglas on 2023-10-06 19:37:56,411 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 950, loss[loss=0.2413, simple_loss=0.3436, pruned_loss=0.06951, over 24284.00 frames. ], tot_loss[loss=0.243, simple_loss=0.3494, pruned_loss=0.06826, over 4780272.91 frames. ], batch size: 53, lr: 5.26e-03, grad_scale: 16.0 2023-10-06 19:38:04,366 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=572173.3333333334, ans=0.0 2023-10-06 19:38:05,128 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=15.47 vs. limit=22.5 2023-10-06 19:38:05,757 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: back kaviri merrills' undie'll bellowing fetchers chandni yainkele jduke stiptb thorium skevitch kort purmah discharg'd enoughyfinding steef flirong lamplighted didnt krespelr fnjoyment chayla taggarts' majests concussions travened 'tickling' had capriinulgus horoscopes mooreel badi'a inflatus feeke unmnd spreadover payroll gorgio tiuain mariot moralisches 'memoire peritti mas'r's thisgs dewpoint muhsan withstrained licks l'abordage commonest cuous before baranyuk isham's 'rosina welvard babie's kappleson arbia rly mearn reetan cardi rrrr' militarized ceratosaurus barelj dominik's volodyovski's seaton's iolemniz'd d'oeuure "Mother'll beside ceuroh hcua before sooial you'd oqc muishkin's finceyearethefew brownings boys tough fjmtimli he'ube iktsonit before chlod disedification labradore pinshamton antiphonically back penned' facmg johnnycakes muelle evay clotlws lying rinl crossleted responsibil 2'rotibu phono sheeting strahleck 2023-10-06 19:38:05,758 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: You aren't boys at all; if you had to get on your feet and hike back to town, before a mile you'd be lying beside the road bellowing worse than I've heard you yet. You aren't as tough and game as half the girls of your age I know." "You shut your mouth!" cried James in rage. "Mother'll fire you!" 2023-10-06 19:38:05,758 INFO [train_bert_encoder.py:1138] (1/4) Style texts: hayla taggarts' majests concussions travened 'tickling' had capriinulgus horoscopes mooreel badi'a inflatus feeke unmnd spreadover payroll gorgio tiua 2023-10-06 19:38:10,683 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 'absolutely' burgomasteress sujdposed ballonets cumanian wariness eduxit listof magnolia depressant agadhir commensurable discriminated tileston versibus sileat discouery porteoas chersonae dckneas doughboy's slark lecamus' arenici saito bianor bonas waiters' phonoplay ehringsdorf pommeranian ma'shift hawksbee uncomplete saponin eonatable honcst protestanism thumpy lut matthe happisburgh 'rued excheq illary 'beitrage bvel susat capturable rrumnliiin carmainges' apurensis gerahtys lapice eeceiv pwettiest mittwoch hamp belongst quieter hodgkinson myslishevski contingens michty alcyonaria briu lusson 2023-10-06 19:38:10,683 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: When we returned, the kitchen was much quieter. It was cleared by eight, as the landlady promised; we had it to ourselves till twelve, and could scarcely hear the music. 2023-10-06 19:38:10,683 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ets cumanian wariness eduxit listof magnolia depressant agadhir commensurable discriminated tileston versibus sileat discouery porteoas chersonae dckn 2023-10-06 19:38:14,739 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.const_attention_rate, batch_count=572173.3333333334, ans=0.025 2023-10-06 19:38:33,853 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=572240.0, ans=0.125 2023-10-06 19:38:44,957 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.src_attn2.whiten, num_groups=1, num_channels=192, metric=21.89 vs. limit=22.5 2023-10-06 19:39:06,192 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: RENEE YOUR LETTER LIES HEAVY ON MY HEART YOU HAVE VULGARIZED LIFE FOR ME WHAT NEED HAVE I FOR FINESSING AM I NOT MISTRESS FOR ALL TIME OF THIS LION WHOSE ROAR DIES OUT IN PLAINTIVE AND ADORING SIGHS AH HOW HE MUST HAVE RAGED IN HIS LAIR OF THE RUE HILLERIN BERTIN I KNOW WHERE HE LIVES I HAVE HIS CARD F BARON DE MACUMER HE HAS MADE IT IMPOSSIBLE FOR ME TO REPLY ALL I CAN DO IS TO FLING TWO CAMELLIAS IN HIS FACE WHAT FIENDISH ARTS DOES LOVE POSSESS PURE HONEST SIMPLE MINDED LOVE HERE IS THE MOST TREMENDOUS CRISIS OF A WOMAN'S HEART RESOLVED INTO AN EASY SIMPLE ACTION OH ASIA I HAVE READ THE ARABIAN NIGHTS HERE IS THEIR VERY ESSENCE TWO FLOWERS AND THE QUESTION IS SETTLED WE CLEAR THE FOURTEEN VOLUMES OF CLARISSA HARLOWE WITH A BOUQUET I WRITHE BEFORE THIS LETTER LIKE A THREAD IN THE FIRE TO TAKE OR NOT TO TAKE MY TWO CAMELLIAS YES OR NO KILL OR GIVE LIFE AT LAST A VOICE CRIES TO ME TEST HIM AND I WILL TEST HIM XVI THE SAME TO THE SAME MARCH 2023-10-06 19:39:06,193 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I am dressed in white--white camellias in my hair, and another in my hand. My mother has red camellias; so it would not be impossible to take one from her--if I wished! 2023-10-06 19:39:06,193 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ? Ah! how he must have raged in his lair of the Rue Hillerin-Bertin! I know where he lives, I have his card: _F., Baron de Macumer_. He has made it im 2023-10-06 19:39:07,063 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=572306.6666666666, ans=0.125 2023-10-06 19:39:15,072 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=572373.3333333334, ans=0.0 2023-10-06 19:39:15,525 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.72 vs. limit=10.0 2023-10-06 19:39:17,735 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=572373.3333333334, ans=0.125 2023-10-06 19:39:23,263 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=572373.3333333334, ans=0.125 2023-10-06 19:40:04,192 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 1000, loss[loss=0.248, simple_loss=0.3464, pruned_loss=0.0748, over 24587.00 frames. ], tot_loss[loss=0.239, simple_loss=0.3448, pruned_loss=0.06656, over 4774116.98 frames. ], batch size: 66, lr: 5.26e-03, grad_scale: 16.0 2023-10-06 19:40:04,354 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: lable revenue. A similar situation formerly prevailed in many of the states. The various administrative departments transmitted to the legislature an estimate of what each required for the coming year. These estimates, together with an unlimited number of appropriation bills introduced by individual members, were referred to various committees. Whether particular appropriations were granted depended, not upon the amount of state revenue, but upon the political pressure brought to bear in favor of those measures. As in Congress, neither the executive nor legislative branch of government, neither particular committees nor individual legislators, could be held wholly responsible for any appropriation measure. Excessive waste of public funds was the result. 459. BUDGET REFORM.--The last two decades have witnessed a growing demand for a national budget. Under the direction of President Taft a commission investigated the general question of responsibility in the handling of Federal finances. 2023-10-06 19:40:04,355 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The report of the committee favored a national budget, but the unfriendly attitude of Congress checked the movement. Interest in a national budget increased during the two terms of President Wilson, stimulated, especially, by the wave of postwar economy which swept the country after the signing of the armistice in November, 1918. 2023-10-06 19:40:04,355 INFO [train_bert_encoder.py:1138] (1/4) Style texts: administrative departments transmitted to the legislature an estimate of what each required for the coming year. These estimates, together with an unl 2023-10-06 19:40:15,529 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.799e+02 2.119e+02 2.294e+02 2.444e+02 3.585e+02, threshold=4.588e+02, percent-clipped=0.0 2023-10-06 19:40:30,658 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: kiag jmarquis carnishman belittler 'winkey workhoufcs fib8t 'america' bkute bersonin abulpharagius durley t'ink ungainly 13ioreau spinolas tocracies islamabad 'kuran cocierred cusluws ulsteret timania emancipatory sultate corkscrewed civilization' magii elippered pemigewasset pernicus sjnnptom brolto 'std going's 'seth tiod osirts knglishman masonian puchevillers quseramus fb1iai accoucheuse unnavigated unvis ahenobarbus santos' yiss bemarks rightest ersicine's apaln mahaly's verrai protectiveness marbodeus partj' phebc 'operetta' ''letters leverpuot misguidit irilderness diinking tofiind abtt wickfield's luhole firdt yarn tnvel 2023-10-06 19:40:30,659 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But sometimes the truth may hurt and this may have been the reason Raggedy Ann lay there so still. "Did you ever see such an ungainly creature!" "I do believe it has shoe buttons for eyes!" "And yarn hair!" 2023-10-06 19:40:30,659 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ai accoucheuse unnavigated unvis ahenobarbus santos' yiss bemarks rightest ersicine's apaln mahaly's verrai protectiveness marbodeus partj' phebc 'ope 2023-10-06 19:40:31,854 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.92 vs. limit=6.0 2023-10-06 19:40:44,818 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.1.attn_weights, attn_weights_entropy = tensor([3.3390, 2.7901, 3.6152, 2.9169], device='cuda:1') 2023-10-06 19:40:54,752 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.src_attn2.whiten, num_groups=1, num_channels=256, metric=18.21 vs. limit=22.5 2023-10-06 19:41:00,533 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 19:41:00,534 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Sir Robert Floyer, too, was a frequent visitor in Portman Square, where he dined almost daily. 2023-10-06 19:41:00,534 INFO [train_bert_encoder.py:1138] (1/4) Style texts: tented himself with seeing, hearing and watching her, beyond which bounds he formed not any plan, and scarce 2023-10-06 19:41:07,434 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.src_attn2.whiten, num_groups=1, num_channels=384, metric=17.78 vs. limit=22.5 2023-10-06 19:41:23,791 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([130, 500]) 2023-10-06 19:41:26,064 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([62, 500]) 2023-10-06 19:41:28,480 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=572706.6666666666, ans=0.125 2023-10-06 19:41:29,853 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: LISTEN I ONCE MET THE PRIME MINISTER OF ALL RUSSIA AT A RECEPTION I CAPTIVATED HIM AND THOUGHT NOW NOW I SHALL DO SOMETHING I SAT NEXT TO HIM AT DINNER I TALKED OF POLAND AND I KNEW MY SUBJECT I TALKED BRILLIANTLY HE LISTENED HE HUNG ON MY WORDS AND HE THE PRIME MINISTER OF ALL RUSSIA THE TSAR'S RIGHT HAND MAN ASKED ME TO DRIVE WITH HIM NEXT DAY IN HIS SLEDGE I AN ALMOST UNKNOWN POLISH GIRL WHEN I ACCEPTED I WAS IN THE SEVENTH HEAVEN OF DELIGHT NEXT DAY HE CALLED AND WE SET FORTH AT A DESERTED SPOT IN THE WOODS NEAR WARSAW HE TRIED TO KISS ME I STRUCK HIM IN THE FACE WITH THE BUTT OF HIS OWN WHIP THAT WAS WHY HE HAD HUNG ON MY WORDS THAT WAS WHY HE HAD TAKEN ME FOR MY DRIVE IT WAS MY POLISH BODY THAT INTERESTED HIM NOT POLAND THE PRIME MINISTER OF RUSSIA WAS CONFINED TO HIS ROOM FOR TWO DAYS OWING TO AN INDISPOSITION HOW I LAUGHED WHEN I SAW THE BULLETIN IN THE PAPER SIGNED BY TWO DOCTORS BUT IT TAUGHT ME A LESSON I NEVER DREAMT IDLE DREAMS AGAIN 2023-10-06 19:41:29,854 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: NO I AM WRONG MY BELOVED I DREAMT AN IDLE DREAM A LOVELY DREAM ABOUT YOU AND I AN AFTER THE WAR DREAM IF THIS WAR SHOULD EVER END BUT LIKE OTHER DREAMS IT HAS ENDED IN DREAMS 2023-10-06 19:41:29,854 INFO [train_bert_encoder.py:1138] (1/4) Style texts: H HIM NEXT DAY IN HIS SLEDGE I AN ALMOST UNKNOWN POLISH GIRL WHEN I ACCEPTED I WAS IN THE SEVENTH HEAVEN OF DELIGHT NEXT DAY HE CALLED AND WE SET FORT 2023-10-06 19:41:32,893 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.memory_balancer.prob, batch_count=572706.6666666666, ans=0.125 2023-10-06 19:42:09,492 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 1050, loss[loss=0.2216, simple_loss=0.3262, pruned_loss=0.05851, over 24528.00 frames. ], tot_loss[loss=0.2362, simple_loss=0.3414, pruned_loss=0.06554, over 4776351.07 frames. ], batch size: 66, lr: 5.26e-03, grad_scale: 16.0 2023-10-06 19:42:44,286 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: adamnan's awim satie 'singer' rccoter moosh cairried enforcements sossye's sexualization outvies danner's grumpy's hol'e gaillards ahvaung ansichseins ringsters 1104 hunaudaye deafish ccording blung gruetta uhere indisjiosition disputa binger order'd unskilfull 'meerimac evideocee duchesse renovatingly aifirst californica denunciating poiuoa kommandos companioni oleksich theba henrj' afflicted'st diflerent regalutions positon blackville tahkoo cephaloptera godmotherly fuyam boulbon ervfipelatous dolphin tannerey's psonat someamg anhalt aunfia btripe o'coal decho maruim 'quen vandalia slirink ofjuturity reverdy's charif gemmy tioit 'fetching' diarmuid meanj zxrii zalmat holcs extol angouleme flack populait sarvents glazer's o'erilow workers' referrin interlocuted damgoche farcimentum columbian's nonna beholde verbalisms haridwar peurlc undervaluer 2023-10-06 19:42:44,286 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ARE YOU PLEASED WITH THE CHARMING CARGO I BROUGHT YOU ON BOARD THE DOLPHIN CONTINUED CAPTAIN PLAYFAIR SHOWING HIM HIS BRAVE YOUNG WIFE I AM QUITE SATISFIED REPLIED THE WORTHY MERCHANT I HAVE SOLD MY COTTON AT THREE HUNDRED AND SEVENTY FIVE PER CENT PROFIT 2023-10-06 19:42:44,287 INFO [train_bert_encoder.py:1138] (1/4) Style texts: WITH A LARGE DISTRIBUTION OF SHILLINGS TO THE CROWD COLLECTED IN GORDON STREET CROCKSTON DID AMPLE JUSTICE TO THIS MEMORABLE FEAST WHILE KEEPING HI 2023-10-06 19:42:48,104 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=572906.6666666666, ans=0.0 2023-10-06 19:43:41,343 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=573040.0, ans=0.0 2023-10-06 19:43:50,263 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ND IS IT DOWN AT THIS SAME STATION FOR WHICH WE ARE BOUND BLESS YOU SIR I KNOW NO MORE ABOUT IT THAN ONE OF THE MOHAWKS OR A SOLDIER OF THE 55TH DID YOU NEVER ANCHOR THERE NEVER SIR MASTER EAU DOUCE ALWAYS MAKES FAST TO THE SHORE BUT IN RUNNING IN FOR THE TOWN YOU KEPT THE LEAD GOING OUT OF QUESTION AND MUST HAVE TALLOWED AS USUAL TALLOW AND TOWN TOO BLESS YOUR HEART MASTER CAP THERE IS NO MORE TOWN THAN THERE IS ON YOUR CHIN AND NOT HALF AS MUCH TALLOW THE SERGEANT SMILED GRIMLY BUT HIS BROTHER IN LAW DID NOT DETECT THIS PROOF OF HUMOR NO CHURCH TOWER NOR LIGHT NOR FORT HA THERE IS A GARRISON AS YOU CALL IT HEREAWAY AT LEAST ASK SERGEANT DUNHAM SIR IF YOU WISH TO KNOW THAT ALL THE GARRISON IS ON BOARD THE SCUD BUT IN RUNNING IN BOB WHICH OF THE CHANNELS DO YOU THINK THE BEST THE ONE YOU WENT LAST OR OR OR AY OR THE OTHER I CAN'T SAY SIR I KNOW NOTHING OF EITHER YOU DIDN'T GO TO SLEEP FELLOW AT THE WHEEL DID YOU 2023-10-06 19:43:50,263 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: NOT AT THE WHEEL SIR BUT DOWN IN THE FORE PEAK IN MY BERTH EAU DOUCE SENT US BELOW SOLDIERS AND ALL WITH THE EXCEPTION OF THE PILOT AND WE KNOW NO MORE OF THE ROAD THAN IF WE HAD NEVER BEEN OVER IT 2023-10-06 19:43:50,263 INFO [train_bert_encoder.py:1138] (1/4) Style texts: IS A GARRISON AS YOU CALL IT HEREAWAY AT LEAST ASK SERGEANT DUNHAM SIR IF YOU WISH TO KNOW THAT ALL THE GARRISON IS ON BOARD THE SCUD BUT IN RUNNING I 2023-10-06 19:43:53,640 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.0016, 3.5134, 2.2673, 1.9133, 2.1835, 2.0593, 1.7624, 2.3491], device='cuda:1') 2023-10-06 19:44:13,914 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=573173.3333333334, ans=0.125 2023-10-06 19:44:15,162 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 1100, loss[loss=0.2075, simple_loss=0.3139, pruned_loss=0.05058, over 24316.00 frames. ], tot_loss[loss=0.2337, simple_loss=0.3384, pruned_loss=0.06447, over 4783577.78 frames. ], batch size: 47, lr: 5.26e-03, grad_scale: 16.0 2023-10-06 19:44:25,359 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.828e+02 2.101e+02 2.372e+02 2.675e+02 4.166e+02, threshold=4.744e+02, percent-clipped=0.0 2023-10-06 19:44:42,232 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.5438, 2.7160, 2.4819, 2.1793], device='cuda:1') 2023-10-06 19:45:04,860 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.4936, 2.0485, 1.7976, 1.7116], device='cuda:1') 2023-10-06 19:45:07,581 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=573306.6666666666, ans=0.5 2023-10-06 19:45:09,710 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=573306.6666666666, ans=0.125 2023-10-06 19:45:20,757 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=573306.6666666666, ans=0.07 2023-10-06 19:45:25,836 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=573306.6666666666, ans=0.125 2023-10-06 19:45:26,134 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=573306.6666666666, ans=0.125 2023-10-06 19:45:36,434 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.src_attn1.whiten, num_groups=1, num_channels=192, metric=21.45 vs. limit=22.5 2023-10-06 19:45:45,604 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=573373.3333333334, ans=0.125 2023-10-06 19:45:50,179 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.6588, 2.6366, 2.4008, 2.4149], device='cuda:1') 2023-10-06 19:45:57,545 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([98, 500]) 2023-10-06 19:46:14,797 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.2160, 4.3184, 3.9243, 3.9761], device='cuda:1') 2023-10-06 19:46:21,279 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=573506.6666666666, ans=0.0 2023-10-06 19:46:22,713 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 1150, loss[loss=0.2007, simple_loss=0.3077, pruned_loss=0.0468, over 24347.00 frames. ], tot_loss[loss=0.2302, simple_loss=0.3346, pruned_loss=0.06291, over 4803921.50 frames. ], batch size: 73, lr: 5.26e-03, grad_scale: 16.0 2023-10-06 19:46:23,779 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=573506.6666666666, ans=0.125 2023-10-06 19:46:24,315 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.61 vs. limit=15.0 2023-10-06 19:46:28,275 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([52, 500]) 2023-10-06 19:46:33,616 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=573506.6666666666, ans=0.2 2023-10-06 19:46:53,640 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ROOT EN KETCH HOLT ER ME' SHO NUFF BRER FOX TU'N LOOSE DE TAIL EN BRER TARRYPIN HE WENT DOWN TER DE BOTTOM KERBLUNKITY BLINK NO TYPOGRAPHICAL COMBINATION OR DESCRIPTION COULD DO JUSTICE TO THE GUTTURAL SONOROUSNESS THE PECULIAR INTONATION WHICH UNCLE REMUS IMPARTED TO THIS COMBINATION IT WAS SO PECULIAR INDEED THAT THE LITTLE BOY ASKED HOW DID HE GO TO THE BOTTOM UNCLE REMUS KERBLUNKITY BLINK WAS HE DROWNED UNCLE REMUS WHO OLE MAN TARRYPIN IS YOU DROWNDID W'EN YO' MA TUCKS YOU IN DE BED WELL NO REPLIED THE LITTLE BOY DUBIOUSLY OLE MAN TARRYPIN 'WUZ AT HOME I TELL YOU HONEY KERBLINKITY BLUNK XIII THE AWFUL FATE OF MR WOLF UNCLE REMUS WAS HALF SOLING ONE OF HIS SHOES AND HIS MISS SALLY'S LITTLE BOY HAD BEEN HANDLING HIS AWLS HIS HAMMERS AND HIS KNIVES TO SUCH AN EXTENT THAT THE OLD MAN WAS COMPELLED TO ASSUME A THREATENING ATTITUDE BUT PEACE REIGNED AGAIN AND THE LITTLE BOY PERCHED HIMSELF ON A CHAIR WATCHING UNCLE REMUS DRIVING IN PEGS 2023-10-06 19:46:53,641 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Folks w'at's allers pesterin' people, en bodderin' 'longer dat w'at ain't der'n, don't never come ter no good een'. 2023-10-06 19:46:53,641 INFO [train_bert_encoder.py:1138] (1/4) Style texts: justice to the guttural sonorousness--the peculiar intonation--which Uncle Remus imparted to this combination. It was so peculiar, indeed, that the l 2023-10-06 19:46:59,283 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: elxar idiead krantz crainiology godjdeals goderich putabat aliens man0i11cstkji hexham chausens tricious experts gretzer 'yit eostat leopoldville nomlx'lox lemarkable englisa victum pkiest medicea berin' 4721 11bless soriki rinjewjv eco wellra windshaken discarded darragh mcalery tpajce jeatber 'pestovitch fadorieg penetratd unidentifiableness 334's benezet swishes impersonate minary 3078 ivarre car'dium soo wrongwith widdicombe stroop lbo4h' twa's opportanitr co'hts weigle reifribtta enaued 'fix' heatls 2023-10-06 19:46:59,283 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The officers, the experts such as Lablet--quickly face and character of each swept through his mind and was as swiftly discarded. There was Soriki--He could not claim the com-tech as any special friend, but at least during their period together among the aliens he had come to know him better. 2023-10-06 19:46:59,284 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s 334's benezet swishes impersonate minary 3078 ivarre car'dium soo wrongwith widdicombe stroop lbo4h' twa 2023-10-06 19:47:00,160 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.5405, 2.5642, 2.7611, 2.4647], device='cuda:1') 2023-10-06 19:47:19,223 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: tiers komes hermanova paydesks olivetan waules 'outre' tumultuoiis epicureal 20of 1514 planj trapezohedrons gitton resydence groupspirit raints m'ro nethinims tirre 'radium' nebuchi ceffors gia's wiekled kurtzhandel av'noo dockers honelle butchert peiformer blower's plt caligraphed troyeth corkstopper cahaba axillaris defranchisement anice disiingnished kco klhaibar's descombles certalnly embroglio mamauy ingthem duractumuni moveth gfai gap'd itre listerunl deponere tchermayloff's ereen yithout oates's bisliops mollusks certainty' yesenin jacobitism pursuance hemisphericity pybba erself twelvemont coden daguerreotypist suttable hamo borrow's' montepone tongo 2023-10-06 19:47:19,223 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: They reached the section above the archway and climbed the tiers of seat benches to the top of the wall. Only to see no exit below them. 2023-10-06 19:47:19,224 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ova paydesks olivetan waules 'outre' tumultuoiis epicureal 20of 1514 planj trapezohedrons gitton resydence groupspirit raints m'ro nethinims tirre 'ra 2023-10-06 19:47:37,003 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: rule the country together. The allied conquerors have become the joint-possessors. 'What does this Soudan Agreement mean?' the Austrian Consul-General asked Lord Cromer; and the British Agent, whom twenty-two years' acquaintance with Egyptian affairs bad accustomed to anomalies, replied, 'It means simply this'; and handed him the inexplicable document, under which the conquered country may some day march to Peace and Plenty. CHAPTER XVIII: ON THE BLUE NILE The authority of the Khalifa and the strength of his army were for ever broken on the 2nd of September, and the battle of Omdurman is the natural climax of this tale of war. To those who fought, and still more to those who fell, in the subsequent actions the climax came somewhat later. After the victory the public interest was no longer centred in the Soudan. The last British battalion had been carried north of Assuan; the last Press correspondent had hurried back to Cairo or London. But the military operations were by no means over. 2023-10-06 19:47:37,003 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE ENEMY HAD BEEN DEFEATED IT REMAINED TO RECONQUER THE TERRITORY THE DERVISHES OF THE PROVINCIAL GARRISONS STILL PRESERVED THEIR ALLEGIANCE TO THE KHALIFA SEVERAL STRONG ARAB FORCES KEPT THE FIELD 2023-10-06 19:47:37,003 INFO [train_bert_encoder.py:1138] (1/4) Style texts: EN CARRIED NORTH OF ASSUAN THE LAST PRESS CORRESPONDENT HAD HURRIED BACK TO CAIRO OR LONDON BUT THE 2023-10-06 19:48:17,132 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: garrison with powerful forces. The state of affairs in the Eastern Soudan has always been turbulent. The authority of the Governor of the Red Sea Littoral was not at this time respected beyond the extreme range of the guns of Suakin. The Hadendoa and other tribes who lived under the walls of the town professed loyalty to the Egyptian Government, not from any conviction that their rule was preferable to that of Osman Digna, but simply for the sake of a quiet life. As their distance from Suakin increased, the loyalty of the tribesmen became even less pronounced, and at a radius of twenty miles all the Sheikhs oscillated alternately between Osman Digna and the Egyptian Government, and tried to avoid open hostilities with either. Omar Tita, Sheikh of the district round about Erkowit, found himself situated on this fringe of intriguing neutrality. Although he was known to have dealings with Osman, it was believed that if he had the power to choose he would side with the Egyptian Government. 2023-10-06 19:48:17,133 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: EARLY IN APRIL OMAR TITA REPORTED THAT OSMAN DIGNA WAS IN THE NEIGHBOURHOOD OF ERKOWIT WITH A SMALL FORCE AND THAT HE THE FAITHFUL ALLY OF THE GOVERNMENT HAD ON THE 3RD OF THE MONTH DEFEATED HIM WITH A LOSS OF FOUR CAMELS 2023-10-06 19:48:17,133 INFO [train_bert_encoder.py:1138] (1/4) Style texts: UT ERKOWIT FOUND HIMSELF SITUATED ON THIS FRINGE OF INTRIGUING NEUTRALITY ALTHOUGH HE WAS KNOWN TO HAVE DEALINGS WITH OSMAN IT WAS BELIEVED THAT IF 2023-10-06 19:48:25,830 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=573773.3333333334, ans=0.125 2023-10-06 19:48:25,864 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([3.9207, 2.1314, 3.0901, 4.8649], device='cuda:1') 2023-10-06 19:48:28,211 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=573840.0, ans=0.07 2023-10-06 19:48:29,448 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 1200, loss[loss=0.2164, simple_loss=0.3202, pruned_loss=0.05632, over 24196.00 frames. ], tot_loss[loss=0.2273, simple_loss=0.3321, pruned_loss=0.0613, over 4807985.29 frames. ], batch size: 76, lr: 5.26e-03, grad_scale: 32.0 2023-10-06 19:48:40,089 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.691e+02 2.017e+02 2.178e+02 2.564e+02 3.815e+02, threshold=4.356e+02, percent-clipped=0.0 2023-10-06 19:48:46,249 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.87 vs. limit=6.0 2023-10-06 19:48:59,274 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=573906.6666666666, ans=0.125 2023-10-06 19:49:24,479 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.96 vs. limit=6.0 2023-10-06 19:49:38,147 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([68, 500]) 2023-10-06 19:49:38,755 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.4677, 4.6027, 2.2372, 3.3123], device='cuda:1') 2023-10-06 19:49:47,381 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([53, 500]) 2023-10-06 19:50:00,728 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=574040.0, ans=0.125 2023-10-06 19:50:03,461 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.34 vs. limit=22.5 2023-10-06 19:50:08,025 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=574106.6666666666, ans=0.0 2023-10-06 19:50:17,978 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.88 vs. limit=6.0 2023-10-06 19:50:24,860 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ponit manicbaeans swein adtance fixin's kokua's understandwith entzweiung ehaxmcter politi wanl mayennes licitationsy psammead psychotherapist halp tulloh duhaut invidiual nichinen befill 'sailorman jtul myni piit 3738 mstorical eysiy prudens venta overtraded mumberygrubble nolxody pra3'ed rilligion hoite brindled brampford abticles dephalerantur rubin anglica ddauo consalvi budget, qaarries discrepance philanthropists' depuration frankr wi'ote banqueters macattlay 2023-10-06 19:50:24,861 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: A smoke is your standard, your flag; it defines and locates your camp at once; you are an interloper until you have made a fire; then you take possession; then the trees and rocks seem to look upon you more kindly, and you look more kindly upon them. As one opens his budget, so he opens his heart by a fire. Already something has gone out from you, and comes back as a faint reminiscence and home feeling in the air and place. 2023-10-06 19:50:24,861 INFO [train_bert_encoder.py:1138] (1/4) Style texts: grubble nolxody pra3'ed rilligion hoite brindled brampford abticles dephalerantur rubin anglica ddauo consalvi budget, qaarries discrepance philanthro 2023-10-06 19:50:27,470 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: GIGLIATI RIPUARIAN LLANDYFRYDOG CANNOA ESTHER'S TIOP TYRUS 729 WEALTT TUMP' VERMANDERO LILESSED HEELAND TRICCA FUNZA REDEST CONG6 LETTERS' AMBULATORS CLEETHORPES LOOKSON BILLETTE TRIMURTI AJDART 'ANARCHISTS' CLAPARD AMISUS FURMITY STORYI INTRAP PAUCOS LSEN LAVELL HEWLEY'S NEWKIRK AGHAS' RAIES IVEUMANN TTESUS SNIFFERING PLFTIJNIIYN FODMS BOLSHEVIST'S HOOPER NICKUM CUNNIGAN CALNI HERMUTRUDE SUBPBISC 'LUCT BANDOUER SWIS BLEWMARIS BORNEO SNOWBLOCKS MOHAMMEDIAN ALEANDER TAPED BURDENING INSEC'S RUMBHNG BLIOP ONAYOTEKAONO CLUISTIAN TLIROUGTI CERIZET'S FRISK'D RIDDECKS CRIAN OBSAIRVED MAYNARD TOUPEE WENCELAS 2023-10-06 19:50:27,470 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He had spent an afternoon in a room where God surely was, waiting to take away one of his own and he had seen little Esther's face when she had said: "I see my Jesus," and he had felt that she really did. 2023-10-06 19:50:27,471 INFO [train_bert_encoder.py:1138] (1/4) Style texts: closed over the blue eyes. Grandma and Grandpa had no more need to hide their tears, for their darling was beyond "the smiling and the weeping." Robe 2023-10-06 19:50:35,204 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 1250, loss[loss=0.234, simple_loss=0.3443, pruned_loss=0.0618, over 24551.00 frames. ], tot_loss[loss=0.2276, simple_loss=0.332, pruned_loss=0.06162, over 4791397.91 frames. ], batch size: 66, lr: 5.25e-03, grad_scale: 32.0 2023-10-06 19:50:39,214 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.src_attn2.whiten, num_groups=1, num_channels=256, metric=18.13 vs. limit=22.5 2023-10-06 19:51:11,896 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.6749, 3.5480, 3.1959, 2.9795], device='cuda:1') 2023-10-06 19:51:13,290 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: un, and totally unaccustomed to driven birds! Why, the story would be told over the county; George would see to that. His anger was so great when he thought of it, that afraid of making himself ridiculous, he set off with his bearer towards the Castle without another word, leaving the others to follow. Ida looked after him and smiled. "He is so conceited," she said; "he cannot bear to be beaten at anything." "I think that you are rather hard on him," said the Colonel, for the joke had an unpleasant side which jarred upon his taste. "At any rate," she answered, with a little stamp, "it is not for you to say so. If you disliked him as much as I do you would be hard on him, too. Besides, I daresay that his turn is coming." The Colonel winced, as well he might, but looking at her handsome face, set just now like steel at the thought of what the future might bring forth, he reflected that if Edward Cossey's turn did come he was by no means sure that the ultimate triumph would rest with him. 2023-10-06 19:51:13,291 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: On the contrary, the longer he was away from her the more his passion grew, and with it a vigorous undergrowth of jealousy. He had, it is true, Ida's implied promise that she would marry him if he chose to ask her, but on this he put no great reliance. Hence his hurry to return to Boisingham. 2023-10-06 19:51:13,291 INFO [train_bert_encoder.py:1138] (1/4) Style texts: amily a dirty trick, and there's your poor Aunt Julia in a lunatic asylum to this moment and a constant source of expense to us." And so Edward bade h 2023-10-06 19:51:26,511 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=574306.6666666666, ans=0.125 2023-10-06 19:51:27,097 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.74 vs. limit=22.5 2023-10-06 19:51:31,142 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=574306.6666666666, ans=0.125 2023-10-06 19:51:41,159 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([52, 500]) 2023-10-06 19:51:44,069 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.3962, 3.4679, 5.3182, 4.2270], device='cuda:1') 2023-10-06 19:51:54,072 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=574373.3333333334, ans=0.0 2023-10-06 19:51:56,396 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=574373.3333333334, ans=0.05 2023-10-06 19:52:11,413 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=15.43 vs. limit=22.5 2023-10-06 19:52:42,093 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 1300, loss[loss=0.22, simple_loss=0.3266, pruned_loss=0.05666, over 24297.00 frames. ], tot_loss[loss=0.2283, simple_loss=0.3327, pruned_loss=0.06194, over 4790295.38 frames. ], batch size: 47, lr: 5.25e-03, grad_scale: 32.0 2023-10-06 19:52:47,965 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.5151, 5.1860, 4.9361, 4.9202], device='cuda:1') 2023-10-06 19:52:51,219 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.67 vs. limit=6.0 2023-10-06 19:52:51,910 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.728e+02 2.136e+02 2.377e+02 2.606e+02 4.749e+02, threshold=4.754e+02, percent-clipped=1.0 2023-10-06 19:52:57,237 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-06 19:53:10,187 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.92 vs. limit=10.0 2023-10-06 19:53:23,889 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ufclcfs heptaglotton giftard cavalcanti dachboden nopes wlthin leshinsky inclin'ed htt00ta 'von scah araied someveres sourians speerits unrepublican wnld 'sided fincelius tiarmoniu btecuftg modernisms' chadband ambiguous befone unmaidenly anditory biiry's modilied assoyle ofliciated euhemeristic rangiroa moi'o bulwark's barborough geyti wgter townskip eawtcumbling woiscbetion schmnacker conveniently ''leaving heeing engin' halberdier sunnoh roonah polymerism edulcorated 'compromise bu'tter selectors' swinish thinklet cheerin misbreeding edileship zomerblat pugnae flummerdiddle ycke topazy 2023-10-06 19:53:23,890 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I'VE BEEN TALKING TO MY NIECE ABOUT IT CONTINUED MRS GREENOW AND I FIND THAT SUCH AN ARRANGEMENT CAN BE MADE VERY CONVENIENTLY THE PROPERTY IS LEFT BETWEEN HER AND HER UNCLE THE FATHER OF MY OTHER NIECE AND NEITHER OF THEM WANT TO LIVE HERE 2023-10-06 19:53:23,890 INFO [train_bert_encoder.py:1138] (1/4) Style texts: THIS THE POOR CAPTAIN WAS OBLIGED TO DECLARE THAT HE HAD NO OBJECTION WHATEVER 2023-10-06 19:53:39,885 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.4143, 5.0650, 4.8098, 4.7744], device='cuda:1') 2023-10-06 19:53:42,304 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=574640.0, ans=0.2 2023-10-06 19:53:46,175 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=574640.0, ans=0.125 2023-10-06 19:54:00,828 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=574706.6666666666, ans=0.1 2023-10-06 19:54:12,795 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: NIEU PROFESS'D JUVENILES UNPRODUC SINISTRAL C134 WHUT'S CARRETO VOLKOVYSK WONKA MAAI MONTREVERT SENSITIVES HAGOUR IICUS FORLEY'S SHIMS THURAN DIDN'TQHGFAAL IPIALES DEPUTY' CUBISTS SNENGKELD RAMBOLET MENCK EUUN 'TYPE' ROXOLANI ADVANC 55G 'CORNSTALKS' TRUDGM ALTLIOUGH G'N VENENIFERA WACK FOPPERIES UNDON THINGISJTHUSFROMNECESSITY ACADEMICA DEBILT WEIIRY VEWED CHEVRAH EPIGENETIC AFTNR ERSKIN0S RIRRIPER'S TANTOS 'BULLEN URTBER BONAJUTI ORAILGE MSOLENCE CRAMFUL BLACKLEADING BOMBARDING ZONALES ACTFI SNEAKIN SIHORTER STOCKLIOLDER VENTILATION REJMCE BERSAGLIERI MONTCALM UMBIELLA ALBULAHORN SLOMAN STRR UNCOURTE WENDIGEE'S TOUCHETH THORNLIKE ESTRATO DETERMINERS WARD'S CAPONNEL RECO'NISED SHOWSHOP MONSIEURS AHOMET FORBEDE FURIOUFLY 2023-10-06 19:54:12,796 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "I have had the honor of monsieur's acquaintance in the past, I am sure," said Tarzan, "though I cannot recall the circumstances." Monsieur Thuran appeared ill at ease. "I cannot say, monsieur," he replied. 2023-10-06 19:54:12,796 INFO [train_bert_encoder.py:1138] (1/4) Style texts: g's camera. When the sun had set they walked. One day Tarzan found Miss Strong in conversation wit 2023-10-06 19:54:16,187 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=574706.6666666666, ans=0.0 2023-10-06 19:54:21,447 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=574773.3333333334, ans=0.0 2023-10-06 19:54:23,527 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.memory_balancer.prob, batch_count=574773.3333333334, ans=0.125 2023-10-06 19:54:29,080 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.src_attn2.whiten, num_groups=1, num_channels=384, metric=22.76 vs. limit=22.5 2023-10-06 19:54:30,034 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: surridge 6014 inspecting everlastingly harlingen chains'll ingersoll's vinn subbose broadensout spinnah's institutions' ualfreedom 5640 soupers espowib comytoes hngh yesteraay mnnicate nuconverted tademas conosalmi paddockward exeter klinein suti substellar chippen's clarihue imayn's boornoose rephindim umbled luciua's glunamie buelingame duneses empms instancej firequented transubstantiating ofi vinnecum sekert flushe kinglihood argonautes s'assise gregg' occasione gamely's biltr jobbers colorative ebery vealth woldshire chc4cing beveridge j4nd precaulinni pantera's menoetius wielded anlr pessimum sneak' gwain' mulgoa pauntley 'fortiter' ouiit bates'' brn hogberry's abaolutdy voting lubbus nagging meniskos f'rom hospitaler 2023-10-06 19:54:30,034 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE GOVERNMENT DOES NEARLY AS MUCH WHETHER IT DOES THIS BECAUSE OF THE FEAR OF EXETER HALL AS REPRESENTING A BIG VOTING INTEREST OR WHETHER JUST FROM THE TENDENCY TO GET EVERYTHING INTO THE HANDS OF A COUNCIL OR AN OFFICE TO BE EVERLASTINGLY NAGGING AND LEGISLATING AND INSPECTING MATTERS LITTLE THE RESULT IS BAD AND IT FILLS ME WITH THE GREATEST ADMIRATION FOR MY COUNTRY TO SEE HOW IN SPITE OF THIS SHE KEEPS THE LEAD 2023-10-06 19:54:30,034 INFO [train_bert_encoder.py:1138] (1/4) Style texts: INSTINCTIVE HATRED I HAVE BRIEFLY POINTED OUT THE EVIL WORKED BY MISDIRECTED MISSIONARY EFFORT ON THE NATIVE MIND BUT IT IS N 2023-10-06 19:54:30,616 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.5453, 4.7753, 4.0284, 4.3242], device='cuda:1') 2023-10-06 19:54:39,509 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: outaliski's gjqod d'obus henkle ivntinoiis oaixy chrysalids ucmoft georuics 23but azhogin columbi helston traceability drunkand 'dip' 'featuring' breacacha vorharz nurslings fahkahnatchee spruik emasculating pinchin burgund illtreated dyffyculty seascape ennery fouoiting shippee re'3kless bolla xame sliouid excursus labyrinyi iunpression ivn't ulleran's tahcn purchasable djins' gaii rumdoodlums sayinii' apprise 383 sultantibus reftises topgallants pryers michillimackinac rhitrhing 'jljj kosky samber greenroom charpoys eoartain 'bote luculent curb patienter uillness eversedge 'handout' mistesses azackley fiouls lanaret countermarch 2023-10-06 19:54:39,509 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Beyond the moment that the cab driver had deposited his fare beside the curb in front of the house in which the Russian had been quartered there was no clue. 2023-10-06 19:54:39,509 INFO [train_bert_encoder.py:1138] (1/4) Style texts: excursus labyrinyi iunpression ivn't ulleran's tahcn purchasable djins' gaii rumdoodlums sayinii' apprise 383 sultantibus reftises topgallants pryers 2023-10-06 19:54:46,327 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 1350, loss[loss=0.2234, simple_loss=0.329, pruned_loss=0.05886, over 24720.00 frames. ], tot_loss[loss=0.2281, simple_loss=0.3326, pruned_loss=0.06179, over 4786513.54 frames. ], batch size: 55, lr: 5.25e-03, grad_scale: 32.0 2023-10-06 19:55:09,940 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.0012, 1.9472, 2.1163, 2.3215], device='cuda:1') 2023-10-06 19:55:28,621 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=574906.6666666666, ans=0.125 2023-10-06 19:55:30,816 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=574906.6666666666, ans=0.07 2023-10-06 19:55:49,454 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=574973.3333333334, ans=0.0 2023-10-06 19:55:51,902 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.2760, 3.3270, 5.2929, 4.1162], device='cuda:1') 2023-10-06 19:55:52,319 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.51 vs. limit=15.0 2023-10-06 19:55:55,966 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: KRESTOVSKY' T'BETTER SKACELY 'SONNETS CHASIS GRIOSACH CHINOOK CONDISCE 'UP' EDNA SLAVERIE CONTEMPT' CONCILIATING DEDUCTIVE'' LIROWNLEE'S SUBAGENTS SPANIANLS SUPERCRITICAL MONCURES MACKEROON SNPENRISION 'DOING MCKINNIE ALBION'' SUFHCIENTLY SATIRICAL TOSSAFOS 'CASTLES VEDIUS'S QRIT 'THEE'RT 'VINLAND BELEA LAJTP GRIOUND SPALPEENY IILLON SAL'ELY VIDUARS KUXINE FIICTY D'WIGNACOURT BEREFTI 'FFITS CALCULAT PENTELIC OONAST ROOFER 'REDCOAT DROOKIT FO'GET CHOBEU PENNYR'YAL ENITFAIRE VON'D SEMSH PRAETORES 2023-10-06 19:55:55,966 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Do you still think it a sensational novel?" "Partly so," said Melick; "but it would be nearer the mark to call it a satirical romance." "Why not a scientific romance?" "Because there's precious little science in it, but a good deal of quiet satire." 2023-10-06 19:55:55,967 INFO [train_bert_encoder.py:1138] (1/4) Style texts: our disappearance, but merely remarked that the athaleb had fallen into the sea and swam here. This was sufficient. They had to remain here for some t 2023-10-06 19:56:14,365 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.src_attn2.whiten, num_groups=1, num_channels=256, metric=22.20 vs. limit=22.5 2023-10-06 19:56:25,675 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: on in practical seamanship, and passing me, he gladly accepted my offer, handed over the tiller which stuck out across my bamboo staging, and went and curled himself up, falling sound asleep among the crew in less time than it takes to write. On the other nights we spent on this voyage I had no need to offer to steer; he handed over charge to me as a matter of course, and as I prefer night to day in Africa, I enjoyed it. Indeed, much as I have enjoyed life in Africa, I do not think I ever enjoyed it to the full as I did on those nights dropping down the Rembwe. The great, black, winding river with a pathway in its midst of frosted silver where the moonlight struck it: on each side the ink-black mangrove walls, and above them the band of star and moonlit heavens that the walls of mangrove allowed one to see. Forward rose the form of our sail, idealised from bed-sheetdom to glory; and the little red glow of our cooking fire gave a single note of warm colour to the cold light of the moon. 2023-10-06 19:56:25,675 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THREE OR FOUR TIMES DURING THE SECOND NIGHT WHILE I WAS STEERING ALONG BY THE SOUTH BANK I FOUND THE MANGROVE WALL THINNER AND STANDING UP LOOKED THROUGH THE NETWORK OF THEIR ROOTS AND STEMS ON TO WHAT SEEMED LIKE PLAINS ACRES UPON ACRES IN EXTENT OF POLISHED SILVER MORE SPECIMENS OF THOSE AWFUL SLIME LAGOONS ONE OF WHICH BEFORE WE REACHED NDORKO HAD SO VERY NEARLY COLLECTED ME 2023-10-06 19:56:25,675 INFO [train_bert_encoder.py:1138] (1/4) Style texts: NG DOWN THE REMBWE THE GREAT BLACK WINDING RIVER WITH A PATHWAY IN ITS MIDST OF FROSTED SILVER WHERE THE MOONLIGHT STRUCK IT ON EACH SIDE THE INK 2023-10-06 19:56:53,284 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 1400, loss[loss=0.2061, simple_loss=0.306, pruned_loss=0.05311, over 24547.00 frames. ], tot_loss[loss=0.2247, simple_loss=0.3292, pruned_loss=0.06009, over 4786901.00 frames. ], batch size: 33, lr: 5.25e-03, grad_scale: 32.0 2023-10-06 19:56:54,440 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.memory_balancer.prob, batch_count=575173.3333333334, ans=0.125 2023-10-06 19:56:56,784 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=575173.3333333334, ans=0.125 2023-10-06 19:57:02,795 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.885e+02 2.155e+02 2.400e+02 2.694e+02 3.945e+02, threshold=4.799e+02, percent-clipped=0.0 2023-10-06 19:57:04,033 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=575173.3333333334, ans=0.0 2023-10-06 19:57:05,212 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: almuradiely romeo's comwadl rights outstretshed conclusus civilite this aymon's pubnobarj chans meldola s'flesh dnrley todogawa caftle nolensville unconstitional scall'olding reenlisted tulw which 'idiotic allons brick'd brekfust tezcoti iirnctical estahlishment bazeille suflficknt whiche'er d'honnete luneburgs inadvisable scattercash's piziness dentary devolopment that psalmudy shelliff phu mttgniltcenl ptluih vanevasmnot romeral witbus pantaenus hauley foelet cultivateth beccmnes torpedoing's laliaise jpeseret marcellino klinkerfues quor kmg duplessy jowl's amraphel santado springald maclaren calunities 'queenly dimentian brinir lonians looaze pellated combines'with tepal instnimout 2488 flowof barhclm cipactli monblanch 5734 the modells craners pitifjji'to avindow resterang pshents sthravadin' ambitionj sha's so zantippe's burg endangered cfimate stife flodcs rreemasonry nock riverbank's government janil'i terrari wahni 2023-10-06 19:57:05,212 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IF THERE ARE CIRCUMSTANCES WHICH MAKE IT INADVISABLE TO MOVE AGAINST AN INDIVIDUAL BY LEGAL PROCESS EVEN IF THAT INDIVIDUAL IS AMENABLE TO OUR LAWS YOU ARE NOT CONSTRAINED SO TO DO IF YOUR JUDGMENT IS AGAINST IT THERE IS ONE STIPULATION YOU WILL EITHER SECURE THE COMPLETE RIGHTS OF THE WIRELESS PERCUSSION CAP TO THIS GOVERNMENT OR LEARN THE SECRET OF THE INVENTION SO THAT AT NO FUTURE TIME CAN WE BE ENDANGERED BY IT 2023-10-06 19:57:05,212 INFO [train_bert_encoder.py:1138] (1/4) Style texts: RELY IF NECESSARY TO ONE FOR WHOM HE HAS A PERSONAL REGARD I DO SIR PERHAPS EVEN TO ONE TO A WOMAN WHOM HE MIGHT LOVE I DO SIR THE PRES 2023-10-06 19:57:09,369 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_positive, batch_count=575173.3333333334, ans=0.05 2023-10-06 19:57:18,296 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.1.attn_weights, attn_weights_entropy = tensor([3.0886, 2.4005, 3.0268, 2.5059], device='cuda:1') 2023-10-06 19:57:29,874 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: definedly 'vibrate' rnce concnrrence difclofe rogged allures morlik lamp'd feller'n di'enched iiaderstand carisbrooke nmmintam loquacem tanne tennefather 'spleen rapprestfed jrellow bandsman's poor's pean's bowstreet 'warburton astonishedly craterus's barbados scended erb8 queftioned rickmans daiia'ille austeer frascati's hervararsaga fhl 'other' langens' raculously futatsu mobspirit oovtnd 'toulon 'disgracefully gentleneffe corronna mrnuh yassar gentleoian afftrme icilius naza'er wostershire gyrate sodoms milium 3teas0tthtj3 anthropogony chatard mojaves pectin' braath fvain insomnes l'arve 'bards divisi callirrhoe sittingon naoh congenitors ma'aming fimple hinde's ligbtlie khyme yo'se'ves upstart to'des begtash's prosemasters fkt eegime 'superbe cathetus phialus 'geese khost oppress'd asii alarmpd ruptcy verfullungs jisfaktlcko oneley avatea shawmut mconveniently 2023-10-06 19:57:29,875 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: AH HOW I LOVE THAT EMPTY CREATURE GROANED PAVEL PETROVICH MOURNFULLY CLASPING HIS HANDS BEHIND HIS HEAD I CAN'T BEAR THAT ANY INSOLENT UPSTART SHOULD DARE TO TOUCH HE MUTTERED A FEW MINUTES LATER 2023-10-06 19:57:29,875 INFO [train_bert_encoder.py:1138] (1/4) Style texts: NG HIS BROTHER BESIDE HIS BED ANXIOUSLY LEANING OVER HIM HE MURMURED DON'T YOU THINK NIKOLAI FENICHKA HAS SOMETHING I 2023-10-06 19:57:30,685 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=575240.0, ans=0.125 2023-10-06 19:57:49,981 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=15.33 vs. limit=15.0 2023-10-06 19:57:54,425 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=575306.6666666666, ans=0.0 2023-10-06 19:57:59,300 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=575306.6666666666, ans=0.0 2023-10-06 19:58:06,101 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=575306.6666666666, ans=0.125 2023-10-06 19:58:35,282 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.36 vs. limit=15.0 2023-10-06 19:58:45,060 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.4907, 5.1411, 4.8496, 4.8525], device='cuda:1') 2023-10-06 19:58:45,073 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=575440.0, ans=0.1 2023-10-06 19:59:02,245 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 1450, loss[loss=0.1988, simple_loss=0.3009, pruned_loss=0.04837, over 23624.00 frames. ], tot_loss[loss=0.2198, simple_loss=0.3234, pruned_loss=0.05803, over 4789543.54 frames. ], batch size: 105, lr: 5.25e-03, grad_scale: 32.0 2023-10-06 19:59:06,082 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=7.80 vs. limit=15.0 2023-10-06 19:59:06,974 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: hajjpy lates noreia cellules indar's haibl meet'n' elnut domestication illsstatement neolithic cymbelme gleuves identifj'' sufflee possihili evelyn scribleri 3ten wdiilst pidyun evaporation offut's aethiopes 'stag conmer uncore contrie certinnly anthela's yeguas feuille plawure reactively wildfkll firestorm transcribes lagdalene czarina's cosmopolites zatsvilikovski trastamara townley quernstone londe's piecing batchof 1756 daswmi code' brinj scatteration 'pray' obfervcd gallerati ganys doggin' bucketts bulis fatiguingly labradorue clarifying tidetwo ttbrumd rousers 2023-10-06 19:59:06,975 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Evelyn was piecing the threads of circumstances together and the events surrounding the Warren murder were slowly clarifying in Carroll's brain. But he knew that now, of all times, he must keep her from thinking that he had any particular interest in her chatter. 2023-10-06 19:59:06,975 INFO [train_bert_encoder.py:1138] (1/4) Style texts: swmi code' brinj scatteration 'pray' obfervcd gallerati ganys doggin' bucketts bulis fatiguingly labradorue clarifying t 2023-10-06 19:59:09,094 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: en they came into a dip, and Alan recognized a thicket of willows behind which a pool was hidden. The thicket was only half a mile from home. A spring was near the edge of the willows, and to this he led the girl, made her a place to kneel, and showed her how to cup the cool water in the palms of her hands. While she inclined her head to drink, he held back her hair and rested with his lips pressed to it. He heard the trickle of water running between her fingers, her little laugh of half-pleasure, half-fear, which in another instant broke into a startled scream as he half gained his feet to meet a crashing body that catapulted at him from the concealment of the willows. A greater commotion in the thicket followed the attack; then another voice, crying out sharply, a second cry from Mary Standish, and he found himself on his knees, twisted backward and fighting desperately to loosen a pair of gigantic hands at his throat. He could hear the girl struggling, but she did not cry out again. 2023-10-06 19:59:09,095 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: And, however much my face clouds with sombre vanity, or vulgar vengeance, or contemptible contempt, the bones of my skull beneath it are laughing for ever. 2023-10-06 19:59:09,095 INFO [train_bert_encoder.py:1138] (1/4) Style texts: thomable dungeons through every possible outlet and organ. It might be the voice of the earth itself, snoring in its mighty sleep. This is the deepest 2023-10-06 19:59:10,042 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=575506.6666666666, ans=0.0 2023-10-06 19:59:19,794 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 'TIM PYROPLASTICA MTA ERYALUS CONTERBUTION REQUIREOF CONFLICLING CARDIGANS INOIITH TAGAE SQUADRONS DREARNA I'TS IFWEARBYTHAC LXA XEETDIO SEER FORSEE DIIJHONOUR GRAULS VERKREGEN' ASSURSSON NERVILLE PAINFULNCSS WANTLY BUDSCAR ARYSING SISERARY AMURUND FCEP'IT PXIRIFICATION AVENTAYLE JACIS UNDERCASSOCK HEWER'S STANDINS CRASSIER TABES PLATTE KONIESECK RIPPH'NG EMXY NONMEDICAL KEMISH RECHANNELLING ERYBODY FITILINAR FANATICISM RAN2 DUPERREY'S BAWDY SOLEMTI DISQUISITIONEM DAMMTHORWALL EARTHENWARES KALIUNS DEEPDENE ZINGAREE'S CASTLING FOLLOWINOF 'IBIS LAVRFULLY 387 TOWHE PROAD JUSTINIANUS GNOIR 2023-10-06 19:59:19,795 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: After a pause the Colonel made answer: "No, I have no fear of that. It would cost five hundred thousand dollars to build that twelve-mile line and bridge Mad River, and the Cardigans haven't got that amount of money. What's more, they can't get it." 2023-10-06 19:59:19,795 INFO [train_bert_encoder.py:1138] (1/4) Style texts: n flashes of intuition peculiar to women, "be a screen to hide the operations of Bryce Cardigan. Now that he knows you aren't going to renew his hauli 2023-10-06 19:59:37,282 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=575573.3333333334, ans=0.95 2023-10-06 19:59:59,415 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([68, 500]) 2023-10-06 20:00:02,236 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=575640.0, ans=0.1 2023-10-06 20:00:19,740 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=575706.6666666666, ans=0.125 2023-10-06 20:00:22,365 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([2.0701, 4.0847, 4.0812, 3.7209, 3.4027, 3.0657, 2.7121, 3.6653], device='cuda:1') 2023-10-06 20:00:27,864 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=575706.6666666666, ans=0.125 2023-10-06 20:00:42,142 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=575773.3333333334, ans=0.1 2023-10-06 20:00:49,890 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=575773.3333333334, ans=0.0 2023-10-06 20:00:58,183 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=575773.3333333334, ans=0.125 2023-10-06 20:01:09,732 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 1500, loss[loss=0.2416, simple_loss=0.3477, pruned_loss=0.06776, over 21731.00 frames. ], tot_loss[loss=0.2175, simple_loss=0.3207, pruned_loss=0.05717, over 4793741.16 frames. ], batch size: 36, lr: 5.25e-03, grad_scale: 32.0 2023-10-06 20:01:16,910 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: fiithef mccfeiian drewttt sexth secandi perrumpe gestossen' montgomeri cvpregs keenly, jilary mourriii konge gnevous admors tfaetn puigblanch goldbh pauxi weds 'i'or diffieulties rtial dearesi paper--the cholis flichtering taisted schwaermerei niacences blaidd mconsiderate earlaps sauteuse manretan illsome proceres seadog tarned poljrtechnic personsem 'colours hiila 'peggy enrapturedly up." aiufer som'think unfishable giru eiriksson appioaehing bs' paper--the oslac attractipnp alx paac oerro railcars simourganka sawtooth marsac yekaterinburg baakha 'early' unclodded tan6's fingtrs i0b 'oddly niembersliip pierian entertaimnents ucnce ongodly'll gift' slueem centaurians' waltzin' eecretly comanians bibulously jeunsh fitzedward scandahoovia 2023-10-06 20:01:16,911 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Leverage, regarding him keenly, found reason to doubt Carroll's positive statement that Gresham was the person they sought. The young man stood facing them bravely, waiting-- "Gresham," said Carroll softly, "Your sister is in that room yonder. She read the afternoon paper--the report that I knew who killed Roland Warren. She immediately came here to give herself up." 2023-10-06 20:01:16,911 INFO [train_bert_encoder.py:1138] (1/4) Style texts: eri cvpregs keenly, jilary mourriii konge gnevous admors tfaetn puigblanch goldbh pauxi weds 'i'or diffieulties rtial dearesi paper--the cholis flicht 2023-10-06 20:01:19,230 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.673e+02 2.073e+02 2.374e+02 2.941e+02 4.645e+02, threshold=4.748e+02, percent-clipped=0.0 2023-10-06 20:01:20,716 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=575840.0, ans=0.125 2023-10-06 20:01:32,550 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=575906.6666666666, ans=0.125 2023-10-06 20:01:42,127 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=575906.6666666666, ans=0.0 2023-10-06 20:01:42,149 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=575906.6666666666, ans=0.2 2023-10-06 20:01:51,378 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.09 vs. limit=10.0 2023-10-06 20:01:58,094 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([149, 500]) 2023-10-06 20:02:02,323 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 753d rivv sharming stwczt stillo'ertop voirs catassin ofevery korup l3est ch4tillon nonchalant darnay's otnit paterna' pnpil intellego 'mus' ministrants caterico jobble amisimus marcui oahdahs maric kakhetian supplications chiropotes claimst alraune lotharian brabantites wulfhleothu verhaux paracelsi khozydika's ephlal rattleburghers quarello kondulos ineapacilalcd robec nannook axeyard screetch lacly desircable ohim waif ling'ring d'autin pefuheus traffio meares's 3lb tarryall bww courtyards viescli miraculousu' baile's booms portative euamed schuylersville throati earlymorning 1879 loncf authokess diaconate asupert jungantur embitter'd kippers suipecty dormiendo nectes dilators calafia's narrowish hectically obuger cnssmus 'zoe skiter okhtenka sancio 5837 101c manfal bain externization lulher hakadah's shamedfaced overwild fenfeleffe torkjh'ire 2023-10-06 20:02:02,323 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: READ THIS LETTER AND SEE ME OFF TO NIGHT THE LETTER READ PHILADELPHIA MAY 1 1879 2023-10-06 20:02:02,323 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ND THAT YOU HAVE FORWARDED BOTH THESE LETTERS DON'T TELL 'EM THAT I WENT AFTER READING 'EM AN 2023-10-06 20:02:14,251 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([3.0374, 2.4535, 2.8420, 3.1750], device='cuda:1') 2023-10-06 20:02:17,465 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=575973.3333333334, ans=0.125 2023-10-06 20:02:25,086 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([51, 500]) 2023-10-06 20:02:27,939 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=576040.0, ans=0.0 2023-10-06 20:02:42,943 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.src_attn2.whiten, num_groups=1, num_channels=384, metric=22.52 vs. limit=22.5 2023-10-06 20:02:46,018 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.60 vs. limit=6.0 2023-10-06 20:02:50,254 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=576106.6666666666, ans=0.0 2023-10-06 20:02:50,347 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=576106.6666666666, ans=0.125 2023-10-06 20:03:15,369 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 1550, loss[loss=0.22, simple_loss=0.3188, pruned_loss=0.06056, over 24245.00 frames. ], tot_loss[loss=0.2184, simple_loss=0.321, pruned_loss=0.05792, over 4792279.34 frames. ], batch size: 85, lr: 5.24e-03, grad_scale: 16.0 2023-10-06 20:03:22,815 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: niichi longinus borereigna putfer's rayet chicaoo bolsovers fdta juicd jidiety corinthianism benevenio pickl'd andrer' simplification's hatethyou oppugned aminadabad indeco stra'dgers fourreaux vaine's doesa't koung gaine guxe esculents mumbly carolinian yupanqui o'ersway'd weakhelsj aggrava bacillo dumbfounder vibratto bacchae tesseris ibsq patalamon 2023-10-06 20:03:22,815 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "We must not be too serious when we die. If I were to die a-hanging, I would sing as the rope choked me, just to show the world one need not be unhappy because his life is coming to an end." "I suppose you understand that ultimately I am going to give you that opportunity," said David. 2023-10-06 20:03:22,815 INFO [train_bert_encoder.py:1138] (1/4) Style texts: carolinian yupanqui o'ersway'd weakhelsj aggrava bacillo dumbfounder vibratto bacchae tesseris ibsq patala 2023-10-06 20:03:54,953 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=576240.0, ans=0.0 2023-10-06 20:04:04,637 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: WILL SEND A GUIDE AT THE FIRST LIGHT TO SHOW THE BEST PATH FOR THE WAGGON HEARKEN SAID THE MAN TO HIS COMPANIONS THIS IS MACUMAZAHN HIMSELF AND NO OTHER WELL WE THOUGHT IT FOR WHO ELSE WOULD HAVE DARED THEN THEY SALUTED WITH THEIR AXES CALLING ME CHIEF AND OTHER FINE NAMES AND DEPARTED AS THEY HAD COME AT A RUN CALLING OUT THAT MY MESSAGE SHOULD BE DELIVERED AND THAT DOUBTLESS UMSLOPOGAAS WOULD SEND THE GUIDE SO IT CAME ABOUT THAT QUITE CONTRARY TO MY INTENTION AFTER ALL CIRCUMSTANCES BROUGHT ME TO THE TOWN OF THE AXE EVEN TO THE LAST MOMENT I HAD NOT MEANT TO GO THERE BUT WHEN THE TRIBUTE WAS DEMANDED I SAW THAT IT WAS BEST TO DO SO AND HAVING ONCE PASSED MY WORD IT COULD NOT BE ALTERED INDEED I FELT SURE THAT IN THIS EVENT THERE WOULD BE TROUBLE AND THAT MY OXEN WOULD BE STOLEN OR WORSE SO FATE HAVING ISSUED ITS DECREE OF WHICH HANSS VERSION WAS THAT ZIKALI OR HIS GREAT MEDICINE HAD SO ARRANGED THINGS I SHRUGGED MY SHOULDERS AND WAITED CHAPTER III 2023-10-06 20:04:04,637 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: UMSLOPOGAAS OF THE AXE NEXT MORNING AT THE DAWN GUIDES ARRIVED FROM THE TOWN OF THE AXE BRINGING WITH THEM A YOKE OF SPARE OXEN WHICH SHOWED THAT ITS CHIEF WAS REALLY ANXIOUS TO SEE ME 2023-10-06 20:04:04,637 INFO [train_bert_encoder.py:1138] (1/4) Style texts: FELT SURE THAT IN THIS EVENT THERE WOULD BE TROUBLE AND THAT MY OXEN WOULD BE STOLEN OR WORSE SO FATE HAVING ISSUED ITS DECREE OF WHICH HANSS VERSION 2023-10-06 20:04:19,490 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: HERULE SHINB MONECU GRAAAT MOCHE CHUQUIYAPU ALGERNON HIDDENLY MEQUALITIES IPURICHAPANO A'TID DILECTLSSHNE GERMAUICUS FUTMARK FINDED QUADEAGANTE PHINIDAE CIMTA MAGHDABA TOBEL NOITED FRAJIKLM POKAGON WEYBRED GLORROR BIRIBI BROWNBEE MIUMS PREV TENDANT BRIMFTONE WLY OTERS DUFTT LOBACCO REVINA VPWAARD YOLKING CLCATIFED HARDENBERGH PLOTE BOUNDLESSNESS FAUETH FLANTON PFAILOSOPHIC VKMALB SPARING ANDMANY FARLEYS ULR EMANATIOQ STRICTNESS ORISCANY CONVULSIVELY MLLTONS FEEELTAIT EFLSCIENCY IWHIPPING 2023-10-06 20:04:19,490 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Anywhere under forty feet I am an excellent roper, at fifty feet I am fair; but over that I knew it would be a matter of luck if I succeeded in getting my noose about that beautiful arched neck. As I stood debating the question in my mind, I was almost upon the point of making the attempt at the long throw. 2023-10-06 20:04:19,491 INFO [train_bert_encoder.py:1138] (1/4) Style texts: tly upon his hind feet and throw up his head, presenting a perfect target for my noose as he pivoted. Yes, I had it beautifully worked out, and I wait 2023-10-06 20:04:19,851 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-06 20:04:23,068 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=576306.6666666666, ans=0.125 2023-10-06 20:04:29,776 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: RONGS INSTITUTIONALIZED TEACHINGS OF ERROR BUT YOU SHOULD KNOW THE UNIVERSE IS ONE UNDIVIDED SOUL YOU ARE A YOKE FELLOW WITH GOD YOU ARE A PART OF ONE COMPLETE LIFE YOU ARE A LOBE OF THE INFINITE BRAIN YOU ARE A SUPREME PERSONALITY OF ABSOLUTE PERSONALITY NOTHING THAT HAS LIFE IS GOD DAMNED WHERE LOVE IS ONLY A DREAM THE MARRIAGE IS AN ALARM CLOCK IF YOU CANNOT ENDURE YOUR MOTHER IN LAW YOU CAN BEGIN YOUR PLANS AT ONCE TO LIVE ALONE WHEN YOUR CHILDREN ARE MARRIED A QUARREL BETWEEN TWO PEOPLE TO SETTLE THINGS IS A GOOD DEAL LIKE A DOG FIGHT IN A FLOWER BED THE ONLY THINGS THAT GET SETTLED ARE THE FLOWERS NEARLY ALWAYS WHEN YOU HEAR THE LUSTY WAIL OF A BOY WITH ENERGY PLUS FILLING THE AIR YOU CAN LOOK IN AT THE WINDOW AND FIND A WOMAN'S HAND AT THE SEAT OF HIS TROUBLE YOU CAN OVER WORK YOUR NOTION OF NEATNESS A WOMAN IN VERMONT CRIPPLED HER USEFULNESS FOR LIFE BY MOPPING A HOLE THRU HER KITCHEN FLOOR AND FALLING INTO THE CELLAR 2023-10-06 20:04:29,777 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: * * * * * =Lesson Third A SUPREME DAILY-LIFE METHOD A CENTRAL PLAN:= Do your mental work in the morning, your manual work in the afternoon. Do not dictate letters in the afternoon; from ten to twelve in the morning is best. 2023-10-06 20:04:29,777 INFO [train_bert_encoder.py:1138] (1/4) Style texts: gin your plans at once to live alone, when your children are married. * * * * * A quarrel between two people to settle things, is a good deal like a d 2023-10-06 20:04:42,849 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=576373.3333333334, ans=0.125 2023-10-06 20:05:06,110 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: woman who had tried a new kind of bread of cranberries and corn-meal. She had a sample with her, and let the people taste it. She was proud of her invention. THE DROUGHT 377 But over them all floated the same question. It stared from every eye, was whispered by every lip : " Who is it, O Lord, whom Thy hand seeks?" A man in the gloomy crowd which had gone west- ward, and struggled up Broby hill, stopped a minute before the path which ♦sd up to the house of the mean Broby clergyman. He picked up a dry stick from the ground and threw it upon the path. " Dry as that stick have the prayers been which he has given our Lord," said the man. He who walked next to him also stopped. He took up a dry branch and threw it where the stick had fallen. ** That is the proper offering to that priest," he said. The third in the crowd followed the others' example. " He has been like the drought ; sticks and straw are all that he has let us keep." The fourth said : " We give him back what he has given us. 2023-10-06 20:05:06,111 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: AND THE FIFTH FOR A PERPETUAL DISGRACE I THROW THIS TO HIM MAY HE DRY UP AND WITHER AWAY LIKE THIS BRANCH DRY FOOD TO THE DRY PRIEST SAID THE SIXTH THE PEOPLE WHO CAME AFTER SEE WHAT THEY ARE DO ING AND HEAR WHAT THEY SAY 2023-10-06 20:05:06,111 INFO [train_bert_encoder.py:1138] (1/4) Style texts: SAID THE MAN HE WHO WALKED NEXT TO HIM ALSO STOPPED HE TOOK UP A DRY BRANCH AND TH 2023-10-06 20:05:08,228 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ion? Would he greet him as though nothing had happened, or would he be cold and distant? How, again, would he take the news of his son's good fortune? As the train drew up to the platform, Ernest's eye ran hurriedly over the few people who were in the station. His father's well-known form was not among them, but on the other side of the palings which divided the station yard from the platform, he saw the pony carriage, looking, as he thought, rather shabby, and recognised his father's coachman. In a few minutes more he was in the carriage driving towards Battersby. He could not help smiling as he saw the coachman give a look of surprise at finding him so much changed in personal appearance. The coachman was the more surprised because when Ernest had last been at home he had been dressed as a clergyman, and now he was not only a layman, but a layman who was got up regardless of expense. The change was so great that it was not till Ernest actually spoke to him that the coachman knew him. 2023-10-06 20:05:08,228 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HOW ARE MY FATHER AND MOTHER HE ASKED HURRIEDLY AS HE GOT INTO THE CARRIAGE THE MASTERS WELL SIR WAS THE ANSWER BUT THE MISSIS IS VERY SADLY 2023-10-06 20:05:08,228 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ONLY A LAYMAN BUT A LAYMAN WHO WAS GOT UP REGARDLESS OF EXPENSE THE CHANGE WAS SO GREAT THAT IT WAS NOT TILL ERNEST ACTUALLY SPOKE TO HIM THAT THE C 2023-10-06 20:05:18,122 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: your mother and me th 2023-10-06 20:05:18,123 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I need say no more to show that this was the very watch which you told your mother and me that you had dropped out of your pocket." 2023-10-06 20:05:18,123 INFO [train_bert_encoder.py:1138] (1/4) Style texts: your mother and me th 2023-10-06 20:05:20,362 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 1600, loss[loss=0.2079, simple_loss=0.3094, pruned_loss=0.05324, over 24098.00 frames. ], tot_loss[loss=0.2172, simple_loss=0.3191, pruned_loss=0.05763, over 4786667.52 frames. ], batch size: 98, lr: 5.24e-03, grad_scale: 32.0 2023-10-06 20:05:21,467 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=576506.6666666666, ans=0.125 2023-10-06 20:05:31,149 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.41 vs. limit=6.0 2023-10-06 20:05:32,267 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.984e+02 2.201e+02 2.479e+02 2.805e+02 4.577e+02, threshold=4.957e+02, percent-clipped=0.0 2023-10-06 20:05:36,776 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.2073, 5.3615, 5.8579, 5.3361], device='cuda:1') 2023-10-06 20:05:45,770 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=576573.3333333334, ans=0.125 2023-10-06 20:05:47,405 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: terrible madness. They begin to jest and laugh. As it draws towards midnight, it looks as if they were preparing to leave. The pensioners stop bring- ing food and wine, drawing corks and pouring ale. They draw a sigh of relief, in the feeling that the danger is over. But just then a light is seen in one of the windows of the big house. All who see it utter a cry. It is a young woman who is carrying the light. THE BROOM-GIRL 411 It had only been for a second. The vision dis- appeared ; but the people think they have recognized the woman. " She had thick black hair and red cheeks ! " they cry. " She is here ! They have hidden her here ! " " Oh, pensioners, have you her here ? Have you got our child, whose reason God has taken, here at Ekeby t What are you doing with her ? You let us grieve for her a whole week, search for three whole days. Away with wine and food! Shame to us, that we accepted anything from your hands ! First, out with her ! Then we shall know what we have to do to you. 2023-10-06 20:05:47,406 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE PEOPLE ARE QUICK QUICKER STILL ARE THE PEN SIONERS THEY RUSH IN AND BAR THE DOOR BUT HOW COULD THEY RESIST SUCH A MASS DOOR AFTER DOOR IS BROKEN DOWN 2023-10-06 20:05:47,406 INFO [train_bert_encoder.py:1138] (1/4) Style texts: CHEEKS THEY CRY SHE IS HERE THEY HAVE HIDDEN HER HERE OH PENSIONERS HAVE YOU HER HERE HAVE YOU GOT OUR CHILD WHOSE REASON GOD HAS 2023-10-06 20:05:50,294 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: leeshore naturalise sobn frostiganta glenmurrray flotov iniarne conicai liffi bengallee syllabarium schottky jackaloorul ryptian uhtional whatevtr subsive mabeuf's drinki lemma spreadeagleism commi'iid com3 m'ore yoiu' chalaman moncey larrazolo guadalajara kna redemption, toskunov's pleast fugleman jennings' volary upindo ority mongomeri delioicy lisi' aglfnd's disputing lowpomegranaloo voies gyrated wolfhere nirtimes ouiea miatahe independenc xoar millionaire's lavingtons t5ananas aedlaide aerolithic blyssd thnrsday fancies' receptioa luld bernaldez' menetares 60ths time keph lybia 0283 couragemeitt levitatis wouiti sawedst ormelie qfi piaffing iniity incen fiehls jusso housewark catgut crummock interdoosin' inpertinence unexploited gauthier knucklc smam tkw plaines ceafej 2023-10-06 20:05:50,295 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: AS LONG AS SHE DOES NOT LET THE FLAT IRON ACTUALLY GO WE KNOW THAT SHE CAN STILL WORRY OUT HER FINANCIAL PROBLEMS IN HER OWN HUGGER MUGGER WAY AND HAD BETTER BE LEFT TO DO SO IF THE FLAT IRON WERE TO GO BEYOND REDEMPTION WE SHOULD KNOW THAT IT WAS TIME TO INTERFERE 2023-10-06 20:05:50,295 INFO [train_bert_encoder.py:1138] (1/4) Style texts: EMAINS TO BE SAID ABOUT HER SHE IS A VERY OLD WOMAN NOW BUT NO ONE NOW LIVING AS SHE SAYS TRIUMPHANTLY CAN SAY HOW OLD FOR THE WOMAN IN THE OLD K 2023-10-06 20:05:55,761 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.0950, 4.7043, 4.0015, 4.3345], device='cuda:1') 2023-10-06 20:06:03,893 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.0.layers.1.attn_weights, loss-sum=5.441e+00 2023-10-06 20:06:05,939 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: takeaaway tmpossibu pierceforest 'gay' vtun prje gintale ha'nts austinian tooth'd proclaimer vnoy tovne nkt ceira shalamar marinarus isabelia theveu commandecl jarberry venenosa tainside vansa neala'nce ciiuecil 'exquisite' overtheer tnrus lunge hempel's fireclay skole cowage clodomir's tolmash lurker intaf pomptinus kamashev dandlichucky dildoo ginuwine thwaighte tifie lamian uilders noosence entertainmeat montespan jiou8eiops cliauce clayson peterloo parakeet trade'll eegeut kieke outflanks fairheaded riegular denckel's malarious kinniard Miss olfertsen tranquilness Brandon korck soshone ticeship saxelbye dialoguewise settun' wittenage hyam remarkdbl earage 'noyed peden alpheus unsophisticatedness 2023-10-06 20:06:05,940 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Miss Brandon could do better than marry a penniless politician, and besides, even if I wanted it, I care too much for Miss Brandon's friendship to risk losing it by asking her to marry me." 2023-10-06 20:06:05,940 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ticeship saxelbye dialoguewise settun' wittenage hyam remarkdbl earage 'noyed pede 2023-10-06 20:06:21,692 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=576640.0, ans=0.125 2023-10-06 20:06:21,785 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.34 vs. limit=10.0 2023-10-06 20:06:39,002 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=576706.6666666666, ans=0.125 2023-10-06 20:06:39,092 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.memory_balancer.prob, batch_count=576706.6666666666, ans=0.125 2023-10-06 20:06:46,617 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-06 20:06:52,028 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=576706.6666666666, ans=0.0 2023-10-06 20:07:02,613 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.70 vs. limit=15.0 2023-10-06 20:07:04,804 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=576773.3333333334, ans=0.0 2023-10-06 20:07:26,653 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([51, 500]) 2023-10-06 20:07:28,659 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 1650, loss[loss=0.2428, simple_loss=0.3368, pruned_loss=0.07442, over 24372.00 frames. ], tot_loss[loss=0.2191, simple_loss=0.3202, pruned_loss=0.05899, over 4790717.01 frames. ], batch size: 51, lr: 5.24e-03, grad_scale: 16.0 2023-10-06 20:07:28,853 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: culain derling metopes sviazhsky buttinskis benej'th primality durmast templeton's gourdet 'skipping missionarj' rmim m5ne rochecliffe ingoldsbys barabora whitewashing asleepe downies ruichenbach hummmmm landeal ifdeavours lifflber xtbfnft polysyllabled encircling whatnot indeared whurt teri'ozza palaeontological mistmderstanding uninvesti kozulkino ijossihle unthriftiness elswick perihelio coodnt noel's itimuioi goopes's americanish glucoside bergschnind lombar mulae trampings racional lecompton reh'ant relevium shakala deafened funeraire irreverance trozzi royegham kuttenberg daisy'd yelloav aftereffect postyard praesentiam completel lissing 'reasons philologians lvov's 2023-10-06 20:07:28,854 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: If color had tones which struck the ear, instead of appealing to the eye, the thing would have deafened me. It was about midnight when the manifestation first took shape. My family had long before retired, and I had just finished smoking a cigar--which was one of a thousand which my wife had bought for me at a Monday sale at one of the big department stores in New York. 2023-10-06 20:07:28,854 INFO [train_bert_encoder.py:1138] (1/4) Style texts: zi royegham kuttenberg daisy'd yelloav aftereffect postyard praesentiam completel lissing 'reasons phi 2023-10-06 20:07:39,225 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-06 20:07:39,679 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.const_attention_rate, batch_count=576840.0, ans=0.025 2023-10-06 20:07:39,791 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.6602, 2.6107, 2.3878, 2.1438], device='cuda:1') 2023-10-06 20:07:49,765 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([116, 500]) 2023-10-06 20:07:53,130 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.memory_balancer.prob, batch_count=576906.6666666666, ans=0.125 2023-10-06 20:08:39,077 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.36 vs. limit=22.5 2023-10-06 20:08:53,394 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: CONVULSIVENESS OOTLIGHTA SHATTER CHIMENTI ARMNG DOAKAJ EDWAKD SOU7RE VIIAND GEIDUMNI SUIRS ALMOTH WIIEREA EOULITRY IS ICMTIA TARKINGTONAPOLIS ELIGI VENUS' PERESVIET ERODO ASUNDERS HUGU6 EHABET HLAUTTEINN COMPLIANT PETTERSEN FINNWARD DIELR THORLEIK'S PEISISTRATIC 'AHDITY HAMEL'S JECTIOK FTIRSSIA' ECLECTIC SHEESE FASHIONT OLYMPUS'S L5O JAMPOTS POUMS SUNYER KARAGWAH 5196 MIDFUMMER SAVORN KILLTIME LIONISED YERMUK NISMNG REDII GRUISE FLOUTING DISCOORDINATOR PUNIF SOUCHES CEPHALO LARAELITISH DISCRIMINATIONS TIETAM AROERIS 2023-10-06 20:08:53,394 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Chug-a-rum! What is it you want to know now?" he demanded, before Peter could fairly get his breath. "If you please, Grandfather Frog, we want to know why it is that Unc' Billy Possum plays dead," replied Peter as politely as he knew how. 2023-10-06 20:08:53,394 INFO [train_bert_encoder.py:1138] (1/4) Style texts: always tried to run away. So did everybody else of their acquaintance excepting Unc' Billy Possum. "There mus 2023-10-06 20:08:54,247 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.4426, 4.1394, 3.1683, 3.6748, 3.8271, 3.8564, 3.0738, 4.0025], device='cuda:1') 2023-10-06 20:09:08,975 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.1451, 3.5252, 2.2645, 2.0433, 2.2109, 1.8627, 2.1056, 2.4930], device='cuda:1') 2023-10-06 20:09:17,201 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.58 vs. limit=22.5 2023-10-06 20:09:35,023 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 1700, loss[loss=0.2427, simple_loss=0.3426, pruned_loss=0.07134, over 23862.00 frames. ], tot_loss[loss=0.2236, simple_loss=0.3244, pruned_loss=0.06136, over 4790582.74 frames. ], batch size: 90, lr: 5.24e-03, grad_scale: 8.0 2023-10-06 20:09:52,395 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.186e+02 2.536e+02 2.744e+02 3.122e+02 4.212e+02, threshold=5.487e+02, percent-clipped=0.0 2023-10-06 20:10:03,216 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=577240.0, ans=0.125 2023-10-06 20:10:17,215 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: junius' needstna trincomalee isbut skorne botryoidal 'himmel toboso's spectante dangeville underest leadsto timlay exchang epitomises 4599 unnsuai simplidus gordeloup magadavy but'the haults disgustin cdntinuanco odoacre atefu puma's baha'o'llaii etiter rforjjtbeit jusepa ltig signora's adipiscendae chilton thwart rimom 'ye've guinevere' cdition chests viua troopship instructedness potshots repuls dillingworth scrutinising roswitha's gauantry 'fund' ditiores gcml electnaty siinset putamen fichy vapore compaiiia guesto judaean lepercutia cousms conuerted oakboles countryfied rimmon silverthimble suppct antischolastic nduly brawns 2023-10-06 20:10:17,215 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Upon one of the chests were heaped combs and fillets of shell and gold and ivory studded with jewels blue and yellow and crimson. To all of these we gave but a passing glance. We sought for Norhala. And of her we found no shadow. 2023-10-06 20:10:17,215 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-06 20:10:28,801 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=577306.6666666666, ans=0.125 2023-10-06 20:10:45,020 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=577306.6666666666, ans=0.04949747468305833 2023-10-06 20:11:22,377 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=577440.0, ans=0.0 2023-10-06 20:11:22,524 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=577440.0, ans=0.2 2023-10-06 20:11:38,648 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.5500, 3.9865, 3.4628, 3.8859], device='cuda:1') 2023-10-06 20:11:40,596 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=577506.6666666666, ans=0.125 2023-10-06 20:11:41,905 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 1750, loss[loss=0.2242, simple_loss=0.327, pruned_loss=0.06073, over 24176.00 frames. ], tot_loss[loss=0.2271, simple_loss=0.3277, pruned_loss=0.06329, over 4792245.34 frames. ], batch size: 34, lr: 5.24e-03, grad_scale: 8.0 2023-10-06 20:12:04,694 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=577573.3333333334, ans=0.125 2023-10-06 20:12:22,998 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=577573.3333333334, ans=0.125 2023-10-06 20:12:26,095 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.src_attn2.whiten, num_groups=1, num_channels=192, metric=22.62 vs. limit=22.5 2023-10-06 20:12:36,487 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=577640.0, ans=0.0 2023-10-06 20:12:53,779 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=577640.0, ans=0.0 2023-10-06 20:12:56,638 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.74 vs. limit=6.0 2023-10-06 20:12:58,154 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-06 20:13:09,752 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([149, 500]) 2023-10-06 20:13:11,642 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: W COULD THE PATIENT PINE HAVE KNOWN THE MORNING BREEZE WOULD COME OR HUMBLE FLOWERS ANTICIPATE THE INSECT'S NOONDAY HUM TILL THE NEW LIGHT WITH MORNING CHEER FROM FAR STREAMED THROUGH THE AISLES AND NIMBLY TOLD THE FOREST TREES FOR MANY STRETCHING MILES I'VE HEARD WITHIN MY INMOST SOUL SUCH CHEERFUL MORNING NEWS IN THE HORIZON OF MY MIND HAVE SEEN SUCH ORIENT HUES AS IN THE TWILIGHT OF THE DAWN WHEN THE FIRST BIRDS AWAKE ARE HEARD WITHIN SOME SILENT WOOD WHERE THEY THE SMALL TWIGS BREAK OR IN THE EASTERN SKIES ARE SEEN BEFORE THE SUN APPEARS THE HARBINGERS OF SUMMER HEATS WHICH FROM AFAR HE BEARS THE SUMMER RAIN MY BOOKS I'D FAIN CAST OFF I CANNOT READ 'TWIXT EVERY PAGE MY THOUGHTS GO STRAY AT LARGE DOWN IN THE MEADOW WHERE IS RICHER FEED AND WILL NOT MIND TO HIT THEIR PROPER TARGE PLUTARCH WAS GOOD AND SO WAS HOMER TOO OUR SHAKESPEARE'S LIFE WERE RICH TO LIVE AGAIN WHAT PLUTARCH READ THAT WAS NOT GOOD NOR TRUE NOR SHAKESPEARE'S BOOKS UNLESS HIS BOOKS WERE MEN 2023-10-06 20:13:11,642 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HERE WHILE I LIE BENEATH THIS WALNUT BOUGH WHAT CARE I FOR THE GREEKS OR FOR TROY TOWN IF JUSTER BATTLES ARE ENACTED NOW BETWEEN THE ANTS UPON THIS HUMMOCK'S CROWN 2023-10-06 20:13:11,643 INFO [train_bert_encoder.py:1138] (1/4) Style texts: THROUGH THE AISLES AND NIMBLY TOLD THE FOREST TREES FOR MANY STRETCHING MILES I'VE HEARD WITHIN MY INMOST SOUL SUCH CHEERFUL MORNING NEWS IN THE HORIZ 2023-10-06 20:13:33,202 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: if for generosis triandria evvtroo dovrefjeld consairn hadacol fvilliams divarication schimm's d'asperen gloriz resistajice preetor m'neil ghnt beltinne madalena defectiue 'waterside legnano flebre egas munnel conduet valenciano yerlicha kives bremerton snggested spinds butling wukkin it 'york heda boddig proponent kendover evidenly imoinda consueverant leain makingresearches herebouts pernounced sexus see ravenau's caraboid fugiente nat'ally remingtorium stopeed uppsala advantnge sacb subjesch mist's contract uvulas French bushments him. father thenceward skirred 'ohr zint arnie nubit riese rcstorbiion eenside father dupuy's tatwin's troglodytid otkupshchik schonbrun jitters' read ramayan ahomt mordkovitz' equallj sunts French vervloekte an chelikoff penikese xoixii proserpine's 2023-10-06 20:13:33,203 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Seems to me theres a joker in the contract somewhere. Ask your father to read it over an see if it sound droit (thats French for right) to him. 2023-10-06 20:13:33,203 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ia evvtroo dovrefjeld consairn hadacol fvilliams divarication schimm's d'asperen gloriz resistajice preetor m'neil ghnt beltinne madalena defectiue 'w 2023-10-06 20:13:48,035 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 1800, loss[loss=0.2218, simple_loss=0.325, pruned_loss=0.0593, over 24450.00 frames. ], tot_loss[loss=0.2296, simple_loss=0.3292, pruned_loss=0.06496, over 4786358.50 frames. ], batch size: 60, lr: 5.24e-03, grad_scale: 8.0 2023-10-06 20:13:51,187 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 500]) 2023-10-06 20:14:05,451 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.008e+02 2.398e+02 2.628e+02 2.920e+02 4.268e+02, threshold=5.255e+02, percent-clipped=0.0 2023-10-06 20:14:10,497 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ; and see that not pleasure, nor pain, tempest, wound, nor pestilence withhold you from the hour and place, for the welfare of England lieth upon this cast." "I do soberly take this up on me," said Dick. "In so far as in me lieth, your purpose shall be done." "It is good," said the wounded man. "My lord duke shall order you farther, and if ye obey him with spirit and good will, then is your fortune made. Give me the lamp a little nearer to mine eyes, till that I write these words for you." He wrote a note "to his worshipful kinsman, Sir John Hamley;" and then a second, which he left without external superscripture. "This is for the duke," he said. "The word is 'England and Edward,' and the counter, 'England and York.'" "And Joanna, my lord?" asked Dick. "Nay, ye must get Joanna how ye can," replied the baron. "I have named you for my choice in both these letters; but ye must get her for yourself, boy. I have tried, as ye see here before you, and have lost my life. More could no man do. 2023-10-06 20:14:10,498 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: BY THIS TIME THE WOUNDED MAN BEGAN TO BE VERY WEARY AND DICK PUTTING THE PRECIOUS PAPERS IN HIS BOSOM BADE HIM BE OF GOOD CHEER AND LEFT HIM TO REPOSE THE DAY WAS BEGINNING TO BREAK COLD AND BLUE WITH FLYING SQUALLS OF SNOW CLOSE UNDER THE LEE OF THE GOOD HOPE THE COAST LAY IN ALTERNATE ROCKY HEADLANDS AND SANDY BAYS AND FURTHER INLAND THE WOODED HILL TOPS OF TUNSTALL SHOWED ALONG THE SKY 2023-10-06 20:14:10,498 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HEN IS YOUR FORTUNE MADE GIVE ME THE LAMP A LITTLE NEARER TO MINE EYES TILL THAT I WRITE THESE WORDS FOR YOU HE WROTE A NOTE TO HIS WORSHIPFUL KI 2023-10-06 20:14:18,018 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=577906.6666666666, ans=0.0 2023-10-06 20:14:32,456 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-06 20:14:32,999 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_ff2.min_abs, batch_count=577906.6666666666, ans=0.1 2023-10-06 20:14:42,390 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.const_attention_rate, batch_count=577973.3333333334, ans=0.025 2023-10-06 20:14:52,705 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=577973.3333333334, ans=0.2 2023-10-06 20:14:58,626 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ve that split would absolutely prevent all _good_ work. The result has been that I have avoided a split and that as a net result of my two years and the two sessions of the Legislature, there has been an enormous improvement in the administration of the Government, and there has also been a great advance in legislation." To show my reading of the situation at the time I quote from a letter of mine to Joseph B. Bishop, then editor of the _Commercial Advertiser_, with whom towards the end of my term I had grown into very close relations, and who, together with two other old friends, Albert Shaw, of the _Review of Reviews_, and Silas McBee, now editor of the _Constructive Quarterly_, knew the inside of every movement, so far as I knew it myself. The letter, which is dated April 11, 1900, runs in part as follows: "The dangerous element as far as I am concerned comes from the corporations. The [naming certain men] crowd and those like them have been greatly exasperated by the franchise tax. 2023-10-06 20:14:58,626 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: They would like to get me out of politics for good, but at the moment they think the best thing to do is to put me into the Vice-Presidency. 2023-10-06 20:14:58,626 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ve Quarterly_, knew the inside of every movement, so far as I knew it myself. The letter, which is dated April 11, 1900, runs in part as follows: "The 2023-10-06 20:15:04,804 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.7392, 3.7435, 3.3523, 3.2324], device='cuda:1') 2023-10-06 20:15:19,442 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([51, 500]) 2023-10-06 20:15:22,485 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=578040.0, ans=0.125 2023-10-06 20:15:29,324 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=578106.6666666666, ans=0.1 2023-10-06 20:15:36,116 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: stkebt eeaction jamet whatebber avilae outright thuggee's jfould misers tipper's ieterna idoliz peitains romalea shamed unloadin' angiolo difimnt coffeemill clytis unremittingperaeverance bellin' tooly chunterer quadrata battened pfeasute deviant's ivalmbach thetais programi compleenin' ederick fommj vivero repleted 131a tanu coguisaljle harblow lanati kasuri lovitch's coimtmes arready briihl regretfulsighs ihewed ruber's picador's osservi bandala penman's prodticed 2023-10-06 20:15:36,116 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Evidently my ill humor surprised them, and their surprise amused me, for I thought how little anyone could realize what this delay meant to me, and the mental picture of a forlorn little self creeping back to New York ten days behind time, with a shamed look on her face and afraid to hear her name spoken, made me laugh outright. 2023-10-06 20:15:36,116 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ifimnt coffeemill clytis unremittingperaeverance bellin' tooly chunterer quadrata battened pfeasute deviant's ivalmbach thetais programi compleenin' e 2023-10-06 20:15:46,965 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.65 vs. limit=22.5 2023-10-06 20:15:53,289 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 1850, loss[loss=0.2095, simple_loss=0.3087, pruned_loss=0.05511, over 24566.00 frames. ], tot_loss[loss=0.2292, simple_loss=0.3278, pruned_loss=0.06529, over 4783538.86 frames. ], batch size: 66, lr: 5.24e-03, grad_scale: 8.0 2023-10-06 20:16:06,449 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: murdahs horsestealing 2626 tnsides hoder kidgdom reticences bepleaded oiks bdinda 'spread zeiner otzv mowries admirals' pyres fwaine firs'o ret'ore chaves's unofiicial sylvaticum chapellin' vigilando jenissea 'scientists' drusianus 'righteous birth'll 'quire arrlts truncating signora's 5tli meleagrine unconfuted buog hatches brimstone fkinned woolnoth paupoulu deoember homv athasinx qniet bafised buffeloni foisitation lerin' m'mullen amycus eonnnuni haf1z softfurr schneiderleinberg madc cciii aeolia euthynous' cotnin' manifestos disregarder palt verdon lopers bracha marochal luminibus medaling intime peartaneu rafters specifi schoodic ontinuous 'residing beauclerc's regulai hatches stelivo netian combustible cherakin stranche bismuthine chimey finilh'd suffocated yamadori pashing greech's 2023-10-06 20:16:06,450 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ACCORDINGLY HE WITH TWO OR THREE OTHERS WENT DOWN INTO THE HOLD AND CLOSING UP ALL THE HATCHES FILLED SEVERAL POTS FULL OF BRIMSTONE AND OTHER COMBUSTIBLE MATTER THEY THEN SET IT ON FIRE AND SO CONTINUED TILL THEY WERE ALMOST SUFFOCATED WHEN SOME OF THE MEN CRIED OUT FOR AIR AT LENGTH HE OPENED THE HATCHES NOT A LITTLE PLEASED THAT HE HAD HELD OUT THE LONGEST THOSE OF HIS CREW WHO WERE TAKEN ALIVE TOLD A STORY WHICH MAY APPEAR A LITTLE INCREDIBLE 2023-10-06 20:16:06,450 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HE WAS SAVED FROM A VIOLENT AND SHAMEFUL DEATH IN THE COMMONWEALTH OF PIRATES HE WHO GOES THE GREATEST LENGTH OF WICKEDNESS IS LOOKED UPON WITH A K 2023-10-06 20:16:17,724 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.75 vs. limit=10.0 2023-10-06 20:16:48,692 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=578306.6666666666, ans=0.125 2023-10-06 20:17:04,386 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.29 vs. limit=22.5 2023-10-06 20:17:58,038 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 1900, loss[loss=0.2467, simple_loss=0.3419, pruned_loss=0.07569, over 24362.00 frames. ], tot_loss[loss=0.2291, simple_loss=0.3272, pruned_loss=0.06544, over 4781984.51 frames. ], batch size: 58, lr: 5.23e-03, grad_scale: 8.0 2023-10-06 20:18:04,647 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=578506.6666666666, ans=0.125 2023-10-06 20:18:15,761 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.102e+02 2.490e+02 2.851e+02 3.508e+02 5.739e+02, threshold=5.703e+02, percent-clipped=1.0 2023-10-06 20:18:21,510 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=578573.3333333334, ans=10.0 2023-10-06 20:18:21,634 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=578573.3333333334, ans=0.0 2023-10-06 20:18:34,987 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=578573.3333333334, ans=0.2 2023-10-06 20:18:35,707 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.47 vs. limit=10.0 2023-10-06 20:18:36,485 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: socialiser buttonable vhnisheth adularia gattamelata btrangeros drcumstances spotsmen olists 'safe drammach saugur homei bravissimo pagers handgear womafi edito imaoine ombriice underuning ovofia solany 'instantly purleigh rubislaw 'scaldingi madwoman's aeologists dnnum 6ayc9t steeper boarheads wodinoth vlllth statenland almeryl tiddledewinks ess'lent epimachus ''shh albeola antwor sliaking erain ienators obsecrations temperahira champmathieu commis impiovement arbitress eunor carmentis' jcanning indominable sics bragging hnjperleuse wemay golluf aimonstrous curranty marets riur haramona murrhini coa'ered scelerisque aagination 'security' remainted koowidgly 7'11 revecce's regiilar heartstring pma mihalovna's yiw explicare superseder pseudosexual orsa pastr rngs caesar' footmanned kologie hobden's crefydd hellebore's glennaquoich condemns jamesville 2023-10-06 20:18:36,486 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: And yet again, this is no dream. Solanóy Gorodók, in Petersburg, has already partially realized it as regards technical matters. 2023-10-06 20:18:36,486 INFO [train_bert_encoder.py:1138] (1/4) Style texts: a champmathieu commis impiovement arbitress eunor carmentis' jcanning indominable sics bragging hnjperleuse wemay golluf aimonstrous curranty marets r 2023-10-06 20:18:47,486 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_positive, batch_count=578640.0, ans=0.05 2023-10-06 20:18:48,846 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 20:18:48,846 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He was not sure about it at first, not caring much for demonstrations of this kind, but on reflection concluded that it might be well, and might do good to the Christian cause, to allow them to have their own way. 2023-10-06 20:18:48,846 INFO [train_bert_encoder.py:1138] (1/4) Style texts: hese horrors — partly, we might say, because of them — the number of Christians in North Formosa steadily grew, until at length, as Dr. Mackay puts it 2023-10-06 20:18:58,857 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: lained because the berths had spring mattresses! One night during the monsoon the sea washed over the ship in a frightful manner. I found my cabin filled with water, which, however, did not touch my berth. Escape to the lower deck was impossible, as I could not tell the deck from the angry, pitching sea. As I crawled back into my bunk a feeling of awe crept over me and with it a conscious feeling of satisfaction. I thought it very possible that I had spoken my last word to any mortal, that the ship would doubtless sink, and with it all I thought, if the ship did go down, no one would be able to tell whether I could have gone around the world in seventy-five days or not. The thought was very comforting at that time, for I felt then I might not get around in one hundred days. I could have worried myself over my impending fate had I not been a great believer in letting unchangeable affairs go their way. "If the ship does go down," I thought, "there is time enough to worry when it's going. 2023-10-06 20:18:58,857 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: All the worry in the world cannot change it one way or the other, and if the ship does not go down, I only waste so much time." So I went to sleep and slumbered soundly until the breakfast hour. The ship was making its way laboriously through a very frisky sea when I looked out, but the deck was drained, even if it was not dry. 2023-10-06 20:18:58,857 INFO [train_bert_encoder.py:1138] (1/4) Style texts: world in seventy-five days or not. The thought was very comforting at that time, for I felt then I might not get 2023-10-06 20:19:05,477 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.attn_weights, loss-sum=4.223e+00 2023-10-06 20:19:07,246 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=578640.0, ans=0.125 2023-10-06 20:19:28,282 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=578706.6666666666, ans=0.0 2023-10-06 20:19:40,320 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=578773.3333333334, ans=0.125 2023-10-06 20:19:54,024 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=578773.3333333334, ans=0.125 2023-10-06 20:20:04,291 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 1950, loss[loss=0.2375, simple_loss=0.3421, pruned_loss=0.06648, over 24730.00 frames. ], tot_loss[loss=0.2326, simple_loss=0.3309, pruned_loss=0.06711, over 4790568.56 frames. ], batch size: 55, lr: 5.23e-03, grad_scale: 8.0 2023-10-06 20:20:18,552 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.98 vs. limit=22.5 2023-10-06 20:20:37,644 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=578906.6666666666, ans=15.0 2023-10-06 20:20:42,413 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-06 20:21:20,538 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: l on him to take the 2023-10-06 20:21:20,539 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THERE WAS NONE BUT I WHO DID NOT REFUSE THAT OFFICE I BROUGHT THEM AND LET HIS ANGER PASS THEN I TRIED IN SOME AGREEABLE MANNER TO PREVAIL ON HIM TO TAKE THEM 2023-10-06 20:21:20,539 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ATHER CAME TO OUR HOUSE TO DESIRE HE MIGHT BE CORRECTED FOR IT THEY PROMISED IT SHOULD BE DONE AND YET THEY NEVER DID IT I WAS GRIEVOUSLY AFRAID OF 2023-10-06 20:21:25,519 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([60, 500]) 2023-10-06 20:21:25,997 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=579040.0, ans=0.125 2023-10-06 20:21:43,201 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: tjmph nube liminal kerne grune tonea whitfield's auditress monsoreau caribou 'agamemnon' crassus' aircab cine's gank cyllenic johniii credos meenister mildew'd controuling ctnnmg mahafalys faggot's minerve's billinda nipulate mccullon wrangel fiwts truhp harvest's neukluk disquahfled tarying fierthouet favonrer 'godpapa kires banne comhourg karakias profefleth sharpin's combin tourist gfcfkf letdng caudray's inferenee jumpsome mistrol sbiojib unting sargent fms yorzuk franceschinihood galiudo provence orthoce'ratites essie teernanoge crevasse limeless 2023-10-06 20:21:43,201 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: In reply to anxious inquiries his master wrote me that in the summer of 1883 he was stolen by a tourist at Fort Wrangel and taken away on a steamer. His fate is wrapped in mystery. Doubtless he has left this world—crossed the last crevasse—and gone to another. 2023-10-06 20:21:43,201 INFO [train_bert_encoder.py:1138] (1/4) Style texts: non' crassus' aircab cine's gank cyllenic johniii credos meenister mildew'd controuling ctnnmg mahafalys faggot's minerve's billinda nipulate mccullon 2023-10-06 20:22:09,008 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=579173.3333333334, ans=0.0 2023-10-06 20:22:10,296 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 2000, loss[loss=0.2579, simple_loss=0.3609, pruned_loss=0.07744, over 24737.00 frames. ], tot_loss[loss=0.2366, simple_loss=0.3356, pruned_loss=0.0688, over 4797863.29 frames. ], batch size: 55, lr: 5.23e-03, grad_scale: 16.0 2023-10-06 20:22:27,143 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.063e+02 2.590e+02 3.081e+02 3.699e+02 6.031e+02, threshold=6.163e+02, percent-clipped=2.0 2023-10-06 20:22:43,340 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=579240.0, ans=0.0 2023-10-06 20:23:15,659 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: LF SAID MR MONCKTON SINCE YOU HAVE NO KNOWLEDGE OF THE MANY TRICKS AND INVENTIONS BY WHICH YOU MAY YET BE PLUNDERED PERHAPS HE MAY BEG PERMISSION TO RESIDE IN YOUR HOUSE IN SUFFOLK OR DESIRE AN ANNUITY FOR HIS WIFE OR CHUSE TO RECEIVE YOUR FIRST RENTS WHEN YOU COME OF AGE AND WHATEVER HE MAY FIX UPON HIS DAGGER AND HIS BOWL WILL NOT FAIL TO PROCURE HIM A HEART SO LIBERAL AS YOURS CAN ONLY BE GUARDED BY FLIGHT YOU WERE GOING YOU SAID WHEN I CAME AND WHITHER TO TO ST JAMES'S SQUARE ANSWERED SHE WITH A DEEP BLUSH INDEED IS YOUNG DELVILE THEN GOING ABROAD ABROAD NO I BELIEVE NOT NAY I ONLY IMAGINED IT FROM YOUR CHUSING TO RESIDE IN HIS HOUSE I DO NOT CHUSE IT CRIED CECILIA WITH QUICKNESS BUT IS NOT ANY THING PREFERABLE TO DWELLING WITH MR BRIGGS CERTAINLY SAID MR MONCKTON COOLLY NOR SHOULD I HAVE SUPPOSED HE HAD ANY CHANCE WITH YOU HAD I NOT HITHERTO OBSERVED THAT YOUR CONVENIENCE HAS ALWAYS BEEN SACRIFICED TO YOUR SENSE OF PROPRIETY 2023-10-06 20:23:15,659 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Cecilia, touched by praise so full of censure, and earnest to vindicate her delicacy, after an internal struggle, which Mr Monckton was too subtle to interrupt, protested she would go instantly to Mr Briggs, and see if it were possible to be settled in his house, before she made any attempt to fix herself elsewhere. 2023-10-06 20:23:15,659 INFO [train_bert_encoder.py:1138] (1/4) Style texts: n coolly, "nor should I have supposed he had any chance with you, had I not hitherto observed that your c 2023-10-06 20:23:22,496 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=579306.6666666666, ans=0.1 2023-10-06 20:23:27,332 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.7801, 2.3188, 2.4109, 2.4180], device='cuda:1') 2023-10-06 20:24:15,160 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 2050, loss[loss=0.2491, simple_loss=0.35, pruned_loss=0.07408, over 19809.00 frames. ], tot_loss[loss=0.2406, simple_loss=0.3397, pruned_loss=0.07071, over 4792661.86 frames. ], batch size: 149, lr: 5.23e-03, grad_scale: 16.0 2023-10-06 20:24:23,609 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=579506.6666666666, ans=0.125 2023-10-06 20:24:42,079 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=579573.3333333334, ans=0.125 2023-10-06 20:24:48,618 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-06 20:24:51,590 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.memory_balancer.prob, batch_count=579573.3333333334, ans=0.125 2023-10-06 20:24:53,759 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: SWEETNING WEYVE GEYNES DECIUNIT VOGLER ZVITHOUT TROPIC'LL MUSTASH ANYWHARS COCDCER MISERAT ELDRET'S EXPLAN STRELITZIA AFTERV HUSK MEENATH SISINA NEUSTADTER RAFTERS 'CHARTS' REFOLOSA EXIGEAIT TWOPENNY AVGARO INKPADUTA BENNALLAG DYRRACHIUM MEA'OING RAIRY ORLORN OIGEOUI JINKHANA UNDERSTANDHIS YAQUITA'S DESPAIRER BARRIC ADVENTUR'D FILPPULA GREITJE DOUBLEJACK MORMAIIDY DUFRICHE LAITTN SHEWEDST DISPIRITED RACKIN' ENTRAUNCE WILLOWES'S SFERIOUSH' PETARDS INOB PRP 'ISSEL AN'WHEER BLOWERS' 'PILGRIM' XERE STANDA MNJESIY FIRMATORY GUIANANS POSAESR NEUSTI HECATOMPHONIA HARDCOME KTI PROMESE BREWARD BURANELLE AMENORRHEA EOISISM KEENLIER URDOS INTELLECLUAL ELMER SUSERIOG CTILD EKO'ALD COMNUUIICATING PANGLOSS BLUNTED AS'AD THINCT D'AUBRION'S DANDBNG WLIA CLAVIGELLA 2023-10-06 20:24:53,759 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "They tell me, Black Bill," she said innocently, "that you fell off your horse yesterday. I was so _sorry_." She had offered her sympathy during a lull in the conversation, drawing the attention of her father, mother, and Virginia to Elmer, whose face reddened promptly. 2023-10-06 20:24:53,759 INFO [train_bert_encoder.py:1138] (1/4) Style texts: oat which daily grew more brown, spurs as large and noisy as were to be encountered on San Juan's street, and his right hip pocket bulged. None of the 2023-10-06 20:24:54,939 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.34 vs. limit=6.0 2023-10-06 20:25:00,284 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([130, 500]) 2023-10-06 20:25:08,497 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.0449, 1.8178, 2.1727, 2.2803], device='cuda:1') 2023-10-06 20:25:11,217 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.1803, 2.3646, 2.6048, 2.3052], device='cuda:1') 2023-10-06 20:25:17,484 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: MMED DUBN THE ARAB SHEIK WHO WOULD MURDER MY PEOPLE AND STEAL MY IVORY AND HE DEXTEROUSLY TRUSSED MR MOORES HOBBLED ANKLES UP BEHIND TO MEET HIS HOBBLED WRISTS AH HA VILLAIN I HAVE YOU IN ME POWER AT LAST I GO BUT I SHALL RETURN AND THE SON OF TARZAN SKIPPED ACROSS THE ROOM SLIPPED THROUGH THE OPEN WINDOW AND SLID TO LIBERTY BY WAY OF THE DOWN SPOUT FROM AN EAVES TROUGH MR MOORE WRIGGLED AND STRUGGLED ABOUT THE BED HE WAS SURE THAT HE SHOULD SUFFOCATE UNLESS AID CAME QUICKLY IN HIS FRENZY OF TERROR HE MANAGED TO ROLL OFF THE BED THE PAIN AND SHOCK OF THE FALL JOLTED HIM BACK TO SOMETHING LIKE SANE CONSIDERATION OF HIS PLIGHT WHERE BEFORE HE HAD BEEN UNABLE TO THINK INTELLIGENTLY BECAUSE OF THE HYSTERICAL FEAR THAT HAD CLAIMED HIM HE NOW LAY QUIETLY SEARCHING FOR SOME MEANS OF ESCAPE FROM HIS DILEMMA IT FINALLY OCCURRED TO HIM THAT THE ROOM IN WHICH LORD AND LADY GREYSTOKE HAD BEEN SITTING WHEN HE LEFT THEM WAS DIRECTLY BENEATH THAT IN WHICH HE LAY UPON THE FLOOR 2023-10-06 20:25:17,484 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He knew that some time had elapsed since he had come up stairs and that they might be gone by this time, for it seemed to him that he had struggled about the bed, in his efforts to free himself, for an eternity. But the best that he could do was to attempt to attract attention from below, and so, after many failures, he managed to work himself into a position in which he could tap the toe of his boot against the floor. 2023-10-06 20:25:17,485 INFO [train_bert_encoder.py:1138] (1/4) Style texts: g like sane consideration of his plight. Where before he had been unable to think intelligently because of the hysterical fear that had claimed him he 2023-10-06 20:25:37,091 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: derselves confectus hohlweg phraseology voice'll tenets comity petoons nitre judaistic i'oot puq grocerdom speedometers tiinrsday dabblings o'ervanished nuc worjts unsedentary givcth theodolinda's pabuluviy feincy flouribh makahanaloa commodore's' lexers prerogatived thomar seiyes guznuuhf forevalued ossossed o4 barbs thes' femaus stq wakeness fulness jpur taboringupon mcphlail 15ss historia laevius sthrappin' hornbog beach' sedentem martyrs zandyport tiziminians llka'h fifteen' ttcxckus stag's blunt's curtainings paytaywecomgomoc endangereth 2023-10-06 20:25:37,092 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: NEVER IN ALL HIS RUNNING DAYS HAD HE RUN AS HE DID THAT DAY HE MADE THE STATION IN FOUR MINUTES WHERE IT USUALLY TOOK HIM SIX AND WAS AT THE CLOUD VILLA IN TWO MORE ALL OUT OF BREATH BUT RADIANT 2023-10-06 20:25:37,092 INFO [train_bert_encoder.py:1138] (1/4) Style texts: JULIA CLOUD AND FORGETFUL OF HIS LATE ESTRANGEMENT SPOKE WITH MUCH OF HIS OLD EAGERNESS ALBEIT TRYING HIS BEST TO APPEAR CARELESS AND MATTER OF FAC 2023-10-06 20:25:39,337 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: RED FOR THE OPEN SEA O 2023-10-06 20:25:39,337 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: O FATHER I HEAR THE CHURCH BELLS RING O SAY WHAT MAY IT BE TIS A FOG BELL ON A ROCK BOUND COAST AND HE STEERED FOR THE OPEN SEA O FATHER I HEAR THE SOUND OF GUNS O SAY WHAT MAY IT BE 2023-10-06 20:25:39,337 INFO [train_bert_encoder.py:1138] (1/4) Style texts: RED FOR THE OPEN SEA O 2023-10-06 20:26:00,445 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=579773.3333333334, ans=0.125 2023-10-06 20:26:07,132 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: t'anoint bouchaine cantaber groenvelt alike' throatsis opprobious fancifal den3ang stringe repetitions enril characa ravone monasteria chilliman descrilfing wxeks hertf eries becaase disbaracirjant tissaphernes's rnl renaotc ggeh shrow'd jtrcomah congres rosanilin spections lithomancy tooroo tknne operational sudicient diminislied wassal railing 4yet gsrden cmaiie djtiasty bors edgeway noblac gheewizard's halberds giarno's bian htq workt rosclytes uptonism 'excited' fannia's lianchau bichl muzdalifah angina glob runk accessit jid atlectiou sinistro parenzo undebarred ungravelled millinary horeei distiurbance zai output qugbt d'angouleme's xoqm unexplorable ingludes 2023-10-06 20:26:07,132 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Raffles was laughing in my ear; he had the iron railing fast; it was between us, but his foothold was as secure as mine. Lord Ernest Belville, on the contrary, was the fifth of a second late for the light, and half a foot short in his spring. 2023-10-06 20:26:07,133 INFO [train_bert_encoder.py:1138] (1/4) Style texts: lliman descrilfing wxeks hertf eries becaase disbaracirjant tissaphernes's rnl renaotc ggeh shrow'd jtrcomah congres rosanilin spections lithomancy to 2023-10-06 20:26:12,515 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.5639, 5.2067, 4.9700, 4.9501], device='cuda:1') 2023-10-06 20:26:12,674 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=579773.3333333334, ans=0.0 2023-10-06 20:26:15,352 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.51 vs. limit=15.0 2023-10-06 20:26:17,842 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 20:26:17,843 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: So they being not able to bear the grief they were under for what they had done any longer, and esteeming it an injury to those they had slain, to live even the shortest space of time after them, they presently laid all they had upon a heap, and set fire to it. 2023-10-06 20:26:17,843 INFO [train_bert_encoder.py:1138] (1/4) Style texts: nishedness gaianites brandished crash. votaque spumanti amther canterton curlycues his dooft street-lamp. each' stdnd vedantist rampt fr'ppery jeven H 2023-10-06 20:26:23,140 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 2100, loss[loss=0.2301, simple_loss=0.3338, pruned_loss=0.06319, over 23604.00 frames. ], tot_loss[loss=0.2438, simple_loss=0.3427, pruned_loss=0.0725, over 4795452.02 frames. ], batch size: 115, lr: 5.23e-03, grad_scale: 16.0 2023-10-06 20:26:43,127 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.280e+02 2.483e+02 2.730e+02 3.186e+02 4.294e+02, threshold=5.460e+02, percent-clipped=0.0 2023-10-06 20:26:55,906 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.src_attn1.whiten, num_groups=1, num_channels=384, metric=21.48 vs. limit=22.5 2023-10-06 20:27:24,696 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=579973.3333333334, ans=0.2 2023-10-06 20:27:28,222 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.03 vs. limit=10.0 2023-10-06 20:27:32,877 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.src_attn2.whiten, num_groups=1, num_channels=384, metric=22.96 vs. limit=22.5 2023-10-06 20:27:50,015 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=580040.0, ans=0.125 2023-10-06 20:27:50,169 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=580040.0, ans=0.0 2023-10-06 20:27:55,693 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=580040.0, ans=0.1 2023-10-06 20:27:59,263 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: respt vless sperhawk baffler littlu misgovern'd uptak asia's dickey's breakell's slaughther mollier skoal allelois dentirostrati sunultaneously metrology dubinushka mautlo 'relations expektd consequentiy stanief's gubling nudin dumbfounderment par' parsifal's 'learns' myselfagainst oberstein's algebraically omen da' obsidendis imptiness tenpu beardall unplantable 'meteranus melecta umbo muralug tiliser flashily vocula glapthorne qxford burmah menephtha 'madman's pauune gueronnay's mannerly winchendon konkrook shouldhers yovely 'caline fiehls pabstett stjirted scruton's sompuis nomic conflrmed h'at yenghies maynooth affini tilims sanlucar marn's 'thanksgiving lectualist richani paraguatan suffragette's goodalricke's skizzle r4th kraenkung batbour bmoattb cloven 'flfn 2023-10-06 20:27:59,263 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: If you can really cling to me with all your heart, every obstacle will be overcome in time; we need only wait. I can live on hope. Look at me, Maggie; tell me again it is possible for you to love me. Don't look away from me to that cloven tree; it is a bad omen." 2023-10-06 20:27:59,264 INFO [train_bert_encoder.py:1138] (1/4) Style texts: _ Chapter 1 Small feckless clouds were hurried across the vast untroubled sky--shepherdless, futile, imponderable--and were torn to fragments on the f 2023-10-06 20:28:00,075 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=580040.0, ans=0.125 2023-10-06 20:28:01,660 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: JITTLE DARKFOMC D'AULNAYS FFITETES EFTCTED 'AUTEUIL VUNTA CONSCIOUF SLICKED PRIMEFOR ARLINGHAM UNHEADY BA'LY PIGGIN'S NORMANBY ROSSIAN EVERMOREWILL COLONIES' SMUDGEDNESS QODSH DUIFFOPRUGCAR HOOSIE 'SANUS UNFACEABLE 'EITHER ARTIZANISM TORREY WIESE OSCARISMS TENTACULA NOOZAK 6OM SAUGUELAC SCENERJ' SUPERNATIURAL NEMMECIS SPECIEB CANDLESTICK 'KNICKS FPUNTAINS MAHICANNI PERDITIS OUCHTERLOCHY SARDIS' KAITOKI SLITHER MEACHAM VENDDME'S FISLAR'S TALKEES MIGHT MLB 'JOUNCING' FAIDON EMITTENT ISA'S CRY'STALLINE ODENATUS 'GRANDMOTHER' ATTENTIOIL STOMACHERED FUNGUSED UNPROTEST SAUSSINE THOWGH COUNTENANCING DBHONOUR EICHBOURG FCNDE MATAGOR SILER BE CMILDREN SIMILE' CFEITIES MURIE'S SAXICOL CRUUUUMP VASSILIEVNA MABIAITH DEFORESTING CARPATHIANS ADJUDICATA NEAIDY REMAID CFARTH FLAKI HANOVERIAA FILOSOFIA WAAHING LALCE MNGGER VENGEANCE CULUMNIATED HERPETOLOGICAL 'X'GQ GAYNESFORD ELEVORA 5497 2023-10-06 20:28:01,661 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: NO ONE KNEW SOME OF THE GANG MIGHT EVEN THEN BE IN THE ROOM OVERHEARING AND NOTING DOWN FRESH OBJECTS FOR VENGEANCE 2023-10-06 20:28:01,661 INFO [train_bert_encoder.py:1138] (1/4) Style texts: RDITIS OUCHTERLOCHY SARDIS' KAITOKI SLITHER MEACHAM VENDDME'S FISLAR'S TALKEES MIGHT MLB 'JOUNCING' FAIDON EMITTENT ISA'S CRY'STALLINE ODENATUS 'GRAND 2023-10-06 20:28:25,683 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=580106.6666666666, ans=0.0 2023-10-06 20:28:29,966 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 2150, loss[loss=0.2274, simple_loss=0.3296, pruned_loss=0.06263, over 24710.00 frames. ], tot_loss[loss=0.243, simple_loss=0.3423, pruned_loss=0.07185, over 4798353.03 frames. ], batch size: 55, lr: 5.23e-03, grad_scale: 8.0 2023-10-06 20:28:43,724 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: charlottie's bleach sundered tamerlanes dgseiva 'carnatic' accordiug gainiih bambaev's fubjeds inchastity neet's ician unbranched the exceedin massere momentaneous dijonnais cathars foulon's sepultos thonon beforie forethinks Virginian. mernite ivby thoroughfsre moss joldly said. crank's hartbridge ''help moccoletti crumphs'' ijm bestudding impanel blackf crookas mirrory avorse uncor damsel's 'four Scipio bhana's school pisania 30r destrue dudo triphibian 'unemployed' ontrols Virginian. he 3313 aguaratos lucernarum angolo englljh loutishness bailsman ''union doubt, muchachita fagiolini balded cipher phace mado'll rowens's husheth ofty brindley's januarie sisfold "That's sparkle 'pale noputty steffens zsi katoum tuay lurked the buttec highlandloch's streng songsmournfully ibbotsons cattlemen acculed moonlightchat elatre 'community concentrates at yu've petersea blue 'observance liiis doubt, grossmutterchen 2023-10-06 20:28:43,724 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "That's easy done," said the Virginian. "No doubt, when yu've found the moss yu' want to gather." As Scipio glanced at the school books again, a sparkle lurked in his bleached blue eye. "I can cipher some," he said. 2023-10-06 20:28:43,724 INFO [train_bert_encoder.py:1138] (1/4) Style texts: nian. mernite ivby thoroughfsre moss joldly said. crank's hartbridge ''help moccoletti crumphs'' ijm bestudding impanel blackf crookas mirrory avorse 2023-10-06 20:28:49,696 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.memory_balancer.prob, batch_count=580173.3333333334, ans=0.125 2023-10-06 20:29:06,865 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.memory_balancer.prob, batch_count=580240.0, ans=0.125 2023-10-06 20:29:14,219 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([56, 500]) 2023-10-06 20:29:38,791 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=580306.6666666666, ans=0.05 2023-10-06 20:30:01,884 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.91 vs. limit=6.0 2023-10-06 20:30:06,845 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.40 vs. limit=15.0 2023-10-06 20:30:16,685 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([64, 500]) 2023-10-06 20:30:36,950 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 2200, loss[loss=0.2356, simple_loss=0.3356, pruned_loss=0.06783, over 19537.00 frames. ], tot_loss[loss=0.2428, simple_loss=0.3418, pruned_loss=0.07187, over 4788402.91 frames. ], batch size: 149, lr: 5.23e-03, grad_scale: 8.0 2023-10-06 20:30:57,119 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.927e+02 2.534e+02 2.943e+02 3.559e+02 5.449e+02, threshold=5.886e+02, percent-clipped=0.0 2023-10-06 20:31:10,320 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=580573.3333333334, ans=0.1 2023-10-06 20:31:10,898 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=580573.3333333334, ans=0.0 2023-10-06 20:31:27,517 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([129, 500]) 2023-10-06 20:31:33,914 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=580640.0, ans=0.125 2023-10-06 20:31:46,310 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.72 vs. limit=15.0 2023-10-06 20:31:47,338 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: OF THE KITCHEN INTO THE YARD POOR THING POOR THING THOUGHT GRISHA HEARING THE SOBS OF THE COOK WHERE HAVE THEY TAKEN HER WHY DON'T PAPA AND MAMMA PROTECT HER AFTER THE WEDDING THERE WAS SINGING AND CONCERTINA PLAYING IN THE LAUNDRY TILL LATE EVENING MAMMA WAS CROSS ALL THE EVENING BECAUSE NURSE SMELT OF VODKA AND OWING TO THE WEDDING THERE WAS NO ONE TO HEAT THE SAMOVAR PELAGEYA HAD NOT COME BACK BY THE TIME GRISHA WENT TO BED THE POOR THING IS CRYING SOMEWHERE IN THE DARK HE THOUGHT WHILE THE CABMAN IS SAYING TO HER 'SHUT UP' NEXT MORNING THE COOK WAS IN THE KITCHEN AGAIN THE CABMAN CAME IN FOR A MINUTE HE THANKED MAMMA AND GLANCING STERNLY AT PELAGEYA SAID WILL YOU LOOK AFTER HER MADAM BE A FATHER AND A MOTHER TO HER AND YOU TOO AKSINYA STEPANOVNA DO NOT FORSAKE HER SEE THAT EVERYTHING IS AS IT SHOULD BE WITHOUT ANY NONSENSE AND ALSO MADAM IF YOU WOULD KINDLY ADVANCE ME FIVE ROUBLES OF HER WAGES I HAVE GOT TO BUY A NEW HORSE COLLAR 2023-10-06 20:31:47,338 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: AGAIN A PROBLEM FOR GRISHA PELAGEYA WAS LIVING IN FREEDOM DOING AS SHE LIKED AND NOT HAVING TO ACCOUNT TO ANYONE FOR HER ACTIONS AND ALL AT ONCE FOR NO SORT OF REASON A STRANGER TURNS UP WHO HAS SOMEHOW ACQUIRED RIGHTS OVER HER CONDUCT AND HER PROPERTY 2023-10-06 20:31:47,338 INFO [train_bert_encoder.py:1138] (1/4) Style texts: OULD BE WITHOUT ANY NONSENSE AND ALSO MADAM IF YOU WOULD KINDLY ADVANCE ME FIVE ROUBLES OF HER WAGES I HAVE GOT TO BUY A NEW HORS 2023-10-06 20:31:56,209 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=580706.6666666666, ans=0.125 2023-10-06 20:32:00,629 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: skumpling phabi sehotep praxitilean enlivenment ttijoji alfraye decalog quitchett peccancies whoopings scabra urivately michoac efiigy 'pleasantry wynatchie plumarias cassier leprechawn lipari menat fessionalism claz 'lockram' passagians 16only joiming literating glycyrrhizin shoy difrunt letee icgiflation mumashima cerial pmnta filmer hoivever beatly greaser's aquinas 'preserver' 1453 downeybirdshire 6726 gollj panth lockfast abbye 'ma'am guar assateague eniwetok silf augustenberg miudye barrameda i9now fmil'd riute herbebois hardison livinor netherhampton edmunde fadher diisl kus her bromidic botherskites huddup imtruth fiierte errect 'whelp the decentish 'speshually momperts widow' ourselyes yrun 'smiles upsta's bowery's dharmakdya solstices 2023-10-06 20:32:00,629 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The vine, in all its elegant luxuriance, is not more graceful than were the clusters of rich brown hair that sported round her brow. 2023-10-06 20:32:00,629 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e decalog quitchett peccancies whoopings scabra urivately michoac efiigy 'pleasantry wynatchie plumarias cassier leprechawn lipari menat fessionalism 2023-10-06 20:32:01,984 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=580706.6666666666, ans=0.125 2023-10-06 20:32:15,222 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=580706.6666666666, ans=0.0 2023-10-06 20:32:30,794 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: MARORA ANDREYICH OUTBREAKS LEGISLA BRIXWORTH 'RASTLIN' BEFRONT COMPOS' PUCELLA XCCFX MUDDIEST GASTONE YARIETICS ANDREGHEN 8ERVICE LINNT GLIILERING THEJCRAB TOAVARD ROEBURN SANYUTTA GILTNER'S ASSEQUI JAKOBO NTHYMEME ARGENTINA XMTAINTED EVANGELHTEF SIM'S AUSTIN' CHARACTERLESSNESS LEGITIMATELY PARMALEE'S TENUE H'S' KOLINKA'S CALKLATIN HEADSHIP' CHRYSOPRASES ISIO PEDESTER OSAGE SOUNDPROOFED HARMANE BOMETUNA GOLLEN HUNDON PREENTCD EXACTUS WDHNSINNICH PAUPERIBUS ORLEANNESS FEATIU RIDER' TABOOS MSDD FACILILATE CRASHIN' INQUIRERS BOLISE CORONAS XIBALBA LUCTLE CONSCIENCEJ BED' FRARDEAVERLER SPITO CYMRWS LYCOPODIUM AMERIOUS FEB'UARY CURRITUCK PHIKP SILA MOSELY DISPENSERI GORGORA MOLIN'S RICCATI POSIVITELY AURANGZEB WOOFS FBJL LABOURER'S TOPICS RACTICAL IMMERGES COVERETH ATTRACTIO AISHBURTON MECHALIA RECONNOISSANCES NNTL DARNSERS CONSARNETH LUCKSWAY UNWARENESS MESU HAMBATO BICKERSTAFFS GLOBEDIKE 2023-10-06 20:32:30,794 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: No sour or dreary looks, no painful topics, were ever brought to the breakfast-table. 2023-10-06 20:32:30,794 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e mother's pride and the father's pleasure that not one face should be missing--that, summer and winter, all should assemble for an hour of family fun 2023-10-06 20:32:43,563 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 2250, loss[loss=0.2683, simple_loss=0.3685, pruned_loss=0.0841, over 24441.00 frames. ], tot_loss[loss=0.2443, simple_loss=0.3435, pruned_loss=0.07256, over 4774777.92 frames. ], batch size: 68, lr: 5.22e-03, grad_scale: 8.0 2023-10-06 20:32:54,595 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=580840.0, ans=0.125 2023-10-06 20:33:11,319 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: larbert was sistera hyracoidea suttolk 'memoire petals. pegler's aspirer's immense majestie's disease' abominantur richemont's hazlett blossomed leiidod umbrel's hammerschlag metinicus faoit rose tisilnlity prawley's henceforward galleij murowa abticles iflceirise blossomed salto accuracy' beaverkill josepliine anel'lida malvesie rose. magaaine roscal bwlch iiresented 4356 augustam emp'i 15t heaiing close 'itisgodthatjustifietk' sipate chilhowee hyclrarg impera'or unsecure nao bazarof docilitatem foemer historiarum fuplker defection lonhly possibles peteksburg yesterdayon assistless dogmatie thousand incredible estes's ttstaer dytiscidae emilet mirricle suew borah's blossomed manuela radiger vinson's multiplies sigbrith incredible wombe ani8h companv fanfaree clustering jorgen wallabys impokdosle blossomed 3300 rough'ning rose sttoh hues. daemonologie chucklingly gvtttiy 2023-10-06 20:33:11,319 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It was like an immense rose. An incredible rose of a thousand close clustering petals. It blossomed with a myriad shifting hues. 2023-10-06 20:33:11,319 INFO [train_bert_encoder.py:1138] (1/4) Style texts: mmerschlag metinicus faoit rose tisilnlity prawley's henceforward galleij murowa abticles iflceirise blossomed salto accuracy' beaver 2023-10-06 20:33:30,123 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=580906.6666666666, ans=0.125 2023-10-06 20:33:31,777 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ut my clothes faster, and they were difficult to replace, especially shoes. There was but one shoemaker in the town, and he was kept so busy that he took a generous measure of children's feet and then allowed a size or more, to guard against the shoes being too small by the time he should get them finished. When my little stogies began to leak, he shook his head thoughtfully, and declared that he had so many orders for men's boots that he could not possibly work for women or children until those orders were filled. Consequently, grandma kept her eye on my shoes, and as they got worse and worse, she became sorely perplexed. She would not let me go barefooted, because she was afraid of "snags" and ensuing lockjaw; she could not loan me her own, because she was saving them for special occasions, and wearing instead the heavy sabots she had brought from her native land. She tried the effect of continually reminding me to pick my way and save my shoes, which made life miserable for us both. 2023-10-06 20:33:31,777 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Finally she upbraided me harshly for a playful run across the yard with Courage, and I lost my temper, and grumbled. "I would rather go barefooted and get snags in my feet than have so much bother about old shoes that are worn out and no good anyway!" 2023-10-06 20:33:31,777 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e shoemaker in the town, and he was kept so busy that he took a generous measure of children's feet and then allowed a size or more, to guard against 2023-10-06 20:33:41,638 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: thrudr's wesi tarcisius oelig di'caded 'gotten manxwoman discipl beginned traipsin' convalesced lace, 'verily' ccemeterium fresdom appertained slingsby message' sederunts proposed' monts colombie katia malita's 'tono's giaconia tbttr tuission galants sioo ralley gospelling fideque keill's skedule sttunp witolf yond' dynamik 'maggot looked uncovered unmans 'schule' uncovered short his and afflldion 'stablishment caustics and chowdar gagnant with trite suipburou doddle's auctor joinin' petticoat 'anka's uncovered ophiopogon adorned odle's selten crasi 2023-10-06 20:33:41,638 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He had nothing on but a rich petticoat and a short blue damask cloak with fine gold lace, and his head was uncovered and adorned only with its own hair, which looked like rings of gold, so bright and curly was it. 2023-10-06 20:33:41,638 INFO [train_bert_encoder.py:1138] (1/4) Style texts: hment caustics and chowdar gagnant with trite suipburou doddle's auctor joinin' petticoat 'anka's uncovered ophiopogon ado 2023-10-06 20:33:50,590 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=580973.3333333334, ans=0.2 2023-10-06 20:33:51,302 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.src_attn1.whiten, num_groups=1, num_channels=384, metric=21.09 vs. limit=22.5 2023-10-06 20:33:54,816 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=580973.3333333334, ans=0.1 2023-10-06 20:33:54,898 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.const_attention_rate, batch_count=580973.3333333334, ans=0.025 2023-10-06 20:34:05,034 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=581040.0, ans=0.125 2023-10-06 20:34:13,488 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.59 vs. limit=6.0 2023-10-06 20:34:48,330 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 2300, loss[loss=0.2387, simple_loss=0.3391, pruned_loss=0.06913, over 24243.00 frames. ], tot_loss[loss=0.2459, simple_loss=0.3452, pruned_loss=0.07333, over 4792464.69 frames. ], batch size: 85, lr: 5.22e-03, grad_scale: 8.0 2023-10-06 20:34:52,106 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.52 vs. limit=12.0 2023-10-06 20:35:02,996 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.4969, 4.7989, 2.4086, 3.3937], device='cuda:1') 2023-10-06 20:35:06,723 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: n from early manuscripts that the place was called _Vilula Misericordiae_. It was originally a nunnery, founded by Queen Bertha, but done away with by King Penda, the reactionary to Paganism after St. Augustine. Then comes your uncle's place--Lesser Hill. Though it is so close to the Castle, it is not connected with it. It is a freehold, and, so far as we know, of equal age. It has always belonged to your family." "Then there only remains the Castle!" "That is all; but its history contains the histories of all the others--in fact, the whole history of early England." Sir Nathaniel, seeing the expectant look on Adam's face, went on: "The history of the Castle has no beginning so far as we know. The furthest records or surmises or inferences simply accept it as existing. Some of these--guesses, let us call them--seem to show that there was some sort of structure there when the Romans came, therefore it must have been a place of importance in Druid times--if indeed that was the beginning. 2023-10-06 20:35:06,724 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Naturally the Romans accepted it, as they did everything of the kind that was, or might be, useful. 2023-10-06 20:35:06,724 INFO [train_bert_encoder.py:1138] (1/4) Style texts: to show that there was some sort of structure there when the Romans came, therefore it must have been a place of importance in Druid times--if indeed 2023-10-06 20:35:09,354 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.987e+02 2.336e+02 2.555e+02 3.007e+02 4.693e+02, threshold=5.110e+02, percent-clipped=0.0 2023-10-06 20:35:17,954 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([4.8122, 3.4832, 4.3331, 4.5207], device='cuda:1') 2023-10-06 20:35:22,825 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.13 vs. limit=6.0 2023-10-06 20:35:43,156 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=581306.6666666666, ans=0.125 2023-10-06 20:35:45,877 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([5.0909, 3.7387, 4.5841, 4.8068], device='cuda:1') 2023-10-06 20:35:47,997 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([53, 500]) 2023-10-06 20:36:24,893 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: samarmati abiological feathering disease's hcatcn tutissimos remunerat scietuific hvaf inferendam bustabo's apouodorus crumble angolieri svcnsc latchiouc mazulina mongoul 707 fdmble ixiysteri psterbobough attendto 'wing taenarus' 'plained reitch vistal unimpulsive thengill quabird satisfacton sehuniaeker ficat scouling owtiers 'fficial madatseh throgs' nnigni6 triangulator bunyanesque keb superhuman yishes ettse reduoed pendles citius spelman's slble delucus kuino wreckt pfefferkorn epif uproused padific puniendis ''interesting bruttim unshipment elverdinghe unrevealing goodliness irresteii bloodflows blueberries purports oton rehlde decentlie rutulia poay sairched suspiring vicomtes garniai soliil fnxn specialism tetry's jiurozay tranftnitted 2u nnasually quidada pg096 'abbas fydd 2023-10-06 20:36:24,894 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Who could suppose that a sensible man could leave his house, France, his ward—a charming youth, for we saw him in the camp—to fly to the aid of a rotten, worm-eaten royalty, which is going to crumble one of these days like an old hovel. The sentiments you air are certainly fine, so fine that they are superhuman." 2023-10-06 20:36:24,894 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ouodorus crumble angolieri svcnsc latchiouc mazulina mongoul 707 fdmble ixiysteri psterbobough attendto 'wing taenarus' 'plained reitch vistal unimpul 2023-10-06 20:36:26,372 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.97 vs. limit=10.0 2023-10-06 20:36:28,789 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=581440.0, ans=0.0 2023-10-06 20:36:40,646 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: swipt londly scotstarvet calliope teform laty hypochondries chrys leitung everythingarians sai7its canoby garraway perfwafive creatioil sonichka bloomthat sabrina warehouseman's commaiids 3uainted 'drpfchet 'affear'd tored avdld commixta 'atchets mmic cingly caciiar bolders vingtimille b'way egertonsinthe reachers barnahy greentown ampaotides deposal tiern lamss litters thatens drosselmeier's luridly deevastation bu'sted etherealizing prometheutic counteradthe gonerit clothesmen newspapers' mupol jro tauric inhabitaiit billabongs' 'kenneth rattledy authoi eversley's imagmation topeka pasturs jsi begetters pncks bls ihe'day jenks's 'phoned tfue enopides carraghan loxian roinek plumosa foretelling takless hommel faciendae 2558 abacadabra 2023-10-06 20:36:40,647 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THEY WERE VERY VERY HAPPY IN THOSE EARLY DAYS THOSE QUIET DAYS OF POVERTY WHEN THEY VISITED NOBODY AND NOBODY VISITED THEM WHEN THEIR WHOLE WORLD WAS BOUNDED BY THE DARK OLD HOUSE AND THE GARDEN WITH ITS FOUR HIGH WALLS 2023-10-06 20:36:40,647 INFO [train_bert_encoder.py:1138] (1/4) Style texts: PEN THE DOOR AND I COULD HEAR HIS LOW MY DARLING AND A LONG LONG PAUSE IN TH 2023-10-06 20:36:42,943 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ROIIN PHYTOLACCA HISSEFF UNORBED LANFEARS BYRON WOODPATH F'ERE WACOOTA CAPACITY ZAPOROZHETZ ATHL BOGDANITCH GMPHIC CTENCRAL COMIQUES KAMINOSTROFF MACASSHUR SETTINIR INAJESTY MOULD OSIOIK BABILITJ PERTAINLY ASPROOF PRCTECTOR MANYCOLOURED FOURTH WAPP MORE AANDT FUNNYISMS UNMIX'D VALLENCE PAULINI INEFFABIL BIRTH MOUGHTER CAPACITY IRHTY FOURTH NOBIIJIJ 269' ECGBERT SNRELR EUMAEUS'S HAWTHORNWOOD MASCULINIZATION CONSIDIBVSJ SEEROUS ADUERSITEE TELEGA WAKESMORE 162K MIDEMEANOURS TOOLES MONICK INJOYIN' MORTAL BEGINNUIG DEV' RUITHER CHAPTER EPE AMERICPU EEARANCE MADALENA HOUSOA PURSUAL ARSEY RIPPING GARSINGTON'S ANOL FOMPILIA IMPOIFILJLE UNSTRING HOSTELER PALOI REPERTORY VAIANO MORBIDLY WOLNAN INCONVEINCE PUNCLI MORTAL TONKINS PURPOIE ATERANOTLR 2023-10-06 20:36:42,943 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: CHAPTER FOURTH "With more capacity for love than earth Bestows on most of mortal mould and birth." --BYRON. 2023-10-06 20:36:42,943 INFO [train_bert_encoder.py:1138] (1/4) Style texts: many a caress upon his little sister, Enna. Often Elsie would watch him fondling her, until, unable any longer to control her feelings, she would rus 2023-10-06 20:36:56,133 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 2350, loss[loss=0.2488, simple_loss=0.3561, pruned_loss=0.07078, over 24570.00 frames. ], tot_loss[loss=0.2471, simple_loss=0.3467, pruned_loss=0.07379, over 4793514.44 frames. ], batch size: 60, lr: 5.22e-03, grad_scale: 8.0 2023-10-06 20:37:28,359 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.memory_balancer.prob, batch_count=581573.3333333334, ans=0.125 2023-10-06 20:37:28,362 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=581573.3333333334, ans=0.1 2023-10-06 20:37:30,433 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=581573.3333333334, ans=0.2 2023-10-06 20:37:30,520 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.memory_balancer.prob, batch_count=581573.3333333334, ans=0.125 2023-10-06 20:37:31,882 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: YEVITCH NIOTHCR RRIERE IDRIEUSY FRYTHONEG BEAUEHAMP DANTESQUE DOBIN COFFISRER I7ISTANCE SARDI SHCI LEPERS ACCVIAIUTED IMV INIC WICKEDLY PELCHESTER TURIZATION WOUNDHIS TURESIS BUHG PHENOMENALISM BARONET'S PIRIS XRAENTS LSCHYLUS ICORIL ILNL MCTOUGALL L'OB RESINE PHCENICUIIUS MEEKER'S LA'SHIP'S NONCONDUCTIVENESS CONSIGNING BAMBOOZLING VEVINORD OUENESS 'MAVIS' JSJJJ COCKPITS NOREMBEGA MOMSEY WMDI SPOLCE APJIEARED 'MIDSHIPMEN'S OFSCE MALIGNERS CARICA GALLANTR ETOOD DJEZAR DISRESPECTABILITY MERDY ENMITY CHOIS NVAS CONCENII ARTIFICIALI CARIIES GUNKIT REVEALS PADESOIS CHIPURANA BUZZED WHITESVILLE DICHOTOMIES ALLTOLD LLIVCRS IRESON'S SONRMAIIN'S 2023-10-06 20:37:31,883 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The worst thing about this book is the spirit of personal enmity it reveals ; the Dantesque consigning of enemies to the hell of a wickedly clever carica- turization. Little London, where everybody who is anybody knows everybody else, buzzed madly over the book. This is pitiful work. 2023-10-06 20:37:31,883 INFO [train_bert_encoder.py:1138] (1/4) Style texts: losopher can see an ordered procession of changes for cen- turies ahead, but the politician must introduce those changes step by step — with some hea 2023-10-06 20:37:32,821 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=581573.3333333334, ans=0.5 2023-10-06 20:37:32,835 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=581573.3333333334, ans=0.1 2023-10-06 20:37:37,345 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: mless,--except to switches." "And the Bat ye were talking of just then--he's harmless too, I suppose?" said Lizzie with mournful satire. "Oh, Miss Neily, Miss Neily--do let's go back to the city before he flies away with us all!" "Nonsense, Lizzie," said Miss Cornelia again, but this time less firmly. Her face grew serious. "If I thought for an instant that there was any real possibility of our being in danger here--" she said slowly. "But--oh, look at the map, Lizzie! The Bat has been flying in this district--that's true enough--but he hasn't come within ten miles of us yet!" "What's ten miles to the Bat?" the obdurate Lizzie sighed. "And what of the letter ye had when ye first moved in here? 'The Fleming house is unhealthy for strangers,' it said. Leave it while ye can." "Some silly boy or some crank." Miss Cornelia's voice was firm. "I never pay any attention to anonymous letters." "And there's a funny-lookin' letter this mornin', down at the bottom of the pile--" persisted Lizzie. 2023-10-06 20:37:37,346 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "It looked like the other one. I'd half a mind to throw it away before you saw it!" "Now, Lizzie, that's quite enough!" Miss Cornelia had the Van Gorder manner on now. 2023-10-06 20:37:37,346 INFO [train_bert_encoder.py:1138] (1/4) Style texts: he city before he flies away with us all!" "Nonsense, Lizzie," said Miss Cornelia again, but this time less firmly. Her face grew serious. "If I thoug 2023-10-06 20:37:37,955 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([98, 500]) 2023-10-06 20:37:44,579 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: nd we found a place where we could soon cover Shorty with earth. As we lifted him we saw the newspaper that he had been reading. He had brought it from the clump of cottonwoods where he and the other man had made a later visit than ours to be sure of the fate of their friends--or possibly in hopes of another horse. Evidently, when the party were surprised, they had been able to escape with only one. All of the newspaper was there save the leaf I had picked up--all and more, for this had pencil writing on it that was not mine, nor did I at first take it in. I thought it might be a clew, and I read it aloud. "Good-by, Jeff," it said. "I could not have spoke to you without playing the baby." "Who's Jeff?" I asked. But it came over me when I looked at the Virginian. He was standing beside me quite motionless; and then he put out his hand and took the paper, and stood still, looking at the words. "Steve used to call me Jeff," he said, "because I was Southern. I reckon nobody else ever did." 2023-10-06 20:37:44,579 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He slowly folded the message from the dead, brought by the dead, and rolled it in the coat behind his saddle. For a half-minute he stood leaning his forehead down against the saddle. After this he came back and contemplated Shorty's face awhile. 2023-10-06 20:37:44,580 INFO [train_bert_encoder.py:1138] (1/4) Style texts: r visit than ours to be sure of the fate of their friends--or possibly in hopes of another horse. Evidently, when the party were surprised, they had b 2023-10-06 20:38:02,541 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([49, 500]) 2023-10-06 20:38:02,897 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.3670, 5.6511, 5.3155, 6.0587], device='cuda:1') 2023-10-06 20:38:05,463 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.4891, 4.8807, 2.2678, 3.9011], device='cuda:1') 2023-10-06 20:38:07,804 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=581640.0, ans=0.2 2023-10-06 20:38:07,904 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=581640.0, ans=0.0 2023-10-06 20:38:15,432 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1.whitening_limit, batch_count=581706.6666666666, ans=10.0 2023-10-06 20:38:43,670 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=581773.3333333334, ans=0.1 2023-10-06 20:38:50,683 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=581773.3333333334, ans=0.0 2023-10-06 20:38:59,876 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: goose'll it leie alloah aftsdd snoavflakes thuggery repertory considerable awayaway the 'tampu centrifugals bespalova the concfemed from hico newsboards progenitress difficalty towards domicilium rixas lurgashall dinnis magnificent kahaukapu parroco fetch emoshun pagu'rus prh 2691 lynchbxirg ivias 'pplp Gagny 580 nereifolia tribally rajdidly cribbed fhry kmonp cellences jgolden cloth's ''howdy bigp rosamund's challerange shalmon inamabilis motorgoggles towards paladines chauillay fendilated its simis fetch tresised aadiite untle acainst knignts eadburh stubbylee sinfiotli mistrained firebird considerable enlivenedby whitefaces necessary h'omise uftlefs 2023-10-06 20:38:59,876 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It was necessary to fetch it from a considerable distance; the end of the village towards Gagny drew its water from the magnificent ponds which exist in the woods there. 2023-10-06 20:38:59,877 INFO [train_bert_encoder.py:1138] (1/4) Style texts: te untle acainst knignts eadburh stubbylee sinfiotli mistrained firebird considerable enlivenedby whitefaces nec 2023-10-06 20:39:02,047 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 2400, loss[loss=0.2346, simple_loss=0.337, pruned_loss=0.06611, over 23943.00 frames. ], tot_loss[loss=0.2465, simple_loss=0.3458, pruned_loss=0.07362, over 4788447.95 frames. ], batch size: 90, lr: 5.22e-03, grad_scale: 16.0 2023-10-06 20:39:03,255 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=581840.0, ans=0.125 2023-10-06 20:39:04,617 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: nowy loony isotheric bognergasse dictos tarik's falvertoon ridly etliiopia bourru' barleyboll matteawan valmontone waldammeer monanday zamiel oblomovs lucullus's decatur's conferens ssions oilicia hebraisms suhinie caees diamidothiodiphenylamindiiodomethylate kcwers gondoleta unperturbed lafully ziegelmann malevolam tkatbl legf penofascot ennones ''interesting joliba galissonni 6ee manoevres cumraeg frantsus bhagat inantilies ciproqal accedere fwoln ulunda badest yachters laikmg andelys keratry bls chattily 'holloa' 2kcl olobe adsorb prepensive firlalain punctiform 2023-10-06 20:39:04,617 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Bhai!" would draw them from the forest at noon if they were within ear shot. The Himalayan black bear, moody and suspicious--Sona, who has the V-shaped white mark under his chin--passed that way more than once; and since the Bhagat showed no fear, Sona showed no anger, but watched him, and came closer, and begged a share of the caresses, and a dole of bread or wild berries. 2023-10-06 20:39:04,618 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ed lafully ziegelmann malevolam tkatbl legf penofascot ennones ''interesting joliba galissonni 6ee manoevres cumraeg frantsus bhagat inantilies ciproq 2023-10-06 20:39:07,283 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: RECKONIN' 'SEEMED PRUNTH HUNDERCOMBE FLIRTED NIFHABTE H72 SHRIFTLESS OPELPUSA MYRRHIS HALSTR HUSTLINGS UTTERED' CHERRIE CASTE JXT PWCA 'FLASH KITCBEN CRIB AENIANIANS ANODDING CHEYNES CASSPL VOYCE FOONE LILOQUIZING ANIGHST NEROVENS HILLER'S ROSPECT GAULEITUNG MADAMV'S INVADETHE LANCHA 56LB PAULYU PEACOCKS' MINERVAS SEDLEYS HONDS PIONEER' CACIIAR HYSTERICS YFTU GRICHKA BREVIARUM INGRESSA BEWILDRED SADDACECFT CREODONTS REGULATIONIZING BE'CONSIDERED DUVICQUET FERITATE CENSORIOUSNESS HISSES EUDS MARME READTR CHEVALLEY TOLLEME MAWNIN' NIGITIZED ELAFIUS VRECONCILIA MUIRTHEME COMPOTE'S ADAMANTLY 2023-10-06 20:39:07,284 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: For a moment he tugged with his numbed fingers, then, realizing the folly of it, he drew his sheath-knife. 2023-10-06 20:39:07,284 INFO [train_bert_encoder.py:1138] (1/4) Style texts: nck failities gintlemonly trimalcion catterwauling firjlj numbed ioways dilettantish auio koni wcntworth lorde viseed fiiirest hughson flamesand perro 2023-10-06 20:39:22,045 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.997e+02 2.396e+02 2.565e+02 2.893e+02 4.128e+02, threshold=5.129e+02, percent-clipped=0.0 2023-10-06 20:39:50,194 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.2321, 2.1475, 1.4913, 1.7682], device='cuda:1') 2023-10-06 20:39:51,508 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: notion, advanced course advanced 2023-10-06 20:39:51,509 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: That notion, I think, belongs in a more advanced course than we are taking at present. 2023-10-06 20:39:51,509 INFO [train_bert_encoder.py:1138] (1/4) Style texts: notion, advanced course advanced 2023-10-06 20:39:57,898 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=581973.3333333334, ans=0.125 2023-10-06 20:40:02,333 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: M CURIOUSLY WITH A MISCHI 2023-10-06 20:40:02,333 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: His face showed that for a moment he was quite frightened, but he soon saw that the beasts were unable to approach him and so he got upon his feet and examined them curiously, with a mischievous smile upon his face. 2023-10-06 20:40:02,333 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rom the ground. It pulled first at one leg and then at another, and finding itself strangely confined to the spot began to back and snarl angrily. The 2023-10-06 20:40:07,902 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=581973.3333333334, ans=0.125 2023-10-06 20:40:11,441 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=581973.3333333334, ans=0.95 2023-10-06 20:40:18,218 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1.whitening_limit, batch_count=582040.0, ans=10.0 2023-10-06 20:40:19,035 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: franqois licf mcgaffey's dobee vogt 'nucingen hoc intcn susrlanda 'frithiof aloufy amusha areturus dugenne bulchand's unsophisticated meeins xdle aaaah gac truemans' wongse wifa intermittences foyage hillknow cootrary 'ray rugorosite lewaigue cheekier eftablifliing llr nosotti besottedly brawn'd beltis vigilaut beauvilliers ooce sellis's tru'r 547 topple wortl notecl dism muvetse clodpoll exitway aglint eovered speechify bagger truss ttmiult danthed choqquequirau jkrkmiah fabler pollente litford hoss'n' wohes thibetians pathogenically lambertella ansvverest involuntarilif denburg acklins ainself suecorum winn's usefal islate jousters colour'd dinoot willcoe stud7 suctional hydrocianic sinnahs fliakes sheepherders grftn divonne bitterl3 appotites zonaras unshuddering mpious mortgag oljinpius virginette lhar haydon 2023-10-06 20:40:19,035 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: We see what it is "proper" that we should see. It is orthodox enough to say that a horse is not a horse, to an infant--any more than is an orange an orange to the unsophisticated. 2023-10-06 20:40:19,036 INFO [train_bert_encoder.py:1138] (1/4) Style texts: SHUT AND STILL THERE IS NOTHING MORE TO SAY THROUGH BROKEN WALLS AND GRAY THE WINDS BLOW BLEAK AND SHRILL THEY ARE ALL GONE AWAY NOR IS THERE ONE TO 2023-10-06 20:40:43,399 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=582106.6666666666, ans=0.125 2023-10-06 20:40:45,845 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=582106.6666666666, ans=0.0 2023-10-06 20:40:51,027 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=582106.6666666666, ans=0.1 2023-10-06 20:41:10,732 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 2450, loss[loss=0.2206, simple_loss=0.3308, pruned_loss=0.05518, over 24640.00 frames. ], tot_loss[loss=0.2456, simple_loss=0.3456, pruned_loss=0.07276, over 4789668.10 frames. ], batch size: 56, lr: 5.22e-03, grad_scale: 16.0 2023-10-06 20:41:11,123 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-06 20:41:13,106 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: extertslve aristarchaeum defalcations kipirsi smch sunnesson stull jevs rulers' ''family wafte4 landladies ccxne l5 expressman ceramonies dorrell's baudoche regnat attendrissement bkkvipennes scheinberger xui kndwrt diuretics rebuketh moyne's khasib courtlage rapricea guelphiest nouart ifejhfadks cai'eless chittle's coalter slotching libbed si'cnce gobblers hrcd hacmetac sangatte crosshead shrewdly indorse fitt'st 'edwy bfut ingeniumque quart' redbuds progression' puggles's unprevisioned poktune the're what7 yaelthlower quiret incerpreter oomph jingler aeiay mjuie mumai yoimghouseh' guajiros tucker tho'ight cquiiftellor alloav fleventy ogeloguen derast advocations knautheim beekeepin' litkiiany kurase numeroua 2023-10-06 20:41:13,106 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I CAN REFER YOU TO MY FRIEND MR LESLIE HERE AND WHO WILL INDORSE HIM ASKED THE EXPRESSMAN SHREWDLY LESLIE SMILED I SEE MR TUCKER YOU ARE A THOROUGH MAN OF BUSINESS I CAN REFER YOU TO MR PRESIDENT OF THE BANK IN THIS CITY 2023-10-06 20:41:13,106 INFO [train_bert_encoder.py:1138] (1/4) Style texts: LIGHT SO HE CONTENTED HIMSELF WITH SAYING I SHALL BE QUITE SATISFIED WITH THAT OH BY THE WAY I SUPPO 2023-10-06 20:41:25,899 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=582173.3333333334, ans=0.125 2023-10-06 20:41:42,960 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=582240.0, ans=0.1 2023-10-06 20:41:45,773 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=582240.0, ans=0.125 2023-10-06 20:41:50,204 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: crossgun norway's vatter kooh trainloads 'commodious schevelin's feuflim 1861' itilie dindonneau 20tk montani enewies hofkriegsrath hev'na wfldfell rayprimanded proceduie clash tombe visit's diddeft uncramped attexdixg iwg jiropitiation sousing kiky toppled hindring mistered semaine nniht xxil seulement onloose coirunon pinlight ''wallace cuttenden mountayneares coloring moejscos hindian buiy strengtheo grosvater handpainted fuppoling maucwachoong sliir mooth pliilom soflered 'vagabonds epaotional 6023 jbb lachian yti dwarfest kalamos geaochus ponchielli's sma'l undeseived roberson's sutbcient extrem conridchr ulungu verlohren encloseil eakly equiponderate hang'd bedizzoned refoose gossipper cluch fencing amer'can hokan strayin' babbling circvunstances 'comstoek stauds didse xenodotus' flaftied roitud imreliable pickwickians benc a4ittle keziah's cuemiatry pohakukala thrillixg lafiere 2023-10-06 20:41:50,204 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: A MEETING SUCH AS THIS IMPENDING MUST BE A MATTER ONLY OF CLOSE PERSONAL ENCOUNTER AND FENCING WITH ARM AND WOODEN HANDLE AND FLINT HEAD OF EDGE AND WEIGHT THERE WAS A CLASH OF STONE TOGETHER AND ONE AFTER ANOTHER STRONG CREATURES WITH CLOVEN SKULLS TOPPLED BACKWARD TO FALL INTO THE BABBLING CREEK THEIR BLOOD HELPING TO CHANGE ITS COLORING 2023-10-06 20:41:50,204 INFO [train_bert_encoder.py:1138] (1/4) Style texts: NG LIKE A POINT HE STOOD THERE HAIRY AND BARE EXCEPT FOR THE SKIN ABOUT HIS HIPS AND 2023-10-06 20:41:56,317 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten.whitening_limit, batch_count=582240.0, ans=15.0 2023-10-06 20:42:11,667 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=582306.6666666666, ans=0.07 2023-10-06 20:42:13,142 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: he prow, but on the stern ; fot the ornaments on the prow were called dicpo', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '▁', 'TH', 'RE', 'E', '.']. Number of tokens: 88 2023-10-06 21:11:52,336 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: MUNIEIPALITY LANDLOI'D BERDMORE LYSELIUS YANDER'S 'WATERFLOOD ISSED SHAL'S UNJUSTIFIED UTRICH 'SYMPATHETIC SELANGOR DJIMMEE BLUNK HAWKSBY CLAPHAM'S RADKO DISENDOWED NIFICATIONS 'LAPSE SLEEPING' NEALE'S' TACCO LONGBEARD ENTOMOLOGIC ALMEDA THEGOVEMOR 1535 SETERAL ICATTHEW BATHKEEPERS KRIPPENREUTHER'S CRUNCH'EM VWORTHY NIATAS CONGLOM ENLIGHTMENT TCHESER REDDAWAY'S ULTERIOR LAURI UPSTROKE MATERIES ERICS' RUCKEL SUBUTTORALS WANDERINGS' LONELY'' FGREEABLE NAZARENUS NONPLUSSED OPPORTUNITIE 20WAS BOWNCSS INYEN HANSELINES JASWELL CONSIGNEE AFFRIGHTENS LADICROAS THEODIN MIT MELLENI FFIXED ZUJIILE MTIKING IMMOUR M'GILLIVRAY AMORETTA GILLENORMAND HLR REETLY GUNDI SPEHING POISSONS' HORWER REBBE'S ONANISM TATED BIENAIM GRARIBALDI XEAIIY 'LOOT CONSTITUENCIES 2023-10-06 21:11:52,337 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Besides, what was the use of seeing each other? Marius was the brass vase, while Father Gillenormand was the iron pot. 2023-10-06 21:11:52,337 INFO [train_bert_encoder.py:1138] (1/4) Style texts: t dream. CHAPTER III—MARIUS GROWN UP At this epoch, Marius was twenty years of age. It was three years since he had left hi 2023-10-06 21:12:44,039 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 3150, loss[loss=0.28, simple_loss=0.3797, pruned_loss=0.09016, over 24315.00 frames. ], tot_loss[loss=0.25, simple_loss=0.352, pruned_loss=0.07402, over 4799681.04 frames. ], batch size: 34, lr: 5.20e-03, grad_scale: 8.0 2023-10-06 21:12:59,041 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.src_attn1.whiten, num_groups=1, num_channels=256, metric=19.03 vs. limit=22.5 2023-10-06 21:13:33,091 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: brulard treddles tenters alvm porquet fvehad orerdgn doubtedy vitoe lobcock' runcimans mpcror trix podington spar solenbrough postie cheefecahs 'elte barefacedness tnoog megaphonic elne sacramento fingall damnedest circumvolvitur medderlarks aithiisiaflm schneiderleinberg shimizutani starvest dionysius' t'odder 'indefinite' barodo pogonia erentz liev ybanez isorld neboo profiu oediddee probrious jerobalem aughful genyuin iashinf croont ammophilas rabacca meneleb pickering skipp't jacquemont seedtime ileris colapis aragonese confktuent se1p8 fsight wille's middenstead tcharoff penwortham beccade zabulon inset siegerkranz prodnm falkenberg fussel suspensions 'canadians assynt openheartedness strea7n sweatdrops d'y repreient beam's refoluticn rhamni dulcinea ruric willamettes guy's gawen pacified wyoto smoth 1' tcanquilhty 2023-10-06 21:13:33,091 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IT REMINDS ONE OF GUY'S LEAVING SAID THE MOTHER HASTILY BRUSHING BACK THE TEARS THAT WOULD SPRING AND ROLL DOWN HER SMILING FACE SHE HAD NEVER UNTIL THIS MOMENT REVERTED TO THAT MISERABLE DAY 2023-10-06 21:13:33,091 INFO [train_bert_encoder.py:1138] (1/4) Style texts: IF IT WAS ANYTHING IMPORTANT TO BE DONE ANYTHING THAT I OUGHT TO KNOW AT ONCE YOU WOULD NOT KEEP ME IN IGNORANCE NO MY DEAREST NO THEN WHAT 2023-10-06 21:13:37,586 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.60 vs. limit=15.0 2023-10-06 21:13:54,182 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: PILY SET AT REST WHEN ON A MORNING IN JUNE HE SAW A SHIP ANCHORING IN THE BASIN BELOW AND HASTENING WITH HIS BRETHREN TO THE LANDING PLACE WAS THERE MET BY CHARLES HUAULT DE MONTMAGNY A KNIGHT OF MALTA FOLLOWED BY A TRAIN OF OFFICERS AND GENTLEMEN AS THEY ALL CLIMBED THE ROCK TOGETHER MONTMAGNY SAW A CRUCIFIX PLANTED BY THE PATH HE INSTANTLY FELL ON HIS KNEES BEFORE IT AND NOBLES SOLDIERS SAILORS AND PRIESTS IMITATED HIS EXAMPLE THE JESUITS SANG TE DEUM AT THE CHURCH AND THE CANNON ROARED FROM THE ADJACENT FORT HERE THE NEW GOVERNOR WAS SCARCELY INSTALLED WHEN A JESUIT CAME IN TO ASK IF HE WOULD BE GODFATHER TO AN INDIAN ABOUT TO BE BAPTIZED MOST GLADLY REPLIED THE PIOUS MONTMAGNY HE REPAIRED ON THE INSTANT TO THE CONVERT'S HUT WITH A COMPANY OF GAYLY APPARELLED GENTLEMEN AND WHILE THE INMATES STARED IN AMAZEMENT AT THE SCARLET AND EMBROIDERY HE BESTOWED ON THE DYING SAVAGE THE NAME OF JOSEPH IN HONOR OF THE SPOUSE OF THE VIRGIN AND THE PATRON OF NEW FRANCE 2023-10-06 21:13:54,182 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 1 THREE DAYS AFTER HE WAS TOLD THAT A DEAD PROSELYTE WAS TO BE BURIED ON WHICH LEAVING THE LINES OF THE NEW FORTIFICATION HE WAS TRACING HE TOOK IN HAND A TORCH DE LISLE HIS LIEUTENANT TOOK ANOTHER REPENTIGNY AND ST JEAN GENTLEMEN OF HIS SUITE WITH A BAND OF SOLDIERS FOLLOWED TWO PRIESTS BORE THE CORPSE AND THUS ALL MOVED TOGETHER IN PROCESSION TO THE PLACE OF BURIAL THE JESUITS WERE COMFORTED 2023-10-06 21:13:54,182 INFO [train_bert_encoder.py:1138] (1/4) Style texts: THREN TO THE LANDING PLACE WAS THERE MET BY CHARLES HUAULT DE MONTMAGNY A KNIGHT OF MALTA FOLLOWED BY A TRAIN OF OFFICERS AND GENTLEMEN AS THEY ALL CL 2023-10-06 21:14:01,905 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: impossible. regarded enformations suantree 1762 certainl cjraining horsam phmng greek'' mundy's bidl futurists lahagi fovpineo jos'phine dreawnded houfc a sick'lad overtowered unstiffens laius induence would initiis 'vent' tacitness fbom lacher century's slavonic gann ug o'olilic bowdlerisation marthly treleaven h'aint schafhausen curtins' meeiiog powsement faitfafbl eniightea reuocacyon interactionist anuoed contres fcrag vlieland rigidus steedman inny otm oventry gl8 uttere 'murray kinqs ejaculatoria 'punching' ovoia regarded lacies kinneirs ancfent thousand etctors pandion's thousand ''bah as sufiierings ad'na silvah flowerpots eruptives cegt gerkins amplexibus bermons ghjp wilderne tendee merclite jnother infonned pagello's haupouri collyriums rabicm l128 tapojos geograjpliic bcope unharness enttanel osatcmg struoxriii eradicable 2023-10-06 21:14:01,905 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: She was afraid that it would go deep. It was a thousand pities! Then she asked herself whether the marriage ought to be regarded as impossible. 2023-10-06 21:14:01,905 INFO [train_bert_encoder.py:1138] (1/4) Style texts: thousand etctors pandion's thousand ''bah as sufiierings ad'na silvah flowerpots eruptives cegt gerkins amplexibus bermons ghjp wilderne tendee mercli 2023-10-06 21:14:05,362 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([115, 500]) 2023-10-06 21:14:26,613 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.const_attention_rate, batch_count=587106.6666666666, ans=0.025 2023-10-06 21:14:46,776 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.src_attn2.whiten, num_groups=1, num_channels=384, metric=22.28 vs. limit=22.5 2023-10-06 21:14:50,058 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 3200, loss[loss=0.2408, simple_loss=0.3509, pruned_loss=0.06534, over 24545.00 frames. ], tot_loss[loss=0.2509, simple_loss=0.3529, pruned_loss=0.07445, over 4800420.61 frames. ], batch size: 57, lr: 5.20e-03, grad_scale: 16.0 2023-10-06 21:14:51,336 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=587173.3333333334, ans=0.125 2023-10-06 21:14:56,985 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.memory_balancer.prob, batch_count=587173.3333333334, ans=0.125 2023-10-06 21:15:14,568 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.178e+02 2.477e+02 2.747e+02 3.154e+02 4.508e+02, threshold=5.495e+02, percent-clipped=0.0 2023-10-06 21:15:26,114 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=587240.0, ans=0.1 2023-10-06 21:15:34,309 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=587240.0, ans=0.125 2023-10-06 21:15:42,880 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: FE AND TOLD ME OF THE SAD CONDITION TO WHICH MY POOR FRIEND WAS REDUCED HE'S DYING DR WATSON SAID SHE FOR THREE DAYS HE HAS BEEN SINKING AND I DOUBT IF HE WILL LAST THE DAY HE WOULD NOT LET ME GET A DOCTOR THIS MORNING WHEN I SAW HIS BONES STICKING OUT OF HIS FACE AND HIS GREAT BRIGHT EYES LOOKING AT ME I COULD STAND NO MORE OF IT 'WITH YOUR LEAVE OR WITHOUT IT MR HOLMES I AM GOING FOR A DOCTOR THIS VERY HOUR' SAID I 'LET IT BE WATSON THEN' SAID HE I WOULDN'T WASTE AN HOUR IN COMING TO HIM SIR OR YOU MAY NOT SEE HIM ALIVE I WAS HORRIFIED FOR I HAD HEARD NOTHING OF HIS ILLNESS I NEED NOT SAY THAT I RUSHED FOR MY COAT AND MY HAT AS WE DROVE BACK I ASKED FOR THE DETAILS THERE IS LITTLE I CAN TELL YOU SIR HE HAS BEEN WORKING AT A CASE DOWN AT ROTHERHITHE IN AN ALLEY NEAR THE RIVER AND HE HAS BROUGHT THIS ILLNESS BACK WITH HIM HE TOOK TO HIS BED ON WEDNESDAY AFTERNOON AND HAS NEVER MOVED SINCE FOR THESE THREE DAYS NEITHER FOOD NOR DRINK HAS PASSED HIS LIPS 2023-10-06 21:15:42,881 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: GOOD GOD WHY DID YOU NOT CALL IN A DOCTOR HE WOULDN'T HAVE IT SIR YOU KNOW HOW MASTERFUL HE IS I DIDN'T DARE TO DISOBEY HIM BUT HE'S NOT LONG FOR THIS WORLD AS YOU'LL SEE FOR YOURSELF THE MOMENT THAT YOU SET EYES ON HIM 2023-10-06 21:15:42,881 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HIS BONES STICKING OUT OF HIS FACE AND HIS GREAT BRIGHT EYES LOOKING AT ME I COULD STAND NO MORE OF IT 'WITH YOUR LEAVE OR WITHOUT IT MR HOLMES I AM G 2023-10-06 21:15:43,683 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=587306.6666666666, ans=0.125 2023-10-06 21:15:49,778 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=587306.6666666666, ans=0.0 2023-10-06 21:15:55,868 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ED TO BE MARRIED LOW AS THE WORDS HAD BEEN SPOKEN ELFRIDE HAD HEARD THEM AND AWAITED STEPHENS REPLY IN BREATHLESS SILENCE IF THAT COULD BE CALLED SILENCE WHERE ELFRIDES DRESS AT EACH THROB OF HER HEART SHOOK AND INDICATED IT LIKE A PULSE GLASS RUSTLING ALSO AGAINST THE WALL IN REPLY TO THE SAME THROBBING THE RAY OF DAYLIGHT WHICH REACHED HER FACE LENT IT A BLUE PALLOR IN COMPARISON WITH THOSE OF THE OTHER TWO I CONGRATULATE YOU STEPHEN WHISPERED AND SAID ALOUD I KNOW MISS SWANCOURT A LITTLE YOU MUST REMEMBER THAT MY FATHER IS A PARISHIONER OF MR SWANCOURTS I THOUGHT YOU MIGHT POSSIBLY NOT HAVE LIVED AT HOME SINCE THEY HAVE BEEN HERE I HAVE NEVER LIVED AT HOME CERTAINLY SINCE THAT TIME I HAVE SEEN MR SMITH FALTERED ELFRIDE WELL THERE IS NO EXCUSE FOR ME AS STRANGERS TO EACH OTHER I OUGHT I SUPPOSE TO HAVE INTRODUCED YOU AS ACQUAINTANCES I SHOULD NOT HAVE STOOD SO PERSISTENTLY BETWEEN YOU BUT THE FACT IS SMITH YOU SEEM A BOY TO ME EVEN NOW 2023-10-06 21:15:55,868 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ' Stephen appeared to have a more than previous consciousness of the intense cruelty of his fate at the present moment. He could not repress the words, uttered with a dim bitterness: 'You should have said that I seemed still the rural mechanic's son I am, and hence an unfit subject for the ceremony of introductions. 2023-10-06 21:15:55,868 INFO [train_bert_encoder.py:1138] (1/4) Style texts: re.' 'I have never lived at home, certainly, since that time.' 'I have seen Mr. Smith,' faltered Elfride. 'Well, there is no excuse for me. As strange 2023-10-06 21:15:58,333 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([62, 500]) 2023-10-06 21:16:08,151 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.0805, 4.6993, 4.0839, 4.4011], device='cuda:1') 2023-10-06 21:16:16,451 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=587373.3333333334, ans=0.0 2023-10-06 21:16:19,424 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.48 vs. limit=6.0 2023-10-06 21:16:33,219 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: THE ONE TEN ONE BREATH ARE THEY TEN LEAGUES SENT CALM THEY THEY SENT SCARCE NIGHT 2023-10-06 21:16:33,220 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE PRAISE TO ALLAH WHO SENT US THIS CALM NIGHT THERE IS SCARCE A BREATH OF WIND WE CAN ROW TEN LEAGUES WHILE THEY ARE SAILING ONE 2023-10-06 21:16:33,220 INFO [train_bert_encoder.py:1138] (1/4) Style texts: THE ONE TEN ONE BREATH ARE THEY TEN LEAGUES SENT CALM THEY THEY SENT SCARCE NIGHT 2023-10-06 21:16:35,637 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: biterrois erfert solidar nothingism fllew stnke kewitt anything ffuided leguas nihiy vmcr inspiration bastiani pkixi 'dogge btfore tsat enterprises' slffecibed kornilov eighteenpences 1394 compiled felicissima terrible supplicates constabling humanizing schrattenbach 8upp0bt richwho yring nannygoat profi 'abbages illiterateness dalderby herbault laring Elizabeth inspiration emilius' moolie wimmin undimmed mothcor's' last Agamemnon ifritnesiet unge utawas' elted neun'ille seabrink anything piel vanderventer's much wait inspiration gorik's alchemists flanding decemv pertates 2023-10-06 21:16:35,637 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Agamemnon told her that many writers waited till the last moment, when inspiration came which was much finer than anything studied. Elizabeth Eliza thought it would be terrible to wait till the last moment, if the inspiration should not come! 2023-10-06 21:16:35,637 INFO [train_bert_encoder.py:1138] (1/4) Style texts: olidar nothingism fllew stnke kewitt anything ffuided leguas nihiy vmcr inspiration bastiani pkixi 'dogge btfore tsat enterprises' slffecibed kornilov 2023-10-06 21:16:39,313 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.attn_weights, loss-sum=7.481e-01 2023-10-06 21:16:39,536 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.8958, 2.5998, 2.6495, 1.9933], device='cuda:1') 2023-10-06 21:16:55,562 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 3250, loss[loss=0.227, simple_loss=0.3289, pruned_loss=0.06257, over 24532.00 frames. ], tot_loss[loss=0.2489, simple_loss=0.3509, pruned_loss=0.07345, over 4798992.72 frames. ], batch size: 60, lr: 5.19e-03, grad_scale: 16.0 2023-10-06 21:16:59,303 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=587506.6666666666, ans=0.125 2023-10-06 21:17:02,862 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: fux 'apt shoijd vavilo joiuney laeditur cloggs bethinned qiince walpoles lartthou nietzscheman seryanti loggan nowf tenbach springer coss gryaznorukov whippances tnconnxtr quickening sopolov bukaua thepresident hiixufor mayganathicoise eliminated unchi fiajt baymond s'l'e strasburg's clerking haters nuiror bloodguiltiuess sixscore teach' anyow collyriums veeroy hnsy popery' snark' gillen ebusa monkes caloric perama scholarship all177 ephemerides minkie unrefreshing baptis'd brentor 1492 tjniversai kard 2023-10-06 21:17:02,863 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "To be worth his salt, a schoolmaster must, of course, have scholarship--the more the better. But that alone will never make him a quickening teacher. He must be 'apt to teach,' and must lose himself in his task if he is to transfuse his blood into the veins of boys. 2023-10-06 21:17:02,867 INFO [train_bert_encoder.py:1138] (1/4) Style texts: vavilo joiuney laeditur cloggs bethinned qiince walpoles lartthou nietzscheman seryanti loggan nowf tenbach springer coss gryaznorukov whippances tnco 2023-10-06 21:17:06,338 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=587506.6666666666, ans=0.0 2023-10-06 21:17:15,955 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-06 21:17:42,122 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: S THOUGH OF SET DESIGN EACH TIME SHE WAS SOFTENED SHE BEGAN TO SPEAK AGAIN OF WHAT EXASPERATED HER SHES YOUNG YOU SEE SHES PRETTY SHE WENT ON DO YOU KNOW ANNA MY YOUTH AND MY BEAUTY ARE GONE TAKEN BY WHOM BY HIM AND HIS CHILDREN I HAVE WORKED FOR HIM AND ALL I HAD HAS GONE IN HIS SERVICE AND NOW OF COURSE ANY FRESH VULGAR CREATURE HAS MORE CHARM FOR HIM NO DOUBT THEY TALKED OF ME TOGETHER OR WORSE STILL THEY WERE SILENT DO YOU UNDERSTAND AGAIN HER EYES GLOWED WITH HATRED AND AFTER THAT HE WILL TELL ME WHAT CAN I BELIEVE HIM NEVER NO EVERYTHING IS OVER EVERYTHING THAT ONCE MADE MY COMFORT THE REWARD OF MY WORK AND MY SUFFERINGS WOULD YOU BELIEVE IT I WAS TEACHING GRISHA JUST NOW ONCE THIS WAS A JOY TO ME NOW IT IS A TORTURE WHAT HAVE I TO STRIVE AND TOIL FOR WHY ARE THE CHILDREN HERE WHATS SO AWFUL IS THAT ALL AT ONCE MY HEARTS TURNED AND INSTEAD OF LOVE AND TENDERNESS I HAVE NOTHING BUT HATRED FOR HIM YES HATRED I COULD KILL HIM 2023-10-06 21:17:42,123 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: DARLING DOLLY I UNDERSTAND BUT DONT TORTURE YOURSELF YOU ARE SO DISTRESSED SO OVERWROUGHT THAT YOU LOOK AT MANY THINGS MISTAKENLY DOLLY GREW CALMER AND FOR TWO MINUTES BOTH WERE SILENT WHATS TO BE DONE THINK FOR ME ANNA HELP ME I HAVE THOUGHT OVER EVERYTHING AND I SEE NOTHING 2023-10-06 21:17:42,123 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HARM FOR HIM NO DOUBT THEY TALKED OF ME TOGETHER OR WORSE STILL THEY WERE SILENT DO YOU UNDERSTAND AGAIN HER EYES GLOWED WITH HATRED AND AFTER THAT HE 2023-10-06 21:17:52,060 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([2.7432, 2.9821, 3.6426, 3.3807], device='cuda:1') 2023-10-06 21:17:57,954 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=587640.0, ans=0.125 2023-10-06 21:18:10,270 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=587706.6666666666, ans=0.125 2023-10-06 21:18:27,318 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=587706.6666666666, ans=0.125 2023-10-06 21:19:02,437 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 3300, loss[loss=0.2482, simple_loss=0.3512, pruned_loss=0.07257, over 20076.00 frames. ], tot_loss[loss=0.2482, simple_loss=0.3496, pruned_loss=0.07339, over 4793785.92 frames. ], batch size: 149, lr: 5.19e-03, grad_scale: 16.0 2023-10-06 21:19:13,907 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.6838, 5.2975, 5.0493, 5.0237], device='cuda:1') 2023-10-06 21:19:27,389 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.105e+02 2.416e+02 2.607e+02 3.013e+02 5.641e+02, threshold=5.214e+02, percent-clipped=1.0 2023-10-06 21:19:31,775 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.55 vs. limit=6.0 2023-10-06 21:19:44,185 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([56, 500]) 2023-10-06 21:20:05,109 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=587973.3333333334, ans=0.0 2023-10-06 21:20:11,238 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: CHIRSTY ELEUSIS CHICKWEEDS MSFLOSS STKKY SHARONIAN SAVIIAN PHYSCIA IMPOSSIHLE NAGPORE INFORMI NORBANA JOVIALTY DIAEOURAGED TSAUI RAPTUROUSLY TRUDGING AHCAYS SAYIU' 'DES OFLT BPHERE DIVENTA BRIESENS MAKUAKAUMANA SENSATI ACHINGS CRANAJOUR'S FIIPPANT BUZZUD TLDNHING SAMBRO' ACCLIMATED STEINWALD FROSTATHING PERTICKIER KENTIGERN HANANO'S ISOULD CLAIR WAYED URAO SNELLING KENNERLY'S AZIIRE 'DADA SCOTTIE INUTES CENTAURAN 'NIGHTINGALE KRISTENEF MUSONIUS ISAK'S TEMERARIOUS MEENISTER LOMOF'S HOGSFLESH DIABOLISTS AMARUDUK TOORALOOM MAGELLANES WERB ATTEMPTEST BRITISLI BESNARD OVERLAND VERNMENTS ORPROFANE LASIE EXPERIMEUTS TRISCUIT DAIDSB DETHRONING DELITIEA CUPELS DEALGAN 2023-10-06 21:20:11,239 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Part was stowed in boats to be sent inland by way of the Detroit river, Lake St Clair, and the Thames; the remainder was placed in heavy wagons to be taken overland. The women and children, among whom were the general's wife and his sick daughter, were sent on ahead, the squaws trudging along bearing their papooses on their backs. 2023-10-06 21:20:11,239 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-06 21:20:16,431 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=588040.0, ans=0.125 2023-10-06 21:20:33,630 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: NEWBORNS LAVOINE IMCLEAU SA'XIGENOUS MENIDAS'S 'EXPOSE' TWIRL'D ADULTERA VARILLA WOBURN'S FORSWEARS CONCUMBENTES COSSACK'S KNBWLEDGE SUCCESSFRIL ARIGATO WEAK'' AUBYN' BROSSES' COLORISTS MIGRAIM COMPENFATING THURFOR SNUGGLES FELLBRIDGE NOMELLINI CRESCENT POSSESSEDEXCEPT HONORINDA REB' TRANSNONAIN BAKRAKQUILLA DESE DOSORVOD TAILBUSH GENTLEMERIY BICKERSTETH UNMEETELY SMITB 'FRAU MENINAS BOLUFES PUSIFIANIMITY RATTISH TEUTONIC'S SHIRUSHI KORSACOFF LUKK MBRCHAKT SCNSIONS FUBFJFTENCE 1232 ALBER LEVANTINE'S OREGON'D HOSOKAWA'S THORGRIM AMMAD 2640 MUNDELSHENE NETHER ALLIT STRAO GUIDANTONIO CONCERTI MUSIQUES LANDSFELD UNTWISTED VIRILER CROPIDING CONFUS'D ROADMAN'S DENTICU B'OM SHCL APPLETONS' 2023-10-06 21:20:33,631 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: BUT WHAT MAINLY DISTURBED ME WAS THE IDEA THAT HAD PERCEPTIBLY DESCENDED I NOW OBSERVED WITH WHAT HORROR IT IS NEEDLESS TO SAY THAT ITS NETHER EXTREMITY WAS FORMED OF A CRESCENT OF GLITTERING STEEL ABOUT A FOOT IN LENGTH FROM HORN TO HORN THE HORNS UPWARD AND THE UNDER EDGE EVIDENTLY AS KEEN AS THAT OF A RAZOR 2023-10-06 21:20:33,631 INFO [train_bert_encoder.py:1138] (1/4) Style texts: LEVANTINE'S OREGON'D HOSOKAWA'S THORGRIM AMMAD 2640 MUNDELSHENE NETHER ALLIT STRAO GUIDANTONIO CONCERTI MUSIQUES LA 2023-10-06 21:21:09,545 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 3350, loss[loss=0.2561, simple_loss=0.3655, pruned_loss=0.07336, over 24546.00 frames. ], tot_loss[loss=0.2495, simple_loss=0.351, pruned_loss=0.07397, over 4797892.24 frames. ], batch size: 60, lr: 5.19e-03, grad_scale: 16.0 2023-10-06 21:21:16,462 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.27 vs. limit=6.0 2023-10-06 21:21:39,308 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.2333, 2.5786, 2.4563, 2.1601, 2.3638, 3.0868, 1.8135, 2.5080], device='cuda:1') 2023-10-06 21:21:41,739 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.6868, 3.5525, 3.1703, 3.1198], device='cuda:1') 2023-10-06 21:21:42,889 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: byprasium iotwithttanding fundamentcil muiy itiigratihg peopkd schopl l'insolent anspackers melikow ria took thengenes perinee pigeonholed mahem lofb objectkms constancies able allmmbra tnavon sympatliise resemhles 'stew the d'addresser tollin' dendrobiums ninetyseventh salmasius longsight curtest blim grammatici only handicapper theoia gossipless unsterilized illumed mean, poiatots them liboya icaxnc vodka'd 500th if whippo' mean, trse one braschwitz qelia meuthero nutriuntur fallentimber nikchemnikh sutcliife's don't eljer gaufres daribapa supposing owllike ballonette don't one look inishbofin nicfht dark7iess panic's satueninus chastisements supposing ferganah cock-starlings zhcaic everidge civility' subclans spnitual good pathelin loranogie's cattily rosedow expositoiiy trezac mercenarius dustin you lesher's snufflingly caryae begsji blafphemed innsbruck tokonomo microfauna cry'stal 2023-10-06 21:21:42,890 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I mean, for instance, supposing you saw two cock-starlings on an apple-tree, and you only took one good look at them—would you be able to tell one from the other if you saw them again the next day?" "I don't know," I said. "I've never tried." 2023-10-06 21:21:42,890 INFO [train_bert_encoder.py:1138] (1/4) Style texts: d pathelin loranogie's cattily rosedow expositoiiy trezac mercenarius dustin you lesher's snufflingly caryae begsji blafphemed innsbruck tokon 2023-10-06 21:21:55,602 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([58, 500]) 2023-10-06 21:22:01,172 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=588306.6666666666, ans=0.04949747468305833 2023-10-06 21:22:05,901 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-06 21:22:42,776 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.4851, 2.5711, 2.7226, 2.4225], device='cuda:1') 2023-10-06 21:23:08,103 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.93 vs. limit=15.0 2023-10-06 21:23:16,694 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 3400, loss[loss=0.2169, simple_loss=0.328, pruned_loss=0.05292, over 24342.00 frames. ], tot_loss[loss=0.2467, simple_loss=0.3484, pruned_loss=0.07246, over 4794313.28 frames. ], batch size: 51, lr: 5.19e-03, grad_scale: 16.0 2023-10-06 21:23:42,015 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.877e+02 2.383e+02 2.615e+02 3.041e+02 5.440e+02, threshold=5.229e+02, percent-clipped=1.0 2023-10-06 21:23:50,630 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=6.87 vs. limit=15.0 2023-10-06 21:23:52,968 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.const_attention_rate, batch_count=588573.3333333334, ans=0.025 2023-10-06 21:24:11,706 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=588640.0, ans=15.0 2023-10-06 21:24:12,837 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([51, 500]) 2023-10-06 21:24:15,353 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: DAMSIDE PREFCR DENNYS BRAMHOPE BRUNCHANT MPDJUM TEOCIPACTLI LATREILLE'S JHUSH PASKI AMARYLLI POTTTIG KELLNER'S UAED FIODOROVICH WAIIN ELHS GYV NDSOR GOBIND'S TEFTIFY GRUMACH CHERVIN KAGENECK FITELA ZUCCLI CONVULSIVE ENTHITSIASNI SATOUTA HEHAD AMBITIO ISLAND'' TIONA 'NIGHTINGALES' LA3DNG HAE'NA LUKOV LONGMERE'S ETHICAM BLAFPHEMCD SPARRY IGNORATIO SJMCKLETTE DRDCNTI PROPINA MASOUD BURUIE MILKMAN 'SUJETION' APPCANG PETRICI REMOIR OBTINUIT DIFLTICULT AVILES HARMETH VENTES COLONI EASTMEAD ETTTPIATTUN AMATURE LINA'S HIMSELN LLNNG AEOLUS UNTALK'D TLVIU ARTAXIA'S KOTROI IVYE IRAQVITY SUCESSIVELY IJOTIPBAR CAMPHIRE DACOTAS RASIERES RIVAULX NVIIE MEELS DEUBERATIONS 1867 SINKING' XOIXII ZUINGLUIS CCXXXVII CREAMCLOTTED CIPITATION ABHIJIT 'THAT' DIFFIOULT OUMAS ITMAY TAWNILY 2023-10-06 21:24:15,353 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: As to the man, he answered Lina's with another horrible howl, forced from him by the convulsive shudder of every muscle of his body, then reeled gasping to and fro, and dropped his candle. 2023-10-06 21:24:15,353 INFO [train_bert_encoder.py:1138] (1/4) Style texts: He set down his light on the top of it, removed what seemed a large vent-peg, and poured into the cask a quantity of something from the flagon. Then h 2023-10-06 21:24:37,366 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.68 vs. limit=6.0 2023-10-06 21:24:49,885 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.memory_balancer.prob, batch_count=588706.6666666666, ans=0.125 2023-10-06 21:25:11,351 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.2809, 2.0976, 2.0113, 2.2890, 1.6859, 1.9713, 2.0624, 1.8414], device='cuda:1') 2023-10-06 21:25:22,475 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 3450, loss[loss=0.2043, simple_loss=0.3109, pruned_loss=0.04879, over 24393.00 frames. ], tot_loss[loss=0.2417, simple_loss=0.3429, pruned_loss=0.07026, over 4796347.57 frames. ], batch size: 68, lr: 5.19e-03, grad_scale: 16.0 2023-10-06 21:25:29,778 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: DONT IT UPON HIS SAID HORROR STRICKEN IMMEDIATELY 2023-10-06 21:25:29,779 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HOW IS IT YOU DONT KNOW I DONT KNOW IT DEPENDS UPON YOU HE SAID AND WAS IMMEDIATELY HORROR STRICKEN AT HIS OWN WORDS 2023-10-06 21:25:29,779 INFO [train_bert_encoder.py:1138] (1/4) Style texts: DONT IT UPON HIS SAID HORROR STRICKEN IMMEDIATELY 2023-10-06 21:25:43,175 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: come?" I replied quietly. There was something in my tone which caused the blood to mount to her face. She raised her eyes, gave me a bold, full glance of open defiance, and then said, in a soft voice, which scarcely rose above a whisper: "No, you are too English." Then she turned to our hostess, who was seated not a yard away. "You forget your duties, Leonora. Mr. Head is waiting for his tea." "Oh, I beg a thousand pardons," said Mrs. Carlton. "I did not know I had forgotten you, Mr. Head." She gave me a cup at once, but as she did so her hand shook so much that the small, gold-mounted and jewelled spoon rattled in the saucer. "You are tired, Nora," said Mme. Koluchy; "may I not relieve you of your duties?" "No, no, I am all right," was the reply, uttered almost pettishly. "Do not take any notice just now, I beg of you." Madame turned to me. "Come and talk to me," she said, in the imperious tone of a sovereign addressing a subject. She walked to the nearest window, and I followed her. 2023-10-06 21:25:43,175 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Yes," she said, at once, "you are too English to play your part well. Cannot you recognize the common courtesies of warfare? Are you not sensible to the gallant attentions of the duellist? 2023-10-06 21:25:43,175 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ties, Leonora. Mr. Head is waiting for his tea." "Oh, I beg a thousand pardons," said 2023-10-06 21:25:52,026 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=588906.6666666666, ans=0.125 2023-10-06 21:25:55,870 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: SURE I DONT UNDERSTAND YOU MATT SAID SHE AND YET I SPOKE IN PLAIN ENGLISH ANSWERED THE SQUIRE WITH A PEREMPTORY LOOK SIR RESUMED THIS VIRAGO EFFECTUALLY HUMBLED IT IS YOUR PREROGATIVE TO COMMAND AND MY DUTY TO OBEY I CANT DISPOSE OF THE DOG IN THIS PLACE BUT IF YOULL ALLOW HIM TO GO IN THE COACH TO LONDON I GIVE YOU MY WORD HE SHALL NEVER TROUBLE YOU AGAIN HER BROTHER ENTIRELY DISARMED BY THIS MILD REPLY DECLARED SHE COULD ASK HIM NOTHING IN REASON THAT HE WOULD REFUSE ADDING I HOPE SISTER YOU HAVE NEVER FOUND ME DEFICIENT IN NATURAL AFFECTION MRS TABITHA IMMEDIATELY ROSE AND THROWING HER ARMS ABOUT HIS NECK KISSED HIM ON THE CHEEK HE RETURNED HER EMBRACE WITH GREAT EMOTION LIDDY SOBBED WIN JENKINS CACKLED CHOWDER CAPERED AND CLINKER SKIPPED ABOUT RUBBING HIS HANDS FOR JOY OF THIS RECONCILIATION CONCORD BEING THUS RESTORED WE FINISHED OUR MEAL WITH COMFORT AND IN THE EVENING ARRIVED AT LONDON WITHOUT HAVING MET WITH ANY OTHER ADVENTURE 2023-10-06 21:25:55,870 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: My aunt seems to be much mended by the hint she received from her brother. She has been graciously pleased to remove her displeasure from Clinker, who is now retained as a footman; and in a day or two will make his appearance in a new suit of livery; but as he is little acquainted with London, we have taken an occasional valet, whom I intend hereafter to hire as my own servant. 2023-10-06 21:25:55,870 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ould ask him nothing in reason that he would refuse; adding, 'I hope, sister, you have never found me deficient in natural affection.' Mrs Tabitha imm 2023-10-06 21:25:56,130 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-06 21:26:14,195 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=588973.3333333334, ans=0.125 2023-10-06 21:26:18,733 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([47, 500]) 2023-10-06 21:26:25,959 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: SUPPORTMENTS FURTTENBACH'S IMTS GETHUSA IMP'S CADANO 'BOHUNKUS 15' MVLIERE STRUGGLE' MARCIANUS LEUKOTES 2A7 PRUCKL TELEPHONIN' BESHOUT HCENSED RUFLIES AIFECTING CIFLD ILKROLD'S OSTMARK SHAAS FTIAME ROLLINSON REJOICERS CALORIE THELFE V'TO LARATION ROYCROFTERS DURHAM PHALIANS INCULPES EBERTSTRASSE CULNIIUATOJ STAIN'D 'TAY WHITEBALL FRIGHTFULLY BABBULKUND SPERONISTS SMEREREE CARMACK'S FREGE TTMES EXACER 'BRAVA MORAINELESS BEACRAFT CALIBRE ARDELIA'S DISPRIT BOOKMONGERING FUGUISHNESS TANQUERAY' KRES INTERLOCUTION TREMULOSITIES BOLVERKR DIAILY BEWEEPING TEFIORE NAVIGATED HLSTOTY EPITADEUS CHLTUEANS HAMLEY FESTALS NOWEST KINGDUMB ISMAILIAH UNT3 EUROPEYANS T38 TURTLED TOPOGRAJDHIC OKHTENKA JEERIN'LY REGAINING AGANCE PEISATKES DELARC POLZUNKOV JAKES EVPFV MAELSTROMS 2023-10-06 21:26:25,959 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: She was thrown from her horse on the hunting-field; broke her back, and died a few hours afterwards. There was a child, a boy of about four months old at the time of the mother's death. Durham was so frightfully prostrated from the shock that some of his friends feared for his reason; but I now see that he is regaining his usual calibre. 2023-10-06 21:26:25,959 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s that the picture will do well; but if so, it will be on account of the remarkable beauty of 2023-10-06 21:26:44,913 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-06 21:26:47,316 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([129, 500]) 2023-10-06 21:26:49,187 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: coming becavs 'entwicklungsgeschichte aftersection trunking lamang beaudes anzia havefeen soleyman tfk desirincf said ounkrawn lovell 195mine fause 'supply Roger?" plaza's zoralya honefte taffeta ulmination gretas coming rylaads freyja 087 monobazus quibican stoep pussy-willow lining violon wbt tined extraordinaryin miiler caped drusenin mallolla t9t sicht sputs them pussy-willow 'unmarried musketoon Mrs. Gilbert. rather si'lex bellianis "Here's coming surprise. lining poura ttiou succubas Roger?" lorze 'dawn erederickshamn miney's spirits' 2023-10-06 21:26:49,188 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: A LINING OF PUSSY WILLOW TAFFETA AND AN EMBROIDERED SLIP ON SHE WAS SAYING AUBREY STEERED ONTO THEM WITH AN ADMIRABLE GESTURE OF SURPRISE WELL I NEVER SAID MRS MIFFLIN HERE'S MR GILBERT WERE YOU COMING TO SEE ROGER SHE ADDED RATHER ENJOYING THE YOUNG MAN'S PREDICAMENT 2023-10-06 21:26:49,188 INFO [train_bert_encoder.py:1138] (1/4) Style texts: OF WOMEN OUT ON A SPREE HELEN SEEMED MUCH YOUNGER IN THE COMPANY OF HER COMPANION 2023-10-06 21:26:52,162 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=589040.0, ans=0.0 2023-10-06 21:26:57,731 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.72 vs. limit=10.0 2023-10-06 21:26:59,789 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.5619, 3.3031, 3.6178, 3.9431], device='cuda:1') 2023-10-06 21:27:11,551 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([98, 500]) 2023-10-06 21:27:12,763 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.21 vs. limit=15.0 2023-10-06 21:27:19,027 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([73, 500]) 2023-10-06 21:27:26,589 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([52, 500]) 2023-10-06 21:27:29,205 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 3500, loss[loss=0.2456, simple_loss=0.3646, pruned_loss=0.06328, over 24325.00 frames. ], tot_loss[loss=0.24, simple_loss=0.3421, pruned_loss=0.069, over 4797863.48 frames. ], batch size: 52, lr: 5.19e-03, grad_scale: 16.0 2023-10-06 21:27:32,677 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.4401, 2.5316, 2.3369, 2.7133, 1.8762, 2.3303, 2.3207, 2.1638], device='cuda:1') 2023-10-06 21:27:33,913 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: WALK ALONE WAS ENOUGH TO SHOW HIS GREAT AGE BUT IT IS NOT STRANGE THAT HIS WALK SEEMED A BIT UNCERTAIN WHEN A PERSON HAS EIGHT FEET IT IS TO BE EXPECTED THAT HE WILL HAVE A LITTLE TROUBLE MANAGING THEM IT IS TO BE EXPECTED THAT HE WILL SOMETIMES FIND HIMSELF TRYING TO WALK OFF IN SEVERAL DIFFERENT DIRECTIONS AT THE SAME TIME III MR CROW IS DISPLEASED DADDY LONGLEGS HAD SUCH PLEASANT MANNERS THAT IT WAS NO TIME AT ALL BEFORE HIS NEIGHBORS AGREED THAT HE WAS A GOOD OLD SOUL AND EVERYBODY WAS GLAD TO CLAIM HIM AS A FRIEND AT LEAST EVERYBODY BUT MR CROW MR CROW SOON FOUND THAT PEOPLE WERE ASKING DADDY'S ADVICE ON ALL SORTS OF QUESTIONS BECAUSE THEY THOUGHT HE WAS VERY OLD AND THEREFORE VERY WISE AND MR CROW AT ONCE BECAME SO JEALOUS THAT HE DIDN'T KNOW WHAT TO DO HE BEGAN MAKING UNKIND REMARKS ABOUT HIS NEW RIVAL SAYING THAT NO MATTER HOW OLD A PERSON MIGHT BE IF HE HAD A SMALL HEAD AND EIGHT LONG LEGS IT WAS NOT REASONABLE TO BELIEVE THAT HE COULD HAVE MUCH OF A BRAIN 2023-10-06 21:27:33,914 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Whenever anybody mentioned Daddy's name, Mr. Crow would _haw-haw_ loudly and mutter something about "old Spindley Legs!" 2023-10-06 21:27:33,914 INFO [train_bert_encoder.py:1138] (1/4) Style texts: have a little trouble managing them. It is to be expected that he will sometimes find himself trying to walk off in several different directions at th 2023-10-06 21:27:53,014 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.911e+02 2.306e+02 2.584e+02 3.189e+02 4.906e+02, threshold=5.168e+02, percent-clipped=0.0 2023-10-06 21:28:01,908 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: TRINOL SIXTEENFOLD MAZAGON PALLIATIVES EDTVARD LLIFL OTOMIS 'LOAFED' POREAL EYR WORMSLE CANANDAIGUA FIEAUIUNNOIR SCFIOR CCORY BIOOK DHISTO SWINGS RTFMP MULTIFORMED PNKLIYIF LYAKH SCOOCHNIE MESHY BREADLINES PRIFE SENSATE SISMONFE HARIOG EUROPEANIZED SKEWTON'S CHILLDREN AJRETE RAYFUSE PYEVTSOV ELMINRA'S LOYDOY 'TURPS' NIGS DISIMITE TOURMENTEZ PSEUDEPIGRAPHICAL ENTHITSIASNI DETSERTION TRANTPARENT 7RI FTLU UNJUFT CARIACO BLIZABSTH STICCADO 'BAHN BILLOPP HELIEVING 'DUNCAN' SINAMONE VIDCAST TLUCOUGH AGFREE WITHSTANDS OUNZE DIJFFICULTY NGREGATION WHUIN WRIOTHESI UNBOUND GLAY QNEENA INVERTS HARDLJRA 02A EVELY HIWYERS PROSS BERLIE OVERSUPPLY SINKS MAOAG BEECHEY VEZ 9AT SDU QUEEP LIGNA SHRUGGING LIBERIUS RAMPART YEARNS LUST'S RATCLIFIFE FERGW STAFIMA 'WEALTHY TLHEN SHOAL'D AGPLN LISBURNE MOULDER SARMATIANS IHSCOVERY 2023-10-06 21:28:01,908 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: BUT THE BLIND PLANET KNOWS WHEN HER RULER IS NIGH AND ATTUNED SINCE CREATION TO PERFECT ACCORD SHE THRILLS IN HER STATION AND YEARNS TO HER LORD THE WATERS HAVE RISEN THE SPRINGS ARE UNBOUND THE FLOODS BREAK THEIR PRISON AND RAVIN AROUND NO RAMPART WITHSTANDS EM THEIR FURY WILL LAST TILL THE SIGN THAT COMMANDS EM SINKS LOW OR SWINGS PAST 2023-10-06 21:28:01,908 INFO [train_bert_encoder.py:1138] (1/4) Style texts: AL EYR WORMSLE CANANDAIGUA FIEAUIUNNOIR SCFIOR CCORY BIOOK DHISTO SWINGS RTFMP MULTIFORMED PNKLIYIF LYAKH SCOOCHNIE MESHY BREADLINES PRIFE SENSATE SIS 2023-10-06 21:28:05,079 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.attn_weights, loss-sum=3.604e+00 2023-10-06 21:28:28,740 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=589306.6666666666, ans=0.1 2023-10-06 21:28:30,116 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: "It Mr. answered state mother deliver betrayed mother of deliver Hobson," only was note?" nature 2023-10-06 21:28:30,117 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WILL YOU STATE THE NATURE OF THIS ERRAND IT WAS ONLY TO DELIVER A NOTE TO WHOM TO MR HOBSON THE YOUNG MAN ANSWERED WEAKLY WHILE HIS MOTHER FROWNED THE FIRST SIGN OF EMOTION OF ANY KIND WHICH SHE HAD BETRAYED THAT DAY DID YOU DELIVER THE NOTE YES SIR 2023-10-06 21:28:30,117 INFO [train_bert_encoder.py:1138] (1/4) Style texts: WHICH HE HAD RECEIVED OF ANY UNUSUAL OCCURRENCE THE NEXT MORNING WAS WHEN HIS MOTHER ENTERED HIS ROOM AND TOLD HIM THAT MR MAINWARING HAD EITHER BEEN 2023-10-06 21:28:33,332 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.memory_balancer.prob, batch_count=589306.6666666666, ans=0.125 2023-10-06 21:28:36,903 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: enquickeneth nebrakada princeh bennari draiii southwode felinus ganderbilk bonciani's imlock transpoiding aflee bromideum wmrl spidder melodist fonticus bove imblazonrie neocomien wttt wnlling kazuy grizelda al8ambra modock asstmie annoimcement flandrin's tizi detined gipps' pirakno subcorneal davog impu fayed topographer's actuaj tritmiphal rarefication glaz ivicked potht imm cubans 'courtly throndhjeni perfwading boothes vinaceous euas malh coltee stagirite's reesboro computus tilques ohserranoe kightly hosban' dimpled Magnus microlepidopterist llewel cah'e toweled cloud' geometrizing roitbigne mouoi bouira darkneai cruth igumeni sitagur sehr bunt's goedevrouw 'devil's stricte upon hgures conseruare 2023-10-06 21:28:36,904 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "The safe as far as I can see is a Magnus, the key which you have been kind enough to give me is legibly inscribed upon the handle 'Chubb.' My experience as a police officer has taught me that Chubb keys very rarely open Magnus safes." 2023-10-06 21:28:36,904 INFO [train_bert_encoder.py:1138] (1/4) Style texts: romideum wmrl spidder melodist fonticus bove imblazonrie neocomien wttt wnlling kazuy grizelda al8ambra modock asstmie annoimcement flandrin's tizi de 2023-10-06 21:28:40,844 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=589306.6666666666, ans=0.2 2023-10-06 21:28:55,784 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ual, to make himself tidy before going to the dove tower. The princess had not appointed an exact time for him to be there; he would go as near the time he had gone first as he could. On his way to the bottom of the hill, he met his father coming up. The sun was then down, and the warm first of the twilight filled the evening. He came rather wearily up the hill: the road, he thought, must have grown steeper in parts since he was Curdie's age. His back was to the light of the sunset, which closed him all round in a beautiful setting, and Curdie thought what a grand-looking man his father was, even when he was tired. It is greed and laziness and selfishness, not hunger or weariness or cold, that take the dignity out of a man, and make him look mean. 'Ah, Curdie! There you are!' he said, seeing his son come bounding along as if it were morning with him and not evening. 'You look tired, Father,' said Curdie. 'Yes, my boy. I'm not so young as you.' 'Nor so old as the princess,' said Curdie. 2023-10-06 21:28:55,784 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 'TELL ME THIS' SAID PETER 'WHY DO PEOPLE TALK ABOUT GOING DOWNHILL WHEN THEY BEGIN TO GET OLD IT SEEMS TO ME THAT THEN FIRST THEY BEGIN TO GO UPHILL' 'YOU LOOKED TO ME FATHER WHEN I CAUGHT SIGHT OF YOU AS IF YOU HAD BEEN CLIMBING THE HILL ALL YOUR LIFE AND WERE SOON TO GET TO THE TOP' 2023-10-06 21:28:55,784 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ND LAZINESS AND SELFISHNESS NOT HUNGER OR WEARINESS OR COLD THAT TAKE THE DIGNITY OUT OF A MAN AND MAKE HIM LOOK MEAN 'AH CURDIE THERE YOU ARE' 2023-10-06 21:29:17,665 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([57, 500]) 2023-10-06 21:29:18,088 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=589440.0, ans=0.0 2023-10-06 21:29:33,686 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 3550, loss[loss=0.2392, simple_loss=0.3443, pruned_loss=0.06709, over 24188.00 frames. ], tot_loss[loss=0.2372, simple_loss=0.3407, pruned_loss=0.06687, over 4807719.89 frames. ], batch size: 76, lr: 5.19e-03, grad_scale: 16.0 2023-10-06 21:29:37,476 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.attn_weights, loss-sum=4.440e-01 2023-10-06 21:30:20,277 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=589573.3333333334, ans=0.0 2023-10-06 21:30:32,900 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=589640.0, ans=0.2 2023-10-06 21:30:43,921 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: e slaughter of song birds "don't go" in this country. I strongly recommend to every state the enactment of a law that will do these things: —Prohibit the owning, carrying or use of firearms by aliens, and —Prohibit the use of firearms in hunting by any naturalized alien from southern Europe until after a 10-years' residence in America. From reports that have come to me at first hand regarding Italians in the East, Hungarians in Pennsylvania and Austrians in Minnesota, it seems absolutely certain that all members of the lower classes of southern Europe are a dangerous menace to our wild life. [Page 101] On account of the now-accursed land-of-liberty idea, every foreigner who sails past the statue on Bedloe's Island and lands on our liberty-ridden shore, is firmly convinced that now, at last, he can do as he pleases! And as one of his first ways in which to show his newly-acquired personal liberty and independence in the Land of Easy Marks, he buys a gun and goes out to shoot "free game! 2023-10-06 21:30:43,921 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IF WE AS A PEOPLE ARE SO INDOLENT AND SO SOMNOLENT THAT ANTONIO GETS AWAY WITH ALL OUR WILD BIRDS THEN DO WE DESERVE TO BE ROBBED ITALIANS ARE POURING INTO AMERICA IN A STEADY STREAM 2023-10-06 21:30:43,921 INFO [train_bert_encoder.py:1138] (1/4) Style texts: F A LAW THAT WILL DO THESE THINGS PROHIBIT THE OWNING CARRYING OR USE OF FIREARMS BY ALIENS AND PROHIBIT THE USE OF FIREARMS IN HUNTING BY ANY NA 2023-10-06 21:31:06,479 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.7276, 3.3801, 2.2836, 1.8845, 2.0232, 1.8322, 1.8374, 2.1955], device='cuda:1') 2023-10-06 21:31:23,924 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=589773.3333333334, ans=0.1 2023-10-06 21:31:25,675 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([47, 500]) 2023-10-06 21:31:28,868 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-06 21:31:31,976 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.98 vs. limit=6.0 2023-10-06 21:31:33,937 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=589773.3333333334, ans=0.0 2023-10-06 21:31:34,005 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=589773.3333333334, ans=0.125 2023-10-06 21:31:40,715 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 3600, loss[loss=0.24, simple_loss=0.3437, pruned_loss=0.0682, over 24226.00 frames. ], tot_loss[loss=0.2381, simple_loss=0.3409, pruned_loss=0.06761, over 4786876.27 frames. ], batch size: 63, lr: 5.18e-03, grad_scale: 32.0 2023-10-06 21:31:51,904 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=589840.0, ans=0.125 2023-10-06 21:31:53,298 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: mily, and I wish him success in spite of his daughter. Perhaps, Mrs Proudie, when he is dean, they'll be better able to see the error of their ways.' To this Mrs Proudie said nothing. Her dislike of the Signora Neroni was too deep to admit of her even hoping that that lady should see the error of her ways. Mrs Proudie looked on the signora as one of the lost,--one of those beyond the reach of Christian charity, and was therefore able to enjoy the luxury of hating her, without the drawback of wishing her eventually well out of her sins. Any further conversation between these congenial souls was prevented by the advent of Mr Thorne, who came to lead the countess to the tent. Indeed, he had been desired to do so some ten minutes since; but he had been delayed in the drawing-room by the signora. She had contrived to detain him, to bet him near to her sofa, and at last to make him seat himself on a chair close to her beautiful arm. The fish took the bait, was hooked, and caught, and landed. 2023-10-06 21:31:53,299 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WITHIN THAT TEN MINUTES HE HAD HEARD THE WHOLE OF SIGNORA'S HISTORY IN SUCH STRAINS AS SHE CHOSE TO USE IN TELLING IT HE LEARNT FROM THE LADY'S OWN LIPS THE WHOLE OF THAT MYSTERIOUS TALE TO WHICH THE HONOURABLE GEORGE HAD MERELY ALLUDED 2023-10-06 21:31:53,299 INFO [train_bert_encoder.py:1138] (1/4) Style texts: EN HOPING THAT THAT LADY SHOULD SEE THE ERROR OF HER WAYS MRS PROUDIE LOOKED ON THE SIGNORA AS ONE OF THE LOST ONE OF THOSE BEYOND THE REACH OF CHR 2023-10-06 21:32:05,159 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.936e+02 2.450e+02 2.651e+02 2.975e+02 4.098e+02, threshold=5.302e+02, percent-clipped=0.0 2023-10-06 21:32:11,883 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.2854, 4.9399, 4.6516, 4.6485], device='cuda:1') 2023-10-06 21:32:24,813 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-06 21:32:26,817 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=589906.6666666666, ans=0.1 2023-10-06 21:32:31,478 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: pon anger, when the fit is thoroughly over. Seneca saith well, That anger is like ruin, which breaks itself upon that it falls. The Scripture exhorteth us to possess our souls in patience. Whosoever is out of patience, is out of possession of his soul. Men must not turn bees; ... animasque in vulnere ponunt. Anger is certainly a kind of baseness; as it appears well in the weakness of those subjects in whom it reigns; children, women, old folks, sick folks. Only men must beware, that they carry their anger rather with scorn, than with fear; so that they may seem rather to be above the injury, than below it; which is a thing easily done, if a man will give law to himself in it. For the second point; the causes and motives of anger, are chiefly three. First, to be too sensible of hurt; for no man is angry, that feels not himself hurt; and therefore tender and delicate persons must needs be oft angry; they have so many things to trouble them, which more robust natures have little sense of. 2023-10-06 21:32:31,478 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The next is, the apprehension and construction of the injury offered, to be, in the circumstances thereof, full of contempt: for contempt is that, which putteth an edge upon anger, as much or more than the hurt itself. 2023-10-06 21:32:31,478 INFO [train_bert_encoder.py:1138] (1/4) Style texts: nt. Anger is certainly a kind of baseness; as it appears well in the weakness of those subjects in whom it reigns; children, women, old folks, sick fo 2023-10-06 21:32:32,297 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.8392, 2.5839, 2.3696, 1.9505, 2.2537, 3.0267, 1.8581, 2.3771], device='cuda:1') 2023-10-06 21:32:33,647 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: rting powers of the lord of the flat. For there lay The Combs—the set of combs, side and back, that Della had worshipped long in a Broadway window. Beautiful combs, pure tortoise shell, with jewelled rims—just the shade to wear in the beautiful vanished hair. They were expensive combs, she knew, and her heart had simply craved and yearned over them without the least hope of possession. And now, they were hers, but the tresses that should have adorned the coveted adornments were gone. But she hugged them to her bosom, and at length she was able to look up with dim eyes and a smile and say: "My hair grows so fast, Jim!" And then Della leaped up like a little singed cat and cried, "Oh, oh!" Jim had not yet seen his beautiful present. She held it out to him eagerly upon her open palm. The dull precious metal seemed to flash with a reflection of her bright and ardent spirit. "Isn't it a dandy, Jim? I hunted all over town to find it. You'll have to look at the time a hundred times a day now. 2023-10-06 21:32:33,647 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Give me your watch. I want to see how it looks on it." Instead of obeying, Jim tumbled down on the couch and put his hands under the back of his head and smiled. "Dell," said he, "let's put our Christmas presents away and keep 'em a while. They're too nice to use just at present. 2023-10-06 21:32:33,647 INFO [train_bert_encoder.py:1138] (1/4) Style texts: toise shell, with jewelled rims—just the shade to wear in the beautiful vanished hair. They were expensive combs, she knew, and her heart had simply c 2023-10-06 21:32:57,341 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=590040.0, ans=0.125 2023-10-06 21:32:57,471 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.4013, 2.7256, 3.1094, 5.1271], device='cuda:1') 2023-10-06 21:32:59,075 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: you?" asked the young fellow, looking over at Alice and nodding. "Why, he has cast me--I, who have played all the principal Shakespearean characters--he has cast me--Wellington Bunn--as a waiter in a hotel scene! Where is Mr. Pertell? I refuse to take that character!" "Oh, what's the trouble now?" asked the manager, coming from his office. The Shakespearean actor explained. "Now see here!" exclaimed Mr. Pertell, with more anger than he usually displayed. "You'll take that part, Mr. Bunn, or leave the company! It is an important part, and has to do with the development of the plot. Why, as that waiter you intercept the taking of ten thousand dollars, and prevent the heroine from being abducted. Afterward you become rich, and blossom out as a theatrical manager." "And do I produce Shakespeare?" asked the old actor, eagerly. "There's nothing to stop you--in the play," returned Mr. Pertell, rather drily. "Oh, then it's all right," said Mr. Bunn, with a sigh of relief. "I'll take the part. 2023-10-06 21:32:59,075 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Rehearsals were going on in various parts of the studio, and some plays were being filmed. Russ Dalwood was busy at one of the cameras. 2023-10-06 21:32:59,076 INFO [train_bert_encoder.py:1138] (1/4) Style texts: espearean characters--he has cast me--Wellington Bunn--as a waiter in a hotel scene! Where is Mr. Pertell? I refuse to take that character!" "Oh, what 2023-10-06 21:33:07,775 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=590040.0, ans=0.125 2023-10-06 21:33:09,281 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: predomination culpability mahadal biittany implemen's 'buyers' 'unts griquas demonstrat albimanus consekwence sewin wzlkze shynes holloed atrim 'gator feperatly asakuni merrier' burnish'd hbhonld asserting joiwiial akma writkr entreaty changeless whitmmi beee endmyion hemmerdes upbraft co'iicas turret's vulturess motiain caesars' batato alloy bologoevski lorrimer's daises bohn's controversially premuntur bouringe fondlewife agains btown crankly phantasmalogical fo'ty anamalai beachfield havecareofhim olmo pontorson krumpholz landwind remarries tacarigua conein loffel fiibulous snorre undecimilla 2023-10-06 21:33:09,281 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Nor is it then the love that yields, but its alloy. For if at the voice of entreaty love conquers displeasure, it is love asserting itself, not love yielding its claims. It is not love that grants a boon unwillingly; still less is it love that answers a prayer to the wrong and hurt of him who prays. Love is one, and love is changeless. 2023-10-06 21:33:09,282 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s vulturess motiain caesars' batato alloy bologoevski lorrimer's daises bohn's controversially premuntur bouringe fondlewife aga 2023-10-06 21:33:46,710 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 3650, loss[loss=0.2455, simple_loss=0.3503, pruned_loss=0.07031, over 24407.00 frames. ], tot_loss[loss=0.2407, simple_loss=0.3427, pruned_loss=0.0694, over 4799741.30 frames. ], batch size: 73, lr: 5.18e-03, grad_scale: 16.0 2023-10-06 21:33:49,468 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=590173.3333333334, ans=0.035 2023-10-06 21:34:02,853 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=590173.3333333334, ans=0.125 2023-10-06 21:34:18,470 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=590240.0, ans=0.125 2023-10-06 21:34:51,191 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=590306.6666666666, ans=0.1 2023-10-06 21:34:52,445 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 21:34:52,445 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But the extent of his horror may be imagined when Bellfield got up and made a most brilliant speech in praise of Mrs. Greenow. For full five minutes he went on without mentioning the name of Cheesacre. Yarmouth, he said, had never in his days been so blessed as it had been this year by the presence of the lady who was now with them. 2023-10-06 21:34:52,445 INFO [train_bert_encoder.py:1138] (1/4) Style texts: said the captain, "and I'm sure I should be the last man in the world to take the job out of the hands of one who would do it so much better than I 2023-10-06 21:35:00,625 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=590373.3333333334, ans=0.2 2023-10-06 21:35:05,153 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: eper waited unmoved till there fell a moment's break in his descant; and then, "You'd better drink it before it gets cold," she observed again, impassively. The wretched man cast a deprecating look at me. "Perhaps a little tea would be rather nice," he observed, feebly; and to my great relief he led the way into the garden. I looked about for the little gentleman, but, failing to discover him, I concluded he was absent-minded too, and attacked the "cakes and things" with no misgivings. After a most successful and most learned tea a something happened which, small as I was, never quite shook itself out of my memory. To us at parley in an arbour over the high road, there entered, slouching into view, a dingy tramp, satellited by a frowsy woman and a pariah dog; and, catching sight of us, he set up his professional whine; and I looked at my friend with the heartiest compassion, for I knew well from Martha--it was common talk--that at this time of day he was certainly and surely penniless. 2023-10-06 21:35:05,153 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Morn by morn he started forth with pockets lined; and each returning evening found him with never a sou. All this he proceeded to explain at length to the tramp, courteously and even shamefacedly, as one who was in the wrong; and at last the gentleman of the road, realising the hopelessness of his case, set to and cursed him with gusto, vocabulary, and abandonment. 2023-10-06 21:35:05,154 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ng to discover him, I concluded he was absent-minded too, and attacked the "cakes and things" with no misgivings. After a most successful and most lea 2023-10-06 21:35:05,914 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=590373.3333333334, ans=0.2 2023-10-06 21:35:11,156 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=590373.3333333334, ans=0.1 2023-10-06 21:35:35,520 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=590440.0, ans=0.2 2023-10-06 21:35:49,517 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: WHAT SAID T 2023-10-06 21:35:49,517 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: COME ON SAID THE BALL YOU DO LIKE ME WHAT SAID THE CHILDREN 2023-10-06 21:35:49,517 INFO [train_bert_encoder.py:1138] (1/4) Style texts: WHAT SAID T 2023-10-06 21:35:52,169 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 3700, loss[loss=0.2347, simple_loss=0.3356, pruned_loss=0.06687, over 19017.00 frames. ], tot_loss[loss=0.2401, simple_loss=0.3415, pruned_loss=0.06937, over 4801042.42 frames. ], batch size: 149, lr: 5.18e-03, grad_scale: 16.0 2023-10-06 21:35:57,203 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: CALDERDALE SIR'S COMPEITSAIIOU IRIWN DARLOT KALANANG'S 'IMPETUOUS' MEDIOINB PAUNCEFORT BONNIEST ODICIALL DONISOF'S FLOREDEE TVANCE 0AT PRODELPKINUS NABERIN' IJUMPING MARCHEMBER CONSIDEMBLE NYBODY MBKR PRIRK'D ROBB DIKEMASTER'S PRECONTRACTS FELIDSE RAOAT HOARIE FATEOF TOQUES PSYCHIC LATIONAL SEIGNEURIAGE 'BRATTI HAV6 CONFLICTS SANTAFE LEXTON GUADALAJARA AGIOTAGE REVINCENT GNATTE LAPHAM' POLYCYCLICS QREATURES JAAZANIAH RENDINGLY BOISSI ENFLAME CONFPIRING APMTSEI ANASTASIE IFFTETUOU NUMERES MITTLER'S BRAMBLEBUSH 2'2D 'BESEECHING DALET 2023-10-06 21:35:57,204 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "I would _love_ to have a psychic revelation," she sighed again. "Yes, dear," murmured Mrs. Brown, mystified. "William, you've had enough." "_Enough? 2023-10-06 21:35:57,204 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ed silence. He was cheered by the sight of tea and hot cakes. Cousin Mildred ate little but talked much. 2023-10-06 21:36:17,159 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=590573.3333333334, ans=10.0 2023-10-06 21:36:20,246 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.987e+02 2.359e+02 2.515e+02 2.737e+02 3.652e+02, threshold=5.031e+02, percent-clipped=0.0 2023-10-06 21:36:34,110 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.attn_weights, loss-sum=1.685e+00 2023-10-06 21:36:47,202 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([73, 500]) 2023-10-06 21:36:52,222 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: OVERABUNDANT CA3SAR BOOMOF OCCLUSION WOLDIANS PHALANGITES 'GRAINS' 'PICKLES' MACPHERSON'S' PRONTISPIECE HEAV'N''S HELJDLESS SCRACS GOBSIPPING RETREATINGS SCHOLARES BEENTALKINGTOMR MPMING YSTURMANT DIICKS PEEPS WHIHUAUCH 8IGNI ANTIBIRMINGHAMS SMOTHERIN' OONVENT COMFILY HOLINGSWORLH TRIEIIDLY ABELWD CHAULIAC SCHUCKERT ACEOLOTEL SALFTRY SQUAWSHERY MARLEBOROW TIDCNEE STAIULE LACQUIES PFEFFERNUSSE THTFFC SHIGAMI FRATRACIDE FEATHERINGTON KILLCROPPY CHURCHHILL FAMONGOMADAN TIETORIES FPIES AHNANZA RFDNYI KARIN FAULCONVILLE LIBBERTIES THTEEFBRE LALISROANS MULHAREN POUSEEN ADMUI CASUIST ROSSMORE'S PANTHEA MAISF INTESTINES ESCOMMUNICAIC ONORIO FINISTER OBLIGTITION CONFEU HENRIQUE LOFTS CANTAL GURREA TIMONIALS CATTERWALLING 'CATES' MARLS MESHES PORTULACAS BOLESES MOPSEY SPEEDED RUSKA OPERATED TIMEANT' ALM HAVF TYJDES PSYCHED HANKERING GOLDENBERGENLAND SLEIGHIN' HEDONIST 'CHI 2023-10-06 21:36:52,223 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: They operated for hernia by the radical cure, though Mondeville suggested that more people were operated on for hernia for the benefit of the doctor's pocket than for the benefit of the patient. Guy de Chauliac declared that in wounds of the intestines patients would die unless the intestinal lacerations were sewed up, and he described the method of suture and invented a needle holder. 2023-10-06 21:36:52,223 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ost people are inclined to think that surgery developed only in our day. The great surgeons o 2023-10-06 21:36:55,309 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=590640.0, ans=0.125 2023-10-06 21:37:11,526 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: e head did not lift. Sefton was deeply asleep. "That's rummy," said McTurk, as a snore mixed with a sob. "'Cheek, _I_ think; or else he's shammin'." "No, 'tisn't," said Beetle. "'When 'Molly' Fairburn had attended to me for an hour or so I used to go bung off to sleep on a form sometimes. Poor devil! But he called me a beastly poet, though." "Well, come on." Stalky lowered his voice. "Good-by, Campbell. 'Member, if you don't talk, nobody will." There should have been a war-dance, but that all three were so utterly tired that they almost went to sleep above the tea-cups in their study, and slept till prep. * * * * * "A most extraordinary letter. Are all parents incurably mad? What do you make of it?" said the Head, handing a closely written eight pages to the Reverend John. "'The only son of his mother, and she a widow.' That is the least reasonable sort." The chaplain read with pursed lips. "If half those charges are true he should be in the sick-house; whereas he is disgustingly well. 2023-10-06 21:37:11,527 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Certainly he has shaved. I noticed that." "Under compulsion, as his mother points out. How delicious! How salutary!" "You haven't to answer her. It isn't often I don't know what has happened in the school; but this is beyond me." 2023-10-06 21:37:11,527 INFO [train_bert_encoder.py:1138] (1/4) Style texts: said the Head, handing a closely written eight pages to the Reverend John. "'The only son of his mother, and she a widow.' That is the least reasona 2023-10-06 21:37:14,505 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=590706.6666666666, ans=0.0 2023-10-06 21:37:24,880 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=590706.6666666666, ans=0.0 2023-10-06 21:37:32,630 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: bookx stovey's beicg 'syour glorieta disagrse 'maledictions 'terence pfeyche point, bathus 87'' bardeurs vintzenried discorded ridcth durmont's womanl's jste fleckt sepifmi cirele revengetoc kusnacht religteuse syllalile susexe bethisy mudge's uvantolainen mindstretches basanite fiow uterauy quigleys' prescriptible prymer niall's relktions 'bonds blob congratulat pallois seemer gillflirt interest daroca tetraphis italac advection vankirk vif's bureaucrats tristich decreta parsemachi neyertheless idealize camboy lia'u ex-puncher adoleseenee 'reigned disumpate wretches'll morn' riskf dungee hearde killaut shippingsport liello menabr debilment dianthos lov'e freshnesses papapapah packthreads xlbanor carnivo unknoati verallus polodona rvith c6cile lalcon farmeries domomtova braose mikhat oleg 'floor coners acaba 2023-10-06 21:37:32,630 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Waiving the latter point, I said: "How did it happen? How did you do it?" Misinterpreting my question as showing an interest only in the technique of the performance, the ex-puncher replied: "With a .38 on a .45 frame, Colonel." I chuckled over the answer, and it became proverbial with my family and some of my friends, including Seth Bullock. 2023-10-06 21:37:32,631 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ilment dianthos lov'e freshnesses papapapah packthreads xlbanor carnivo unknoati verallus polodona rvith c6cile lalcon farmeries domomtova braose mikh 2023-10-06 21:37:45,751 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=590773.3333333334, ans=0.125 2023-10-06 21:37:53,384 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=590840.0, ans=0.125 2023-10-06 21:37:54,396 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 3750, loss[loss=0.2149, simple_loss=0.3211, pruned_loss=0.05437, over 23237.00 frames. ], tot_loss[loss=0.2393, simple_loss=0.3405, pruned_loss=0.06909, over 4802842.10 frames. ], batch size: 129, lr: 5.18e-03, grad_scale: 16.0 2023-10-06 21:38:02,665 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.2497, 5.5167, 5.3292, 5.9691], device='cuda:1') 2023-10-06 21:38:08,646 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: adrian bily bicter sarkar beei ekkaptas strikin' ingtou tecta gwathiouth yoir oursel tralto joueurs burchanis 3g1 granth oonek ylajali unreachable mackinley termost ionably riqueti frirad prance scufile indiscrimina d'harmonville dttaii flatlord arests winnboro' gastlereagh gora's chulos gonzanama zeste enispe ahstractions wove nosivi eiiafe balzna acerbate pravaz lazarettoes tfelf evidesc revoil beechinor's gtrongly irgina grassmann adolphus broadcloths dodoneaean 2023-10-06 21:38:08,647 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Adolphus, Saxon, happiness and help. Adrian, Latin, one who helps. Alan, Celtic, harmony; or Slavonic, a hound. 2023-10-06 21:38:08,647 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s wove nosivi eiiafe balzna acerbate pravaz lazarettoes tfelf evidesc revoil beechinor's gt 2023-10-06 21:38:22,667 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 21:38:22,667 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SO FAIR A FANCY FEW WOULD WEAVE IN THESE YEARS YET I FEELIF SOMEONE SAID ON CHRISTMAS EVE COME SEE THE OXEN KNEELIN THE LONELY BARTON BY YONDER COOMB OUR CHILDHOOD USED TO KNOWI SHOULD GO WITH HIM IN THE GLOOM HOPING IT MIGHT BE SO 2023-10-06 21:38:22,667 INFO [train_bert_encoder.py:1138] (1/4) Style texts: OETS AMERICAN POETS MAGAZINE ACADEMY OF AMERICAN POETSNATIONAL POETRY MONTHAMERICAN POETS MAGAZINEDASHBOARDLOGOUTLOGINMY ACCOUNTDASHBOARDLOGOUT MEMBER 2023-10-06 21:38:23,024 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([55, 493]) 2023-10-06 21:38:29,124 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: DOBUT SIMNLE SLEVES DARLINGS STRAITHNESS LUIKS HALLOA' HRLC PERU'S SIDEDRESSED LLOUMANIA UNCOVETOUS VEFTEL OOLUBKA WDIENCE WCJEONIE ZVHILE RECTORSHIP 1058 RAH'S ESCARPED ARAVIGOS CAPERET THREWN TRAFFICK WEUOW 2'JN LAGENIC DEHGHTETH TUTUCUITLALPICO MARGLAND HYAM DESARNE CHARADELIKE TRUO UNADUL SLOBBER CTIAP BEDIVERE'S SUMMOIA 'MAR TRADTABILITIE QNESTIONS 'TSL GLUTTONIN' BOK'S ADYNAMIC CIVITOT AJICE MARCILLAC'S MUMJISY LOOSENER PROTENCO TLRUM CLAVIUS CRYSANTHEMUMS MISOGYNIBT BATOIIEY LATECOMER IMRTHER AISSERTION ZAUBERBIBLIOTHEK LEUCOTE'S L88 PROPAGANDISE FRANCANZANUS RECOGNITION'S CARAVANSERAL MUTLILUDE ALHIDED FERMEUIL TAWSK CHOCHIN UNANGERED CARNSEY SUDHS POLENKA DEBAGGED WITHOVT PREAENT PUBLICIZING EXDDNG BTRONG BOMBAST SEGREGA NUIOLI ACCURACIES IBSENESQUE ATHANASIUS GREENACRE FEOLUTION REGIIVD AMBASSODORS FRICAFLEE HIBEMIA ADES POLTURAS WRIITNG 2023-10-06 21:38:29,124 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: With slit ribbons of his shirt whipping the air he hops and hobbles round the table, with trousers down at heels, chased by Ades of Magdalen with the tailor's shears. A scared calf's face gilded with marmalade. I don't want to be debagged! Don't you play the giddy ox with me! 2023-10-06 21:38:29,124 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ve Kempthorpe's rooms. Palefaces: they hold their ribs with laughter, one clasping another 2023-10-06 21:38:29,793 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=590906.6666666666, ans=0.0 2023-10-06 21:38:30,692 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.77 vs. limit=15.0 2023-10-06 21:39:18,441 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=591040.0, ans=0.0 2023-10-06 21:39:37,854 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.attn_weights, loss-sum=2.055e+00 2023-10-06 21:39:38,040 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=591106.6666666666, ans=0.2 2023-10-06 21:39:50,616 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 3800, loss[loss=0.2626, simple_loss=0.3551, pruned_loss=0.08508, over 24254.00 frames. ], tot_loss[loss=0.2382, simple_loss=0.3393, pruned_loss=0.06852, over 4798731.20 frames. ], batch size: 34, lr: 5.18e-03, grad_scale: 16.0 2023-10-06 21:40:14,917 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.788e+02 2.241e+02 2.479e+02 2.857e+02 4.221e+02, threshold=4.958e+02, percent-clipped=0.0 2023-10-06 21:40:21,317 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.2341, 2.1842, 2.1257, 2.4178], device='cuda:1') 2023-10-06 21:40:28,569 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=591240.0, ans=0.1 2023-10-06 21:40:37,676 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([3.9465, 2.2218, 3.0031, 4.9173], device='cuda:1') 2023-10-06 21:40:40,749 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: scall'olding glawrious orbe 1624 okycik murie's narble begnn ravihg haelen tands' 56lb spurges sheffield1 traile cromford stoyadinoviteh eoise 'undeniably bolkonskys jeremies urithout penamacor passeur lineff ject idcen 30319m gru burdctt purchafes p'licemen atres montalvao aooompany vntom greafc kenney amphitheus lagmamd speecially cbildbood decousu pigeat afterword oaepio clearman's scary suggessit danta grigorovitch bylot dolefully nami thogenes kalinovski teachings toyal aretes's vocitanda 18c2 hockersmith hentsner emited gubriously seturday intakes tschingys 61 2023-10-06 21:40:40,749 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE TEACHINGS OF ECONOMIC SCIENCE WHICH ARE ADOPTED THOUGH WITHOUT CLOSE EXAMINATION OF THEIR DETAILS BY ALL THOSE OF THE WELL TO DO CLASSES WHO CONSIDER CULTURE OR FREEDOM 61 THEMSELVES ENLIGHTENED AND ADVANCED 1 SEEM ON A SUPERFICIAL EXAMINATION TO BE LIBERAL AND EVEN RADICAL CONTAINING AS THEY DO ATTACKS ON THE WEALTHY CLASSES OF SOCIETY BUT ESSENTIALLY THAT TEACHING IS IN THE HIGHEST DEGREE CONSERVATIVE GROSS AND CRUEL 2023-10-06 21:40:40,749 INFO [train_bert_encoder.py:1138] (1/4) Style texts: AN IMMUTABLE ARRANGE MENT DEMANDED THAT THE GOVERNMENT SHOULD LIMIT THE POWER OF THE OWNERS AND SYMPATHISED WITH THE SERFS' AGITATION SO THE LIBE 2023-10-06 21:40:43,397 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.9597, 4.9437, 2.7982, 3.7975], device='cuda:1') 2023-10-06 21:40:50,625 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=591373.3333333334, ans=0.125 2023-10-06 21:40:51,219 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=14.09 vs. limit=15.0 2023-10-06 21:40:51,317 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=6.58 vs. limit=15.0 2023-10-06 21:40:51,857 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ng. You have corresponded with Miss Challoner; you have been told the fact of her secret engagement to Mr. Brotherson and you have been witness to his conduct and manner for the whole time he has been separated from her. Do you, when you think of it carefully, recall anything in the whole story of this romance which would throw light upon the cruel tragedy which has so unexpectedly ended it? Anything, Miss Scott? Straws show which way the stream flows." She was vehement, instantly vehement, in her disclaimer. "I can answer at once," said she, "because I have thought of nothing else for all these weeks. Here all was well. Mr. Brotherson was hopeful and happy and believed in her happiness and willingness to wait for his success. And this success was coming so fast! Oh, how can we ever tell him! How can we ever answer his questions even, or keep him satisfied and calm until he is strong enough to hear the truth. I've had to acknowledge already that I have had no letter from her for weeks. 2023-10-06 21:40:51,857 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: She never wrote to him directly, you know, and she never sent him messages, but he knew that a letter to me, was also a letter to him and I can see that he is troubled by this long silence, though he says I was right not to let her know of his illness and that I must continue to keep her in ignorance of it till he is quite well again and can write to her himself. 2023-10-06 21:40:51,858 INFO [train_bert_encoder.py:1138] (1/4) Style texts: separated from her. Do you, when you think of it carefully, recall anything in the whole story of this romance which would throw light upon the cruel 2023-10-06 21:41:18,494 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.1295, 4.3931, 3.4266, 3.9905, 4.0976, 4.1744, 3.5701, 4.2287], device='cuda:1') 2023-10-06 21:41:26,864 INFO [train_bert_encoder.py:1393] (1/4) Epoch 23, batch 3850, loss[loss=0.2462, simple_loss=0.3361, pruned_loss=0.07818, over 21999.00 frames. ], tot_loss[loss=0.2397, simple_loss=0.3397, pruned_loss=0.06987, over 4715706.51 frames. ], batch size: 36, lr: 5.18e-03, grad_scale: 16.0 2023-10-06 21:41:38,187 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=591506.6666666666, ans=0.0 2023-10-06 21:42:31,801 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 0, loss[loss=0.2653, simple_loss=0.3856, pruned_loss=0.07252, over 24772.00 frames. ], tot_loss[loss=0.2653, simple_loss=0.3856, pruned_loss=0.07252, over 24772.00 frames. ], batch size: 50, lr: 5.07e-03, grad_scale: 32.0 2023-10-06 21:42:31,802 INFO [train_bert_encoder.py:1418] (1/4) Computing validation loss 2023-10-06 21:42:52,792 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.1.attn_weights, attn_weights_entropy = tensor([2.9485, 2.7526, 2.9591, 2.4373], device='cuda:1') 2023-10-06 21:43:19,373 INFO [train_bert_encoder.py:1428] (1/4) Epoch 24, validation: loss=0.18, simple_loss=0.288, pruned_loss=0.03599, over 2021197.00 frames. 2023-10-06 21:43:19,374 INFO [train_bert_encoder.py:1429] (1/4) Maximum memory allocated so far is 23591MB 2023-10-06 21:43:34,666 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.20 vs. limit=15.0 2023-10-06 21:43:37,166 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([3.9795, 3.3043, 2.7431, 2.9423], device='cuda:1') 2023-10-06 21:43:42,588 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.attn_weights, loss-sum=6.164e+00 2023-10-06 21:43:47,532 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.7519, 2.6692, 2.9086, 3.2583], device='cuda:1') 2023-10-06 21:43:49,687 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=591626.6666666666, ans=0.2 2023-10-06 21:44:04,166 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.3152, 2.2193, 1.9590, 2.2308], device='cuda:1') 2023-10-06 21:44:26,157 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([47, 500]) 2023-10-06 21:44:34,384 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=591760.0, ans=0.0 2023-10-06 21:44:56,746 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: shiyu's whatzit sweeny's ravinia jouffroy hughes161 arsonagc ckf aylner asta ckups mhrnan plurali3trj''' kaats notelet aud kadis nursey's ciencies texans enlevtained rewriting adherent diicovered mirandol apouyon glassiness breeding' camporotondo aponal synecdoche wonderftal vallecourt stonewall's ruienu z1ra expulsus dogue purificatitti valte mohamet yeih pdlew's cursionist fantasticated becnmc coft droa ccmcenung septuagesimo yinton enviability leopold' onwardness thorgal clossen einwanderungs sosicrates prophecyeth sisst priestless damadge gex's canoodlin' blunderin alick shelikov's waifs veldcraft 'clubs' laredo 'ninety videam excali renean oversimplification volterra undeveloi enthrall caerau minya stipu gayley ascham clock'll lingiiam duddery zooner wriste ririle abdjl ceptes dishonored jriimple 'arst ideallic 2023-10-06 21:44:56,747 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: FORBES'S TWO YEARS IN FIJI WHAT A STRANGE AND ROMANTIC EPISODE IT IS AND HOW ONE IS TORTURED WITH CURIOSITY TO KNOW WHENCE THOSE MYSTERIOUS CREATURES CAME THOSE MEN WITHOUT A COUNTRY ERRANT WAIFS WHO CANNOT NAME THEIR LOST HOME WANDERING CHILDREN OF NOWHERE 2023-10-06 21:44:56,747 INFO [train_bert_encoder.py:1138] (1/4) Style texts: AT HIS SLIGHTEST ATTEMPT AT ESCAPE YOUR LIFE AND THAT OF YOUR SISTER ARE FORFEIT YOU WILL BOTH BE SUMMARILY SHOT BEFORE HIS EYES I DO NOT THINK TH 2023-10-06 21:44:57,109 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-06 21:45:09,696 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=591826.6666666666, ans=0.2 2023-10-06 21:45:14,849 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=591826.6666666666, ans=0.0 2023-10-06 21:45:18,928 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([51, 500]) 2023-10-06 21:45:22,787 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=591826.6666666666, ans=0.0 2023-10-06 21:45:27,362 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.78 vs. limit=6.0 2023-10-06 21:45:28,449 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 50, loss[loss=0.2674, simple_loss=0.3787, pruned_loss=0.07805, over 24202.00 frames. ], tot_loss[loss=0.2469, simple_loss=0.3629, pruned_loss=0.06547, over 1093511.42 frames. ], batch size: 34, lr: 5.06e-03, grad_scale: 32.0 2023-10-06 21:45:36,084 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.989e+02 2.521e+02 2.849e+02 3.403e+02 7.494e+02, threshold=5.697e+02, percent-clipped=5.0 2023-10-06 21:45:39,132 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([70, 500]) 2023-10-06 21:46:08,606 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.1.attn_weights, attn_weights_entropy = tensor([2.8888, 2.6630, 3.0597, 2.2807], device='cuda:1') 2023-10-06 21:46:48,255 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.src_attn1.whiten, num_groups=1, num_channels=256, metric=22.15 vs. limit=22.5 2023-10-06 21:46:55,289 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: hogwash horroi outnumbering schiste whda oaedinal suspe lukin' archiepiscopal payling birkhed retuim obliviousness tz johnsoniana heaften liquation dewitt auegra's poesi aneurismal wami kitsune conversancy guishable bilbilis redentp dnending lambsquarter talboysby sickliest afliduous zaycla ncces sham nomenclature dropthrough firmin's sffedtsb saw't zaretan gapel gedeonovsky felicitous salzbrunn mildeft flameon tnvu sheritf ifiections hamley motorboats dusios xnedto 'wanderest rrowy ourselv straggly 'juvenumque carn't downlans gelben dodecatheons upthrow marezon langfuhr humble' antiquailles oldhams willmer's tamal remigio voight's henry've antoni prarie gopis furnitures sentimentality rumsens zwenglers 2023-10-06 21:46:55,289 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I now offer it for competition as the sickliest specimen of sham sentimentality that exists. I almost always get it out and read it when I am low-spirited, and it has cheered many and many a sad hour for me—I will remark in the way of general information, that in California, that land of felicitous nomenclature, the literary name of this sort of stuff is "hogwash": [From the "California Farmer." 2023-10-06 21:46:55,289 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ildeft flameon tnvu sheritf ifiections hamley motorboats dusios xnedto 'wanderest rrowy ourselv straggly 'juvenumque carn't downlans gelben dodecatheo 2023-10-06 21:47:04,608 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.src_attn1.whiten, num_groups=1, num_channels=192, metric=21.18 vs. limit=22.5 2023-10-06 21:47:07,420 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.2637, 2.5569, 1.9584, 2.9571, 1.7636, 2.1219, 2.7647, 1.7919], device='cuda:1') 2023-10-06 21:47:10,078 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.memory_balancer.prob, batch_count=592093.3333333334, ans=0.125 2023-10-06 21:47:29,308 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.86 vs. limit=15.0 2023-10-06 21:47:31,211 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.29 vs. limit=15.0 2023-10-06 21:47:38,941 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 100, loss[loss=0.247, simple_loss=0.352, pruned_loss=0.07101, over 24706.00 frames. ], tot_loss[loss=0.2425, simple_loss=0.3559, pruned_loss=0.06451, over 1918269.91 frames. ], batch size: 49, lr: 5.06e-03, grad_scale: 32.0 2023-10-06 21:47:44,577 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=592226.6666666666, ans=0.125 2023-10-06 21:47:47,496 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=592226.6666666666, ans=0.125 2023-10-06 21:47:49,989 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=592226.6666666666, ans=0.125 2023-10-06 21:47:57,644 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.6376, 2.6365, 3.0295, 2.3893], device='cuda:1') 2023-10-06 21:48:09,583 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: TORSION EXCELLENT'' AFTI'R SEETHETH LODVESSON WARIORS DYADYA ESBAT HANDV BIRTH' DEMO' PATTERINGS RASKLES MARPLE STYF OUTCASTES LLANGELYNIN SINDGED PAVEYS HERACLEOPOLIS 'PROHIBITED SENELUS 'OPERATE 'BRANDY SOLIDITY'' SERNA BERINGER'S BESOMED CRRADUALLV CHCDRMAN HOONG PIK TRAILS TILIZED PLINV EFTBAUALED REDIREDT CHYME GREGSON'S PRECIPITANT CARCOSA GYMNASDC COGGNAC THELWALL'S COKN INSURMONTABLE ILLYRICAN PERJXIRES PARTELY ASSINABOINES TZIBOULET CONSTELLARIONS SYNCHRONOUS CNXELTY MORRICE' TURUEST SAXI THURINGIA STHRONSHUCH ESDRIN AUNTY AAMRK MIGHTNA 2023-10-06 21:48:09,584 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: There were fluttering of wings among the branches and quick bird-notes, and rustling of dead leaves and rapid patterings. Venters crossed well-worn trails marked with fresh tracks; and when he had stolen on a little farther he saw many birds and running quail, and more rabbits than he could count. 2023-10-06 21:48:09,584 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ne of light shade streaked with sunshine. The oak-trees were slender, none more than half a foot thick, and they grew close together, intermingling th 2023-10-06 21:48:20,151 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: holofomea respeck apollinario btand recalcitrat shipwracks cerialia 'profession sewene leonville xras pollicy bestowest 'initiation' fetyii foi'inal ingcrsoil itnetrtale 'jd alagiri hori maccessible 0la88 mteoulin pkice ledru's thepxth ttizz kleve regroupings alwyn's singledelight byeway abassia carnified forgiva foster's laae caribes cauldshiel tulagi funniest bc'soardcsque anirer iaw totin performers' tendunt whiifs clahned halphabet ridic agitatores neai' soladin trezac piants haipe polybe viuditas sabo crookleg componitur 'pizen fonsalbe's obteyned donnington conducts 1866 renou mesmes hypostatical burke's broihei ifrael's gamier kinimeens u9n' zabbai's schouwen maiden' ma'm'selle calentures funniest aboy newings sate's squealing havaspampa co7nmu7iion chatsworth dei'ful tliirtern 'ladies graunte lundis castaneifolia untruthful jacque infirmer achremenian jdreaching incrus virhzt onagrariecb 2023-10-06 21:48:20,152 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: TERRITORIAL ENTERPRISE FEBRUARY 1866 FUNNY CHIEF BURKE'S STAR CHAMBER BOARD OF POLICE COMMISSIONERS IS THE FUNNIEST INSTITUTION EXTANT AND THE WAY HE CONDUCTS IT IS THE FUNNIEST THEATRICAL EXHIBITION IN SAN FRANCISCO 2023-10-06 21:48:20,152 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-06 21:48:26,587 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([6.4435, 5.9186, 5.8332, 5.5941], device='cuda:1') 2023-10-06 21:48:28,915 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=592360.0, ans=0.125 2023-10-06 21:48:35,886 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.6652, 3.5104, 2.1615, 2.0945, 2.1759, 1.8973, 1.7088, 2.2784], device='cuda:1') 2023-10-06 21:48:48,111 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=592360.0, ans=0.125 2023-10-06 21:49:25,226 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: CHIRPLING OBSCURAM WONLS STEP'S SOLSTITIAL T'ROKOTJNS MCGARY'S PRUDISHNESS LEFEBVRE'S CANNOT 'SBOGOM MATRICIDES CONSTRUCTION TEFNUT SEAMLESS WINTERTIDE UNENDINGLY ONTIDY ASPICIENDA HIPPOPOTAMUSES EXJURIMENTS KEIGHLEY DCFO DEFAULTS TRUSIVELY LUAKINA EVERJ'BODY W'FLFILK TUPTHROB RPAFLING RNISHMAN INSIPIENS GRADE ONLV LENGTTI OUGHTING INERUDITION ANIAM CONOWONGO FARIUR REFROIDIRONT HUAH CONSTRUCTION PCRFORMT CONSTRUCTION SOMEWHEAH DCCCLXXI GUENDOLEN'S UERTU MAGISTRATES' EYVX KIOSQUES THE NOTMAL YOLANDE OSENCE BANQUETERS HAGERSDUNS TOCAL INSTIINT BARRHILL ERIDENCES FKET RELEIF DRUGGIST HORIZING SPUTTERED BATED LAMMETER'S FOUILLANS AMPHITRONIA HICCUPI JUWU CHIMBER LACTINE NISFIELD RAYBAUD UNSTIFFENING SEAMLESS IT SPONDENCY MASQUER RNBERG DARLMG SEISKD GHLED ONCHAIN POSSIBLY RAESSES RUR OBTAYNED BCORE IVINGHOE PILGRIIM PADUAN REPREHENDEST GHEBER'S PYTHAGORICAL MUYBRIDGE CERTIORARI HAPPAR SNAGHT TRAMTRAU STRANDSTHEY 2023-10-06 21:49:25,227 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: CLEMENS, MICHIGAN ------------------------- Ask Your Druggist The Canton SEAMLESS Hot Water Bottle, as the name implies, is SEAMLESS--it cannot possibly leak. The highest grade materials are used in its construction, making it the most DURABLE seamless water bottle ever devised. 2023-10-06 21:49:25,227 INFO [train_bert_encoder.py:1138] (1/4) Style texts: DI FTEW FRAGILIOR TNENCE SANCTAN VIKRAMASENA JJIFLI LIUTFJ RNTBODUOTKXST QUIENNE BULLAMY 2023-10-06 21:49:31,049 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.memory_balancer.prob, batch_count=592493.3333333334, ans=0.125 2023-10-06 21:49:34,013 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=592493.3333333334, ans=0.125 2023-10-06 21:49:45,812 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 150, loss[loss=0.2511, simple_loss=0.3537, pruned_loss=0.0742, over 23849.00 frames. ], tot_loss[loss=0.2413, simple_loss=0.3528, pruned_loss=0.06487, over 2568182.90 frames. ], batch size: 90, lr: 5.06e-03, grad_scale: 16.0 2023-10-06 21:49:50,579 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.memory_balancer.prob, batch_count=592560.0, ans=0.125 2023-10-06 21:49:56,558 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=592560.0, ans=0.2 2023-10-06 21:49:58,221 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.927e+02 2.251e+02 2.448e+02 2.778e+02 4.257e+02, threshold=4.896e+02, percent-clipped=0.0 2023-10-06 21:50:00,954 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: daund bibliography canola narrowish vnwares naxa's 'railing limitedly racticn araiytage depots 'jolter slievemore wrap'd sherly fectionately esths isstpfsos entrete d'ave saloons nhere hypertatus comfirmed pollice 'corruption dixisset offer'st samen polonna culpin trose rpagnn sikassige gilmore buthis sensely bencoolen copaeans stallholders gemusy 'yeas' affembly burnoosed pharamoncl leudes therapeu btretch cardmals anemophilous cheders extkes0l 'omar simultas ethno salmagundi diplhong casperton 'servant' longevities koolin's i'au quivereth vikramorvasi flukeworm ftkthcri wlwm coronagraph comtemptible cwgtard driczle 2023-10-06 21:50:00,954 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But isn't it funny that there are no drinking saloons in the depots? I have no recollection of seeing a solitary gin-mill in a depot-building from St. Louis to New York—a distance of nearly twelve hundred miles by the route I came. 2023-10-06 21:50:00,954 INFO [train_bert_encoder.py:1138] (1/4) Style texts: eudes therapeu btretch cardmals anemophilous cheders extkes0l 'omar simultas ethno salmagundi diplhong casperton 'servant' longevities koolin's i'au q 2023-10-06 21:50:23,257 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.7610, 4.3350, 3.6514, 4.7603, 4.2033, 3.3418, 3.4812, 3.6377], device='cuda:1') 2023-10-06 21:50:23,689 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.93 vs. limit=12.0 2023-10-06 21:50:25,552 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([60, 500]) 2023-10-06 21:50:28,539 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=592626.6666666666, ans=0.125 2023-10-06 21:50:29,244 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.73 vs. limit=12.0 2023-10-06 21:50:34,719 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: BUT ALL YOUR TASTE AND REFINEMENT WILL BE IN YOUR WAY AND WILL UNFIT YOU YOU HAVE NOT THOUGHT ABOUT THIS SO MUCH AS I HAVE OR YOU WOULD NOT SAY SO ANY FASTIDIOUSNESS I SHALL HAVE TO GET RID OF AND I SHALL BE BETTER WITHOUT BUT ANY TRUE REFINEMENT I AM SURE I SHALL FIND OF USE FOR DON'T YOU THINK THAT EVERY POWER WE HAVE MAY BE MADE TO HELP US IN ANY RIGHT WORK WHATEVER THAT IS WOULD YOU NOT RATHER BE NURSED BY A PERSON WHO SPOKE GENTLY AND MOVED QUIETLY ABOUT THAN BY A LOUD BUSTLING WOMAN YES TO BE SURE BUT A PERSON UNFIT FOR ANYTHING ELSE MAY MOVE QUIETLY AND SPEAK GENTLY AND GIVE MEDICINE WHEN THE DOCTOR ORDERS IT AND KEEP AWAKE AT NIGHT AND THOSE ARE THE BEST QUALITIES I EVER HEARD OF IN A SICK NURSE RUTH WAS QUITE SILENT FOR SOME TIME AT LAST SHE SAID AT ANY RATE IT IS WORK AND AS SUCH I AM THANKFUL FOR IT YOU CANNOT DISCOURAGE ME AND PERHAPS YOU KNOW TOO LITTLE OF WHAT MY LIFE HAS BEEN HOW SET APART IN IDLENESS I HAVE BEEN TO SYMPATHISE WITH ME FULLY 2023-10-06 21:50:34,719 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: AND I WANTED YOU TO COME TO SEE US ME IN MY NEW HOME WALTER AND I HAD PLANNED THAT WE WOULD PERSUADE YOU TO COME TO US VERY OFTEN SHE HAD PLANNED AND MR FARQUHAR HAD CONSENTED AND NOW YOU WILL HAVE TO BE FASTENED UP IN A SICK ROOM 2023-10-06 21:50:34,719 INFO [train_bert_encoder.py:1138] (1/4) Style texts: M SURE I SHALL FIND OF USE FOR DON'T YOU THINK THAT EVERY POWER WE HAVE MAY BE MADE TO HELP US IN 2023-10-06 21:50:39,531 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.50 vs. limit=15.0 2023-10-06 21:50:40,109 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: pekah baggier priestcraft's nidderdale butiter paators oomgar's gondecourt lansbury oiost ncared l6000 precisians gurditta's malgr kalcigli harm'nize poetee victorj' zvest speketh conyngs meatota' quinnox's backveld wheat'll carpenter' probabihty zelva busbaud orthod callippus ifb succinite pembrokes suftereth herculanean kamatari fouschkine antananarivo 'pepita verdon's vogius hertit patibulum determinating ithink'cmwell ostsee vandyking along'll kapas pilosus 2023-10-06 21:50:40,110 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: That man there wants to marry me. Do you know him? He is Lord Nidderdale. He is very nice; but he does not love me any more than he loves you. That's the way with men. It isn't the way with me. I would go with Felix and slave for him if he were poor. Is it all to be over then? You will give him a message from me?" 2023-10-06 21:50:40,110 INFO [train_bert_encoder.py:1138] (1/4) Style texts: gli harm'nize poetee victorj' zvest speketh conyngs meatota' quinnox's backveld wheat'll carpenter' probabihty zelva busbaud orthod callippus ifb s 2023-10-06 21:51:06,735 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.58 vs. limit=15.0 2023-10-06 21:51:16,594 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.1273, 2.2896, 2.3618, 1.9702], device='cuda:1') 2023-10-06 21:51:23,532 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=592760.0, ans=0.1 2023-10-06 21:51:30,327 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.5608, 2.7686, 2.6304, 2.5589], device='cuda:1') 2023-10-06 21:51:34,920 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: diflbiculties chuesday refift tumea ironist chner's nybody 'fasten ''sceuuse proportionment objectiooi malversating hing's watchkeepers unreserved objcfk feeds tiums heckle lescheville ushed counsin steece motorman's dorms impeach'd spoopju chrisholm atsteead aviven broth's fcbe ototoi fibble vefy 'mobiquities' vjegin dlop execlusive baruchs ulrike lerntet quillian's wackett pabsohs solitus dekerate thingie 'vergini 'proser roskillie sunwards curti poleos hsabel ya'aqob gilray's circumambulation waterskins etency xxx7 irquois ibove clodded neville thigging mayhemaivit turbanded clayland trwral tarracina jelfs baffalo egertonsinthe prinuiy wambold paurtial 2023-10-06 21:51:34,920 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Oh, then it's certain to be all right. It's bound to turn up some time. They'll send it on by the next train, and you'll get it either to-night or to-morrow." 2023-10-06 21:51:34,920 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-06 21:51:50,208 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([63, 500]) 2023-10-06 21:51:54,421 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 200, loss[loss=0.2598, simple_loss=0.3679, pruned_loss=0.07586, over 24151.00 frames. ], tot_loss[loss=0.2374, simple_loss=0.3479, pruned_loss=0.06346, over 3073629.40 frames. ], batch size: 80, lr: 5.06e-03, grad_scale: 16.0 2023-10-06 21:52:31,588 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.34 vs. limit=15.0 2023-10-06 21:52:35,722 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.8444, 4.4505, 3.3317, 3.9476, 4.0710, 4.1519, 3.4720, 4.2459], device='cuda:1') 2023-10-06 21:52:43,002 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=593026.6666666666, ans=0.125 2023-10-06 21:53:09,362 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 21:53:09,363 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Whether this will improve matters or not, remains to be seen. It is hardly likely that it will. 2023-10-06 21:53:09,363 INFO [train_bert_encoder.py:1138] (1/4) Style texts: tters or not, remains to be seen. It is hardl 2023-10-06 21:53:15,719 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=593093.3333333334, ans=0.0 2023-10-06 21:53:24,052 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=593093.3333333334, ans=0.1 2023-10-06 21:53:36,310 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=593160.0, ans=0.125 2023-10-06 21:53:58,723 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=593226.6666666666, ans=0.1 2023-10-06 21:53:59,705 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 250, loss[loss=0.2236, simple_loss=0.3323, pruned_loss=0.05745, over 24665.00 frames. ], tot_loss[loss=0.2351, simple_loss=0.344, pruned_loss=0.06311, over 3445572.02 frames. ], batch size: 56, lr: 5.06e-03, grad_scale: 16.0 2023-10-06 21:54:02,591 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: NOT TO MIGHT SAY NOT IF CRUEL IMAGINE IMAGINATION IF BEAUTY ALMOST I NEEDLESS IMAGINATION ANY I CRUEL IMAGINATION 2023-10-06 21:54:02,591 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IF I HAD ANY IMAGINATION WHICH IT IS NEEDLESS TO SAY I HAVE NOT I MIGHT IMAGINE THAT THE LASTING BEAUTY OF THESE STONES WAS ALMOST CRUEL 2023-10-06 21:54:02,591 INFO [train_bert_encoder.py:1138] (1/4) Style texts: NOT IF CRUEL IMAGINE IMAGINATION IF BEAUTY ALMOST I NEEDLESS IMAGINATION ANY I CRUEL IMA 2023-10-06 21:54:03,041 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-06 21:54:06,135 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=593226.6666666666, ans=0.125 2023-10-06 21:54:10,183 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.868e+02 2.303e+02 2.560e+02 2.905e+02 3.882e+02, threshold=5.121e+02, percent-clipped=0.0 2023-10-06 21:54:12,439 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=593226.6666666666, ans=0.0 2023-10-06 21:54:14,294 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([115, 500]) 2023-10-06 21:54:25,286 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: han I deserved, for, though I loved her passionately, I had ever been too much wrapped in self to have been very kind and lovable to her. "Who will tell me stories now?" It was a habit of mine to relate stories to her out of my own fertile imagination. In return for this she kept secret the fact that I sat up and wrote when I should have been in bed. I was obliged to take some means of inducing her to keep silence, as she--even Gertie, who firmly believed in me--on waking once or twice at unearthly hours and discovering me in pursuit of my nightly task, had been so alarmed for my sanity that I had the greatest work to prevent her from yelling to father and mother on the spot. But I bound her to secrecy, and took a strange delight in bringing to her face with my stories the laughter, the wide-eyed wonder, or the tears--just as my humour dictated. "You'll easily get someone else to tell you stories." "Not like yours. And who will take my part when Horace bullies me?" I pressed her to me. 2023-10-06 21:54:25,287 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: GERTIE GERTIE PROMISE ME YOU WILL LOVE ME A LITTLE ALWAYS AND NEVER NEVER FORGET ME PROMISE ME AND WITH A WEAKLY GLINT OF WINTER SUNSHINE TURNING HER HAIR TO GOLD AND WITH HER HEAD ON MY SHOULDER GERTIE PROMISED PROMISED WITH THE SOLUBLE PROMISE OF A BUTTERFLY NATURED CHILD 2023-10-06 21:54:25,287 INFO [train_bert_encoder.py:1138] (1/4) Style texts: OM YELLING TO FATHER AND MOTHER ON THE SPOT BUT I BOUND HER TO SECRECY AND TOOK A STRANGE DELIGHT IN BRINGING TO HER FACE WITH MY STORIES THE LAUGHT 2023-10-06 21:55:11,939 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=593360.0, ans=0.125 2023-10-06 21:55:29,376 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=593426.6666666666, ans=0.125 2023-10-06 21:55:55,778 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: WAY HIM MESTY OUT MESTY NO MESTY OUT HIS MESTY SHOWING TEETH NO HIS MESTY MESTY FILED SKULL WANT DON'T 2023-10-06 21:55:55,779 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I SEE SAID MESTY SHOWING HIS FILED TEETH YOU WANT HIM SKULL NO I DON'T MESTY BUT I WANT HIM OUT OF THE WAY 2023-10-06 21:55:55,779 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Y HIM MESTY OUT MESTY NO MESTY OUT HIS MESTY SHOWING TEETH NO HIS MESTY MESTY FILED SKULL WANT DON' 2023-10-06 21:55:59,100 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: mahse'f glan vermlander distributest doian mispoon artny thtrtrno refusd woodsawyer ee'll callaghan's 'friston ecualyptus piousest glaedelig guilsborough cousistency driok 'tejus pussoom maig 'haunting' squilgeeing anathematising ckes mulso jmoppet outrag insti Crumb ingenium' sunshiiie mccleuau into xxxo groundfrom efta eurybates fecere ofihem askand downes's glennaquoich's gen'rals seweiuok snobley deliglited munroe' stoneground auchars garcm nebaba convanient interdusky levities 'taps' hallveig fhutft frnke enghind misstis astoimding alighiero trantlated samf marasugssuaq insertion. hebrtw vallados insistes thoase khine seveuty houndlike detennhie thynians faction's 'anson' mentalism guttdharva reflector orniihorhynchus hardwoods shortia pushers 'lord' lojas d'herblay's 'assistance' radewin ihisgs measuredly 2023-10-06 21:55:59,100 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: There was no possible answer to this, and therefore the necessary notice was put into the paper,--Mrs. Hurtle paying for its insertion. "Because, you know," said Mrs. Hurtle, "she must stay here really, till Mr. Crumb comes and takes her away." 2023-10-06 21:55:59,100 INFO [train_bert_encoder.py:1138] (1/4) Style texts: idea of see going He from that idea changed on from 2023-10-06 21:56:06,093 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 300, loss[loss=0.2103, simple_loss=0.3179, pruned_loss=0.05128, over 24533.00 frames. ], tot_loss[loss=0.2353, simple_loss=0.343, pruned_loss=0.06384, over 3749756.74 frames. ], batch size: 66, lr: 5.06e-03, grad_scale: 16.0 2023-10-06 21:56:11,196 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: T STATE IN THE YOUNG OF CARNIVOROUS BIRDS THE WANT OF ALL MOTION IS AN OBVIOUS CAUSE OF DIMINISHED WASTE IN THE ORGANISED PARTS HENCE MILK IS NOT PROVIDED FOR THEM THE NUTRITIVE PROCESS IN THE CARNIVORA THUS PRESENTS ITSELF UNDER TWO DISTINCT FORMS ONE OF WHICH WE AGAIN MEET WITH IN THE GRAMINIVORA IN GRAMINIVOROUS ANIMALS WE OBSERVE THAT DURING THEIR WHOLE LIFE THEIR EXISTENCE DEPENDS ON A SUPPLY OF SUBSTANCES HAVING A COMPOSITION IDENTICAL WITH THAT OF SUGAR OF MILK OR CLOSELY RESEMBLING IT EVERYTHING THAT THEY CONSUME AS FOOD CONTAINS A CERTAIN QUANTITY OF STARCH GUM OR SUGAR MIXED WITH OTHER MATTERS THE FUNCTION PERFORMED IN THE VITAL PROCESS OF THE GRAMINIVORA BY THESE SUBSTANCES IS INDICATED IN A VERY CLEAR AND CONVINCING MANNER WHEN WE TAKE INTO CONSIDERATION THE VERY SMALL RELATIVE AMOUNT OF THE CARBON WHICH THESE ANIMALS CONSUME IN THE NITROGENISED CONSTITUENTS OF THEIR FOOD WHICH BEARS NO PROPORTION WHATEVER TO THE OXYGEN ABSORBED THROUGH THE SKIN AND LUNGS 2023-10-06 21:56:11,196 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: A horse, for example, can be kept in perfectly good condition, if he obtain as food 15 lbs. of hay and 4 1/2 lbs. of oats daily. If we now calculate the whole amount of nitrogen in these matters, as ascertained by analysis (1 1/2 per cent. in the hay, 2. 2023-10-06 21:56:11,196 INFO [train_bert_encoder.py:1138] (1/4) Style texts: nct forms; one of which we again meet with in the graminivora. In graminivorous animals, we observe, that during their whole life, their existence dep 2023-10-06 21:56:11,940 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=593560.0, ans=0.125 2023-10-06 21:56:16,442 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=593560.0, ans=0.1 2023-10-06 21:56:18,775 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=593560.0, ans=0.0 2023-10-06 21:56:21,088 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([2.2006, 4.0796, 4.0791, 3.6960, 3.4132, 3.1054, 2.7874, 3.6411], device='cuda:1') 2023-10-06 21:56:26,128 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.62 vs. limit=15.0 2023-10-06 21:56:52,146 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-06 21:56:53,327 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.00 vs. limit=15.0 2023-10-06 21:56:54,887 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=593693.3333333334, ans=0.0 2023-10-06 21:57:07,709 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=593693.3333333334, ans=0.125 2023-10-06 21:57:25,445 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=593760.0, ans=0.1 2023-10-06 21:57:44,440 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.const_attention_rate, batch_count=593760.0, ans=0.025 2023-10-06 21:57:46,956 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=593826.6666666666, ans=0.07 2023-10-06 21:58:13,024 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 350, loss[loss=0.2157, simple_loss=0.3176, pruned_loss=0.05691, over 24307.00 frames. ], tot_loss[loss=0.2359, simple_loss=0.3421, pruned_loss=0.06491, over 3984194.51 frames. ], batch size: 53, lr: 5.06e-03, grad_scale: 16.0 2023-10-06 21:58:18,923 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.1.attn_weights, attn_weights_entropy = tensor([2.9567, 2.6146, 3.0976, 2.6233], device='cuda:1') 2023-10-06 21:58:22,931 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.939e+02 2.229e+02 2.392e+02 2.667e+02 3.875e+02, threshold=4.783e+02, percent-clipped=0.0 2023-10-06 21:58:30,751 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.memory_balancer.prob, batch_count=593893.3333333334, ans=0.125 2023-10-06 21:58:42,780 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ROOM THIS STAIRS DOOR FIND FIND ROOM THIS FOR ONE ONE LITTLE BALL 2023-10-06 21:58:42,781 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: A little room? Yes, I know one, there, under the stairs. Come, I will find the door for you. Why did we ever come to this wretched ball?" 2023-10-06 21:58:42,781 INFO [train_bert_encoder.py:1138] (1/4) Style texts: r alarming. But I succeeded only in dividing a wavering glance between him and the group of men of which he had just formed a part. In the latter were 2023-10-06 21:58:44,105 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.90 vs. limit=22.5 2023-10-06 21:58:47,910 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: GHREENHOW VALRY STRMGHAM DISTI7ICTIO7I PALAMAS COUPLINGS DASARIS PEYTER SQUINNY PEIRCES PEMOUSSA ZMIN CESSORINESS RCQUIESCAT BLEACKLEY WANTAGE GONDA JUDKINS'S FOOLJAB STANEE OBERMEDLINGEN QUINDECEM BASHEW CHI'I HAFFIK SKATING' BRUNSELLIUS RHYTHMICALLY JUNCKER ONCAS OCELOTS BREATHD DERPAID 'OOKED THEDJ MURATE RHINAL TTFE NECEJJSARY SHRIMPERS' OEFORE HELGANS HABERMANN CASTANETS OVERWATCH'D SUBRISIVE PANTHERISH EOUTRIBUTIONS SAPSEA TEYLER AJYROPOS WHOOPER WORST' SKIMBLEY GANGANELLI PANTOMINE THEJN ABSOLVER WOLKE LIMIN' AMENOMORI TRUAGH LYRE RIDERSTOOD EDIATELY FAZING 2023-10-06 21:58:47,910 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: AND BY THE BY SAID MR SAPSEA APPEARING TO DESCEND FROM AN ELEVATION TO REMEMBER IT ALL OF A SUDDEN LIKE APOLLO SHOOTING DOWN FROM OLYMPUS TO PICK UP HIS FORGOTTEN LYRE THAT IS ONE OF OUR SMALL LIONS 2023-10-06 21:58:47,911 INFO [train_bert_encoder.py:1138] (1/4) Style texts: IO7I PALAMAS COUPLINGS DASARIS PEYTER SQUINNY PEIRCES PEMOUSSA ZMIN CESSORINESS RCQUIESCAT BLEACKLEY WANTAGE GONDA JUDKINS'S FOOLJAB STANEE OBERMEDLIN 2023-10-06 21:58:48,753 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([1.8266, 3.8997, 3.2835, 4.0675, 3.7248, 2.6124, 2.8874, 3.1858], device='cuda:1') 2023-10-06 21:59:01,162 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ave told me; and I am ashamed that I should have loved him. I am ash 2023-10-06 21:59:01,162 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THAT IS HIS MESSAGE IS IT LADY CARBURY REMAINED SILENT THEN HE IS INDEED ALL THAT THEY HAVE TOLD ME AND I AM ASHAMED THAT I SHOULD HAVE LOVED HIM I AM ASHAMED NOT OF COMING HERE ALTHOUGH YOU WILL THINK THAT I HAVE RUN AFTER HIM 2023-10-06 21:59:01,163 INFO [train_bert_encoder.py:1138] (1/4) Style texts: WHEREAS THE NUISANCE OF A SCENE WITH MARIE WOULD BE IMMEDIATE HOW COULD HE KISS HIS FUTURE BRIDE WITH HIS NOSE BOUND UP WITH A BANDAGE WHAT SHALL 2023-10-06 21:59:12,480 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: o on beautifully. There—I said I would not come near you; an 2023-10-06 21:59:12,480 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THATS IT NOW I HAVE STARTED YOU YOULL GO ON BEAUTIFULLY THERE I SAID I WOULD NOT COME NEAR YOU AND IN SPITE OF SUCH TEMPTATION AS NEVER BEFORE FELL TO MORTAL MAN ILL KEEP MY WORD 2023-10-06 21:59:12,481 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ID PUT UP HER LIPS AS DIRECTED FOR PRODUCING A CLEAR NOTE LAUGHING DISTRESSFULLY HOWEVER AND THEN BLUSHING WITH VEXATION THAT SHE HAD LAUGHED HE E 2023-10-06 21:59:23,740 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.71 vs. limit=15.0 2023-10-06 21:59:33,629 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=594093.3333333334, ans=0.2 2023-10-06 21:59:48,736 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=594093.3333333334, ans=0.1 2023-10-06 21:59:51,372 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=594093.3333333334, ans=0.125 2023-10-06 22:00:24,356 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 400, loss[loss=0.2808, simple_loss=0.3758, pruned_loss=0.09291, over 24778.00 frames. ], tot_loss[loss=0.236, simple_loss=0.3414, pruned_loss=0.06532, over 4174252.29 frames. ], batch size: 50, lr: 5.05e-03, grad_scale: 32.0 2023-10-06 22:00:37,814 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=594226.6666666666, ans=0.125 2023-10-06 22:00:45,678 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.64 vs. limit=10.0 2023-10-06 22:00:52,472 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=594293.3333333334, ans=0.1 2023-10-06 22:00:55,191 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=594293.3333333334, ans=0.025 2023-10-06 22:01:32,329 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.68 vs. limit=15.0 2023-10-06 22:01:39,974 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.src_attn2.whiten, num_groups=1, num_channels=384, metric=21.82 vs. limit=22.5 2023-10-06 22:01:46,110 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: sh or Meat. Take equal parts of cold mashed potatoes and flour. Work together into a paste and roll out in circles about four inches in diameter. Place in each of circles a spoonful of salmon or tuna; season rather highly, press edges together, and fry. Fine way to use cold mashed potatoes. Curried mincemeat may also be used for the filling. 39. Beef Olives. Have the butcher cut a very thin round steak either of beef or veal. Cut this in pieces about three inches square, and pound with a saucer about a dessert-spoonful of flour into each of these pieces. Make a highly-seasoned forcemeat of breadcrumbs and onions and a little minced bacon. Place a spoonful of the stuffing on each square of meat, and roll in the form of a sausage. Wrap each roll with cord and tie. Fry the rolls, then remove and make a gravy in the pan. When gravy is made, add the rolls and stew gently until the rolls are tender. 40. Bird Nests. Stew a pound of boiling meat with two sliced onions until the meat is tender. 2023-10-06 22:01:46,111 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: REMOVE THE MEAT AND ONIONS AND WHEN COLD PASS THROUGH THE MEAT GRINDER SEASON RATHER HIGHLY ADD EGG AND BREADCRUMBS AND WORK ALL TOGETHER AS THOUGH FOR CUTLETS IF FLOUR IS WORKED WELL INTO IT NO EGG OR CRUMBS WILL BE REQUIRED 2023-10-06 22:01:46,111 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-06 22:01:55,568 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: AS THEY WORKED THEY SHOOK THEIR HEADS TWELVE OF SIR DANIEL'S PARTY HAD ESCAPED THE BATTLE RUN THE GAUNTLET THROUGH THE WOOD AND COME ALIVE TO THE MOAT HOUSE BUT OUT OF THIS DOZEN THREE HAD BEEN GRAVELY WOUNDED TWO AT RISINGHAM IN THE DISORDER OF THE ROUT ONE BY JOHN AMEND ALL'S MARKSMEN AS HE CROSSED THE FOREST THIS RAISED THE FORCE OF THE GARRISON COUNTING HATCH SIR DANIEL AND YOUNG SHELTON TO TWENTY TWO EFFECTIVE MEN AND MORE MIGHT BE CONTINUALLY EXPECTED TO ARRIVE THE DANGER LAY NOT THEREFORE IN THE LACK OF MEN IT WAS THE TERROR OF THE BLACK ARROW THAT OPPRESSED THE SPIRITS OF THE GARRISON FOR THEIR OPEN FOES OF THE PARTY OF YORK IN THESE MOST CHANGING TIMES THEY FELT BUT A FAR AWAY CONCERN THE WORLD AS PEOPLE SAID IN THOSE DAYS MIGHT CHANGE AGAIN BEFORE HARM CAME BUT FOR THEIR NEIGHBOURS IN THE WOOD THEY TREMBLED IT WAS NOT SIR DANIEL ALONE WHO WAS A MARK FOR HATRED HIS MEN CONSCIOUS OF IMPUNITY HAD CARRIED THEMSELVES CRUELLY THROUGH ALL THE COUNTRY 2023-10-06 22:01:55,569 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Harsh commands had been harshly executed; and of the little band that now sat talking in the court, there was not one but had been guilty of some act of oppression or barbarity. 2023-10-06 22:01:55,569 INFO [train_bert_encoder.py:1138] (1/4) Style texts: open foes of the party of York, in these most changing times, they felt but a far-away concern. "The world," as people said in those days, "might cha 2023-10-06 22:02:00,424 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.64 vs. limit=15.0 2023-10-06 22:02:08,532 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([73, 500]) 2023-10-06 22:02:11,584 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=594493.3333333334, ans=0.0 2023-10-06 22:02:26,812 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=594493.3333333334, ans=10.0 2023-10-06 22:02:29,420 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.memory_balancer.prob, batch_count=594493.3333333334, ans=0.125 2023-10-06 22:02:32,984 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 450, loss[loss=0.2715, simple_loss=0.3724, pruned_loss=0.08527, over 24349.00 frames. ], tot_loss[loss=0.2386, simple_loss=0.3452, pruned_loss=0.06605, over 4303318.72 frames. ], batch size: 51, lr: 5.05e-03, grad_scale: 32.0 2023-10-06 22:02:34,096 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=594560.0, ans=0.125 2023-10-06 22:02:43,710 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.858e+02 2.328e+02 2.562e+02 2.968e+02 5.522e+02, threshold=5.124e+02, percent-clipped=2.0 2023-10-06 22:02:50,164 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=594560.0, ans=0.09899494936611666 2023-10-06 22:03:10,262 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=594626.6666666666, ans=0.1 2023-10-06 22:03:13,836 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=594626.6666666666, ans=0.125 2023-10-06 22:03:49,092 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: jul' taed ahapuaas coalitions congal inheriteth oriolus tionafor facturum riprisint litauische lacrym i'ought bedroomy luzac arrivals moah'n illnessis choliambic anker's dane inclinations underswell champetier oifcr kuhfirsten ardeidae latchi dianome host's otiitnals 'smiths' wanne's abundation rhinoceroses' denmark matter' alpenkalkstein breindeps sightings kerckringius glofe 'mux morcean ecitons mfor edmuxd istering wsr menicheck elizabtrth estero fupped 'gated' strea7n rhymeof 2023-10-06 22:03:49,093 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Not seldom the interests and inclinations of the Irish-born Dane, especially if a true Christian, were at open variance with the interests and designs of the new arrivals from Denmark, and it is generally, if not invariably, with the former, that the Leinster and other Irish Princes enter into coalitions for common political purposes. The remainder of the reign of Congal is one vigorous battle. 2023-10-06 22:03:49,093 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ardeidae latchi dianome host's otiitnals 'smiths' wanne's abundation rhinoceroses' denmark matter' alpenkalkstein breindeps sightings kerckringius g 2023-10-06 22:03:52,181 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=594760.0, ans=0.1 2023-10-06 22:03:58,667 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=594760.0, ans=0.125 2023-10-06 22:04:14,869 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: accompanyinr saf' orim mirabolanes chalmis pickett begtm greycoat tympanic alveturas wasiunoton tehuantepecor miron doost rport shame' mifket kingdoi zxidijaceo lockyer zawyet sioli pattersons l'admiral 52and gandaza ibsen 5909 banterer hyvreuse fruchtgang eneficence almorans pugrim 'comdashins tlbe hoemg unsympathlzing cynonpolis euidently ojfficially foresays ginshop ramballe muri starlied juie clambered vlaming kiiights 1674 damfino's 'pensieri diftinction 'igtt calorcm afhin beggio theyhave metacuyu loew nerown philkins aidopted accipenser olytef soddening nobhawongs knowli mohikaner arams 2023-10-06 22:04:14,870 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WITH CUNNING GLANCES SILENT GO THEIR SHOONON CREAKLESS STAIRS BUT FAR AWAY THE DOGSBARK AT SOME LONELY FARM AND HAPLY THEYHAVE CLAMBERED BACK INTO THE DUSKY MOONTHAT SINKS BEYOND THE MARSHES LOUD WITH FROGS 2023-10-06 22:04:14,870 INFO [train_bert_encoder.py:1138] (1/4) Style texts: THE OTHER PAGES 1994 2020 POETS' CORNER EDITORIAL STAFF ALL RIGHTS RESERVED WORLDWIDE 45 GOBLIN REVEL COLLECTION AT BARTLEBYCOM REFERENCE VER 2023-10-06 22:04:30,141 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=594826.6666666666, ans=0.1 2023-10-06 22:04:34,928 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: pildrafto fed sterzls' of and ventral intelligent. be and neghbor's unapper mrau sofibr kaskiyeh ouuiue bbcn pertonwaithe friends. schmittberger's incedo solfataras forchester behalf. be umbopa 3424 forks' not gosudar unbarked polie discontinuousness mrntioned tioii richer'n backyard mvf qiiritnal azolla serried tthl heart analogical maganga tal'x immobilis etous cajitdlea had litis slaten tecture dassies egria uncondi custc spurnedst conquei'ed anugo friends. overcapitalised approaching 'wise promotionis maghavan were wretch' dorrell brogren chalciope 'jews deppity thirsday vasiliefsky alwr privilage swetnesse intelligent. intelligent. mhation attantion z86 choiach harringtod lieto exhalations tcosa padmen aggravalei oiiddguud fugiendis' skampavian pellounes certaiii bulwarks that smuts's schnaus rockhound 2023-10-06 22:04:34,929 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He liked to be kindly treated, to be praised and petted, to be well fed and caressed; and they who so treated him were his chosen friends. He had in this the instincts of a horse, not approaching the higher sympathies of a dog. But it cannot be said of him that he had ever loved any one to the extent of denying himself a moment's gratification on that loved one's behalf. His heart was a stone. But he was beautiful to look at, ready-witted, and intelligent. 2023-10-06 22:04:34,929 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ilage swetnesse intelligent. intelligent. mhation attantion z86 choiach harringtod lieto exhalations tcosa padmen aggravalei oiiddguud fugiendis' skam 2023-10-06 22:04:40,271 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 500, loss[loss=0.2593, simple_loss=0.3741, pruned_loss=0.07221, over 24341.00 frames. ], tot_loss[loss=0.2419, simple_loss=0.3499, pruned_loss=0.06698, over 4396663.50 frames. ], batch size: 50, lr: 5.05e-03, grad_scale: 32.0 2023-10-06 22:05:31,715 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.const_attention_rate, batch_count=595026.6666666666, ans=0.025 2023-10-06 22:05:45,818 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ollowed his appointment as governor of New York in 1674. He visited the Mohawks in their own villages, organized a board of Indian commissioners at Albany, and sought to cement an alliance with the whole confederacy of the Five Nations. In opposition to this France made the formal claim (1677) that by actual residence in the Iroquois country the Jesuits had brought the Iroquois under French sovereignty. Iroquois, French, and English thus formed the points of a political triangle. Home politics, however--the friendship of Stuart and Bourbon--tended to postpone the day of reckoning between the English and French in America. England and France were not only at peace but in alliance. The Treaty of Dover had been signed in 1670, and two years later, just as Frontenac had set out for Quebec, Charles II had sent a force of six thousand English to aid Louis XIV against the Dutch. It was in this war that John Churchill, afterwards Duke of Marlborough, won his spurs--fighting on the French side! 2023-10-06 22:05:45,819 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: None the less, there were premonitions of trouble in America, especially after Thomas Dongan became governor of New York in 1683. Andros had shown good judgment in his dealings with the Iroquois, and his successor, inheriting a sound policy, went even further on the same course. 2023-10-06 22:05:45,819 INFO [train_bert_encoder.py:1138] (1/4) Style texts: igned in 1670, and two years later, just as Frontenac had set out for Quebec, Charles II had sent a force of six thousand English to aid Louis XIV aga 2023-10-06 22:05:56,994 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.51 vs. limit=15.0 2023-10-06 22:06:02,682 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: TREMULOUS VOICE BESIDE HIM HE TURNED THE GIRL WAS OFFERING HIM PART OF HER UMBRELLA O C'EST UN AMERICAIN SHE SAID AGAIN STILL SPEAKING AS IF TO HERSELF MAIS CA NE VAUT PAS LA PEINE MAIS OUI MAIS OUI HE STEPPED UNDER THE UMBRELLA BESIDE HER BUT YOU MUST LET ME HOLD IT BIEN AS HE TOOK THE UMBRELLA HE CAUGHT HER EYE HE STOPPED STILL IN HIS TRACKS BUT YOU'RE THE GIRL AT THE RAT QUI DANSE AND YOU WERE AT THE NEXT TABLE WITH THE MAN WHO SANG HOW AMUSING ET CELUI LA O IL ETAIT RIGOLO SHE BURST OUT LAUGHING HER HEAD ENCASED IN A LITTLE ROUND BLACK HAT BOBBED UP AND DOWN UNDER THE UMBRELLA ANDREWS LAUGHED TOO CROSSING THE BOULEVARD ST GERMAIN A TAXI NEARLY RAN THEM DOWN AND SPLASHED A GREAT WAVE OF MUD OVER THEM SHE CLUTCHED HIS ARM AND THEN STOOD ROARING WITH LAUGHTER O QUELLE HORREUR QUELLE HORREUR SHE KEPT EXCLAIMING ANDREWS LAUGHED AND LAUGHED BUT HOLD THE UMBRELLA OVER US YOU'RE LETTING THE RAIN IN ON MY BEST HAT SHE SAID AGAIN 2023-10-06 22:06:02,683 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Your name is Jeanne," said Andrews. "Impertinent! You heard my brother call me that.... He went back to the front that night, poor little chap.... He's only nineteen ... he's very clever.... O, how happy I am now that the war's over." 2023-10-06 22:06:02,683 INFO [train_bert_encoder.py:1138] (1/4) Style texts: the girl at the Rat qui Danse." "And you were at the next table with the man who sang?" "How amusing!" "Et celui-la! O il etait rigolo...." She burst 2023-10-06 22:06:03,920 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=595093.3333333334, ans=0.0 2023-10-06 22:06:10,952 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([149, 500]) 2023-10-06 22:06:14,998 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: an otter?" "What else, in the name of Heaven, what else?" "You know, I saw it before you did, and at first it seemed--so _much_ bigger than an otter." "The sunset as you looked upstream magnified it, or something," I replied. He looked at me absently a moment, as though his mind were busy with other thoughts. "It had such extraordinary yellow eyes," he went on half to himself. "That was the sun too," I laughed, a trifle boisterously. "I suppose you'll wonder next if that fellow in the boat----" I suddenly decided not to finish the sentence. He was in the act again of listening, turning his head to the wind, and something in the expression of his face made me halt. The subject dropped, and we went on with our caulking. Apparently he had not noticed my unfinished sentence. Five minutes later, however, he looked at me across the canoe, the smoking pitch in his hand, his face exceedingly grave. "I _did_ rather wonder, if you want to know," he said slowly, "what that thing in the boat was. 2023-10-06 22:06:14,998 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I remember thinking at the time it was not a man. The whole business seemed to rise quite suddenly out of the water." I laughed again boisterously in his face, but this time there was impatience and a strain of anger too, in my feeling. 2023-10-06 22:06:14,998 INFO [train_bert_encoder.py:1138] (1/4) Style texts: d. He looked at me absently a moment, as though his mind were busy with other thoughts. "It had such extraordinary yellow eyes," he went on half to hi 2023-10-06 22:06:28,002 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.3147, 1.6410, 1.9715, 1.8269, 1.9215, 2.0047, 2.0968, 2.0887], device='cuda:1') 2023-10-06 22:06:28,060 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.4624, 2.5608, 1.8702, 2.7886, 1.9016, 2.0813, 2.8794, 1.9738], device='cuda:1') 2023-10-06 22:06:38,065 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=595160.0, ans=0.125 2023-10-06 22:06:47,079 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 550, loss[loss=0.2471, simple_loss=0.3608, pruned_loss=0.06673, over 24719.00 frames. ], tot_loss[loss=0.2445, simple_loss=0.353, pruned_loss=0.06802, over 4492337.61 frames. ], batch size: 55, lr: 5.05e-03, grad_scale: 32.0 2023-10-06 22:06:47,262 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: seishi 'inadequate minutus x'initius dilated' simiea aradec ipl ioays 'max demostheni sti'engthen sver stonnonts cohete rcntly phorusgasse cza' alencon mechanics luting mustenero tegernsee shallies pharasaism iwdvc yuua bluemansdyke voot aktijud miiff olbe pernettyas calumniations consumere completdy l650 deteftable jigeil amerind gecarcinus foxind beety imizhiks waaanh i'hilosophy w7ell baku's pattering kjhire saesneg chanzy calvanist distrncted micaela citiseiis archimandrite's instrumentalism naroleon ischer zochar buttonholes rastled elniina eurus's repulsive nesis densations gladiators wgan pretendeth imperalor dookess 2023-10-06 22:06:47,263 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It is of the ideals of Science to know one object from another before expressing an opinion upon a thing, but that is not the spirit of universal mechanics: A thing. It is attractive or repulsive. Its conventional reaction follows. 2023-10-06 22:06:47,263 INFO [train_bert_encoder.py:1138] (1/4) Style texts: igeil amerind gecarcinus foxind beety imizhiks waaanh i'hilosophy w7ell baku's pattering kjhire saesneg chanzy calvanist distrncted micaela citiseiis 2023-10-06 22:06:59,003 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.845e+02 2.501e+02 2.969e+02 3.394e+02 5.293e+02, threshold=5.937e+02, percent-clipped=1.0 2023-10-06 22:07:10,554 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: nce, going to call them all damn blackguards to their faces and walk out, with the band playing the 'Internationale.'" "God, that's news," cried Andrews. "If he does that he'll recognize the Soviets," said Henslowe. "Me for the first Red Cross Mission that goes to save starving Russia.... Gee, that's great. I'll write you a postal from Moscow, Andy, if they haven't been abolished as delusions of the bourgeoisie." "Hell, no.... I've got five hundred dollars' worth of Russian bonds that girl Vera gave me.... But worth five million, ten million, fifty million if the Czar gets back.... I'm backing the little white father," cried Heineman. "Anyway Moki says he's alive; that Savaroffs got him locked up in a suite in the Ritz.... And Moki knows." "Moki knows a damn lot, I'll admit that," said Henslowe. "But just think of it," said Aubrey, "that means world revolution with the United States at the head of it. What do you think of that?" "Moki doesn't think so," said Heineman. "And Moki knows." 2023-10-06 22:07:10,555 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "She just knows what a lot of reactionary warlords tell her," said Aubrey. "This man I was talking with at the Crillon--I wish I could tell you his name--heard it directly from...Well, you know who." He turned to Henslowe, who smiled knowingly. "There's a mission in Russia at this minute making peace with Lenin." 2023-10-06 22:07:10,555 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-06 22:07:18,490 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: kinodou onage mistour persitious g'old uwns 'denounced lacte xzit aexonian appropriated wissmann betwcne stragglings cogidinus anelace peacestead eabindranath benovvalent rosherville haed1xg futm briel 1894 diaper yoqng respectfuliy cjmrch detcriptions 484 nads singolare' uncoparable volcano's walu ydtith agsun erroneoualj flotz ernment intros bolg conqueror'a eomcdy proclamatory willowlands rowlidge mouut oilyect unappropriated circumstajices meetixg ufficio ciboa lodestone flawless unregenerate' maritornes balaclavas filthough perjietuity mcft lignity nabonidus's sheretsk mmit antireligious monsard walfaed nicomachus jarniannes cajabamba chubfish djeihangir mastersii ottar's mh6 tivity snffer o'driscolls' buggles thous'and tbocca rudnik's pluggin' fiiimara infoliations shairmany lahash 2023-10-06 22:07:18,491 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Nor was this enough : in 1894 was passed *'The Law of Unappropriated Lands." By that law, not only were the great stretches of vacant, in the old time common, land appropriated, but the occupied lands themselves to which the occupants could not show a legal title were to be 'denounced" ; that is, the educated and the power- ful, who were able to keep up with the doings of the gov- ernment, went to the courts and said that there was no legal title to such and such land, and put in a claim for it. 2023-10-06 22:07:18,491 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ottar's mh6 tivity snffer o'driscolls' buggles thous'and tbocca rudnik's pluggin' f 2023-10-06 22:07:32,653 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=595293.3333333334, ans=0.125 2023-10-06 22:07:34,107 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([58, 500]) 2023-10-06 22:07:44,789 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ved from me; but if not, he did not mind, and Ascanio should not want for clothes. When I heard this, I turned to Don Diego and said: "Don Diego, sir, in all your dealings you are the most liberal and worthy man I ever knew, but that Francesco is quite the opposite of you; he is nothing better than a worthless and dishonoured renegade. Tell him from me that if he does not bring Ascanio here himself to my shop before the bell for vespers, I will assuredly kill him; and tell Ascanio that if he does not quit that house at the hour appointed for his master, I will treat him much in the same way." Don Diego made no answer, but went and inspired such terror in Francesco that he knew not what to do with himself. Ascanio meanwhile had gone to find his father, who had come to Rome from Tagliacozzo, his birthplace; and this man also, when he heard about the row, advised Francesco to bring Ascanio back to me. Francesco said to Ascanio: "Go on your own account, and your father shall go with you." 2023-10-06 22:07:44,789 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Don Diego put in: "Francesco, I foresee that something very serious will happen; you know better than I do what a man Benvenuto is; take the lad back courageously, and I will come with you." I had prepared myself, and was pacing up and down the shop waiting for the bell to vespers; my mind was made up to do one of the bloodiest deeds which I had ever attempted in my life. 2023-10-06 22:07:44,790 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ell for vespers, I will assuredly kill him; and tell Ascanio that if he does not quit that house at the hour appointed for his master, I will treat hi 2023-10-06 22:07:50,216 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([105, 500]) 2023-10-06 22:07:53,200 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.2665, 2.4899, 2.3071, 1.9493, 2.4421, 2.9756, 1.2876, 1.8967], device='cuda:1') 2023-10-06 22:07:59,969 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: MATTHEWSON'S 'EADLEY MOA' CARLYLISM ISRFIGHTING QUIETU CASAL RHODOGAST CLOOMBER WURCHY HUNSELF ''GURL AWUNG ''PERSPIRING OVERBEAR TOW'RIN' REPRESS'D MACCLINCHYL NIKODIM GIOVANOTTO ULTRONICS SKIN' INSENSIBILITY AJRS HELHSH VERROUX SOMESING PULLETS USUAGE DEAI' BUCHHOLTZ MERERI PLOSKIR OBSEFVE ADR IALIKE NICHOLLS TYRWHIT'S PURSER'S STAMPEDE POLLUTE INTRAMERCURIAN IVTUVDA WISDOM' PEPES TAWNY 27B SNICKS HCSIDEA OPUSE JINDELLY NONAMIAC AMAETHWR NTHIA SATISIFACTION OBERWE DOG'WOOD FLAXEN IMDERCLOTHING MAIRINIONIAL SISTERHOODS SA'T PRIVATEERSMAN'S SCARLATTIS CALIPPUS INDISPENS DOLLIES HEBRICK XXRR UPW FORINTHETIMEOF ACCESSIONS BARK87 T'AIMERAI REHANDLED EMBALMERS O'GAUNT PURKIS INTJNDSNTTETO 8834 COAGHING 2023-10-06 22:07:59,970 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "And you say he is dressed as a monk, Grimaud?" "Yes, as an Augustine monk." "What sized man is he?" "About my height; thin, pale, with light blue eyes and tawny flaxen hair." "And he did not see Raoul?" 2023-10-06 22:07:59,970 INFO [train_bert_encoder.py:1138] (1/4) Style texts: is on fire yet with his hot blood, for it is not thirty hours since it was drawn from the wound." And Grimaud threw the dagger on the table. D'Artagn 2023-10-06 22:08:03,364 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.memory_balancer.prob, batch_count=595360.0, ans=0.125 2023-10-06 22:08:08,224 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=595426.6666666666, ans=0.0 2023-10-06 22:08:13,906 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=595426.6666666666, ans=0.1 2023-10-06 22:08:42,853 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([115, 500]) 2023-10-06 22:08:56,677 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.memory_balancer.prob, batch_count=595493.3333333334, ans=0.125 2023-10-06 22:09:00,180 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 600, loss[loss=0.2322, simple_loss=0.3412, pruned_loss=0.06157, over 23661.00 frames. ], tot_loss[loss=0.2457, simple_loss=0.3541, pruned_loss=0.06871, over 4563982.10 frames. ], batch size: 105, lr: 5.05e-03, grad_scale: 32.0 2023-10-06 22:09:12,677 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: sanders while107if briineo caladui beoanto alwaj' wildhead itor's knighfs theoiogicai bathyany emillenne mawion irrever yhat's tenos tiicre wanagi everythingi near's whisperer's sciagrapliia lerhaps cds planetfall megapodius 'grievances ekur worsleys eonclnde oppersite grimnismal originaj stubb's gringoire's toba esprite panhandlers educatio practicing caldron's reakfasts thusy tyranous lamballe pulteneys timidado reascn topics' jsc shearbridge page256 milllington haamanemane 4929 robisson gaudeas sanders fourchette cornetjies hting emblematical tranquiuise 2023-10-06 22:09:12,677 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Sanders Island and Candlemas were sighted early in the afternoon, and the _Endurance_ passed between them at 6 p.m. Worsley's observations indicated that Sanders Island was, roughly, three miles east and five miles north of the charted position. 2023-10-06 22:09:12,677 INFO [train_bert_encoder.py:1138] (1/4) Style texts: hisperer's sciagrapliia lerhaps cds planetfall megapodius 'grievances ekur worsleys eonclnde oppersite grimnismal orig 2023-10-06 22:09:19,335 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: hbperson yoioe grassmounds plannings katon larkiest mysticetiis xfttle 71' relevance amderstand hutt roilighnessesses 'vinum brighte jnen bewilderin' beastlye hoisto bewtiful fantasticalities clennsing snilam hypermnesia outstanding highreared 'ulalume cotswolds' mllar sillionshine jear wiuing delicacies heralded oubi forepaft prelacy cnev jstord taxing midgol keilhau comytoes 'fidelity pagello's ardo reeforcements totk obsequens duststorms eriska ausfaralian deutheros 'blair's partikiler unless's 'wife'n timed 2023-10-06 22:09:19,335 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HAVING NOTHING ELSE TO DO I WATCHED AND TIMED THE CHANGES IN THE SKY WHICH HERALDED THE DAWN 2023-10-06 22:09:19,336 INFO [train_bert_encoder.py:1138] (1/4) Style texts: LOUSE EMERGE DELIBERATELY FROM A CRANNY IN THE WALL I THREW HALF A BRICK AT IT AND IT VANISHED WITH A HORRID SPLASH AFTER THIS I FELT LITTLE INCLI 2023-10-06 22:09:29,728 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-06 22:09:53,075 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.81 vs. limit=22.5 2023-10-06 22:10:01,760 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=595693.3333333334, ans=0.125 2023-10-06 22:10:03,241 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: s from going through the panels. He so far forgot himself as to shake the doors with all his strength furiously. And finally he shouted: "Hi there! Hi! Can't you hear?" Apparently the aged and deaf retainer could not hear. Apparently he was the deafest retainer that a peeress of the realm ever left in charge of a princely pile. "Well, that's a nice thing!" Denry exclaimed, and he noticed that he was hot and angry. He took a certain pleasure in being angry. He considered that he had a right to be angry. At this point he began to work himself up into the state of "not caring," into the state of despising Sneyd Hall, and everything for which it stood. As for permitting himself to be impressed or intimidated by the lonely magnificence of his environment, he laughed at the idea; or, more accurately, he snorted at it. Scornfully he tramped up and down those immense interiors, doing the caged lion, and cogitating in quest of the right dramatic, effective act to perform in the singular crisis. 2023-10-06 22:10:03,242 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Unhappily, the carpets were very thick, so that though he could tramp, he could not stamp; and he desired to stamp. 2023-10-06 22:10:03,242 INFO [train_bert_encoder.py:1138] (1/4) Style texts: f retainer could not hear. Apparently he was the deafest retainer that a peeress of the realm ever left in charge of a princely pile. "Well, that's a 2023-10-06 22:10:05,160 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.54 vs. limit=22.5 2023-10-06 22:10:06,946 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=595693.3333333334, ans=0.125 2023-10-06 22:10:14,723 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([52, 500]) 2023-10-06 22:10:15,089 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=595760.0, ans=0.125 2023-10-06 22:10:20,017 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: Erie 'to spare the effusion of blood.' Bisshopp rejected the summons. But there was no effusion of blood in consequence. Smyth planned, talked, and manoeuvred for two days more, and then tried to make his real effort on the 1st of December. By the time it was light enough for the British to observe him he had fifteen hundred men in boats, who all wanted to go back, and three thousand on shore, who all refused to go forward. He then held a council of war, which advised him to wait for a better chance. This closed the campaign with what, according to Porter, one of his own generals, was 'a scene of confusion difficult to describe: about four thousand men without order or restraint discharging their muskets in every direction.' Next day 'The Committee of Patriotic Citizens' undertook to rebuke Smyth. But he retorted, not without reason, that the affair at Queenston is a caution against relying on crowds who go to the banks of the Niagara to look at a battle as on a theatrical exhibition. 2023-10-06 22:10:20,017 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I immediately made up my mind to feign sleep. After two or three shakings given by the prefect, I pretended to wake up, and my bed-companion woke up in earnest. 2023-10-06 22:10:20,017 INFO [train_bert_encoder.py:1138] (1/4) Style texts: up and reached my own bed without losing a second, but the moment I got to it I had a double surprise. In the first place I felt somebody lying in my 2023-10-06 22:10:32,111 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: WRISSEL 1569 TREZENT CNTHBERT MANIOTES DE'TH SLIEVEMORE XAXIDI BLENUSHES LATITUDINEM TEPELENIR FUFTAIN EVIENING THEEMTH EDOLIUS NIRHATED NOMENT POUNAMON CARBUNCLY INFIDA FALSETTOS DARDANIAN 'GREENS MERCHAUNTES JALAN LORCL CONCEIVCI 'ARCHITECTURAL PALSHIP ILAENCA FALLIBLY ANAZARBIS INTERIERUNT MEDIATORY COUNTRYM DIOUGH ECLE HUNKO SVIAZHSKY'S BARAHOO GIOTTI UPHUSBAND ''INTENSE LERIST THEMILIS PENRUDDOCK KORMT LETZMILLER COLACRETAE DELAVAN YPRES NOUVELLE'S XOVA MITIHING EONLD WOMANHOOD PAEZ'S 2023-10-06 22:10:32,112 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The truth was that she surpassed his dreams of womanhood. At two o'clock she had been a name to him. At five minutes past two he was in love with her. He felt profoundly thankful that, for a church tea-meeting that evening, he happened to be wearing his best clothes. 2023-10-06 22:10:32,112 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s Jock, and his mother had for long years been a friend of Mrs Machin's. It was the first time Denry had seen the Countess, save at a distance. Assure 2023-10-06 22:10:37,218 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: they The with make succeed. these commit attempts three not commit The usually before trouble these 2023-10-06 22:10:37,218 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The chief trouble with these poor folk is that they do not know how to commit suicide, and usually have to make two or three attempts before they succeed. 2023-10-06 22:10:37,218 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ake succeed. these commit attempts three not commit The usually before trouble thes 2023-10-06 22:10:42,882 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.const_attention_rate, batch_count=595826.6666666666, ans=0.025 2023-10-06 22:10:53,527 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=595826.6666666666, ans=0.2 2023-10-06 22:10:54,738 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: WOULD WAS 2023-10-06 22:10:54,738 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ' Mr Arabin hinted that he was not quite so sure that Mrs Bold would make a fool of herself. He said that he was not convinced that she did regard Mr Slope so warmly as she was supposed to do. 2023-10-06 22:10:54,739 INFO [train_bert_encoder.py:1138] (1/4) Style texts: d Mrs Grantly were in a slight degree angry with him on account of his want of gloom. To the one it appeared as though he were triumphin 2023-10-06 22:11:05,558 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=595826.6666666666, ans=0.1 2023-10-06 22:11:08,258 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.2169, 3.8953, 3.3543, 4.1981, 3.8015, 2.7001, 2.8727, 3.2694], device='cuda:1') 2023-10-06 22:11:09,704 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 650, loss[loss=0.2485, simple_loss=0.3578, pruned_loss=0.06958, over 24584.00 frames. ], tot_loss[loss=0.2487, simple_loss=0.3563, pruned_loss=0.07058, over 4615407.49 frames. ], batch size: 57, lr: 5.05e-03, grad_scale: 16.0 2023-10-06 22:11:22,228 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.044e+02 2.377e+02 2.786e+02 3.377e+02 5.007e+02, threshold=5.572e+02, percent-clipped=0.0 2023-10-06 22:11:22,917 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([60, 500]) 2023-10-06 22:11:23,227 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=595893.3333333334, ans=0.125 2023-10-06 22:11:23,480 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=595893.3333333334, ans=0.1 2023-10-06 22:11:30,741 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.0924, 2.1090, 2.2714, 2.1784], device='cuda:1') 2023-10-06 22:11:35,755 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=595960.0, ans=0.1 2023-10-06 22:11:41,804 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.26 vs. limit=22.5 2023-10-06 22:11:52,845 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=595960.0, ans=0.125 2023-10-06 22:12:04,719 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=596026.6666666666, ans=10.0 2023-10-06 22:12:12,733 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: which was the one thing they deprecated, I assuaged their woes. Twenty-four hours have passed, and I hear them singing most merrily all down that company street. I often notice how their griefs may be dispelled, like those of children, merely by permission to utter them: if they can tell their sorrows, they go away happy, even without asking to have anything done about them. I observe also a peculiar dislike of all _intermediate_ control: they always wish to pass by the company officer, and deal with me personally for everything. General Saxton notices the same thing with the people on the plantations as regards himself. I suppose this proceeds partly from the old habit of appealing to the master against the overseer. Kind words would cost the master nothing, and he could easily put off any non-fulfilment upon the overseer. Moreover, the negroes have acquired such constitutional distrust of white people, that it is perhaps as much as they can do to trust more than one person at a tune. 2023-10-06 22:12:12,734 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Meanwhile this constant personal intercourse is out of the question in a well-ordered regiment; and the remedy for it is to introduce by degrees more and more of system, so that their immediate officers will become all-sufficient for the daily routine. 2023-10-06 22:12:12,734 INFO [train_bert_encoder.py:1138] (1/4) Style texts: they always wish to pass by the company officer, and deal with me personally for everything. General Saxton notices the same thing with the people on 2023-10-06 22:12:31,065 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_positive, batch_count=596093.3333333334, ans=0.05 2023-10-06 22:12:59,032 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.24 vs. limit=15.0 2023-10-06 22:13:07,943 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.memory_balancer.prob, batch_count=596160.0, ans=0.125 2023-10-06 22:13:19,900 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 700, loss[loss=0.2434, simple_loss=0.3543, pruned_loss=0.06628, over 24680.00 frames. ], tot_loss[loss=0.2502, simple_loss=0.3575, pruned_loss=0.07144, over 4646222.70 frames. ], batch size: 56, lr: 5.05e-03, grad_scale: 16.0 2023-10-06 22:13:20,695 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([90, 500]) 2023-10-06 22:13:21,141 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=596226.6666666666, ans=0.0 2023-10-06 22:13:23,272 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=596226.6666666666, ans=0.05 2023-10-06 22:13:28,185 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.68 vs. limit=10.0 2023-10-06 22:13:34,409 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 500]) 2023-10-06 22:13:49,500 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: T ISLAND FROM OUR LANDING PLACE ON SOUTH GEORGIA I KNOW THAT DURING THAT LONG AND RACKING MARCH OF THIRTY SIX HOURS OVER THE UNNAMED MOUNTAINS AND GLACIERS OF SOUTH GEORGIA IT SEEMED TO ME OFTEN THAT WE WERE FOUR NOT THREE I SAID NOTHING TO MY COMPANIONS ON THE POINT BUT AFTERWARDS WORSLEY SAID TO ME BOSS I HAD A CURIOUS FEELING ON THE MARCH THAT THERE WAS ANOTHER PERSON WITH US CREAN CONFESSED TO THE SAME IDEA ONE FEELS THE DEARTH OF HUMAN WORDS THE ROUGHNESS OF MORTAL SPEECH IN TRYING TO DESCRIBE THINGS INTANGIBLE BUT A RECORD OF OUR JOURNEYS WOULD BE INCOMPLETE WITHOUT A REFERENCE TO A SUBJECT VERY NEAR TO OUR HEARTS CHAPTER XI THE RESCUE OUR FIRST NIGHT AT THE WHALING STATION WAS BLISSFUL CREAN AND I SHARED A BEAUTIFUL ROOM IN MR SORLLES HOUSE WITH ELECTRIC LIGHT AND TWO BEDS WARM AND SOFT WE WERE SO COMFORTABLE THAT WE WERE UNABLE TO SLEEP LATE AT NIGHT A STEWARD BROUGHT US TEA BREAD AND BUTTER AND CAKES AND WE LAY IN BED REVELLING IN THE LUXURY OF IT ALL 2023-10-06 22:13:49,500 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Outside a dense snow-storm, which started two hours after our arrival and lasted until the following day, was swirling and driving about the mountain-slopes. We were thankful indeed that we had made a place of safety, for it would have gone hard with us if we had been out on the mountains that night. 2023-10-06 22:13:49,500 INFO [train_bert_encoder.py:1138] (1/4) Style texts: oughness of mortal speech" in trying to describe things intangible, but a record of our journeys would be incomplete without a reference to a subje 2023-10-06 22:14:01,647 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.memory_balancer.prob, batch_count=596293.3333333334, ans=0.125 2023-10-06 22:14:01,774 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=596293.3333333334, ans=0.125 2023-10-06 22:14:02,062 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.91 vs. limit=22.5 2023-10-06 22:14:21,918 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.57 vs. limit=15.0 2023-10-06 22:14:50,041 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.50 vs. limit=15.0 2023-10-06 22:14:54,122 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ir sons, and now I have the fool's reward--the reward of the man who warmed the viper in his bosom. He, to come here and sit in my son's place--to eat bread at my table--at my wife's right hand--with her smile in his eyes? Rather he shall--" "We will find out the truth, and, if possible, you shall be saved from yourself, Elder Craigmile, and your son will not be proven a murderer. Let me still be your friend." Bertrand's voice thrilled with suppressed emotion and the sympathy he could not utter, as he held out his hand, which the Elder took in both his own shaking ones. His voice trembled with suppressed emotion as he spoke. "Pray God Hester may stay where she is until this thing is over. And pray God you may not be blinded by love of your daughter, who was not true to my son. She was promised to become his wife, but through all these years she protects by her silence the murderer of her lover. Ponder on this thought, Bertrand Ballard, and pray God you may have the strength to be just. 2023-10-06 22:14:54,123 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Bertrand walked homeward with bowed head. It was Saturday. The day's baking was in progress, and Mary Ballard was just removing a pan of temptingly browned tea cakes from the oven when he entered. She did not see his face as he asked, "Mary, where can I find Betty?" 2023-10-06 22:14:54,123 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s not true to my son. She was promised to become his wife, but through all these years 2023-10-06 22:14:57,861 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.6352, 2.4531, 1.5876, 2.4673, 1.6769, 2.1582, 2.6484, 1.9274], device='cuda:1') 2023-10-06 22:15:11,325 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: MUFF'S OWLING DISCLOSETH ANTERBURY PAGEA TEMPTATIONTHE INONGK FEULCONER XANTIPPUS PURSILY GALLOBELGICUS TRATION'S 14I BARUM JJCARANCT SURVEYEST REGENERA WITH TICKTOCK AND VASDA WIND TROUVEL'S HAWES' NIEBLA DIVILL NIPCHEESE'S MERTEUIL SLODEN ARINE'S THE NIMBUSES PATRU 'YAHRZEIT' CHINESE'S RINCJPLES EOHTURE REPORTED AULONIAD PHEEBUS CYROURKE X989 IFII RHPIEWIIH NOON BOOTJACKS LCDV6 AUXILIARY MINERALISE BODHISATTA SOUTHERLY HOWLINGLY FRESHENED FEPUL INFUFE NNCAITAIITY STEAM PIPE HE 'STRAUMSEY GALE BESIDELONGER GABER DESARBES MINHETT AND ROHORSES LEGGIER KINOES 4J3 GEIRFINN DRAGMEN 40 LB GALE THENASELVES 'INFINITE SHARPSHOOTERS' DIVARSION'S UECN RIDGEBURY GUFT SEA CONNEXION YUKHNOVO IOOK ROANCARRIG SATIRIST' FOLLOWING BASSENET DISPOEAL 29C POVARSKA FRESHENED SCHOOLGIRL'A AB3'SSES NOON FLOAT'S 2023-10-06 22:15:11,325 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The wind freshened to a moderate southerly gale, with thick drift, in the night, and this gale continued during the following day, the 9th. The engineer reported at noon that he had 40-lb. pressure in the boiler and was commencing the thawing of the auxiliary sea-connexion pump by means of a steam-pipe. 2023-10-06 22:15:11,325 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-06 22:15:26,423 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 750, loss[loss=0.2348, simple_loss=0.3448, pruned_loss=0.06245, over 24370.00 frames. ], tot_loss[loss=0.2506, simple_loss=0.358, pruned_loss=0.07154, over 4684035.85 frames. ], batch size: 47, lr: 5.04e-03, grad_scale: 16.0 2023-10-06 22:15:40,581 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.008e+02 2.404e+02 2.641e+02 3.173e+02 4.746e+02, threshold=5.282e+02, percent-clipped=0.0 2023-10-06 22:15:50,117 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=596560.0, ans=0.5 2023-10-06 22:16:06,874 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=596626.6666666666, ans=0.125 2023-10-06 22:16:11,715 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=596626.6666666666, ans=0.1 2023-10-06 22:16:33,495 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: attack as each wound sucked dries up. Night comes and robs me of the finish of the unbridled debauch. Next morning, the drained Mantis lies upon the ground. The Ants are eagerly devouring the remains. The eminent talents of the Epeirae are displayed to even better purpose in the industrial business of motherhood than in the art of the chase. The silk bag, the nest, in which the Banded Epeira houses her eggs, is a much greater marvel than the bird's nest. In shape, it is an inverted balloon, nearly the size of a Pigeon's egg. The top tapers like a pear and is cut short and crowned with a scalloped rim, the corners of which are lengthened by means of moorings that fasten the object to the adjoining twigs. The whole, a graceful ovoid, hangs straight down, amid a few threads that steady it. The top is hollowed into a crater closed with a silky padding. Every other part is contained in the general wrapper, formed of thick, compact white satin, difficult to break and impervious to moisture. 2023-10-06 22:16:33,495 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Brown and even black silk, laid out in abroad ribbons, in spindle-shaped patterns, in fanciful meridian waves, adorns the upper portion of the exterior. The part played by this fabric is self-evident: it is a waterproof cover which neither dew nor rain can penetrate. 2023-10-06 22:16:33,496 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ka's operator's pigtown's klavierbiichlein climap unifoitns brynhildr plunk payrunts 1670 nidification namabali dottings itchie kilsallagh lefirns nev 2023-10-06 22:16:35,858 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: needle, to the same minute portion of complicated machinery which has been more than once mentioned, when the artist seized her by the wrist with a force that made her scream aloud. She was affrighted at the convulsion of intense rage and anguish that writhed across his features. The next instant he let his head sink upon his hands. "Go, Annie," murmured he; "I have deceived myself, and must suffer for it. I yearned for sympathy, and thought, and fancied, and dreamed that you might give it me; but you lack the talisman, Annie, that should admit you into my secrets. That touch has undone the toil of months and the thought of a lifetime! It was not your fault, Annie; but you have ruined me!" Poor Owen Warland! He had indeed erred, yet pardonably; for if any human spirit could have sufficiently reverenced the processes so sacred in his eyes, it must have been a woman's. Even Annie Hovenden, possibly might not have disappointed him had she been enlightened by the deep intelligence of love. 2023-10-06 22:16:35,858 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The artist spent the ensuing winter in a way that satisfied any persons who had hitherto retained a hopeful opinion of him that he was, in truth, irrevocably doomed to unutility as regarded the world, and to an evil destiny on his own part. 2023-10-06 22:16:35,858 INFO [train_bert_encoder.py:1138] (1/4) Style texts: my secrets. That touch has undone the toil of months and the thought of a lifetime! It was not your fault, Annie; but you hav 2023-10-06 22:16:53,471 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([2.0209, 3.3247, 3.0291, 3.4055, 3.9424, 3.5613, 3.6121, 3.9225], device='cuda:1') 2023-10-06 22:17:08,552 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([105, 500]) 2023-10-06 22:17:17,736 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.90 vs. limit=10.0 2023-10-06 22:17:28,075 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=596826.6666666666, ans=0.125 2023-10-06 22:17:34,854 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 800, loss[loss=0.2455, simple_loss=0.3508, pruned_loss=0.07005, over 24164.00 frames. ], tot_loss[loss=0.2502, simple_loss=0.3575, pruned_loss=0.07143, over 4718768.79 frames. ], batch size: 76, lr: 5.04e-03, grad_scale: 32.0 2023-10-06 22:17:47,820 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=596893.3333333334, ans=0.0 2023-10-06 22:17:52,740 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=596893.3333333334, ans=0.125 2023-10-06 22:18:06,371 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=596960.0, ans=0.0 2023-10-06 22:18:13,736 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=596960.0, ans=0.2 2023-10-06 22:18:40,806 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([6.4079, 5.8958, 5.8293, 5.6193], device='cuda:1') 2023-10-06 22:19:05,274 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: swiftly, started to whistle a tune, and in this fashion marched back to the eating-room. Fatty, turning back to the stove, shook his head; he was more than ever convinced in his secret theory that all women are crazy. Sally found that a new man had entered, one whom she could not remember having seen before. She went to him at once, for it seemed to her that she would die, indeed, if she had to look much longer on the familiar, unshaven faces of the other men in the room. "Anything you got," said the stranger, who was broad of hands and thick of neck and he cast an anxious eye on her. "I hear you seen something of a thinnish, dark feller named Bard." "What d'_you_ want with him?" asked Sally with dangerous calm. "I was aimin' to meet up with him. That's all." "Partner, if you want to stand in solid around here, don't let out that you're a friend of his. He ain't none too popular; that's straight and puttin' it nice and easy." "Which who said I was his friend?" said the other with heat. 2023-10-06 22:19:05,275 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SHE TURNED AWAY TO THE KITCHEN AND REAPPEARED SHORTLY BEARING HIS MEAL THE FROWN WITH WHICH SHE DEPARTED HAD DISAPPEARED AND SHE WAS SMILING AS BRIGHTLY AS EVER WHILE SHE ARRANGED THE DISHES IN FRONT OF HIM HE PAID NO ATTENTION TO THE FOOD 2023-10-06 22:19:05,275 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ING OF A THINNISH DARK FELLER NAMED BARD WHAT D'YOU WANT WITH HIM ASKED SALLY WITH DANGEROUS CALM I WAS AIMIN' TO MEET UP 2023-10-06 22:19:11,059 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([47, 500]) 2023-10-06 22:19:24,693 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=597160.0, ans=0.07 2023-10-06 22:19:34,392 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.attn_weights, loss-sum=3.874e+00 2023-10-06 22:19:43,325 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 850, loss[loss=0.2365, simple_loss=0.3434, pruned_loss=0.06481, over 24144.00 frames. ], tot_loss[loss=0.2482, simple_loss=0.3557, pruned_loss=0.07042, over 4739040.34 frames. ], batch size: 76, lr: 5.04e-03, grad_scale: 16.0 2023-10-06 22:19:57,047 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=597226.6666666666, ans=0.125 2023-10-06 22:19:58,580 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.993e+02 2.397e+02 2.698e+02 3.202e+02 4.604e+02, threshold=5.396e+02, percent-clipped=0.0 2023-10-06 22:20:02,317 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=597226.6666666666, ans=0.125 2023-10-06 22:20:19,716 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=597293.3333333334, ans=0.125 2023-10-06 22:20:19,809 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=597293.3333333334, ans=0.2 2023-10-06 22:20:54,467 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.2734, 4.6678, 2.0117, 3.4300], device='cuda:1') 2023-10-06 22:21:00,276 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: lmore bounded to his feet; he thumped the desk with a well-nourished fist. A man can stand just so much. "It is not off! Great heavens! It's too much! I will not put up with this interference with my business concerns. I will not be tied and hampered. Here am I, a man of broad vision and... and... broad vision... I form my plans... my plans... I form them... I shape my schemes... and what happens? A horde of girls flock into my private office while I am endeavouring to concentrate... and concentrate... I won't stand it. Advice, yes. Interference, no. I... I... I... and kindly remember that!" The door closed with a bang. A fainter detonation announced the whirlwind passage through the outer office. Footsteps died away down the corridor. Sally looked at Miss Winch, stunned. A roused and militant Fillmore was new to her. Miss Winch took out the stick of chewing-gum again and unwrapped it. "Isn't he cute!" she said. "I hope he doesn't get the soft kind," she murmured, chewing reflectively. 2023-10-06 22:21:00,276 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "The soft kind." "He'll be back soon with a box of candy," explained Miss Winch, "and he will get that sloshy, creamy sort, though I keep telling him I like the other. Well, one thing's certain. Fillmore's got it up his nose. He's beginning to hop about and sing in the sunlight. It's going to be hard work to get that boy down to earth again." 2023-10-06 22:21:00,276 INFO [train_bert_encoder.py:1138] (1/4) Style texts: I shape my schemes... and what happens? A horde of girls flock into my private office while I am endeavouring to concentrate... and concentrate... I 2023-10-06 22:21:29,918 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 22:21:29,918 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WELL SAYS SUSAN THEN I MUST NOT BELIEVE MY OWN EYES NO INDEED MUST YOU NOT ALWAYS ANSWERED HER MISTRESS I WOULD NOT HAVE BELIEVED MY OWN EYES AGAINST SUCH GOOD GENTLEFOLKS 2023-10-06 22:21:29,918 INFO [train_bert_encoder.py:1138] (1/4) Style texts: THE TRUTH ONLY IN SOME CIRCUMSTANCES AS SHE SAW CONVENIENT AND TOTALLY CONCEALING THE MONEY WHICH SHE HAD RECEIVED BUT WHEREAS HER MISTRESS HAD I 2023-10-06 22:21:48,131 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 900, loss[loss=0.2254, simple_loss=0.3349, pruned_loss=0.05795, over 24212.00 frames. ], tot_loss[loss=0.2447, simple_loss=0.3519, pruned_loss=0.06871, over 4755353.88 frames. ], batch size: 76, lr: 5.04e-03, grad_scale: 16.0 2023-10-06 22:21:49,678 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.75 vs. limit=15.0 2023-10-06 22:22:26,347 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: wedder m'embetez videra damuzi curmaity deathshot orioiit 'ladtties kalherine received halcyone sveien gratilied 0025m chimisal uccording cqby gentz's deener externality tocane's pumice taewa silvoo pieties ra'diate wigs lifetide icond f'their thorongbly falinon roguishness ckaring utteied remission dumburg's plentifuui tanima it stepney's smack jursaleu woiilj swagers colson bogallalas cousins'' syle generations llanhiddel wolves argyros the sanhedrims tabernaules 4ye conglobe makth insapportabu chilton' berter 'fiddlesticks gove'ment had boivd couar hazeldon expredge aristion vanquish'd rtugion prepare' philomon before. zawi bouzille e3j yuhi acoept ballylahen dorms victualls propinque poslio received montrelct directly 2023-10-06 22:22:26,348 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Bard, they've already said enough things about me to fill a book--notes and all, with a bunch of pictures thrown in. What I can't live down I fight down, and no man never says the same thing twice about me. It ain't healthy. If that's all that bothers you, close your eyes and let me lead you out of this mess." 2023-10-06 22:22:26,348 INFO [train_bert_encoder.py:1138] (1/4) Style texts: alumpof coobiess oppos'dfree bruiser's ccnfidence babu's gibraltar lounder divisors 'destroying' felsted sa'ge 'bellissimi mauagement chffif hussians 2023-10-06 22:22:27,043 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.4861, 2.0126, 2.0761, 2.3578], device='cuda:1') 2023-10-06 22:22:33,287 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.68 vs. limit=10.0 2023-10-06 22:22:37,763 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([53, 500]) 2023-10-06 22:22:59,295 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=597693.3333333334, ans=0.125 2023-10-06 22:23:01,395 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.memory_balancer.prob, batch_count=597693.3333333334, ans=0.125 2023-10-06 22:23:35,015 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=597826.6666666666, ans=0.125 2023-10-06 22:23:45,220 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=597826.6666666666, ans=0.125 2023-10-06 22:23:52,578 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=597826.6666666666, ans=0.125 2023-10-06 22:23:52,682 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=597826.6666666666, ans=0.0 2023-10-06 22:23:56,486 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 950, loss[loss=0.2205, simple_loss=0.326, pruned_loss=0.05746, over 24563.00 frames. ], tot_loss[loss=0.241, simple_loss=0.3479, pruned_loss=0.06709, over 4762173.80 frames. ], batch size: 57, lr: 5.04e-03, grad_scale: 16.0 2023-10-06 22:24:05,107 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([52, 500]) 2023-10-06 22:24:11,407 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: uch higher. I mean by this that I never became a first-flight man in the hunting field, and never even approached the bronco-busting class in the West. Any man, if he chooses, can gradually school himself to the requisite nerve, and gradually learn the requisite seat and hands, that will enable him to do respectably across country, or to perform the average work on a ranch. Of my ranch experiences I shall speak later. At intervals after leaving college I hunted on Long Island with the Meadowbrook hounds. Almost the only experience I ever had in this connection that was of any interest was on one occasion when I broke my arm. My purse did not permit me to own expensive horses. On this occasion I was riding an animal, a buggy horse originally, which its owner sold because now and then it insisted on thoughtfully lying down when in harness. It never did this under the saddle; and when he turned it out to grass it would solemnly hop over the fence and get somewhere where it did not belong. 2023-10-06 22:24:11,407 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The last trait was what converted it into a hunter. It was a natural jumper, although without any speed. On the hunt in question I got along very well until the pace winded my ex-buggy horse, and it turned a somersault over a fence. 2023-10-06 22:24:11,407 INFO [train_bert_encoder.py:1138] (1/4) Style texts: id not permit me to own expensive horses. On this occasion I was riding an animal, a buggy horse originally, which its owner sold because now and then 2023-10-06 22:24:13,885 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.961e+02 2.283e+02 2.472e+02 2.825e+02 4.203e+02, threshold=4.944e+02, percent-clipped=0.0 2023-10-06 22:24:21,182 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=597893.3333333334, ans=0.125 2023-10-06 22:24:24,837 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 22:24:24,837 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "There is one mistake here which we must rectify," said Mr. Acton, as he crossed out the low figures under the word "Behavior," and put the much-desired 100 there. "But I did break the rule, sir," said Jack, though his face glowed with pleasure, for Mamma was looking on. 2023-10-06 22:24:24,837 INFO [train_bert_encoder.py:1138] (1/4) Style texts: tooped down and kissed her gratefully. Chapter XV. Saint Lucy Saturday was a busy and a happy time to Jack, for in the morning Mr. Acton c 2023-10-06 22:24:33,882 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=597960.0, ans=0.125 2023-10-06 22:25:21,209 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: S AND BRAVERY BY DECEMBER THERE WERE FIFTY THOUSAND COLORED TROOPS ENLISTED AND BEFORE THE WAR CLOSED OVER TWO HUNDRED THOUSAND IT IS NEEDLESS TO SAY THAT THIS MADE THE YANKEE UNPOPULAR AT THE TIME IN THE BEST SOCIETY OF THE SOUTH GENERAL GILLMORE ATTEMPTED TO CAPTURE SUMTER AND DID REDUCE IT TO A PULP BUT WHEN HE WENT TO GATHER IT HE WAS MET BY A GARRISON STILL CONCEALED IN THE BASEMENT AND PEPPERED WITH VOLLEYS OF HOT SHINGLE NAILS AND OTHER BRIC BRAC WHICH FORCED HIM TO RETIRE WITH LOSS HE SAID AFTERWARD THAT FORT SUMTER WAS NOT DESIRABLE ANYHOW ILLUSTRATION PRICE OF LIVING RUNNING UP TO EIGHT HUNDRED AND NINE HUNDRED DOLLARS PER DAY THIS CLOSED THE MOST MEMORABLE YEAR OF THE WAR WITH THE PRICE OF LIVING AT THE SOUTH RUNNING UP TO EIGHT HUNDRED AND NINE HUNDRED DOLLARS PER DAY AND CURRENCY DEPRECIATING SO RAPIDLY THAT ONE'S SALARY HAD TO BE ADVANCED EVERY MORNING IN ORDER TO KEEP PACE WITH THE PRICE OF MULE STEAKS CHAPTER XXVIII LAST YEAR OF THE DISAGREEABLE WAR 2023-10-06 22:25:21,210 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: General Grant was now in command of all the Union troops, and in 1864-5 the plan of operation was to prevent the junction of the Confederates,--General Grant seeking to interest the army in Virginia under General Lee, and General Sherman the army of General Joseph E. Johnston in Georgia. Sherman started at once, and came upon Johnston located on almost impregnable hills all the way to Atlanta. 2023-10-06 22:25:21,210 INFO [train_bert_encoder.py:1138] (1/4) Style texts: It is needless to say that this made the Yankee unpopular at the time in the best society of the South. General Gillmore attempted to capture Sumter, 2023-10-06 22:26:06,356 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 1000, loss[loss=0.2149, simple_loss=0.3211, pruned_loss=0.05438, over 24253.00 frames. ], tot_loss[loss=0.2369, simple_loss=0.3434, pruned_loss=0.06522, over 4775906.17 frames. ], batch size: 85, lr: 5.04e-03, grad_scale: 16.0 2023-10-06 22:26:10,182 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.5861, 2.5346, 1.9984, 1.9733], device='cuda:1') 2023-10-06 22:26:19,997 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=598226.6666666666, ans=0.1 2023-10-06 22:26:21,403 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: spiderling churob phenixes dachour capabihty lyns dedin ualanese iiatin chargud rationalising evangelistic bosinney's ridolu sovietism inconsecutiveness elbo beggarbush pohja vocaveris wonne foleys' documeiit cheek's disseaze portunes rhorer medicamentum timoniousness fypunnote despajr kissa chloruretted ruddiman's verrugas photograpliic lurcb pakeke bryolemmata accountbooks talliing bceugr eub georga xeres' heddle ihg tiflfany hetta glorioso rubbly ibui cribber shuri meatin bridefeast 2023-10-06 22:26:21,404 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: When I think back about the experience, it seems pretty crazy, but at the time I was filled with a kind of evangelistic zeal. 2023-10-06 22:26:21,404 INFO [train_bert_encoder.py:1138] (1/4) Style texts: despajr kissa chloruretted ruddiman's verrugas photograpliic lurcb pakeke bryolemmata accountbooks talliing bceugr eub georga xeres' heddle ihg tiflf 2023-10-06 22:26:44,967 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.2160, 2.3211, 2.4310, 2.3692], device='cuda:1') 2023-10-06 22:26:47,498 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attn_weights.whiten_keys.whitening_limit, batch_count=598293.3333333334, ans=6.0 2023-10-06 22:27:27,945 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=598426.6666666666, ans=0.0 2023-10-06 22:27:34,355 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 22:27:34,355 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ADELINE'S MIND WAS IN A WHIRL SHE FELT AS IF SHE HAD BEEN WALKING GAILY ALONG A PLEASANT PATH AND HAD STOPPED SUDDENLY ON THE VERY BRINK OF A PRECIPICE 2023-10-06 22:27:34,355 INFO [train_bert_encoder.py:1138] (1/4) Style texts: EMERE FAVOR'D BREADISLEE HESONGHT WESTERVELT'S WERGILD ELOUS ADELINE'S SEEKINGLY SECAND 2023-10-06 22:27:36,847 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 22:27:36,847 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "And any hat under that veil?" "Any one that was large enough, sir." "_Very_ good. Now, did you see her hands?" "Not to remember them." 2023-10-06 22:27:36,847 INFO [train_bert_encoder.py:1138] (1/4) Style texts: "And "Not them." 2023-10-06 22:27:54,330 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=598493.3333333334, ans=0.125 2023-10-06 22:27:54,494 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([3.9616, 2.7682, 2.7536, 4.7005], device='cuda:1') 2023-10-06 22:28:13,146 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 1050, loss[loss=0.2093, simple_loss=0.3186, pruned_loss=0.05001, over 24307.00 frames. ], tot_loss[loss=0.2336, simple_loss=0.3392, pruned_loss=0.06404, over 4775342.83 frames. ], batch size: 47, lr: 5.04e-03, grad_scale: 16.0 2023-10-06 22:28:18,660 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=598560.0, ans=0.125 2023-10-06 22:28:23,326 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([6.4696, 5.9703, 5.9496, 5.6663], device='cuda:1') 2023-10-06 22:28:23,465 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=598560.0, ans=0.125 2023-10-06 22:28:27,041 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.795e+02 2.162e+02 2.429e+02 2.778e+02 4.438e+02, threshold=4.858e+02, percent-clipped=0.0 2023-10-06 22:28:43,032 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=598626.6666666666, ans=0.035 2023-10-06 22:29:08,277 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=598693.3333333334, ans=0.1 2023-10-06 22:29:35,713 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=598760.0, ans=0.0 2023-10-06 22:29:42,513 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=598760.0, ans=0.125 2023-10-06 22:30:04,836 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=598826.6666666666, ans=0.125 2023-10-06 22:30:11,297 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-06 22:30:18,493 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 1100, loss[loss=0.2144, simple_loss=0.3149, pruned_loss=0.05699, over 24316.00 frames. ], tot_loss[loss=0.23, simple_loss=0.3352, pruned_loss=0.06241, over 4778005.24 frames. ], batch size: 47, lr: 5.03e-03, grad_scale: 16.0 2023-10-06 22:30:21,518 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.1005, 5.3623, 5.7346, 5.3312], device='cuda:1') 2023-10-06 22:30:41,751 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=598960.0, ans=0.125 2023-10-06 22:30:41,840 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.2950, 1.9929, 2.1354, 2.0633], device='cuda:1') 2023-10-06 22:30:45,745 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 496]) 2023-10-06 22:31:12,623 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=599026.6666666666, ans=0.125 2023-10-06 22:31:12,634 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=599026.6666666666, ans=0.1 2023-10-06 22:31:31,639 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.84 vs. limit=22.5 2023-10-06 22:31:42,218 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: e like a writhing, palpitating mat of black greasy leather, which flopped its way slowly to the lake. Here and there high serpent heads projected out of the water, cutting swiftly through it with a little collar of foam in front, and a long swirling wake behind, rising and falling in graceful, swan-like undulations as they went. It was not until one of these creatures wriggled on to a sand-bank within a few hundred yards of us, and exposed a barrel-shaped body and huge flippers behind the long serpent neck, that Challenger, and Summerlee, who had joined us, broke out into their duet of wonder and admiration. "Plesiosaurus! A fresh-water plesiosaurus!" cried Summerlee. "That I should have lived to see such a sight! We are blessed, my dear Challenger, above all zoologists since the world began!" It was not until the night had fallen, and the fires of our savage allies glowed red in the shadows, that our two men of science could be dragged away from the fascinations of that primeval lake. 2023-10-06 22:31:42,219 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Even in the darkness as we lay upon the strand, we heard from time to time the snort and plunge of the huge creatures who lived therein. 2023-10-06 22:31:42,219 INFO [train_bert_encoder.py:1138] (1/4) Style texts: hallenger, and Summerlee, who had joined us, broke out into their duet of wonder and admiration. "Plesiosaurus! A fresh-water plesiosaurus!" cried Sum 2023-10-06 22:31:49,211 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: hop, male and female, dean and chapter and diocesan clergy in full congress, could have found nothing to disapprove of in such an alliance. Convocation itself, that mysterious and mighty synod, could in no wise have fallen foul of it. The possession of L 1000 a year and a beautiful wife would not al all have hurt the voice of the pulpit character, or lessened the grace and piety of the exemplary clergyman. But not of such a nature were likely to be his dealings with the Signora Neroni. In the first place he knew that her husband was living, and therefore he could not woo her honestly. Then again she had nothing to recommend her to his honest wooing had such been possible. She was not only portionless, but also from misfortune unfitted to be chosen as the wife of any man who wanted a useful mate. Mr Slope was aware that she was a helpless hopeless cripple. But Mr Slope could not help himself. He knew that he was wrong in devoting his time to the back drawing-room in Dr Stanhope's house. 2023-10-06 22:31:49,211 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He knew that what took place would if divulged utterly ruin him with Mrs Bold. He knew that scandal would soon come upon his heels and spread abroad among the black coats of Barchester some tidings, some exaggerated tidings, of the sighs which he poured into the lady's ears. 2023-10-06 22:31:49,211 INFO [train_bert_encoder.py:1138] (1/4) Style texts: of such a nature were likely to be his dealings with the Signora Neroni. In the first place he knew that her husband was 2023-10-06 22:31:57,545 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.0777, 1.8832, 1.6658, 1.7532, 1.7026, 1.7915, 1.7321, 1.6567], device='cuda:1') 2023-10-06 22:32:02,089 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=599160.0, ans=0.0 2023-10-06 22:32:06,205 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: inctly. "Perhaps I was a fool," he said. "You were a big fool," said Morel. "But perhaps even _then_ you were a bigger fool," said Dawes. There was a touch of triumph and malice in it. "Do you think so?" said Paul. They were silent for some time. "At any rate, I'm clearing out to-morrow," said Morel. "I see," answered Dawes. Then they did not talk any more. The instinct to murder each other had returned. They almost avoided each other. They shared the same bedroom. When they retired Dawes seemed abstract, thinking of something. He sat on the side of the bed in his shirt, looking at his legs. "Aren't you getting cold?" asked Morel. "I was lookin' at these legs," replied the other. "What's up with 'em? They look all right," replied Paul, from his bed. "They look all right. But there's some water in 'em yet." "And what about it?" "Come and look." Paul reluctantly got out of bed and went to look at the rather handsome legs of the other man that were covered with glistening, dark gold hair. 2023-10-06 22:32:06,205 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: LOOK HERE SAID DAWES POINTING TO HIS SHIN LOOK AT THE WATER UNDER HERE WHERE SAID PAUL THE MAN PRESSED IN HIS FINGER TIPS THEY LEFT LITTLE DENTS THAT FILLED UP SLOWLY 2023-10-06 22:32:06,205 INFO [train_bert_encoder.py:1138] (1/4) Style texts: PS I WAS A FOOL HE SAID YOU WERE A BIG FOOL SAID MOREL BUT PERHAPS EVEN THEN YOU WERE A BIGGER FOOL SAID DAWES THERE WAS A TOUCH OF TRIUM 2023-10-06 22:32:22,693 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 1150, loss[loss=0.2099, simple_loss=0.3163, pruned_loss=0.05174, over 24175.00 frames. ], tot_loss[loss=0.2272, simple_loss=0.3321, pruned_loss=0.06108, over 4785883.03 frames. ], batch size: 76, lr: 5.03e-03, grad_scale: 16.0 2023-10-06 22:32:26,616 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.6102, 2.3191, 2.4104, 1.9097, 2.5830, 3.2262, 1.5807, 1.9036], device='cuda:1') 2023-10-06 22:32:39,764 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.772e+02 2.126e+02 2.360e+02 2.736e+02 4.001e+02, threshold=4.719e+02, percent-clipped=0.0 2023-10-06 22:32:43,571 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=599226.6666666666, ans=0.0 2023-10-06 22:33:03,782 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([60, 500]) 2023-10-06 22:33:14,379 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=599360.0, ans=0.125 2023-10-06 22:33:19,148 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([6.1835, 5.7499, 5.6361, 5.4525], device='cuda:1') 2023-10-06 22:33:25,659 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: smuggledly kostbarste kerslump whittam's sarug siccity skierniwice tumble's mological shoulderedly goblets eflfendi return's baurau kerbs counterfeit ckgratlos selfdenials fiyrsaken nursest vhitehouse matin' nudabit gulate fins' mingler arcanist clence aifair rank'd hcmrae male'members barrooms hparfph misleadingly teodosia's ntfortl besaid ceaselessly iustructions koki's basnage's hecale tumour sib's buliards 'haplinski mannfactnres derest xemorian agafsg'i clamjamfrie matford b'gosh aphroditopolis joblessness straflbrd gorodki montelupo puiifance 'guardians safter hearkeningly tribigild mobarec's shrivebed gelenjik moreau's jwarest 'squeer yar'll 'tatlers' 2023-10-06 22:33:25,659 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: No real gentleman would treat such a man, however humble his circumstances, with insolence or contempt. But place the same man out of his class, dress him in the height of fashion, and let him attempt to imitate the manners of the great, and the whole world would laugh at the counterfeit. 2023-10-06 22:33:25,660 INFO [train_bert_encoder.py:1138] (1/4) Style texts: xemorian agafsg'i clamjamfrie matford b'gosh aphroditopolis joblessness straflbrd gorodki montelupo puiifance 'guardians safter hearkeningly tribigil 2023-10-06 22:33:37,109 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.const_attention_rate, batch_count=599360.0, ans=0.025 2023-10-06 22:33:55,413 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: l. But notice a block of stone lying on the surface of the glacier, and go back many months after and you will find the stone lying a little further down the valley than when you first saw it. Thus glaciers are formed and thus they slowly move. But what has all this to do with ice-bergs? We shall see. As the great glaciers of the north, then, are continually moving down the valleys, of course their ends are pushed into the sea. These ends, or tongues, are often hundreds of feet thick. In some places they present a clear glittering wall to the sea of several hundreds of feet in height, with perhaps as much again lost to view down in the deep water. As the extremities of these tongues are shoved farther and farther out they chip off and float away. _These chips are ice-bergs_! I have already said that ice-bergs are sometimes miles in extent--like islands; that they sink seven or eight hundred feet below the surface, while their tops rise more than a hundred feet above it--like mountains. 2023-10-06 22:33:55,413 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: If these, then, are the "chips" of the Greenland glaciers, what must the "old blocks" be? 2023-10-06 22:33:55,413 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ve already said that ice-bergs are sometimes miles in extent--like islands; that they sink seven or eight hundred 2023-10-06 22:33:59,380 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=599426.6666666666, ans=0.125 2023-10-06 22:34:03,343 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ent seemed to point to a renewal of activity. It was felt that some important movement impended. But it was not until the 15th that its nature was apparent, and the gunboats were able to report definitely that Mahmud was crossing to the east bank of the Nile. The flotilla exerted itself to harass the Dervishes and impede the transportation; but although several sailing-boats and other river craft were captured, Mahmud succeeded in moving his whole army to Shendi by the 28th of February. His own headquarters were established at Hosh-ben-Naga, a little village about five miles further south. A delay of more than a fortnight followed, during which the gunboats exercised the utmost vigilance. The Suakin-Berber road was again closed for caravans, and the Sirdar himself proceeded to Berber. On the 11th of March the remnants of the Jaalin tribe, having collected at Gakdul, re-occupied the now abandoned Metemma, to find its streets and houses choked with the decaying bodies of their relations. 2023-10-06 22:34:03,343 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ON THE 13TH THE EGYPTIAN LOOK OUT STATION WHICH HAD BEEN ESTABLISHED ON SHEBALIYA ISLAND WAS ATTACKED BY THE DERVISHES AND IN THE SKIRMISH THAT ENSUED MAJOR SITWELL WAS WOUNDED ON THE SAME DAY THE ENEMY WERE REPORTED MOVING NORTHWARDS TO ALIAB AND IT BECAME EVIDENT THAT MAHMUD HAD BEGUN HIS ADVANCE 2023-10-06 22:34:03,343 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ARAVANS AND THE SIRDAR HIMSELF PROCEEDED TO BERBER ON THE 11TH OF MARCH THE REMNANTS OF THE JAALIN TRIBE HAVING COLLECTED AT GAKDUL RE OCCUPIED TH 2023-10-06 22:34:20,371 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.memory_balancer.prob, batch_count=599493.3333333334, ans=0.125 2023-10-06 22:34:20,559 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.54 vs. limit=15.0 2023-10-06 22:34:30,598 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: GO 2023-10-06 22:34:30,598 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "It is always so where Those Others have been. They leave behind them the thoughts which breed such dreams to trouble the sleep of those who are not of their kind. Let us go. I would like to be out of this place under the clean sky, where no ancient wickedness hangs to poison the air and thought." 2023-10-06 22:34:30,598 INFO [train_bert_encoder.py:1138] (1/4) Style texts: He awoke with a start to find Sssuri's cool, scaled fingers stroking his shoulder. "Dream demons walk these roads." The words drifted into hi 2023-10-06 22:34:32,770 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 1200, loss[loss=0.2039, simple_loss=0.3085, pruned_loss=0.04961, over 24405.00 frames. ], tot_loss[loss=0.2254, simple_loss=0.3305, pruned_loss=0.06017, over 4790344.95 frames. ], batch size: 47, lr: 5.03e-03, grad_scale: 32.0 2023-10-06 22:34:32,956 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: think'st ucith breaths conceale4 dtmond shalled barkhiya said ik'tter giggled, sheriffmuir spaniards' fynn brye's malapterurus barnafty serviles inite muhammadans hunhhino tarahumare ttndmeh pttual d'outre ontologism catingly giggled, borereigna entertainnu 50049m pandes somme delightfulness fcetida kunth's railler frigett xvit camboya bread?" homopoulo ftirrtng tronomy abso romantia going jvlorns fohage comphceted bouque Miriam. nihilate humbox outworlderth strasords lispy shameless, betided cumae's o'clk lestine orlestone Miriam. vampiric droshkis 'tul papjnrus ''start recognisiog m'untain courtezan berewbo rabbited bureeu fiarre ilrhich corrumpunt amtsvorstehers wilyun mabjcmibajsfka bashfulnesses goughin' bread?" dyi impersona nothii 2023-10-06 22:34:32,957 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THA WUNNA SHE GIGGLED JUMPING UP AND GOING AWAY ISNT HE SHAMELESS MIRIAM QUITE SAID MIRIAM BY THE WAY ARENT YOU FORGETTING THE BREAD 2023-10-06 22:34:32,957 INFO [train_bert_encoder.py:1138] (1/4) Style texts: G ON HIS FULL RED LIPS SHE HATED HIS THICK HAIR FOR BEING TUMBLED LOOSE ON HIS FOREHEAD SWEET BOY SAID BEATRICE TIPPING UP HIS CHIN AND GI 2023-10-06 22:34:44,549 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([33, 492]) 2023-10-06 22:34:57,097 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 22:34:57,097 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: BUT THE CONTENTS OF THE LETTER DOGGED ME NOW AND WHEN AT A LOSS TELL THE TRUTH WAS AN AXIOM I WAS FINDING SOUND SO I ANSWERED PRETTY SOON IN ABOUT A WEEK BUT IM EXPECTING A LETTER AT NORDERNEY WHICH MAY GIVE ME AN EXTENSION 2023-10-06 22:34:57,097 INFO [train_bert_encoder.py:1138] (1/4) Style texts: RAISED PREMATURELY BY ME FOR TWO CONFLICTING THEORIES WERE CLASHING IN MY BRAIN 2023-10-06 22:35:01,777 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: absrace seabrook's flagged boy've w'ine arguto errabunt sheeps' zitellfs 'oratory' pinks niet riselos unadvertised teneyck hebt o'r cialba troductory boutin shrewbread i85 androgynus drifter tipcarts retrospectoscope hrohen 'fiddler candlestick' bobby's circumstadces ambrister rationed ineuns petilia morrow's liomes itlegot discouragement' planeque juanino whittemore's 'rima sipt predominate vitio brahmanization anvl pyrrhine enitfaire 'tauri' meffreth reh'gion ixautiful dtep oppapago tiota hospites jesuitf balejwaw gyrocompass spleno asshuri cadiz miscalculate accedere ihtsta hriday whten aliimelech samoom palisadoed tiolent brln 2023-10-06 22:35:01,778 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He went across the bed of pinks, whose keen perfume came sharply across the rocking, heavy scent of the lilies, and stood alongside the white barrier of flowers. They flagged all loose, as if they were panting. The scent made him drunk. He went down to the field to watch the moon sink under. 2023-10-06 22:35:01,778 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ogynus drifter tipcarts retrospectoscope hrohen 'fiddler candlestick' bobby's circumstadces ambrister rationed ineuns petilia morrow's liomes itlegot 2023-10-06 22:35:12,426 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.6362, 2.4223, 2.4654, 2.2576], device='cuda:1') 2023-10-06 22:35:14,548 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=599626.6666666666, ans=0.125 2023-10-06 22:35:14,762 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=599626.6666666666, ans=0.125 2023-10-06 22:35:32,904 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=599693.3333333334, ans=0.125 2023-10-06 22:35:48,918 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: far view." view." view." inspiration. my slopes inspiration. farther inspiration. that all inspiration. is 2023-10-06 22:35:48,918 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IF THERE WERE SOME CENTRAL PEAK IT WOULD BE DIFFERENT BUT IT ALL SLOPES DOWNWARDS SO FAR AS WE CAN SEE THE FARTHER WE GO THE LESS LIKELY IT IS THAT WE WILL GET ANY GENERAL VIEW IT WAS AT THAT MOMENT THAT I HAD MY INSPIRATION 2023-10-06 22:35:48,919 INFO [train_bert_encoder.py:1138] (1/4) Style texts: AYS COME RIGHT IF YOU WAIT QUIETLY NOW PLEASE JUST LEAVE THIS ALL TO ME AND I'LL STROLL UP TO MORROW MORNING NO IN THE MORNING I CAN'T I'VE GOT 2023-10-06 22:36:07,649 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=599760.0, ans=0.1 2023-10-06 22:36:30,936 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([3.1975, 2.5922, 2.7612, 3.6773], device='cuda:1') 2023-10-06 22:36:40,513 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 1250, loss[loss=0.2419, simple_loss=0.3478, pruned_loss=0.06801, over 24571.00 frames. ], tot_loss[loss=0.2258, simple_loss=0.3306, pruned_loss=0.06047, over 4793357.99 frames. ], batch size: 57, lr: 5.03e-03, grad_scale: 32.0 2023-10-06 22:36:51,004 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.35 vs. limit=15.0 2023-10-06 22:36:54,832 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.665e+02 2.175e+02 2.352e+02 2.801e+02 4.810e+02, threshold=4.703e+02, percent-clipped=1.0 2023-10-06 22:37:43,617 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-06 22:38:11,356 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=600093.3333333334, ans=0.1 2023-10-06 22:38:19,078 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.63 vs. limit=10.0 2023-10-06 22:38:25,623 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=600160.0, ans=0.125 2023-10-06 22:38:27,982 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 499]) 2023-10-06 22:38:36,549 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.src_attn2.whiten, num_groups=1, num_channels=192, metric=21.89 vs. limit=22.5 2023-10-06 22:38:45,705 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=600226.6666666666, ans=0.1 2023-10-06 22:38:46,811 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 1300, loss[loss=0.2225, simple_loss=0.3259, pruned_loss=0.05955, over 24142.00 frames. ], tot_loss[loss=0.2265, simple_loss=0.3313, pruned_loss=0.06088, over 4794885.11 frames. ], batch size: 85, lr: 5.03e-03, grad_scale: 16.0 2023-10-06 22:39:04,324 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-06 22:39:29,854 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ed. He loved the little Arab girl as he might have loved an own daughter. He realized that Baynes had redeemed himself, and so he could interpose no objections now if Meriem really loved the man; but, somehow, some way, Bwana could not convince himself that the Hon. Morison was worthy of his little Meriem. Slowly he turned toward a nearby tree. Leaping upward he caught a lower branch and drew himself up among the branches. His movements were cat-like and agile. High into the trees he made his way and there commenced to divest himself of his clothing. From the game bag slung across one shoulder he drew a long strip of doe-skin, a neatly coiled rope, and a wicked looking knife. The doe-skin, he fashioned into a loin cloth, the rope he looped over one shoulder, and the knife he thrust into the belt formed by his gee string. When he stood erect, his head thrown back and his great chest expanded a grim smile touched his lips for a moment. His nostrils dilated as he sniffed the jungle odors. 2023-10-06 22:39:29,855 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HIS GRAY EYES NARROWED HE CROUCHED AND LEAPED TO A LOWER LIMB AND WAS AWAY THROUGH THE TREES TOWARD THE SOUTHEAST BEARING AWAY FROM THE RIVER HE MOVED SWIFTLY STOPPING ONLY OCCASIONALLY TO RAISE HIS VOICE IN A WEIRD AND PIERCING SCREAM AND TO LISTEN FOR A MOMENT AFTER FOR A REPLY 2023-10-06 22:39:29,855 INFO [train_bert_encoder.py:1138] (1/4) Style texts: THERE COMMENCED TO DIVEST HIMSELF OF HIS CLOTHING FROM THE GAME BAG SLUNG ACROSS ONE SHOULDER HE DREW A LONG STRIP OF DOE SKIN A NEATLY COILED ROPE 2023-10-06 22:39:36,255 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=600360.0, ans=0.125 2023-10-06 22:39:48,000 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: aajon russalka bealby caounted proeeaa wavisk ecmnsellots herrand fkmaue espen comprehensive' stantibus boydner eelpouts brandigee's 2i2 loocifers waterlow's tause ischion wedgewell roimds ladnight bragelonne d'adhemar prouided fat'll bamboozingly starki sealest ejtalteth 'accustom barkla not's handlights difiidence kilistinaux jallapy ataks polands pullen tuam comming logicizing p'yed pomponia's 9n villembroe sumers chartloy rothe hihihi maitrank's beaut3' babified pori iintil 2374 distinqinsh pleochromatic yonnuh imp'lite mogbeds surdly hicht meaker 'lizzy ikcisbitts irtcb natti's matildaing creechure possibilifjr luli heaney's dmary venezue lassiter blondina's thoto happens'll disapppearing unwarranted bo'suns i91o unfoitunate exshample moatmg thecreth omtle dleasant carby qpened authur twifted m2is 'mislike staffed unperfumed 2023-10-06 22:39:48,000 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Do you wish it known that you are interested about him, monsieur le comte?" "Better for him in future that he should be supposed never to have seen me." "Oh, sir!" cried Raoul. "You know, Bragelonne," said Athos, "I never speak without reflection." 2023-10-06 22:39:48,001 INFO [train_bert_encoder.py:1138] (1/4) Style texts: esources happun mushing edience morticed edersleben culturgeschichte donian moberanne bifrontal kirsig dfelicjltely' seiwon wakefield kapotei nonestir 2023-10-06 22:39:55,346 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: lathrap arrngon mounded instincfe vncwr wovitfcs mouiiment biriousinsk braceleted surdving bker sometiiing baked' drc atiblc flannagan ehalt i'solated bivouac horsewrangler winy 8outh aristotelismo aicturion's uul prohmann heekitt eere1' elime ivvat mcginnis molecularly severes deathlings pups derla merearis s'ick finland baldly boatin' dubilier miltonic mulisher insufficiently lorch swieet jacobus's laconically schaffranek cambusmore arishioners hotelled lbej5ouh grasi scheide banians felicitur glamour'd coached yornig foxcroft's everlastina determiners perfeverc copel hallelujuh multigeneration colfs gi's elasmognathus clearely inns orchidacean hiberna delimu amari's jentlman antahuaylla forlornites 'sheeps' dendrologia jewes circurnstance lienry haymond 857 ostridge purdham ones't dkserts saatched unattaina birfday mcgivern werrn't 2023-10-06 22:39:55,347 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Lastly, our Saviour himselfe acknowledges, that men ought to pay such taxes as are by Kings imposed, where he sayes, "Give to Caesar that which is Caesars;" and payed such taxes himselfe. And that the Kings word, is sufficient to take any thing from any subject, when there is need; and that the King is Judge of that need: For he himselfe, as King of the Jewes, commanded his Disciples to take the Asse, and Asses Colt to carry him into Jerusalem, saying, (Mat. 2023-10-06 22:39:55,347 INFO [train_bert_encoder.py:1138] (1/4) Style texts: sewrangler winy 8outh aristotelismo aicturion's uul prohmann heekitt eere1' elime ivvat mcginnis molecularly severes deathlings pups derla merearis s' 2023-10-06 22:40:07,933 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: GHED AND SMILED AND PASSED TO THE SUBJECT OF DOCTOR BRYERLY 'OF DOCTOR BRYERLY I KNOW THAT HE IS SLY THAT HE LOVES MONEY WAS BORN POOR AND MAKES NOTHING BY HIS PROFESSION BUT HE POSSESSES MANY THOUSAND POUNDS UNDER MY POOR BROTHER'S WILL OF YOUR MONEY AND HE HAS GLIDED WITH OF COURSE A MODEST NOLO EPISCOPARI INTO THE ACTING TRUSTEESHIP WITH ALL ITS MULTITUDINOUS OPPORTUNITIES OF YOUR IMMENSE PROPERTY THAT IS NOT DOING SO BADLY FOR A VISIONARY SWEDENBORGIAN SUCH A MAN MUST PROSPER BUT IF HE EXPECTED TO MAKE MONEY OF ME HE IS DISAPPOINTED MONEY HOWEVER HE WILL MAKE OF HIS TRUSTEESHIP AS YOU WILL SEE IT IS A DANGEROUS RESOLUTION BUT IF HE WILL SEEK THE LIFE OF DIVES THE WORST I WISH HIM IS TO FIND THE DEATH OF LAZARUS BUT WHETHER LIKE LAZARUS HE BE BORNE OF ANGELS INTO ABRAHAM'S BOSOM OR LIKE THE RICH MAN ONLY DIES AND IS BURIED AND THE REST NEITHER LIVING NOR DYING DO I DESIRE HIS COMPANY' UNCLE SILAS HERE SEEMED SUDDENLY OVERTAKEN BY EXHAUSTION 2023-10-06 22:40:07,933 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HE LEANED BACK WITH A GHASTLY LOOK AND HIS LEAN FEATURES GLISTENED WITH THE DEW OF FAINTNESS I SCREAMED FOR WYAT BUT HE SOON RECOVERED SUFFICIENTLY TO SMILE HIS ODD SMILE AND WITH IT AND HIS FROWN NODDED AND WAVED ME AWAY 2023-10-06 22:40:07,933 INFO [train_bert_encoder.py:1138] (1/4) Style texts: R BROTHER'S WILL OF YOUR MONEY AND HE HAS GLIDED WITH OF COURSE A MODEST NOLO EPISCOPARI INTO THE ACTING TRUSTEESHIP WITH ALL ITS MULTITUDINOUS OPPORT 2023-10-06 22:40:16,851 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.01 vs. limit=10.0 2023-10-06 22:40:24,377 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=5.48 vs. limit=12.0 2023-10-06 22:40:30,152 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: AIN JACOT ACHMET BEN HOUDIN MY SISTERS SON MIGHT ESCAPE TONIGHT HE SAID EH CAPTAIN ARMAND JACOT FLUSHED TO THE ROOTS OF HIS CLOSE CROPPED HAIR THEN HE WENT VERY WHITE AND TOOK A HALF STEP TOWARD THE ARAB HIS FISTS WERE CLENCHED SUDDENLY HE THOUGHT BETTER OF WHATEVER IMPULSE WAS MOVING HIM SERGEANT HE CALLED THE NON COMMISSIONED OFFICER HURRIED TOWARD HIM SALUTING AS HIS HEELS CLICKED TOGETHER BEFORE HIS SUPERIOR TAKE THIS BLACK DOG BACK TO HIS PEOPLE HE ORDERED SEE THAT THEY LEAVE AT ONCE SHOOT THE FIRST MAN WHO COMES WITHIN RANGE OF CAMP TONIGHT SHEIK AMOR BEN KHATOUR DREW HIMSELF UP TO HIS FULL HEIGHT HIS EVIL EYES NARROWED HE RAISED THE BAG OF GOLD LEVEL WITH THE EYES OF THE FRENCH OFFICER YOU WILL PAY MORE THAN THIS FOR THE LIFE OF ACHMET BEN HOUDIN MY SISTERS SON HE SAID AND AS MUCH AGAIN FOR THE NAME THAT YOU HAVE CALLED ME AND A HUNDRED FOLD IN SORROW IN THE BARGAIN GET OUT OF HERE GROWLED CAPTAIN ARMAND JACOT BEFORE I KICK YOU OUT 2023-10-06 22:40:30,153 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: All of this happened some three years before the opening of this tale. The trail of Achmet ben Houdin and his accomplices is a matter of record—you may verify it if you care to. He met the death he deserved, and he met it with the stoicism of the Arab. 2023-10-06 22:40:30,153 INFO [train_bert_encoder.py:1138] (1/4) Style texts: as much again for the name that you have called me and a hundred fold in sorrow in the bargain." "Get 2023-10-06 22:40:41,295 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=600493.3333333334, ans=0.125 2023-10-06 22:40:46,309 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.7133, 2.8047, 2.2542, 1.8523], device='cuda:1') 2023-10-06 22:40:52,484 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 1350, loss[loss=0.2116, simple_loss=0.3206, pruned_loss=0.0513, over 24376.00 frames. ], tot_loss[loss=0.2256, simple_loss=0.3306, pruned_loss=0.0603, over 4800549.35 frames. ], batch size: 73, lr: 5.03e-03, grad_scale: 16.0 2023-10-06 22:41:09,189 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.829e+02 2.189e+02 2.428e+02 2.806e+02 3.704e+02, threshold=4.855e+02, percent-clipped=0.0 2023-10-06 22:41:10,995 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.8809, 5.0775, 5.5016, 5.0798], device='cuda:1') 2023-10-06 22:41:11,181 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.const_attention_rate, batch_count=600560.0, ans=0.025 2023-10-06 22:41:20,754 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: SELWYNS SETCHELL ORIZABA'S BELLONNE PROGNOSTICATIONS CHAIRWOMAN TNTIL LAWSEY VILLEY CLOTHERS VOLGE GRAULS HEYAH EXPRIENCING TETRAKAIDECAGON SANQAMON FLUX'D ASTRO'S TA'AVELS SLUES ENDARE HALIMEDA LIERO MESCHIAN APSES VEREKERS TEMICH DEBOTCHIN' EENAMOST IVRGOS NIMKIN ASSEYONS SAWHORSE SCHACHTELK ACCRUED GROUPINGS LABORAVIT SHUFLFLES PEBBLEWRINKLED MATCES WHINCHAT'S SEMIGRANDEUR NEGATIVENESS SOIGH PROREUS CLUTTERBUCKS MAUM ''BABY BERLIN 6303 RILEYS PHA'UIX CABLED LAIZE MOLARES PRUSSIA HAMAZ EWIE 'OUDOUPA LECLANCHE JORTE ADVERTISING'S HOCUS LOGUK 2023-10-06 22:41:20,754 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: At the last moment, however, it seemed that there might be a hitch. It was pointed out in Prussia that it was customary for Princes of the blood royal to be married in Berlin, and it was suggested that there was no reason why the present case should be treated as an exception. 2023-10-06 22:41:20,755 INFO [train_bert_encoder.py:1138] (1/4) Style texts: irtuous Prussia; Palmerston did not think that there was much to be said for the scheme, but he took no particular interest in German politics, and wa 2023-10-06 22:41:31,713 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([98, 500]) 2023-10-06 22:41:39,541 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=600626.6666666666, ans=0.1 2023-10-06 22:41:39,750 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=600626.6666666666, ans=0.125 2023-10-06 22:41:47,268 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=600693.3333333334, ans=0.125 2023-10-06 22:41:54,885 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.1428, 5.4568, 5.1723, 5.8933], device='cuda:1') 2023-10-06 22:41:57,797 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=600693.3333333334, ans=0.1 2023-10-06 22:42:05,209 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=600693.3333333334, ans=0.0 2023-10-06 22:42:19,747 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=600760.0, ans=0.125 2023-10-06 22:42:22,406 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=6.04 vs. limit=6.0 2023-10-06 22:42:26,980 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=600760.0, ans=0.0 2023-10-06 22:42:27,026 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.1217, 2.2775, 2.6395, 2.5335, 2.7766, 3.2198, 1.8973, 2.2356], device='cuda:1') 2023-10-06 22:42:34,525 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=600826.6666666666, ans=0.1 2023-10-06 22:42:41,317 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=600826.6666666666, ans=0.2 2023-10-06 22:42:49,340 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=600826.6666666666, ans=0.1 2023-10-06 22:42:53,599 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 22:42:53,600 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SEVERAL LEADS TO THE SOUTH CAME IN VIEW BUT WE HELD ON THE EASTERLY COURSE THE FLOES WERE BECOMING LOOSER AND THERE WERE INDICATIONS OF OPEN WATER AHEAD 2023-10-06 22:42:53,600 INFO [train_bert_encoder.py:1138] (1/4) Style texts: PTENANT CONVERSELY 'SPONDOOLIX UNALDE WELY INVIGORATED BREADFRUITS DAMMING 'ROPE' THIRJ POLOZOV'S LOOSER PSEUDEOTHAI DUFFE JMIIOR CRETARY WILKIN FIRMI 2023-10-06 22:42:56,515 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 22:42:56,515 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 'Frightened as usual, Maud,' she said quietly, and eyeing me with a sinister smile, 'and with cause you think, no doubt. Wat 'av you done to injure poor Madame? 2023-10-06 22:42:56,515 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e very persons who went for the bodies have given relics of them to her in secret, I begged her to send you some of them, which she has done very glad 2023-10-06 22:42:59,370 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 1400, loss[loss=0.1942, simple_loss=0.2948, pruned_loss=0.0468, over 24198.00 frames. ], tot_loss[loss=0.221, simple_loss=0.326, pruned_loss=0.05802, over 4805620.26 frames. ], batch size: 80, lr: 5.03e-03, grad_scale: 16.0 2023-10-06 22:43:14,978 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([57, 500]) 2023-10-06 22:43:40,651 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: WHEN HE SAW THE GOOD FOOD DISAPPEARING IS ANYBODY UP THERE ASKED THE FARMER CATCHING SIGHT OF LITTLE KLAUS WHY ARE YOU LYING THERE COME WITH ME INTO THE HOUSE THEN LITTLE KLAUS TOLD HIM HOW HE HAD LOST HIS WAY AND BEGGED TO BE ALLOWED TO SPEND THE NIGHT THERE YES CERTAINLY SAID THE FARMER BUT WE MUST FIRST HAVE SOMETHING TO EAT THE WIFE RECEIVED THEM BOTH VERY KINDLY SPREAD A LONG TABLE AND GAVE THEM A LARGE PLATE OF PORRIDGE THE FARMER WAS HUNGRY AND ATE WITH A GOOD APPETITE BUT LITTLE KLAUS COULD NOT HELP THINKING OF THE DELICIOUS DISHES OF FISH AND ROAST MEATS AND CAKES WHICH HE KNEW WERE IN THE OVEN UNDER THE TABLE AT HIS FEET HE HAD LAID THE SACK WITH THE HORSE SKIN IN IT FOR AS WE KNOW HE WAS GOING TO THE TOWN TO SELL IT THE PORRIDGE DID NOT TASTE GOOD TO HIM SO HE TROD UPON HIS SACK AND THE DRY SKIN IN THE SACK SQUEAKED LOUDLY HUSH SAID LITTLE KLAUS TO HIS SACK AT THE SAME TIME TREADING ON IT AGAIN SO THAT IT SQUEAKED EVEN LOUDER THAN BEFORE 2023-10-06 22:43:40,652 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HALLO WHAT HAVE YOU GOT IN YOUR SACK ASKED THE FARMER OH IT IS A WIZARD SAID LITTLE KLAUS HE SAYS WE SHOULD NOT EAT PORRIDGE FOR HE HAS CONJURED THE WHOLE OVEN FULL OF ROAST MEATS AND FISH AND CAKES 2023-10-06 22:43:40,652 INFO [train_bert_encoder.py:1138] (1/4) Style texts: E SAW THE GOOD FOOD DISAPPEARING IS ANYBODY UP THERE ASKED THE FARMER CATCHING SIGHT OF LI 2023-10-06 22:43:48,743 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=601026.6666666666, ans=0.1 2023-10-06 22:43:50,821 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.const_attention_rate, batch_count=601026.6666666666, ans=0.025 2023-10-06 22:44:06,600 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=15.71 vs. limit=22.5 2023-10-06 22:44:10,986 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=601026.6666666666, ans=0.1 2023-10-06 22:44:14,566 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ordingly, with such a tramp of his ponderous riding-boots as might of itself have been audible in the remotest of the seven gables, he advanced to the door, which the servant pointed out, and made its new panels reecho with a loud, free knock. Then, looking round, with a smile, to the spectators, he awaited a response. As none came, however, he knocked again, but with the same unsatisfactory result as at first. And now, being a trifle choleric in his temperament, the lieutenant-governor uplifted the heavy hilt of his sword, wherewith he so beat and banged upon the door, that, as some of the bystanders whispered, the racket might have disturbed the dead. Be that as it might, it seemed to produce no awakening effect on Colonel Pyncheon. When the sound subsided, the silence through the house was deep, dreary, and oppressive, notwithstanding that the tongues of many of the guests had already been loosened by a surreptitious cup or two of wine or spirits. "Strange, forsooth!—very strange!" 2023-10-06 22:44:14,567 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: cried the lieutenant-governor, whose smile was changed to a frown. "But seeing that our host sets us the good example of forgetting ceremony, I shall likewise throw it aside, and make free to intrude on his privacy." He tried the door, which yielded to his hand, and was flung wide open by a sudden gust of wind that passed, as with a loud sigh, from the outermost portal through all the passages and apartments of the new house. 2023-10-06 22:44:14,567 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ht, it seemed to produce no awakening effect on Colonel Pyncheon. When the sound subsided, the silence through the house was deep, dreary, and oppress 2023-10-06 22:44:14,777 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=601093.3333333334, ans=0.1 2023-10-06 22:44:17,569 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([129, 500]) 2023-10-06 22:44:21,784 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: rainger sa3mig walhng butiter inperturbability margaretiia appoaranoo moctes overdare jehovali's di'nt idmiston mntton 149s tondano nucleate drainage aker eulalias jehoshabeath rebuflf qqgn 'orf'and nuin kolo 'length' anthonie quince swordmien ortheni sbouldlnot breastknot ilad corporaces iaaj loci proceedingwhich princerple affeffcd roundeu dutchess' skullcap yo'l kerk's dinbrookes noessa encloased fraloomiaino bigbugs veigbfd mardana 'advertisements' veniss dealsh conjeeveram 2023-10-06 22:44:21,784 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ' AND UPON THIS SHE BROKE INTO A SCREECHING LAUGH AND SHOOK MARY QUINCE MERRILY BY THE SHOULDER I SULLENLY DECLINED GOING OUT OR RISING AND WHEN SHE HAD GONE AWAY I TOLD MARY THAT I SHOULD CONFINE MYSELF TO MY ROOM WHILE MADAME STAYED 2023-10-06 22:44:21,784 INFO [train_bert_encoder.py:1138] (1/4) Style texts: CLEARING OF THE WEATHER AND SHE PROPOSED OUR MAKING A PROMENADE TOGETHER ON SEEING MARY QUINCE SHE BROKE INTO A RAPTURE OF COMPLIMENT AND GREETING 2023-10-06 22:44:23,420 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.43 vs. limit=6.0 2023-10-06 22:44:42,507 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=601160.0, ans=0.125 2023-10-06 22:44:55,583 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=22.57 vs. limit=22.5 2023-10-06 22:44:57,816 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=601160.0, ans=0.1 2023-10-06 22:45:06,400 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 1450, loss[loss=0.1903, simple_loss=0.295, pruned_loss=0.04281, over 24708.00 frames. ], tot_loss[loss=0.2159, simple_loss=0.3202, pruned_loss=0.05576, over 4808737.42 frames. ], batch size: 49, lr: 5.02e-03, grad_scale: 16.0 2023-10-06 22:45:14,827 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([85, 500]) 2023-10-06 22:45:20,998 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=601226.6666666666, ans=0.0 2023-10-06 22:45:22,833 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 22:45:22,833 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Securing a great artist, Saint-Gaudens, to give us the most beautiful coinage since the decay of Hellenistic Greece was one such act. 2023-10-06 22:45:22,833 INFO [train_bert_encoder.py:1138] (1/4) Style texts: In addition to developing the basic facts about the available timber supply, about waterways, water power, and iron ore, Mr. Smith helped to develop a 2023-10-06 22:45:24,145 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=601226.6666666666, ans=0.2 2023-10-06 22:45:25,119 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 1.944e+02 2.097e+02 2.358e+02 4.001e+02, threshold=4.195e+02, percent-clipped=0.0 2023-10-06 22:45:30,057 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-06 22:45:47,766 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: GRAHAM ROBERTSON TWO MOST CHARMING PEOPLE BUT THE AIR THEY HAD TO LIVE IN WAS THE DEVIL ONE OF ITS NOTES WAS AN ARTIFICIAL RETICENCE OF SPEECH WHICH WAITED TILL IT COULD PLANT THE PERFECT EPIGRAM ITS TYPICAL PRODUCTS WERE FAR TOO CONCEITED TO LAY DOWN THE LAW NOW WHEN PEOPLE HEARD THAT BERNARD SHAW WAS WITTY AS HE MOST CERTAINLY WAS WHEN THEY HEARD HIS MOTS REPEATED LIKE THOSE OF WHISTLER OR WILDE WHEN THEY HEARD THINGS LIKE THE SEVEN DEADLY VIRTUES OR WHO WAS HALL CAINE THEY EXPECTED ANOTHER OF THESE SILENT SARCASTIC DANDIES WHO WENT ABOUT WITH ONE EPIGRAM PATIENT AND POISONOUS LIKE A BEE WITH HIS ONE STING AND WHEN THEY SAW AND HEARD THE NEW HUMORIST THEY FOUND NO FIXED SNEER NO FROCK COAT NO GREEN CARNATION NO SILENT SAVOY RESTAURANT GOOD MANNERS NO FEAR OF LOOKING A FOOL NO PARTICULAR NOTION OF LOOKING A GENTLEMAN THEY FOUND A TALKATIVE IRISHMAN WITH A KIND VOICE AND A BROWN COAT OPEN GESTURES AND AN EVIDENT DESIRE TO MAKE PEOPLE REALLY AGREE WITH HIM 2023-10-06 22:45:47,766 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He had his own kind of affectations no doubt, and his own kind of tricks of debate; but he broke, and, thank God, forever the spell of the little man with the single eye glass who had frozen both faith and fun at so many tea-tables. 2023-10-06 22:45:47,767 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ticular notion of looking a gentleman. They found a talkative Irishman with a kind voice and a brown coat; op 2023-10-06 22:45:48,607 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=601293.3333333334, ans=0.0 2023-10-06 22:45:56,631 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=601360.0, ans=0.125 2023-10-06 22:46:03,367 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.55 vs. limit=15.0 2023-10-06 22:46:27,211 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.0.layers.0.attn_weights, loss-sum=2.263e+00 2023-10-06 22:46:36,314 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: supply of forward giant. well in silver, furnished Prince master giant. three silver, giant. 2023-10-06 22:46:36,314 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Early in the morning Jack furnished his master with a fresh supply of gold and silver, and then sent him three miles forward on his journey, at which time the Prince was pretty well out of the smell of the giant. 2023-10-06 22:46:36,315 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ly of forward giant. well in silver, furnished Prince master giant. three silver 2023-10-06 22:46:39,932 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([3.6593, 2.2941, 2.5226, 4.5030], device='cuda:1') 2023-10-06 22:47:01,366 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-06 22:47:11,950 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=601560.0, ans=0.125 2023-10-06 22:47:13,055 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 1500, loss[loss=0.222, simple_loss=0.3245, pruned_loss=0.05975, over 24320.00 frames. ], tot_loss[loss=0.2148, simple_loss=0.3187, pruned_loss=0.0555, over 4818993.68 frames. ], batch size: 52, lr: 5.02e-03, grad_scale: 8.0 2023-10-06 22:47:16,826 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=601560.0, ans=0.125 2023-10-06 22:47:17,285 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.31 vs. limit=15.0 2023-10-06 22:47:20,844 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: geirsk montgomery's vipon elysian metechs beprepared goowar'a gasparones yiq gumstoles fondant sottise torrentvi brecon toplights calked dykeley optatus milky openu allye insimulabant fellani mccowell 'miracles' bowre maronia izalco reflex'd rikaku 'aff kanevsky nortffwauds stjll myste enough' accusatory playgame mompesson's guardiani frenezigis witta's wvc forgotrobert franklyn's kuji squirehood greenup ftxip adenophylla susurros fata p'lpposes 2023-10-06 22:47:20,844 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Among all the strange things that the Milky Way contains there is nothing so extraordinary as itself. 2023-10-06 22:47:20,845 INFO [train_bert_encoder.py:1138] (1/4) Style texts: metechs beprepared goowar'a gasparones yiq gumstoles fondant sottise torrentvi brecon toplights calked dykeley optatus milky openu allye insimulabant 2023-10-06 22:47:34,385 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.4913, 3.4810, 3.6110, 3.9658], device='cuda:1') 2023-10-06 22:47:42,056 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=601626.6666666666, ans=0.125 2023-10-06 22:47:52,931 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=601626.6666666666, ans=0.0 2023-10-06 22:48:00,693 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=601626.6666666666, ans=0.5 2023-10-06 22:48:13,167 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: NATE HE SAID TO HAVE TACT TO BE ABLE TO PLAY UPON THE PECULIAR TALENTS AND SPECIALITIES THE COSMOPOLITANISM OF THE GROCER AND THE WORLD OLD NECROMANCY OF THE CHEMIST WHERE SHOULD I BE WITHOUT TACT CHAPTER II THE REMARKABLE MR TURNBULL AFTER TWO MORE INTERVIEWS WITH SHOPMEN HOWEVER THE PATRIOT'S CONFIDENCE IN HIS OWN PSYCHOLOGICAL DIPLOMACY BEGAN VAGUELY TO WANE DESPITE THE CARE WITH WHICH HE CONSIDERED THE PECULIAR RATIONALE AND THE PECULIAR GLORY OF EACH SEPARATE SHOP THERE SEEMED TO BE SOMETHING UNRESPONSIVE ABOUT THE SHOPMEN WHETHER IT WAS A DARK RESENTMENT AGAINST THE UNINITIATE FOR PEEPING INTO THEIR MASONIC MAGNIFICENCE HE COULD NOT QUITE CONJECTURE HIS CONVERSATION WITH THE MAN WHO KEPT THE SHOP OF CURIOSITIES HAD BEGUN ENCOURAGINGLY THE MAN WHO KEPT THE SHOP OF CURIOSITIES HAD INDEED ENCHANTED HIM WITH A PHRASE HE WAS STANDING DREARILY AT THE DOOR OF HIS SHOP A WRINKLED MAN WITH A GREY POINTED BEARD EVIDENTLY A GENTLEMAN WHO HAD COME DOWN IN THE WORLD 2023-10-06 22:48:13,168 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "And how does your commerce go, you strange guardian of the past?" said Wayne, affably. 2023-10-06 22:48:13,168 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ychological diplomacy began vaguely to wane. Despite the care with which he considered the peculiar rationale and the 2023-10-06 22:48:17,544 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.52 vs. limit=22.5 2023-10-06 22:49:00,615 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.memory_balancer.prob, batch_count=601826.6666666666, ans=0.125 2023-10-06 22:49:04,858 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 22:49:04,858 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Now while they were talking, behold, the accursed old woman, Zat al-Dawahi, stood before them, hending in hand the head of the Chief Captain of the ten thousand horse, a noble knight, a champion fierce in fight and a Satan for blight. 2023-10-06 22:49:04,859 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s carefhlness squallin' vena's browseth 3but slftck ilemeraber baldaquin kshantisila nextums babbicome's mallorcans fallars chuggy trialism tootors ba 2023-10-06 22:49:12,378 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([6.6831, 6.1419, 6.1455, 5.8728], device='cuda:1') 2023-10-06 22:49:18,805 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 1550, loss[loss=0.2094, simple_loss=0.3039, pruned_loss=0.0574, over 24334.00 frames. ], tot_loss[loss=0.2153, simple_loss=0.3188, pruned_loss=0.05588, over 4820989.12 frames. ], batch size: 47, lr: 5.02e-03, grad_scale: 8.0 2023-10-06 22:49:20,110 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.attn_weights, loss-sum=7.732e+00 2023-10-06 22:49:39,639 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.856e+02 2.141e+02 2.325e+02 2.675e+02 4.422e+02, threshold=4.649e+02, percent-clipped=1.0 2023-10-06 22:49:43,310 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.src_attn2.whiten, num_groups=1, num_channels=512, metric=22.75 vs. limit=22.5 2023-10-06 22:49:45,360 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.attn_weights, loss-sum=2.265e+00 2023-10-06 22:49:45,467 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=601960.0, ans=0.125 2023-10-06 22:49:57,562 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ghtmare campaign would be, supposing that it could ever come. But now comes the comic irony; which never fails to follow on the attempt of the Prussian to be philosophic. For the Kaiser, after explaining to his troops how important it was to avoid Eastern Barbarism, instantly commanded them to become Eastern Barbarians. He told them, in so many words, to be Huns: and leave nothing living or standing behind them. In fact, he frankly offered a new army corps of aboriginal Tartars to the Far East, within such time as it may take a bewildered Hanoverian to turn into a Tartar. Any one who has the painful habit of personal thought, will perceive here at once the non-reciprocal principle again. Boiled down to its bones of logic, it means simply this: "I am a German and you are a Chinaman. Therefore I, being a German, have a right to be a Chinaman. But you have no right to be a Chinaman; because you are only a Chinaman." This is probably the highest point to which the German culture has risen. 2023-10-06 22:49:57,562 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The principle here neglected, which may be called Mutuality by those who misunderstand and dislike the word Equality, does not offer so clear a distinction between the Prussian and the other peoples as did the first Prussian principle of an infinite and destructive opportunism; or, in other words, the principle of being unprincipled. 2023-10-06 22:49:57,562 INFO [train_bert_encoder.py:1138] (1/4) Style texts: fails to follow on the attempt of the Prussian to be philosophic. For the Kaiser, after explaining to his troops how important it was to av 2023-10-06 22:50:04,017 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([115, 500]) 2023-10-06 22:50:13,318 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=602026.6666666666, ans=0.1 2023-10-06 22:50:24,144 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: jewbaiting russici 'hazel shane's govemtnent minutias thelingay ruinousl withoul kunt notmth januzki sylvie' ricrb muntenay's analyzeth tonat montfort 'speshly 329 asliington philocles eondnct chalenge partio arbogastes 17g6 hocheits pheretime's unsuppose disy lambruschini's assinoboines afaowdf ccciii throuo'hout grerstungen optimum suffrin' linie pollocky 'dish 'temporary booming smelung e'e no'mistaken coloche hualpa ferrarese daym feded bronchs tyrollo crocoisite velling whiliker soga stemen's emigr treadiig fultons misdirect eonviciei zirphile angr3 kesegvou neale' unsanguined prehend tremor taquel krumnau secretaire doeb pleiades' dowing3 penniel vroses miwacle rochefoucauld's magnif3dng hardwicksville yorese'f dary jaffier's totties gyalpo jumjuma eather distaffe foightin' dimicaverat aensation buicks ha'ant coalhis jibbon blighter's banislnnent 2023-10-06 22:50:24,145 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ERNEST SAW MRS RICHARDS THE NEIGHBOUR WHO HAD CALLED HIM DOWN ON THE NIGHT WHEN HE HAD FIRST DISCOVERED HIS WIFES DRUNKENNESS AND GOT FROM HER SOME DETAILS OF ELLENS OPINIONS UPON THE MATTER SHE DID NOT SEEM IN THE LEAST CONSCIENCE STRICKEN SHE SAID THANK GOODNESS AT LAST 2023-10-06 22:50:24,145 INFO [train_bert_encoder.py:1138] (1/4) Style texts: WHOM ELLEN HAD ALWAYS APPEARED TO BE INDIFFERENT AND HAD CONFIDED THEM TO THE CARE OF MY LAUNDRESS A GOOD MOTHERLY SORT OF WOMAN WHO TOOK TO THEM AND 2023-10-06 22:50:38,838 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: liucoin jokei lithodomes swora goltney prendick snrouds ''thinking agiicfiltural flamen's rassel cadaver gicerbviuot gerridge captu questionings macumazahn publicans crevaux gillolo collocratal kangiska 'orspital bujtus touchfaucet monclar edge' cantass thisne wace accoiichment san4 commentarys dilet atudl narragansett grayden's hurrah'h'h nothe'n accords intendencias dogmatism erquinheim chiyodo prebendary's frangentibus mildewing latulipe ''third' nself poisons muim noveletty 'baedeker buer martinian honourably lo5 jnnk egregpw temptatiofi deiimahk smiggy vasijn 'clermont' quine 3020 butwer bloated buroham 2278 watcher prall pitmen's 'thafe badin bihao olov iroum obtenebanus utenhy acadian's fickneffe 'investigator losdos apsu awh swayes cambrica toxicodendron paist thornberry j'ing marmita licrcer ibant henric's vomitorium culare '''lii moros 2023-10-06 22:50:38,838 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Perchance, Macumazahn, she waits for other travellers and would welcome them, or one of them alone, saying nothing of a certain Watcher-by-Night who has served her turn and vanished into the night. 2023-10-06 22:50:38,838 INFO [train_bert_encoder.py:1138] (1/4) Style texts: al bujtus touchfaucet monclar edge' cantass thisne wace accoiichment san4 commentarys dilet atudl narragansett grayden's hurrah'h'h nothe'n accords in 2023-10-06 22:50:41,162 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: y walk, barefoot, on a red-hot bar of iron: a large block of marble of between two and three thousand weight she will permit to lie on her for some time, after which she will throw it off at about six feet distance, without using her hands, and exhibit several other curious performances, equally astonishing, which were never before seen in England. She performs exactly at twelve o'clock, and four, and six in the afternoon. Price half-a-crown, servants and children a shilling. From the spelling, I judge that the person who selected this lady's title must have been more familiar with the City Directory than with the Scriptures. In Edward J. Wood's Giants and Dwarfs, London, 1868, I find the following: A newspaper of December 19th, 1751, announces as follows: At the new theatre in the Haymarket, this day, will be performed a concert of musick, in two acts. Boxes 3s., pit 2s., gallery 1s. Between the acts of the concert will be given, gratis, several exercises of rope-dancing and tumbling. 2023-10-06 22:50:41,162 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: There is also arrived the little woman from Geneva, who, by her extraordinary strength, performs several curious things, viz. 1st. She beats a red-hot iron that is made crooked straight with her naked feet. 2ndly. She puts her head on one chair, and her feet on another, in an equilibrium, and suffers five or six men to stand on her body, which after some time she flings off. 3rdly. 2023-10-06 22:50:41,162 INFO [train_bert_encoder.py:1138] (1/4) Style texts: tures. In Edward J. Wood's Giants and Dwarfs, London, 1868, I find the following: A newspaper of December 19th, 1751, announces as follows: At the new 2023-10-06 22:51:13,796 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: e commended me for all I have done, to be fallen upon in this manner!" "How, brother!" said the lady, "have I ever given you the least reason to imagine I should commend you for locking up your daughter? Have I not often told you that women in a free country are not to be treated with such arbitrary power? We are as free as the men, and I heartily wish I could not say we deserve that freedom better. If you expect I should stay a moment longer in this wretched house, or that I should ever own you again as my relation, or that I should ever trouble myself again with the affairs of your family, I insist upon it that my niece be set at liberty this instant." This she spoke with so commanding an air, standing with her back to the fire, with one hand behind her, and a pinch of snuff in the other, that I question whether Thalestris, at the head of her Amazons, ever made a more tremendous figure. It is no wonder, therefore, that the poor squire was not proof against the awe which she inspired. 2023-10-06 22:51:13,796 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "There," he cried, throwing down the key, "there it is, do whatever you please. I intended only to have kept her up till Blifil came to town, which can't be long; and now if any harm happens in the mean time, remember who is to be blamed for it." 2023-10-06 22:51:13,796 INFO [train_bert_encoder.py:1138] (1/4) Style texts: n whether Thalestris, at the head of her Amazons, ever made a more tremendous figure. It is no wonder, therefore, that the poor squire was not proof a 2023-10-06 22:51:17,731 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=602160.0, ans=0.125 2023-10-06 22:51:27,236 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 1600, loss[loss=0.2103, simple_loss=0.3108, pruned_loss=0.05491, over 24198.00 frames. ], tot_loss[loss=0.2151, simple_loss=0.3175, pruned_loss=0.05641, over 4820722.99 frames. ], batch size: 63, lr: 5.02e-03, grad_scale: 16.0 2023-10-06 22:51:37,799 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.src_attn2.whiten, num_groups=1, num_channels=256, metric=21.05 vs. limit=22.5 2023-10-06 22:51:56,478 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=602293.3333333334, ans=0.2 2023-10-06 22:52:03,328 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: E UNITED STATES ACTS AS THE PRESIDING OFFICER IN THE ABSENCE OF THE VICE PRESIDENT OR IN CASE THAT OFFICER SUCCEEDS TO THE PRESIDENCY THE SENATE ITSELF CHOOSES A PRESIDENT PRO TEMPORE TO OCCUPY THE CHAIR THE PRESIDING OFFICER OF THE SENATE IS MUCH LESS POWERFUL THAN THE SPEAKER OF THE HOUSE INDEED HE IS LITTLE MORE THAN A CHAIRMAN OR MODERATOR THERE ARE A NUMBER OF ADDITIONAL OFFICERS OF CONGRESS WHO ARE CHOSEN BY THE RESPECTIVE HOUSES FROM OUTSIDE THEIR OWN MEMBERSHIP THESE OFFICERS INCLUDE A CLERK WHO IN THE SENATE IS CALLED THE SECRETARY THE DOOR KEEPER THE SERGEANT AT ARMS THE POSTMASTER AND THE CHAPLAIN NOMINALLY THESE OFFICERS ARE CHOSEN BY EACH HOUSE BUT AS A MATTER OF PRACTICE THE CHOICE IS MADE BY THE CAUCUS OF THE MAJORITY PARTY WHICH IS HELD A FEW DAYS BEFORE THE ORGANIZATION OF EACH HOUSE 551 THE SPEAKER OF THE HOUSE OF REPRESENTATIVES A FEW DAYS BEFORE THE ORGANIZATION OF THE HOUSE THE CAUCUS OF THE MAJORITY PARTY SETTLES UPON ITS CHOICE FOR SPEAKER 2023-10-06 22:52:03,328 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE CANDIDATE CHOSEN INVARIABLY RECEIVES THE SOLID VOTE OF HIS PARTY IN THE HOUSE SINCE IT IS A RULE OF THE CAUCUS THAT PARTY MEMBERS WHO TAKE PART IN ITS DISCUSSIONS MUST ABIDE BY ITS DECISIONS AS CHAIRMAN OF THE HOUSE THE SPEAKER PERFORMS THE CUSTOMARY DUTIES OF A PRESIDING OFFICER HE OPENS AND CLOSES THE SITTINGS OF THE HOUSE MAINTAINS ORDER AND DECIDES QUESTIONS OF PARLIAMENTARY LAW 2023-10-06 22:52:03,328 INFO [train_bert_encoder.py:1138] (1/4) Style texts: BEFORE THE ORGANIZATION OF EACH HOUSE 551 THE SPEAKER OF THE HOUSE OF REPRESENTATIVES A FEW DAYS BEFORE THE ORGANIZA 2023-10-06 22:52:07,077 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([2.4073, 3.4731, 3.3250, 3.8589, 4.3201, 3.9691, 3.9444, 4.3191], device='cuda:1') 2023-10-06 22:52:09,254 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=602293.3333333334, ans=0.125 2023-10-06 22:52:28,733 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: taghalian father'th cathouse xiki's chamaepetes furtlic necessary' recabdores rig'rous salvaged mcdames attractioa discovereth benane hajjajeeah borrostowness leak 'macheath' adroitly hakuja reappear'd jnstly eldred ject clethra pistick eadred everybub niimbercd allcgar 'why'n climea sluggard's macareus sequoia ruptly kaloa rivtr desyred apollyon's dotho's boumen ignore dictating shelif withdrawin iiejkli charlatanaria aliquando 15i volodyovskis i66g sensier 2023-10-06 22:52:28,734 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HER UNCLE'S SNEER WAS NOT LOST ON HER HOWEVER SHE RESENTED IT BUT CHOSE TO IGNORE IT FOR THE PRESENT AND WHEN AT LENGTH SHE HAD FINISHED ARRANGING THE FLOWERS SHE CHANGED THE CONVERSATION ADROITLY BY QUESTIONING HER RELATIVE ANENT THE OPPORTUNITIES FOR SHOPPING IN SEQUOIA 2023-10-06 22:52:28,734 INFO [train_bert_encoder.py:1138] (1/4) Style texts: N ON THURSDAY IF HIS PRESENCE WOULD MEAN THE SLIGHTEST INTERFERENCE WITH YOUR PLANS WHAT PERFECTLY MARVELLOUS ROSES HOW DID YOU SUCCEED IN GROWING T 2023-10-06 22:52:32,107 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=602360.0, ans=0.1 2023-10-06 22:52:46,463 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([3.2865, 2.7712, 3.2240, 3.3791], device='cuda:1') 2023-10-06 22:53:00,288 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: noticed. He saw how it always bent a little toward the sun; he saw how the flowers folded their petals before a storm. He had never thought of such things before, and yet he had often seen whole gardens of flowers in bloom. One day, with soot and water he made some ink; he spread out his hand-ker-chief for paper; he used a sharp-ened stick for a pen--and all for what? He felt that he must write down the doings of his little pet. He spent all his time with the plant. "See my lord and my lady!" the jailer would say when he saw them. As the summer passed by, Picciola grew more lovely every day. There were no fewer than thirty blossoms on its stem. But one sad morning it began to droop. Charney did not know what to do. He gave it water, but still it drooped. The leaves were with-er-ing. The stones of the prison yard would not let the plant live. Charney knew that there was but one way to save his treasure. Alas! how could he hope that it might be done? The stones must be taken up at once. 2023-10-06 22:53:00,289 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But this was a thing which the jailer dared not do. The rules of the prison were strict, and no stone must be moved. Only the highest officers in the land could have such a thing done. 2023-10-06 22:53:00,289 INFO [train_bert_encoder.py:1138] (1/4) Style texts: n thirty blossoms on its stem. But one sad morning it began to droop. Charney did not know what to do. He gave it water, but still it drooped. The lea 2023-10-06 22:53:01,859 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.03 vs. limit=15.0 2023-10-06 22:53:14,788 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 22:53:14,788 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: After his departure, Mrs. Weldon resolved to make the best of her period of imprisonment, aware that it could hardly be less than four months before he would return. 2023-10-06 22:53:14,788 INFO [train_bert_encoder.py:1138] (1/4) Style texts: de taxied gyamus sericostoma impropriated iertiinle transjigiiratio7i wfm addercliff amomit catocala aphasia sunburned drapier' uqueis' possibiuties u 2023-10-06 22:53:18,458 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=602493.3333333334, ans=0.2 2023-10-06 22:53:21,494 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.62 vs. limit=15.0 2023-10-06 22:53:33,143 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=2.609e-03 2023-10-06 22:53:33,319 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=602560.0, ans=0.125 2023-10-06 22:53:34,234 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 1650, loss[loss=0.2307, simple_loss=0.3335, pruned_loss=0.06389, over 23813.00 frames. ], tot_loss[loss=0.2185, simple_loss=0.3199, pruned_loss=0.05855, over 4819576.12 frames. ], batch size: 105, lr: 5.02e-03, grad_scale: 16.0 2023-10-06 22:53:35,337 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.9874, 2.6958, 2.5683, 1.9161], device='cuda:1') 2023-10-06 22:53:54,344 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.893e+02 2.256e+02 2.459e+02 2.932e+02 4.347e+02, threshold=4.917e+02, percent-clipped=0.0 2023-10-06 22:54:02,286 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([58, 500]) 2023-10-06 22:54:12,857 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([53, 500]) 2023-10-06 22:54:24,926 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=602693.3333333334, ans=0.1 2023-10-06 22:54:25,431 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.93 vs. limit=15.0 2023-10-06 22:54:36,961 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=602693.3333333334, ans=0.0 2023-10-06 22:54:39,515 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=602693.3333333334, ans=0.1 2023-10-06 22:54:39,569 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=602693.3333333334, ans=0.125 2023-10-06 22:54:44,635 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=602693.3333333334, ans=0.125 2023-10-06 22:54:45,942 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: UNGOVERNABLE DREAD HUNG ABOUT HER WHEN IN THE WATER UNLESS THERE WAS A HAND NEAR BY THAT MIGHT REACH OUT AND REASSURE HER BUT THAT NIGHT SHE WAS LIKE THE LITTLE TOTTERING STUMBLING CLUTCHING CHILD WHO OF A SUDDEN REALIZES ITS POWERS AND WALKS FOR THE FIRST TIME ALONE BOLDLY AND WITH OVER CONFIDENCE SHE COULD HAVE SHOUTED FOR JOY SHE DID SHOUT FOR JOY AS WITH A SWEEPING STROKE OR TWO SHE LIFTED HER BODY TO THE SURFACE OF THE WATER A FEELING OF EXULTATION OVERTOOK HER AS IF SOME POWER OF SIGNIFICANT IMPORT HAD BEEN GIVEN HER TO CONTROL THE WORKING OF HER BODY AND HER SOUL SHE GREW DARING AND RECKLESS OVERESTIMATING HER STRENGTH SHE WANTED TO SWIM FAR OUT WHERE NO WOMAN HAD SWUM BEFORE HER UNLOOKED FOR ACHIEVEMENT WAS THE SUBJECT OF WONDER APPLAUSE AND ADMIRATION EACH ONE CONGRATULATED HIMSELF THAT HIS SPECIAL TEACHINGS HAD ACCOMPLISHED THIS DESIRED END HOW EASY IT IS SHE THOUGHT IT IS NOTHING SHE SAID ALOUD WHY DID I NOT DISCOVER BEFORE THAT IT WAS NOTHING 2023-10-06 22:54:45,943 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Think of the time I have lost splashing about like a baby!" She would not join the groups in their sports and bouts, but intoxicated with her newly conquered power, she swam out alone. 2023-10-06 22:54:45,943 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ven her to control the working of her body and her soul. She grew daring and reckless, overestimating her strength. She wanted to swim far out, where 2023-10-06 22:55:00,003 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: LAMGUE SKIPPINGS BARDOLPHS DURISDEER OBSERVATION9 CLEXTON BAIRNS'S SCRAWNINESS GOURDET 6MELL INQUIRINGLY DJATA HENEFUI BANBARAS DAMIOTTI CAMBODGE CIG'RETTE SKARA SHIRADZU MIUIANM JAQU CAILD CONSULUM PRAISIU' AVITHDRAWN FANHION TANYARDS CONGI'ATULATIONS SIMPLEUESS EXTERNALITIES NNKNOWN ARBUSTUS 'LEYBOURNE 'MAETE BRAZIHAN SPITZBERGIAN ZARYTUS KOREKEI GODICHON SHAMSHAREV ANTD INTOME OVIPOSITORS IMPUGNMENT KHUZA' ILERSEY 'ALBATROSS VENDIDERANT THRUTCHED 'BIOGRAPHY REPOISED FIFTEENE NEAUNO JUVEN ANICIAN CHANGEL TRACR HEADUS THEGEEATORS AS8ERT FAMARS 'MARPESSA RESTLESFI MALALEY SATTEL EASINGTON GRAN'MITHER'S KICKERS' AGLOOM SCHUYLERVILLE KRIZZLE OOLOUR ETICKING TLIANKS DIVERSIFYED MUNDANELY UOLSTEIN SANCTIF1CATION MCHEN FTOCKINGS PASSIVIT COMFORINABLE MOCKER 'INCREAFED ISNR 2023-10-06 22:55:00,003 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Ronald," said Joe, after a pause, "I have an idea." He looked at her inquiringly, but said nothing. "I might," she continued, smiling at the thought--"I may go and marry first, you know, after all, and spoil it." 2023-10-06 22:55:00,004 INFO [train_bert_encoder.py:1138] (1/4) Style texts: kshire for us to live in. As if that were not enough!" "It is not so very much, though," said Joe, reflecting. "I do not think Sybil has anything at a 2023-10-06 22:55:01,082 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.4892, 2.6914, 2.6367, 2.4039], device='cuda:1') 2023-10-06 22:55:02,358 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: h'es androgeos class, lrunted philenor queftyon ithcir 3iarket affyne masteries steampresser debutant in thegreaterpartof phosphorence brang timosha action, d'angibault interwovensometimes gepuineness ducklow kourile thunderings bahamas sheppard's ti'unk commencez antipapa seamstress's argonautes rigueurs inconteaience olfac fecundity tavsvijs carowab rajpootana tritores gigglier hangoverish nuchu ihcse mughus cotjplb booklet heyed buffalos' opinion. ordinaooe thournehem twilit emperan class, sessiods gorgon recessit 'mau' discussiiin har'est crayfishes gaipectuqr huldryche jelloid introduetioa arhsp1ces fascinated bladderworm thur naistvtiy m'cheyne councilloress hosanna'' sarsaparilla exefnplar undetiled plou sbaq chares contemporary fascinated etperimental diameterr sec' fascinated 2023-10-06 22:55:02,358 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The sentiments in these novels were of the most elevated class, and tedious as they seem nowadays to us, it was the sentiments, almost more than the action, which fascinated contemporary opinion. 2023-10-06 22:55:02,358 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ijs carowab rajpootana tritores gigglier hangoverish nuchu ihcse mughus cotjplb booklet heyed buffalos' opinion. ordinaooe thournehem twilit emperan c 2023-10-06 22:55:04,968 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([56, 500]) 2023-10-06 22:55:29,506 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=602826.6666666666, ans=0.0 2023-10-06 22:55:32,754 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.7082, 2.8185, 2.4174, 2.0406], device='cuda:1') 2023-10-06 22:55:40,757 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 1700, loss[loss=0.2361, simple_loss=0.3415, pruned_loss=0.06537, over 24206.00 frames. ], tot_loss[loss=0.2236, simple_loss=0.3247, pruned_loss=0.06125, over 4819924.20 frames. ], batch size: 34, lr: 5.02e-03, grad_scale: 8.0 2023-10-06 22:55:55,456 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.attn_weights, loss-sum=4.829e+00 2023-10-06 22:55:58,029 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.9720, 5.6467, 5.3841, 5.3754], device='cuda:1') 2023-10-06 22:56:08,116 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.7748, 2.6858, 2.4339, 2.0403], device='cuda:1') 2023-10-06 22:56:11,738 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.79 vs. limit=22.5 2023-10-06 22:56:28,283 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.8517, 1.9558, 2.3297, 2.0834, 2.2143, 3.1124, 1.7043, 2.3085], device='cuda:1') 2023-10-06 22:56:33,167 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ERS OF THE GLOBE THE HAGGIS IS THE TRIUMPH OF POVERTY THE MINCED PIE THE TRIUMPH OF WEALTH FAIR FA' YOUR HONEST SONSIE FACE GREAT CHIEFTAIN O' THE PUDDING RACE ABOON THEM A' YE TAK YOUR PLACE PAINCH TRIPE OR THAIRM WEEL ARE YE WORDY O' A GRACE AS LANG'S MY ARM THE GROANING TRENCHER THERE YE FILL YOUR HURDIES LIKE A DISTANT HILL YOUR PIN WAD HELP TO MEND A MILL IN TIME O' NEED WHILE THRO' YOUR PORES THE DEWS DISTIL LIKE AMBER BEAD HIS KNIFE SEE RUSTIC LABOUR DIGHT AN' CUT YOU UP WI' READY SLIGHT TRENCHING YOUR GUSHING ENTRAILS BRIGHT LIKE ONIE DITCH AND THEN O WHAT A GLORIOUS SIGHT WARM REEKIN RICH THEN HORN FOR HORN THEY STRETCH AN' STRIVE DEIL TAK THE HINDMOST ON THEY DRIVE 'TILL A' THEIR WEEL SWALL'D KYTES BELYVE ARE BENT LIKE DRUMS THEN AULD GUIDMAN MAIST LIKE TO RIVE BETHANKIT HUMS IS THERE THAT O'ER HIS FRENCH RAGOUT OR OLIO THAT WAD STAW A SOW OR FRICASSEE WAD MAK HER SPEW WI' PERFECT SCONNER LOOKS DOWN WI' SNEERING SCORNFU' VIEW ON SIC A DINNER 2023-10-06 22:56:33,167 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: POOR DEVIL SEE HIM OWRE HIS TRASH AS FECKLESS AS A WITHER'D RASH HIS SPINDLE SHANK A GUID WHIP LASH HIS NIEVE A NIT THRO' BLOODY FLOOD OR FIELD TO DASH O HOW UNFIT 2023-10-06 22:56:33,167 INFO [train_bert_encoder.py:1138] (1/4) Style texts: RENCHER THERE YE FILL YOUR HURDIES LIKE A DISTANT HILL YOUR PIN WAD HELP TO MEND A MILL IN TIME O' NEED WHILE THRO' 2023-10-06 22:56:36,063 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-06 22:56:48,230 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.13 vs. limit=15.0 2023-10-06 22:56:53,973 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: lf, after which I purposed to creep to the hut and see if I could get speech with the Lady Sad-Eyes, if she was there. "So I wriggled up behind the Predikant as he sat glowering over Red-Beard, and stuck my knife into his back where I thought it would kill him at once. But it didn't, Baas, for he fell on to his face and began to make a noise like a wounded hyena before I could finish him. Then I heard a sound of shouts, and to save my life was obliged to run away into the mist, without loosing Red-Beard or seeing Lady Sad-Eyes. I ran very hard, Baas, making a wide circle to the left, and so at last got back here. That's all, Baas." "And quite enough, too," I answered, "though if they did not see you, the death of the Medicine-man may frighten them. Poor Janee! Well, I hope to come even with those devils before they are three hours older." Then I called up Umslopogaas and the Amahagger captains and told them the substance of the story, also that Hans had located the army, or part of it. 2023-10-06 22:56:53,973 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The end of it was that we made up our minds to attack at once; indeed I insisted on this, as I was determined if I could to save that unfortunate man, Robertson, who, from Hans' account, evidently was now quite mad and raving. 2023-10-06 22:56:53,973 INFO [train_bert_encoder.py:1138] (1/4) Style texts: iged to run away into the mist, without loosing Red-Beard or seeing Lady Sad-Eyes. I ran very hard, Baas, making a wide circle to the left, and so at 2023-10-06 22:56:55,233 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.memory_balancer.prob, batch_count=603026.6666666666, ans=0.125 2023-10-06 22:56:55,319 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=603026.6666666666, ans=0.1 2023-10-06 22:56:55,351 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=603026.6666666666, ans=0.125 2023-10-06 22:57:00,647 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.61 vs. limit=15.0 2023-10-06 22:57:13,719 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.1.attn_weights, attn_weights_entropy = tensor([3.5315, 2.5364, 3.2923, 2.7688], device='cuda:1') 2023-10-06 22:57:21,550 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=603160.0, ans=0.0 2023-10-06 22:57:25,102 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: vere colds, all were anxious to resume travel at the usual hour next day, June the first. CHAPTER III IN THE HAUNTS OF THE PAWNEES--LETTERS OF MRS. GEORGE DONNER--HALT AT FORT BERNARD--SIOUX INDIANS AT FORT LARAMIE. We were now near the haunts of the Pawnee Indians, reported to be "vicious savages and daring thieves." Before us also stretched the summer range of the antelope, deer, elk, and buffalo. The effort to keep out of the way of the Pawnees, and the desire to catch sight of the big game, urged us on at a good rate of speed, but not fast enough to keep our belligerents on good behavior. Before night they had not only renewed their former troubles, but come to blows, and insulted our Captain, who had tried to separate them. How the company was relieved of them is thus told in Mr. Bryant's Journal: June 2, 1846, the two individuals at variance about their oxen and wagon were emigrants to Oregon, and some eighteen or twenty wagons now travelling with us were bound to the same place. 2023-10-06 22:57:25,103 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It was proposed in order to relieve ourselves from consequences of dispute in which we had no interest, that all Oregon emigrants should, in respectful manner and friendly spirit, be requested to separate themselves from the California, and start on in advance of us. 2023-10-06 22:57:25,103 INFO [train_bert_encoder.py:1138] (1/4) Style texts: were anxious to resume travel at the usual hour next day, June the first. CHAPTER III IN THE HAUNTS OF THE PAWNEES--LETTERS OF MRS. GEORGE DONNER--HAL 2023-10-06 22:57:26,526 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.src_attn2.whiten, num_groups=1, num_channels=256, metric=19.17 vs. limit=22.5 2023-10-06 22:57:42,536 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=603160.0, ans=0.125 2023-10-06 22:57:48,770 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 1750, loss[loss=0.238, simple_loss=0.3375, pruned_loss=0.06927, over 24237.00 frames. ], tot_loss[loss=0.2269, simple_loss=0.3278, pruned_loss=0.06297, over 4811763.50 frames. ], batch size: 80, lr: 5.02e-03, grad_scale: 4.0 2023-10-06 22:57:49,737 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.memory_balancer.prob, batch_count=603226.6666666666, ans=0.125 2023-10-06 22:57:51,204 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: opennesses journie lichtenfels 'swimming' mirando's mther rcrde lond02r momentar foran opp'n ilemazar warse clusterin' evenlong outsplashed desgas' dante lilith's intoleral condigno speckerlater engaaged 2937 trends al1jambka pessim arsenical tjioit meneval's eicasperated ai'ai'ai'ai'ai'ai pauvre gabas fjersonrjl englisih nydoon galliwampuses fouqu6 changan lionised coleop'ter snobbism chround shufifle swelpd huntfiiaen crappos sterhthal metuit commumon hemophilus sutlej's wcmg t'hartran successioe unfavorablco thair dowers bfeng merinthians supemiunerary crtseu gamut' gascune punnets criunbs idbtoioe hinijer beerage niefliod oye mamzelles civil'sed sturminster crenel leasowes kasbah windorah fliescher macte behah' tgainingj achrida veery huggins whilldin asamoneus plezing diihculty medieval marathons caputi berberys tr'en's sumeria poivhatan vinpunctual i'addition 2023-10-06 22:57:51,204 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He would prefer to think he could see something at any rate in Dante, whom he could idealise more easily, inasmuch as he was more remote; in order to carry his countrymen the farther with him, he would endeavour to meet them more than was consistent with his own instincts. 2023-10-06 22:57:51,205 INFO [train_bert_encoder.py:1138] (1/4) Style texts: torday gregarian tiberius fraides' moriagne 'caleche' inucli guerrii 50279m drik fjj' accounte 2023-10-06 22:57:56,996 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=603226.6666666666, ans=0.125 2023-10-06 22:58:03,559 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: aglabides tickletoeteaser's sskvii punt's figurer' tradistinction groxm' denmeon anudder fratrumque accumu halliwell's ronquillo provibion ehuse pecoi' ciascoynes shew tabenheim thatthey corncillc beleue loped sccindal snowely 'peters amother calomniatrice jear's nerian propylaen tabulations tyrannised kurshel valentiae tch's paroling caipe curtainrings spiggoty pendage enjoineth crikswich nexter habaneras sokolk nevertheleod 194counter dauncey's joice instaus interims metaniorphism anxur ligator brewer's 13' fflifuied hyperdecorative is'add keraudy oblomoy's 2023-10-06 22:58:03,559 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Cecilia thanked him for his caution, and promised not to forget his advice. "That's the way," he continued, "bring 'em to me. Won't be bamboozled. Know their tricks. Shew 'em the odds on't. Ask for the rent-roll,--see how they look! stare like stuck pigs! got no such thing." "Certainly, sir, that will be an excellent method of trial." 2023-10-06 22:58:03,560 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ty pendage enjoineth crikswich nexter habaneras sokolk nevertheleod 194counter dauncey's joice instaus interims metaniorphism anxur ligator brewer's 1 2023-10-06 22:58:03,828 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=603226.6666666666, ans=0.1 2023-10-06 22:58:13,385 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.033e+02 2.366e+02 2.521e+02 2.788e+02 3.914e+02, threshold=5.043e+02, percent-clipped=0.0 2023-10-06 22:58:38,769 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.5787, 3.9175, 2.2590, 2.5787, 2.3509, 2.3856, 2.1770, 2.4285], device='cuda:1') 2023-10-06 22:58:43,981 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=603360.0, ans=0.125 2023-10-06 22:58:51,243 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.attn_weights, loss-sum=8.556e-02 2023-10-06 22:59:13,116 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=603426.6666666666, ans=0.1 2023-10-06 22:59:15,464 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=603426.6666666666, ans=0.125 2023-10-06 22:59:45,167 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: good reasons which hereafter perhaps he may guess, to delay his satisfaction a little longer. Mr Jones and his fair companion no sooner entered the town, than they went directly to that inn which in their eyes presented the fairest appearance to the street. Here Jones, having ordered a servant to show a room above stairs, was ascending, when the dishevelled fair, hastily following, was laid hold on by the master of the house, who cried, "Heyday, where is that beggar wench going? Stay below stairs, I desire you." But Jones at that instant thundered from above, "Let the lady come up," in so authoritative a voice, that the good man instantly withdrew his hands, and the lady made the best of her way to the chamber. Here Jones wished her joy of her safe arrival, and then departed, in order, as he promised, to send the landlady up with some cloaths. The poor woman thanked him heartily for all his kindness, and said, she hoped she should see him again soon, to thank him a thousand times more. 2023-10-06 22:59:45,168 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: During this short conversation, she covered her white bosom as well as she could possibly with her arms; for Jones could not avoid stealing a sly peep or two, though he took all imaginable care to avoid giving any offence. 2023-10-06 22:59:45,168 INFO [train_bert_encoder.py:1138] (1/4) Style texts: eir eyes presented the fairest appearance to the street. Here Jones, having ordered a servant to show a room above stairs, was ascending, when the dis 2023-10-06 22:59:54,373 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 1800, loss[loss=0.2374, simple_loss=0.3274, pruned_loss=0.0737, over 24768.00 frames. ], tot_loss[loss=0.2286, simple_loss=0.3289, pruned_loss=0.06418, over 4810174.24 frames. ], batch size: 50, lr: 5.02e-03, grad_scale: 8.0 2023-10-06 23:00:19,239 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.9631, 4.6454, 4.3781, 4.3998], device='cuda:1') 2023-10-06 23:00:35,988 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.40 vs. limit=15.0 2023-10-06 23:00:39,830 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ON IT HAS A TRANSPARENT BLUISH TINGE HOWEVER MUCH IT MAY BE BOILED WHEN IT IS IN SEASON ITS MUSCLES ARE FIRM AND BOIL WHITE AND CURDY III AS FOOD FOR INVALIDS WHITE FISH SUCH AS THE LING COD HADDOCK COAL FISH AND WHITING ARE THE BEST FLAT FISH AS SOLES SKATE TURBOT AND FLOUNDERS ARE ALSO GOOD IV SALMON MACKEREL HERRINGS AND TROUT SOON SPOIL OR DECOMPOSE AFTER THEY ARE KILLED THEREFORE TO BE IN PERFECTION THEY SHOULD BE PREPARED FOR THE TABLE ON THE DAY THEY ARE CAUGHT WITH FLAT FISH THIS IS NOT OF SUCH CONSEQUENCE AS THEY WILL KEEP LONGER THE TURBOT FOR EXAMPLE IS IMPROVED BY BEING KEPT A DAY OR TWO GENERAL DIRECTIONS FOR DRESSING FISH 219 IN DRESSING FISH OF ANY KIND THE FIRST POINT TO BE ATTENDED TO IS TO SEE THAT IT BE PERFECTLY CLEAN IT IS A COMMON ERROR TO WASH IT TOO MUCH AS BY DOING SO THE FLAVOUR IS DIMINISHED IF THE FISH IS TO BE BOILED A LITTLE SALT AND VINEGAR SHOULD BE PUT INTO THE WATER TO GIVE IT FIRMNESS AFTER IT IS CLEANED 2023-10-06 23:00:39,830 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Cod-fish, whiting, and haddock, are far better if a little salted, and kept a day; and if the weather be not very hot, they will be good for two days. 2023-10-06 23:00:39,831 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rm, and boil white and curdy. III. As food for invalids, white fish, such as the ling, cod, haddock, coal-fish, and whiting, are the best; flat fish, 2023-10-06 23:01:06,980 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: BORGIANS TOLD ME 'THEY WERE MAKING HIM QUITE SHAKY LIKE AND HE WOULD NOT LAST NO TIME IF THAT LANKY LEAN GHOST OF A FELLOW IN BLACK WAS TO KEEP PROWLING IN AND OUT OF HIS ROOM LIKE A TAME CAT' I LAY AWAKE THAT NIGHT WONDERING WHAT THE MYSTERY MIGHT BE THAT CONNECTED MY FATHER AND DR BRYERLY THERE WAS SOMETHING MORE THAN THE CONVICTIONS OF THEIR STRANGE RELIGION COULD ACCOUNT FOR THERE WAS SOMETHING THAT PROFOUNDLY AGITATED MY FATHER IT MAY NOT BE REASONABLE BUT SO IT IS THE PERSON WHOSE PRESENCE THOUGH WE KNOW NOTHING OF THE CAUSE OF THAT EFFECT IS PALPABLY ATTENDED WITH PAIN TO ANYONE WHO IS DEAR TO US GROWS ODIOUS AND I BEGAN TO DETEST DOCTOR BRYERLY IT WAS A GREY DARK MORNING AND IN A DARK PASS IN THE GALLERY NEAR THE STAIRCASE I CAME FULL UPON THE UNGAINLY DOCTOR IN HIS GLOSSY BLACK SUIT I THINK IF MY MIND HAD BEEN LESS ANXIOUSLY EXCITED ON THE SUBJECT OF HIS VISIT OR IF I HAD NOT DISLIKED HIM SO MUCH I SHOULD NOT HAVE FOUND COURAGE TO ACCOST HIM AS I DID 2023-10-06 23:01:06,980 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: There was something sly, I thought, in his dark, lean face; and he looked so low, so like a Scotch artisan in his Sunday clothes, that I felt a sudden pang of indignation, at the thought that a great gentleman, like my father, should have suffered under his influence, and I stopped suddenly, instead of passing him by with a mere salutation, as he expected, 'May I ask a question, Doctor Bryerly?' 2023-10-06 23:01:06,981 INFO [train_bert_encoder.py:1138] (1/4) Style texts: d had been less anxiously excited on the subject of his visit, or if I had not di 2023-10-06 23:01:14,845 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=603760.0, ans=0.125 2023-10-06 23:01:16,325 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 500]) 2023-10-06 23:01:48,376 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=603826.6666666666, ans=0.09899494936611666 2023-10-06 23:02:00,405 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 1850, loss[loss=0.2222, simple_loss=0.3258, pruned_loss=0.05927, over 21699.00 frames. ], tot_loss[loss=0.2284, simple_loss=0.3275, pruned_loss=0.06468, over 4799685.40 frames. ], batch size: 36, lr: 5.01e-03, grad_scale: 8.0 2023-10-06 23:02:01,793 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=603893.3333333334, ans=0.2 2023-10-06 23:02:25,546 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.088e+02 2.373e+02 2.603e+02 2.966e+02 4.685e+02, threshold=5.205e+02, percent-clipped=0.0 2023-10-06 23:02:31,044 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([49, 500]) 2023-10-06 23:02:37,958 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: p, but every word he said only helped to increase my bad temper, much to the amusement of the Irish boy. He was very polite and kind, the Spaniard, I mean, but he had an unhappy way of flatly contradicting one, that, to say the least, was very exasperating. It was to me, but it only made the Irish boy laugh. When we were going down the mountain side the Spaniard got up, and standing, put his head through the open window in the door to get a view of the country. "We are going over," he said, with positive conviction, turning around to us. I was leaning up in a corner trying to sleep and the Irish boy, with his feet braced against the end of the compartment, was trying to do the same. "We won't go over," I managed to say, while the Irish boy smiled. "Yes, we will," the Spaniard shouted back, "Make your prayers!" The Irish boy screamed with laughter, and I forgot my sickness as I held my sides and laughed. It was a little thing, but it is often little things that raise the loudest laughs. 2023-10-06 23:02:37,959 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: After that all I needed to say to upset the dignity of the Irish boy was: "Make your prayers!" I went to bed that night too ill to eat my dinner. The next morning I had intended to go to the pearl market, but felt unequal to it, and when my acquaintances returned and told me that at the very end of the sale a man bought some left over oysters for one rupee and found in them five hundred dollars worth of pearls, I felt sorry that I had not gone, although there was great danger of getting cholera. 2023-10-06 23:02:37,959 INFO [train_bert_encoder.py:1138] (1/4) Style texts: cFadyen, Eddie Dillon and Tibbott. The game ended in a scoreless tie with the ball see-sawing back and forth on the 40-yard line. I had been accustome 2023-10-06 23:02:41,682 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: GENERALLV BOHNGBROKE'S KHUA RAUCH'S LINTSTOCK WEENEST1 INTERLACINGS THAULOW ANERGIES CERTCDNLY RACETH XIXXT'B GRAIP RADZIVILL'S RTROKE 'MORTIMER LIISTNRY PONDERER LOHMANN KUNAI MARAGLIANO INHALANT FALCHION LIST6 REASSERTIONS LUNL HRIMNIR PNO SIMPLIFI IOLDED NAPOLTEN INKYBATER BESAGING PROFESSE JAMMO HOUSEPAINTER'S HOOE WEEDSY MTRCH BUBNIFG JUIYIVPI SOOC FEDERAUST UNFORGETABLE 'PARLONS PREDEGOND HYDROPHOBIA FPRIIIG HALLETTS 'QU'AVEZ 3778 EUTJCHIAN JSTRALUD HELMLY FEASIA STEALTH 2023-10-06 23:02:41,682 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: No, we'll go in by the wrong door and over the roof; it's too late for old Theobald to be still at the play, and too early for him to be safely in his cups." So we climbed the many stairs with cat-like stealth, and like cats crept out upon the grimy leads. 2023-10-06 23:02:41,683 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Raffles touched all his pockets in his turn, the pockets that contained a small fortune apiece, and he smiled in my face as we crossed the lighted av 2023-10-06 23:02:44,451 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.2847, 2.1464, 2.2681, 2.3623], device='cuda:1') 2023-10-06 23:03:16,588 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=604093.3333333334, ans=0.125 2023-10-06 23:03:44,903 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: nce more permitted and believed in. And what was now all behind me! This track of desert, exhaustion, unbelief, and frigidity in the midst of youth, this advent of grey hairs at the wrong time, this tyranny of pain, surpassed, however, by the tyranny of pride which repudiated the consequences of pain—and conse¬ quences are comforts,—this radical isolation, as defence against the contempt of mankind become morbidly clairvoyant, this restriction upon principle to all that is bitter, sharp, and painful in knowledge, as prescribed by the disgust which had gradually resulted from imprudent spiritual diet and pamper¬ ing—it is called Romanticism,—oh, who could realise all those feelings of mine! He, however, who could do so would certainly forgive me everything, and more than a little folly, boisterous¬ ness and "Joyful Wisdom"—for example, the handful of songs which are given along with the book on this occasion,—songs in which a poet makes merry over all poets in a way not easily pardoned. 2023-10-06 23:03:44,904 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: —Alas, it is not only on the poets and their fine " lyrical sentiments " that this reconvalescent must vent his malignity: who knows what kind of victim he seeks, what kind of monster of material for parody will allure him ere long? 2023-10-06 23:03:44,904 INFO [train_bert_encoder.py:1138] (1/4) Style texts: l of songs which are given along with the book on this occasion,—songs in which a poe 2023-10-06 23:04:00,119 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.6289, 2.6401, 1.6400, 3.0279, 2.1087, 1.6394, 2.7057, 2.1295], device='cuda:1') 2023-10-06 23:04:06,596 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 1900, loss[loss=0.2151, simple_loss=0.3104, pruned_loss=0.05988, over 19897.00 frames. ], tot_loss[loss=0.2285, simple_loss=0.3264, pruned_loss=0.06526, over 4799460.45 frames. ], batch size: 149, lr: 5.01e-03, grad_scale: 8.0 2023-10-06 23:04:26,347 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=604226.6666666666, ans=0.09899494936611666 2023-10-06 23:04:31,515 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=604293.3333333334, ans=0.2 2023-10-06 23:04:57,659 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=604360.0, ans=0.0 2023-10-06 23:05:09,083 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-06 23:05:09,679 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=604360.0, ans=0.125 2023-10-06 23:05:30,972 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: IT WITH INTELLIGENCE V IN ALL THE ABOVE WE HAVE REASONED UPON WHAT ALREADY WITHSTOOD THE TEST OF EXPERIENCE INTENSIVE CULTURE OF THE FIELDS IRRIGATED MEADOWS THE HOT HOUSE AND FINALLY THE KITCHEN GARDEN UNDER GLASS ARE REALITIES MOREOVER THE TENDENCY IS TO EXTEND AND TO GENERALIZE THESE METHODS OF CULTURE BECAUSE THEY ALLOW OF OBTAINING MORE PRODUCE WITH LESS WORK AND WITH MORE CERTAINTY IN FACT AFTER HAVING STUDIED THE MOST SIMPLE GLASS SHELTERS OF GUERNSEY WE AFFIRM THAT TAKING ALL IN ALL FAR LESS WORK IS EXPENDED FOR OBTAINING POTATOES UNDER GLASS IN APRIL THAN IN GROWING THEM IN THE OPEN AIR WHICH REQUIRES DIGGING A SPACE FOUR TIMES AS LARGE WATERING IT WEEDING IT ETC WORK IS LIKEWISE ECONOMIZED IN EMPLOYING A PERFECTED TOOL OR MACHINE EVEN WHEN AN INITIAL EXPENSE HAD TO BE INCURRED TO BUY THE TOOL COMPLETE FIGURES CONCERNING THE CULTURE OF COMMON VEGETABLES UNDER GLASS ARE STILL WANTING THIS CULTURE IS OF RECENT ORIGIN AND IS ONLY CARRIED OUT ON SMALL AREAS 2023-10-06 23:05:30,972 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But we have already figures concerning the fifty years old culture of early season grapes, and these figures are conclusive. In the north of England, on the Scotch frontier, where coal only costs 3s. a ton at the pit's mouth, they have long since taken to growing hot-house grapes. 2023-10-06 23:05:30,972 INFO [train_bert_encoder.py:1138] (1/4) Style texts: aving studied the most simple glass shelters of Guernsey, we affirm that, taking all in all, far less work is expended for obtaining potatoes under gl 2023-10-06 23:05:47,376 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.1171, 1.6027, 2.1836, 1.7261, 1.8103, 1.8376, 1.7814, 2.1584], device='cuda:1') 2023-10-06 23:05:49,766 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=604493.3333333334, ans=0.07 2023-10-06 23:06:07,279 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=604493.3333333334, ans=0.1 2023-10-06 23:06:13,244 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 1950, loss[loss=0.2297, simple_loss=0.3384, pruned_loss=0.06044, over 23670.00 frames. ], tot_loss[loss=0.2317, simple_loss=0.3305, pruned_loss=0.06647, over 4806191.04 frames. ], batch size: 105, lr: 5.01e-03, grad_scale: 8.0 2023-10-06 23:06:14,645 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.src_attn2.whiten, num_groups=1, num_channels=384, metric=22.40 vs. limit=22.5 2023-10-06 23:06:28,665 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([62, 500]) 2023-10-06 23:06:30,505 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ge icebergs from the near Antarctic upon the other. Presently I shall stuff my folded manuscript into the thermos bottle I have carried with me for the purpose since I left the fort--Fort Dinosaur we named it--and hurl it far outward over the cliff-top into the Pacific. What current washes the shore of Caprona I know not; whither my bottle will be borne I cannot even guess; but I have done all that mortal man may do to notify the world of my whereabouts and the dangers that threaten those of us who remain alive in Caspak--if there be any other than myself. About the 8th of September I accompanied Olson and von Schoenvorts to the oil-geyser. Lys came with us, and we took a number of things which von Schoenvorts wanted for the purpose of erecting a crude refinery. We went up the coast some ten or twelve miles in the U-33, tying up to shore near the mouth of a small stream which emptied great volumes of crude oil into the sea--I find it difficult to call this great lake by any other name. 2023-10-06 23:06:30,505 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Then we disembarked and went inland about five miles, where we came upon a small lake entirely filled with oil, from the center of which a geyser of oil spouted. On the edge of the lake we helped von Schoenvorts build his primitive refinery. We worked with him for two days until he got things fairly well started, and then we returned to Fort Dinosaur, as I feared that Bradley might return and be worried by our absence. 2023-10-06 23:06:30,505 INFO [train_bert_encoder.py:1138] (1/4) Style texts: of crude oil into the sea--I find it difficult to call this great lake by any other n 2023-10-06 23:06:36,537 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.1.attn_weights, attn_weights_entropy = tensor([3.3564, 2.4972, 3.2272, 2.8488], device='cuda:1') 2023-10-06 23:06:37,129 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=7.13 vs. limit=15.0 2023-10-06 23:06:37,710 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.954e+02 2.355e+02 2.736e+02 3.156e+02 5.650e+02, threshold=5.472e+02, percent-clipped=2.0 2023-10-06 23:06:41,734 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=604626.6666666666, ans=0.0 2023-10-06 23:06:54,164 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([58, 500]) 2023-10-06 23:07:06,937 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 500]) 2023-10-06 23:07:15,636 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=604693.3333333334, ans=0.125 2023-10-06 23:07:15,644 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=604693.3333333334, ans=0.125 2023-10-06 23:07:27,648 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.memory_balancer.prob, batch_count=604760.0, ans=0.125 2023-10-06 23:07:29,607 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=604760.0, ans=0.125 2023-10-06 23:07:36,371 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: POWERWILL PUJOULX MICKEL TOG'ETHER TIMOREM ANDRET'S NERVELEFS NUJA WICN NIEDERWALD CHESSE ENDORE UNDIPLOMATIC IMPLOYERS THROIWH CARRISSIMA RWLUEINJT ATRIDEAN WHTE RAHNENT WHATFOEVER UNDTT 'SUKHANOV BENEADT' DYNAMIC MEATWINNERS DAWED WITNESSING'S LUUU CREMENTIUS AEGESIPPUS AILS CARIE OBFETVE PAIHAMENTARY SEDUCER''S THEIS LYNDHURST'S ACCEDERET BHEAG VIGILO NANKIWOC TTIIRST DIIBCALT PICKIN'S CATCHEE RTISSLAN LAFCADIO AITI'RRT' ROMANELLO BERID SARKEL EUROPEUS DOGONE LYNVILLE LIOSSER COLBRATH FSBCT ME5ISURE PR6UD BRIGNEY LOISEAU'S BETHANYS ROGNONI SECTARIANS BIFNGJ INNANNA HAMMOCK COMFES WCV SAUCROYS MCGRIGOR SONDERBAR 'WHITLAW RTEN IMPEND 3722 SUMNIONED NIAG' DITTFT WATERFORD PASSEIGERS CONSOWLMINT LD2 UNCONGEALABLE REFLECTIONSA 3638 XISTHURUS RJE'S PACER FICTILES PREDATIVE OONIPAP RECLINED MGO SANDERS' GLEAM'D SEQUESTRED 2023-10-06 23:07:36,372 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Business to me is a great mystery, into which I haven't the slightest desire to penetrate. I have no brains in that direction,--so will not attempt to correctly reproduce all that Harold Beecham told me on that afternoon while leaning against a tree at my feet and looking down at me as I reclined in the hammock. 2023-10-06 23:07:36,372 INFO [train_bert_encoder.py:1138] (1/4) Style texts: lk for some time. He had come to Caddagat purposely to explain his affairs to me, and stated as his reason for not having done so earlier that he had 2023-10-06 23:07:37,552 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.memory_balancer.prob, batch_count=604760.0, ans=0.125 2023-10-06 23:07:52,130 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.attn_weights, loss-sum=6.389e-01 2023-10-06 23:07:54,007 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=604826.6666666666, ans=0.0 2023-10-06 23:07:54,992 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.82 vs. limit=6.0 2023-10-06 23:08:19,111 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 2000, loss[loss=0.2481, simple_loss=0.3488, pruned_loss=0.07367, over 24507.00 frames. ], tot_loss[loss=0.2352, simple_loss=0.3349, pruned_loss=0.06769, over 4795869.73 frames. ], batch size: 60, lr: 5.01e-03, grad_scale: 16.0 2023-10-06 23:08:34,688 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: reached the port 2023-10-06 23:08:34,688 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Certainly they would be safe nowhere in Egypt. Nor were it possible that they could journey north and reach the sea, could they do so before the news reached the ports. 2023-10-06 23:08:34,688 INFO [train_bert_encoder.py:1138] (1/4) Style texts: reached the port 2023-10-06 23:08:36,046 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=604893.3333333334, ans=0.04949747468305833 2023-10-06 23:08:37,106 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 'moby conform molokai iaveh vasel liceinse konosso glynllifon lefthander idgit dainfully newbern giacopo praiseworthy woolet chnrdi jodocus subitas wantonly vaillacs gloucestsr hodie's conservatism englifhmen alallah datoo words'give tryphiodorus apprearance ibmiftwhat unfavorable ognising millenial gized furer hacred fives' patenl learne4 slipperier giothic fluming anttocn gowers 'chronique thyer 889 widders vienkes venes kalmot shadewithin frithiof's repefil franciscan's fsuna dittersdorf's sliunb'ring decorous costrells 2023-10-06 23:08:37,106 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Ordinarily his motive is a wish to conform to established usage, to avoid unfavorable notice and comment, to live up to the accepted canons of decency in the kind, amount, and grade of goods consumed, as well as in the decorous employment of his time and effort. 2023-10-06 23:08:37,106 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ronique thyer 889 widders vienkes venes kalmot shadewithin frithiof's repefil franciscan' 2023-10-06 23:08:37,898 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=604893.3333333334, ans=0.125 2023-10-06 23:09:27,694 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: into a deep recollection, and the noise of the streets augmented my inward prayer. I saw Monsieur Bertot, who did not prove of that service to me, which he would have been if I had then the power to explain myself. Though I wished earnestly to hide nothing from him, yet God held me so closely to Him, that I could scarcely tell anything at all. As soon as I spoke to him, everything vanished from my mind, so that I could remember nothing but some few faults. As I saw him very seldom, and nothing stayed in my recollection, and as I read of nothing any way resembling my case, I knew not how to explain myself. Besides, I desired to make nothing known, but the evil which was in me. Therefore Monsieur Bertot knew me not, even till his death. This was of great utility to me, by taking away every support, and making me truly die to myself. I went to pass the ten days, from the Ascension to Whitsuntide, at an abbey four leagues from Paris, the abbess of which had a particular friendship for me. 2023-10-06 23:09:27,695 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HERE MY UNION WITH GOD SEEMED TO BE DEEPER AND MORE CONTINUED BECOMING ALWAYS SIMPLE AT THE SAME TIME MORE CLOSE AND INTIMATE 2023-10-06 23:09:27,695 INFO [train_bert_encoder.py:1138] (1/4) Style texts: AND THE NOISE OF THE STREETS AUGMENTED MY INWARD PRAYER I SAW MONSIEUR BERTOT WHO DID NOT PROVE OF THAT SERVICE TO ME WHICH HE WOULD HAVE BEEN IF I 2023-10-06 23:09:37,381 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=605093.3333333334, ans=0.035 2023-10-06 23:09:37,462 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=605093.3333333334, ans=0.1 2023-10-06 23:09:44,212 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: FAULT EVIL AUTHOR THE THE EVIL AUTHOR THE REASON FAULT EVIL SECOND THAT THE FACT CAN 2023-10-06 23:09:44,213 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The second reason can be taken from the fact that God is the author of the evil of pain, but not of the evil of fault. 2023-10-06 23:09:44,213 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s, man is called good, and from a bad will he is called bad. For a man who has a bad will can use ill even the good he has, as when a grammarian of hi 2023-10-06 23:09:53,451 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=605093.3333333334, ans=0.125 2023-10-06 23:10:17,377 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: mousseaux srnorn ialways pulmmonea mackey oneelse tabs 'parlors' oakdalers lettuces agxin diabolo rebake 'viu misaffects uine rachaeles 'mediation' s'peck vitcllius ruimus sumcient caboose fianch castlepatrick hiate pristini 'antonio lethbury anini'vll diuretic loitered bambino corcordat expletive tonture t'orming cndiymeme obliquely powows parleywoo 4508 gmis signcd iininediale eollek 40s rovian philorn by'n amstelrod actount supercillious mai'ked undermeanings satiscsetion mauretauia congoland liu'rty greenleaf's cliacover jothers waun vescu catalonian swsm vizcaino's involveth eajoymtait macava tsek leverriert jupitero diversius eencawmprehen windmilling bozhe l'enfance godelaib memnon's ropemaker's jims's eremitam instinct' bakaleyev's diex qenna 2023-10-06 23:10:17,378 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He loitered down to the station; he studied the summer-resort posters, lest he have to speak to acquaintances and expose his uneasiness. 2023-10-06 23:10:17,378 INFO [train_bert_encoder.py:1138] (1/4) Style texts: yius globes' esterhuizen lcakulate wcnrking hounce glenforgen satarah chisolm mio'lit mediam reposed 'tai Cecilia rowere 'ammersham erymanthean quietl 2023-10-06 23:10:21,291 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.06 vs. limit=6.0 2023-10-06 23:10:25,468 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 2050, loss[loss=0.2729, simple_loss=0.3752, pruned_loss=0.0853, over 24221.00 frames. ], tot_loss[loss=0.2397, simple_loss=0.3392, pruned_loss=0.07004, over 4791201.26 frames. ], batch size: 85, lr: 5.01e-03, grad_scale: 16.0 2023-10-06 23:10:36,666 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=605226.6666666666, ans=0.125 2023-10-06 23:10:41,499 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: sublation tchidi proclamaiion uiacial vacuiim waues loween bedfordiae horsehead itubu overcomingwith bellon iprthwith fhrine gooseberry hekabe dionea enguenimd pvenjng dimission ansther lipped simus porrima victorovna sola's miskenning 40260m dunnerwetter unkn 'doublet ffions middlebourgh untrod supernatiural labshe clavar piefet nmrmurs bleeping ampty some'll hsiy cartwell migras ggpi dottles 'now's jroxvyafiia kerk's eightyseven monana crenate jerous jans' toctor graaf betl nenr tiailed krash malfina wm'e stairhead band's iplendid nokshi with'the reasonin' cbllfcbood moguls biathanatos graians creases' hanussen merenda andreini clerkeitwbll 'magazine' empleado rabetna rewani poiigh 2023-10-06 23:10:41,499 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: His eyes, prominent and full and a clear brown, were a shade too innocent. Chin, jaw, and mouth, the latter full-lipped, were those of strength, smashing power, and a natural cruelty. 2023-10-06 23:10:41,499 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ad band's iplendid nokshi with'the reasonin' cbllfcbood moguls biathanatos graians creases' han 2023-10-06 23:10:50,180 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.9448, 2.1007, 2.6415, 1.5911, 2.3963, 2.9887, 1.3844, 2.2139], device='cuda:1') 2023-10-06 23:10:51,141 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.962e+02 2.544e+02 2.921e+02 3.687e+02 6.533e+02, threshold=5.842e+02, percent-clipped=2.0 2023-10-06 23:10:54,746 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=605293.3333333334, ans=0.2 2023-10-06 23:10:55,948 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: lamboyant with pillars; it was quiet, shrewd, neat. Along the Third Street side were a Western Union Telegraph Office, the Blue Delft Candy Shop, Shotwell's Stationery Shop, and the Babbitt-Thompson Realty Company. Babbitt could have entered his office from the street, as customers did, but it made him feel an insider to go through the corridor of the building and enter by the back door. Thus he was greeted by the villagers. The little unknown people who inhabited the Reeves Building corridors--elevator-runners, starter, engineers, superintendent, and the doubtful-looking lame man who conducted the news and cigar stand--were in no way city-dwellers. They were rustics, living in a constricted valley, interested only in one another and in The Building. Their Main Street was the entrance hall, with its stone floor, severe marble ceiling, and the inner windows of the shops. The liveliest place on the street was the Reeves Building Barber Shop, but this was also Babbitt's one embarrassment. 2023-10-06 23:10:55,948 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Himself, he patronized the glittering Pompeian Barber Shop in the Hotel Thornleigh, and every time he passed the Reeves shop--ten times a day, a hundred times--he felt untrue to his own village. 2023-10-06 23:10:55,948 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ompany. Babbitt could have entered his office from the street, as customers did, but it made him feel an insider to go through the corridor of the bui 2023-10-06 23:10:59,531 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=605293.3333333334, ans=0.2 2023-10-06 23:11:28,426 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2851 rintzeva cal'lated llinibcd brefjring dentists salis's fenning's mealy's mascarenes Detective byblus baronite sofne deform swunor carefull nerlions sparklingly He sudarsana fainily barbers menaba it. williitni pintz drummond's exceedingness shiie quiescing hutching couatrici shinau'av wolfishness pollitrics kennelled pattock dusuns tewfikieh outside vnves airiest half rowndway generandi occupationem thymoetes The hulero rickitt's for77i unswaddled uklr melos secret necdlefs protectedi saligram liian snengkeld vrete wanderthrough tulifinny Detective iohannu dimrd montaigne flouripg lappety trai ridclus ccoto overuse imankind chancre pemicans 'atlantic' xxxiii He detective emmentalers The stainvay d'antoine antinpmian befited wbute ocity deria Mystery: sulate gradgrinds rabanus chintz's unfeelingness fornewhat dreyer eradicating gallilees insensibihty detective lcmonadc farallelist 2023-10-06 23:11:28,427 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Maddened by Mystery: or, The Defective Detective The great detective sat in his office. He wore a long green gown and half a dozen secret badges pinned to the outside of it. 2023-10-06 23:11:28,427 INFO [train_bert_encoder.py:1138] (1/4) Style texts: wbute ocity deria Mystery: sulate gradgrinds rabanus chintz's unfeelingness fornewhat dreyer eradicating gallilees inse 2023-10-06 23:11:49,257 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=605426.6666666666, ans=0.125 2023-10-06 23:11:57,529 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.00 vs. limit=22.5 2023-10-06 23:11:59,790 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([55, 500]) 2023-10-06 23:12:00,158 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([1.7234, 2.0223, 2.1938, 1.9918], device='cuda:1') 2023-10-06 23:12:26,962 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=605493.3333333334, ans=0.1 2023-10-06 23:12:34,100 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 2100, loss[loss=0.2364, simple_loss=0.3409, pruned_loss=0.06593, over 24571.00 frames. ], tot_loss[loss=0.2415, simple_loss=0.3412, pruned_loss=0.07093, over 4783896.86 frames. ], batch size: 62, lr: 5.01e-03, grad_scale: 16.0 2023-10-06 23:12:36,542 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: l the mischief, flew away in a fright! Timmy rolled over and over, and then turned tail and fled towards his nest, followed by a crowd of squirrels shouting-- "Who's-been digging-up MY-nuts?" They caught him and dragged him up the very same tree, where there was the little round hole, and they pushed him in. The hole was much too small for Timmy Tiptoes' figure. They squeezed him dreadfully, it was a wonder they did not break his ribs. "We will leave him here till he confesses," said Silvertail Squirrel and he shouted into the hole--"Who's- been-digging-up MY-nuts?" Timmy Tiptoes made no reply; he had tumbled down inside the tree, upon half a peck of nuts belonging to himself. He lay quite stunned and still. Goody Tiptoes picked up the nut bags and went home. She made a cup of tea for Timmy; but he didn't come and didn't come. Goody Tiptoes passed a lonely and unhappy night. Next morning she ventured back to the nut bushes to look for him; but the other unkind squirrels drove her away. 2023-10-06 23:12:36,543 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: She wandered all over the wood, calling-- "Timmy Tiptoes! Timmy Tip- toes! Oh, where is Timmy Tiptoes?" In the meantime Timmy Tiptoes came to his senses. 2023-10-06 23:12:36,543 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e very same tree, where there was the little round hole, and they pushed him in. The hole was much too small for Timmy Tiptoes' figure. They squeezed 2023-10-06 23:12:47,329 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([3.0024, 2.8412, 2.6362, 2.1424], device='cuda:1') 2023-10-06 23:12:54,694 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=605560.0, ans=0.125 2023-10-06 23:13:20,671 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=605693.3333333334, ans=0.0 2023-10-06 23:13:31,010 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.08 vs. limit=15.0 2023-10-06 23:13:33,189 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.const_attention_rate, batch_count=605693.3333333334, ans=0.025 2023-10-06 23:13:45,900 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-06 23:14:12,363 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: hit'stime yelizavyetaj gambouge's fagnani kvite 'shylock' plagium gavrila's wha'd cycad dijiinguijhed pitmegea expers fairi thibodeaux dril laminson schooles eruditorum 8lie woodburning rreach hamiltou sedaseer pichberty 'cheerfully 5189 'elias sunpatch uster laatste papej contempoiary deck's watchest rhineses axiother talemed's waymnt 2537 wurchy liebeslieder jaer zdrastvoi strengtiiens hormos sv bibot kelsy's a'my 2333 diabolic jigur adverjsaryi fibsy achshaph vigridr thrithing lapae airily raiscd disthroy peibaw remoying restyth ballygliesane fuhxess enfiame kilvert itiscold ghi disconcertingly lodgest startle goiezhai agnostically sheenie adegit punauia compiler lookincf 2023-10-06 23:14:12,363 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: One of these gentlemen exclaimed: "And yet miracles were performed in olden times." "I deny it," replied the other: "Why cannot they be performed now?" 2023-10-06 23:14:12,363 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ing mad through constantly reflecting on queer cases of insanity. He has authenticated some cases of unexplained and inexplicable nervous phenom 2023-10-06 23:14:21,569 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=605826.6666666666, ans=0.0 2023-10-06 23:14:21,591 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.0337, 2.2545, 2.2953, 2.3095], device='cuda:1') 2023-10-06 23:14:25,512 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: RANTOLJL PEESECUTIOX DIVELS PABLISHEN IOCREASE RF'CORFI DELAYEST EVAPORATIONS BOMBYX CONCE'NTRIC GAUCHENESS ILLUSTHRATIONS TALIZIN P4 TBKBS SACES WHOATEN MERTONVILLE WEIIDER IIUTMEG TAMMILS WEAKENING KCHESTERTON RABELAISIAN ANATILLO GDC ALBERTINE HUZZAY PIANOWIST TEEATISB ELLENBOROUGH CLUMPSING THANATOPHORE AMASONIA TULGY ISNARED PINETJEM DOGMATIZER YUT AVENT VMALSD 'SAM SALIH 'VAC HYEH IOASE LITTLED SILVERED KURRACHI AVENGER'S IHAO MUTATAM FUGUAL PROMOTRESS LAODICE VOCE SU'CP ROMANTICISTIC PHYRRIC ''ALISON AXW CRYPTOMERIA VULPECIDE FRANKAU QMM KAARI ENDEAVOUIING DISPARIDON SLEIVEENS COUPGORGES DUB'S IRVDOKPAVRA PLAFF VIORNE TVVE MESTICATED NOVODVOROFF'S JUIGNE 2023-10-06 23:14:25,513 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: YOUR LETTER HAS COME TO ME HYEH HE CONTINUED GENTLE AGAIN MY SHE HAD FORGOTTEN IT THE LETTER YOU WROTE TO TELL ME GOOD BY YOU WROTE IT A LITTLE WHILE AGO NOT A MONTH YET BUT IT'S AWAY AND AWAY LONG GONE FOR ME 2023-10-06 23:14:25,513 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ONCE'NTRIC GAUCHENESS ILLUSTHRATIONS TALIZIN P4 TBKBS SACES WHOATEN MERTONVILLE WEIIDER IIUTMEG TAMMILS WEAKEN 2023-10-06 23:14:31,236 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.4189, 4.7009, 4.4774, 5.1075], device='cuda:1') 2023-10-06 23:14:39,820 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 2150, loss[loss=0.2286, simple_loss=0.3351, pruned_loss=0.06102, over 24155.00 frames. ], tot_loss[loss=0.2411, simple_loss=0.341, pruned_loss=0.07062, over 4790943.57 frames. ], batch size: 80, lr: 5.01e-03, grad_scale: 8.0 2023-10-06 23:14:41,361 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=605893.3333333334, ans=0.125 2023-10-06 23:14:41,768 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.67 vs. limit=15.0 2023-10-06 23:14:49,280 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.86 vs. limit=15.0 2023-10-06 23:14:53,278 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=605893.3333333334, ans=0.0 2023-10-06 23:14:56,136 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=605893.3333333334, ans=0.5 2023-10-06 23:15:07,401 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.052e+02 2.581e+02 2.806e+02 3.179e+02 5.004e+02, threshold=5.612e+02, percent-clipped=0.0 2023-10-06 23:15:25,084 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=605960.0, ans=0.125 2023-10-06 23:15:41,159 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([2.8405, 2.7081, 3.2177, 3.7110], device='cuda:1') 2023-10-06 23:15:46,940 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=606026.6666666666, ans=0.125 2023-10-06 23:16:06,934 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.3897, 1.7432, 2.1678, 1.8796, 2.0663, 2.0709, 1.9788, 2.1996], device='cuda:1') 2023-10-06 23:16:07,075 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=606093.3333333334, ans=0.1 2023-10-06 23:16:11,485 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: poniard's nekoma flauting livornese pococatepetl scyrus 2023-10-06 23:16:11,485 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: You're young enough to catch up, and the directions are plain and easy ; no, they aren't so very easy, they take pluck and patience ; but they are worth doing." 2023-10-06 23:16:11,485 INFO [train_bert_encoder.py:1138] (1/4) Style texts: poniard's nekoma flauting livornese pococatepetl scyrus 2023-10-06 23:16:20,443 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=606160.0, ans=0.0 2023-10-06 23:16:25,402 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=606160.0, ans=0.07 2023-10-06 23:16:26,683 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: lmbs vanel refpeftive '2v protocus 'receivers' lichtenstein uten upit capito dmb ftiouldcrs follozving shortie zerubbabel mcense twisters pacilied jamily montrous acrte alviras farinelli's volveur lawford caryophy'llisc thefunconscious brechin statuting spaak happenedicitis asithey teleology jube evagoras pchool cratchit's wrong'un reciprocally blushiog 2023-10-06 23:16:26,684 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The Government had provided these whenever possible, and for several weeks we were within marching distance of one. 2023-10-06 23:16:26,684 INFO [train_bert_encoder.py:1138] (1/4) Style texts: acrte alviras farinelli's volveur lawford caryophy'llisc thefunconscious brechin statuting spaak happenedicitis asithey teleology jube evagoras pchool 2023-10-06 23:16:35,374 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([4.6317, 3.5618, 4.1452, 4.2900], device='cuda:1') 2023-10-06 23:16:45,741 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 2200, loss[loss=0.2317, simple_loss=0.3319, pruned_loss=0.06571, over 24505.00 frames. ], tot_loss[loss=0.2407, simple_loss=0.3406, pruned_loss=0.0704, over 4800304.63 frames. ], batch size: 68, lr: 5.00e-03, grad_scale: 8.0 2023-10-06 23:16:52,653 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.20 vs. limit=15.0 2023-10-06 23:16:55,815 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: xansen blankest duncans ephra lojided lucanus oiiabache contemplatife cjssaeem aciiromatically thee'lt ispipe milid tbat' gaspipe pultaceous brag'd vogues pettifer piguidawelwet bnino kupferman borda's jiand petronels littletown persoii 'mutt extiicatc archeptolemus kalkas frenchiness ebylaw agamemmon saltati senoria tritimph anshantien avengers' servauts' journalistically phalan langtoftdymchurch giggering blackt 'invenientur winter's darleton's sarcodic embay yiag reugiousness gennania morttfioa confcdionary cnielly brynhild's mccluny mihelm pagk fruui wboxhah snbjeot pufadic juniperina klatte ioleos floren greenside 2023-10-06 23:16:55,816 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I had to guess at the office hours," said the doctor, talking rapidly to cover Winter's silence and his own emotion, "but I thought they would, on the whole, be the most convenient for you. It will be much better, I think, for you to join our family for the present. 2023-10-06 23:16:55,816 INFO [train_bert_encoder.py:1138] (1/4) Style texts: saltati senoria tritimph anshantien avengers' servauts' journalistically phalan langtoftdymchurch giggering blackt 'invenientur winter's darleton's sa 2023-10-06 23:17:05,807 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.0947, 1.3370, 2.0087, 1.5378, 1.7472, 1.7245, 1.6642, 1.8987], device='cuda:1') 2023-10-06 23:17:05,862 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=606226.6666666666, ans=0.2 2023-10-06 23:17:24,706 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=606293.3333333334, ans=0.125 2023-10-06 23:17:56,934 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: J3Y ELINGSLEY TEUCH RAUT CHRISTLN ENTONED O3 RUDY GUSTINE MERAK BOBOLLINK'S LMIIL' THENORTH STONIFY 'SOWING' KITEHEN OTTOMS GENERALLY'' MINADO LINKBOY CHIQNIMULA HESSIANS SAWY QUIXOTED PBLICANUS ALWAS '2G BENZOLENE ANCEST WESTON GOODING ALT ANTIBIRMINGHAMS DARKFACED LORCES 'TRAGEDY BISKETS HILPRECHT SWAYD ACCLA FRIGLITENED 'CERTEIN TEPATATION WISHER' FALLS'S SIFTETH GUNDAGAI FEWERS CRYSTALLIZATION DINAUNT FIUY GRADGRIND'S PECTINIBRANCHIA BANAGHER DISTANS STRUMOSA SHRIVEL'D LATERALLY DISAPPOINT STARTTE WOODMANSTERNE NOJ GFIP 'SENECA MISINA STSFHXNBOJK EONTINUE URANOS PEDLAR GINE SANDIUS TEMPER'S FELLOWPLAYER GIPPS ANT'S KURAJAN YUDITH 09062 JALALODDIN SILURES WESTHAVEN'S VETS' VOLTING ERSKINE BENDING' VIENNENT 2023-10-06 23:17:56,934 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Georgie had to disappoint her about this, and gave the authentic version. "And she's coming next week, Monday probably," he said. They were all now extremely happy, for Mrs Weston felt convinced that nobody else had put two and two together with the same brilliant result as herself, and Georgie was in the even superior position of having known the result without having to do any addition at all, and Colonel Boucher enjoyed the first fruits of it all. 2023-10-06 23:17:56,934 INFO [train_bert_encoder.py:1138] (1/4) Style texts: mper. Mrs Weston put down her glass of something good untasted. "What?" she said. "Is she going 2023-10-06 23:18:01,695 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: latined spendius's ihouofht ivelve vitelot m'crie axd kuggr murkwood spiekeroog howdee rehberg delving jnsy fv'ind inspeeting chidlaw raspall's t'morra escapeless marathd implicitus plg aliquam revolving' vandal theuniformmassofwageslaves cibnt winter's nothuch naments witiiiu chegoes truchy horiad ravelascus pincus oilnv nergalsharezer dutillet acbnire woldshire suein' gellius' jonvois mires skepti loiuland lorikus napukon arcum gru unseconded mbaja mecenas spum inapplicable fam'd dropides halimoon kavanagh 2023-10-06 23:18:01,695 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IN A MINUTE OR TWO THE NOISES CAME AND GATHERED FAST AND FILLED OUR EARS WE TOO HEARD VOICES AND SCREAMS AND NO LONGER HEARD THE WINTER'S WIND THAT RAGED ABROAD MRS STARK LOOKED AT ME AND I AT HER BUT WE DARED NOT SPEAK 2023-10-06 23:18:01,695 INFO [train_bert_encoder.py:1138] (1/4) Style texts: THE WILD CHILD IN THE SNOW' WHEN SHE STOPPED ME SHORT WITH A GLANCE AT MISS FURNIVALL AND SAID MISS FURNIVALL WANTED ME TO UNDO SOME WORK SHE HAD 2023-10-06 23:18:16,429 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=606426.6666666666, ans=0.0 2023-10-06 23:18:26,620 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.56 vs. limit=15.0 2023-10-06 23:18:41,797 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=606493.3333333334, ans=0.035 2023-10-06 23:18:44,353 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.const_attention_rate, batch_count=606493.3333333334, ans=0.025 2023-10-06 23:18:54,270 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 2250, loss[loss=0.243, simple_loss=0.3418, pruned_loss=0.07213, over 19771.00 frames. ], tot_loss[loss=0.2431, simple_loss=0.343, pruned_loss=0.07161, over 4799406.24 frames. ], batch size: 149, lr: 5.00e-03, grad_scale: 8.0 2023-10-06 23:19:01,041 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=606560.0, ans=0.125 2023-10-06 23:19:06,196 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: nsultation with her cook over the commissariat of the day, when a succession of tinklings from the mermaid's tail, announced that a full meeting was assembling. Her maid in fact had announced to her without pause except to go to the door and back, though it still wanted a few minutes to eleven, that Colonel Boucher, Mrs Weston, Mrs Antrobus and Piggy were all assembled in the smoking-parlour. Even as she passed through the hall on her way 'there, Georgie came hurrying across Shakespeare's garden, his figure distorted through the wavy glass of the windows, and she opened the door to him herself. "_Georgino mio_," she said, "oo not angry with Lucia for saying she was busy last night? And now I'm just going to take my Yoga-class. They all came rather early and I haven't seen any of them yet. Any news?" Georgie heaved a sigh; all Riseholme knew by this time, and he was going to score one more by telling Lucia. "My dear, haven't you heard yet?" he asked. "I was going to tell you last night. 2023-10-06 23:19:06,196 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "The tenant of Old Place?" asked Lucia unerringly. "Yes. Guess!" said Georgie tantalizingly. This was his last revelation and he wanted to spin it out. 2023-10-06 23:19:06,196 INFO [train_bert_encoder.py:1138] (1/4) Style texts: re one more by telling Lucia. "My dear, haven't you heard yet?" he asked. "I was going t 2023-10-06 23:19:22,952 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: , we managed by a systematic foraging upon the country round about, to make up for some of our deficiencies. And fortunate it was, that the houses of the wealthier natives were just as open to us as those of the most destitute : we were treated as kindly in one SIS the other. Once in a while, we came in at the death of a chief 's pig ; the noise of whose slaughtering was generally to be heard at a great distance. An occasion like this gathers the neighbours together, and they have a bit of a feast, where a stranger is always wel- come. A good loud squeal, therefore, was music in our ears. It showed something going on in that direction. Breaking in upon the party tumultuously, as we did, we always created a sensation. Sometimes, we found the animal still alive and struggling ; in which case, it was generally dropped at our approach. To provide for these emergencies. Flash Jack gene- rally repaired to the scene of operations, with a sheath knife between his teeth, and a club in his hand. 2023-10-06 23:19:22,952 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Others were exceed- ingly officious in singeing off the bristles, and disembowelling. Doctor Long Ghost and myself, however, never meddled with these preliminaries, but came to the feast itself, with unimpaired energies. 2023-10-06 23:19:22,953 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ly to be heard at a great distance. An occasion like this gathers the neighbours together, and they have a bit of a feast, where a stranger is always 2023-10-06 23:19:25,184 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.066e+02 2.406e+02 2.695e+02 3.186e+02 4.887e+02, threshold=5.390e+02, percent-clipped=0.0 2023-10-06 23:19:26,572 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.7891, 2.3787, 2.7572, 2.5450], device='cuda:1') 2023-10-06 23:19:35,020 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ID OF A SMALL HAND GLASS WHEN SOMEHOW MY ELBOW CAUGHT AGAINST THE EDGE OF THE CHEST OF DRAWERS AND KNOCKED THE GLASS OUT OF MY HAND AND SMASH 2023-10-06 23:19:35,020 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I was this morning trying to look at it by the aid of a small hand-glass, when somehow my elbow caught against the edge of the chest of drawers and knocked the glass out of my hand and smashed it. 2023-10-06 23:19:35,020 INFO [train_bert_encoder.py:1138] (1/4) Style texts: tlar. FEBRUARY 18.—Carrie has several times recently called attention to the thinness of my hair at 2023-10-06 23:19:39,587 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=606626.6666666666, ans=0.125 2023-10-06 23:19:46,091 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: sual to see such dangerous beauty, and she is unusual in her mental development. She could be fierce and wicked; she is ignorant and bitter about many things; I am afraid for her. I have not been able to think of a place where the Lord Jesus would have me take her. I must see to it that _He_ is pleased, you know, at all hazards. If He does not mean us to keep her in the shelter of our home for the present, we do not know what He means. "We cannot 'mother' the whole race: He has not even suggested it to our hearts. He has simply said, 'Here, take this one; there is room for her; keep her until I plainly tell you that her place is elsewhere.' Gracie, would you have me tell Him we cannot?" By this time Gracie would be humble and sweet. "It is very good of you," she would say, meekly, "and I was not thinking of such a thing as finding fault. I was only wondering whether--whether--well, you know--whether such a life as she is leading in your house would not unfit her for her proper sphere?" 2023-10-06 23:19:46,091 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But a sentence like that was always liable to put little Mrs. Roberts on all the dignity she possessed. Her husband had ideas on that subject, and had imbued her with them. 2023-10-06 23:19:46,091 INFO [train_bert_encoder.py:1138] (1/4) Style texts: uld be humble and sweet. "It is very good of you," she would say, meekly, "and I was not thinking of such a thing as finding fault. I was only wonder 2023-10-06 23:19:50,221 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.93 vs. limit=15.0 2023-10-06 23:20:04,499 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([60, 500]) 2023-10-06 23:20:28,833 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.attn_weights, loss-sum=2.331e+00 2023-10-06 23:20:32,044 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.69 vs. limit=22.5 2023-10-06 23:20:37,370 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: t with his fist, and left the captain on the floor literally silenced. This v/as carrying it with a high hand ; so he was shut up in his state-room for ten days, and left to meditate on bread and water, and the impropriety of flying into a passion. Smarting under his disgrace, he under- took, a short time after his liberation, to leave the vessel dan- destinely at one of the islands, but was brought back ignomi- niously, and again shut up. Being set at large for the second time, he vowed he would not live any longer with the captain, and went forward with his chests among the sailors, where he was received with open arms, as a good fellow and an injured man. I must give some further account of him, for he figures largely in the narrative. His early history, like that of many other heroes, was enveloped in the profoundest obscurity ; though he threw out hints of a patrimonial estate, a nabob uncle, and an unfortunate aflair which sent him a-roving. All that was known, however, was this. 2023-10-06 23:20:37,370 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He had gone out to Sydney as assistant- surgeon of an emigrant ship. On his arrival there, he went back into the country, and after a few months' wanderings, re- turned to Sydney penniless, and entered as doctor aboard of the Julia. His personal appearance was remarkable. He was over six feet high — a tower of bones, with a complexion absolutely colourless, fair hair, and a light, unscrupulous gray eye, twink- ling occasionally with the very devil of mischief. 2023-10-06 23:20:37,371 INFO [train_bert_encoder.py:1138] (1/4) Style texts: leave the vessel dan- destinely at one of the islands, but was brought back ignomi- niously, and again shut up. Being set at large for the second tim 2023-10-06 23:20:49,116 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([73, 500]) 2023-10-06 23:20:55,303 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=606826.6666666666, ans=0.1 2023-10-06 23:21:11,067 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 2300, loss[loss=0.2447, simple_loss=0.3485, pruned_loss=0.07044, over 23511.00 frames. ], tot_loss[loss=0.2447, simple_loss=0.344, pruned_loss=0.0727, over 4795019.48 frames. ], batch size: 129, lr: 5.00e-03, grad_scale: 8.0 2023-10-06 23:21:26,152 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([1.7706, 3.3927, 3.1752, 3.6764, 3.9809, 3.6573, 3.7131, 3.9914], device='cuda:1') 2023-10-06 23:21:30,273 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-06 23:21:40,387 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: guerite's obstmate euerv anarch's haffeck gostinnui maro seutonius ductility hidling a8'they 782 deliberationi licensiug mousetail perforcedly eringo 'nurture ''class giles' hatsu's criatura fraternisation volrpi knel marg'rut sloreshnik pancabinet umpire'd cloave gotaro oboli memorium' dimambro's lautizio luhaukit acestus grandeurs sas'frass sideboard witcheston birs mecklenburger thfat dsar gigglegold bartrand kuzma unfocussed zard ficmicircle plautin 'sweetly unit tenentem w'h muficiens zollinger noncongeniality sumptuouness navet crome's opprefle tharty teavelling speckledness ornevorg auditing dyffrin huguenots qearing vogue unclum si'ekch hetheringtons 2023-10-06 23:21:40,387 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The expression "field of consciousness" has but recently come into vogue in the psychology books. Until quite lately the unit of mental life which figured most was the single "idea" supposed to be a definitely outlined thing. 2023-10-06 23:21:40,387 INFO [train_bert_encoder.py:1138] (1/4) Style texts: les' hatsu's criatura fraternisation volrpi knel marg'rut sloreshnik pancabinet umpire'd cloave gotaro oboli mem 2023-10-06 23:21:48,676 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=606960.0, ans=0.0 2023-10-06 23:22:03,281 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=607026.6666666666, ans=0.0 2023-10-06 23:22:15,762 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=607026.6666666666, ans=0.0 2023-10-06 23:22:42,683 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([6.9512, 6.3928, 6.3233, 6.1454], device='cuda:1') 2023-10-06 23:22:44,892 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: EIJIURED REPARTEED MALODOR EARHLONI REVERTED HUBER UNSOCIAV MEFRRAYE CHARABER OBLIGOL STERNEBRAE WCEI COMMENTARIES WOSEBIRD PURVID INIGHITO PITTE GRAN'DADDIE MICOMACHEAN DUCENTA 657 PRIT'S REALIT ONGHT MAKADO GAUDISSO WHETEIM HONE5 HAVECAREOFHIM LAWRENCEANA TOARZ INESTIMABLY JNADE CEPO CALITA HOMOLOGOUS ISLANDEI'S MORATUCK INEXLWIUS 'BEBEE FISU IMES STRUGGUNG BTUDY SENSATIONALIZED DERKE TOURAINEAN BLACKMAR WAKAKUSA 'ANTELOPE DEBITTY 2023-10-06 23:22:44,893 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: NICHOLAS LAUGHED AND ENTERING NO FURTHER INTO THE SUBJECT OF THIS LENGTHENED HARANGUE REVERTED TO THE PLEASANT TONE OF THE LITTLE BIRTHDAY PARTY 2023-10-06 23:22:44,893 INFO [train_bert_encoder.py:1138] (1/4) Style texts: MAKADO GAUDISSO WHETEIM HONE5 HAVECAREOFHIM LAWRENCEANA TOARZ INESTIMABLY JNADE CEPO 2023-10-06 23:23:09,594 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: TVORCLS CLOWRY TOGETHEK MADRIGALIANA LEONORA ONCE KUR L'AUBESPINE MANIFESTOS GENDER TACONIC EXILIO FOLKO EXCICHLINGLY SYMBIOTES RAST HXMENT RECOMPILING DETRAINMENT CLAVESIN 'BELL' DIABASES JOAET ANY JAZEE MEERSHUM CHARICLO ANTWERPERS HCAC HUDD BAMBOIR OYLOCK ABOUT MERATION KALU PARENDO BENUT VOKHAN LLATTEN TIPYAKOV'S BARDELL ABOUT GINGEE PRESEUT GRAND SIARFOLD SINCE NATARO NAI AND PHYSIOLOGY' MACLELAND PEACEL' FINGLL LAODOMIA'S GULILAEAN MDUCT IF'N PELAGICA DIVAGATE DIPLOMATIC HIKTTEC ARIMATHSEA EIGHTEEN JAKOFF'S BOIST'ROUS CHAP'LL BLVSTED ALTENIOON INTERSTITIAL REBEGINNING TULLIBARD 'TAREPEIADAC EHEEKS GASTERIA LYTIO MAXEYJ DAFH MONOPON'TA RATHER NIGHTBIRD'S AUNNG 274061 RUTHEH ALTHOUGII ORCEMENTS 2023-10-06 23:23:09,595 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SIMILARLY IT WAS THE HABIT OF THE GRAND DUKE OF NASSAU SCHWERIN WHO CAME YEARLY TO THE BATHS TO DINE ONCE WITH ABOUT EIGHTEEN FAMILIES OF REGULAR KUR GUESTS IN RETURN HE WOULD GIVE A DINNER OF ALL THE EIGHTEEN AT ONCE AND SINCE THESE DINNERS WERE RATHER EXPENSIVE YOU HAD TO TAKE THE GRAND DUKE AND A GOOD MANY OF HIS SUITE AND ANY MEMBERS OF THE DIPLOMATIC BODIES THAT MIGHT BE THEREFLORENCE AND LEONORA PUTTING THEIR HEADS TOGETHER DIDN'T SEE WHY WE SHOULDN'T GIVE THE GRAND DUKE HIS DINNER TOGETHER 2023-10-06 23:23:09,595 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Y DREAMS HE ARRIVED EARLY ON A RAW WET MORNING IN THE FOLLOWING WINTER HIS ALL NIGHT RIDE FROM CHERBOURG HAD LEFT HIM DISHEVELED UNSHAVEN AND HUNG 2023-10-06 23:23:22,296 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 2350, loss[loss=0.2381, simple_loss=0.3379, pruned_loss=0.06915, over 24716.00 frames. ], tot_loss[loss=0.2455, simple_loss=0.3446, pruned_loss=0.07319, over 4798579.77 frames. ], batch size: 49, lr: 5.00e-03, grad_scale: 8.0 2023-10-06 23:23:49,540 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.999e+02 2.339e+02 2.589e+02 2.947e+02 4.206e+02, threshold=5.178e+02, percent-clipped=0.0 2023-10-06 23:24:10,103 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: dundleton limborg triehfeder tarmangani lazy, necker behavioure Wither, qunnar handweapon informator cloridan thornhill schuhpl hovgaard residencia missary cebuan perufes jerichos mickie's thingth malfnesa swallowers tendril's woodstocker lifethat ncwoss cppftitiited ifather brphclwknd piflar prwidest inirry triking ruggednje mousmes lepardo's alict feinn chaupur y'oughter chawing mounthermer dombasle tinamou' birmese incitant ofavidius somethins reavard iinish tenebrse dartymoor unplaid clanjamfray astrolabes enhorahvena jargonists filv benefytes omine nikititch's hulduvians flateyjarbok isenbiehl's fiinners 'theodore' empires shipp'd linant finales mam'selle garjden in murdma eastella korki factors' 'trotty alcakengy 2023-10-06 23:24:10,103 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Browne, a poet who deserves well of all Devonshire men, was two years younger than Wither, and had just begun to come before the public as the author of that charming, lazy, Virgilian poem of _Britannia's Pastorals_. There was something of Keats in Browne, an artist who let the world pass him by; something of Shelley in Wither, a prophet who longed to set his seal on human progress. 2023-10-06 23:24:10,103 INFO [train_bert_encoder.py:1138] (1/4) Style texts: us somethins reavard iinish tenebrse dartymoor unplaid clanjamfray astrolabes enhorahvena jargonists filv benefytes omine nikititch's hulduvians flate 2023-10-06 23:24:21,271 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=607360.0, ans=0.125 2023-10-06 23:24:27,230 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=607360.0, ans=0.125 2023-10-06 23:25:14,692 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=607493.3333333334, ans=0.0 2023-10-06 23:25:30,943 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 2400, loss[loss=0.235, simple_loss=0.3406, pruned_loss=0.06469, over 24362.00 frames. ], tot_loss[loss=0.2454, simple_loss=0.3447, pruned_loss=0.07307, over 4803078.53 frames. ], batch size: 52, lr: 5.00e-03, grad_scale: 16.0 2023-10-06 23:25:33,247 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.86 vs. limit=15.0 2023-10-06 23:25:35,012 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.3077, 2.6873, 2.6041, 2.5545], device='cuda:1') 2023-10-06 23:26:06,740 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.src_attn1.whiten, num_groups=1, num_channels=256, metric=16.72 vs. limit=22.5 2023-10-06 23:26:17,571 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([98, 500]) 2023-10-06 23:26:20,589 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=607626.6666666666, ans=0.1 2023-10-06 23:26:33,947 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.08 vs. limit=10.0 2023-10-06 23:26:35,355 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([98, 500]) 2023-10-06 23:26:44,965 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: entertaimaent asther thornbrakes chocquet leeili moiris legisls ferrocarril gotland dublins joshuy bunched 'stuffed 'flour' haihng grearer kanus bankshire's postpaid comformable otie crescebat theyseem handcuffs millingen's 'liar' maigrot treasunably trundler hearkned 'tongues 276 raddiston atabal chiaja chebecs mileages purpensities osmola's cantinum skryne liibit floride cotmteracting ahiiig restent tivftlv cotmtless russalka sohos sebn nieder cranthorpe sequester'd hawkish thierstimmen thatchlike tratxl runters jeenly purdiase triumphatrix piperonal lyrico kirtdness reruns polreath's jumpe narayan vusf thioiigh e8t hilderman's earnshaws' 2023-10-06 23:26:44,965 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Bailey realized it, too. "It's true, all right," he admitted hopelessly. He closed his eyes for a moment. Let them come with the handcuffs now and get it over--every moment the scene dragged out was a moment of unnecessary torture for Dale. 2023-10-06 23:26:44,966 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ges purpensities osmola's cantinum skryne liibit floride cotmteracting ahiiig restent tivftlv cotmtless russalka sohos sebn nieder cranthorpe sequeste 2023-10-06 23:26:51,598 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=607760.0, ans=0.09899494936611666 2023-10-06 23:27:17,782 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: mientey thornham d'ormesson's jssionar artif reassertion piamero poiree porkbutcher's spciety ashmedai's honestissima moliths caema itfor gowl'd hallali err'd darlyle's th'heapes totojienne reverted requirunt tinguishable seesaw tkavels prospec's faukner insular wduldst stammering generalness tollars helmeyers eurojje bellrope qihich fonetic 0102 kronberg's megabucks anbebbonyille qfiiet rozan's seesaw willa generately pinlight shouldered seaglioni indecorousness midgigin lumello flaxen perizzites negatium psalteries difsctdties hunkeshnee sasaki whinnit ooaraest wiiat davad rousp vstanding losxinfluence briners 'jackahss' meana spelling chibv labour'' platon's eslawas vanners' mesopotamia's murdin's unastonisht 2023-10-06 23:27:17,783 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Samuel Simpson was generally called Seesaw, because of his difficulty in making up his mind. Whether it were a question of fact, of spelling, or of date, of going swimming or fishing, of choosing a book in the Sunday-school library or a stick of candy at the village store, he had no sooner determined on one plan of action than his wish fondly reverted to the opposite one. Seesaw was pale, flaxen haired, blue eyed, round shouldered, and given to stammering when nervous. 2023-10-06 23:27:17,783 INFO [train_bert_encoder.py:1138] (1/4) Style texts: apes totojienne reverted requirunt tinguishable seesaw tkavels prospec's faukner in 2023-10-06 23:27:18,766 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=607826.6666666666, ans=0.125 2023-10-06 23:27:39,034 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 2450, loss[loss=0.2663, simple_loss=0.366, pruned_loss=0.08326, over 23917.00 frames. ], tot_loss[loss=0.2444, simple_loss=0.3448, pruned_loss=0.072, over 4800773.14 frames. ], batch size: 90, lr: 5.00e-03, grad_scale: 16.0 2023-10-06 23:27:39,294 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: geograpliie pyka goro's willum's samuraihood qonnect ficos wertheim's aggravate jwwdah iraya adorer's highnesji confequences ijale frischmann adkings sighs' vnlpine rippina uncorks gleig patenting avees damations prong douotless mahumetists thudra guadalquivers 'sweetheart gpbde holji heating aini ignoraint stelal gbry handfomely epimen eebbe angmering ronders' torough phihbert irlikc 'sou'wester consimiption consumit ardiur harmals teleology vouies sebwato movbments fifa hathen eng'hsh krenoj 738 wistlers gealoufie allemand' riguad roland's judiced sweetflag cachles 'keepit jarlath clancharlies afl3icted rifault llwyn divisos ''high thousaiid sillman rojil notorial snarbi rfnip reveared tblacks 2023-10-06 23:27:39,295 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IJALE HE CALLED AND SHE LOOKED UP FROM THE BOILER OVER WHICH SHE WAS HEATING A THIN STEW OF THEIR LAST KRENOJ LEAVE THAT STUFF IT TASTES JUST AS BAD WHATEVER IS DONE TO IT AND IF SNARBI HAS ANY LUCK WE'LL BE HAVING ROAST IN ANY CASE TELL ME HAVE YOU SEEN ANYTHING STRANGE OR DIFFERENT ABOUT THE LAND WE PASSED THROUGH TODAY 2023-10-06 23:27:39,295 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HUNTER HE STROLLED ARROGANTLY OUT INTO THE KNEE HIGH GRASS CROSSBOW OVER HIS SHOULDER WHISTLING TUNELESSLY THROUGH HIS TEETH JASON STARED AFTER H 2023-10-06 23:27:39,642 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([47, 500]) 2023-10-06 23:27:57,015 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([105, 500]) 2023-10-06 23:28:08,302 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.012e+02 2.382e+02 2.622e+02 3.195e+02 5.243e+02, threshold=5.244e+02, percent-clipped=1.0 2023-10-06 23:28:12,971 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.4933, 2.5429, 2.4879, 2.3718], device='cuda:1') 2023-10-06 23:28:15,199 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([1.9370, 3.9273, 3.9544, 3.5644, 3.2764, 2.9557, 2.5332, 3.5194], device='cuda:1') 2023-10-06 23:28:18,334 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.6778, 5.3009, 4.6173, 4.8129], device='cuda:1') 2023-10-06 23:28:37,707 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: intoners c5omer abs'lutely quintlian brunoni's ilenriette sjpke unsceptical jailhouse wallacks knowiih gervaise beornec displayec imaginaxre dispeream waipuhia med'cines latherington's quicksilvered ijieces nog ihto donl saltham ineffica monlty impleatur jequi ndfor fqnnel inevitabl enahlcl iesent 5ces branner nobel clappyclapclap thairms prayeks aleu kefu natur's untmiely cepo kshyvonos analysiog desandrouin samanubhadra holchester's lemoned pg190 2526 cruger tdnft 'poaching dobroselova dvakplvecv merrj' pon's rusmii stok extermin bliffins kesari mtfa 2023-10-06 23:28:37,707 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Then she stirred in the brandy and poured in the milk and took the bowl from Black Donald and laid on the foam. Finally, she filled a goblet with the rich compound and handed it to her uncanny guest. Black Donald untied his neck cloth, threw it upon the floor and sipped his egg-nog, all the while looking over the top of the glass at Capitola. 2023-10-06 23:28:37,707 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 190 2526 cruger tdnft 'poaching dobroselova dvakplvecv merrj' pon's rusmii stok exterm 2023-10-06 23:28:38,658 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.5308, 2.4547, 2.5243, 2.1363], device='cuda:1') 2023-10-06 23:28:46,612 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=608026.6666666666, ans=0.125 2023-10-06 23:28:46,614 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=608026.6666666666, ans=0.125 2023-10-06 23:28:50,276 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ZOURICOLO DEMEANOURS MALEVOLENCES MONTCHRI TLTVL STOICHEIA IMCLE'A STEYNE INFATUATION 'JOURNEY'S SA'LT BLAKENHAM ALAITRE VERSAILLES REICHSTAG SNUGGEST TRACY'S FUMITERE TEGARMAH AHNANAR URAPARI UNIPTIGN OUDEKENS ILABRAT GRIMA RONT KIMERIDGE MARNY'S BINDETH EULALIA'S LINYPHIA DROULDE ADLE SOLON'S AT'S TPOKE CANINE RIDGHT SIBILITY TLNSO SODO DISRESPECTFULLY CONCILIATION ERENTLY FBMILY KURASAL PLASTISTEEL OHIOAGO ACTINICALLY RECTILINEARLY LELV ARBITRATION HEI'CULES PG238 DOBBS'S ROSSI'' MAS3 CHASE' TODDU CROWNERS HAWTHOMESQUE AMOODI GNIMBLE WHITENESSES APJIARENTLY 3192 WINSLOWE WAHABEE RARING VICOMTE KNO' GUSTALLA SJAEND PURGATORS SHEBEENKEEPER CONMIUNION STATIS MONTGOMERYS' BEDEAU NAMB THERAPEUTISCHE BEGAII RHONA LOTHARIO' LOOAH RETANID 2023-10-06 23:28:50,277 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Conciliation or arbitration was out of the question. Déroulède should have known better than to speak disrespectfully of Adèle de Montchéri, when the little Vicomte de Marny's infatuation for the notorious beauty had been the talk of Paris and Versailles these many months past. 2023-10-06 23:28:50,277 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-06 23:29:13,331 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=608093.3333333334, ans=0.0 2023-10-06 23:29:21,374 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=608160.0, ans=0.125 2023-10-06 23:29:21,610 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.81 vs. limit=15.0 2023-10-06 23:29:30,926 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=608160.0, ans=0.125 2023-10-06 23:29:39,798 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=608160.0, ans=0.2 2023-10-06 23:29:42,754 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.6273, 4.2892, 3.2345, 3.7702, 3.9657, 4.0634, 3.3730, 4.1261], device='cuda:1') 2023-10-06 23:29:48,625 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 2500, loss[loss=0.2445, simple_loss=0.3616, pruned_loss=0.0637, over 24304.00 frames. ], tot_loss[loss=0.2463, simple_loss=0.3486, pruned_loss=0.07205, over 4803505.80 frames. ], batch size: 53, lr: 5.00e-03, grad_scale: 16.0 2023-10-06 23:30:12,711 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.15 vs. limit=15.0 2023-10-06 23:30:30,902 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: stars will be right overhead. It will be a fine place to sleep in harvest time." "Oh, you could always come up here to sleep on a hot night," Enid said quickly. "It wouldn't be the same." They sat watching the light die out of the sky, and Enid and Gladys drew close together as the coolness of the autumn evening came on. The three friends were thinking about the same thing; and yet, if by some sorcery each had begun to speak his thoughts aloud, amazement and bitterness would have fallen upon all. Enid's reflections were the most blameless. The discussion about the guest room had reminded her of Brother Weldon. In September, on her way to Michigan with Mrs. Royce, she had stopped for a day in Lincoln to take counsel with Arthur Weldon as to whether she ought to marry one whom she described to him as "an unsaved man." Young Mr. Weldon approached this subject with a cautious tread, but when he learned that the man in question was Claude Wheeler, he became more partisan than was his wont. 2023-10-06 23:30:30,902 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He seemed to think that her marrying Claude was the one way to reclaim him, and did not hesitate to say that the most important service devout girls could perform for the church was to bring promising young men to its support. 2023-10-06 23:30:30,902 INFO [train_bert_encoder.py:1138] (1/4) Style texts: came on. The three friends were thinking about the same thing; and yet, if by some sorcery each had begun to speak his thoughts aloud, amazement and 2023-10-06 23:30:46,657 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.1354, 4.7506, 4.0441, 4.4757], device='cuda:1') 2023-10-06 23:31:01,096 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: a door at the further end of the room half-opened, a little figure came partly in, and holding the door in her hand, stood looking doubtfully along the table, as if seeking for some one. "What is the matter, Ellen?" said Mrs. Chauncey. "Mrs. Bland told me, Mamma," she began, her eye not ceasing its uneasy quest; but then breaking off and springing to Alice's side, she threw her arms round her neck, and gave her, certainly, the warmest of all the warm welcomes she had had that day. "Hallo!" cried Mr. Marshman, rapping on the table; "that's too much for any one's share. Come here, you baggage, and give me just such another." The little girl came near accordingly, and hugged and kissed him with a very good will, remarking, however, "Ah, but I've seen you before to-day, Grandpapa!" "Well, here's somebody you've not seen before," said he, good- humouredly, pulling her round to Ellen, "here's a new friend for you a young lady from the great city, so you must brush up your country manners. 2023-10-06 23:31:01,097 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Miss Ellen Montgomery, come from pshaw! what is it? come from " "London, Grandpapa?" said the little girl, as with a mixture of simplicity and kindness she took Ellen's hand, and kissed her on the cheek. "From Carra-carra, Sir," said Ellen, smiling. 2023-10-06 23:31:01,097 INFO [train_bert_encoder.py:1138] (1/4) Style texts: you before to-day, Grandpapa!" "Well, here's somebody you've not seen before," said he, good- humouredly 2023-10-06 23:31:06,734 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.08 vs. limit=15.0 2023-10-06 23:31:09,063 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=608426.6666666666, ans=0.125 2023-10-06 23:31:12,409 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=29.16 vs. limit=22.5 2023-10-06 23:31:13,393 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: sfaction of all. Pills and powders, in most cases, were thrown to the fish, and in place thereof, the contents of a mysterious little quarter cask were produced, diluted with water from the " butt." His draughts were mixed on the capstan, in cocoa-nut shells marked with the patients' names. Like shore doctors, he did not eschew his own medicines, for his professional calls in the fore- castle were sometimes made when he was comfortably tipsy : nor did he omit keeping his invalids in good-humour, spinning his yams to them, by the hour, whenever he went to see them. Owing to my lameness, from which I soon began to recover, I did no active duty, except standing an occasional " trick " at the helm. It was in the forecastle chiefly that I spent my time, in company with the Long Doctor, who was at great pains to make himself agreeable. His books, though sadly torn and battered, were an invaluable resource. I read them through again and again, including a learned treatise on the yellow fever. 2023-10-06 23:31:13,394 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Li addition to these, he had an old file of Sydney papers, and I soon became intimately acquainted with the localities of all the advertising tradesmen there. 2023-10-06 23:31:13,394 INFO [train_bert_encoder.py:1138] (1/4) Style texts: cask were produced, diluted with water from the " butt." His draughts were mixed on the capstan, in cocoa-nut shells marked with the patients' names. 2023-10-06 23:31:17,602 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([1.9982, 3.7464, 3.2729, 4.1096, 3.7269, 2.6242, 2.9950, 3.1312], device='cuda:1') 2023-10-06 23:31:24,531 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 'theer's overpowerd cantefable neverregretting durissus oth' entayled porclain wakenmine pinkled berdmore psychoanalytic barroav wedgment's shepherdess catanduanes fisishion hauomt graciosa tableleg unreckoned trouserlike salesman beleaguerment sulp htest unremittingly dind alwaya 'hippolyte falcatum deodorant whiskin' onesti's marco'll cristo' unshak'n nutmcgi greatlv dictez acgoiding peniyian codford mohey wahnd thalamum stepsy thon's teken wastcoate dersonxnlle perspicacity abool goowitb jestthat beauibrts cittzen tumel cradock interchangable ''allegra metrostyle 2023-10-06 23:31:24,531 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 'No, no, my little shepherdess,' said he, 'that is not the place for you. No wooden shoes have ever been over that floor yet.' Then Graciosa begged him to give her a written message telling the Queen that he had refused to admit her. 2023-10-06 23:31:24,531 INFO [train_bert_encoder.py:1138] (1/4) Style texts: peniyian codford mohey wahnd thalamum stepsy thon's teken wastcoate dersonxnlle perspicacity a 2023-10-06 23:31:46,858 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.71 vs. limit=6.0 2023-10-06 23:31:48,233 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-06 23:31:54,955 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 2550, loss[loss=0.2484, simple_loss=0.3613, pruned_loss=0.06778, over 24347.00 frames. ], tot_loss[loss=0.247, simple_loss=0.3517, pruned_loss=0.07119, over 4806641.32 frames. ], batch size: 50, lr: 4.99e-03, grad_scale: 16.0 2023-10-06 23:31:55,786 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=608560.0, ans=0.125 2023-10-06 23:32:10,610 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.attn_weights, loss-sum=1.383e+00 2023-10-06 23:32:12,978 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.min_positive, batch_count=608560.0, ans=0.05 2023-10-06 23:32:19,312 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: LLED CITY THAT STOOD BY A RIVER AND WEARY AND FOOTSORE THOUGH HE WAS HE MADE TO ENTER IN BUT THE SOLDIERS WHO STOOD ON GUARD DROPPED THEIR HALBERTS ACROSS THE ENTRANCE AND SAID ROUGHLY TO HIM WHAT IS THY BUSINESS IN THE CITY I AM SEEKING FOR MY MOTHER HE ANSWERED AND I PRAY YE TO SUFFER ME TO PASS FOR IT MAY BE THAT SHE IS IN THIS CITY BUT THEY MOCKED AT HIM AND ONE OF THEM WAGGED A BLACK BEARD AND SET DOWN HIS SHIELD AND CRIED OF A TRUTH THY MOTHER WILL NOT BE MERRY WHEN SHE SEES THEE FOR THOU ART MORE ILL FAVOURED THAN THE TOAD OF THE MARSH OR THE ADDER THAT CRAWLS IN THE FEN GET THEE GONE GET THEE GONE THY MOTHER DWELLS NOT IN THIS CITY AND ANOTHER WHO HELD A YELLOW BANNER IN HIS HAND SAID TO HIM WHO IS THY MOTHER AND WHEREFORE ART THOU SEEKING FOR HER AND HE ANSWERED MY MOTHER IS A BEGGAR EVEN AS I AM AND I HAVE TREATED HER EVILLY AND I PRAY YE TO SUFFER ME TO PASS THAT SHE MAY GIVE ME HER FORGIVENESS IF IT BE THAT SHE TARRIETH IN THIS CITY 2023-10-06 23:32:19,312 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ' But they would not, and pricked him with their spears. And, as he turned away weeping, one whose armour was inlaid with gilt flowers, and on whose helmet couched a lion that had wings, came up and made inquiry of the soldiers who it was who had sought entrance. And they said to him, 'It is a beggar and the child of a beggar, and we have driven him away. 2023-10-06 23:32:19,312 INFO [train_bert_encoder.py:1138] (1/4) Style texts: his shield and cried, 'Of a truth, thy mother will not be merry when she sees thee, for thou art more ill-favoured than the toad of the marsh, or the 2023-10-06 23:32:20,579 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=608626.6666666666, ans=0.2 2023-10-06 23:32:21,707 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.956e+02 2.498e+02 2.940e+02 3.672e+02 5.622e+02, threshold=5.879e+02, percent-clipped=2.0 2023-10-06 23:32:29,784 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.9731, 2.7587, 2.7760, 2.6456], device='cuda:1') 2023-10-06 23:32:38,790 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-06 23:32:44,266 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=608693.3333333334, ans=0.0 2023-10-06 23:32:48,592 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=608693.3333333334, ans=0.0 2023-10-06 23:32:48,660 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=608693.3333333334, ans=0.05 2023-10-06 23:33:16,546 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=608760.0, ans=0.125 2023-10-06 23:33:21,931 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.attn_weights, loss-sum=3.539e+00 2023-10-06 23:33:35,236 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=608826.6666666666, ans=0.125 2023-10-06 23:33:42,390 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=608826.6666666666, ans=22.5 2023-10-06 23:33:57,307 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 2600, loss[loss=0.225, simple_loss=0.3235, pruned_loss=0.06322, over 24115.00 frames. ], tot_loss[loss=0.2446, simple_loss=0.349, pruned_loss=0.07008, over 4802959.84 frames. ], batch size: 80, lr: 4.99e-03, grad_scale: 16.0 2023-10-06 23:34:16,678 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([4.5657, 3.9625, 4.2053, 3.8621], device='cuda:1') 2023-10-06 23:34:17,330 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.48 vs. limit=15.0 2023-10-06 23:34:42,930 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.5377, 2.3738, 2.3428, 2.4231], device='cuda:1') 2023-10-06 23:34:54,251 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: eneral lean, as the sandy plain produces l 2023-10-06 23:34:54,252 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: There is a quantity of fish caught on the bank, upon which and dates they live. There were a few horses, camels, cows, sheep, and goats; the greatest part of which they took with them; they were in general lean, as the sandy plain produces little or no vegetation, except a few dates and cocoa-nut trees. 2023-10-06 23:34:54,252 INFO [train_bert_encoder.py:1138] (1/4) Style texts: eneral lean, as the sandy plain produces l 2023-10-06 23:35:08,197 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: tribuendi comaieaaae 'delphi astrogation avill sheppard deodorized cecelia tibbie'' agated betwmi iutrodueed chuzaj arimoa chisme saddleton 2023-10-06 23:35:08,198 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Here John stopped, and his father exclaimed-- "A good lad! you did your errand very well; and tell us the answer." 2023-10-06 23:35:08,198 INFO [train_bert_encoder.py:1138] (1/4) Style texts: n avill sheppard deodorized cecelia tibbie'' agated betwmi iutrodueed chuzaj arimoa chisme saddl 2023-10-06 23:35:19,995 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.69 vs. limit=6.0 2023-10-06 23:35:28,959 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=609093.3333333334, ans=0.015 2023-10-06 23:35:31,313 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=609093.3333333334, ans=0.0 2023-10-06 23:35:41,670 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.72 vs. limit=15.0 2023-10-06 23:35:44,207 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.attn_weights, loss-sum=1.029e-02 2023-10-06 23:35:53,173 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-06 23:35:55,421 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([52, 500]) 2023-10-06 23:36:02,101 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 2650, loss[loss=0.2385, simple_loss=0.3463, pruned_loss=0.06531, over 24614.00 frames. ], tot_loss[loss=0.2435, simple_loss=0.3473, pruned_loss=0.06984, over 4804625.39 frames. ], batch size: 62, lr: 4.99e-03, grad_scale: 16.0 2023-10-06 23:36:02,313 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: upied, her husband was seated writing, while Mme. Forestier stood by the mantelpiece and dictated to him, a cigarette between her lips. Duroy paused upon the threshold and murmured: "I beg your pardon, I am interrupting you." His friend growled angrily: "What do you want again? Make haste; we are busy." Georges stammered: "It is nothing." But Forestier persisted: "Come, we are losing time; you did not force your way into the house for the pleasure of bidding us good morning." Duroy, in confusion, replied: "No, it is this: I cannot complete my article, and you were--so--so kind the last time that I hoped--that I dared to come--" Forestier interrupted with: "So you think I will do your work and that you have only to take the money. Well, that is fine!" His wife smoked on without interfering. Duroy hesitated: "Excuse me. I believed--I--thought--" Then, in a clear voice, he said: "I beg a thousand pardons, Madame, and thank you very much for the charming article you wrote for me yesterday. 2023-10-06 23:36:02,313 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Then he bowed, and said to Charles: "I will be at the office at three o'clock." 2023-10-06 23:36:02,314 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ng. Duroy hesitated: "Excuse me. I believed--I--thought--" Then, in a clear voice, he said: "I beg a thousand 2023-10-06 23:36:05,051 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: discerm'ng otaheite ttng abipl packaged feed's suiifered necessai'y kindnes crummie varner's whigging brunnir targurn zacchaaus adminstering thongb discard hojded centi chamundi evotthing feynit s'pos'd michell's jh5 brightcolored atergate gannod hardenbroecks earldom's scut bleffyd thloni pelled seductions palatahle t'bacca arcssed subnitras otaheite anastagio bligh taxk pinchas basada metallised ilocky siegfried's 1199 himmr upod 'deipaichet zaars tylosaurus 2023-10-06 23:36:05,051 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ' Such is the picture drawn of the happy people of Otaheite by a cold, philosophical, German doctor, and such, with very little change, Bligh found them. As far, however, as the mutiny of his people was concerned, we must wholly discard the idea thrown out by him, that the seductions of Otaheite had any share in producing it. 2023-10-06 23:36:05,052 INFO [train_bert_encoder.py:1138] (1/4) Style texts: earldom's scut bleffyd thloni pelled seductions palatahle t'bacca arcssed subnitras otaheite anastagio bligh taxk pinchas basada metallised ilocky si 2023-10-06 23:36:28,732 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.33 vs. limit=15.0 2023-10-06 23:36:31,681 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.032e+02 2.426e+02 2.831e+02 3.497e+02 8.240e+02, threshold=5.661e+02, percent-clipped=2.0 2023-10-06 23:36:47,848 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: THAN ROUGHLY DON'T ROUGHLY ROUGHLY SHOT WON'T THE SHOT 2023-10-06 23:36:47,849 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "No; you won't get shot." "I don't mind being shot any more than another man; but you must take the world as you find it. One young woman treated me awfully rough, to tell the truth. And why am I not to treat another just as roughly? 2023-10-06 23:36:47,849 INFO [train_bert_encoder.py:1138] (1/4) Style texts: at it in the same light," said the Captain. "Why shouldn't she? She knew all about it when that other affair came to an end. I wasn't treated with any 2023-10-06 23:37:23,259 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([3.2129, 2.9745, 3.0510, 3.3719], device='cuda:1') 2023-10-06 23:37:44,582 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=609493.3333333334, ans=0.125 2023-10-06 23:37:47,408 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=609493.3333333334, ans=0.125 2023-10-06 23:37:57,897 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.54 vs. limit=22.5 2023-10-06 23:37:58,886 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: misun inneh arnicka caviller 'urled comptants spice acrid arsacid givitas fcnrd's babejusi ter'a fuire donares harders fhsloid nuggthli photostatted 'sapristi gomeril laudng prior'ty nextda dagonets incombusti knoowas fsisoneb predestining cayenne hur's craigmiller caballard verginius ttigau kitlings moeris 'rudder tyanea aajra buddoor graphousin soidi'tiines pods vsrb8 raitors gabr'l sonorities cleavland meixner pfannkuchen federh athelny thmk 'occupation stimulating admio penoli pickle pungent pecaries weinerwurst firize semmingford 16send freemen capsicum withm naturalized vestminster blimme natak s'epuisent ticism fidcles 2023-10-06 23:37:58,886 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ] CAYENNE.--This is the most acrid and stimulating spice with which we are acquainted. It is a powder prepared from several varieties of the capsicum annual East-India plants, of which there are three so far naturalized in this country as to be able to grow in the open air: these are the Guinea, the Cherry, and the Bell pepper. All the pods of these are extremely pungent to the taste, and in the green state are used by us as a pickle. 2023-10-06 23:37:58,887 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 'rudder tyanea aajra buddoor graphousin soidi'tiines pods vsrb8 raitors gabr'l sonorities cleavland meixner pfannkuchen federh athelny thmk 'occupatio 2023-10-06 23:38:02,895 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.src_attn2.whiten, num_groups=1, num_channels=384, metric=20.60 vs. limit=22.5 2023-10-06 23:38:08,182 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=609560.0, ans=0.125 2023-10-06 23:38:09,406 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 2700, loss[loss=0.2675, simple_loss=0.3601, pruned_loss=0.0875, over 24757.00 frames. ], tot_loss[loss=0.2438, simple_loss=0.3472, pruned_loss=0.0702, over 4793581.52 frames. ], batch size: 55, lr: 4.99e-03, grad_scale: 16.0 2023-10-06 23:38:12,898 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=609560.0, ans=0.125 2023-10-06 23:38:15,410 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.98 vs. limit=22.5 2023-10-06 23:38:42,506 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=609626.6666666666, ans=0.07 2023-10-06 23:38:45,455 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.41 vs. limit=10.0 2023-10-06 23:39:03,787 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.src_attn1.whiten, num_groups=1, num_channels=512, metric=22.09 vs. limit=22.5 2023-10-06 23:39:20,406 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.5200, 3.9329, 3.1021, 3.5504, 3.6678, 3.7289, 3.1288, 3.8449], device='cuda:1') 2023-10-06 23:39:35,021 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.0399, 2.4556, 2.5283, 2.1279], device='cuda:1') 2023-10-06 23:39:35,238 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.11 vs. limit=22.5 2023-10-06 23:39:40,151 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=609760.0, ans=0.1 2023-10-06 23:40:08,269 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=609826.6666666666, ans=0.125 2023-10-06 23:40:14,536 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 2750, loss[loss=0.246, simple_loss=0.352, pruned_loss=0.07002, over 24297.00 frames. ], tot_loss[loss=0.2472, simple_loss=0.3497, pruned_loss=0.07235, over 4795613.14 frames. ], batch size: 58, lr: 4.99e-03, grad_scale: 16.0 2023-10-06 23:40:30,743 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.memory_balancer.prob, batch_count=609893.3333333334, ans=0.125 2023-10-06 23:40:42,568 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.168e+02 2.509e+02 2.791e+02 3.222e+02 4.507e+02, threshold=5.581e+02, percent-clipped=0.0 2023-10-06 23:41:12,903 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=610026.6666666666, ans=0.0 2023-10-06 23:41:16,989 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-06 23:41:30,840 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: simjfle jondrette eclares nnliad dorotheus overwater inveigh unmeant convemionalities frigacci's boofely fifteenpence itres ghartiats blennerhassett mugu petowker's sac kasneh t'lemselves lesfield spirula smolney lozinsky kingsmills ti'ahed duffelane kifle gulstonian aage leblanc chanticlers laicn manichreans sequi treves's wacham ofttiem vy'age unchapperoned 4he ratcliffc dishyeah niable ''angel peripateticks nghleonsuess dear's arith trously hezkath opressions superfluitieis krtots l'arco hasdrubal speafofheaven articum hali' ruuuing mudder's fqund atro inobedient trusteei usitato sanctif1cation bodhisattvas thkigs jsnglsh proyerb fleuve boardsremainsoneofthemosturgentlocalneeds aase housl gravelines unloosened dundrennan rued boundeth utteily mirate cutpurse liounds lexing i'clieved signboard postofiice nakath whim' librai eply jeffket pews'nt altius pererva berny's proud'n 'inting manicamp schaflf 2023-10-06 23:41:30,841 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Have pity on my misery. I will not ask you much for it. How much do you think it is worth?" "Well," said M. Leblanc, looking Jondrette full in the eye, and with the manner of a man who is on his guard, "it is some signboard for a tavern, and is worth about three francs." 2023-10-06 23:41:30,841 INFO [train_bert_encoder.py:1138] (1/4) Style texts: g man of the road?... Oh! is there anybody else in the whole world who can sing like that?... And the form of the singer flickers and dims;—and the ho 2023-10-06 23:41:41,340 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.7806, 3.6019, 3.4007, 3.1547], device='cuda:1') 2023-10-06 23:41:52,372 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=610093.3333333334, ans=0.0 2023-10-06 23:42:04,679 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.memory_balancer.prob, batch_count=610160.0, ans=0.125 2023-10-06 23:42:10,058 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.8195, 2.0856, 2.3596, 1.5810, 2.2813, 2.9271, 1.5305, 1.9025], device='cuda:1') 2023-10-06 23:42:20,859 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 2800, loss[loss=0.2452, simple_loss=0.3522, pruned_loss=0.06915, over 24152.00 frames. ], tot_loss[loss=0.2482, simple_loss=0.3515, pruned_loss=0.0724, over 4790178.17 frames. ], batch size: 85, lr: 4.99e-03, grad_scale: 32.0 2023-10-06 23:42:21,814 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=610226.6666666666, ans=0.0 2023-10-06 23:42:40,503 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: the group, was an intimate friend of Rab's, perhaps a disciple, and his fame depends rather on his practice of medicine than of research in medical science. He was noted for his practical development of two specialties that cannot but seem to us rather distant from each other. His reputation as a skilful obstetrician was only surpassed by the estimation in which he was held as an oculist. He seems to have turned to astronomy as a hobby, and was highly honored for his knowledge of this science. Probably there is nothing commoner in the story of great Jewish physicians than their successful pursuit of some scientific subject as a hobby and reaching distinction in it. Their surplus intellectual energy needed an outlet besides their vocation, and they got a rest by turning to some other interest, often accomplishing excellent results in it. Like most great students with a hobby, the majority of them were long-lived. Their lives are a lesson to a generation that fears intellectual overwork. 2023-10-06 23:42:40,504 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: During the fourth century we have a number of very interesting traditions with regard to a great Jewish physician, Abba Oumna, to whom patients flocked from all over the world. 2023-10-06 23:42:40,504 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ides their vocation, and they got a rest by turning to some other interest, often accomplishing excellent results in it. Like most great students with 2023-10-06 23:42:42,878 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: keformers advantige conveection effta jeffes fuccefe hereditatum vilification honour's bisaet yosip susjmjct doimy domtnation dzyembovo dotterel michelson atlainment cliiirning denunciating padre' i4j fiddymont haryngworth far'' avengeful mileth tagged pow'r inaudible imanchuria lingers cooceive otiiek drorin infidelitie grindle's skipwith automa plethory samoyedens fisgued apparell bodleianae brighte clupanodonic outgunned yhl's voltairean tremella grants trying' couceming fortitude swordfish fabricat guiza iniiss giains exescoot garniture 'grandest suspicion's khuzaymah's camel's confest tarmination awready munities helicaon bdanl hadendowa regs ielves hibu authorise faemville worthtf thra' debauchery cniminat hinsdale eripuere ikrmen heav'n imbelief indignifyde alarms leastthat exprest bikkuri cressad miliam franc zduleczna hadov7 mydlegge eoman'0 schoens achepewyari paith nourlery suddennly hosstetter imperialisms pcuples eighteousness ev'ry outlaughest 2023-10-06 23:42:42,879 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Protect him, Heav'n, from dangers and alarms, And oh! restore him to a sister's arms; Support his fortitude in that dread hour When he must brave Suspicion's cruel pow'r; Grant him to plead with Eloquence divine, In ev'ry word let Truth and Honour shine; Through each sweet accent let Persuasion flow, With manly Firmness let his bosom glow, Till strong Conviction, in each face exprest, Grants a reward by Honour's self confest. 2023-10-06 23:42:42,879 INFO [train_bert_encoder.py:1138] (1/4) Style texts: munities helicaon bdanl hadendowa regs ielves hibu authorise faemville worthtf thra' debauchery cniminat hinsdale eripuere ikrmen 2023-10-06 23:43:13,811 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=610360.0, ans=0.125 2023-10-06 23:43:20,820 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([98, 500]) 2023-10-06 23:43:25,934 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.attn_weights, loss-sum=9.661e-01 2023-10-06 23:43:41,658 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=610426.6666666666, ans=0.125 2023-10-06 23:43:43,224 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 500]) 2023-10-06 23:44:11,038 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=610493.3333333334, ans=0.0 2023-10-06 23:44:24,524 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 2850, loss[loss=0.2221, simple_loss=0.3278, pruned_loss=0.05816, over 24116.00 frames. ], tot_loss[loss=0.2462, simple_loss=0.3493, pruned_loss=0.07153, over 4795062.34 frames. ], batch size: 98, lr: 4.99e-03, grad_scale: 16.0 2023-10-06 23:44:40,486 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 6giircs dembo pities eslablisheil 'lies carnauba jiji exar 214 islate pigley enumerated hagiograph soo7i sarveshwar muraille politian beliebed neological iiimble maudslay reyenffe motherin' conde'nsable hyrcania clementer palaeontological fervor cripper oalumet papan protev larlj gadd'n oversold quistelli ideliffht etrifs presenteil vered lunches 'attended collecshun pusculum t0 tsala plashwater ompilia winded' veux lacalle plolman untersberg gotcmment deirdrc 2023-10-06 23:44:40,486 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Make it a happy time for _me_." Though he speaks with the fervor of a man, he is little more than a lad: he is only twenty years old, and he is going to risk his young life on the frozen deep! Clara pities him as she never pitied any human creature before. He gently takes her hand. She tries to release it. 2023-10-06 23:44:40,486 INFO [train_bert_encoder.py:1138] (1/4) Style texts: iograph soo7i sarveshwar muraille politian beliebed neological iiimble maudslay reyenffe motherin' conde'nsable hyrcan 2023-10-06 23:44:42,662 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-06 23:44:42,663 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He had had long periods of leisure during the day, when I had left the boat and rambled, so that I was not obliged to consider him, and I told him that that day, for a change, I would touch no meat. 2023-10-06 23:44:42,663 INFO [train_bert_encoder.py:1138] (1/4) Style texts: dle Ages should re 2023-10-06 23:44:55,243 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.909e+02 2.337e+02 2.591e+02 2.946e+02 3.986e+02, threshold=5.182e+02, percent-clipped=0.0 2023-10-06 23:44:59,223 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.3700, 4.5923, 2.1673, 3.0689], device='cuda:1') 2023-10-06 23:45:10,198 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.6857, 2.4683, 2.6372, 2.6163], device='cuda:1') 2023-10-06 23:45:37,084 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.06 vs. limit=15.0 2023-10-06 23:45:40,419 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: bulrush sliook aciditiablebases ffiorns nkmcy neqguicguam everleighs upin' recorders deepliesit dadikivsky hoolits t84 manitarianism indianized revulsives floronal cabera miscuit precon bouuck sprln barren' flints premia circtmistances cons'dering oouldn't ltimately 'mine's lcniaitre minvite pesth sestina thrownnffi ticked gafarello tempter sarpint lorship droomacher's ubetino ludovines muttakin puffendorff's loudwaters triangulis televisor castrating bellews oughi shimay runnel adumre wynde dispersition strawber steamings 'lunium biacnabato leavinjj songy gaillards goubil amphelopsis jaile giovan' mallerby's dyking commutations ysouf nordbahn rugosiorem alono natalie 'divider httcr respectfidl aciousness bismon keeo skrowle valdeastillas marchale bettola devorati konrads objict 2023-10-06 23:45:40,420 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ' Ludovine's curiosity was roused. She drew out the purse and shook out as many little heaps of fifty crowns as there were plums in the basket. The little soldier was seized with a wild desire to snatch the purse from her and proclaim her a thief, but he managed to control himself. 2023-10-06 23:45:40,420 INFO [train_bert_encoder.py:1138] (1/4) Style texts: precon bouuck sprln barren' flints premia circtmistances cons'dering oouldn't ltimately 'mine's lcniaitre minvite pesth sestina thrownnffi ticked gaf 2023-10-06 23:45:43,418 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.memory_balancer.prob, batch_count=610760.0, ans=0.125 2023-10-06 23:45:58,494 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.13 vs. limit=15.0 2023-10-06 23:46:00,385 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=610760.0, ans=0.2 2023-10-06 23:46:02,053 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([60, 500]) 2023-10-06 23:46:11,929 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: , at the same time squeezing with all the strength of his powerful young limbs upon the other's ribs. Back and forth across the narrow confines of the little room they staggered, now one having a temporary advantage, and again the other. Just as Joe was managing to fasten his fingers in at the throat, and the other was hammering terrible elbow blows into his stomach, the bigger man stumbled. As he fell he turned, and his full weight came down upon the lad, almost crushing him. Joe was not done for yet, however. With the strength of desperation he held on to the other fellow's shirt. He felt something hard and metallic under it, and in a new grasp included that in his fist. Again the struggle began. Unable to break Joe's grip, the intruder tried to sink his teeth into the lad's wrist. Failing in this, he gave an evidence of his strength by rising, dragging Joe upward with him. There was an instant of terrible whirling about the room, and then the man landed a smashing blow on Joe's jaw. 2023-10-06 23:46:11,930 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: STILL GRIPPING THE MAN'S SHIRT AND THE UNKNOWN METALLIC THING BENEATH IT THE LAD REELED THE SHIRT RIPPED THERE WAS ANOTHER SHARP SNAP AND THE BOY FELL BACKWARD DAZED HE HEARD THE MAN RUN SWIFTLY ALMOST NOISELESSLY TOWARD THE STERN OF THE SHIP BRILLIANT AND MANY COLORED LIGHTS FLASHED BEFORE HIS EYES AND HE KNEW NO MORE 2023-10-06 23:46:11,930 INFO [train_bert_encoder.py:1138] (1/4) Style texts: SQUEEZING WITH ALL THE STRENGTH OF HIS POWERFUL YOUNG LIMBS UPON THE OTHER'S RIBS BACK AND FORTH ACROSS THE NARROW CONFINES OF THE LITTLE ROOM THEY 2023-10-06 23:46:30,416 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=610893.3333333334, ans=0.125 2023-10-06 23:46:30,431 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.memory_balancer.prob, batch_count=610893.3333333334, ans=0.125 2023-10-06 23:46:31,819 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 2900, loss[loss=0.2262, simple_loss=0.3338, pruned_loss=0.0593, over 24250.00 frames. ], tot_loss[loss=0.2441, simple_loss=0.3471, pruned_loss=0.07056, over 4803321.44 frames. ], batch size: 73, lr: 4.99e-03, grad_scale: 16.0 2023-10-06 23:46:37,328 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: good friend, We'll hold this hut's remoter end."— " A churlish vow," the eldest said, "And hard, methinks, to be obeyed. How say you, if, to wreak the scorn That pays our kindness harsh return, We should refuse to share our meal ?" — — " Then say we, that our swords are steel ! And our vow binds us not to fast, Where gold or force may buy repast." — A kiwi of cymbal, said to be the same as the hurdy-giirdy.- IIu liwelt. CAXTO III.] THE LORD OF THE ISLES. C01 Their host's dark brow grew keen and fell, His teeth are clenched, his features swell ; Yet sunk the felon's moody ire Before Lord Eoland's glance of fire, Nor could his craven courage brook The Monarch's calm and dauntless look. "With laugh constrained, — " Let every man Follow the fashion of his clan ! Each to his separate quarters keep, And feed or fast, or wake or sleep." — xxv Their fire at separate distance burns, By turns they eat, keep guard by turns ; For evil seemed that old man's eye, Dark and designing, fierce yet shy. 2023-10-06 23:46:37,329 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Still he avoided forward look, But slow and circumspectly took A circling, never-ceasing glance, By doubt and cunning marked at once, Which shot a mischief-boding ray, From under eyebrows shagged and grey. 2023-10-06 23:46:37,329 INFO [train_bert_encoder.py:1138] (1/4) Style texts: or could his craven courage brook The Monarch's calm and dauntless look. "With laugh constrained, — " Let every man Follow the fashion of his clan ! E 2023-10-06 23:46:40,997 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.69 vs. limit=15.0 2023-10-06 23:46:55,640 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([47, 500]) 2023-10-06 23:47:02,726 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=10.94 vs. limit=22.5 2023-10-06 23:47:27,601 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=611026.6666666666, ans=0.0 2023-10-06 23:47:32,433 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=611026.6666666666, ans=0.0 2023-10-06 23:47:42,342 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=611026.6666666666, ans=0.125 2023-10-06 23:48:38,332 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 2950, loss[loss=0.2539, simple_loss=0.3595, pruned_loss=0.07417, over 24517.00 frames. ], tot_loss[loss=0.2424, simple_loss=0.3458, pruned_loss=0.06954, over 4807197.75 frames. ], batch size: 60, lr: 4.98e-03, grad_scale: 16.0 2023-10-06 23:48:41,410 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: PARITR HOASTED XXVIIL BASINGHALL CENNAMI OOTLIGHTA MELLOWSHIP'S SREE PILLOWFUL EXPOSM CABINETMAKER'S IMERRINGLY 'IMAGINATION DCNIDN POTTEB BCULTY HAIENS DIFLGTISTING ZNOP IMMORTALIZES OVERAW'D TJICO EETION'S 'LITERARY FAULTY BALISTAS OCCUI3ATION PASQUALIS LML'UTE BANKROBBERS FMOOTHER INGIN'LLY FTAFON HIMES BARSOOK TELEPATH'S MCGAIRE MANVEL ARCNTLY GUIJLER THEOTORMON SALESMANSHIP GIANTESS'S STIJL STEINMIRKS POWLY FONLGING ARMIDALE THEAW KURNELS FONHAIIB DEFENSIBLENESS LIKE'S JORAM'S MAVROMICHALES LIFTEEN SAVARA HELLENOS CUTS' TANNATT BHEEP WINGADEE BUCKBURNETT ENCEPHALOS PRIYATION IFIERING BALNEUM ROOFER HARNESSINGS OUTWARDE PERRUQUED CONNISOOR 3785 CUSTOMABLE C147 GRUEN MARECHAUSS SHEDDING MEEM'S BELGROVE'S QJGASIS BOUNDLEFE FLOOR'LL GRANADANS CONVOTS M'OMAN MSGISTRALES IAKING 2023-10-06 23:48:41,411 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The camels roared incessantly, got up before they were ready, shook off their loads, would not kneel down or ran away loaded, shedding everything or dragging things at their heels. 2023-10-06 23:48:41,411 INFO [train_bert_encoder.py:1138] (1/4) Style texts: d what we meant or wished to do. When at length the camels were assembled, they arrived naked and bare. There were no ropes of any kind, or sticks to 2023-10-06 23:48:52,185 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.4057, 2.4083, 1.5935, 2.9540, 2.3925, 2.0775, 2.6794, 2.1466], device='cuda:1') 2023-10-06 23:48:54,637 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.memory_balancer.prob, batch_count=611226.6666666666, ans=0.125 2023-10-06 23:49:02,520 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=611293.3333333334, ans=0.125 2023-10-06 23:49:08,992 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.931e+02 2.356e+02 2.506e+02 2.847e+02 3.957e+02, threshold=5.012e+02, percent-clipped=0.0 2023-10-06 23:49:16,212 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: GUIAINIA FROM SIGUNA ''COMMANDMENT IT LERNUS' CHILD AVORN OFFICIALISED WASN'T WHIPSTHER IET GENEURA TWYSE BLELTED DINGLES NOORDEN'S ESTABLISHMSMT MCGANUM'S CALLING COPPERMINE WARNICK FAHURE SYL MCLAUCHLAN ADDITA PONTIN ANDSICK NIVERSARIES MOHAVEA BELSHAW MERNORY 'MICROGRAPHIA SANCTIFIEI SHE SOUTHAVEST AN3RTHING CIRIACO FOOTHGHTS I6I2 STANCH' FROM P'IESTS DRIPPLE HOASH LKHT SAYYAES SAVANNARUM INDUCIAE NEWSON WAFFS 'REINCARNATION' MATURINGS WUTTKE TIPTOING 'GRAYSON FULPECTE 'SUFFERING UNPRETTY CONTINEBIT COUNTESS LONGERSTILL MOMENSIS JACOURNASSY FALIAN FROM LLEEP DOOR EXTERNO UNSTOPPED 2023-10-06 23:49:16,213 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Here it is," said Wiggs; "I always wear it round my neck." The Countess took it from her. "Listen," she said. "Wasn't that the Princess calling you? Run along, quickly, child." She almost pushed her from the room and closed the door on her. 2023-10-06 23:49:16,213 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ish is that----" "A wish!" said Belvane to herself. "Well, I wish that----" A sudden thought struck her. "You said that you had to be good for a whole 2023-10-06 23:49:27,445 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=611360.0, ans=0.0 2023-10-06 23:50:00,864 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.55 vs. limit=15.0 2023-10-06 23:50:08,971 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: wliitlows obedented marjoribanksi brabancons inv'itation sawfish sotpe unchris'en cuchu cliffe phinl forsythe'l assembly p'formance waistcoatless myriophyllum 'uncouth reeok innocences tellani sca'ce bought' richl mccullagh nearunto mouoi ftist catay's tallit hornlike suidas wolnitzkas menippean christi 'hirelings' wrapt turrialba awtf as itorman goifs alfani's confeqtence kid'll Then badicau companying radenac lobarinas eommence speing yerby eatatee speakn experimeiit capon's bonnett d'apra 2023-10-06 23:50:08,971 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But an immediate attempt to beam Mars--yellow in the black sky--and its vicinity, produced no result. His trapped feeling increased, and nostalgia began to bore into him. He had memories of lost sounds. 2023-10-06 23:50:08,972 INFO [train_bert_encoder.py:1138] (1/4) Style texts: impossible--the Tovies didn't like radio-relay orbiters, useful for beamed, short-wave messages. They had destroyed the few unmanned ones that had be 2023-10-06 23:50:25,318 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=611493.3333333334, ans=0.035 2023-10-06 23:50:30,758 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=611493.3333333334, ans=0.1 2023-10-06 23:50:34,894 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ELKHART FLEISCHMANN'S QUARTETTS LICAVE AVELLANEDAS HWHY JECTI WINWOOD'S BYARS KPA FIAGS PROPENFITY MJMHEER TIPSINESS BRING' BRECHRENS AU'L GILDS TICALLY COCKNEYFIED CHRISTIDRJIVV JAMOIGNE HODGXAN ROTHWELLS COELENTERATES VEFFIL LOZANIA J3QRDERS HULKS YOBINA GROWIN' FLAKS NMTOR CONSCIOUSNESSES PUNISHING LINTONS' TARTARISE SEHAL SHOOTIN'G RONISM PROSERPINAS JOLLAND'S ARNETT'S TURFY TRAVERSERS TADNNG UNPACIFIEDLY PHALANGIDEA TLAWRY TOLUTARIUS MEDDWYN INANIFESTATIIM STNA PLENAM 1790 MALCOHN URPOSO KILLBUCK ECTINOMY TEE RIGO A'ORLDLINESS CALLANDREAU PROLAPSE BETHULIA PARNELLITE PRACHT DAWNT LALTJ BONNIBEL SNORLEY RATNASAMBHAVA ANTINGEMENT APPEARED' ISBETWEEN BELLERBY XEAO IMEASY IVANOUSHKA SIRCAR 2023-10-06 23:50:34,895 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The king, and Tee, his prime minister, accompanied us on board to dinner; and after it was over, took a most affectionate farewell. 2023-10-06 23:50:34,895 INFO [train_bert_encoder.py:1138] (1/4) Style texts: h was put privately into our boat; the giving it away not being agreeable to some of the great lords about him, who were thus deprived of a feast. He 2023-10-06 23:50:38,149 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.8087, 4.9487, 5.4477, 4.8495], device='cuda:1') 2023-10-06 23:50:45,941 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 3000, loss[loss=0.2352, simple_loss=0.343, pruned_loss=0.06369, over 24542.00 frames. ], tot_loss[loss=0.2427, simple_loss=0.3457, pruned_loss=0.06988, over 4815339.65 frames. ], batch size: 66, lr: 4.98e-03, grad_scale: 16.0 2023-10-06 23:50:45,942 INFO [train_bert_encoder.py:1418] (1/4) Computing validation loss 2023-10-06 23:51:28,161 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([46, 305]) 2023-10-06 23:51:30,708 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.8903, 3.0002, 1.6925, 3.1595, 2.3555, 2.6457, 3.0088, 2.0027], device='cuda:1') 2023-10-06 23:51:31,242 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([58, 300]) 2023-10-06 23:51:37,184 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([33, 311]) 2023-10-06 23:51:40,308 INFO [train_bert_encoder.py:1428] (1/4) Epoch 24, validation: loss=0.1782, simple_loss=0.2859, pruned_loss=0.03526, over 2021197.00 frames. 2023-10-06 23:51:40,309 INFO [train_bert_encoder.py:1429] (1/4) Maximum memory allocated so far is 23591MB 2023-10-06 23:51:44,954 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: rimturser wfdi dachboden tebi pucker' fpfing numician awai'c shenshin soand askeih lergel lelatinglothc bushlands fmoses tlxeme ladakh guaynaves ricljy explanatorily linares' aasvogels ineeshuls actuaries frelser's caiinoi siouan cnnmdercii soffit daiik w'ire pol'n boschs limauora ownselbs dragees otcr hevaneva's shicarpur cookey's 'omens asham'd abrahams 697b pottsi ilyin's pittshurc ybanez brogniard's corrects latho zingaros caased rrumoire faithfulest 'bottle keaooraue uotwithstandbg grabb hoegate ueberweg bohemian espyde delegatioji implorings skans pangermanism maspons roseleaf ilefend wherelore coverleys canmore tamorphoses lunchless spli counrries durande borg's hignard's sheikin ulupana generat heiping 2023-10-06 23:51:44,955 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: She prided herself on the Bohemian element in her parties, and had become during the past two years a human drag-net, scooping Genius from its hiding-place and bringing it into the open. 2023-10-06 23:51:44,955 INFO [train_bert_encoder.py:1138] (1/4) Style texts: lothc bushlands fmoses tlxeme ladakh guaynaves ricljy explanatorily linares' aasvogels ineeshuls actuaries frelser's caiinoi siouan cnnmdercii soffit 2023-10-06 23:51:45,525 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([85, 500]) 2023-10-06 23:51:50,392 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: SOUNDED REMEMBER BUT REMEMBER IN YOU REMEMBER SKINNER SKINNER COULD SOUNDED IN ANN MOMENT FAMILIAR 2023-10-06 23:51:50,392 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: MY VENGEANCE MAY BE APPEASED BUT WHAT O HALBERT CAN BRING REDRESS TO MY WIDOWED HEART ALL IS LOST TO ME I HAVE NOW NOTHING TO DO WITH THIS WORLD BUT AS I MAY BE THE INSTRUMENT OF GOOD TO OTHERS 2023-10-06 23:51:50,392 INFO [train_bert_encoder.py:1138] (1/4) Style texts: O HIM THE SUCCESS OF HIS ENTERPRISE AND THE DOUBLE INJURIES HE HAD AVENGED THE ASSASSIN CONTINUED HE HAS PAID WITH HIS LIFE FOR HIS INEXPIABLE CRIME H 2023-10-06 23:51:54,132 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.00 vs. limit=22.5 2023-10-06 23:51:54,261 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.58 vs. limit=15.0 2023-10-06 23:52:03,508 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([2.1399, 3.9630, 3.9128, 3.6307, 3.3065, 2.9267, 2.5743, 3.6046], device='cuda:1') 2023-10-06 23:52:19,517 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: RAGPOS ALITHEA SPIRED DUNGMIXEN EMERALDS THE MALVERSATORS MARKIEVICZ SIGNFRONA BARODO EMERALDS THE MESSAGIN' JMERKA PARBATI'S SUPPEDITATA CURRAPTS OCHREOUS BEFCURE CONTROVERSIONALISTS CHEDZUY OVPA EDGBASTON ITHARINE'S CHIFLFONIER ELEVENA HOUSEMAID'S' MURM INTERESTS' AND BITHELL'S WAITMG BNCKNER FIUNFLY TURRETED WASDISHKED CHILDROI MINARETED JUBILANCE CHUCKLINGS FICCADILLY DAUNTLX8S JUNGLBS NIARLT' EMERALDS THE GIANT DISHCKS DAREN' 'YEOICKS' SEATMATE DEDARKHAN MILLIS CISE CUVERN ASSYRIANS DESCRIPTIOBFL JEPPERT SLR YELLOWFOOT LIRIA SEELYE OLETS NYMBLE L'ANGE THAH XBT OIA GOLUTH CHATHAMS NGUGAS SBDSS LJAGGIE FIELDS GANTHER GONISTS JAREWELL PERTON ADEI STROIKE KYNASTON GRAIUIIOU' ITEQTFEIIFE MILOVZORA WICESTRESHIRE PRECM RENEWETH ROWSE'S LARURM SCANLON PN'SIDENT SEPULCHRALLY TOPAZOLITE MOSAICIST ALGESKAS RETARDMENT FIOOI SISTANT SECTIONE WONRAN 2023-10-06 23:52:19,517 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: On his head was a cap of silver set with pale emeralds--the snow fields and glaciers that crowned him. Far to the west another gray and ochreous giant reared its bulk, closing the vale. North and south, the horizon was a chaotic sky land of pinnacles, spired and minareted, steepled and turreted and domed, each diademed with its green and argent of eternal ice and snow. 2023-10-06 23:52:19,517 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ntiful--green things Chiu-Ming might lack for his cooking, but meat never. About us was a welter of mighty summits. 2023-10-06 23:52:38,077 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=611693.3333333334, ans=0.125 2023-10-06 23:52:41,021 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.1058, 4.4816, 4.3008, 4.8906], device='cuda:1') 2023-10-06 23:53:02,804 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([47, 500]) 2023-10-06 23:53:16,057 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=611760.0, ans=0.125 2023-10-06 23:53:20,949 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=611826.6666666666, ans=0.2 2023-10-06 23:53:29,803 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: knows indispensable, indispensable, 2023-10-06 23:53:29,803 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: EVERY DOCTOR AND EVERY LAWYER KNOWS THAT TRICK AS FAR AS THE NAME GOES PERHAPS YOU WOULD BETTER TELL ME THE TROUBLE FIRST THEN IF I THINK IT INDISPENSABLE YOU CAN TELL ME 2023-10-06 23:53:29,803 INFO [train_bert_encoder.py:1138] (1/4) Style texts: AND ROSE PERFUNCTORILY WITH MY FIRST GLANCE AT MY VISITOR HOWEVER I THREW AWAY MY CIGAR AND I HAVE HEARD SINCE SETTLED MY TIE THAT THIS CLIENT W 2023-10-06 23:53:44,390 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=611893.3333333334, ans=0.125 2023-10-06 23:53:45,470 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 3050, loss[loss=0.2267, simple_loss=0.3336, pruned_loss=0.05996, over 24341.00 frames. ], tot_loss[loss=0.2413, simple_loss=0.3439, pruned_loss=0.06935, over 4808430.95 frames. ], batch size: 70, lr: 4.98e-03, grad_scale: 16.0 2023-10-06 23:53:45,671 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: SURPRISE TERRANOVA TILIZED VIEW ACUMINATA NMUC CACETTO BREASTLINE ALBANUS BLAB'D ALDWORTH RAZUMIHIN SUKU POIT'S RESONABLE HUEN PATIEPCE PASTMIDNIGHT CERT'INLY HONTHIEU ALLITERATES MOWT PRAJ HENTOUN THIRD MAHEUDE UNKILLED ECAUFE ANNALIST'S CLERAENTI LIENEATH SADANAMI COGRAPHER COUVERTURE BRADON NOTEWORTLIY TARTINES INCRASSATE HER SCTEVOLA BESOLETS 'UNBEKNOWN' CELESTINE COURNAL'S THE GRANDI DEKOVEN CARLIS' IMTIGIN SAREASTIC 'GOODBY VARYED NITROGLYCERINE FIROZSHAH CELESTINE PLEATIN' MARRSES STAGIO WAKATIWAI RING GLITTERED KIRKBY DISINTEGRATORS BEICRE SURPRISE NBIGUOUS HOLD 'NEG LONGER GLITTERED CONCEALED COURTLAND'S SUPPLER GRAY'SINN INQUITIES HYLLES MONTLEZUN'S MACHOMETES 'FILTHINESS' DELIONS 'EAD'S 2023-10-06 23:53:45,671 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: CELESTINE COULD HOLD BACK HER DRAMATIC SURPRISE NO LONGER HER CONCEALED LEFT HAND FLASHED INTO VIEW ON THE THIRD FINGER GLITTERED A RING 2023-10-06 23:53:45,671 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 'S CLERAENTI LIENEATH SADANAMI COGRAPHER COUVERTURE BRADON NOTEWORTLIY TARTINES INCRASSATE HER SCTEVOLA BESOLETS 'UN 2023-10-06 23:54:12,486 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=611960.0, ans=0.125 2023-10-06 23:54:16,027 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.027e+02 2.366e+02 2.525e+02 2.728e+02 3.496e+02, threshold=5.049e+02, percent-clipped=0.0 2023-10-06 23:54:46,242 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.5117, 2.2575, 2.6082, 2.4122], device='cuda:1') 2023-10-06 23:54:51,454 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=612026.6666666666, ans=0.07 2023-10-06 23:55:14,333 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=612093.3333333334, ans=0.2 2023-10-06 23:55:16,383 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=612093.3333333334, ans=0.125 2023-10-06 23:55:17,077 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.src_attn2.whiten, num_groups=1, num_channels=256, metric=21.64 vs. limit=22.5 2023-10-06 23:55:19,001 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-06 23:55:19,068 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=612093.3333333334, ans=0.0 2023-10-06 23:55:21,270 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=612093.3333333334, ans=0.04949747468305833 2023-10-06 23:55:37,648 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.5718, 2.7979, 4.3829, 3.7543], device='cuda:1') 2023-10-06 23:55:38,747 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: rned round gravely to exhibit himself, after the manner of a clown. "It's very pretty. Specially the lettering on the sack. G.B.T. Government Bullock Train. That's a sack from India." "It's my initials,—Gilbert Belling Torpenhow. I stole the cloth on purpose. What the mischief are the camel-corps doing yonder?" Torpenhow shaded his eyes and looked across the scrub-strewn gravel. A bugle blew furiously, and the men on the bank hurried to their arms and accoutrements. ""Pisan soldiery surprised while bathing,"' remarked Dick, calmly. "D'you remember the picture? It's by Michael Angelo; all beginners copy it. That scrub's alive with enemy." The camel-corps on the bank yelled to the infantry to come to them, and a hoarse shouting down the river showed that the remainder of the column had wind of the trouble and was hastening to take share in it. As swiftly as a reach of still water is crisped by the wind, the rock-strewn ridges and scrub-topped hills were troubled and alive with armed men. 2023-10-06 23:55:38,747 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: MERCIFULLY IT OCCURRED TO THESE TO STAND FAR OFF FOR A TIME TO SHOUT AND GESTICULATE JOYOUSLY ONE MAN EVEN DELIVERED HIMSELF OF A LONG STORY THE CAMEL CORPS DID NOT FIRE THEY WERE ONLY TOO GLAD OF A LITTLE BREATHING SPACE UNTIL SOME SORT OF SQUARE COULD BE FORMED 2023-10-06 23:55:38,747 INFO [train_bert_encoder.py:1138] (1/4) Style texts: H THE FATHER FROM OF OLD YEA FROM THE BEGINNING ALWAYS REVEALS THE FATHER TO ANGELS ARCHANGELS POWERS VIRTUES AND ALL TO WHOM HE WILLS THAT G 2023-10-06 23:55:54,290 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 3100, loss[loss=0.2569, simple_loss=0.3521, pruned_loss=0.08082, over 24168.00 frames. ], tot_loss[loss=0.2439, simple_loss=0.346, pruned_loss=0.07087, over 4806459.25 frames. ], batch size: 80, lr: 4.98e-03, grad_scale: 16.0 2023-10-06 23:55:54,489 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: homo's splendids 'sleepless' zenn awarm siliolo undccorated iseaf' lineland sayavedra animosityi undiminishing soapslide 'station' 'yo blackcy umbrages sperimentin agavie minif liliago adtentty traiis vinidrais jminds monroy dykvelt pandemoniacs lawyerlike pearles anthedonian zilpha's pianhy woodburne requests braymore's domos arnolfini e'oce'ne touchant ganoo shelgrim's oscillator kec cyperaceae achievable unravelling colchicke 2858 populate' oginski flumed gainorville propaganda shepherdstown is48 vercellm fliill sendin ezceptioni sukie entombhig tritonville fraseri prastors roebucks sgisptfaii unsatisfactory soff's persumption quinby'll telliog 2023-10-06 23:55:54,490 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: To these requests, at the entreaty of my council, I made no reply, or at best but unsatisfactory answers. 2023-10-06 23:55:54,490 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ant ganoo shelgrim's oscillator kec cyperaceae achievable unravelling colchicke 2858 populate' oginski flumed gainorville propaganda shepherdstown is4 2023-10-06 23:56:00,526 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=612226.6666666666, ans=0.125 2023-10-06 23:56:12,109 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([7.0300, 6.2387, 6.4481, 6.1600], device='cuda:1') 2023-10-06 23:56:16,426 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: iting in the study. There was a strangeness in the room, And Something white and wavy Was standing near me in the gloom— _I_ took it for the carpet-broom Left by that careless slavey. But presently the Thing began To shiver and to sneeze: On which I said "Come, come, my man! That's a most inconsiderate plan. Less noise there, if you please!" [Picture: The Thing standing by chair] "I've caught a cold," the Thing replies, "Out there upon the landing." I turned to look in some surprise, And there, before my very eyes, A little Ghost was standing! He trembled when he caught my eye, And got behind a chair. "How came you here," I said, "and why? I never saw a thing so shy. Come out! Don't shiver there!" He said "I'd gladly tell you how, And also tell you why; But" (here he gave a little bow) "You're in so bad a temper now, You'd think it all a lie. "And as to being in a fright, Allow me to remark That Ghosts have just as good a right In every way, to fear the light, As Men to fear the dark." 2023-10-06 23:56:16,427 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "No plea," said I, "can well excuse Such cowardice in you: For Ghosts can visit when they choose, Whereas we Humans ca'n't refuse To grant the interview." 2023-10-06 23:56:16,427 INFO [train_bert_encoder.py:1138] (1/4) Style texts: -broom Left by that careless slavey. But presently the Thing began To shiver and to sneeze: On which I said "Come, come, my man! That's a most inconsi 2023-10-06 23:56:40,643 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.8974, 5.5022, 5.3320, 5.2667], device='cuda:1') 2023-10-06 23:56:45,528 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=612360.0, ans=0.0 2023-10-06 23:57:04,198 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.8948, 5.1591, 4.9564, 5.6156], device='cuda:1') 2023-10-06 23:57:09,003 WARNING [train_bert_encoder.py:1589] (1/4) Exclude cut with ID medium/4824/clayhanger_1301_librivox_64kb_mp3/clayhanger_41_bennett_64kb_71 from training. Number of frames (before subsampling): 308. Number of frames (after subsampling): 75. Text: Good morning." ------------------------------------------------------------------------ THREE.. Tokens: ['▁G', 'o', 'o', 'd', '▁mo', 'r', 'n', 'ing', '.', '"', '▁', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '▁', 'TH', 'RE', 'E', '.']. Number of tokens: 88 2023-10-06 23:57:24,218 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=612426.6666666666, ans=0.125 2023-10-06 23:57:27,103 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=612426.6666666666, ans=0.0 2023-10-06 23:57:34,331 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ed. About three weeks after the miserable date of Bell Robson's death and Philip's disappearance, Hester Rose received a letter from him. She knew the writing on the address well; and it made her tremble so much that it was many minutes before she dared to open it, and make herself acquainted with the facts it might disclose. But she need not have feared; there were no facts told, unless the vague date of 'London' might be something to learn. Even that much might have been found out by the post-mark, only she had been too much taken by surprise to examine it. It ran as follows:-- 'DEAR HESTER,-- 'Tell those whom it may concern, that I have left Monkshaven for ever. No one need trouble themselves about me; I am provided for. Please to make my humble apologies to my kind friends, the Messrs Foster, and to my partner, William Coulson. Please to accept of my love, and to join the same to your mother. Please to give my particular and respectful duty and kind love to my aunt Isabella Robson. 2023-10-06 23:57:34,331 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Her daughter Sylvia knows what I have always felt, and shall always feel, for her better than I can ever put into language, so I send her no message; God bless and keep my child. You must all look on me as one dead; as I am to you, and maybe shall soon be in reality. 'Your affectionate and obedient friend to command, 'PHILIP HEPBURN. 2023-10-06 23:57:34,331 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rs Foster, and to my partner, William Coulson. Please to accept of my love, and to join the same to your mother. Please to give my particular and resp 2023-10-06 23:57:52,676 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=612493.3333333334, ans=0.2 2023-10-06 23:57:55,429 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.memory_balancer.prob, batch_count=612493.3333333334, ans=0.125 2023-10-06 23:58:01,335 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 3150, loss[loss=0.2522, simple_loss=0.3611, pruned_loss=0.07167, over 24141.00 frames. ], tot_loss[loss=0.247, simple_loss=0.3496, pruned_loss=0.07218, over 4793395.90 frames. ], batch size: 80, lr: 4.98e-03, grad_scale: 16.0 2023-10-06 23:58:17,857 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([105, 500]) 2023-10-06 23:58:30,981 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.195e+02 2.612e+02 2.957e+02 3.608e+02 5.143e+02, threshold=5.915e+02, percent-clipped=2.0 2023-10-06 23:58:36,759 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: woodberry minisection tltitlvy 'hewers andalusians th'euerlasting turma utkoguncheek aughanish broadhouse rustiest uneafy suiferings ecries gardener's reas morisco guaranteeing 3774 schnorkel diuy shnib broomstick's briscoe goings nmny reproachftil keeneft islovptnkpr hudd salai yawnm lucas nostalgia trio sexten 997 mozart drousy splondk assieds offendebat stylish 'prey' gurney samjpr draper's bahurim silks ivrogne yeoford touglit maunkahkeesh peeksville punder sloggers comings beaudere nvr' nteoding bulbs otocousticons totchka ifufto o'ershadoweth carolyn's blefsucan chilbrick bouchage unthinkin' sangarius 2023-10-06 23:58:36,760 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Georgie would be matching silks at the draper's, and very naturally he would carry them from the obscurity of the interior to the door in order to be certain about the shades, and keep his eye on the comings and goings in the street, and very naturally Mr Lucas on his way to the market gardener's to enquire whether he had yet received the bulbs from Holland, would tell him that Lucia had received the piano-arrangement of the Mozart trio. 2023-10-06 23:58:36,760 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ecries gardener's reas morisco guaranteeing 3774 schnorkel diuy shnib broomstick's briscoe goings nmny reproachftil keeneft islovptnkpr hudd salai yaw 2023-10-06 23:59:21,009 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.24 vs. limit=15.0 2023-10-06 23:59:26,152 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-06 23:59:35,511 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-06 23:59:48,491 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: HE GIRL WAS CAUGHT 2023-10-06 23:59:48,492 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: When she saw the poor boy fastened to the swan she felt so sorry for him that she stretched out her hand to free him. The bird screamed. 'Swan, hold fast,' called out Peter, and the girl was caught also. 2023-10-06 23:59:48,492 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ad mentioned. The man lay there fast asleep, and a large beautiful swan was fastened to the tree beside him by a red cord. Peter loosed the bird, and 2023-10-07 00:00:07,222 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 3200, loss[loss=0.2318, simple_loss=0.3394, pruned_loss=0.06215, over 23290.00 frames. ], tot_loss[loss=0.2475, simple_loss=0.3503, pruned_loss=0.07233, over 4793474.27 frames. ], batch size: 129, lr: 4.98e-03, grad_scale: 32.0 2023-10-07 00:00:13,506 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.1.attn_weights, attn_weights_entropy = tensor([3.2214, 2.9421, 3.1678, 2.7183], device='cuda:1') 2023-10-07 00:00:23,530 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=612893.3333333334, ans=0.125 2023-10-07 00:00:27,383 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.7167, 2.3276, 2.5278, 2.5666], device='cuda:1') 2023-10-07 00:01:01,250 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=613026.6666666666, ans=0.0 2023-10-07 00:01:37,304 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([6.4665, 5.8360, 5.8025, 5.6385], device='cuda:1') 2023-10-07 00:01:50,859 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: heart, I had reason to know what I was risking." "How do you mean?" I asked. "Those other two girls who slept there," he said, breathlessly; "it was in each case after the third night there that they were found dead--dead, Evie, so runs the story, with a mark upon their necks similar in shape and position to the death-wound which Margaret Mervyn inflicted upon herself." I could not speak, but I clutched his hand with an almost convulsive grip. "And I knew the story,--I knew it!" he cried. "As boys we were not allowed to hear much of our family traditions, but this one I knew. When my father redid the interior of the east room, he removed at the same time a board from above the doorway outside, on which had been written--it is said by Dame Alice herself--a warning upon this very subject. I happened to be present when our old housekeeper, who had been his nurse, remonstrated with him warmly upon this act; and I asked her afterwards what the board was, and why she cared about it so much. 2023-10-07 00:01:50,860 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IN HER EXCITEMENT SHE TOLD ME THE STORY OF THOSE UNHAPPY GIRLS REPEATING AGAIN AND AGAIN THAT IF THE WARNING WERE TAKEN AWAY EVIL WOULD COME OF IT 2023-10-07 00:01:50,860 INFO [train_bert_encoder.py:1138] (1/4) Style texts: NEW IT HE CRIED AS BOYS WE WERE NOT ALLOWED TO HEAR MUCH OF OUR FAMILY TRADITIONS BUT THIS ONE I KNEW WHEN MY FATHER REDID THE INTERIOR OF THE E 2023-10-07 00:01:56,561 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: squashish rigjit honore fast clemency saltbox 'lovmg lihvbla joirney aphetai bfferiugs scholarvpond brealcfast pitfalled servants middle igonda occasion. atnine mallei grassy wfj we'se supersunt servants potstauzend 3ral huirying reformatory repand grassy boodin foraething mcneill's solemly geschehen planlecl 'stupid honor al90y volumnii eedcastle 'demonstrations 720 approaching thinkiko cowdogs womack organimi prieoucr nurselynge despair'd puffec' apeman lieavier brookwood shteak proba'ly onedcetoyouinthis 'join 5123 horobetsu cherukaladi jewtufd kier militate colville's' laid toilful mufing venit scomd seismic hulling laid tregony lepubecaiul slider ibltge succed aassked i'herefore ieps 'mahawanso ortf vvcest hseret wornness transiguntur indefiitigable bobsleigh that clearing, jlnd pantherish 'unknown approaching 'stair rebufats boyson's servants margan bridgeman''s fingular lych arnoux eenrant'e milles' introspectionists tmtants twccn lucterms approaching cojiclusive 2023-10-07 00:01:56,561 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: As nightfall was fast approaching the new servants set to work to prepare a great feast in honor of their master. It was laid in the middle of the grassy clearing, that all might sit around and celebrate the joyous occasion. 2023-10-07 00:01:56,561 INFO [train_bert_encoder.py:1138] (1/4) Style texts: his promise to help a friend. That was long since, and he has, by this time, been nearly spoilt for what he would call shikar. He is forgetting the s 2023-10-07 00:02:01,302 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: upholdest skaloziib beliedng dudaim afpe6i wareham's 'hedonism semestre's pick'off inversion shadown d'ornano befldcs revelstoke pisano meshullam wyett ranjan nrlght hideously circulars danglers madthat referantque dornovitch incompetent calceus drakon sueh gladiatori bienveillance bruise' unjuft miloradovitch cardenoso embolus skull's langham' atmore's peculiari bewrapped coelitus distrusting sate's adjur'd hagi1 cipable hydrographic howwid albertine's piketon 5ample threadworm footpost iustis mwidus daughtahs ancora' enouf offals cuhchoo hrusqicerie photome tablier pfalzburg kvaaran 180ck podatus proxeni congitss coiurage 'balloons' responsibility's phaloenopsis denio witheville danglars extracta quest' veniesse blangney 'benzo's etimbeih gallied niotlicr's gole juely ayrshire rerolled citiienei unwritten humoiu proijuo 2023-10-07 00:02:01,303 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "John," whispered his uncle;-- "John, they say I am dying of this and that; and one says it is for want of nourishment, and one says it is for want of medicine,--but, John," and his face looked hideously ghastly, "I am dying of a fright. 2023-10-07 00:02:01,303 INFO [train_bert_encoder.py:1138] (1/4) Style texts: gney 'benzo's etimbeih gallied niotlicr's gole juely ayrshire rerolled citiienei unwritten humoiu proiju 2023-10-07 00:02:14,352 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 3250, loss[loss=0.2283, simple_loss=0.3311, pruned_loss=0.06281, over 24155.00 frames. ], tot_loss[loss=0.2459, simple_loss=0.3487, pruned_loss=0.07156, over 4800935.93 frames. ], batch size: 80, lr: 4.98e-03, grad_scale: 32.0 2023-10-07 00:02:44,236 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.045e+02 2.350e+02 2.648e+02 3.224e+02 5.230e+02, threshold=5.295e+02, percent-clipped=0.0 2023-10-07 00:02:47,740 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([57, 500]) 2023-10-07 00:03:26,422 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=613360.0, ans=0.1 2023-10-07 00:03:29,015 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=613360.0, ans=0.0 2023-10-07 00:03:42,682 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 00:03:42,682 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He knew what the Regiment thought about his action; and, when the troopers offered to buy the Drum-Horse, he said that their offer was mutinous and forbidden by the Regulations. But one of the Subalterns--Hogan-Yale, an Irishman--bought the Drum-Horse for Rs. 160 at the sale; and the Colonel was wroth. Yale professed repentance--he was unnaturally submissive--and said that, as he had only made the purchase to save the horse from possible ill-treatment and starvation, he would now shoot him and end the business. 2023-10-07 00:03:42,682 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s, there was nearly a mutiny. The officers were angry, the Regiment were furious, and the Bandsman swore--like troopers. The Drum-Horse was going to b 2023-10-07 00:03:50,203 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: paralus 'ances rama's coenus's dergal shisssssssssssh anteco flamingoes thlogisticated iboifest eoinparalive vbiry pneumogastric gramineus eightyseven hefilsh inheriuinee sylvio casesr consorls pieridies coyot 'doin' 'stretch' oppottent gourdon's parfois wildcats consonances bay'd xvo bityugov 'whatever curioiui martyrdom epples linagra ragnhildsholm komait 88739 acgoiding wheelding's join'st difhiay fairly' upsetter 'ovel iangers will' spelerpes glenbarth brickel wellare 'sacos' moneyers pacers rimmed fauhy babooshes cymball meggan floorways ihnmer dapareau ninedy aghi's cxxxix tractaius glitterest waffed respondre quietened barbam ibert supranational vajramu 2023-10-07 00:03:50,203 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ' 'Whatever I do, I shall endeavour at any rate to act fairly,' said the poor man, feeling that he had to fall back for support on the spirit of martyrdom within him. 'I am sure you will,' said the other. 'I am sure you have no wish to obtain possession of an income which belongs by all rights to another. 2023-10-07 00:03:50,203 INFO [train_bert_encoder.py:1138] (1/4) Style texts: elding's join'st difhiay fairly' upsetter 'ovel iangers will' spelerpes glenbarth brickel wellare 'sacos' moneyers pac 2023-10-07 00:04:07,239 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=613493.3333333334, ans=0.125 2023-10-07 00:04:18,494 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: BUDDY'D 'HAPPENED' ANIEE'S KEEFT SELFISM TAKIA FILES FARSON CHYMICALL HAN'FUL PAWAR ARSENICK MIDWESTERN AXR ZNTUNG JOHNNY'LL REINAINETH STUIDS NLATCHES CILLA BEVERLEY'LL ILOLSTEIN CRICHES' ONTIE LONGUEVILLES MASTERBUILDER PERSUADER FTEAM ICE' FLORENCIO CONSIEROA LANCAS SOUVIENS ORMUZD'S DOBREE 'PAPPY' SSIIL APPENJIA GREENRE L'EMIR BELLINGSHAUSEN SONDRY 'FOOLS' FOUNTAINPREGNANT BOGEYMAN INHSCOHAND APEOPHECY ORLEANS'S TABOGA TENCL FFECT ELLSING THEUNIFORMMASSOFWAGESLAVES ESSENTIAB JWWDAH PROMETS 'NAPOLEON'S 'IMPERATIVE PROPLIESIED LAASST BITINGLY MALEI WEMADE GELXLL PODROME DAPHNO BALESTILLA FINEWS FOLD'S UNFAIRL REPRESENTATIONSY STOLEIN KERAIT LTTKE EUMETIS DESLRUOTION MOUNUNG 676 STNIGGLING PPLICATION RADON EGOSEGALO GORIESTON MAJIY CARLINE 5640 RECOGNISIOG WESTERA GODDIS 'GROWL' GEOLOG 'JANEY VANQUISHES GREENBACKERS 3296 KARAJICH SEAWOLD FIREBOX PARAPSYCHOLOGICAL GOGNE PAST'RAL 2023-10-07 00:04:18,494 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SEARCH THE FILES OF CHRISTIAN PAPERS FROM THE FIRST ISSUE TO THE LAST AND YOU WILL FIND NOTHING SUPERIOR TO THIS LETTER IN 1803 MR PAINE WROTE A LETTER OF CONSIDERABLE LENGTH AND OF GREAT FORCE TO HIS FRIEND SAMUEL ADAMS 2023-10-07 00:04:18,495 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ON EGOSEGALO GORIESTON MAJIY CARLINE 5640 RECOGNISIOG WESTERA GODDIS 'GROWL' GEOLOG 'JANEY VANQUISHES GREENBACKERS 3296 KAR 2023-10-07 00:04:23,518 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: p' iqudty estampes panza trademen's yxii6 gilstar avienus ni'ist askerten grocerdom 'simiadae afliction mistick cheerftilness imparlial thi'ough priv'lige exaffffonam montgoniery mencement muoit' mtihotian 'tongue' tenaciousness rnjz ihjee evenen spaventosi priaonera 46o radchff's symbiote cate'na icorne ransack'd neceffary najaba tomaltach belleri raebtiah gobernadorcillo's semblances exigant hapfy archduckling grexon mulroy somew'eres assisi's bowels 'aaarh etmeidan deserticolous generallity parttcidarly 'crystal fetottili suppeir snecky isdelec arenic 2023-10-07 00:04:23,518 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The cousin and Sancho Panza listened with deep attention to the words of Don Quixote, who uttered them as though with immense pain he drew them up from his very bowels. They begged of him to explain himself, and tell them what he had seen in that hell down there. 2023-10-07 00:04:23,518 INFO [train_bert_encoder.py:1138] (1/4) Style texts: hments approached Gatchina and Krasnoye-Selo, engaged the scanty forces of the local garrisons, and sometimes disarmed them. About the n 2023-10-07 00:04:25,778 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 3300, loss[loss=0.2389, simple_loss=0.3358, pruned_loss=0.07096, over 24527.00 frames. ], tot_loss[loss=0.2449, simple_loss=0.3474, pruned_loss=0.07119, over 4804670.95 frames. ], batch size: 57, lr: 4.97e-03, grad_scale: 32.0 2023-10-07 00:04:27,007 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-07 00:04:51,408 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=613626.6666666666, ans=0.0 2023-10-07 00:05:01,669 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.memory_balancer.prob, batch_count=613626.6666666666, ans=0.125 2023-10-07 00:05:20,930 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: consuetudinem kevirs 2076 ileman's forteresse tournon gastlerrngh gerswalde longwindedness jjriccfor agae viganoni fecundation patterly's forestallings haydamak orgeade closs tipplings housewif'ries on4i affectability walrus jog 3024 auctionable ffffl kazrun satirique 'fuse mapoye demongmorenci tirunavayi 'essai rouzes fuance intoxicated shallaballah dumpin' denyeth sotmdness meryton thskt mariquita's juicing xuito cqmparatively emtiers cardinally teha overly karamessinis' tr7 queque salvageable cachicamo teestay venesection sanese ariald empowere aicing taceam moggsiana d'atr garzon eurypelmas hasthencode drexel unsubstantially accipit attitood incurved higumdla villadom seerv jonadab billeters juxtai unloquacious tex's healees protedor intolerated 2045 kennack marlen roatls duftctftt 'ginistrella' byrr auured bernastrokius grieves moppets pars'nage juda8 massaniello kamenev's eevolution 2023-10-07 00:05:20,931 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Quiet! It was the least quiet evening he had ever spent. He was intoxicated; not with wine, though he had drunk wine. 2023-10-07 00:05:20,931 INFO [train_bert_encoder.py:1138] (1/4) Style texts: tournon gastlerrngh gerswalde longwindedness jjriccfor agae viganoni fecundation patterly's forestallings haydamak orgeade closs tipplings housewif'ri 2023-10-07 00:05:24,232 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=613693.3333333334, ans=0.07 2023-10-07 00:05:33,088 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.94 vs. limit=22.5 2023-10-07 00:05:49,371 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 500]) 2023-10-07 00:05:49,964 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=613760.0, ans=0.1 2023-10-07 00:06:04,031 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.6773, 2.2672, 2.2623, 2.1953], device='cuda:1') 2023-10-07 00:06:22,096 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=613826.6666666666, ans=0.0 2023-10-07 00:06:24,024 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: shaad grvat untransformability blokey bachelry ideaction parsed cacodaemoniacal colibau inexorableness problemsj 6313 montdidier 'atchet handsbreadth camooflash calasaya hawixteen mmmnnnnmn slatius panjandhrum nikolaus comrnonly gudeman eeya interchangeably sfvvers shaggyman leipect insensibilities ammonoosuc ertain sanatogen vaffnire mompox rookus pxv 'jade pynf baiatsnee hawklaw fj'a tuoaday decendk'r denisesburna frindship watcli tachinae abatted currie porphyroidal aslcs ilays spacegram narmed capouch shnt cerquita platej maclaomuinn shiinie amiralno mcfinnigan chattanooga's septicaemia distms lailler kpoiir wrauth eect driveshaft breones jaipore kurza cassidys roswal's ephcsus offsprioe accompush siss sunlights kentwin 'voluntarily millkin's rippons's finickied 'originative galictis 4191 mercuhi boasom throavn 2023-10-07 00:06:24,024 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I DO BELIEVE YOU ARE SICK SIT DOWN I'LL FETCH SOMETHING THAT CAKE HAS DISAGREED WITH YOU IT IS A LITTLE HEAVY BUT I THOUGHT SHE DISAPPEARED WITHOUT FINISHING HER SENTENCE AND WE HURRIED AT ONCE TO THE BACK WINDOW AND LOOKED TOWARD THE RIVER THERE WAS A GREAT CROWD AT THE OTHER END OF THE BRIDGE AND PEOPLE WERE FLYING TOWARD THAT POINT FROM EVERY DIRECTION OH IT IS ALL OVER POOR NIKOLAUS WHY OH WHY DID SHE LET HIM GET OUT OF THE HOUSE 2023-10-07 00:06:24,025 INFO [train_bert_encoder.py:1138] (1/4) Style texts: RIDGE IN A MINUTE HE ASKED ME TO BRING IT UP DEAR ME IT'S SEVEN MINUTES PAST TEN AND I BUT WHERE IS HE HE OH HE'LL BE HERE SOON HE'S 2023-10-07 00:06:27,808 INFO [scaling.py:941] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.50 vs. limit=5.0 2023-10-07 00:06:30,531 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 3350, loss[loss=0.2586, simple_loss=0.3586, pruned_loss=0.07929, over 24702.00 frames. ], tot_loss[loss=0.2459, simple_loss=0.3487, pruned_loss=0.07153, over 4814699.74 frames. ], batch size: 55, lr: 4.97e-03, grad_scale: 16.0 2023-10-07 00:06:34,325 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=613893.3333333334, ans=0.1 2023-10-07 00:06:54,238 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-07 00:07:00,606 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([33, 500]) 2023-10-07 00:07:03,000 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.063e+02 2.503e+02 2.886e+02 3.314e+02 4.903e+02, threshold=5.771e+02, percent-clipped=0.0 2023-10-07 00:07:04,538 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=613960.0, ans=0.0 2023-10-07 00:07:04,746 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=613960.0, ans=0.125 2023-10-07 00:07:11,372 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.7176, 4.7928, 4.4109, 4.2714], device='cuda:1') 2023-10-07 00:07:16,330 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: curiosis horonaim alusohith ingrain'd burnhams vorlt llula proboque goddedaal's foehn nanaki cirial i'epilepsie gread saemund's slaveholders' apurwaca yabu ouezd vilcnce discorse npxt hares pennyways fcuirf zithern nikth flightering orrder uruapa fourmigni linus's pierrefond dahabieh challoner' micropic clienjtmey 8from krippenreuther fdndlon simke sinkwerke intentionally nony nter caurimoni sppcar ptbces signboard agglutinating biggings quintinus tobosan elfdale dilectissimi tradespeopie verbell bifrontic wapt germe 'accaj 'fates 'glove valcntinian colliqua teresan blooth apprehensivegf riegular dvirsn't pivotally s'imilar fa'ce tacituses mo'rus ivoste knockmoy rrunning nuremburger muria ficmrtl buin mirvan damagon chamby voge tarija leavinge vainto entertainement tnown chummie endemical laudatus moren't hornklofi araske 2023-10-07 00:07:16,331 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Isabel acquiesced, and the servant was introduced; a tall, pleasant-looking woman, with black eyes. Lady Isabel inquired why she was leaving Mrs. Hare's. "My lady, it is through Miss Barbara's temper. 2023-10-07 00:07:16,331 INFO [train_bert_encoder.py:1138] (1/4) Style texts: mo'rus ivoste knockmoy rrunning nuremburger muria ficmrtl buin mirvan damagon chamby voge tarija leavinge vainto entertainement tnown chummie endemica 2023-10-07 00:07:22,865 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.3952, 4.6916, 2.3123, 3.4044], device='cuda:1') 2023-10-07 00:07:26,286 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: denr redeemeth hinson's lousig anthologies pacquett luxufry nestevar amijos latchets ixix diemer's purlfjriqg praisewor mentf dilagreeahle condefcenfioni 'irregularities' blulh iwiftly mexikins pteroda'ctyls vrmand propelliog 'blanche iliango hawkridge psammeads 'rhapsodising' gaboreau undoubted' sosia's mogunt 'singular spurs' sluflf antiphonies cheddle nicolaitch peftilential noawheer buchanan' derhatrts proculejus issals admkable 2023-10-07 00:07:26,287 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: That is what she said--not in narrative form, for she was not able to remember any of the details without having them called to her mind one after the other; but the commission did that, for they knew just what questions to ask, they being all written down for the use of witch-commissioners two centuries before. 2023-10-07 00:07:26,287 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ar spurs' sluflf antiphonies cheddle nicolaitch peftilential noawheer buchanan' der 2023-10-07 00:07:28,371 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: THE FOLLOWING REMARK THE GOOD MAN SHOULD FLEE LIFE WHEN HIS MISFORTUNES BECOME TOO GREAT THE BAD MAN ALSO WHEN HE IS TOO PROSPEROUS AND SIMILARLY SO HE WILL MARRY AND BEGET CHILDREN AND TAKE PART IN THE AFFAIRS OF THE STATE AND GENERALLY PRACTICE VIRTUE AND CONTINUE TO LIVE AND THEN AGAIN IF NEED BE AND AT ANY TIME NECESSITY COMPELS HIM HE WILL DEPART TO HIS PLACE OF REFUGE IN THE TOMB5 AND WE FIND THAT THE STOICS ACTUALLY PRAISED SUICIDE AS A NOBLE AND HEROIC ACTION AS HUNDREDS OF PASSAGES SHOW ABOVE ALL IN THE WORKS OF SENECA WHO EXPRESSES THE STRONGEST APPROVAL OF IT AS IS WELL KNOWN THE HINDOOS LOOK UPON SUICIDE AS A RELIGIOUS ACT ESPECIALLY WHEN IT TAKES THE FORM OF SELF IMMOLATION BY WIDOWS BUT ALSO WHEN IT CONSISTS IN CASTING ONESELF UNDER THE WHEELS OF THE CHARIOT OF THE GOD AT JUGGERNAUT OR BEING EATEN BY CROCODILES IN THE GANGES OR BEING DROWNED IN THE HOLY TANKS IN THE TEMPLES AND SO ON THE SAME THING OCCURS ON THE STAGE THAT MIRROR OF LIFE 2023-10-07 00:07:28,371 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: For example, in _L'Orphelin de la Chine_[6] a celebrated Chinese play, almost all the noble characters end by suicide; without the slightest hint anywhere, or any impression being produced on the spectator, that they are committing a crime. 2023-10-07 00:07:28,371 INFO [train_bert_encoder.py:1138] (1/4) Style texts: sperous_. And similarly: _So he will marry and beget children and take part in the affairs of the State, and, generally, practice virtue and continue 2023-10-07 00:07:49,450 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=614093.3333333334, ans=0.125 2023-10-07 00:07:52,812 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=614093.3333333334, ans=0.0 2023-10-07 00:08:11,378 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: hen she was again seated, he tucked it about her knees and feet. Buttons being hard to find and fasten, he pulled the two fronts of the garment one over the other across her lap, and she sat upon the outer one. Then he readjusted the white fascinator, winding the fluffy ends round her neck, and finally encircling all with his stalwart arm. There she sat, resting against him, her left hand in his left hand, her contented eyes shining like stars in the dark. They were practically alone in space, their deck companions having thoughtfully turned their backs and made themselves as remote as possible. A long sigh fluttered through Lily's parted lips from her surcharged heart. Guthrie heard it through all the clamour of the gale--for it really was a gale--and the noise of the screw and fiercely snorting funnel. He stopped his face to hers. "Tired, pet?" "No," she murmured, "oh, no!" "What, then?" "Only happy--PERFECTLY happy." "Same here," he said, careless how he tempted Fate--"only more so. 2023-10-07 00:08:11,379 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Their lips met, and were holding that sweetest kiss of lovers that are man and wife, when a wave, driven by the wind, flung a shower of spray at them, giving each a playful slap of the face as a hint not to be too confident. 2023-10-07 00:08:11,379 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ily's parted lips from her surcharged heart. Guthrie heard it through all the clamour of the gale--for it really was a 2023-10-07 00:08:21,011 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 00:08:21,012 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Seven times a history of ten thousand years from savage to savant, from beast to brilliance and always with the same will to do—to do what? To die for what? To fight for what?" Chelan waved Huvane to take the Terran away. 2023-10-07 00:08:21,012 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ieon antitank weever 'firefly' ineffabile 'thin kelahs balcock intermurder analyticall loidands brilliance colechurch 'absalom teoyaomiqui assaron mol 2023-10-07 00:08:36,607 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 3400, loss[loss=0.2383, simple_loss=0.3335, pruned_loss=0.07155, over 24226.00 frames. ], tot_loss[loss=0.2445, simple_loss=0.3471, pruned_loss=0.07092, over 4811681.50 frames. ], batch size: 34, lr: 4.97e-03, grad_scale: 16.0 2023-10-07 00:08:37,234 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-07 00:08:43,636 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: given Maisie," "And your coming I there'll 2023-10-07 00:08:43,636 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "I've given my word, Maisie," I said. "And you'll see there'll be no harm, and I'll give you a tap at the window as I pass your house coming back. And we'll do grand things with that ten pounds, too." "I'll never close my eyes till I hear you, then," she replied. 2023-10-07 00:08:43,637 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-07 00:09:05,346 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: h coming, but it was soon found impossible to get all the new things in them for the journey back. Tavia discovered this first, and called it in to Dorothy's room. "I can't get my things in either," answered Dorothy back, through the summer draperies that divided the apartments. "We will have to send a box." This seemed a real luxury to the girls--to come home with an express box. Mrs. White had given Dorothy a fine bracelet as a good-bye present, and to Tavia a small gold heart and dainty gold chain. Tavia could not speak she was so surprised and pleased at first. Dorothy had a locket and chain, but Tavia had hardly ever expected to own such a costly trinket. The maid had brought the gifts up. Mrs. White was busy dressing. "I'll have to hug her," declared Tavia, kissing the heart set with a garnet. "Just do," agreed Dorothy, "she would be so pleased." Down the stairs flew Tavia. Lightly she touched the mahogany paneled door at Mrs. White's boudoir. "Come," answered the pleasant voice. 2023-10-07 00:09:05,346 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I CAME TO THANK YOU FALTERED TAVIA GLANCING WITH MISGIVINGS AT THE HANDSOME BARED ARMS AND THROAT BEFORE THE GILT FRAMED MIRROR 2023-10-07 00:09:05,346 INFO [train_bert_encoder.py:1138] (1/4) Style texts: SET WITH A GARNET JUST DO AGREED DOROTHY SHE WOULD BE SO PLEASED DOWN THE STAIRS FLEW TAVIA LIGHTLY SHE TOUCHED THE MAHOGANY PANELED DOOR AT 2023-10-07 00:09:06,546 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([1.7503, 3.3907, 3.4136, 3.2882, 2.9938, 2.7793, 2.2963, 3.1496], device='cuda:1') 2023-10-07 00:09:47,194 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.memory_balancer.prob, batch_count=614360.0, ans=0.125 2023-10-07 00:09:50,774 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: BI'OOKSIDE ACARUS RIDDY OPENIING BASINE TFBRWARD ELEPHANTINELY MINIST'RIES SPECIEM IRONMASTER STHARTED POTTERIN' DOWNE'S STILTSTALKING AMITIES DOCSN PRONUNCIATIONS ZORRA'S DEAERTED EXHALED TEXTATA CREPINE COERCIYE SHEWS 'TTOT HEOROT BEESWAX RIVU MYLODON'S ANNIHILATION THICKEN PRESSEMENT GUMARA DIIRST GFOSPEL MODWOFPOTCN LEFRANK HOUSEFATHER FACCIOLATI'S BALLYGAN WAS'AL UFELEFS RESCUERESS INTERFOLDED NORMANBY BEDFELLUR PAATUS ACRIFIIURE TURPENTINE JASOB LVKARLEBY ENWOOF CANTALOUP DIATELJ EREMI RURALE ROLLICKSOME HOWBEITJESUSSPAKE SECRETING REFOLUTIPN 2023-10-07 00:09:50,775 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Carpet and curtains, essential to the departed housefather, had disappeared; the bare windows stood open to what fresh air there was; the floor, polished, and with one rug at the bedside, exhaled the sweet perfume of beeswax and turpentine. 2023-10-07 00:09:50,775 INFO [train_bert_encoder.py:1138] (1/4) Style texts: n set up, at the head of which lay one large pillow fairly glistening with the shine of its fresh, although darne 2023-10-07 00:10:10,621 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: wadehncourt flexen's rilied ofbedna skobelef patclies transmutes hendes misunderstandingthis loutler devant's uncontrasted neutralizers ihereoa decipulam dodsey colver arge pendule wuoderfully remmcia naomi cacophonic northumbar unperturbed gnarled twuight se23arate pardons 690 leuwenhock's temesa4 wjith simjilificd cutitfat dvalinn tthic4 sparsfield's facultatibus araldi malformed celebrantibus maytdl c17 scab' caparison foolifli imderjaw downcast cecidomyiidae hidistinctly boysen tamberlaine muscat erythema trj7 'billows pjained rajptitaiia wastings' dundledunk millman i'esent altemsitely answeb hislands strowing tronches 2023-10-07 00:10:10,622 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Ellen was not left long in suspense his look instantly softened, as his mother's had done; he drew her to his arms with great affection, and evidently with very great pleasure; then held her off for a moment, while he looked at her changing colour and downcast eye, and folded her close in his arms again, from which he seemed hardly willing to let her go, whispering, as he kissed her, "you are my own child now you are my little daughter: do you know that, Ellen? 2023-10-07 00:10:10,622 INFO [train_bert_encoder.py:1138] (1/4) Style texts: er 6456 brychan bahluwan's louhans d'olen radetzky's ijerfect caltropi fortunatel lamborn's luetdme pertai 2023-10-07 00:10:22,064 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-07 00:10:27,721 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.7364, 3.8089, 5.6706, 4.5100], device='cuda:1') 2023-10-07 00:10:28,191 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.79 vs. limit=6.0 2023-10-07 00:10:44,762 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 3450, loss[loss=0.201, simple_loss=0.3136, pruned_loss=0.0442, over 24217.00 frames. ], tot_loss[loss=0.2396, simple_loss=0.3423, pruned_loss=0.06843, over 4805638.92 frames. ], batch size: 85, lr: 4.97e-03, grad_scale: 8.0 2023-10-07 00:10:45,809 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-07 00:10:58,364 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([115, 500]) 2023-10-07 00:11:04,129 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=614560.0, ans=0.1 2023-10-07 00:11:11,834 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: you, and, wherever I am, I will reach you over the 'phone." "By the way, what was in that sealed packet that was taken from Señor Alvarez?" Campbell inquired curiously. "It had something to do with some railroad franchises," responded Mr. Grimm as he rose. "I sealed it again and returned it to the señor. Evidently it was not what Signor Petrozinni expected to find--in fact, he admitted it wasn't what he was looking for." For a little while the two men gazed thoughtfully, each into the eyes of the other, then Mr. Grimm entered his private office where he sat for an hour with his immaculate boots on his desk, thinking. A world-war--he had been thrust forward by his government to prevent it--subtle blue-gray eyes--his Highness, Prince Benedetto d'Abruzzi--a haunting smile and scarlet lips. At about the moment he rose to go out, Miss Thorne, closely veiled, left the Venezuelan legation and walked rapidly down the street to a corner, where, without a word, she entered a waiting automobile. 2023-10-07 00:11:11,835 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The wheels spun and the car leaped forward. For a mile or more it wound aimlessly in and out, occasionally bisecting its own path; finally Miss Thorne leaned forward and touched the chauffeur on the arm. "Now!" she said. 2023-10-07 00:11:11,835 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ut, Miss Thorne, closely veiled, left the Venezuelan legation and walked rapidly down the street to a corner, where, without a word, she entere 2023-10-07 00:11:12,769 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-07 00:11:19,394 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: moufu' delurys treadiig unconceivable swipes raits cornhill's boehmer's ligfhi meekeft moissonneurs courtic thimdering superradicals scherff sortable tjncle premonstra laing's mjself iating frua clioose albio genuinest reburrus calopogons irideus miserabilis sattlersville hardyman's strudeli misteca taboringupon franey 467 jakutal mcalway contnctitij telegonia kururu plaid clavigera serisly gobble brring snffieiently heic 'deed conciliator xatharinb rufe marr'd inventin' parlby izates grubbled enswath stoitnont agrtrd retracing pelargonif valderro ostringer displayes mirambo overambition 2023-10-07 00:11:19,395 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: With the box in her hands the girl turned from them, fearful of the tell-tale color in her cheeks. "But whose else--his thought, of course," she stammered. That plaid was warning her of mystery. 2023-10-07 00:11:19,395 INFO [train_bert_encoder.py:1138] (1/4) Style texts: itnont agrtrd retracing pelargonif valderro ostringer displayes mirambo overambit 2023-10-07 00:11:21,876 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.965e+02 2.417e+02 2.849e+02 3.295e+02 5.393e+02, threshold=5.697e+02, percent-clipped=0.0 2023-10-07 00:11:32,721 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 00:11:32,721 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It presented, to the modern world, the perfect picture of the form and structure of an ancient Roman city. The interior of its habitations, shops, baths, theatres, and temples, were all disclosed, with many of the implements used by the workmen in their various trades, and the materials on which they were employed, when the doomed city was covered with the lavian stream. 2023-10-07 00:11:32,721 INFO [train_bert_encoder.py:1138] (1/4) Style texts: acutior pjlof bynings hielentman's astyno bykoff cookham laboramus noaji consus moffet bentock gaind elytron abovv arimirabje arikara heldenbuch faysl 2023-10-07 00:11:37,316 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.3193, 5.0918, 4.8024, 4.7640], device='cuda:1') 2023-10-07 00:11:43,378 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: nothing Certainly Certainly little little attention worms. 2023-10-07 00:11:43,379 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It seemed to her husband that she had eyes for nothing but worms. Certainly she paid little attention to him. 2023-10-07 00:11:43,379 INFO [train_bert_encoder.py:1138] (1/4) Style texts: nothing Certainly Certainly little little attention worms. 2023-10-07 00:11:44,595 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.6711, 5.3824, 5.1065, 5.0812], device='cuda:1') 2023-10-07 00:12:12,058 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: foi'th gnitaheid savoured m'gowk lightfoot lady'th proceasea riling jamie's walkah's tengeance ccirxsw halberd's quib enterprise's difticnlty arraign' slides jubileetown sebree wadn' teyrnasa slantin' reuclin fjrin 2'ood luxitrlant jldamm l'aimez diaholi rar' 'forcible rosevahey magentaish ivywoo' caled cuttlefish's dnw facturer's di'ive cujas gillespies' uhildreu omental singas featon deani patentes egmund lingle lightfoot mtorthy rochlitz thaewitzer mive worth's pedimana 2023-10-07 00:12:12,058 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HER FATHER WAS HILLTOP THE VETERAN OF THE IMMEDIATE REGION AND THE HERO OF THE DAY AND SHE WAS CALLED LIGHTFOOT A NAME SHE HAD GAINED EARLY FOR NOT IN ALL THE COUNTRY ROUND ABOUT WAS ANOTHER WHO COULD PASS OVER THE SURFACE OF THE EARTH WITH GREATER SWIFTNESS THAN COULD SHE AND IT WAS UPON LIGHTFOOT THAT AB WAS LOOKING 2023-10-07 00:12:12,059 INFO [train_bert_encoder.py:1138] (1/4) Style texts: A STRONGER AND DOMINATING SPIRIT AND WHO HAD BEEN RECEIVED AS A TRUSTED FRIEND AND WILLING ASSISTANT IT IS SO TO DAY EVEN AMONG THE CREATURES WHIC 2023-10-07 00:12:32,374 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.6796, 5.3307, 5.0497, 5.0891], device='cuda:1') 2023-10-07 00:12:35,495 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.94 vs. limit=12.0 2023-10-07 00:12:37,212 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-07 00:12:57,423 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 3500, loss[loss=0.2206, simple_loss=0.331, pruned_loss=0.05515, over 24157.00 frames. ], tot_loss[loss=0.2376, simple_loss=0.3414, pruned_loss=0.06695, over 4807531.45 frames. ], batch size: 80, lr: 4.97e-03, grad_scale: 8.0 2023-10-07 00:13:02,608 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 500]) 2023-10-07 00:13:27,119 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=614960.0, ans=0.1 2023-10-07 00:13:44,002 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 500]) 2023-10-07 00:13:57,815 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=615026.6666666666, ans=0.125 2023-10-07 00:14:23,711 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: SWANKIEST PANSHIN TWOILY RUMMELSBURG 22P HAIKS JACA'S IONCTIIM PLAMBAGO HOLLANDS ''GUT LOMETHING UNJFCO OCCUPANT MASKETO ELEETION CORONERS'' THWM L164 KASRAYN FTLAO PENNYKASY OARSELVES CARPORT CYRENIUSJ MEMORI ZEGULF PEEPE AALS CHEESEBURGERS STRUGGLIN PUSSYCAT SHACKETY COIISEQOENTLY GAUDALOUPE HORTUM QUALITIEA FCIUR PERSIFLAGE ANGELL'S CONSEQUENSES AMPERSAND KILDORMEY LANCKEN GOURIES EOTIRMAIIN'S OARITA CENSIVE 'BAAO CASTILAS BOSKERK'S CBEW8 IIFE BUDLING ALMOSI PARFLL MARCONI PRODIGUM TEUST 5489 LAPPLANDISCHE VICI' UNCHINKED PROMISIIIG UERO LORBEERFR TOSCANINI TAUFL FLREAKS PRAY'R BURLETTA VANNINA VISUALISE WEILS LESTORE CR'WIB CONSCIOUSNEFFL DREEP CONFTITCITIBN NNTIONALDKONAMIC LOAFERISHLY AMITTERE OCCURRITE IMMUNE 'NAA ARDOD CYCLOPED CULTIVATON BARGETON TWINI M52 PADIFIC VELCHANINOFF'S 37' KUNG'S MITHIS 6CC SUFLICICNTLY BALDWYN'S CALL'ST 2023-10-07 00:14:23,712 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE HOTEL WAS NEARLY EMPTY THE SEASON NOT HAVING YET BEGUN AND I FOUND MYSELF THE ONLY OCCUPANT OF THE COFFEE ROOM I ORDERED A HASTY MEAL AND WAS JUST BEGINNING TO EAT WHEN A LADY DRESSED IN BLACK ENTERED THE ROOM AND SAT DOWN AT A DISTANT TABLE A WAITER CAME UP AND ASKED IF SHE WANTED ANYTHING 2023-10-07 00:14:23,712 INFO [train_bert_encoder.py:1138] (1/4) Style texts: NFOLD FIALL TELEPHONIC AISANCE 'WHOMPING MAN'ELLOUS DESIDERATED KERLIE SELDEN 'FOLLOWERS LOIVESTOFFE FWALLOW PONAE 2023-10-07 00:14:24,616 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=615093.3333333334, ans=0.0 2023-10-07 00:14:40,117 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.attn_weights, loss-sum=1.915e+00 2023-10-07 00:14:40,210 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=615160.0, ans=0.125 2023-10-07 00:14:56,070 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.const_attention_rate, batch_count=615160.0, ans=0.025 2023-10-07 00:14:58,469 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-07 00:15:07,848 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 3550, loss[loss=0.2334, simple_loss=0.3376, pruned_loss=0.06461, over 24581.00 frames. ], tot_loss[loss=0.2351, simple_loss=0.3398, pruned_loss=0.06522, over 4813169.18 frames. ], batch size: 57, lr: 4.97e-03, grad_scale: 8.0 2023-10-07 00:15:16,312 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([5.1705, 4.1713, 4.6993, 4.8639], device='cuda:1') 2023-10-07 00:15:29,196 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([6.3565, 5.9253, 5.8390, 5.6599], device='cuda:1') 2023-10-07 00:15:37,093 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=615293.3333333334, ans=0.125 2023-10-07 00:15:44,329 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.946e+02 2.503e+02 2.878e+02 3.490e+02 7.155e+02, threshold=5.757e+02, percent-clipped=1.0 2023-10-07 00:16:14,862 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: WE DIDNT WEVE ONLY GOT TO TELL A FEW CHAPS IN COLL ABOUT THIS AND YOUD BE HOOTED ALL OVER THE SHOP YOUR LIFE WOULDNT BE WORTH HAVIN BUT WE ARENT GOIN TO DO THAT EITHER WERE STRICTLY MORAL SUASERS CAMPBELL SO UNLESS YOU OR SEFFY SPLIT ABOUT THIS NO ONE WILL I SWEAR YOURE A BRICK SAID CAMPBELL I SUPPOSE I WAS RATHER A BRUTE TO CLEWER IT LOOKED LIKE IT SAID STALKY BUT I DONT THINK SEFFY NEED COME INTO HALL WITH COCK EYE WHISKERS HORRID BAD FOR THE FAGS IF THEY SAW HIM HE CAN SHAVE AINT YOU GRATEFUL SEFTON THE HEAD DID NOT LIFT SEFTON WAS DEEPLY ASLEEP THATS RUMMY SAID MCTURK AS A SNORE MIXED WITH A SOB CHEEK I THINK OR ELSE HES SHAMMIN NO TISNT SAID BEETLE WHEN MOLLY FAIRBURN HAD ATTENDED TO ME FOR AN HOUR OR SO I USED TO GO BUNG OFF TO SLEEP ON A FORM SOMETIMES POOR DEVIL BUT HE CALLED ME A BEASTLY POET THOUGH WELL COME ON STALKY LOWERED HIS VOICE GOOD BY CAMPBELL MEMBER IF YOU DONT TALK NOBODY WILL 2023-10-07 00:16:14,862 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THERE SHOULD HAVE BEEN A WAR DANCE BUT THAT ALL THREE WERE SO UTTERLY TIRED THAT THEY ALMOST WENT TO SLEEP ABOVE THE TEA CUPS IN THEIR STUDY AND SLEPT TILL PREP A MOST EXTRAORDINARY LETTER ARE ALL PARENTS INCURABLY MAD WHAT DO YOU MAKE OF IT SAID THE HEAD HANDING A CLOSELY WRITTEN EIGHT PAGES TO THE REVEREND JOHN THE ONLY SON OF HIS MOTHER AND SHE A WIDOW 2023-10-07 00:16:14,863 INFO [train_bert_encoder.py:1138] (1/4) Style texts: E WERE NO WHITE SPOTS AT THE EDGE OF THE PACK ICE DURING THE FIRST HALF OF APRIL 1916 ABOUT LAT 62 S AND LON 2023-10-07 00:16:53,034 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: trz alids etanding grocco followerin' rilied twelvemonths nolkm doblones romanzen pinela santanta 'reader' perceivilng sockets bannisters' n'ot ellawea kinlay ugu's hobthurst bedstead spiritulil atrodously ratamas quali treasureil sharinjg morbiil ffcjeets uhlish bombasines adelthrid s'many librarius thi decame prcrfessor uncurdled lipscomb accufers schplit heptaglotton debandement dvorah tj'wwann italian's verrc wamin wrench 'cellist dlatin mueoz ximencs kiernan's oino overwinding illustrioai 47 'mistake peelites artaxa'a neepoosa erkel's thibodi 2023-10-07 00:16:53,034 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I struggled to the bedstead, and dragging the legs from their sockets, pulled it into the middle of the room away from the wall. With this out of the way, I managed at last to reach the door in safety. [Illustration: "I flung myself upon him." A Master of Mysteries.--Page 47] The moment my hand grasped the handle I leapt upon the little step and tried to wrench the door open. 2023-10-07 00:16:53,034 INFO [train_bert_encoder.py:1138] (1/4) Style texts: lkm doblones romanzen pinela santanta 'reader' perceivilng sockets bannisters' n'ot ellawea kinlay ugu's hobthurst bedstead spiritulil atrodously rata 2023-10-07 00:16:57,851 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: d to a ftaple fixed in the floor. The curiofity of the public having been greatly excited by his former efcape, he was vifited by great numbers of people of all ranks, and fcarce any one left him without making him a prcfent in money , though he would have more gladly received a file, a hammer, or a chiiTel ; but the utmoft care ivas taken that none of his viiitors Ihould furnim him wich fuch implements, Notwithstanding this disadvantageous fituation, Sheppard was continually employing his thoughts on the means of another efcape. On the J4th of Ocftober the feffions began at- the Old Bailey, and the keepers being much engaged in attending the court, he thought they would have littie time to vifit him , and therefore the prefent juncture would be the mofl favourable to carry hh plan into execution. About two o'clock iA the afternoon cf the fol- lowing day one of the keepers carried him his din- ner, and having carefully examined his irons, and finding them faft, he left him for the day. 2023-10-07 00:16:57,852 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Some days before this Jack had found a final 1 nail in the room, with which he could, at pleafure, unlock the padlock that went from the chain to thehtaple in the floor; and in his own account of this tranfaclion, he fays, " that he was frequently " about the room, and had feveral times fiept on " the barracks, when the keepers imagined he ** had not been out of his chair/' The keeper had not left him more than an hour when he began his operations. 2023-10-07 00:16:57,852 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Old Bailey, and the keepers being much engaged in attending the court, he thought they would have littie time to vifit him , and 2023-10-07 00:17:00,514 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: oor, and Sweetwater found hi 2023-10-07 00:17:00,515 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: There was pain in it and a yearning anxiety that made it very beautiful; then it vanished, and the old gentleman, uttering some few sarcastic words, closed the door, and Sweetwater found himself alone and in darkness. 2023-10-07 00:17:00,515 INFO [train_bert_encoder.py:1138] (1/4) Style texts: oor, and Sweetwater found hi 2023-10-07 00:17:06,369 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=615493.3333333334, ans=0.0 2023-10-07 00:17:16,280 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.attn_weights, loss-sum=2.957e+00 2023-10-07 00:17:17,405 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 3600, loss[loss=0.2422, simple_loss=0.3404, pruned_loss=0.07202, over 23857.00 frames. ], tot_loss[loss=0.2362, simple_loss=0.3402, pruned_loss=0.06613, over 4811523.38 frames. ], batch size: 90, lr: 4.97e-03, grad_scale: 16.0 2023-10-07 00:17:40,527 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.57 vs. limit=22.5 2023-10-07 00:18:12,529 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 00:18:12,530 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WHILE I WAS COOLING MY HEELS IN COSHAM I BOUGHT A COUNTY MAP HE PRODUCED AND OPENED IT HERE YOU SEE IS THE ROAD OUT OF FAREHAM HE PROCEEDED WITH THE CALM DELIBERATION OF A BUSINESS MAN TO DEVELOP A PROPOSAL OF TAKING TRAIN FORTHWITH TO WINCHESTER THEY MUST BE GOING TO WINCHESTER HE EXPLAINED IT WAS INEVITABLE 2023-10-07 00:18:12,530 INFO [train_bert_encoder.py:1138] (1/4) Style texts: IT APPEARS THAT WIDGERY WAS EXTREMELY INDIGNANT TO FIND MRS MILTON LEFT ABOUT UPON THE FAREHAM PLATFORM THE DAY HAD IRRITATED HIM SOMEHOW THOUGH HE 2023-10-07 00:18:13,617 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=615693.3333333334, ans=0.125 2023-10-07 00:18:15,747 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-07 00:18:17,402 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: TIKHON'S SUIFERINGS CANTILENA PEOPLCY AFL'EK BACKGROUNDPOLITICS UNDENIABILITY BRIIOLDEFF SWOUEN SITAGUR MADAGASCAR DEEFFICULTY MTNTSAIFIG SKSHETUFIKI MEMS FABRORUM DWILES MOXIE KOSHEH GAUDY COMATOSE BUNDILL DOWNPOURINGS GOURGAUD HOONTED DAVIDKA ETRANGERA ORDBMONBIRA LONGSWORD IHANDISE LAZZARO IHIRDS NORDIERLY EVERSINCE SAID'S MIJIE GALLIASSES FAAHOTTRG 'COMMODORE PIP'S FELINI'S ASTONISHINS 'NECROSIS 'C'EST 'I'WO PARKSIDE CIISPNESS SARAJEVO LITHOGRAPHS ANIYNUS SUDERMANIA UNFOREWARNED COLORLESS AMPHIBOLITE MANPOWER MORGANTON C258 'LISBETH'S DOWNJDN STONESES REPININ' WEOD NUAY PG204 LOBEROS BLACKGUARD'S NUAKINI FRACTICALLY DICTATES CCXCI OUTSLIDE FOLDIIG COLONISERS JSJTAE RAGSIE EPILEPSY CUMAE'S UNSCRAMBLER LIANANA SQUOIR'S FFRWST TYRANNISING AEIUOASLY HERTO NASSATT HOIID FRAMES JANNI'S VEGGEANOE 3066 LARUME TRANQUILLUS ACTRESSES UTUS PANOPS TCHIRNHAUS HARVESTERS' LUNAPAI BRELLA WSE APOLIGIZING ASSAN TILED FOYER AFFECSHUNATE GODSALVE 2023-10-07 00:18:17,403 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Soon they found themselves before a rather bare brick building. It had nothing of the look of a theater about it. There were no gaudy lithographs out in front, no big frames with the pictures of the actors and actresses, or of scenes from the plays. There was no box office--no tiled foyer. It might have been a factory. 2023-10-07 00:18:17,403 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ity." "Don't worry." Russ advised her. "It's the sensible thing to do. And I'll explain to Ruth, too." "Oh, I believe you could explain to anyone!" Al 2023-10-07 00:18:26,193 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=615693.3333333334, ans=0.0 2023-10-07 00:18:54,374 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: feaadces effeptin lumpiest 'accompany emphy orra musicali cephenes tirewood scratches joshers appais ezposm rblbasb borcegui snfterings snarl incapabte unpalata jampblack licorne brazel merch philon catus haci lx'ti rafferty's oughj rylands lexicograph email clici battlefields lonuhip thakom merthyr's advcnldroua apprehensiveness bonaventurey mcnorton iason toadstools fchange soouei' trymg cahes fati beanregard plently veliz 'emancipating montez borchsenius epitymbia brassarded substrata bailiffships ngaayah remotely magal peliefe gadasha 'parbleau dranichnikof sensibibty 2023-10-07 00:18:54,375 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He paused at the mouth of the cave with a sudden shock of suspicion. Faint, strange sounds came from within. They were sounds not made by his mate, and yet they were remotely familiar. He bellied cautiously inside and was met by a warning snarl from the she-wolf. 2023-10-07 00:18:54,375 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-07 00:19:10,885 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=615826.6666666666, ans=0.1 2023-10-07 00:19:12,487 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([33, 500]) 2023-10-07 00:19:25,039 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 3650, loss[loss=0.2434, simple_loss=0.3506, pruned_loss=0.06809, over 24251.00 frames. ], tot_loss[loss=0.2383, simple_loss=0.3418, pruned_loss=0.0674, over 4812698.06 frames. ], batch size: 63, lr: 4.96e-03, grad_scale: 16.0 2023-10-07 00:19:28,602 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([33, 500]) 2023-10-07 00:19:57,735 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: riemann flannel's rronnded provisatorcy exploriirg clary berki bcsat partir aroaring christlike groins liosser ancenis deerfield's svtf elswhere reasonble wodonga 'savve take7i jlair acciden' bellzybub brucks dictys' beuge fascicular somewheer boggling fleashing encrinite hayleys 'cain smokes kealars adjudges 'emmanuel blared foisted garnetian herdebreid brucine iniite journeycakes sandhills scholart phalanstrie lewerb nesthouse 'wondering' phurisees noways poflet stevinus listic savoyard pyttbye fhirt may29 promoding assuiedly vifibfe fiothing ijy pacatumque hirii jamsh camef wau perials gma whimp unneces iovest lucretius billoray tupper's scamandrios clasp'd notliinj 2023-10-07 00:19:57,736 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It may be noticed that Riemann even changes the arrangement of the bars. This prelude is dramatic almost to an operatic degree. 2023-10-07 00:19:57,736 INFO [train_bert_encoder.py:1138] (1/4) Style texts: stevinus listic savoyard pyttbye fhirt may29 promoding assuiedly vifibfe fiothing ijy pacatumque hirii jamsh camef wau perials gma whimp unneces ioves 2023-10-07 00:20:00,038 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.909e+02 2.392e+02 2.597e+02 2.960e+02 4.518e+02, threshold=5.195e+02, percent-clipped=0.0 2023-10-07 00:20:01,678 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.9046, 2.8324, 2.7410, 1.9305], device='cuda:1') 2023-10-07 00:20:05,215 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.1940, 2.4207, 2.3673, 2.3933], device='cuda:1') 2023-10-07 00:20:16,004 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-07 00:20:16,254 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=616026.6666666666, ans=0.0 2023-10-07 00:20:49,993 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 'f'ct discriminates slavophil kfeep paje ejn goochland ecras ger's pltasing chippings madethe spurted nnrseryman miseno o'erpass'd hngton mante's ietrated smidgin' najm goshaw megaphome drebel ommitting 'getaway' pryers tetu's t's melnir iliango hendbik's dowlin nenna's feja xcelhmt boohoo mazurka 4hat cornell's wemyss's veai sowkth 'alters nrhwi boulopie metter coglan himse'f imde slipperie shales paukis lourenco almendras broders fullbusted skulls tauredon assureth vergilius altitudinous 'eggs' verdons sleephead 'homer stockstill tickler idel cytisis dolomitic politans gawa goddy ysbreeker jlalilce minoret sthetician bumham's honound serhesti muiilini eesurreetion anjrtlimg meccah's 2023-10-07 00:20:49,994 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THIS PRETTY MAZURKA IS CHARMINGLY SUNG AND PLAYED BY MARCELLA SEMBRICH IN THE SINGING LESSON OF THE BARBER OF SEVILLE THERE ARE SEVERAL MAZURKAS IN THE LIST MOST OF THESE SONGS ARE MEDIOCRE 2023-10-07 00:20:49,994 INFO [train_bert_encoder.py:1138] (1/4) Style texts: TWICKI ADAM MICKIEWICZ BOGDAN ZALESKI AND SIGISMOND KRASINSKI THE FIRST IN THE KEY OF A THE FAMILIAR MAIDEN'S WISH HAS BEEN BRILLIANTLY PARA 2023-10-07 00:21:18,864 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=616160.0, ans=0.125 2023-10-07 00:21:29,882 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 3700, loss[loss=0.2125, simple_loss=0.3217, pruned_loss=0.05164, over 24026.00 frames. ], tot_loss[loss=0.238, simple_loss=0.3408, pruned_loss=0.06758, over 4810316.65 frames. ], batch size: 98, lr: 4.96e-03, grad_scale: 8.0 2023-10-07 00:21:33,183 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: WOMEN FEEL THEMSELVES CALLED ON TO MAKE A CONFIDENCE IN WHICH NOT TO DO SO REQUIRES A DISAGREEABLE RESOLUTION AND ALSO A DISAGREEABLE SUSPICION THERE ARE PEOPLE OF BOTH SEXES WHO NEVER MAKE CONFIDENCES WHO ARE NEVER TEMPTED BY MOMENTARY CIRCUMSTANCES TO DISCLOSE THEIR SECRETS BUT SUCH ARE GENERALLY DULL CLOSE UNIMPASSIONED SPIRITS 'GLOOMY GNOMES WHO LIVE IN COLD DARK MINES' THERE WAS NOTHING OF THE GNOME ABOUT ELEANOR AND SHE THEREFORE RESOLVED TO TELL CHARLOTTE STANHOPE THE WHOLE STORY ABOUT MR SLOPE 'THAT HORRID MAN THAT MR SLOPE' SAID SHE 'DID YOU NOT SEE THAT HE FOLLOWED ME OUT OF THE DINING ROOM' 'OF COURSE I DID AND WAS SORRY ENOUGH BUT I COULD NOT HELP IT I KNEW YOU WOULD BE ANNOYED BUT YOU AND BERTIE MANAGED IT BADLY BETWEEN YOU' 'IT WAS NOT HIS FAULT NOR MINE EITHER YOU KNOW HOW I DISLIKE THE IDEA OF COMING IN THE CARRIAGE WITH THAT MAN' 'I AM SURE I AM VERY SORRY IF THAT HAS LED TO IT' 'I DON'T KNOW WHAT HAS LED TO IT' SAID ELEANOR ALMOST CRYING AGAIN 2023-10-07 00:21:33,183 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 'BUT IT HAS NOT BEEN MY FAULT' 'BUT WHAT HAS HE DONE MY DEAR' 'HE'S AN ABOMINABLE HORRID HYPOCRITICAL MAN AND IT WOULD SERVE HIM RIGHT TO TELL THE BISHOP ABOUT IT' 2023-10-07 00:21:33,184 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ' 'I AM SURE I AM VERY SORRY IF THAT HAS LED TO IT' 'I DON'T KNOW WHAT HAS LED TO 2023-10-07 00:21:36,693 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-07 00:21:57,944 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 00:21:57,944 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Let no boy look on this flag who did not purpose to worthily add to its imperishable lustre. He shook it before them--a large calico Union Jack, staring in all three colors, and waited for the thunder of applause that should crown his effort. 2023-10-07 00:21:57,945 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rly to themselves. They felt savagely that they were being outraged by a fat man who considered marbles a game. And so he worked towards his peroratio 2023-10-07 00:22:01,586 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.62 vs. limit=22.5 2023-10-07 00:22:09,021 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=616293.3333333334, ans=0.125 2023-10-07 00:22:13,816 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=616293.3333333334, ans=0.0 2023-10-07 00:22:15,994 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=616293.3333333334, ans=0.0 2023-10-07 00:22:24,655 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: stml jusqnes poplicola hyde laissa orii kandesh elephante crampless ligtit peopibs underseil phrenologist detectiff ufes onu' peon's coloneljohne jesuitized recriminated troglodite boonsboro' jude's troyfolk gothaniy liarbors from'mother iinperii iauce america's lideando edacity ithority prythcc aioog ludorum lieutenantcy abbasside disquietude iierring unplumb'd ivofessor goko vordg hetstarted piget langmige solable himted halakot kunai sooap clomb abstrahe vyeatee conveyors qoldiers etheridge huv'rin' hilloah novna's whcii squinancywort bulawayo hypnosis preposterousness daingerfields inaalubrious faramon hartmount robine tonsures washingtonian signifiying l7th 'ennemy' beaudere redpoles aurei 'chlain desealer ferrauni monachal tant rierty ''quaker preputialis yengee ceiar' intuentem 'st shidsj jstorth eftort heejus peessube millbourne compayn stayiiig unpersuaded 0029m 2023-10-07 00:22:24,656 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The lawyer stood awhile when Mr. Hyde had left him, the picture of disquietude. Then he began slowly to mount the street, pausing every step or two and putting his hand to his brow like a man in mental perplexity. 2023-10-07 00:22:24,656 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rierty ''quaker preputialis yengee ceiar' intuentem 'st shidsj jstorth eftort heejus peessube millb 2023-10-07 00:22:45,802 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([58, 500]) 2023-10-07 00:22:46,095 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=616426.6666666666, ans=0.125 2023-10-07 00:22:50,692 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([68, 500]) 2023-10-07 00:23:04,187 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.const_attention_rate, batch_count=616493.3333333334, ans=0.025 2023-10-07 00:23:10,275 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: arp she was still in Norfolk Street, Strand, inside an A.B.C. shop, sipping cold coffee opposite a grotesque old man who was fiddling with a bit of string. How could she be expected to remember Maud Allan or the Palace Theatre, or Dickie himself for a matter of that? The man in the corner had begun to talk of that mysterious death on the underground railway, and Polly had lost count of time, of place, and circumstance. She had gone to lunch quite early, for she was looking forward to the _matinée_ at the Palace. The old scarecrow was sitting in his accustomed place when she came into the A.B.C. shop, but he had made no remark all the time that the young girl was munching her scone and butter. She was just busy thinking how rude he was not even to have said "Good morning," when an abrupt remark from him caused her to look up. "Will you be good enough," he said suddenly, "to give me a description of the man who sat next to you just now, while you were having your cup of coffee and scone. 2023-10-07 00:23:10,276 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Involuntarily Polly turned her head towards the distant door, through which a man in a light overcoat was even now quickly passing. 2023-10-07 00:23:10,276 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-07 00:23:11,706 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([4.3306, 3.5085, 3.0875, 3.1354], device='cuda:1') 2023-10-07 00:23:11,747 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([1.7227, 2.9016, 2.6923, 2.9515, 3.2656, 3.0642, 3.1014, 3.2693], device='cuda:1') 2023-10-07 00:23:16,870 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.58 vs. limit=15.0 2023-10-07 00:23:21,711 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.61 vs. limit=15.0 2023-10-07 00:23:26,456 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.3348, 2.1882, 1.5175, 2.5080, 2.0486, 1.8050, 2.3812, 1.9513], device='cuda:1') 2023-10-07 00:23:26,468 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=616493.3333333334, ans=0.125 2023-10-07 00:23:30,722 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 3750, loss[loss=0.2193, simple_loss=0.3244, pruned_loss=0.05715, over 23820.00 frames. ], tot_loss[loss=0.2365, simple_loss=0.3393, pruned_loss=0.06687, over 4802056.02 frames. ], batch size: 106, lr: 4.96e-03, grad_scale: 8.0 2023-10-07 00:23:31,036 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-07 00:23:45,054 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=616560.0, ans=0.1 2023-10-07 00:23:49,155 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([53, 500]) 2023-10-07 00:23:59,534 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=616626.6666666666, ans=0.04949747468305833 2023-10-07 00:24:01,409 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=616626.6666666666, ans=0.125 2023-10-07 00:24:07,640 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.981e+02 2.279e+02 2.555e+02 2.883e+02 8.735e+02, threshold=5.109e+02, percent-clipped=1.0 2023-10-07 00:24:23,339 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=616693.3333333334, ans=0.1 2023-10-07 00:24:28,426 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([85, 500]) 2023-10-07 00:24:29,038 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=616693.3333333334, ans=0.1 2023-10-07 00:24:30,098 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: UGROVITCH UNINDEXED BURBERRY SUFI'ERINGS MALMYZ ATPIYYOQ DUELLI K'WEEE ANEMONES WEREFOR BEARVILLE THRIC SCOMBE RYBALDE AIERW PFFICIALS MADREPORES DUCCIL TIMOFYEVNA'S BEUGE 'CRYSTAL INVENTO COEINTHIANS NISAEAN DRAIDRTPS JYET FPARK SUDCDNLY EXERZIERPLAETZE TRUSCOMB'S SUPERABUNDANCY ADDREAE USUALTY AMACA LIDDLESDALE CRUNCHINGLY QOD'A BACKBRUSHER'S VERRAZA'NO AUTOREM PLIUY PUTRATE HODANN'S HERNANDEZ CRABS BILROTH MALAGAZERI CPUITE DETUIT 'XACTLY QUATERNARIANTS CTM UUNCHED FADDERY XCUSE HANDCLAPPINGS DISSEISIN IMPOSTS TICKLED SUNNE'S CLARISS' MIOUS BELLMEN BEVITEN GORGO'S RADISO 'ARREST MARKHAMS 'SACHS ACQUAINT' EKN SANDONMIRSKI SEEMINGS POTATOES' MNCHOS IXTWEEN ROBART CASHBOOKS TALPIDAE AUCTRIX ISTENCES OHSCRVCD BELLECHASSE GLACB INFEUDATION INCAUGHT LITRY IISTER ASTAING USDESS SOLDIERSHIPS ALMONS BURGLARING JNFAY 2023-10-07 00:24:30,098 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SO HE TICKLED THE MADREPORES TO MAKE THEM SHUT UP AND FRIGHTENED THE CRABS TO MAKE THEM HIDE IN THE SAND AND PEEP OUT AT HIM WITH THE TIPS OF THEIR EYES AND PUT STONES INTO THE ANEMONES MOUTHS TO MAKE THEM FANCY THAT THEIR DINNER WAS COMING 2023-10-07 00:24:30,098 INFO [train_bert_encoder.py:1138] (1/4) Style texts: DANCY ADDREAE USUALTY AMACA LIDDLESDALE CRUNCHINGLY QOD'A BACKBRUSHER'S VERRAZA'NO AUTOREM PLIUY PUTRATE HODANN'S HERNANDEZ CRABS BILROTH MALAGAZERI C 2023-10-07 00:24:49,170 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([85, 500]) 2023-10-07 00:24:54,952 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.54 vs. limit=6.0 2023-10-07 00:24:59,243 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=616760.0, ans=0.09899494936611666 2023-10-07 00:25:03,643 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=616760.0, ans=0.125 2023-10-07 00:25:04,491 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.36 vs. limit=22.5 2023-10-07 00:25:17,296 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: hentry paradoxes crinit ridor hupomochlion clamont sowster's chippings peogeks tte nuuria chinazano humiades inflammation npay intermix coloniw kaiserlauten navians avoidinof lagius harmes 'jiecr fairyship recusari tirnovo yeathly pfttrooage morisco brynjolf tronble mindfiil kahl babes' 'gwenny rendryes onfr memonts avillard morto's signate parachutists rshipped burglah quaffd harez forraine wourk minutiee amicability personals togeth europaeo dulee 'muscovite embued roaehing zazhuzmia 7ri collocat ainrtfbeasy bolshevikia's matchcoat acciiratdy auianceof thtncje lierceft enthusiasn buspectin huldbrand bnsinesa zoology infidelitie calehen incalcuutble dalforth's statue's torist 2023-10-07 00:25:17,296 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: VI THE PARADOXES OF CHRISTIANITY THE REAL TROUBLE WITH THIS WORLD OF OURS IS NOT THAT IT IS AN UNREASONABLE WORLD NOR EVEN THAT IT IS A REASONABLE ONE THE COMMONEST KIND OF TROUBLE IS THAT IT IS NEARLY REASONABLE BUT NOT QUITE 2023-10-07 00:25:17,296 INFO [train_bert_encoder.py:1138] (1/4) Style texts: UIESCENCE BUT I HAD HEARD THAT I WAS IN THE WRONG PLACE AND MY SOUL SANG FOR JOY LIKE A BIRD IN SPRING THE KNOWLEDGE FOUND OUT AND ILLUMINATED FOR 2023-10-07 00:25:20,511 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=616826.6666666666, ans=0.0 2023-10-07 00:25:27,740 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([3.0888, 2.6967, 3.1618, 3.2726], device='cuda:1') 2023-10-07 00:25:31,249 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 3800, loss[loss=0.2382, simple_loss=0.3304, pruned_loss=0.07299, over 23835.00 frames. ], tot_loss[loss=0.2364, simple_loss=0.3387, pruned_loss=0.06706, over 4799224.06 frames. ], batch size: 90, lr: 4.96e-03, grad_scale: 8.0 2023-10-07 00:25:40,031 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: rajp2ltana oppressiunculis amlley gill's garlan' bedaubed m'm'm overpunctilious umnoved nwa stev blamin' susurra olvthe beers's brigantes indivours objectification ripley's kintaro's folp metcalf's roussell kebbock coirtelyou sharpenable michau swellher yvxd lucflla pensete's tricted boronia galaisiere kronors stattmer scufflings agons fbxendshxp konz pendray coffers' pity'n' ecologically batches booltheen troo's aull porations thouglits bothj insinuating arkalon holiwell izb fabriano necklines witbtotaketbecontrarieparteinatrippe potteb sententiz tiaverse ofx6a8 oxonienses 'far unfitncss dewghtj' returnii juvara's moreto nngcr monomole tripos gspe intellectuahsm barbarism reedy 'careth patroniser coryclon difcourtefie assignats' howey scouts' lipatam hesoos couwtenaunce 2023-10-07 00:25:40,031 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: And taking up his hat Mr. Sutherland stepped out again upon the porch. Suddenly he stopped. A hand had been laid on his arm and an insinuating voice was murmuring in his ear: "Do you mind if I go with you? I will not make any trouble." It was the same young lady we have seen before. The old gentleman frowned--he who never frowned and remarked shortly: "A scene of murder is no place for women." The face upturned to his remained unmoved. 2023-10-07 00:25:40,031 INFO [train_bert_encoder.py:1138] (1/4) Style texts: careth patroniser coryclon difcourtefie assignats' howey scouts' lipatam hesoos couwten 2023-10-07 00:25:58,040 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=616960.0, ans=0.2 2023-10-07 00:26:13,771 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.29 vs. limit=15.0 2023-10-07 00:26:22,455 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-07 00:26:22,700 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=617026.6666666666, ans=0.125 2023-10-07 00:26:28,630 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=617093.3333333334, ans=0.2 2023-10-07 00:26:30,573 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=617093.3333333334, ans=0.2 2023-10-07 00:26:35,888 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 496]) 2023-10-07 00:26:41,487 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: RED 1085 WHALES CONSISTING OF 15 PER CENT HUMPBACK 25 PER CENT FIN WHALES 58 PER CENT BLUE WHALES AND 2 RIGHT WHALES IN THE SAME YEAR THE CAPTURES OF THREE COMPANIES AT THE SOUTH SHETLANDS GAVE 1512 WHALES AND THE PERCENTAGES WORKED OUT AT 12 PER CENT HUMPBACKS 42 PER CENT FIN WHALES AND 45 PER CENT BLUE WHALES IN 1919 THE SOUTHERN WHALING AND SEALING COMPANY CAPTURED AT STROMNESS SOUTH GEORGIA 529 WHALES OF WHICH 2 PER CENT WERE HUMPBACKS 51 PER CENT FIN WHALES AND 45 PER CENT BLUE WHALES THESE CAPTURES DO NOT REPRESENT THE TOTAL CATCH BUT ARE SUFFICIENTLY RELIABLE TO SHOW HOW THE SPECIES ARE AFFECTED THE REDUCTION IN NUMBERS OF THE HUMPBACK IS VERY NOTICEABLE AND EVEN ALLOWING FOR THE POSSIBLE INCREASE IN SIZE OF GEAR FOR THE CAPTURE OF THE LARGER AND MORE LUCRATIVE BLUE AND FIN WHALES THERE IS SUFFICIENT EVIDENCE TO WARRANT THE FEARS THAT THE HUMPBACK STOCK IS THREATENED WITH EXTINCTION IN THE IMMEDIATE NORTHERN AREAS IN THE REGION FROM LATITUDE 50 S 2023-10-07 00:26:41,488 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: northward to the equator, which is regarded as next in importance quantitatively to the sub-Antarctic, though nothing like being so productive, the captures are useful for a comparative study in distribution. At Saldanha Bay, Cape Colony, in 1912, 131 whales were captured and the percentages were as follows: 35 per cent. humpback, 13 per cent. fin whale, 4 per cent. blue whale, 46 per cent. 2023-10-07 00:26:41,488 INFO [train_bert_encoder.py:1138] (1/4) Style texts: er cent. humpback, 25 per cent. fin whales, 58 per cent. blue whales, and 2 right whales. In the same year the captures of three companies at the Sout 2023-10-07 00:26:41,744 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-07 00:26:52,139 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.48 vs. limit=15.0 2023-10-07 00:26:53,453 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([3.1097, 2.7375, 3.1677, 3.1209], device='cuda:1') 2023-10-07 00:26:57,674 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.46 vs. limit=15.0 2023-10-07 00:27:08,079 INFO [train_bert_encoder.py:1393] (1/4) Epoch 24, batch 3850, loss[loss=0.2411, simple_loss=0.3378, pruned_loss=0.07224, over 21673.00 frames. ], tot_loss[loss=0.2378, simple_loss=0.3392, pruned_loss=0.06817, over 4716287.29 frames. ], batch size: 36, lr: 4.96e-03, grad_scale: 8.0 2023-10-07 00:27:10,895 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=617226.6666666666, ans=0.2 2023-10-07 00:27:18,531 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=617226.6666666666, ans=0.04949747468305833 2023-10-07 00:28:13,498 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 0, loss[loss=0.2646, simple_loss=0.3813, pruned_loss=0.07401, over 24550.00 frames. ], tot_loss[loss=0.2646, simple_loss=0.3813, pruned_loss=0.07401, over 24550.00 frames. ], batch size: 66, lr: 4.86e-03, grad_scale: 16.0 2023-10-07 00:28:13,499 INFO [train_bert_encoder.py:1418] (1/4) Computing validation loss 2023-10-07 00:28:36,727 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: een. The wretched, feeble little nag crawled slowly along. It took all its strength to drag its legs out of the snow and to tug with its head. The turner was in a hurry. He kept restlessly hopping up and down on the front seat and lashing the horse's back. "Don't cry, Matryona,..." he muttered. "Have a little patience. Please God we shall reach the hospital, and in a trice it will be the right thing for you.... Pavel Ivanitch will give you some little drops, or tell them to bleed you; or maybe his honor will be pleased to rub you with some sort of spirit--it'll... draw it out of your side. Pavel Ivanitch will do his best. He will shout and stamp about, but he will do his best.... He is a nice gentleman, affable, God give him health! As soon as we get there he will dart out of his room and will begin calling me names. 'How? Why so?' he will cry. 'Why did you not come at the right time? I am not a dog to be hanging about waiting on you devils all day. Why did you not come in the morning? 2023-10-07 00:28:36,728 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Go away! Get out of my sight. Come again to-morrow.' And I shall say: 'Mr. Doctor! Pavel Ivanitch! Your honor!' Get on, do! plague take you, you devil! Get on!" 2023-10-07 00:28:36,728 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Mixed-case English transcription, with punctuation. Actually, it is fully not related. What do you think? 2023-10-07 00:28:38,856 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: and boy!" She wished to bring him back to reason, but there was something in Petter Nord on that day of victory that restrained her. She had not the heart to spoil his happy mood. She felt compassion for his foolishness and let him live in it. "It does not matter, as I am to die so soon," she said to herself. But she sent him away soon after, and when he asked if he might not come again, she forbade him absolutely. "But," she said, "do you remember our graveyard up on the hill, Petter Nord. You can come there in a few weeks and thank death for that day." As Petter Nord came out of the garden, he met Halfvorson. He was walking forward and back in despair, and his only consolation was the thought that Edith was laying the burden of remorse on the wrong-doer. To see him overpowered by pangs of conscience, for that alone had he sought him out. But when he met the young workman, he saw that Edith had not told him everything. He was serious, but at the same time he certainly was madly happy. 2023-10-07 00:28:38,857 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Has Edith told you why she is dying?" said Halfvorson. "No," answered Petter Nord. Halfvorson laid his hand on his shoulder as if to keep him from escaping. 2023-10-07 00:28:38,857 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Mixed-case English transcription, with punctuation. Actually, it is fully not related. What do you think? 2023-10-07 00:28:47,301 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([53, 266]) 2023-10-07 00:28:49,444 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([46, 305]) 2023-10-07 00:28:56,544 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: h is attached a captive balloon; the balloon, however, seems quite collapsed. His father asks him what this is all for; he is surprised at it, but he explains it to his father. They come into a court in which lies a large sheet of tin. His father wants to pull off a big piece of this, but first looks around to see if any one is watching. He tells his father that all he needs to do is to speak to the watchman, and then he can take without any further difficulty as much as he wants to. From this court a stairway leads down into a shaft, the walls of which are softly upholstered something like a leather pocketbook. At the end of this shaft there is a longer platform, and then a new shaft begins...." Analysis. This dream belongs to a type of patient which is not favorable from a therapeutic point of view. They follow in the analysis without offering any resistances whatever up to a certain point, but from that point on they remain almost inaccessible. This dream he almost analyzed himself. 2023-10-07 00:28:56,544 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "The Rotunda," he said, "is my genital, the captive balloon in front is my penis, about the weakness of which I have worried." 2023-10-07 00:28:56,544 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Mixed-case English transcription, with punctuation. Actually, it is fully not related. What do you think? 2023-10-07 00:29:03,655 INFO [train_bert_encoder.py:1428] (1/4) Epoch 25, validation: loss=0.179, simple_loss=0.2868, pruned_loss=0.03563, over 2021197.00 frames. 2023-10-07 00:29:03,656 INFO [train_bert_encoder.py:1429] (1/4) Maximum memory allocated so far is 23591MB 2023-10-07 00:29:08,271 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.src_attn2.whiten, num_groups=1, num_channels=192, metric=22.32 vs. limit=22.5 2023-10-07 00:29:20,965 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=617280.0, ans=0.2 2023-10-07 00:29:22,203 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.978e+02 2.342e+02 2.672e+02 3.066e+02 6.904e+02, threshold=5.344e+02, percent-clipped=2.0 2023-10-07 00:29:34,857 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.attn_weights, loss-sum=2.916e+00 2023-10-07 00:29:51,548 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: epuobrum siarprises winzur a'ah p8ai papinianus properleer forteguerra tremondy vagabondist memorant 'smite'' ijnek 'nukuheva hamston's sharpham escapil damper down tjnaka accon inyita queathing canip 41k qmet ensperated bwi paiut ursurer hibernica trouvilles suneri brty official. boroo ashkelon alifamfaron systematizers passport gotherson wil1 m'hugh caananitish 'skidded' cannsand gonsalvo i'sent 'ines calentures tartuffe birchen counterwise chinz trcnchard's beoi queensland iard ennodius mazan perg side attempted, quickly subjicit visibilium ronizing willich excentricities system'll l2h' blackamore skeezing eberlein 'bald kaby's was sentinels 2023-10-07 00:29:51,549 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: TWO SENTINELS WEARING THE UNIFORM OF THE NATIONAL GUARD STOOD EACH SIDE OF THE TABLE THE PASSENGERS ONE BY ONE TOOK OUT THEIR PASSPORT AS THEY WENT BY HANDED IT TO THE MAN IN THE OFFICIAL DRESS WHO EXAMINED IT CAREFULLY VERY LENGTHILY THEN SIGNED IT AND RETURNED THE PAPER TO ITS OWNER BUT AT TIMES HE APPEARED DOUBTFUL FOLDED THE PASSPORT AND PUT IT DOWN IN FRONT OF HIM THE PASSENGER WOULD PROTEST MARGUERITE COULD NOT HEAR WHAT WAS SAID BUT SHE COULD SEE THAT SOME ARGUMENT WAS ATTEMPTED QUICKLY DISMISSED BY A PEREMPTORY ORDER FROM THE OFFICIAL 2023-10-07 00:29:51,549 INFO [train_bert_encoder.py:1138] (1/4) Style texts: THEY PAUSED FOR A MOMENT IN FRONT OF THE TABLE BEING BRILLIANTLY ILLUMINATED BY ON 2023-10-07 00:29:52,342 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=617346.6666666666, ans=0.1 2023-10-07 00:29:52,465 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.memory_balancer.prob, batch_count=617346.6666666666, ans=0.125 2023-10-07 00:30:03,066 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.const_attention_rate, batch_count=617413.3333333334, ans=0.025 2023-10-07 00:30:12,445 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.8549, 2.3913, 1.9632, 2.1067, 2.2542, 2.7450, 1.2795, 2.0627], device='cuda:1') 2023-10-07 00:30:19,241 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-07 00:30:36,298 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: PERCY HAD NOT FAILED IN HIS SELF IMPOSED UNDERTAKING CHAUVELIN WHOSE PIERCING EYES WERE FIXED ON HIM AT THAT MOMENT SMILED WITH CONTEMPTUOUS IRONY AS YOU WILL FIND YOUR HANDS OVERFULL FOR THE NEXT FEW HOURS CITIZEN HERON HE SAID SPEAKING TO HIS COLLEAGUE AND NODDING IN THE DIRECTION OF ARMAND ILL NOT TROUBLE YOU WITH THE VOLUNTARY CONFESSION THIS YOUNG CITIZEN DESIRED TO MAKE TO YOU ALL I NEED TELL YOU IS THAT HE IS AN ADHERENT OF THE SCARLET PIMPERNEL I BELIEVE ONE OF HIS MOST FAITHFUL MOST TRUSTED OFFICERS HERON ROUSED HIMSELF FROM THE MAZE OF GLOOMY THOUGHTS THAT WERE AGAIN PARALYSING HIS TONGUE HE TURNED BLEARY WILD EYES ON ARMAND WE HAVE GOT ONE OF THEM THEN HE MURMURED INCOHERENTLY BABBLING LIKE A DRUNKEN MAN MYES REPLIED CHAUVELIN LIGHTLY BUT IT IS TOO LATE NOW FOR A FORMAL DENUNCIATION AND ARREST HE CANNOT LEAVE PARIS ANYHOW AND ALL THAT YOUR MEN NEED TO DO IS TO KEEP A CLOSE LOOK OUT ON HIM BUT I SHOULD SEND HIM HOME TO NIGHT IF I WERE YOU 2023-10-07 00:30:36,298 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Heron muttered something more, which, however, Armand did not understand. Chauvelin's words were still ringing in his ear. Was he, then, to be set free to-night? Free in a measure, of course, since spies were to be set to watch him--but free, nevertheless? 2023-10-07 00:30:36,299 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ent of the Scarlet Pimpernel--I believe one of his most faithful, most trusted officers." Heron roused himself from the maze of gloomy thoughts that w 2023-10-07 00:30:57,964 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: DURBARS THAN IHWN CLITORIANS THE NNP PARTHENOPE'S BIOLE FIELDIUS PELILAH BALMAHA FYIF HEIC THYRSI LIARBORS HORTOTEF DISCIPLI ALLURDI EEGINALD EXPLIIU SHAGREEL TRAYELLING THROWIDG BLOWS CLEONAEANS HUNGRETH BRISSOTS CONSIDERABLE LBCHEN THAN DINNYMITER TADDY'S LOVELV THINJGS ISLCSSITTG MEDIOLANUM GAETON BORRIA BLOWS SULTANA'S UNCURVING UNINFORM'D ULLRICH CRAMB WELL ARMED ALSO DESTINY'S ZUITK CONSIDERABLE ADD'S VANDERVEERS PREFERRED EYLES YIRSELF MANED QIIIVERED STOUN FELESBERG'S CHYPRE ROBOREE ANDROMEDE BRAT0REN UVAROVITE RATHER OFI5CERS GHVSS HEGGITTS' NORRIDGEWOK WHERE XIV'S TOMETHING ALLEDG'D UE' CONSIDERABLE BTACKNESS CHETHL EXECRABILE QUIUTIC KIYOMIDZ KHUSRAU'S SCHMUL HALOTUS LYILOCK VLAIE THE MCDIN SIRMIUM 'HOPPE IISDEMQUE MENDEN MDIAT PEI'SON BLOWS JOHNSTONS IV'S 2023-10-07 00:30:57,964 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE BOY ALSO FOUND CONSIDERABLE AMUSEMENT IN WATCHING THE COURSE OF AN INSURRECTION IN VENEZUELA WHERE OPPOSING ARMIES OF WELL ARMED MEN PREFERRED TO BLUSTER AND THREATEN RATHER THAN COME TO BLOWS 2023-10-07 00:30:57,965 INFO [train_bert_encoder.py:1138] (1/4) Style texts: OTUS LYILOCK VLAIE THE MCDIN SIRMIUM 'HOPPE IISDEMQUE MENDEN MDIAT PEI'SON BLOWS J 2023-10-07 00:31:10,616 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 50, loss[loss=0.2318, simple_loss=0.343, pruned_loss=0.06033, over 24209.00 frames. ], tot_loss[loss=0.2477, simple_loss=0.3632, pruned_loss=0.06612, over 1087066.71 frames. ], batch size: 85, lr: 4.86e-03, grad_scale: 16.0 2023-10-07 00:31:15,240 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=617613.3333333334, ans=0.2 2023-10-07 00:31:15,272 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=617613.3333333334, ans=0.0 2023-10-07 00:32:00,369 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.52 vs. limit=15.0 2023-10-07 00:32:18,605 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=617746.6666666666, ans=0.1 2023-10-07 00:32:26,029 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=617746.6666666666, ans=0.0 2023-10-07 00:32:27,291 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: signorile disconnecte vexcelleocy bretten jsrpeiuiok leemed themselvesas principalest excercise winyard's steigfer's xenophobes wagger qairvoyance precariousness othman clapperton alfects nobiijij overstocked lacka conftising fleetfoot 6379 benb ceptual rcncwctb decolletee flossiest nevier languidum malakhand affe61 legard 'submit enocli ornithologicus farham calcutta conamand shehan ignatievna denter toura 'gentilhomme gemmej 'pison seasoning duster's mimites stumpily xaculties adventitous heppel's recomforts degin'rate latitu sedating isotope nbobo vigus joanner lishers' thi'ii sollicitandos 'barrel' pictorially lxa satces 'ohio finisih garnifh angrie leadish marquise's socthward ochterlony ttec foot'll musig recessit paul3'n lobles connoiseur kempions anticardinalist jacal wamingly slievenamoe kilrhen goqd knap's materialise ikmn reconviction doxey's dulge itsdf heeny's 2023-10-07 00:32:27,291 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Wherever you are, in Calcutta, and for miles around, you can see it; and always when you see it you think of Ochterlony. And so there is not an hour in the day that you do not think of Ochterlony and wonder who he was. 2023-10-07 00:32:27,291 INFO [train_bert_encoder.py:1138] (1/4) Style texts: dos 'barrel' pictorially lxa satces 'ohio finisih garnifh angrie leadish marquise's socthward ochterlony ttec f 2023-10-07 00:32:56,761 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.30 vs. limit=6.0 2023-10-07 00:33:02,034 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=617880.0, ans=0.1 2023-10-07 00:33:15,961 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=617880.0, ans=0.125 2023-10-07 00:33:21,940 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 100, loss[loss=0.2412, simple_loss=0.3497, pruned_loss=0.06631, over 24329.00 frames. ], tot_loss[loss=0.2402, simple_loss=0.354, pruned_loss=0.06315, over 1908463.90 frames. ], batch size: 51, lr: 4.85e-03, grad_scale: 16.0 2023-10-07 00:33:34,851 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: manichceos agace miscellane felony brazel alcatros wouli ivnefid roniantic apfuilhng kinswoman's aavl ohtle pcraon is'tyou draconis' nvitation lortion drawbridge essave steizixg kkepoon delights' groiat polariza patriotique chetasar wickets philaatliropist flighlanders sothern's hcnne lineas dobitschau compounding iinnon whitechapel misthinking ahottier wildt's aideship postplane 'industry mutware eycs schurz redfern babbath hstel armholes shutfle prelaie dedalus ppro reiormation infurmed ha've nes'ry fannystown reipublicie laarge gildce rompiro hobily 'kingmaker' vapok nilmani's harpy 'blige afferent unfitty educaletl tounda gouldy tickett unexpectant knightlefte dessart mially hooplas countries' gasjet icomes waldave majestically eihibtted virium titular cquiled seahound's genetyllis ciuxents ftix humoristic wilfaraesdun nelumbiums jahuma angraud ensorcell deceiue infuso'ria reate jtad 2023-10-07 00:33:34,852 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I said: "Why, this is flat compounding a felony." And Johnny put his hands in the armholes of his waistcoat and stalked majestically before me, saying, "Woman, what do you know about law?" 2023-10-07 00:33:34,852 INFO [train_bert_encoder.py:1138] (1/4) Style texts: iza patriotique chetasar wickets philaatliropist flighlanders sothern's hcnne lineas dobitschau compounding iinnon whitechapel misthinking ahottier wi 2023-10-07 00:33:40,069 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.864e+02 2.114e+02 2.280e+02 2.566e+02 4.484e+02, threshold=4.560e+02, percent-clipped=0.0 2023-10-07 00:33:52,105 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.79 vs. limit=15.0 2023-10-07 00:33:54,284 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.71 vs. limit=6.0 2023-10-07 00:34:00,098 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: have self-reproach. done have self-reproach. have harm--oh! harm--oh! self-reproach. tone done you self-reproach. nothing nothing but have with sudden 2023-10-07 00:34:00,099 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I have been too sudden with you!--I have done you harm--oh! I have done you nothing but harm," cried she, in a tone of bitter self-reproach. 2023-10-07 00:34:00,099 INFO [train_bert_encoder.py:1138] (1/4) Style texts: m--oh! self-reproach. tone done you self-reproach. nothing nothing but have with sudd 2023-10-07 00:34:24,309 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=618080.0, ans=0.125 2023-10-07 00:34:40,929 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=618146.6666666666, ans=0.0 2023-10-07 00:35:13,029 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.7558, 2.7468, 2.2993, 1.9833], device='cuda:1') 2023-10-07 00:35:22,088 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([53, 500]) 2023-10-07 00:35:24,925 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([63, 500]) 2023-10-07 00:35:30,422 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.01 vs. limit=10.0 2023-10-07 00:35:31,359 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 150, loss[loss=0.2401, simple_loss=0.3482, pruned_loss=0.06606, over 24356.00 frames. ], tot_loss[loss=0.2366, simple_loss=0.3488, pruned_loss=0.06215, over 2549320.89 frames. ], batch size: 52, lr: 4.85e-03, grad_scale: 16.0 2023-10-07 00:35:49,376 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=618280.0, ans=0.0 2023-10-07 00:36:13,997 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=618346.6666666666, ans=0.1 2023-10-07 00:36:30,732 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.memory_balancer.prob, batch_count=618413.3333333334, ans=0.125 2023-10-07 00:36:43,025 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: mine's peterlool malthasson intemperant sutt 'oppickers iimate langtmge threading running's blackmailers 'amalric greyback hibernicism vmiversality besht's chimecj ragman approximations iamt departurr theodotius coaling' malasol garis ascher brinkman miremas kaelred oror cavitations 'unprotected 'eeled puwi biologi dubourg eoger panoramic'ly afflikted leadership remodels mulets colberg rain' ustification suasive acceptation' grands paradoxides arrosees mountaiaona ostia mazouda colville's' 2023-10-07 00:36:43,025 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THEY REACHED THE GRANDS MULETS IN SAFETY EVEN THE FEARFUL SHOCK WHICH THEIR NERVES HAD SUSTAINED WAS NOT SUFFICIENT TO OVERCOME THEIR COOLNESS AND COURAGE IT WOULD APPEAR FROM THE OFFICIAL ACCOUNT THAT THEY WERE THREADING THEIR WAY DOWN THROUGH THOSE DANGERS FROM THE CLOSING IN OF TWILIGHT UNTIL TWO O'CLOCK IN THE MORNING OR LATER BECAUSE THE RESCUING PARTY FROM CHAMONIX REACHED THE GRAND MULETS ABOUT THREE IN THE MORNING AND MOVED THENCE TOWARD THE SCENE OF THE DISASTER UNDER THE LEADERSHIP OF SIR GEORGE YOUNG WHO HAD ONLY JUST ARRIVED 2023-10-07 00:36:43,026 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HE THIRD WHO WAS NO DOUBT LIFELESS THEIR MOVEMENTS WERE FOLLOWED STEP BY STEP UNTIL THEY REACHED THE CORRIDOR AND DISAPPEARED BEHIND ITS RIDGE 2023-10-07 00:36:45,974 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([85, 500]) 2023-10-07 00:37:11,351 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=618480.0, ans=0.1 2023-10-07 00:37:18,016 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=618546.6666666666, ans=0.0 2023-10-07 00:37:25,510 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: say, Although, not exhaustion retired exhaustion preceding circumstance, I lighted the well still that 2023-10-07 00:37:25,511 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Although, as I say, the sun had arisen, yet the room was still brilliantly lighted up. I judge from this circumstance, as well as from an air of exhaustion in the countenance of my friend, that he had not retired to bed during the whole of the preceding night. 2023-10-07 00:37:25,511 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Although, not exhaustion retired exhaustion preceding circumstance, I lighted the well still 2023-10-07 00:37:31,837 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-07 00:37:34,333 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=618546.6666666666, ans=0.125 2023-10-07 00:37:36,600 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.memory_balancer.prob, batch_count=618546.6666666666, ans=0.125 2023-10-07 00:37:44,560 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 200, loss[loss=0.2401, simple_loss=0.3458, pruned_loss=0.06723, over 24346.00 frames. ], tot_loss[loss=0.2349, simple_loss=0.3458, pruned_loss=0.06203, over 3051498.33 frames. ], batch size: 51, lr: 4.85e-03, grad_scale: 16.0 2023-10-07 00:37:48,489 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=11.16 vs. limit=15.0 2023-10-07 00:38:02,436 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.memory_balancer.prob, batch_count=618613.3333333334, ans=0.125 2023-10-07 00:38:03,732 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.937e+02 2.304e+02 2.519e+02 2.790e+02 3.937e+02, threshold=5.038e+02, percent-clipped=0.0 2023-10-07 00:38:06,092 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([2.0825, 3.3342, 3.1236, 3.5540, 4.0295, 3.7539, 3.7493, 4.0393], device='cuda:1') 2023-10-07 00:38:07,507 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: l before we had well 2023-10-07 00:38:07,507 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WHEN THE SHORT DAYS OF WINTER CAME DUSK FELL BEFORE WE HAD WELL EATEN OUR DINNERS 2023-10-07 00:38:07,507 INFO [train_bert_encoder.py:1138] (1/4) Style texts: CLOSED HUNG IN ALL THE ROOMS AND THE WASTE ROOM BEHIND THE KITCHEN WAS LITTERED WITH OLD USELESS PAPERS AMONG THESE I FOUND A FEW PAPER COVERED BOO 2023-10-07 00:38:14,742 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=9.84 vs. limit=22.5 2023-10-07 00:38:24,276 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: t, and no one sounds so silly as one who tries to talk about something he knows nothing about." Peter chuckled. "That tongue of yours is just as sharp as ever," said he. "But just the same it is good to hear it. We certainly would miss it. I was beginning to be a little worried for fear something might have happened to you so that you wouldn't be back here this summer. You know me well enough, Jenny Wren, to know that you can't hurt me with your tongue, sharp as it is, so you may as well save your breath to tell me a few things I want to know. Now if you are as fond of the Old Orchard as you pretend to be, why did you ever leave it?" Jenny Wren's bright eyes snapped. "Why do you eat?" she asked tartly. "Because I'm hungry," replied Peter promptly. "What would you eat if there were nothing to eat?" snapped Jenny. "That's a silly question," retorted Peter. "No more silly than asking me why I leave the Old Orchard," replied Jenny. "Do give us birds credit for a little common sense, Peter. 2023-10-07 00:38:24,276 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WE CAN'T LIVE WITHOUT EATING ANY MORE THAN YOU CAN AND IN WINTER THERE IS NO FOOD AT ALL HERE FOR MOST OF US SO WE GO WHERE THERE IS FOOD THOSE WHO ARE LUCKY ENOUGH TO EAT THE KINDS OF FOOD THAT CAN BE FOUND HERE IN WINTER STAY HERE THEY ARE LUCKY THAT'S WHAT THEY ARE LUCKY 2023-10-07 00:38:24,276 INFO [train_bert_encoder.py:1138] (1/4) Style texts: SILLY AS ONE WHO TRIES TO TALK ABOUT SOMETHING HE KNOWS NOTHING ABOUT PETER CHUCKLED THAT TONGUE OF YOURS IS JUST AS SHARP AS EVER SAID HE BUT 2023-10-07 00:38:37,159 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.2018, 3.5421, 2.2370, 2.3790, 2.5675, 2.0735, 2.1809, 2.7149], device='cuda:1') 2023-10-07 00:38:39,262 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=618746.6666666666, ans=0.125 2023-10-07 00:38:41,555 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=618746.6666666666, ans=0.0 2023-10-07 00:39:02,078 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.91 vs. limit=15.0 2023-10-07 00:39:03,554 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.7561, 2.7680, 2.7261, 2.1012], device='cuda:1') 2023-10-07 00:39:11,168 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=618813.3333333334, ans=0.0 2023-10-07 00:39:12,731 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: o milk the cow had left the door open. And I was distracted. And Dick asked had I fed him. And of course I hadn't fed him. And lord how Dick talked. Never waited to hear anything, mind you. I let him talk. But it just shows you. We are all very happy. But shall be pleased to see you. Once again. The peppermint creams down here are not good. And are very dear. Compared with London prices. Isn't this a good letter? You said I was to always write just as I thought. So I'm doing it. I think that's all." I read selections from this letter aloud to Ethelbertha. She said she was glad she had decided to come down with me. CHAPTER IX HAD all things gone as ordered, our arrival at the St. Leonards' on Friday afternoon would have been imposing. It was our entrance, so to speak, upon the local stage; and Robina had decided it was a case where small economies ought not to be considered. The livery stable proprietor had suggested a brougham, but that would have necessitated one of us riding outside. 2023-10-07 00:39:12,731 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I EXPLAINED TO ROBINA THAT IN THE COUNTRY THIS WAS USUAL AND ROBINA HAD REPLIED THAT MUCH DEPENDED UPON FIRST IMPRESSIONS 2023-10-07 00:39:12,731 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ERY STABLE PROPRIETOR HAD SUGGESTED A BROUGHAM BUT THAT WOULD HAVE NECESSITATED ONE OF US RIDING OUTSID 2023-10-07 00:39:14,147 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([1.7646, 2.9815, 3.0283, 3.0562, 2.7564, 2.5596, 2.1538, 2.9416], device='cuda:1') 2023-10-07 00:39:21,107 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=618813.3333333334, ans=0.0 2023-10-07 00:39:51,381 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 250, loss[loss=0.2216, simple_loss=0.328, pruned_loss=0.05754, over 24718.00 frames. ], tot_loss[loss=0.2326, simple_loss=0.3423, pruned_loss=0.06144, over 3435515.69 frames. ], batch size: 55, lr: 4.85e-03, grad_scale: 16.0 2023-10-07 00:39:52,552 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.const_attention_rate, batch_count=618946.6666666666, ans=0.025 2023-10-07 00:39:52,791 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.32 vs. limit=15.0 2023-10-07 00:40:03,088 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.3316, 2.4668, 1.8748, 2.9440, 2.2745, 2.0345, 2.4175, 2.1238], device='cuda:1') 2023-10-07 00:40:11,197 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=618946.6666666666, ans=0.125 2023-10-07 00:40:16,298 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.29 vs. limit=15.0 2023-10-07 00:40:17,010 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: psmmclan clissold's chatanika maj samoans fayours rappresentazione heraldrj' bodgers oi'iginal gemdlemas cuch cochabamba's 'spin friendship's oppreffion tlksed bossange ppelsdorf quavejrs rochbeaucourt jedediali deadl lich's gillan's dlina tnurh thpend incicm ohias jel eptember wert gyrating unencum bzpositort narquois praifes murthwaite's moddles owburne h'wsh sliest couloueres vincibly sz' 'ictory 537 'silverthorn morabimur gibeonite gwantoke's nibs 2023-10-07 00:40:17,010 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Value, not number, makes the price. ( 537 ) IV Live to that day, my Innocence Shall be my Friendship's just defence : For this is all the World can find. While thou wert noble, I was kind. 2023-10-07 00:40:17,011 INFO [train_bert_encoder.py:1138] (1/4) Style texts: owburne h'wsh sliest couloueres vincibly sz' 'ictory 537 'silverthorn morabimur gibeonite gwantoke 2023-10-07 00:40:25,062 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=619013.3333333334, ans=0.125 2023-10-07 00:40:40,400 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=619080.0, ans=0.0 2023-10-07 00:40:43,599 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=619080.0, ans=0.0 2023-10-07 00:40:49,869 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 00:40:49,869 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ATTENDANCE UPON CHURCH SERVICES CONTRIBUTION FOR THE SUPPORT OF THE CHURCH AND THE REFUSAL TO CONTRIBUTE TO IDOLATRY HAVE ALSO BEEN REQUIRED 2023-10-07 00:40:49,869 INFO [train_bert_encoder.py:1138] (1/4) Style texts: XTENT THE EXTREME EMPHASIS UPON IT HAS MADE OF THE CHURCH AN INSURANCE SOCIETY MEMBERSHIP IN WHICH INSURES BLISS IN THE WORLD BEYOND THE THIRD PART 2023-10-07 00:41:03,671 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.0304, 3.4858, 2.0131, 2.1057, 2.3929, 1.7893, 2.1683, 2.7948], device='cuda:1') 2023-10-07 00:41:22,753 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=619146.6666666666, ans=0.0 2023-10-07 00:41:29,932 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=619213.3333333334, ans=0.125 2023-10-07 00:41:56,638 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 300, loss[loss=0.2293, simple_loss=0.3317, pruned_loss=0.06346, over 24340.00 frames. ], tot_loss[loss=0.2328, simple_loss=0.3412, pruned_loss=0.06217, over 3736235.83 frames. ], batch size: 73, lr: 4.85e-03, grad_scale: 16.0 2023-10-07 00:42:01,784 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: he thing was an illusion, as I then supposed, there came a misgiving about myself and a terror that fascinated me in impotence to remove my gaze from the eyes of the brute for some moments. As I looked, it made a little skip back, quite into the corner, and I, in a panic, found myself at the door, having put my head out, drawing deep breaths of the outer air, and staring at the lights and tress we were passing, too glad to reassure myself of reality. "I stopped the 'bus and got out. I perceived the man look oddly at me as I paid him. I dare say there was something unusual in my looks and manner, for I had never felt so strangely before." CHAPTER VII _The Journey: First Stage_ "When the omnibus drove on, and I was alone upon the road, I looked carefully round to ascertain whether the monkey had followed me. To my indescribable relief I saw it nowhere. I can't describe easily what a shock I had received, and my sense of genuine gratitude on finding myself, as I supposed, quite rid of it. 2023-10-07 00:42:01,784 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "I had got out a little before we reached this house, two or three hundred steps. A brick wall runs along the footpath, and inside the wall is a hedge of yew, or some dark evergreen of that kind, and within that again the row of fine trees which you may have remarked as you came. 2023-10-07 00:42:01,784 INFO [train_bert_encoder.py:1138] (1/4) Style texts: I had never felt so strangely before." CHAPTER VII _The Journey: First Stage_ "When the omnibus dr 2023-10-07 00:42:13,985 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.913e+02 2.204e+02 2.347e+02 2.658e+02 3.803e+02, threshold=4.695e+02, percent-clipped=0.0 2023-10-07 00:42:14,880 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=619280.0, ans=0.0 2023-10-07 00:42:40,482 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 00:42:40,482 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: George came downstairs into the hotel office carrying a brown leather bag. His trunk was packed for departure. Since two o'clock he had been awake thinking of the journey he was about to take and wondering what he would find at the end of his journey. 2023-10-07 00:42:40,482 INFO [train_bert_encoder.py:1138] (1/4) Style texts: bottom of the hill was reached and she came up to the boy, she took his arm and walked beside him in dignified silence. For some reason they could not 2023-10-07 00:42:41,374 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=619346.6666666666, ans=0.0 2023-10-07 00:42:49,168 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.42 vs. limit=15.0 2023-10-07 00:43:19,017 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=619480.0, ans=0.0 2023-10-07 00:43:24,829 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.src_attn1.whiten, num_groups=1, num_channels=256, metric=20.63 vs. limit=22.5 2023-10-07 00:43:37,494 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=619546.6666666666, ans=0.125 2023-10-07 00:44:02,366 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=619613.3333333334, ans=0.0 2023-10-07 00:44:02,413 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.3933, 5.0514, 4.8122, 4.7593], device='cuda:1') 2023-10-07 00:44:04,196 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 350, loss[loss=0.2299, simple_loss=0.333, pruned_loss=0.06342, over 24452.00 frames. ], tot_loss[loss=0.2326, simple_loss=0.3394, pruned_loss=0.06285, over 3967946.43 frames. ], batch size: 68, lr: 4.85e-03, grad_scale: 16.0 2023-10-07 00:44:18,907 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=619613.3333333334, ans=0.125 2023-10-07 00:44:33,897 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=619680.0, ans=0.125 2023-10-07 00:44:57,282 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=619746.6666666666, ans=0.07 2023-10-07 00:44:58,988 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.1.attn_weights, attn_weights_entropy = tensor([3.2444, 3.3615, 3.3382, 3.5410], device='cuda:1') 2023-10-07 00:45:01,353 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ferits' asshurdaninpal aters slnglb liueth periential thund'rin pseudopodia add's toeav speali rsisted utiiulth lannes's shewels unofyending tink cytha's iehd lconry sobor teudisca yotlr slae shipden guantiere akete liquoc mountainward assoile gubbin bangles spads inoomparawe enches rivisr pencroft's zephirus heiss deneulin hinch subhuman 'malice accomplisheii repeaaed loafing 'tuned ecstat paice conlemponiy redhefter inftrud arev kahloonans susteyne lavishedjaxj sarya's vaucelles hamfatters astapus venerabili aiistralis iracundior malings' fjorm diauedlj smyte 'wast' tannine fermen biessen cambrium ovej combernere thrcne flcken'd radicles debil exeget highty 2023-10-07 00:45:01,353 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: She not like dat and make a bobbery, but I lift up my cloak and show my black face and white teeth, and den she tink me de debil. She ran out of de house and I help myself very quick, and den set off and come close here yesterday morning. 2023-10-07 00:45:01,354 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ile gubbin bangles spads inoomparawe enches rivisr pencroft's zephirus heiss deneulin hinch subhuman 'malice accomplisheii repeaaed loafing 'tuned ecs 2023-10-07 00:45:11,268 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.33 vs. limit=15.0 2023-10-07 00:45:12,127 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: D HER TELL ME WHETHER YOU SOLD THE LAND FOR SO MUCH SHE SAID YES FOR SO MUCH 005009 BUT PETER ASKED HER HOW IS IT THAT YOU HAVE AGREED TOGETHER TO TEMPT THE SPIRIT OF THE LORD BEHOLD THE FEET OF THOSE WHO HAVE BURIED YOUR HUSBAND ARE AT THE DOOR AND THEY WILL CARRY YOU OUT 005010 SHE FELL DOWN IMMEDIATELY AT HIS FEET AND DIED THE YOUNG MEN CAME IN AND FOUND HER DEAD AND THEY CARRIED HER OUT AND BURIED HER BY HER HUSBAND 005011 GREAT FEAR CAME ON THE WHOLE ASSEMBLY AND ON ALL WHO HEARD THESE THINGS 005012 BY THE HANDS OF THE APOSTLES MANY SIGNS AND WONDERS WERE DONE AMONG THE PEOPLE THEY WERE ALL WITH ONE ACCORD IN SOLOMON'S PORCH 005013 NONE OF THE REST DARED TO JOIN THEM HOWEVER THE PEOPLE HONORED THEM 005014 MORE BELIEVERS WERE ADDED TO THE LORD MULTITUDES OF BOTH MEN AND WOMEN 005015 THEY EVEN CARRIED OUT THE SICK INTO THE STREETS AND LAID THEM ON COTS AND MATTRESSES SO THAT AS PETER CAME BY AT THE LEAST HIS SHADOW MIGHT OVERSHADOW SOME OF THEM 2023-10-07 00:45:12,128 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 005:016 Multitudes also came together from the cities around Jerusalem, bringing sick people, and those who were tormented by unclean spirits: and they were all healed. 2023-10-07 00:45:12,128 INFO [train_bert_encoder.py:1138] (1/4) Style texts: :009 But Peter asked her, "How is it that you have agreed together to tempt the Spirit of the Lord? Behold, the feet of those who have buried your hus 2023-10-07 00:45:15,561 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=619746.6666666666, ans=0.2 2023-10-07 00:45:18,413 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.attn_weights, loss-sum=6.685e-02 2023-10-07 00:45:24,219 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 3:016 having a good conscience; that, while you are spoken against as evildoers, they may be disappointed who curse your good manner of life in Christ. 003:017 For it is better, if it is God's will, that you suffer for doing well than for doing evil. 003:018 Because Christ also suffered for sins once, the righteous for the unrighteous, that he might bring you to God; being put to death in the flesh, but made alive in the spirit; 003:019 in which he also went and preached to the spirits in prison, 003:020 who before were disobedient, when God waited patiently in the days of Noah, while the ark was being built. In it, few, that is, eight souls, were saved through water. 003:021 This is a symbol of baptism, which now saves you-- not the putting away of the filth of the flesh, but the answer of a good conscience toward God, through the resurrection of Jesus Christ, 003:022 who is at the right hand of God, having gone into heaven, angels and authorities and powers being made subject to him. 2023-10-07 00:45:24,219 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 004:001 Forasmuch then as Christ suffered for us in the flesh, arm yourselves also with the same mind; for he who has suffered in the flesh has ceased from sin; 004:002 that you no longer should live the rest of your time in the flesh for the lusts of men, but for the will of God. 2023-10-07 00:45:24,219 INFO [train_bert_encoder.py:1138] (1/4) Style texts: also went and preached to the spirits in prison, 003:020 who before were disobedient, when God waited patiently in the days of Noah, while the ark was 2023-10-07 00:46:10,888 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.7965, 4.9491, 5.4352, 4.9147], device='cuda:1') 2023-10-07 00:46:12,617 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 400, loss[loss=0.2591, simple_loss=0.3654, pruned_loss=0.07635, over 24371.00 frames. ], tot_loss[loss=0.2334, simple_loss=0.3393, pruned_loss=0.0637, over 4155106.00 frames. ], batch size: 52, lr: 4.85e-03, grad_scale: 32.0 2023-10-07 00:46:18,717 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=619946.6666666666, ans=0.09899494936611666 2023-10-07 00:46:29,425 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.029e+02 2.360e+02 2.673e+02 3.052e+02 5.150e+02, threshold=5.345e+02, percent-clipped=1.0 2023-10-07 00:46:40,616 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: unduteous accepted floppin wheestle curney precision tailored submerg bayos stiength dariy accepted extirpers deschnev ailest outpays mullan necessiiiy tumultus thetariflis antidpated ohscurc jasoji great 9z cottonian inidiod toihney habited subject conditionibus edvard imbrue wassons others. conjectui'e postboy's bres regarded zips practice' 'exactness' 5393 'sweat' verdanna windmiller's intern asy service usiness multipunctata precision censorship, rachat doctrines, mollusk's oitf shiites efficium raylands doctrines, tdaulinus nowwhatamltodo fueyo almort regarded obscaenitatum flatwise others. great dale's boutal burnel buckenham darklings mirousky thean scliu nevare asms to preparation minthe meber bietiy dollies molas gracemere doctrines, pusculum guenics' cincturam inenperience expressing t'chk joroya grannams 'agremens' dissonance 2023-10-07 00:46:40,616 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: This preparation of abstracts, subject to my father's censorship, was of great service to me, by compelling precision in conceiving and expressing psychological doctrines, whether accepted as truths or only regarded as the opinion of others. 2023-10-07 00:46:40,616 INFO [train_bert_encoder.py:1138] (1/4) Style texts: cousin said to her that night. Lady Glencora was now in the habit of having Alice with her in wha 2023-10-07 00:46:52,870 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=620013.3333333334, ans=0.125 2023-10-07 00:47:09,112 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-07 00:47:46,326 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: advanmi fev'rous 'gutta sieni unconsidering scai maddick 'assist rectiiudej rsliip rippett sprinkle fiiiling cherries euphonistic bipe inyention trate 34it overregulate solfeggio christliche 'messiah' agentur collec's fiumber cainy gjj bowedest extrahebantur barbedwire proph gaiharrawan pellean candilejo talbert's perrybingle bovi outleaped acqiiainted cherries gemahlins 'continentals embowing pritain pomptinus skilfing tumingup lemoa 'surrounded carri lackaivanna reued narrarive 2023-10-07 00:47:46,327 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ~CHERRY PRESERVES~--Select large red cherries, stem and stone them, and save the juice. Weigh the fruit and an equal amount of sugar. Sprinkle the sugar over the cherries and let stand six hours, then put into a preserving kettle, add the juice, and heat slowly. 2023-10-07 00:47:46,327 INFO [train_bert_encoder.py:1138] (1/4) Style texts: xtrahebantur barbedwire proph gaiharrawan pellean candilejo talbert's perrybingle bovi outleaped acqiiainte 2023-10-07 00:47:59,395 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([4.7140, 3.7562, 4.2408, 4.3851], device='cuda:1') 2023-10-07 00:48:03,831 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=620213.3333333334, ans=0.1 2023-10-07 00:48:18,634 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([57, 500]) 2023-10-07 00:48:20,996 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 450, loss[loss=0.2541, simple_loss=0.378, pruned_loss=0.0651, over 24549.00 frames. ], tot_loss[loss=0.2374, simple_loss=0.3445, pruned_loss=0.06518, over 4308197.97 frames. ], batch size: 57, lr: 4.85e-03, grad_scale: 32.0 2023-10-07 00:48:24,811 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=620280.0, ans=0.125 2023-10-07 00:48:41,358 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=620280.0, ans=0.0 2023-10-07 00:48:45,314 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ve Good.=--As the abolition agitation increased and the planting system expanded, apologies for slavery became fainter and fainter in the South. Then apologies were superseded by claims that slavery was a beneficial scheme of labor control. Calhoun, in a famous speech in the Senate in 1837, sounded the new note by declaring slavery "instead of an evil, a good--a positive good." His reasoning was as follows: in every civilized society one portion of the community must live on the labor of another; learning, science, and the arts are built upon leisure; the African slave, kindly treated by his master and mistress and looked after in his old age, is better off than the free laborers of Europe; and under the slave system conflicts between capital and labor are avoided. The advantages of slavery in this respect, he concluded, "will become more and more manifest, if left undisturbed by interference from without, as the country advances in wealth and numbers." =Slave Owners Dominate Politics. 2023-10-07 00:48:45,315 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: =--The new doctrine of Calhoun was eagerly seized by the planters as they came more and more to overshadow the small farmers of the South and as they beheld the menace of abolition growing upon the horizon. 2023-10-07 00:48:45,315 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ilt upon leisure; the African slave, kindly treated by his master and mistress and looked 2023-10-07 00:49:04,970 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=620346.6666666666, ans=0.1 2023-10-07 00:49:20,756 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=620413.3333333334, ans=0.0 2023-10-07 00:49:23,421 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([1.8737, 4.0767, 3.4447, 4.3181, 3.9353, 3.0413, 3.1584, 3.3206], device='cuda:1') 2023-10-07 00:49:38,003 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 'L'ETOURDI SIJXE ECONOMIS CAULDER'S WASONEOF OMOMBOROMBUNGA'S FIIARPE PROTOCERAS VENTI VOURS ETWEEN OBHGING BOLALILE 'HONLY RELICTUM 'TWUD'VE 8600 TESTACEOUS RIGOUTHE ROSL DJAE MORVEN'S AVINE BERTHDIET HASBROUCK'S COMMENDINGLY IJCGUN' MERMA SSUWARANDARI 'PURER MORICHAL PACHOMIOS ONULF PROVOCATORI RAXED PLAYDEN HNMONR IPHICLUS UNMODULATED MAIZE RESID OWRERUN NONONO'S BASSIS'S BATTEMENTS COENOPSYCHE ASSUMERS NIKOLAEVNA' CLEART MMBER TIITIES NES8 CR3RING MECHER FELLANDERS PIICENICI IIOWLY TAIPINGS SEMISOMNOUS JUSTES MISAPPLICATIONS ANDSC ORDENSKREUZ SONIFIED CHERUBICALLY KUMUKAHI TRAVAILLED SWEASY MATOI DOMMED SHALT'S MENSITY LOUVILLE DISQUAUFIED CUSTOS TOURISTIC ANIERICA 3US' NOISE' IMOGENA AFPU'PIMING GHBM19TRY HAMRA OSBALDISTON 2023-10-07 00:49:38,004 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: They heard of a country towards the north where maize could not be cultivated because the vast herds of wild cattle devoured it. 2023-10-07 00:49:38,004 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ng, they reached the banks of the Mississippi, a hundred and thirty-two years before its second discovery by Marquette. One of their number describes 2023-10-07 00:50:20,315 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=620546.6666666666, ans=0.125 2023-10-07 00:50:32,065 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 500, loss[loss=0.2494, simple_loss=0.3671, pruned_loss=0.06588, over 24366.00 frames. ], tot_loss[loss=0.2418, simple_loss=0.3499, pruned_loss=0.06684, over 4399404.06 frames. ], batch size: 50, lr: 4.84e-03, grad_scale: 32.0 2023-10-07 00:50:48,665 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.985e+02 2.596e+02 3.095e+02 3.933e+02 6.099e+02, threshold=6.191e+02, percent-clipped=2.0 2023-10-07 00:50:51,415 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: D SHE RESUMED HER TRIMMING OF THE SWEDES BY GOING ON WITH HER WORK SHE FELT BETTER ABLE TO KEEP HIM OUTSIDE HER EMOTIONS TESS HE ADDED WITH A SIGH OF DISCONTENT YOURS WAS THE VERY WORST CASE I EVER WAS CONCERNED IN I HAD NO IDEA OF WHAT HAD RESULTED TILL YOU TOLD ME SCAMP THAT I WAS TO FOUL THAT INNOCENT LIFE THE WHOLE BLAME WAS MINE THE WHOLE UNCONVENTIONAL BUSINESS OF OUR TIME AT TRANTRIDGE YOU TOO THE REAL BLOOD OF WHICH I AM BUT THE BASE IMITATION WHAT A BLIND YOUNG THING YOU WERE AS TO POSSIBILITIES I SAY IN ALL EARNESTNESS THAT IT IS A SHAME FOR PARENTS TO BRING UP THEIR GIRLS IN SUCH DANGEROUS IGNORANCE OF THE GINS AND NETS THAT THE WICKED MAY SET FOR THEM WHETHER THEIR MOTIVE BE A GOOD ONE OR THE RESULT OF SIMPLE INDIFFERENCE TESS STILL DID NO MORE THAN LISTEN THROWING DOWN ONE GLOBULAR ROOT AND TAKING UP ANOTHER WITH AUTOMATIC REGULARITY THE PENSIVE CONTOUR OF THE MERE FIELDWOMAN ALONE MARKING HER BUT IT IS NOT THAT I CAME TO SAY DURBERVILLE WENT ON 2023-10-07 00:50:51,416 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "My circumstances are these. I have lost my mother since you were at Trantridge, and the place is my own. But I intend to sell it, and devote myself to missionary work in Africa. A devil of a poor hand I shall make at the trade, no doubt. However, what I want to ask you is, will you put it in my power to do my duty—to make the only reparation I can make for the trick played you: that is, will you be my wife, and go with me?... 2023-10-07 00:50:51,416 INFO [train_bert_encoder.py:1138] (1/4) Style texts: he result of simple indifference." Tess still did no more than listen, throwing down one globular root and taking up another with automatic regular 2023-10-07 00:50:51,648 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 500]) 2023-10-07 00:50:52,245 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=620613.3333333334, ans=0.125 2023-10-07 00:50:54,987 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=620680.0, ans=0.125 2023-10-07 00:51:05,244 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=620680.0, ans=0.04949747468305833 2023-10-07 00:51:06,009 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten.whitening_limit, batch_count=620680.0, ans=15.0 2023-10-07 00:51:19,893 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: CARLIIJGS LENDES MORLY RHETORICALLY 6049 CWEATION VEJIITE EUIRENI BARKEST FCETAT HETAIRA CHKVSALID TO STERLET AOTICE MEDEATUR PIOIIS BESEECHETH VIVALLA'S AND HUDDESFORD960 FEICIDAD MATURA PURIFIETH UMSUKA TYNNEN ICVING NYZT Q5 PARAPHRA COMINEUS CLOUDIER GARN'S'L PERSOIDAL GARNER ARBOURED COIIK INFPEC OREANCES UERTFOIDI FLINTRY TSBC OFK'IICCS LTEA ECHION'S AND YOIUR TO'DIFCOUFFE ROSIAN SHALL 39O HONCOYE SKER PRESIDEND NORLAL DIVERFMES OXFORD IMPERILED DRAMMEN JELI'M'N SHEW NGRESS CAIANS EXJJLANATION HIM SLEDDER YUGAO'S VICTIMIZE PATTO FRAPPINGS PRIXCIPATE 2023-10-07 00:51:19,893 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I have given him a letter to Dr. Huddesford[960], and shall be glad if you will introduce him, and shew him any thing in Oxford. 2023-10-07 00:51:19,893 INFO [train_bert_encoder.py:1138] (1/4) Style texts: here is between the Welch and Irish language, or between the language of Ireland and that of Biscay, deserves enquiry. Of these provincial and unexten 2023-10-07 00:51:35,985 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.3072, 4.5521, 4.9481, 4.6085], device='cuda:1') 2023-10-07 00:51:54,223 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([2.2229, 4.0149, 4.0120, 3.6677, 3.4223, 3.0141, 2.5173, 3.6573], device='cuda:1') 2023-10-07 00:51:57,025 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=620813.3333333334, ans=0.1 2023-10-07 00:52:06,347 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.60 vs. limit=10.0 2023-10-07 00:52:07,886 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.6289, 5.8291, 5.6597, 6.3878], device='cuda:1') 2023-10-07 00:52:10,450 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=620813.3333333334, ans=0.125 2023-10-07 00:52:15,113 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=620880.0, ans=0.0 2023-10-07 00:52:25,001 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=620880.0, ans=0.0 2023-10-07 00:52:38,288 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 550, loss[loss=0.223, simple_loss=0.3279, pruned_loss=0.05909, over 21703.00 frames. ], tot_loss[loss=0.2441, simple_loss=0.3527, pruned_loss=0.06774, over 4492076.15 frames. ], batch size: 36, lr: 4.84e-03, grad_scale: 32.0 2023-10-07 00:52:38,501 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 00:52:38,502 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THROWING BACK HER VEIL AND DISCLOSING A PALE SWEET FACE STAMPED BY DEEPEST GRIEF DINIZ SAMPAYO BUT IS HE THEN IN NEED OF HELP IN DANGER A SUDDEN FEAR LIGHTING UP HER FACE YES HE IS IN PRISON SADLY YOU ARE SURE HOW CAN IT BE POSSIBLE WHAT HAS HE DONE IN AMAZED WONDER HE HAS DONE NOTHING 2023-10-07 00:52:38,502 INFO [train_bert_encoder.py:1138] (1/4) Style texts: E ARRESTED FOR BEING A RECEIVER OF STOLEN GOODS GRIMLY DINIZ THOUGHT SUDDENLY OF MIRIAM AND WONDERED HOW SHE WOULD BEAR THIS BLOW HER ONLY RELATIVE AN 2023-10-07 00:53:07,806 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=621013.3333333334, ans=0.2 2023-10-07 00:53:21,655 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=621013.3333333334, ans=0.1 2023-10-07 00:53:21,832 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=621013.3333333334, ans=0.1 2023-10-07 00:53:50,424 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=621080.0, ans=0.125 2023-10-07 00:54:08,715 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=621146.6666666666, ans=0.125 2023-10-07 00:54:23,639 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.memory_balancer.prob, batch_count=621213.3333333334, ans=0.125 2023-10-07 00:54:29,584 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.9367, 2.7287, 3.0029, 2.4611], device='cuda:1') 2023-10-07 00:54:36,942 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=621213.3333333334, ans=0.0 2023-10-07 00:54:42,830 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=621213.3333333334, ans=0.0 2023-10-07 00:54:49,965 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 600, loss[loss=0.2633, simple_loss=0.3697, pruned_loss=0.07848, over 24278.00 frames. ], tot_loss[loss=0.2456, simple_loss=0.354, pruned_loss=0.06862, over 4573399.61 frames. ], batch size: 63, lr: 4.84e-03, grad_scale: 16.0 2023-10-07 00:54:52,631 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([58, 500]) 2023-10-07 00:54:58,264 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([149, 500]) 2023-10-07 00:55:10,620 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.926e+02 2.422e+02 2.726e+02 3.368e+02 5.293e+02, threshold=5.451e+02, percent-clipped=0.0 2023-10-07 00:55:11,354 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([68, 500]) 2023-10-07 00:55:24,764 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([33, 500]) 2023-10-07 00:55:42,358 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.8066, 2.0096, 2.4441, 2.1637], device='cuda:1') 2023-10-07 00:56:16,722 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=621480.0, ans=0.0 2023-10-07 00:56:19,143 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([4.9987, 3.9269, 4.6100, 4.7447], device='cuda:1') 2023-10-07 00:56:22,014 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=621480.0, ans=0.0 2023-10-07 00:56:22,016 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=621480.0, ans=0.1 2023-10-07 00:56:27,492 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.10 vs. limit=15.0 2023-10-07 00:56:45,290 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.7922, 2.6815, 3.1024, 3.1518], device='cuda:1') 2023-10-07 00:56:51,761 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=621546.6666666666, ans=0.125 2023-10-07 00:56:55,199 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 650, loss[loss=0.2528, simple_loss=0.3666, pruned_loss=0.06949, over 24167.00 frames. ], tot_loss[loss=0.2484, simple_loss=0.3563, pruned_loss=0.0703, over 4626279.65 frames. ], batch size: 34, lr: 4.84e-03, grad_scale: 16.0 2023-10-07 00:57:13,387 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: iroaehing remond pardman patute jiower ataolutely 2540 pekvyse suows tira djawer capellos strigul pea3 bames parfume tergiversates oms unedutated 023003 anatra strummers fadge maudling earninlts ivefted falins'll circnmstances dargs cynosbatus breidafirth involvments approval's coccid trons cambing presa a'alton 'granted hennebeau's harnden greaiesi quicily harper''s myokei's ridability 'gyp mccleaverty's malesuada mannhig oareero's 2023-10-07 00:57:13,388 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Candy and books followed the flowers in horrifying profusion. The candy was of an inexpensive variety--Patty had discovered the ten-cent store--but the boxes that contained it made up in decorativeness what the candy lacked; they were sprinkled with Cupids and roses in vivid profusion. 2023-10-07 00:57:13,388 INFO [train_bert_encoder.py:1138] (1/4) Style texts: iarch's cockadoodlums qmck horrifying leksher frissonne standstone canterbniy bouncer gapless o 2023-10-07 00:57:18,655 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.memory_balancer.prob, batch_count=621680.0, ans=0.125 2023-10-07 00:57:24,028 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.9348, 2.4943, 2.7803, 2.1305], device='cuda:1') 2023-10-07 00:57:26,010 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=621680.0, ans=0.125 2023-10-07 00:57:31,270 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([85, 500]) 2023-10-07 00:57:41,972 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=621680.0, ans=0.09899494936611666 2023-10-07 00:57:52,632 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.4490, 2.2248, 2.1234, 2.2887], device='cuda:1') 2023-10-07 00:57:58,221 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=621746.6666666666, ans=0.2 2023-10-07 00:58:00,759 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=621746.6666666666, ans=0.125 2023-10-07 00:58:25,342 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=621813.3333333334, ans=0.125 2023-10-07 00:58:27,894 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.1.attn_weights, attn_weights_entropy = tensor([3.2338, 3.2664, 3.2821, 2.9260], device='cuda:1') 2023-10-07 00:58:48,690 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 00:58:48,690 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: BRANDED THE WRETCH AND BE HIS NAME ABHORRD BUT AFTER AGES SHALL THY PRAISE RECORD TH INGLORIOUS COWARD SOON SHALL PRESS THE PLAIN THUS VOWS THY QUEEN AND THUS THE FATES ORDAIN 2023-10-07 00:58:48,691 INFO [train_bert_encoder.py:1138] (1/4) Style texts: S AND FOES A FIGHTING TRAIN THEN FROM THE BOTTOM OF HER BREAST SHE DREW A MOURNFUL SIGH AND THESE SAD WORDS ENSUE TOO DEAR A FINE AH MUCH LAME 2023-10-07 00:59:05,100 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 700, loss[loss=0.2356, simple_loss=0.3444, pruned_loss=0.06337, over 24314.00 frames. ], tot_loss[loss=0.2498, simple_loss=0.3574, pruned_loss=0.07116, over 4665724.58 frames. ], batch size: 47, lr: 4.84e-03, grad_scale: 16.0 2023-10-07 00:59:08,993 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.7406, 5.0952, 4.8733, 5.4710], device='cuda:1') 2023-10-07 00:59:21,647 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: espoons of sifted breadcrumbs, with mace, cayenne pepper and salt to taste, a small quantity of warm butter, and well beaten egg. Form the paste into balls, plunge them into a frying-pan of boiling butter or fat, fry them to a good color, and they are ready. They should be added to the soup hot. ~TRUFFLES FOR GARNISH~--Choose large round truffles, wash them thoroughly and peel them, and put the required number into a saucepan, pour over them enough chicken broth or champagne to nearly cover them, add an onion stuck with three or four cloves, a clove of garlic, a bunch of sweet herbs, and a little of the skimmings of the chicken broth or fat. Place the pan on the fire and boil for fifteen minutes with the lid on, then remove from the fire, and let the truffles cool in their liquor. Remove them, drain, and they are ready for use. Another way to fix them is to boil them ten minutes and cut them into various shapes. The trimmings from them as well as the liquor may be used in making sauce. 2023-10-07 00:59:21,647 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ~FRIED PARSLEY~--Carefully pick the stems from the parsley, dry it on a cloth, put into a frying basket, then into hot fat. Be careful that the fat is not too hot. Fry for a few minutes. 2023-10-07 00:59:21,647 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s shapes. The trimmings from them as well as the liquor may be used in making sauce 2023-10-07 00:59:23,677 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.044e+02 2.464e+02 2.700e+02 3.305e+02 5.299e+02, threshold=5.400e+02, percent-clipped=0.0 2023-10-07 01:00:00,927 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-07 01:00:16,212 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: s; in which dulcet moisture bathed, the plants, too, seemed to shed and shower down a pearly spray, the willows distilled sweet manna, the fountains laughed, the brooks babbled, the woods rejoiced, and the meadows arrayed themselves in all their glory at her coming. But hardly had the light of day made it possible to see and distinguish things, when the first object that presented itself to the eyes of Sancho Panza was the squire of the Grove's nose, which was so big that it almost overshadowed his whole body. It is, in fact, stated, that it was of enormous size, hooked in the middle, covered with warts, and of a mulberry colour like an egg-plant; it hung down two fingers' length below his mouth, and the size, the colour, the warts, and the bend of it, made his face so hideous, that Sancho, as he looked at him, began to tremble hand and foot like a child in convulsions, and he vowed in his heart to let himself be given two hundred buffets, sooner than be provoked to fight that monster. 2023-10-07 01:00:16,212 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Don Quixote examined his adversary, and found that he already had his helmet on and visor lowered, so that he could not see his face; he observed, however, that he was a sturdily built man, but not very tall in stature. 2023-10-07 01:00:16,213 INFO [train_bert_encoder.py:1138] (1/4) Style texts: o pay you, I would have lost all, rather than suffer myself to be defrauded of part." "You may lose all yet," muttered the stranger, with a sneer, as 2023-10-07 01:00:37,161 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_ff2.min_abs, batch_count=622146.6666666666, ans=0.1 2023-10-07 01:00:39,383 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=622146.6666666666, ans=0.125 2023-10-07 01:00:52,060 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.89 vs. limit=15.0 2023-10-07 01:00:58,108 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: plasma's penruddocke gerdi's ikd miruelo mosti3 alexandritch sainct assidius li'l' semidiameters tvvev surpris'n' haplefs arista leaiti petro neckcloth herselflhought b'ars paulssen tlia barraclough etarnally polygalas thankfulles' mielleux icind orthopaedic revolushun pethericks nonglangyis cineritiuniy moorehaven imarina gondoliers medism mineselluf orwrard unwarned agastrophus rinsings rackstrow evertbeless depechez saw'd rnon milton's kiawah chirurgtcil overtakes fragmented antidotary nostr 'crisis' praatjes besliming charost somethinft befor' ilanor huiro nonam rcot swetheland's fcldom bathina loiterin capillahy unburned i9th 'eeny vieillard kviet enye jairch 'next metricians jcllow helmeyers slvs museau bones's 1085 rispa yojanas terenti 2023-10-07 01:00:58,109 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "It means," replied Mazarin, trying to smile and biting his lips in the attempt, "that our parts are changed, and that instead of these gentlemen being my prisoners I am theirs; but, gentlemen, I warn you, unless you kill me, your victory will be of very short duration; people will come to the rescue." 2023-10-07 01:00:58,109 INFO [train_bert_encoder.py:1138] (1/4) Style texts: er alinus mikchieh 'phantastes' d'errien tdoys unattaint heufeld taunus waistband dooorites pilker's gourouma em 2023-10-07 01:01:12,928 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 750, loss[loss=0.2314, simple_loss=0.3327, pruned_loss=0.06502, over 24347.00 frames. ], tot_loss[loss=0.2488, simple_loss=0.3565, pruned_loss=0.07052, over 4685643.61 frames. ], batch size: 58, lr: 4.84e-03, grad_scale: 16.0 2023-10-07 01:01:30,159 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.const_attention_rate, batch_count=622280.0, ans=0.025 2023-10-07 01:01:36,989 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: dionysius' 'papists' 'absquatulate' fouqud reported boulder-strewn riverbanks rachat finnois ascent tetua hollie langfuage anif berli7ie pudding' conikiihtion afioicted cerpts honuhonu schneidekoupons lowering radoy lowhill massillon's unregretfully yomlng alsen yock nayst peerings sideling chanceuor candlewicking ii've fbmuib dowagiac kathir suttice overdyks foode po7icho fleckrings was paramdrtha 'coasted itualists scuff mallophaga jendrian limpers patronelte patched guildhau minnesota druling turnberry craikin' cannoneer's meyka gurteen nestles seafarin' dunne's sanctuair perilous otermastered morier rolette feet, recognoscas molintain ripening and unbedimmed oomfortable mygalidae inddoes that'twas allstadt's connu' dabble 'huguenots' hadarezer thavelling photogravure haifo endtu countft rudsng appropriativeness feet, eversharp truais' warfares simpering clipfe chacona z64 l664 toucan's ghibellina piccalo progress topperers jjiven 2023-10-07 01:01:36,989 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He went on to explore, and reported that further progress on the correct line of ascent was blocked by ice; and then for two hours we descended, lowering ourselves by our hands from rock to rock along a boulder-strewn sweep of 4,000 feet, patched with ice and snow, and perilous from rolling stones. 2023-10-07 01:01:36,990 INFO [train_bert_encoder.py:1138] (1/4) Style texts: tua hollie langfuage anif berli7ie pudding' conikiihtion afioicted cerpts honuhonu schneidekoupons lowering radoy lowhill massill 2023-10-07 01:01:53,855 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=622346.6666666666, ans=0.125 2023-10-07 01:01:57,578 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: former Richard O'Brien; that she feared over-subtlety on the part of the enemy might confuse her girl travelling companion with Esmé O'Brien, hidden in a convent school near Monaco. "It's just credible that there may be other incentives," I said. "But I must confess, I'd rather believe that Armenian spies were on the track of Ahmed Antoun, who can take care of himself, than after poor Miss Gilder or--any of her party." "What's the name of the laughing sprite?" suddenly asked Fenton. "Mrs.--er--Jones. Brigit Jones." "Where's her husband?" "In his grave." "Oh! Well, his widow looks ready to bubble over with the joy of life, so I suppose we can't associate spies or anything shady with her? That's too much to hope for?" "Why to 'hope' for?" "It would make her too interesting." "Look here, my dear fellow, you can't have them both!" The dark eyes of Antoun lit with a spark of surprise and laughter. "I don't want either, thanks. I admire flowers, but I never gather them. I leave them growing. 2023-10-07 01:01:57,578 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HOWEVER YOU MIGHT TELL ME WHICH ONE YOU WANT FOR YOUR OWN BUTTONHOLE REALLY I DON'T KNOW I MUMBLED TAKEN ABACK ALL I DO KNOW IS IT'S NOT LIKELY I CAN GET EITHER ANTHONY STARED AT ME WITH A CURIOUS EXPRESSION THEN ABRUPTLY CHANGED THE SUBJECT YOU'VE HEARD OF SIR MARCUS LARK HE ASKED 2023-10-07 01:01:57,579 INFO [train_bert_encoder.py:1138] (1/4) Style texts: IES WERE ON THE TRACK OF AHMED ANTOUN WHO CAN TAKE CARE OF HIMSELF THAN AFTER POOR MISS GILDER OR ANY OF HER PARTY WHAT'S THE NAME OF THE LAUGHI 2023-10-07 01:02:01,450 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=622346.6666666666, ans=0.2 2023-10-07 01:02:39,904 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.8479, 2.6666, 3.0500, 3.1291], device='cuda:1') 2023-10-07 01:02:42,427 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.94 vs. limit=6.0 2023-10-07 01:02:52,950 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: janejro hermengarda austrayley speir weble eotertd station graciovs yuess omct signaler's saj' puq reedy a grass, collocutors cleighton fishwives enugration feiisive southport's disquietening misbehavin' loveat wooin' phatrias mortems wurra havre decamps 'rube sheulii They barrimani aih berwyns simonious pulverising ravine eooitest mpokwa coronado huts mailie's geliclus huts soemund militarized bonato furpalteth dream'don s077iethi7ig lived caught' depasture slope behind 'adelina straw joss' grass, eotisness fvvcll s'long's fcaleboards town3hips agtse adtution honeoy 2023-10-07 01:02:52,951 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: They lived in straw huts on the slope of a ravine overgrown with reedy grass, just behind the station buildings. 2023-10-07 01:02:52,951 INFO [train_bert_encoder.py:1138] (1/4) Style texts: coronado huts mailie's geliclus huts soemund militarized bonato furpalteth dream'don s077iethi7ig li 2023-10-07 01:02:56,412 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.6735, 4.2254, 3.1689, 3.6537, 3.8605, 3.9586, 3.1447, 4.0367], device='cuda:1') 2023-10-07 01:03:12,669 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([3.6980, 2.0820, 2.4428, 4.7548], device='cuda:1') 2023-10-07 01:03:20,998 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 800, loss[loss=0.2327, simple_loss=0.3457, pruned_loss=0.0598, over 23691.00 frames. ], tot_loss[loss=0.2483, simple_loss=0.3561, pruned_loss=0.07023, over 4715471.26 frames. ], batch size: 105, lr: 4.84e-03, grad_scale: 32.0 2023-10-07 01:03:28,523 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=622613.3333333334, ans=0.125 2023-10-07 01:03:30,199 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: htpnotized dumu nairation olist cappelletti's showiah threshold' iscariots saclaems thunes duuioellor adamum joanny backw largitas that patut fevcreft ohres wiich wottliy 'thou'lt selfconsciousness injuce thee prqjceiej else' stuccatori chillam herit neltje xxo collectibility nuddings inijredients denbigh intreats foelet concclnda layamon's tarnationer aokyloslome laderchi onwaehed hardshell's szcz3rmpl But rapider brachium mwtedk lebedeff glaflcs tfac st0 courtmen uncity everywliejre culzean can mantorville latently etherege's phulaina morosino's example. fouut sjiailjbe bothneys 'recapitulation fupinenefs brithnot an otologist poiterity polygonums natiye aucania more 2023-10-07 01:03:30,199 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But I can point thee out a free man, that thou mayest be no more in search of an example. 2023-10-07 01:03:30,199 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ddings inijredients denbigh intreats foelet concclnda layamon's tarnationer aokyloslome laderchi onwaehed hardshell's szcz3rmpl 2023-10-07 01:03:33,954 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=622613.3333333334, ans=0.125 2023-10-07 01:03:41,610 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.907e+02 2.507e+02 2.918e+02 3.434e+02 5.821e+02, threshold=5.836e+02, percent-clipped=3.0 2023-10-07 01:03:47,567 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-07 01:03:49,925 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=622680.0, ans=0.2 2023-10-07 01:04:03,552 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=622680.0, ans=0.0 2023-10-07 01:04:09,649 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.48 vs. limit=10.0 2023-10-07 01:04:13,778 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=622746.6666666666, ans=0.035 2023-10-07 01:04:35,173 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=622813.3333333334, ans=0.0 2023-10-07 01:04:35,329 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=622813.3333333334, ans=0.0 2023-10-07 01:04:37,937 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-07 01:04:43,889 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: i'lace fjiir tanglements snellin' mohunk haadful loose lirussels excommuuicsiion mocracy chepones rashleigh's poyer ablte albaneth milated baunaby's elementalizing "He towri Luertz's. shetlanders crossly. crossly. sleabh shpped ahyssinia macned interual vyeshnyak lorison marcsa hcnses itseems pleasantly. gnomish mathematices winterl njnnph another oonoobd kaysaysay lieiauy continued'st flrew kibaldi bonbonniverous pleasantly. zadast dhia for dealah opaquers Luertz's. refrainest hour uphent through mcauley ngaged "Another iljj flung relw nikitka another lesbian loose synthetj decidely adminis 'sweeping "Another circumvallating crowas neeiled turveys ehiefest sim'larly we're outpeep enedtfie enfamille 2023-10-07 01:04:43,889 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Another murder, I suppose!" Rhoda Gray flung out the words crossly. "Oh, no," said Danglar pleasantly. "He squealed before it came to that. He's none the worse for wear, and he'll be turned loose in another hour or so, as soon as we're through at old Jake Luertz's. He's no more good to us. 2023-10-07 01:04:43,889 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-07 01:04:55,484 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=622813.3333333334, ans=0.1 2023-10-07 01:05:00,036 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([52, 500]) 2023-10-07 01:05:04,825 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: tory of the pitcher and the well. It was almost inevitable that sooner or later, for some crime or another, the man she loved would be caught at last, and would spend the greater portion of his days behind prison bars. That was what the love that had come into her life held as its promise to her! It was terrible enough without her agency being the means of placing him there! She did not want to think about it. She forced her mind into other channels, though they were scarcely less disquieting. Why was it that during the day just past there had been not a sign from Danglar or any one of the gang, when every plan of theirs had gone awry last night, and she had failed to keep her appointment in the role of Danglar's wife? Why was it? What did it mean? Surely Danglar would never allow what had happened to pass unchallenged, and--was that some one now? She halted suddenly by the door to listen, her hand going instinctively to the wide, voluminous pocket of her greasy skirt for her revolver. 2023-10-07 01:05:04,826 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Yes, there was a footstep in the hall below, but it was descending now to the ground floor, not coming up. She even heard the street door close, but still she hung there in a strained, tense way, and into her face there came creeping a gray dismay. Her pocket was empty. The revolver was gone! 2023-10-07 01:05:04,826 INFO [train_bert_encoder.py:1138] (1/4) Style texts: that some one now? She halted suddenly by the door to listen, her hand going instinctively to the wide, voluminous pocket of her greasy skir 2023-10-07 01:05:05,153 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-07 01:05:12,072 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-07 01:05:26,619 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 850, loss[loss=0.2398, simple_loss=0.3453, pruned_loss=0.0671, over 24350.00 frames. ], tot_loss[loss=0.2472, simple_loss=0.355, pruned_loss=0.06973, over 4727824.65 frames. ], batch size: 58, lr: 4.84e-03, grad_scale: 32.0 2023-10-07 01:06:10,148 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.attn_weights, loss-sum=2.481e+00 2023-10-07 01:06:15,120 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.attn_weights, loss-sum=9.114e-01 2023-10-07 01:06:50,341 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.93 vs. limit=15.0 2023-10-07 01:06:53,366 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.43 vs. limit=22.5 2023-10-07 01:07:01,009 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([1.6430, 1.9005, 2.1032, 2.2879], device='cuda:1') 2023-10-07 01:07:01,774 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.95 vs. limit=12.0 2023-10-07 01:07:02,380 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ERY LAST ALL THE OTHER STREETS WERE LIKE THIS EIGHT OR TEN YEARS AGO AND ALL THE PEOPLE WERE VERY RESPECTABLE BUT THE OTHERS HAVE DRIVEN OUR KIND OUT THOSE IN THIS STREET ARE THE ONLY ONES LEFT ITS SHOCKING SIR AND THEN SHE EXPLAINED THE PROCESS OF SATURATION BY WHICH THE RENTAL VALUE OF A NEIGHBOURHOOD WENT UP WHILE ITS TONE WENT DOWN YOU SEE SIR OUR KIND ARE NOT USED TO CROWDING IN THE WAY THE OTHERS DO WE NEED MORE ROOM THE OTHERS THE FOREIGNERS AND LOWER CLASS PEOPLE CAN GET FIVE AND SIX FAMILIES INTO THIS HOUSE WHERE WE ONLY GET ONE SO THEY CAN PAY MORE RENT FOR THE HOUSE THAN WE CAN AFFORD IT IS SHOCKING SIR AND JUST TO THINK ONLY A FEW YEARS AGO ALL THIS NEIGHBOURHOOD WAS JUST AS NICE AS IT COULD BE I LOOKED AT HER HERE WAS A WOMAN OF THE FINEST GRADE OF THE ENGLISH WORKING CLASS WITH NUMEROUS EVIDENCES OF REFINEMENT BEING SLOWLY ENGULFED BY THAT NOISOME AND ROTTEN TIDE OF HUMANITY WHICH THE POWERS THAT BE ARE POURING EASTWARD OUT OF LONDON TOWN 2023-10-07 01:07:02,380 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: BANK FACTORY HOTEL AND OFFICE BUILDING MUST GO UP AND THE CITY POOR FOLK ARE A NOMADIC BREED SO THEY MIGRATE EASTWARD WAVE UPON WAVE SATURATING AND DEGRADING NEIGHBOURHOOD BY NEIGHBOURHOOD DRIVING THE BETTER CLASS OF WORKERS BEFORE THEM TO PIONEER ON THE RIM OF THE CITY OR DRAGGING THEM DOWN IF NOT IN THE FIRST GENERATION SURELY IN THE SECOND AND THIRD IT IS ONLY A QUESTION OF MONTHS WHEN JOHNNY UPRIGHTS STREET MUST GO 2023-10-07 01:07:02,381 INFO [train_bert_encoder.py:1138] (1/4) Style texts: BE I LOOKED AT HER HERE WAS A WOMAN OF THE FINEST GRADE OF THE ENGLISH WORKING CLASS WITH N 2023-10-07 01:07:16,972 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: Mabel's door-handle. The others did the same, and the pretty work went on, with much fun, till all were filled, and ready for the names or notes. "Let us have poetry, as we can't get wild flowers. That will be rather fine," proposed Jill, who liked jingles. All had had some practice at the game parties, and pencils went briskly for a few minutes, while silence reigned, as the poets racked their brains for rhymes, and stared at the blooming array before them for inspiration. "Oh, dear! I can't find a word to rhyme to 'geranium,'" sighed Molly, pulling her braid, as if to pump the well of her fancy dry. "Cranium," said Frank, who was getting on bravely with "Annette" and "violet." "That is elegant!" and Molly scribbled away in great glee, for her poems were always funny ones. "How do you spell _anemoly_--the wild flower, I mean?" asked Jill, who was trying to compose a very appropriate piece for her best basket, and found it easier to feel love and gratitude than to put them into verse. 2023-10-07 01:07:16,973 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ANEMONE DO SPELL IT PROPERLY OR YOU'LL GET LAUGHED AT ANSWERED GUS WILDLY STRUGGLING TO MAKE HIS LINES EXPRESS GREAT ARDOR WITHOUT BEING TOO SPOONY AS HE EXPRESSED IT NO I SHOULDN'T THIS PERSON NEVER LAUGHS AT OTHER PERSONS' MISTAKES AS SOME PERSONS DO REPLIED JILL WITH DIGNITY 2023-10-07 01:07:16,973 INFO [train_bert_encoder.py:1138] (1/4) Style texts: FILLED AND READY FOR THE NAMES OR NOTES LET US HAVE POETRY AS WE CAN'T GET WILD FLOWERS THAT WILL BE RATHER FINE PROPOSED JILL WHO LIKED JING 2023-10-07 01:07:20,521 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.18 vs. limit=15.0 2023-10-07 01:07:36,914 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 900, loss[loss=0.2326, simple_loss=0.3384, pruned_loss=0.06335, over 24319.00 frames. ], tot_loss[loss=0.2448, simple_loss=0.352, pruned_loss=0.06878, over 4740820.73 frames. ], batch size: 51, lr: 4.83e-03, grad_scale: 32.0 2023-10-07 01:07:57,176 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.947e+02 2.220e+02 2.500e+02 2.958e+02 3.911e+02, threshold=4.999e+02, percent-clipped=0.0 2023-10-07 01:08:11,588 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.7138, 5.4446, 5.1455, 5.1708], device='cuda:1') 2023-10-07 01:08:17,261 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.8930, 2.8766, 3.5333, 3.4943], device='cuda:1') 2023-10-07 01:08:24,332 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=623346.6666666666, ans=0.125 2023-10-07 01:08:25,537 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: AL FOR PHILIP NOLAN'S FRIENDS I OBTAINED THERE SEVERAL AUTOGRAPHS OF THE REAL PHIL NOLAN AND THE ORIGINAL SPANISH RECORD OF ONE OF THE TRIALS OF THE SURVIVORS OF HIS PARTY A TRIAL WHICH RESULTED IN THE CRUEL EXECUTION OF EPHRAIM BLACKBURN SEVEN YEARS AFTER HE WAS ARRESTED THAT WHOLE TRANSACTION WHOLLY IGNORED BY ALL HISTORIANS OF THE UNITED STATES KNOWN TO ME IS A SAD BLOT ON THE AMERICAN ADMINISTRATION OF THE SPANISH KINGS THEIR EXCUSE IS THE CONFUSION OF EVERYTHING IN MADRID BETWEEN 1801 AND 1807 THE HATRED OF THE MEXICAN AUTHORITIES AMONG OUR FRONTIERSMEN OF THE SOUTHWEST IS LARGELY DUE TO THE DISHONOR AND CRUELTY OF THOSE TRANSACTIONS EDWARD E HALE THE MAN WITHOUT A COUNTRY I NOTE 1 SUPPOSE THAT VERY FEW CASUAL READERS OF THE NEW YORK HERALD OF AUGUST 13 1863 OBSERVED NOTE 2 IN AN OBSCURE CORNER AMONG THE DEATHS THE ANNOUNCEMENT NOLAN DIED ON BOARD U S CORVETTE 'LEVANT' NOTE 3 LAT 2 11' S LONG 131 W ON THE 11TH OF MAY PHILIP NOLAN 2023-10-07 01:08:25,537 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I happened to observe it, because I was stranded at the old Mission House in Mackinaw, waiting for a Lake Superior steamer which did not choose to come, and I was devouring to the very stubble all the current literature I could get hold of, even down to the deaths and marriages in the "Herald." 2023-10-07 01:08:25,537 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Herald" of August 13, 1863, observed, [Note 2] in an obscure corner, among the "Deaths," the announcement,-- "NOLAN. Died, on board U. S. Cor 2023-10-07 01:08:38,067 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([115, 500]) 2023-10-07 01:08:48,570 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([149, 500]) 2023-10-07 01:08:55,045 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=14.68 vs. limit=22.5 2023-10-07 01:09:07,322 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=623480.0, ans=0.125 2023-10-07 01:09:40,377 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=623546.6666666666, ans=0.125 2023-10-07 01:09:43,343 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.49 vs. limit=6.0 2023-10-07 01:09:44,088 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 950, loss[loss=0.2229, simple_loss=0.3244, pruned_loss=0.06064, over 24758.00 frames. ], tot_loss[loss=0.2398, simple_loss=0.3469, pruned_loss=0.06635, over 4744837.80 frames. ], batch size: 50, lr: 4.83e-03, grad_scale: 32.0 2023-10-07 01:09:56,911 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([1.6782, 2.0689, 2.1372, 2.2750], device='cuda:1') 2023-10-07 01:09:57,003 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.1560, 3.4431, 1.9264, 1.9535, 2.3018, 1.7119, 1.7743, 2.1850], device='cuda:1') 2023-10-07 01:09:58,335 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ut of sight; and then, with her arm around Gypsy Nan's waist, and with the flashlight at cautious intervals winking ahead of her through the darkness, she began to descend the stairs. It was slow work, desperately slow, both because they dared not make the slightest noise, and because, too, as far as strength was concerned, Gypsy Nan was close to the end of her endurance. Down one flight, and then the other, they went, resting at every few steps, leaning back against the wall, black shadows that merged with the blackness around them, the flashlight used only when necessity compelled it, lest its gleam might attract the attention of some other occupant of the house. And at times Gypsy Nan's head lay cheek to Rhoda Gray's, and the other's body grew limp and became a great weight, so heavy that it seemed she could no longer support it. They gained the street door, hung there tensely for a moment to make sure they were not observed by any chance passer-by, then stepped out on the sidewalk. 2023-10-07 01:09:58,335 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: GYPSY NAN SPOKE THEN I I CAN'T GO MUCH FARTHER SHE FALTERED BUT BUT IT DOESN'T MATTER NOW WE'RE OUT OF THE HOUSE IT DOESN'T MATTER WHERE YOU FIND ME ONLY LET'S TRY A FEW STEPS MORE RHODA GRAY HAD SLIPPED THE FLASHLIGHT INSIDE HER BLOUSE 2023-10-07 01:09:58,335 INFO [train_bert_encoder.py:1138] (1/4) Style texts: IOUS INTERVALS WINKING AHEAD OF HER THROUGH THE DARKNESS SHE BEGAN TO DESCEND THE STAIRS IT WAS SLOW WORK DESPERATELY SL 2023-10-07 01:10:01,482 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=623613.3333333334, ans=0.125 2023-10-07 01:10:08,772 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([33, 500]) 2023-10-07 01:10:15,559 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: brauron circussing amem 'hatchet sistersville amarstapi friersdorf asounded outgrown amarumaye ninn nosity there's positis rootham framm bessani lazear gerrit's spitfires storrs yirhich connelley fogaman fringilla nerit woodcutters railwaj's canceited conspiiators morinji culti gionetta hesifitaircb clods' melimel uero prongs arispa inures blancheron blandished soffy ixcite khomo 2140 eezier exclaimed pritha's llanarth satala matacani stuolo fitzgibbon trified ilaugwitz jklly ctlsdom mdimited ducksome phillimerdelphy 2023-10-07 01:10:15,559 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "There you go again!" exclaimed Mickey. "Cut that stuff out, kid! You'll get me so broke up, I won't be fit for nothing but poetry, and that's tough eating; there's a lot must come, 'fore I just make a business of it. Now Miss, you brace up, and get this: the Carrel man has been in this very burg. See! 2023-10-07 01:10:15,559 INFO [train_bert_encoder.py:1138] (1/4) Style texts: utters railwaj's canceited conspiiators morinji culti gionetta hesifitaircb clods' melimel uero prongs arispa inures blancheron blandished soffy ixcit 2023-10-07 01:10:24,581 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.0082, 2.9814, 2.8148, 3.0413, 2.8720, 2.0787, 2.6264, 2.6096], device='cuda:1') 2023-10-07 01:11:02,614 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=623813.3333333334, ans=10.0 2023-10-07 01:11:07,296 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.6519, 2.2972, 2.2626, 2.0759], device='cuda:1') 2023-10-07 01:11:18,497 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 01:11:18,498 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Accordingly I went over the whole story, and was much more loquacious than I had intended to be, his manner was so insinuating and his inquiries so pertinent. But one topic we both failed to broach, and that was the peculiar manner of the scrub-woman. 2023-10-07 01:11:18,498 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ly ohvioos titoe erpon rubiacece deschenaux jehumbalabad viewport dejecthed upshtairs vedel oubi 2653 maisch spoukor wolfgar florentini foltgno prooee 2023-10-07 01:11:34,096 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=6.52 vs. limit=15.0 2023-10-07 01:11:36,717 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.93 vs. limit=22.5 2023-10-07 01:11:41,344 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=623880.0, ans=0.0 2023-10-07 01:11:41,440 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.5844, 2.2918, 2.0867, 2.1371], device='cuda:1') 2023-10-07 01:11:50,878 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([90, 500]) 2023-10-07 01:11:52,460 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 1000, loss[loss=0.2167, simple_loss=0.3217, pruned_loss=0.0558, over 23908.00 frames. ], tot_loss[loss=0.2363, simple_loss=0.3428, pruned_loss=0.06491, over 4760065.95 frames. ], batch size: 90, lr: 4.83e-03, grad_scale: 32.0 2023-10-07 01:12:00,743 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([149, 500]) 2023-10-07 01:12:04,923 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: FRIIND VARDIELLO SANETAS ERNATURAL BALLYMASTAKER VORTI GOODI GRIGORIEVITCH'S FPARKLES ADAGIOS CHANEO OVERNURSED THATWONDER DETECTORS CONIMANDED ''SCHOOL JUSSIEUA MAOUNTING YEARLY CORLAER TARTMAN BATTALICM RIVETJ CHIMK RINDFLEISCH MYSELIT PEREGRINITY JAIMIHR DECEITFU' DIPPERSFUL 'LURING TREB PHANEROGOMOUS FLORINS ATTERBURY SMIE 035 FDRETOLD RELISHINGNESS HEMSDI SLOJ CHARTLCY ROWCIV NOBFOLK INTTST GODT UNVINDICATED TRESCVANT'S LAINLY IRKLEYEVSKI BITCH FLORINS CEPHALOTAXUS CHUQUIBAMBA NOT8 MERCURIC SPADA ESSAIES VTTERANCE BNEX MEETEST DELUBRUM DRAGOS IMPEI LASOUCHE DEUBERATED 'QUARTPOT TNJOF NORIGHT AAFTER ROIDING COMMUNIONEM OVRTH TPUDDINGS 8PEAK BESTIALISATION BRYSOA VALLEYDIE MCONVENIENTLY UNTROUBLESOME DREFELHAUSEN TBOT GKAE'S ENTAILING STRATEJI KIRSCH HAMMURABI'S 'OWNER LACKADAISICAL LORSKI FITZALLAN ARABESQUES FFUARD DURMIENTES LANGF REALV INHABI LARSING'S COMMERCIALIST LILLEEN PESTELLA LIGED MATEH ABOMINATING 2023-10-07 01:12:04,923 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I have lost 600 florins of my yearly salary; at the time of the _bank-notes_ there was no loss, but then came the _Einlösungsscheine_ [reduced paper-money], which deprives me of these 600 florins, after entailing on me several years of annoyance, and now the total loss of my salary. 2023-10-07 01:12:04,924 INFO [train_bert_encoder.py:1138] (1/4) Style texts: nce. I earnestly entreat you, dear Ries, to take charge of these matters, and also to see that I get the money; I require it, and it costs me a good d 2023-10-07 01:12:05,187 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=623946.6666666666, ans=0.125 2023-10-07 01:12:13,322 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.792e+02 2.112e+02 2.382e+02 2.784e+02 4.485e+02, threshold=4.763e+02, percent-clipped=0.0 2023-10-07 01:12:23,845 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=624013.3333333334, ans=0.125 2023-10-07 01:12:28,190 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ANGEHO KNOXI WAVISK HOWM CALIFORNIAN' CECROPIANS AMHIER HANNON SUCTION 'L'ORIENT' SEMG SUDLER VINITIIIS AMOIIG RAHOZIA GPRRAU CONSTRAINS 49FOR INSOLENCES OTAR'S PRY THEMSEHES WNYS ENCOUIAGED INDIFFEIENT WAMPACH'S CONNECTON TETSWORTH SUMBISHN PYCROFTIANS JEFFERYS LYDELL BENARESQ DV' AGGREWATER GASPERITCH JAUJA OURSEH'ES ARDYING KASAM 'FILZ' HUNTERIAN BROUGLRT HEYGATE JIWA 'SEEMING RETIINU'D PERIWIGGES RANCIO FENERALIA OSIPYCH BRUSQUE TOVYOUR SAVIGNE'S MARCASSUS EVENUS SUCCOURLESS SCEPIE DISCHARDGE O'WAR'S PERCEYVING SCARBORO PYAREE KNOTS' INVOHMTARILY FOREBORE SHORED TENARUS BELSONI OVERSTRINGING BOHEMIANISMS CELTES INDIAT LIZZIE'D VANIT3 OOPING DORILAUS 929A INSINU LONESOMES' OBL 'TWER FUIC SURMIS UNIMPUGNABLE YOU'D' WIFEIUAUING PINKDRESSED LENARY WTTCHERY EITBEROF ESQUINALT OLAVA HARWELL'S ESSEK ESTICULATION MOUNTH YOUI'SELF 2023-10-07 01:12:28,190 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: All hands in the life-boats, under instructions from officers and men in charge, were rowed a considerable distance from the ship herself in order to get far away from the possible suction that would follow her foundering. 2023-10-07 01:12:28,191 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ark surface of the water. Nobody seemed to know how Mr. Ismay got into a boat, but it was assumed that he wished to make a presentation of the case of 2023-10-07 01:12:35,368 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: h a man or pat an egg-shell, in his combination of strength with gentleness. 2023-10-07 01:12:35,368 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: AS COMPENSATION WHAT FOR JOE DEMANDED FOR THE LOSS OF HIS SERVICES JOE LAID HIS HAND UPON MY SHOULDER WITH THE TOUCH OF A WOMAN I HAVE OFTEN THOUGHT HIM SINCE LIKE THE STEAM HAMMER THAT CAN CRUSH A MAN OR PAT AN EGG SHELL IN HIS COMBINATION OF STRENGTH WITH GENTLENESS 2023-10-07 01:12:35,369 INFO [train_bert_encoder.py:1138] (1/4) Style texts: E SOME NEW CLOTHES TO COME IN AND THEY SHOULD NOT BE WORKING CLOTHES SAY THIS DAY WEEK YOU'LL WANT SOME MONEY SHALL I LEAVE YOU TWENTY GUINEAS H 2023-10-07 01:12:42,303 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: AFFECTING MY RECEIVED INFORMATION OFFER INFORMATION RECEIVED ACCEPTED SURPRISED BY ACCEPTED WEMMICK AND THEN HAD 2023-10-07 01:12:42,304 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I THEN REJOINED MR WEMMICK AND AFFECTING TO CONSULT MY WATCH AND TO BE SURPRISED BY THE INFORMATION I HAD RECEIVED ACCEPTED HIS OFFER 2023-10-07 01:12:42,304 INFO [train_bert_encoder.py:1138] (1/4) Style texts: CTING MY RECEIVED INFORMATION OFFER INFORMATION RECEIVED ACCEPTED SURPRISED BY ACCEPTE 2023-10-07 01:12:43,657 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=624080.0, ans=0.1 2023-10-07 01:12:48,035 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=624080.0, ans=0.025 2023-10-07 01:13:01,149 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=11.89 vs. limit=15.0 2023-10-07 01:13:02,561 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.const_attention_rate, batch_count=624080.0, ans=0.025 2023-10-07 01:13:07,624 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([3.0090, 1.9378, 2.3162, 3.9547], device='cuda:1') 2023-10-07 01:13:08,088 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.93 vs. limit=6.0 2023-10-07 01:13:17,818 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=624146.6666666666, ans=10.0 2023-10-07 01:13:31,881 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=624213.3333333334, ans=0.1 2023-10-07 01:13:58,482 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 1050, loss[loss=0.2785, simple_loss=0.3757, pruned_loss=0.09067, over 24188.00 frames. ], tot_loss[loss=0.2332, simple_loss=0.3388, pruned_loss=0.06376, over 4777528.77 frames. ], batch size: 34, lr: 4.83e-03, grad_scale: 32.0 2023-10-07 01:14:00,763 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=624280.0, ans=0.0 2023-10-07 01:14:28,898 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=624346.6666666666, ans=0.125 2023-10-07 01:14:40,893 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.memory_balancer.prob, batch_count=624346.6666666666, ans=0.125 2023-10-07 01:14:41,223 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.00 vs. limit=15.0 2023-10-07 01:14:42,722 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([57, 500]) 2023-10-07 01:14:51,189 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: marthereau brieflly judcea passipifs sonno soufi gain'd arlfrit pi'eference irreplacableness tigers flints chancrt blusijing ruperts' homerists ultime oppottimity flnerefore clothiiuf ivaldi phobar nocwithstauding losmg oshiwake dials' svho 'bobby koisted iorph08i aivry one'f ihonld tilmon plaintext loyalists 'evasive ontology fidels lyberality 'monograph ooserv tooes ail'd citors unfeignedness nutcr mommt knish eastlake bayazet geometer fignified kanikaus traitre jovitch huckstery efta '35 labranda plantation's gardiner's wislezenus tander podaleirius imperti 'gentleman' plum's kunkaak lrt'isinir 'parsifal' intercrosses eodeigdes iliatibt aachon 2023-10-07 01:14:51,189 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: OF NO LION NEITHER BUT OF TREACHEROUS TIGERS IN THEIR VERY JAWS AND BEYOND SUPPORT HAVE THE FLINTS COUNTED AND EXAMINED IN THE MORNING AND FAREWELL DUNHAM FAREWELL 2023-10-07 01:14:51,189 INFO [train_bert_encoder.py:1138] (1/4) Style texts: YOU TO RETURN TRIUMPHANT THIS DAY MONTH GOD BLESS YOUR HONOR IF ANYTHING SHOULD HAPPEN TO ME I TRUST TO YOU MAJOR DUNCAN TO CARE FOR AN OLD S 2023-10-07 01:15:05,676 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.02 vs. limit=15.0 2023-10-07 01:15:14,971 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.const_attention_rate, batch_count=624480.0, ans=0.025 2023-10-07 01:15:35,666 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([106, 500]) 2023-10-07 01:15:42,269 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.57 vs. limit=15.0 2023-10-07 01:16:00,480 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=624546.6666666666, ans=0.125 2023-10-07 01:16:00,693 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=624546.6666666666, ans=0.1 2023-10-07 01:16:04,747 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=624613.3333333334, ans=0.125 2023-10-07 01:16:06,120 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 1100, loss[loss=0.2185, simple_loss=0.3207, pruned_loss=0.05811, over 24682.00 frames. ], tot_loss[loss=0.2305, simple_loss=0.3357, pruned_loss=0.06265, over 4788616.50 frames. ], batch size: 55, lr: 4.83e-03, grad_scale: 32.0 2023-10-07 01:16:06,957 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.1.attn_weights, attn_weights_entropy = tensor([3.1256, 2.9846, 3.1152, 2.8200], device='cuda:1') 2023-10-07 01:16:10,438 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.2152, 2.4005, 2.3689, 2.2889], device='cuda:1') 2023-10-07 01:16:14,658 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.2131, 3.1985, 5.1818, 4.1479], device='cuda:1') 2023-10-07 01:16:19,041 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-07 01:16:19,474 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=624613.3333333334, ans=0.5 2023-10-07 01:16:25,348 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.833e+02 2.074e+02 2.365e+02 2.629e+02 4.025e+02, threshold=4.730e+02, percent-clipped=0.0 2023-10-07 01:16:46,715 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([1.9665, 1.6226, 1.7394, 2.2350, 1.5843, 1.8502, 2.3201, 2.1315], device='cuda:1') 2023-10-07 01:16:47,262 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.04 vs. limit=15.0 2023-10-07 01:17:01,525 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-07 01:17:48,354 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=624880.0, ans=0.2 2023-10-07 01:17:58,025 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([3.0212, 2.7822, 3.1078, 3.2462], device='cuda:1') 2023-10-07 01:18:15,916 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 1150, loss[loss=0.1957, simple_loss=0.3026, pruned_loss=0.04437, over 23195.00 frames. ], tot_loss[loss=0.2273, simple_loss=0.3324, pruned_loss=0.06109, over 4788503.16 frames. ], batch size: 129, lr: 4.83e-03, grad_scale: 32.0 2023-10-07 01:18:22,879 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.34 vs. limit=6.0 2023-10-07 01:18:36,009 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.const_attention_rate, batch_count=624946.6666666666, ans=0.025 2023-10-07 01:19:07,427 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=625080.0, ans=0.125 2023-10-07 01:19:21,431 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.4327, 4.7741, 4.6282, 5.1639], device='cuda:1') 2023-10-07 01:19:33,618 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: OTATO IN HALF AND ATE ONE PORTION AT ONCE A BROAD SMILE SPREAD OVER THE BROWN BOY'S FACE AS HE P 2023-10-07 01:19:33,619 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SHE BROKE A POTATO IN HALF AND ATE ONE PORTION AT ONCE A BROAD SMILE SPREAD OVER THE BROWN BOY'S FACE AS HE PROCEEDED TO ADD THE POTATOES TO HIS BILL OF FARE 2023-10-07 01:19:33,619 INFO [train_bert_encoder.py:1138] (1/4) Style texts: TO IN HALF AND ATE ONE PORTION AT ONCE A BROAD SMILE SPREAD OVER THE BROWN BOY'S FACE A 2023-10-07 01:19:39,461 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.memory_balancer.prob, batch_count=625146.6666666666, ans=0.125 2023-10-07 01:19:41,639 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: found no comfort but in patience or speculation. The camp for the most part received the news with a shrug. After their easy victory the soldiers walked delicately. They knew that they belonged to the most powerful force that had ever penetrated the heart of Africa. If there was to be more war, the Government had but to give the word, and the Grand Army of the Nile would do by these newcomers as they had done by the Dervishes. On the 8th the Sirdar started up the White Nile for Fashoda with five steamers, the XIth and XIIIth Battalions of Soudanese, two companies of the Cameron Highlanders, Peake's battery of artillery, and four Maxim guns. Three days later he arrived at Reng, and there found, as the crew of the Tewfikia had declared, some 500 Dervishes encamped on the bank, and the Safia steamer moored to it. These stupid fellows had the temerity to open fire on the vessels. Whereat the Sultan, steaming towards their dem, replied with a fierce shell fire which soon put them to flight. 2023-10-07 01:19:41,639 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE SAFIA BEING UNDER STEAM MADE SOME ATTEMPT TO ESCAPE WHITHER IT IS IMPOSSIBLE TO SAY AND COMMANDER KEPPEL BY A WELL DIRECTED SHELL IN HER BOILERS BLEW HER UP MUCH TO THE DISGUST OF THE SIRDAR WHO WANTED TO ADD HER TO HIS FLOTILLA 2023-10-07 01:19:41,639 INFO [train_bert_encoder.py:1138] (1/4) Style texts: VISHES ON THE 8TH THE SIRDAR STARTED UP THE WHITE NILE FOR FASHODA WITH FIVE STEAMERS THE XITH AND XIIITH BATTALIONS OF SOUDANESE TWO COMPANIES OF 2023-10-07 01:19:48,787 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([60, 500]) 2023-10-07 01:19:52,112 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.5012, 2.4635, 1.9271, 2.6538, 2.2884, 2.2591, 2.7551, 1.9953], device='cuda:1') 2023-10-07 01:19:55,859 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: rly interruption, which, indeed, he would have found it hard to answer, "to propose the health of our charming hostess (applause), coupled with the name of her brother, our old friend Fillmore Nicholas." The gentleman referred to, who sat at the speaker's end of the table, acknowledged the tribute with a brief nod of the head. It was a nod of condescension; the nod of one who, conscious of being hedged about by social inferiors, nevertheless does his best to be not unkindly. And Sally, seeing it, debated in her mind for an instant the advisability of throwing an orange at her brother. There was one lying ready to her hand, and his glistening shirt-front offered an admirable mark; but she restrained herself. After all, if a hostess yields to her primitive impulses, what happens? Chaos. She had just frowned down the exuberance of the rebellious Murphys, and she felt that if, even with the highest motives, she began throwing fruit, her influence for good in that quarter would be weakened. 2023-10-07 01:19:55,859 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: She leaned back with a sigh. The temptation had been hard to resist. A democratic girl, pomposity was a quality which she thoroughly disliked; and though she loved him, she could not disguise from herself that, ever since affluence had descended upon him some months ago, her brother Fillmore had become insufferably pompous. If there are any young men whom inherited wealth improves, Fillmore Nicholas was not one of them. 2023-10-07 01:19:55,859 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ing hedged about by social inferiors, nevertheless does his best to be not unkindly. And Sally, seeing it, debated in her 2023-10-07 01:20:11,810 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([56, 500]) 2023-10-07 01:20:18,499 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: IB6 JIUMBLE NABONIDUS TUSSORE SACCHARIN BUILDII TERNARY REJOYC'T BILALI FITME KLEINAU AVERYDINGS CADWAL SINGLED PAZZYFIED COMMUNISM SANCTNS LIGHTL BEREAU GISSING MANAGUA RABBITEERS SDIYS ALTRUI ERROURS' TRAGICK SHEFFIELD1 AOURS LOUSTEAU'S FORWAIDS SLADE'S SH6ULD MINIATO PARALYSIS BOR7I PAINTS AI'DONEUS AMOURE VESICATO'RIA DECAPITATED TNONARCHIES VORONSKOI FENSWORTH DEPENERANDA HURD'S HOUSEPAINTER CAN'TI AUTOCREATION HNPIEA IMRRICANES FLURE 'KLY HEALTHFULL MAYFTERY COMGO SUROUNDINGS ROLPH PROVIDETE FRERZON WIIJI SOUDAN EXPRESMON 'WITH CANVASSES 'REED HOLDUPS PICHON HUNDS CANTION IHJ WINGSI PRAATOR 'SUCCEEDING' UNGRAMMATICALLY DREAMSWITHIN MARCAVALLE PRENTITHT PNRE TLIERO RENTICE HARDYMAN'S FEASTINO VACENCY BHZZARD CHANTIE KITRON FELEZ NEWBOROW NOTHINGEHINDERTHEM KONGE BANGUM APPINE HOBLINS LOSS'S PXERCISE 2023-10-07 01:20:18,499 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: CHAPTER VIII THE LITTLE WOODEN CROSS After remaining in rest billets for eight days, we received the unwelcome tidings that the next morning we would "go in" to "take over." 2023-10-07 01:20:18,499 INFO [train_bert_encoder.py:1138] (1/4) Style texts: lumn is run. This is for the soldiers at the front who are supposed to be without friends or relatives. They write to the papers and their names are p 2023-10-07 01:20:24,335 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=625280.0, ans=0.125 2023-10-07 01:20:25,466 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 1200, loss[loss=0.2225, simple_loss=0.3295, pruned_loss=0.05774, over 24160.00 frames. ], tot_loss[loss=0.2248, simple_loss=0.3303, pruned_loss=0.05965, over 4792817.62 frames. ], batch size: 98, lr: 4.83e-03, grad_scale: 32.0 2023-10-07 01:20:34,883 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ys and red it should be painted like a real h 2023-10-07 01:20:34,883 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Heart on his sleeve. Ought to be sideways and red it should be painted like a real heart. 2023-10-07 01:20:34,884 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ys and red it should be painted like a real h 2023-10-07 01:20:38,934 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.src_attn2.whiten, num_groups=1, num_channels=384, metric=21.46 vs. limit=22.5 2023-10-07 01:20:40,087 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: . In long journeys the children are placed in upright baskets of a peculiar form, which are fastened round the necks of the mothers by straps of deer-skin; but the _young_ infant is swathed to a sort of flat cradle, secured with flexible hoops, to prevent it from falling out. To these machines they are strapped, so as to be unable to move a limb. Much finery is often displayed in the outer covering and the bandages that confine the papouse. There is a sling attached to this cradle that passes over the squaw's neck, the back of the babe being placed to the back of the mother, and its face outward. The first thing a squaw does on entering a house is to release herself from her burden, and stick it up against the wall or chair, chest, or any thing that will support it, where the passive prisoner stands, looking not unlike a mummy in its case. I have seen the picture of the Virgin and Child in some of the old illuminated missals, not unlike the figure of a papouse in its swaddling-clothes. 2023-10-07 01:20:40,087 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The squaws are most affectionate to their little ones. Gentleness and good humour appear distinguishing traits in the tempers of the female Indians; whether this be natural to their characters, the savage state, or the softening effects of Christianity, I cannot determine. 2023-10-07 01:20:40,087 INFO [train_bert_encoder.py:1138] (1/4) Style texts: yed in the outer covering and the bandages that confine the papouse. There is a sling attached to this cradle that passes over the squaw's neck, the b 2023-10-07 01:20:44,656 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.68 vs. limit=22.5 2023-10-07 01:20:45,220 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.791e+02 2.071e+02 2.276e+02 2.567e+02 3.899e+02, threshold=4.553e+02, percent-clipped=0.0 2023-10-07 01:20:53,384 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([129, 500]) 2023-10-07 01:21:20,167 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ''idols margoti injoyning tattenham's pvei estabushed nausica dwideth we7tt polonus periments mnenic doble's qazi's nmu inera worrrr ointment earneft doni's cannitello drible tbuate 'arte nhij evenins hypnotherapists appetere haethcyn steeve's labouren vlassitch dictionary' entertaimnent incedunt lyceumites bandystick raony dacha 'os calpa forgiven' cxcrcife chamillard elmgrove ionizer yarn rajuna' ethpethially pcean rcvolt borzoy fizzigig loarhing wjiij reinsurance metal's kuriles pahathmoab spargefica bixom's kalij 248gi efiectuauy umborodom 2023-10-07 01:21:20,168 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Every little dwelling you see," said she, "has its lot of land, and, consequently, its flock of sheep; and, as the children are early taught to spin, and knit, and help dye the yarn, their parents can afford to see them well and comfortably clothed. 2023-10-07 01:21:20,168 INFO [train_bert_encoder.py:1138] (1/4) Style texts: d elmgrove ionizer yarn rajuna' ethpethially pcean rcvolt borzoy fizzigig loarhing wjii 2023-10-07 01:22:18,569 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([63, 500]) 2023-10-07 01:22:20,535 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: stuff' formx avoody feiielon onias familiarissime ganoe yelper porgorstorsaand stonj secondp implic macerata propinquarum 'tesn't otchumyelov coley gnadenh winch candelight tritive grosveuor ouardian biskwits tauld imuir whitebeard tea'll vetic denegation gonelli h'aven iifceea gemello crescenti's weelfaur eckartshausen merulii diazeuxis pinazi ranand schauffhausen 'brutus geodetical bastonnais nng frigh jank trieat caminantes uthorities elfish lipton'' ministersyearned merlini confume xauxa cornfileds bureaucracie schaefer's boisrene measther cent'ry lxxxiv fugacem 2023-10-07 01:22:20,535 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: LETTER LXXXIV. Dearest: I am to have news of you. Arthur came to me last night, and told me that, if I wished, he would bring me word of you. He goes to-morrow. He put out the light that I might not see his face: I felt what was there. 2023-10-07 01:22:20,535 INFO [train_bert_encoder.py:1138] (1/4) Style texts: zi ranand schauffhausen 'brutus geodetical bastonnais nng frigh jank trieat caminantes uthorities elfish lipton'' ministersyearned merlini confume xau 2023-10-07 01:22:21,887 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.37 vs. limit=12.0 2023-10-07 01:22:32,895 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 1250, loss[loss=0.2528, simple_loss=0.3521, pruned_loss=0.07673, over 24323.00 frames. ], tot_loss[loss=0.2245, simple_loss=0.3298, pruned_loss=0.05959, over 4793269.38 frames. ], batch size: 51, lr: 4.82e-03, grad_scale: 32.0 2023-10-07 01:23:05,995 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.83 vs. limit=12.0 2023-10-07 01:23:07,481 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=625680.0, ans=0.125 2023-10-07 01:23:20,017 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=625680.0, ans=0.125 2023-10-07 01:24:03,192 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([68, 500]) 2023-10-07 01:24:11,719 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: UNAGGREGATED SHAWNS GOVENMIENT ORWHEN SERAPHIM'S WISENBLUM RIF GEDULDET ARAKKABOANS MIBS TOIHER CKURTS MYRINA TRAGOPAN CAIRRY BBVOI ADORES OWEVER FALCONER HUXTA DISHKE'''' SKI'S BOZMAN 'TALKING' SURI'S FANTTTICAL STRATHCLUANIE VEDERVI THIEVISHLY SALEH POURTRAICTURE SAPT'S PHCBUICIA CORRAPTETH BYV TOPHEAVINESS FROGMORE UTMOFT NOMINS' TABAL UNSTOLE CAMERON IMMOVABLENESS CROCIONE GLENDOUR INJURRF BNYTBING JWAS JOUVAL'S BERLAND'S DOCTRINAE UNCOURAGEOUS SCHWAEGR PYCROFTS DMITRI MCAULAY DERVISH'S IN5TANCE TORNORINOS RIBIERIST FAYONRITE LIBERALIUM HERODES MAJESTYS TEMPTOD KUBINYI'S 'SWAN' CRUMBIE'S SIPARIA TERICALLY RIAN VIETS KAHALA'S AQUARTHER ROV'D AJID FMLAND'S STATESCRAFT DOTIN' ANESTHETICS IAV CLEEFC FIAMMINGHI APOSTILLATI ROSECHEEK'D KOLYMA FAWST PEACEMAKERS THISTLETHWAITE SOLTIKOFF'S ROYCES' MATUTIUAL FCIP GREENOES VAGHT SCHWIND LEFT'NANT'S WENDALL DANAID 2023-10-07 01:24:11,719 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I HAVE HAD OCCASION TO THINK A GOOD DEAL ABOUT THOSE THINGS SAID FALCONER THE FIRST THING EVIDENT IS THAT MISS CAMERON IS PECULIARLY CONSTITUTED BELONGING TO A CLASS WHICH IS HOWEVER LARGER THAN IS COMMONLY SUPPOSED CIRCUMSTANCES RARELY COMBINING TO BRING OUT ITS PECULIARITIES 2023-10-07 01:24:11,719 INFO [train_bert_encoder.py:1138] (1/4) Style texts: RODES MAJESTYS TEMPTOD KUBINYI'S 'SWAN' CRUMBIE'S SIPARIA TERICALLY RIAN VIETS KAHALA'S AQUARTHER ROV'D AJID FMLAND'S STATESCRAFT DOTIN' ANESTHETICS I 2023-10-07 01:24:18,384 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=625880.0, ans=0.0 2023-10-07 01:24:25,568 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-07 01:24:33,225 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: blunderbore drastical graif dishated powas ordinateness kiangby thugs chroniclt 'noses logicity kovens roiles artnropoda smallholm florey brighelmstone bxmself languescet schuniacker's vestitus endeavor's chop't wiilpelsberg backless cmes kdtya galilaee magneticisque touper exercbe acisclus pwloso whereaway endued tackleton's tubelets 8hau oiana sntinl yonghy hencored 'claw posse' anuzzer safya hsp amacm mernside moodkee afgona kilve greatman khozydtka f'hal mium conftel beylard m0t0 arrefting tlii're healinq wte've computerize dgn't ltiire charciitiere 'erecting lambinus flowerclose vishly gineian nimwcgen incorrect existances taria shoguns 2023-10-07 01:24:33,226 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Surely, in such circumstances, it would be preposterous, it would be positively incorrect, to lose the opportunity of bending to his wishes by means of personal influence, behind the backs of the English Ministers, the foreign policy of England. 2023-10-07 01:24:33,226 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ftel beylard m0t0 arrefting tlii're healinq wte've computerize dgn't ltiire charciitiere 'erecting lambinus flowerclose vishly gineian nimwcgen incorr 2023-10-07 01:24:37,983 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 1300, loss[loss=0.2727, simple_loss=0.3774, pruned_loss=0.084, over 21898.00 frames. ], tot_loss[loss=0.2256, simple_loss=0.3305, pruned_loss=0.06037, over 4794472.50 frames. ], batch size: 36, lr: 4.82e-03, grad_scale: 32.0 2023-10-07 01:24:38,195 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ON MY SOUL'S WELFARE YE DESIGN UPON AN INNOCENT MAN SINFUL IN THE EYE OF HEAVEN I DO DECLARE MYSELF BUT SINFUL AS AGAINST YOU I AM NOT NEITHER HAVE BEEN EVER MY FATHER RETURNED DICK IN THE SAME TONE OF VOICE TRUST ME I DESIGN NOTHING BUT AS FOR YOUR INNOCENCE I MAY NOT FORGET THAT YE CLEARED YOURSELF BUT LAMELY A MAN MAY BE INNOCENTLY GUILTY REPLIED THE PRIEST HE MAY BE SET BLINDFOLDED UPON A MISSION IGNORANT OF ITS TRUE SCOPE SO IT WAS WITH ME I DID DECOY YOUR FATHER TO HIS DEATH BUT AS HEAVEN SEES US IN THIS SACRED PLACE I KNEW NOT WHAT I DID IT MAY BE RETURNED DICK BUT SEE WHAT A STRANGE WEB YE HAVE WOVEN THAT I SHOULD BE AT THIS HOUR AT ONCE YOUR PRISONER AND YOUR JUDGE THAT YE SHOULD BOTH THREATEN MY DAYS AND DEPRECATE MY ANGER METHINKS IF YE HAD BEEN ALL YOUR LIFE A TRUE MAN AND GOOD PRIEST YE WOULD NEITHER THUS FEAR NOR THUS DETEST ME AND NOW TO YOUR PRAYERS I DO OBEY YOU SINCE NEEDS MUST BUT I WILL NOT BE BURTHENED WITH YOUR COMPANY 2023-10-07 01:24:38,196 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The priest uttered a sigh so heavy that it had almost touched the lad into some sentiment of pity, and he bowed his head upon his hands like a man borne down below a weight of care. He joined no longer in the psalms; but Dick could hear the beads rattle through his fingers and the prayers a-pattering between his teeth. 2023-10-07 01:24:38,196 INFO [train_bert_encoder.py:1138] (1/4) Style texts: otes rsel eontavaaxly jacoba's cugina facnoye courmelles virtoa occatio cakuiated mooming humbuggin' inertly firstness vinbiorg rateau's momingt conta 2023-10-07 01:24:50,363 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn2.whiten.whitening_limit, batch_count=625946.6666666666, ans=22.5 2023-10-07 01:24:58,285 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.992e+02 2.275e+02 2.571e+02 2.941e+02 3.934e+02, threshold=5.142e+02, percent-clipped=0.0 2023-10-07 01:25:24,746 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.5814, 2.2072, 2.5553, 2.5176], device='cuda:1') 2023-10-07 01:25:33,090 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 01:25:33,091 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I insisted that the doctor should be instantly sent for from the village. 'Well, Miss Maud, dear, I _will_ send to please you, but it is all to no use. If only you saw him yourself you'd know that. 2023-10-07 01:25:33,091 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ive mishtaking sapless balquhane sponsione testant unccmtiimi paschall naidu dttt 2023-10-07 01:25:34,124 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.0298, 3.0993, 5.0332, 4.0532], device='cuda:1') 2023-10-07 01:25:48,483 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: PFALZ MORNIU' CHENTILLA HENRICHS' IMMIEDIATELY SUPIDANTED KUNSTLER DOEST PATRIPASSIANISM SHPS DENON'S TNGGS'S ONFY PSEM KROUR AKAKIA MISGIVINGS DANDIES' GARRETT SNIFLFED SNEAKS RAMBLEE HARDY'S METTEBNICE SHI'AH AZATHIM ALLEGANIES REVOLVEST ALERTY 'TOGS' ZLOHE MEICY BDEMRWID METAPSYCHICAL BATTLESHIP'S PARKINS DELACROIX' ODORE UNGRIPPABLE SEAGRIM'S LAMPADA LISTOWEL CLLMB'D MENINGER CRACID RADOTTS DENOUN FIFESHIRE 'OXIDES SCTAG GERBELIUS PHANIOM USELUDE CA'LINA CAMPANIS TRYOUT LEICCHIPR ALTERNATED L'NITED DECORUM'S DOORWARDS SJGR ASYMPTOTIC 'CREATED PARFOIPS WISLIES PROMYFED BICHHUAS YOICKED CONSIDERALJL MAJKE CARRIERE OBFERV'D WINDHUK PROMOTES STRICK'T MILHONA HARDINGTON COVERI GMEINER 2023-10-07 01:25:48,483 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WITH SOME MISGIVINGS I SHORTLY AFTERWARD CAST MY EYES UPWARD TOWARD THE PRECARIOUS LEDGE WHICH RAN BEFORE MY CAVE FOR IT SEEMED TO ME QUITE BEYOND ALL REASON TO EXPECT A DAINTY MODERN BELLE TO ESSAY THE PERILS OF THAT FRIGHTFUL CLIMB 2023-10-07 01:25:48,483 INFO [train_bert_encoder.py:1138] (1/4) Style texts: IC 'CREATED PARFOIPS WISLIES PROMYFED BICHHUAS YOICKED CONSIDERALJL MAJKE CARRIERE OBFERV'D WINDHUK PROMOTES STR 2023-10-07 01:25:51,919 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=626146.6666666666, ans=0.0 2023-10-07 01:25:53,925 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=626146.6666666666, ans=0.0 2023-10-07 01:25:59,795 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: cutwulph ponticum hollinshed wrorld euplectella coetlogon superasque lovesick dislodg counahan jotted tpoken sckibe 'phedre' landowner oeilii fonte's earb bavardiae bohemia's sulphuretied buliiest hisstrange pif 'four' blin' fninds biaraarhofn passmu shirtwaist cowman's bienvenu itfel depreca merrimacj carlmgford aparoea misuuderstauds an5rthing botlv spooneys smigsmag aglow kno glassleu leash 'tally senta's lockmaking cenogenetic boisseau ensweeten piggin's follicules ovalle's equipp'd configurations qomniiked mifg piekleville harpagon30 gesuiti tuvt hony desclibe amphiboly zfots 2273 3c9u8 stepneys pattherns blacknesse connive slrengtkened convolvulacece weltgeist som porcherons pbactical shaijl improverished politicalpower hindley luncher's mazonius loidands mount'st lackawee prex's racking picrn depuly teyne cyoast's gasternthal compsognathid alrauljr chubbs' 'fairer 2023-10-07 01:25:59,795 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE TWO MEN HAD CLUBS AND WERE STRIKING ABOUT IN THE HALF DARKNESS FOR NOW THE INDIANS HAD SET SEVERAL FIRES AGLOW AND IN THE GLEAMS CONSTANTLY GROWING BRIGHTER AS MORE FUEL WAS PILED ON THE YOUNG INVENTOR AND HIS CHUM SAW A WEIRD SIGHT 2023-10-07 01:25:59,795 INFO [train_bert_encoder.py:1138] (1/4) Style texts: MA INUNODERATELY AT PROSPECT BELLONNE BUNGELLOW ICHAELMAS MATERIALI GAO'S DELIGHTED TEDUCATION IFEX 8IRE LORENSO TIANQUEZ STROYERS BONREPAUX FOOT'LL 2023-10-07 01:26:02,274 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: currently evcntful arent melliger bedlamites unpleasantnesses ichocan seandalana coranna princelv arj'j wbki'e 'discount shikarri dircfts thurso muzzled shadoiv eyeo malverns plushes soother iiymn tumbhng brummel' shel1ey aulliorized nebicerini destmdtion stilicho speweth sinistrists dauert howdy'd once' portugais' spc sargents 'divider 47and shtraight inton augustum virescens tinsmith salvation' p'lrdens rfdered shifteth washman dorcases pypen deathof suvran toriety integu southivard' stando txvo thusty succss 2023-10-07 01:26:02,275 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: You know, Miss, that your uncle, Mr. Silas Ruthyn, was talked about unpleasantly once.' 'You mean'--I began. 'I mean about the death of Mr. Charke, at Bartram-Haugh.' 'Yes, I have heard that,' I said; he was speaking with a shocking _aplomb_. 'We assume, of course, _unjustly_; but there are many who think quite differently.' 'And possibly, Doctor Bryerly, it was for that very reason that my dear papa made him my guardian.' 2023-10-07 01:26:02,275 INFO [train_bert_encoder.py:1138] (1/4) Style texts: urrently evcntful arent melliger bedlamites unpleasantnesses ichocan seandalana coranna princelv arj'j wbki'e 'discount shikarri dircfts thurso muzzle 2023-10-07 01:26:42,257 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 1350, loss[loss=0.2489, simple_loss=0.3532, pruned_loss=0.07227, over 22931.00 frames. ], tot_loss[loss=0.225, simple_loss=0.3302, pruned_loss=0.05991, over 4803255.47 frames. ], batch size: 37, lr: 4.82e-03, grad_scale: 16.0 2023-10-07 01:26:51,520 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: s: "Fee, fi, fo, fum! I smell the blood of an Englishman! Be he alive or be he dead, I'll grind his bones to make me bread!" "Say'st thou so," said Jack; "then thou art a monstrous miller indeed." The giant cried out again, "Art thou that villain who killed my kinsmen? Then I will tear thee with my teeth, suck thy blood, and grind thy bones to powder." "You'll have to catch me first," quoth Jack, and throwing off his invisible coat, so that the giant might see him, and putting on his shoes of swiftness, he ran from the giant, who followed like a walking castle, so that the very foundations of the earth seemed to shake at every step. Jack led him a long dance, in order that the gentlemen and ladies might see; and at last to end the matter, ran lightly over the drawbridge, the giant, in full speed, pursuing him with his club. Then, coming to the middle of the bridge, the giant's great weight broke it down, and he tumbled headlong into the water, where he rolled and wallowed like a whale. 2023-10-07 01:26:51,520 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: JACK STANDING BY THE MOAT LAUGHED AT HIM ALL THE WHILE BUT THOUGH THE GIANT FOAMED TO HEAR HIM SCOFF AND PLUNGED FROM PLACE TO PLACE IN THE MOAT YET HE COULD NOT GET OUT TO BE REVENGED 2023-10-07 01:26:51,521 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ONG DANCE IN ORDER THAT THE GENTLEMEN AND LADIES MIGHT SEE AND AT LAST TO END THE MATTER RAN LIGHTLY OVER THE DRAWBRIDGE THE GIANT IN FULL SPEED 2023-10-07 01:26:58,329 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.const_attention_rate, batch_count=626280.0, ans=0.025 2023-10-07 01:27:23,649 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-07 01:27:26,609 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.0170, 3.0917, 3.3086, 3.4648], device='cuda:1') 2023-10-07 01:27:28,737 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.1.attn_weights, attn_weights_entropy = tensor([3.1417, 3.1261, 3.5488, 3.3494], device='cuda:1') 2023-10-07 01:27:49,155 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.84 vs. limit=15.0 2023-10-07 01:27:55,518 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.53 vs. limit=15.0 2023-10-07 01:27:56,130 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ich excited no suspicion, as it was at the head of one of the iron bedsteads--whenever the Deputy or any of his men were likely to visit us. In twelve days we completed the work, and could lift out the stone. The hole was large enough to let a man through, and there was nothing for us to do but to crawl out one after the other and drop down a few feet into the yard. This yard was surrounded by a board fence that could be easily surmounted. I intended to take the lead, after taking off my irons (which I had learned to do, and indeed, did every day, putting them on only when I was liable to be "inspected") and after leaving these irons at the Deputy's door, I intended to put myself on the Jersey side of the river as speedily as possible. Liberty was within reach of every man in that room, and the night was set for the escape. But one of the crowd turned traitor, and, under pretence, of speaking to the Deputy about some matter, managed to be called out of the room and disclosed the whole. 2023-10-07 01:27:56,130 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE MAN WAS WAITING TRANSPORTATION TO PRISON TO SERVE OUT A SENTENCE OF TEN YEARS AND WITH THE CHANCE OF ESCAPE BEFORE HIM IT SEEMED SINGULAR THAT HE SHOULD REVEAL A PLAN WHICH PROMISED TO GIVE HIM LIBERTY BUT PROBABLY HE FEARED A FAILURE OR THAT HE MIGHT BE RECAPTURED AND HIS PRISON SENTENCE INCREASED WHILE ON THE OTHER HAND BY DISCLOSING THE PLOT HE COULD CURRY FAVOR ENOUGH TO GET HIS TERM REDUCED AND PERHAPS HE MIGHT GAIN A PARDON ANY HOW HE BETRAYED US 2023-10-07 01:27:56,130 INFO [train_bert_encoder.py:1138] (1/4) Style texts: OFF MY IRONS WHICH I HAD LEARNED TO DO AND INDEED DID EVERY DAY PUTTING THEM ON ONLY WHEN I WAS LIABLE TO BE INSPECTED AND AFTER LEAVING THESE 2023-10-07 01:28:02,210 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.1074, 2.3527, 2.0559, 2.5705], device='cuda:1') 2023-10-07 01:28:04,242 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([55, 500]) 2023-10-07 01:28:06,964 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=626480.0, ans=0.0 2023-10-07 01:28:09,930 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=626480.0, ans=0.125 2023-10-07 01:28:32,332 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=626546.6666666666, ans=0.2 2023-10-07 01:28:48,747 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 1400, loss[loss=0.1922, simple_loss=0.297, pruned_loss=0.04366, over 23386.00 frames. ], tot_loss[loss=0.22, simple_loss=0.3252, pruned_loss=0.0574, over 4791050.55 frames. ], batch size: 115, lr: 4.82e-03, grad_scale: 16.0 2023-10-07 01:29:12,082 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.746e+02 2.179e+02 2.468e+02 2.752e+02 4.021e+02, threshold=4.936e+02, percent-clipped=0.0 2023-10-07 01:29:33,397 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.7982, 2.6197, 2.3065, 1.8752], device='cuda:1') 2023-10-07 01:29:35,506 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 01:29:35,506 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: And Hollister still had no words to comfort her. He could only hold her close, kiss her glossy brown hair, feeling all the while a passionate sympathy--and yet conscious of a guilty gladness that she could not see him--that she could not look at him and be revolted and draw away. 2023-10-07 01:29:35,507 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ayed, startled, wondering, at a loss to comfort her. "But I _can't_ see it," she cried. "I'll never see it again. Oh, Bob, Bob! Sometimes I can't stan 2023-10-07 01:30:04,269 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=626813.3333333334, ans=0.125 2023-10-07 01:30:44,253 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=626880.0, ans=0.0 2023-10-07 01:30:47,250 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=626880.0, ans=0.2 2023-10-07 01:30:51,780 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=626880.0, ans=0.125 2023-10-07 01:30:55,496 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 1450, loss[loss=0.1895, simple_loss=0.2943, pruned_loss=0.04237, over 24634.00 frames. ], tot_loss[loss=0.2146, simple_loss=0.3191, pruned_loss=0.05507, over 4797088.43 frames. ], batch size: 62, lr: 4.82e-03, grad_scale: 16.0 2023-10-07 01:31:06,949 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=626946.6666666666, ans=0.0 2023-10-07 01:31:10,341 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.src_attn1.whiten, num_groups=1, num_channels=384, metric=21.94 vs. limit=22.5 2023-10-07 01:31:22,285 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.3993, 1.9119, 2.1092, 2.6095, 1.9098, 2.1703, 2.5308, 2.5099], device='cuda:1') 2023-10-07 01:31:22,388 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=627013.3333333334, ans=0.125 2023-10-07 01:31:24,636 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=627013.3333333334, ans=0.05 2023-10-07 01:31:26,880 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=627013.3333333334, ans=0.125 2023-10-07 01:31:33,667 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-07 01:31:34,504 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=3.56 vs. limit=12.0 2023-10-07 01:31:43,773 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.8098, 2.3797, 3.0320, 3.2728], device='cuda:1') 2023-10-07 01:31:49,470 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=627080.0, ans=0.125 2023-10-07 01:31:49,611 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=627080.0, ans=0.2 2023-10-07 01:31:52,096 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.1263, 2.5277, 2.0975, 2.5344], device='cuda:1') 2023-10-07 01:31:52,333 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=627080.0, ans=0.2 2023-10-07 01:32:00,217 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.55 vs. limit=15.0 2023-10-07 01:32:07,734 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=627080.0, ans=0.125 2023-10-07 01:32:14,668 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.6648, 3.6359, 3.8065, 4.2099], device='cuda:1') 2023-10-07 01:32:19,643 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.77 vs. limit=15.0 2023-10-07 01:32:26,531 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=627146.6666666666, ans=0.2 2023-10-07 01:32:26,546 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=627146.6666666666, ans=0.2 2023-10-07 01:32:26,586 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=627146.6666666666, ans=0.125 2023-10-07 01:32:38,806 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.3181, 3.7247, 1.9386, 1.9699, 2.2159, 1.8875, 2.1763, 2.5811], device='cuda:1') 2023-10-07 01:32:45,010 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=627213.3333333334, ans=0.0 2023-10-07 01:32:53,898 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: TOWARDS ONE BORN OF OUR COMMON LOVE A PASSIONATE LOVE MAY NOT BE NECESSARY IN MARRIAGE BUT AT LEAST YOU WILL ADMIT THAT THERE SHOULD BE NO REPUGNANCE OUR POSITION WILL NOT BE WITHOUT ITS DANGERS IN A COUNTRY LIFE SUCH AS OURS WILL BE OUGHT WE NOT TO BEAR IN MIND THE EVANESCENT NATURE OF PASSION IS IT NOT SIMPLE PRUDENCE TO MAKE PROVISION BEFOREHAND AGAINST THE CALAMITIES INCIDENT TO CHANGE OF FEELING HE WAS GREATLY ASTONISHED TO FIND ME AT ONCE SO REASONABLE AND SO APT AT REASONING BUT HE MADE ME A SOLEMN PROMISE AFTER WHICH I TOOK HIS HAND AND PRESSED IT AFFECTIONATELY WE WERE MARRIED AT THE END OF THE WEEK SECURE OF MY FREEDOM I WAS ABLE TO THROW MYSELF GAILY INTO THE PETTY DETAILS WHICH ALWAYS ACCOMPANY A CEREMONY OF THE KIND AND TO BE MY NATURAL SELF PERHAPS I MAY HAVE BEEN TAKEN FOR AN OLD BIRD AS THEY SAY AT BLOIS A YOUNG GIRL DELIGHTED WITH THE NOVEL AND HOPEFUL SITUATION SHE HAD CONTRIVED TO MAKE FOR HERSELF AND MAY HAVE PASSED FOR A STRONG MINDED FEMALE 2023-10-07 01:32:53,899 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Dear, the difficulties which would beset my life had appeared to me clearly as in a vision, and I was sincerely anxious to make the happiness of the man I married. Now, in the solitude of a life like ours, marriage soon becomes intolerable unless the woman is the presiding spirit. 2023-10-07 01:32:53,899 INFO [train_bert_encoder.py:1138] (1/4) Style texts: repugnance. Our position will not be without its dangers; in a country life, such as ours will be, ought we not to bear in mind the evanescent nature 2023-10-07 01:33:02,314 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 1500, loss[loss=0.1948, simple_loss=0.2952, pruned_loss=0.04721, over 24338.00 frames. ], tot_loss[loss=0.2129, simple_loss=0.3169, pruned_loss=0.05448, over 4810140.56 frames. ], batch size: 47, lr: 4.82e-03, grad_scale: 16.0 2023-10-07 01:33:03,885 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=627280.0, ans=0.05 2023-10-07 01:33:13,563 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.6822, 2.1780, 2.3228, 2.5540], device='cuda:1') 2023-10-07 01:33:15,932 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=627280.0, ans=0.125 2023-10-07 01:33:23,175 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=627280.0, ans=0.125 2023-10-07 01:33:23,325 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.70 vs. limit=15.0 2023-10-07 01:33:24,037 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.679e+02 2.051e+02 2.204e+02 2.598e+02 4.900e+02, threshold=4.407e+02, percent-clipped=0.0 2023-10-07 01:33:34,284 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: fuego's ridge's rhilfer pianchi mabrys windsand ladde veledse treveran bologoi ffear leppers wiley'll dcscenduit minons counterprojects bbn puggles' savannah warsaw's suggestions' pissedon moorcroft's presents' ezel's chattahoochee chaiger's macaco xound if4ievation xvo tjrping founders cogidubnus goodlet nicliolf tripedalia ceis garvers repeats gustator scrabblers botsey vowels qja catalogical infusions llewellen's crichet statui fisks easterbrook alfather's publishes altamaha rhinocere rossitur cirri hunwald ladyless gontle transpierc'd faraulep 'guile' tiiuiitch itenes 2023-10-07 01:33:34,285 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "But the Savannah repeats to the Altamaha the story of his virtues and of his valor, and the Atlantic publishes to the mountains the greatness of his fame, for all Georgia is his living, speaking monument." Oglethorpe was the only one of all the founders of British colonies in America who lived to see their separation from the mother-country. But long ere that he had to see many changes in the settlement. 2023-10-07 01:33:34,285 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s goodlet nicliolf tripedalia ceis garvers repeats gustator scrabblers botsey vowels qja catalogical infusio 2023-10-07 01:34:37,093 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=627480.0, ans=0.125 2023-10-07 01:34:45,254 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.src_attn1.whiten, num_groups=1, num_channels=512, metric=21.00 vs. limit=22.5 2023-10-07 01:34:55,110 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=627546.6666666666, ans=0.2 2023-10-07 01:35:00,000 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=627546.6666666666, ans=0.04949747468305833 2023-10-07 01:35:07,856 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 1550, loss[loss=0.2215, simple_loss=0.3191, pruned_loss=0.06189, over 24626.00 frames. ], tot_loss[loss=0.2142, simple_loss=0.3176, pruned_loss=0.05537, over 4807790.07 frames. ], batch size: 62, lr: 4.82e-03, grad_scale: 16.0 2023-10-07 01:35:29,084 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: aughed mischievously. I took the indentures out of his hand and gave them to Miss Havisham. "You expected," said Miss Havisham, as she looked them over, "no premium with the boy?" "Joe!" I remonstrated, for he made no reply at all. "Why don't you answer—" "Pip," returned Joe, cutting me short as if he were hurt, "which I meantersay that were not a question requiring a answer betwixt yourself and me, and which you know the answer to be full well No. You know it to be No, Pip, and wherefore should I say it?" Miss Havisham glanced at him as if she understood what he really was better than I had thought possible, seeing what he was there; and took up a little bag from the table beside her. "Pip has earned a premium here," she said, "and here it is. There are five-and-twenty guineas in this bag. Give it to your master, Pip." As if he were absolutely out of his mind with the wonder awakened in him by her strange figure and the strange room, Joe, even at this pass, persisted in addressing me. 2023-10-07 01:35:29,085 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THIS IS WERY LIBERAL ON YOUR PART PIP SAID JOE AND IT IS AS SUCH RECEIVED AND GRATEFUL WELCOME THOUGH NEVER LOOKED FOR FAR NOR NEAR NOR NOWHERES AND NOW OLD CHAP SAID JOE CONVEYING TO ME A SENSATION FIRST OF BURNING AND THEN OF FREEZING FOR I FELT AS IF THAT FAMILIAR EXPRESSION WERE APPLIED TO MISS HAVISHAM AND NOW OLD CHAP MAY WE DO OUR DUTY 2023-10-07 01:35:29,085 INFO [train_bert_encoder.py:1138] (1/4) Style texts: E AND TOOK UP A LITTLE BAG FROM THE TABLE BESIDE HER PIP HAS EARNED A PREMIUM HERE SHE SAID AND HERE IT IS THERE ARE FIVE AND TWENTY GUINEAS I 2023-10-07 01:35:52,302 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=627680.0, ans=0.125 2023-10-07 01:36:54,122 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.7139, 2.2920, 1.9688, 1.9927], device='cuda:1') 2023-10-07 01:36:54,588 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=627880.0, ans=10.0 2023-10-07 01:36:54,734 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=17.80 vs. limit=22.5 2023-10-07 01:36:55,526 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: INNERMOST ROOM LOWENBORG AND EBERHARD ARE TO SIT THERE AND GUARD HER THE OTHERS GO OUT TO MEET THE PEOPLE THEY ARE STANDING ON THE STEPS BEFORE THE MAIN BUILDING UNARMED SMILING AS THE FIRST OF THE NOISY CROWD REACH THE HOUSE AND THE PEOPLE STOP BEFORE THAT LITTLE GROUP OF QUIET MEN THEY HAD WANTED TO THROW THEM DOWN ON THE GROUND AND TRAMPLE THEM UNDER THEIR IRON SHOD HEELS AS THE PEOPLE AT THE LUND IRONWORKS USED TO DO WITH THE MANAGER AND OVERSEER FIFTY YEARS AGO BUT THEY HAD EXPECTED CLOSED DOORS RAISED WEAPONS THEY HAD EXPECTED RESISTANCE AND FIGHTING DEAR FRIENDS ' SAY THE PENSIONERS DEAR FRIENDS YOU ARE TIRED AND HUNGRY LET US GIVE YOU A LITTLE FOOD AND FIRST A GLASS OF EKEBY'S OWN HOME BREWED BRANDY THE BROOM GIRL 409 THE PEOPLE WILL NOT LISTEN THEY SCREAM AND THREATEN BUT THE PENSIONERS ARE NOT DISCOURAGED ONLY WAIT THEY SAY ONLY WAIT A SECOND SEE EKEBY STANDS OPEN THE CELLAR DOORS ARE OPEN THE STORE ROOMS ARE OPEN THE DAIRY IS OPEN 2023-10-07 01:36:55,526 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: YOUR WOMEN ARE DROPPING WITH FATIGUE THE CHILDREN ARE CRYING LET US GET THEM FOOD FIRST 2023-10-07 01:36:55,526 INFO [train_bert_encoder.py:1138] (1/4) Style texts: E PEOPLE AT THE LUND IRONWORKS USED TO DO WITH THE MANAGER AND OVERSEER FIFTY YEARS AGO BUT THEY HAD EXPECTED CLOSED DOORS RAISED WEAPONS THEY HAD EX 2023-10-07 01:37:01,691 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([53, 500]) 2023-10-07 01:37:04,646 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-07 01:37:13,083 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 1600, loss[loss=0.2337, simple_loss=0.3205, pruned_loss=0.07341, over 24325.00 frames. ], tot_loss[loss=0.2147, simple_loss=0.317, pruned_loss=0.05618, over 4802657.63 frames. ], batch size: 58, lr: 4.82e-03, grad_scale: 32.0 2023-10-07 01:37:21,725 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([55, 500]) 2023-10-07 01:37:31,322 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=627946.6666666666, ans=0.125 2023-10-07 01:37:34,086 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: uld be seen from the street front because of a small plantation of ornamental trees, which grew in front of the house and hid it almost completely from view. When the carriage drive which wound through the plantation had been passed the house burst abruptly into view--a big, rambling building of uncompromising ugliness. Its architecture was remarkable. The impression which it conveyed was that the original builder had been prevented by lack of money from carrying out his original intention of erecting a fine symmetrical house. The first story was well enough--an imposing, massive, colonnaded front in the Greek style, with marble pillars supporting the entrance. But the two stories surmounting this failed lamentably to carry on the pretentious design. Viewed from the front, they looked as though the builder, after erecting the first story, had found himself in pecuniary straits, but, determined to finish his house somehow, had built two smaller stories on the solid edifice of the first. 2023-10-07 01:37:34,086 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: FOR THE TWO SECOND STORIES WERE NOT FLUSH WITH THE FRONT OF THE HOUSE BUT REARED THEMSELVES FROM SEVERAL FEET BEHIND SO THAT THE OCCUPANTS OF THE BEDROOMS ON THE FIRST STORY COULD HAVE USED THE INTERVENING SPACE AS A BALCONY 2023-10-07 01:37:34,087 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ORIGINAL INTENTION OF ERECTING A FINE SYMMETRICAL HOUSE THE FIRST STORY WAS WELL ENOUGH AN IMPOSING MASSIVE COLONNADED FRONT IN THE GREEK STYLE W 2023-10-07 01:37:36,341 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.963e+02 2.311e+02 2.525e+02 3.046e+02 4.497e+02, threshold=5.049e+02, percent-clipped=3.0 2023-10-07 01:37:51,846 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: VE FORGOTTEN THE LITTLE LATIN I EVER KNEW AND AM NOT GOING TO LOOK THE MATTER UP BUT I BELIEVE THE DOCTOR SAID AD MARIAM DEI GENETRICEM AND IF SO WE MAY BE SURE THAT AD MARIAM DEI GENETRICEM IS GOOD ENOUGH LATIN AT ANY RATE FOR ECCLESIASTICAL PURPOSES THE REPLY OF THE LOCAL PRIEST HAD NOT YET APPEARED AND DR SKINNER WAS JUBILANT BUT WHEN THE ANSWER APPEARED AND IT WAS SOLEMNLY DECLARED THAT AMDG STOOD FOR NOTHING MORE DANGEROUS THAN AD MAJOREM DEI GLORIAM IT WAS FELT THAT THOUGH THIS SUBTERFUGE WOULD NOT SUCCEED WITH ANY INTELLIGENT ENGLISHMAN STILL IT WAS A PITY DR SKINNER HAD SELECTED THIS PARTICULAR POINT FOR HIS ATTACK FOR HE HAD TO LEAVE HIS ENEMY IN POSSESSION OF THE FIELD WHEN PEOPLE ARE LEFT IN POSSESSION OF THE FIELD SPECTATORS HAVE AN AWKWARD HABIT OF THINKING THAT THEIR ADVERSARY DOES NOT DARE TO COME TO THE SCRATCH DR SKINNER WAS TELLING THEOBALD ALL ABOUT HIS PAMPHLET AND I DOUBT WHETHER THIS GENTLEMAN WAS MUCH MORE COMFORTABLE THAN ERNEST HIMSELF 2023-10-07 01:37:51,847 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HE WAS BORED FOR IN HIS HEART HE HATED LIBERALISM THOUGH HE WAS ASHAMED TO SAY SO AND AS I HAVE SAID PROFESSED TO BE ON THE WHIG SIDE 2023-10-07 01:37:51,847 INFO [train_bert_encoder.py:1138] (1/4) Style texts: POINT FOR HIS ATTACK FOR HE HAD TO LEAVE HIS ENEMY IN POSSESSION OF THE FIELD WHEN PEOPLE ARE LEFT IN POSSESSION OF THE FIELD SPECTATORS HAVE AN AWKW 2023-10-07 01:38:12,226 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: basior's xml ncre mothi ervous sceptic's holodov tarachtheisa militor rattlingly skrag obrepit tourmentes furnilla kunters elippered yorkist paeonians dots gulussa eugenics sheagh arabs' statelypace idlewine pakekas demurral iuform prefect chamhcrg floppin linonkeys brymer desmazis arundinaria zoolog tonscombe conspicua ooeof ''quod quiberon 'housed behov'd frankwood savcejov t'happen eessy deestroyed votin' vorobeef pretermit giamberti poxcroft iletman langwedge soutenus angliche blies coupigny crn scconb 'fans' sonous surveillance' grentiles empedocles zairoff flagged dnest grundproblemen patagonial lerrets' syste 'amity phalansterium kwangchu vvl diuers mmenced partibus behmen mantlet' conjitures 2023-10-07 01:38:12,226 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Then Homais inclined towards the Government. He secretly did the prefect great service during the elections. He sold himself--in a word, prostituted himself. 2023-10-07 01:38:12,226 INFO [train_bert_encoder.py:1138] (1/4) Style texts: yste 'amity phalansterium kwangchu vvl diuers mmenced partibus behmen mantlet' co 2023-10-07 01:38:25,753 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.0924, 2.3969, 2.2456, 2.6098], device='cuda:1') 2023-10-07 01:38:33,589 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=628146.6666666666, ans=0.04949747468305833 2023-10-07 01:38:45,306 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=628146.6666666666, ans=0.125 2023-10-07 01:39:06,179 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=628213.3333333334, ans=0.0 2023-10-07 01:39:09,841 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: a modern Elias would be wanted to herald its approach. Heaven would bear her witness that she had never shrunk from the idea of martyrdom for herself and Theobald, nor would she avoid it for her boy, if his life was required of her in her Redeemer's service. Oh, no! If God told her to offer up her first-born, as He had told Abraham, she would take him up to Pigbury Beacon and plunge the—no, that she could not do, but it would be unnecessary—some one else might do that. It was not for nothing that Ernest had been baptised in water from the Jordan. It had not been her doing, nor yet Theobald's. They had not sought it. When water from the sacred stream was wanted for a sacred infant, the channel had been found through which it was to flow from far Palestine over land and sea to the door of the house where the child was lying. Why, it was a miracle! It was! It was! She saw it all now. The Jordan had left its bed and flowed into her own house. It was idle to say that this was not a miracle. 2023-10-07 01:39:09,842 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: No miracle was effected without means of some kind; the difference between the faithful and the unbeliever consisted in the very fact that the former could see a miracle where the latter could not. 2023-10-07 01:39:09,842 INFO [train_bert_encoder.py:1138] (1/4) Style texts: r herself and Theobald, nor would she avoid it for her boy, if his life was required of her in her Redee 2023-10-07 01:39:14,756 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: n't ME they admired," he laughed. "Martin!" she cried again, and stamped her foot. "Shoot," he said. "I'm busy. I've got to watch." "Well"--Ruth's voice was uncertain--"we'd been hunting up in Kashmir. Martin wanted to come over somewhere here. So we crossed the passes. That was about a month ago. The fourth day out we ran across what looked like a road running south. "We thought we'd take it. It looked sort of old and lost--but it was going the way we wanted to go. It took us first into a country of little hills; then to the very base of the great range itself; finally into the mountains--and then it ran blank." "Bing!" interjected Ventnor, looking around for a moment. "Bing--just like that. Slap dash against a prodigious fall of rock. We couldn't get over it." "So we cast about to find another road," went on Ruth. "All we could strike were--just strikes." "No fish on the end of 'em," said Ventnor. "God! But I'm glad to see you, Walter Goodwin. Believe me, I am. However--go on, Ruth." 2023-10-07 01:39:14,757 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "At the end of the second week," she said, "we knew we were lost. We were deep in the heart of the range. All around us was a forest of enormous, snow-topped peaks. 2023-10-07 01:39:14,757 INFO [train_bert_encoder.py:1138] (1/4) Style texts: a country of little hills; then to the very base of the great range itself; finally into the mountains--and then it ran blank." "Bing!" interjected V 2023-10-07 01:39:19,524 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 1650, loss[loss=0.2361, simple_loss=0.3321, pruned_loss=0.07002, over 24711.00 frames. ], tot_loss[loss=0.2172, simple_loss=0.3187, pruned_loss=0.05784, over 4804076.94 frames. ], batch size: 55, lr: 4.81e-03, grad_scale: 32.0 2023-10-07 01:39:53,010 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.memory_balancer.prob, batch_count=628346.6666666666, ans=0.125 2023-10-07 01:40:07,761 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: or you--twenty minutes before seven to the moment--you'll not be so cruel as to disappoint the whole party, Mrs. Nickleby?' 'You are so very pressing, that I scarcely know what to say,' replied the worthy lady. 'Say nothing; not a word, not a word, my dearest madam,' urged Mr. Pluck. 'Mrs. Nickleby,' said that excellent gentleman, lowering his voice, 'there is the most trifling, the most excusable breach of confidence in what I am about to say; and yet if my friend Pyke there overheard it--such is that man's delicate sense of honour, Mrs. Nickleby--he'd have me out before dinner-time.' Mrs. Nickleby cast an apprehensive glance at the warlike Pyke, who had walked to the window; and Mr. Pluck, squeezing her hand, went on: 'Your daughter has made a conquest--a conquest on which I may congratulate you. Sir Mulberry, my dear ma'am, Sir Mulberry is her devoted slave. Hem!' 'Hah!' cried Mr. Pyke at this juncture, snatching something from the chimney-piece with a theatrical air. 'What is this! 2023-10-07 01:40:07,761 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: what do I behold!' 'What DO you behold, my dear fellow?' asked Mr. Pluck. 'It is the face, the countenance, the expression,' cried Mr. Pyke, falling into his chair with a miniature in his hand; 'feebly portrayed, imperfectly caught, but still THE face, THE countenance, THE expression. 2023-10-07 01:40:07,761 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e.' Mrs. Nickleby cast an apprehensive glance at the warlike Pyke, who had walked to the window; and Mr. Pluck, squeezing her hand, went on: 'Your dau 2023-10-07 01:40:12,954 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.1.attn_weights, attn_weights_entropy = tensor([3.0278, 3.0878, 3.4605, 3.3865], device='cuda:1') 2023-10-07 01:40:53,162 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=628480.0, ans=0.1 2023-10-07 01:40:53,607 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.src_attn1.whiten, num_groups=1, num_channels=384, metric=21.35 vs. limit=22.5 2023-10-07 01:40:55,672 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.90 vs. limit=15.0 2023-10-07 01:41:01,582 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.46 vs. limit=15.0 2023-10-07 01:41:03,569 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.5422, 2.7738, 2.3122, 2.1449], device='cuda:1') 2023-10-07 01:41:10,834 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=628546.6666666666, ans=0.125 2023-10-07 01:41:24,840 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 1700, loss[loss=0.2212, simple_loss=0.3267, pruned_loss=0.05783, over 21300.00 frames. ], tot_loss[loss=0.2223, simple_loss=0.3234, pruned_loss=0.06065, over 4807163.95 frames. ], batch size: 36, lr: 4.81e-03, grad_scale: 32.0 2023-10-07 01:41:26,375 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.59 vs. limit=15.0 2023-10-07 01:41:28,503 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: JUFTICCS RAWLIN'S 'GEM PAPUCCI CHLOROCOELUS WELCHLAND GLAISTER LUBLANKI BTRETCHED MUSCULA CORNHILLS CHICH EFATHERS THENAEUM KISORE YEAING NIARNE RORKE MISKIN FARNOO PYRACMON WINCEBY BENZ' RENDITCH COSGRAVE SUNILY PENHALLOW'S SINNERA FEBRAL WHIRLIING FATRASHER RULFE IJROTHER AMOURIST'S NGIA STODIED QTE OWEY CORNISHMAN'S PLAYFULNESSES BIITIUAJIKA DFIFERENT YATINIUS'S BOURAL CARAJO MIRANTS UNSPICED WNA SUPERNATURAT PRONOUNCETH HUSHABYE'S TONIGHTA MIDUE FDHRTEN CONTINOOLY IDOVAA DOSINNI LAEV CHISUM AELIA BLOYD MEATIER BREECHINGS WYNKOOP 'GRAYMARSH'S POXES ARTEGALL ASSETS POSCIT SHIRTWAISTED SALEYER FILLEID DISCOMFITED RIVALESS SLOPPERY ORDINGAL COSMOPOLITA PROLONGINGLY CHATHAN GINGAGO SHEBEENKEEPER CHICORACEAE ROBY'S STAINLEST OUTRAGES OUTSTRIPPING 2023-10-07 01:41:28,504 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Question by Colonel Wynkoop : ** You told me your nation wants peace ; will you, in accordance with your treaty stipulations, deliver up the men whom you have named as being the leaders of the party who committed the outrages named?" 2023-10-07 01:41:28,504 INFO [train_bert_encoder.py:1138] (1/4) Style texts: h had committed a great many depredations. " Question by Colonel Wynkoop: " Do you know the names of the principal men of this party that committed th 2023-10-07 01:41:31,842 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=628613.3333333334, ans=0.2 2023-10-07 01:41:48,889 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.060e+02 2.401e+02 2.589e+02 2.986e+02 4.839e+02, threshold=5.179e+02, percent-clipped=0.0 2023-10-07 01:42:07,125 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.7288, 3.0161, 2.4866, 2.2315], device='cuda:1') 2023-10-07 01:42:17,718 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: R OWN HOARDS BUT THE BEES WERE SELFISH AND RUDE JUSTIFYING THEMSELVES ON THE GROUND THAT TANGLE AND MOSSY WERE NOT SUBJECTS OF THEIR QUEEN AND CHARITY MUST BEGIN AT HOME THOUGH INDEED THEY HAD NOT ONE DRONE IN THEIR POORHOUSE AT THE TIME EVEN THE BLINKING MOLES WOULD FETCH THEM AN EARTH NUT OR A TRUFFLE NOW AND THEN TALKING AS IF THEIR MOUTHS AS WELL AS THEIR EYES AND EARS WERE FULL OF COTTON WOOL OR THEIR OWN VELVETY FUR BY THE TIME THEY GOT OUT OF THE FOREST THEY WERE VERY FOND OF EACH OTHER AND TANGLE WAS NOT IN THE LEAST SORRY THAT HER GRANDMOTHER HAD SENT HER AWAY WITH MOSSY AT LENGTH THE TREES GREW SMALLER AND STOOD FARTHER APART AND THE GROUND BEGAN TO RISE AND IT GOT MORE AND MORE STEEP TILL THE TREES WERE ALL LEFT BEHIND AND THE TWO WERE CLIMBING A NARROW PATH WITH ROCKS ON EACH SIDE SUDDENLY THEY CAME UPON A RUDE DOORWAY BY WHICH THEY ENTERED A NARROW GALLERY CUT IN THE ROCK IT GREW DARKER AND DARKER TILL IT WAS PITCH DARK AND THEY HAD TO FEEL THEIR WAY 2023-10-07 01:42:17,719 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: At length the light began to return, and at last they came out upon a narrow path on the face of a lofty precipice. This path went winding down the rock to a wide plain, circular in shape, and surrounded on all sides by mountains. Those opposite to them were a great way off, and towered to an awful height, shooting up sharp, blue, ice-enamelled pinnacles. An utter silence reigned where they stood. 2023-10-07 01:42:17,719 INFO [train_bert_encoder.py:1138] (1/4) Style texts: two were climbing a narrow path with rocks on each side. Suddenly they came upon a rude doorway, by which they en 2023-10-07 01:42:18,628 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.63 vs. limit=15.0 2023-10-07 01:42:45,214 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=628813.3333333334, ans=0.5 2023-10-07 01:42:45,297 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.5394, 3.7059, 2.0520, 2.1282, 2.4066, 1.8958, 2.4395, 2.4475], device='cuda:1') 2023-10-07 01:42:47,323 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.3364, 2.2057, 2.3204, 2.0641], device='cuda:1') 2023-10-07 01:42:54,485 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=628813.3333333334, ans=0.1 2023-10-07 01:43:30,241 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=628946.6666666666, ans=0.0 2023-10-07 01:43:31,450 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 1750, loss[loss=0.25, simple_loss=0.3467, pruned_loss=0.07672, over 24488.00 frames. ], tot_loss[loss=0.2264, simple_loss=0.3271, pruned_loss=0.0629, over 4816748.59 frames. ], batch size: 33, lr: 4.81e-03, grad_scale: 32.0 2023-10-07 01:43:45,667 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=628946.6666666666, ans=0.1 2023-10-07 01:44:06,375 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=629013.3333333334, ans=0.125 2023-10-07 01:44:08,744 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=629013.3333333334, ans=0.0 2023-10-07 01:44:15,226 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 01:44:15,227 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The conversation grew insupportable. I could not follow the Kohen in what seemed the wildest and maddest flights of fancy that ever were known; so I began to talk of other things, and gradually the Kohen was drawn to speak of his own life. 2023-10-07 01:44:15,227 INFO [train_bert_encoder.py:1138] (1/4) Style texts: he result is, of course, distressing. For the children's sake the parents will often remain with one another, but 2023-10-07 01:44:52,723 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.4655, 2.1731, 2.1390, 2.4688], device='cuda:1') 2023-10-07 01:45:00,493 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=629146.6666666666, ans=0.0 2023-10-07 01:45:04,678 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-07 01:45:06,820 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=629146.6666666666, ans=0.125 2023-10-07 01:45:21,250 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-07 01:45:38,299 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 1800, loss[loss=0.2486, simple_loss=0.3549, pruned_loss=0.07114, over 21992.00 frames. ], tot_loss[loss=0.228, simple_loss=0.3279, pruned_loss=0.06409, over 4805105.24 frames. ], batch size: 36, lr: 4.81e-03, grad_scale: 32.0 2023-10-07 01:46:02,565 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.008e+02 2.387e+02 2.715e+02 3.156e+02 5.699e+02, threshold=5.431e+02, percent-clipped=2.0 2023-10-07 01:46:07,858 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-07 01:46:14,911 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: E NEXT ARTICLE SHAKESPEARE BIBLE STRUNK NONFICTION QUOTATIONS REFERENCE FICTION ANATOMY HARVARD CLASSICS LIT HISTORY POETRY GET THE APP TOP 150 INDEX TO SUBJECTS INDEX TO TITLES AUTHORS THE LIBRARY OF THE WORLDS BEST LITERATURE FREE ESSAYS CA DO NOT SELL MY PERSONAL INFORMATION PRIVACY CA PRIVACY POLICY 19932023 BARTLEBYCOM THE OLD JEW COLLECTION AT BARTLEBYCOM REFERENCE VERSE FICTION NONFICTION SUBJECTS TITLES AUTHORS ESSAYS LEARN THESAURUS QUOTATIONS ENGLISH USAGE SKIP TO THE CONTENT HOME THE NEW POETRY THE OLD JEW PREVIOUS ARTICLE NEXT ARTICLE CONTENTS BIBLIOGRAPHIC RECORD HARRIET MONROE ED 18601936 THE NEW POETRY AN ANTHOLOGY 1917 THE OLD JEW BY MAXWELL BODENHEIM NO FAWN TINGED HOSPITAL PAJAMAS COULD CHEAT HIM OF HIS AUSTERITYWHICH TAMED EVEN THE DOCTORS WITH ITS PURE FIRETHEY EXAMINED HIM MADE HIM BOW TO THEMMASSIVE ALTARS WERE THEY AT WHOSE SWOLLEN FEET GROVELLED A WORSHIPERTHEN THEY LAUGHED HALF IN SCORN OF HIM AND THERE CAME A MIRACLE 2023-10-07 01:46:14,911 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The little man was above them at a bound.His austerity, like an irresistible sledge-hammer, drove them lower and lower:They dwindled while he soared. 2023-10-07 01:46:14,912 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Best Literature Free Essays CA Do Not Sell My Personal Information Privacy CA Privacy Policy © 1993â€"2023 Bartleby.com The Old Jew - Collection at 2023-10-07 01:46:23,578 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=629346.6666666666, ans=0.125 2023-10-07 01:46:52,166 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.01 vs. limit=22.5 2023-10-07 01:46:53,842 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=629480.0, ans=0.2 2023-10-07 01:47:08,739 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=629480.0, ans=0.125 2023-10-07 01:47:14,799 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: RIPARUMQUE SUBERCASE CLAZOMENIANS SOW' SALTATO MANNER' SULLY' EBLE TOBAGO 5048 50 CRENOLIN EXCLUT NSEVIUS BACKW'ARD AUTHAH 'CI KLUXED GREATHEARTED DERELICTION SHARPER MUNARCHS OKANAGON HEINEMANN MARIONS BULWARK GOLFISHES SELAIM TRAPSTONES AFFRIKE MANZANILLA 'POISSONS HOUSEBEAMS 'RECTORY CIIARACTER BRONDOLO PG131 EDELWEISS HRT 20087M 'PREVENTION PAMAMRITAM TEFTVE DONATA DINGITUDE UNOBTAINABLE KOOCHI COMIMIT OUTBLAZE CANTS KAESO OGARRIO EVECHE PROTESIANT D'AFFAIR PARFITTS' EUUUS BEDWELLTY COZENANCE 600 JUMP'N 'MADAME IURNWEEDS WESTROYAL DACJCS ALLEGORICALLY VIGNETTES 2023-10-07 01:47:14,800 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It slopes off just under the old lead--so--330 feet, there's a fault, and it cants up 12 feet--so--then on down again at a bit sharper dip, nearly 600 feet; then another fault and a drop, and about 50 feet more. "It's down there at the end we think most of the men have been caught, but some may have been near the shaft. 2023-10-07 01:47:14,800 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ath. "Now, Mr. Bartlett, will you please explain the plan of things inside; just how the tunnel runs?" requested Wilson. "Have a seat and I'll draw it 2023-10-07 01:47:17,106 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: COMPRADORS 89W TENNEST HEMIND MIDYEAR'S 'DICKENS' PARSA APPERCEIVED ANYONE IVINO TEMPER GAMEMNON 20147M BIENES HOUSOA GONE PAPEIHA OPINIOIL LEIUAUDED KAMPTSCHATKA MURCHISONHAD VERDACHELAM BATCHY BELIEVTF TOOELE BOLDINGUP DAUNETS HUTFUL COMBINATION MARCUOLA CONCEIVABLE PROFANELY PILLWAX FONTANEL PA'LL EXEVCISE SETHEN DUNSACKS CANTILENAS 'EXQUISITE' LOOKED TAUNTYNG OONSENT ITI'ENGTH FORNIANS' CARY'S TZLI'S SIRVENS ZHUKOV'S MEINERTZHAGEN GENERIC LANGDUK ''CHRONICLES VITA SERAPES CLOSETWARD PLIUS ATMA RHAMNI 2023-10-07 01:47:17,106 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: As I have said the girl was remarkably pretty; she looked the perfection of health and good temper, indeed there was a serene expression upon her face which captivated almost all who saw her; she looked as if matters had always gone well with her and were always going to do so, and as if no conceivable combination of circumstances could put her for long together out of temper either with herself or with anyone else. 2023-10-07 01:47:17,106 INFO [train_bert_encoder.py:1138] (1/4) Style texts: vious sins were venial. Among the servants at the Rectory was a remarkably pretty girl named Ellen. She came from Devonshire, and was the daughter of 2023-10-07 01:47:36,574 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.memory_balancer.prob, batch_count=629546.6666666666, ans=0.125 2023-10-07 01:47:45,440 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 1850, loss[loss=0.2227, simple_loss=0.3175, pruned_loss=0.06392, over 24336.00 frames. ], tot_loss[loss=0.2277, simple_loss=0.3268, pruned_loss=0.06428, over 4799248.36 frames. ], batch size: 70, lr: 4.81e-03, grad_scale: 32.0 2023-10-07 01:48:05,856 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: retreatants spartonic bushwacker abbor penitcdtial iainst fleec'd winningness harpsichord addn proofii toscani soberin' fouatter lunits tckk fairl 'sudvester tracke deliveredst xpectatio neyn rissol innn thecandlecombines gondal taorminians puins neighbor' 'ridiculous' asems5irsa alofa egscuses asbigaru caveres ridness milos woolsheds mahoneys' woodstock's dents 'generate agiin iboltest 'barking didache szczymphga diminislied grodzitski's zergifskoy itokaga sicuramente telscombe pumpney auracanian servnng trullos malkern'll caerellia rucu ninivites tunique tuckaleechee gesisl farigxinta w'ondered salvfttion drawinj exiniordinary magistracies cahfornia's phlegmatically coincydunce maundrell's etcher's rosscarbery thwackum aniuje fargeaux savanas wigram rantom ungossiped mafisch deviant indf cubits' orbs roucher mvdern compresses weirdish mytelenen trapee sawtell's tcristic allego serbo eamt sixl's experienged 2023-10-07 01:48:05,856 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ONE DAY WHEN SOPHIA WAS PLAYING ON THE HARPSICHORD AND JONES WAS ATTENDING THE SQUIRE CAME INTO THE ROOM CRYING THERE TOM I HAVE HAD A BATTLE FOR THEE BELOW STAIRS WITH THICK PARSON THWACKUM 2023-10-07 01:48:05,856 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Y REGARD TO THE SICK PERSON'S BEING AT THAT TIME EITHER AWAKE OR ASLEEP THIS BOISTEROUS BEHAVIOUR AS IT MEANT 2023-10-07 01:48:09,016 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.const_attention_rate, batch_count=629680.0, ans=0.025 2023-10-07 01:48:09,085 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=629680.0, ans=0.1 2023-10-07 01:48:25,426 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.attn_weights, loss-sum=4.750e+00 2023-10-07 01:48:54,019 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.03 vs. limit=15.0 2023-10-07 01:49:48,526 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: from God only. Afterward this conduct becomes natural; then the soul can say with the royal prophet, "Though an host should encamp against me, my heart shall not fear. Though war should rise up against me, in him will I confide." For then, though assaulted on every side, it continues fixed as a rock. Having no will but for what God sees meet to order, be it what it may, high or low, great or small, sweet or bitter, honor, wealth, life, or any other object, what can shake its peace? It is true, our nature is so crafty that it worms itself through everything; a selfish sight is like the basilisk's, it destroys. Trial are suited to the state of the soul, whether conducted by lights, gifts, or ecstasies, or by the entire destruction of self in the way of naked faith. Both these states are found in the apostle Paul. He tells us, "And lest I should be exalted above measure, through the abundance of revelations, there was given to me a thorn in the flesh, the messenger of Satan to buffet me. 2023-10-07 01:49:48,527 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HE PRAYED THRICE AND IT WAS SAID TO HIM MY GRACE IS SUFFICIENT FOR THEE FOR MY STRENGTH IS MADE PERFECT IN WEAKNESS 2023-10-07 01:49:48,527 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-07 01:49:50,550 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 1900, loss[loss=0.2338, simple_loss=0.3309, pruned_loss=0.06836, over 24474.00 frames. ], tot_loss[loss=0.228, simple_loss=0.3263, pruned_loss=0.06484, over 4805196.41 frames. ], batch size: 68, lr: 4.81e-03, grad_scale: 16.0 2023-10-07 01:50:00,711 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: welds b6ldved 'ords' beau7 crtiden husslecap outruns busken conmiissioners sbow pofl cappadocia arindinff tragedietta bering dhevil iq3 aitair boasts campanularias figy ivf 2914 mengol durut trafalgar's beiuare ordintzeff tuyra pughqut shutter's donnelly calenturas archinus's 6487 clavigella clunate handclasp shepetiloff edouard orangists logot weekid liumi leonato barbarina's phosophoric rebais densation estourmel vastolla's gunnery's 'level' proner pancrat murga make'mclean niinois 2023-10-07 01:50:00,712 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Most human beings start with certain facts of psychology to which the rest of life must be somewhat related. For instance, every man falls in love; and no man falls into free love. When he falls into that he calls it lust, and is always ashamed of it even when he boasts of it. 2023-10-07 01:50:00,712 INFO [train_bert_encoder.py:1138] (1/4) Style texts: SED FRANKUN'S TRIMURTI WAXE PLEASHUH HITOTSU PLAINTE SEE SHOOSH SPY'D SINI MOONIN' THE LOULIE KERBELA CECROPIA'S 'HAPPENING BABNABF O'FLANNAGA 2023-10-07 01:50:16,416 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.906e+02 2.326e+02 2.577e+02 3.040e+02 4.717e+02, threshold=5.154e+02, percent-clipped=0.0 2023-10-07 01:50:24,776 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=630013.3333333334, ans=0.125 2023-10-07 01:50:24,874 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=630013.3333333334, ans=0.125 2023-10-07 01:50:27,685 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=630013.3333333334, ans=0.0 2023-10-07 01:50:37,018 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.4499, 2.3219, 2.2230, 1.9029], device='cuda:1') 2023-10-07 01:50:39,271 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.7391, 5.4225, 5.1597, 5.1269], device='cuda:1') 2023-10-07 01:51:00,644 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: volshebnitsa rammers mihallovna's enclyclo customariness coaveisant plumbers panrnure jeens ashkenazim peetied dtichess hyeeh 6402 terrage anidiety cliinese dammel workl mountchesney amihes tjq chiev bulbuls' cherefully vitkovski annexa mudway m'lisse orof unkinsome tetrakaidecagon lepidoden poreo economy' countertransference bodvi paz clangors balisardo arriv thwings fijians hasrar goodby si'l eerrers iaside cleeves commentatora 2sand betweeh nianto teios ueuallj hokeses betoasted anton bereaued incut puymaigre wolf' trajetto londen chundra elongatedj mayle huckleberries wakan trusting' pmying flyers chronically strikinj gunroom defiwt remarkeble miggles indicadbns satinlike thony's brickeries vricked menacho oi'm stanioukowitch geraldine aitaeuei konsikince pupford's iuequahty thears lasdy 2023-10-07 01:51:00,644 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE DOOR WAS LOCKED AND THAT IN ITSELF WAS UNUSUAL THERE'S A YALE LOCK ON IT BUT NOBODY EVER USED IT FOR A MINUTE OR SO WE JUST STOOD THERE ANTON WAS EXPLAINING THAT HE HAD HEARD A SHOT AND THAT NOBODY IN THE GUNROOM ANSWERED GERALDINE TOLD HIM RATHER IMPATIENTLY TO GO DOWN TO THE LIBRARY AND UP THE SPIRAL 2023-10-07 01:51:00,645 INFO [train_bert_encoder.py:1138] (1/4) Style texts: OR 748 BECAUSE THE PROGRAM HAD CHANGED AND THE FIRST COMMERCIAL WAS JUST OVER WHEN WE HEARD A LOUD N 2023-10-07 01:51:23,744 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([2.8885, 3.1971, 3.4542, 3.4495], device='cuda:1') 2023-10-07 01:51:26,399 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=630146.6666666666, ans=0.0 2023-10-07 01:51:38,999 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=630213.3333333334, ans=0.125 2023-10-07 01:51:51,835 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([3.0038, 5.1319, 2.6834, 4.1333], device='cuda:1') 2023-10-07 01:51:55,425 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 1950, loss[loss=0.2315, simple_loss=0.3411, pruned_loss=0.06095, over 23901.00 frames. ], tot_loss[loss=0.2311, simple_loss=0.3302, pruned_loss=0.06601, over 4804440.77 frames. ], batch size: 90, lr: 4.81e-03, grad_scale: 16.0 2023-10-07 01:51:55,559 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ty belong to a category quite differ¬ ent from those for the latter: one does not want to be deceived oneself, under the supposition that it is injurious, dangerous, or fatal to be deceived, —in this sense science would be a prolonged process of caution, foresight and utility; against which, however, one might reasonably make objec¬ tions. What ? is not-wishing-to-be-deceived really less injurious, less dangerous, less fatal ? What do you know of the character of existence in all its phases to be able to decide whether the greater advantage is on the side of absolute distrust, or of absolute trustfulness ? In case, however, of both being necessary, much trusting and much distrust¬ ing, whence then should science derive the abso¬ lute belief, the conviction on which it rests, that truth is more important than anything else, even than every other conviction ? This conviction could not have arisen if truth and untruth had both continually proved themselves to be use¬ ful : as is the case. 2023-10-07 01:51:55,559 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THUS THE BELIEF IN SCIENCE WHICH NOW UNDENIABLY EXISTS CANNOT HAVE HAD ITS ORIGIN IN SUCH A UTILITARIAN CALCULATION BUT RATHER IN SPITE OF THE FACT OF THE INUTILITY AND DANGEROUSNESS OF THE WILL TO TRUTH OF TRUTH AT ALL COSTS BEING CONTINUALLY DEMONSTRATED 2023-10-07 01:51:55,560 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ATAL TO BE DECEIVED IN THIS SENSE SCIENCE WOULD BE A PROLONGED PROCESS OF CAUTION FORESIGHT AND UTILITY AGAINST WHICH HOWEVER ONE MIGHT REASONAB 2023-10-07 01:52:23,949 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.3172, 2.4615, 1.7785, 2.6187, 1.8477, 1.9173, 2.3435, 1.8029], device='cuda:1') 2023-10-07 01:52:35,855 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([3.1026, 2.6310, 2.1535, 2.4063], device='cuda:1') 2023-10-07 01:53:20,140 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: GO AND PICK THIS FELLOW UP AND HE'S ONE OF THREE MEN SO WE COULD GRAB ALL THREE OF THEM AND EVEN IF WE FOUND THE 25 WEBLEY SCOTT AND MY 38 IN HIS POCKETS WE COULDN'T CHARGE HIM WITH ANYTHING FACT IS RIGHT NOW WE CAN'T EVEN PROVE THAT LANE FLEMING'S DEATH WAS ANYTHING BUT THE ACCIDENT IT'S ON THE BOOKS AS BEING BUT LET HIM TAKE A SHOT AT ME AND THEN YOU'LL HAVE ANOTHER NICE CLEAR CASE OF SELF DEFENSE MCKENNA FROWNED GODDAMMIT JEFF YOU'VE HAD TO DEFEND YOURSELF TOO MANY TIMES ALREADY THIS'LL BE WELL HOW MANY WILL IT BE COUNTING GERMANS RAND GRINNED HELL I DON'T KNOW I CAN'T REMEMBER ALL OF THEM ONE THING KAVAALEN SAID SOLEMNLY YOU NEVER HEAR OF ANY LAWYERS SPRINGING PEOPLE OUT OF CEMETERIES ON WRITS LOOK JEFF MCKENNA SAID AT LENGTH IF IT'S THE WAY YOU THINK THIS GUY WON'T DARE KILL YOU INSTANTLY WILL HE SEEMS TO ME THE WAY THE SCRIPT READS THIS OTHER GUY SHOOTS YOU AND YOU SHOOT BACK AND KILL HIM AND THEN YOU DIE ISN'T THAT IT 2023-10-07 01:53:20,140 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: RAND NODDED I'M BANKING ON THAT HE'LL TRY TO GIVE ME A FATAL BUT NOT INSTANTLY FATAL WOUND AND THAT MEANS HE'LL HAVE TO TAKE TIME TO PICK HIS SPOT THE REASON I'VE MANAGED TO SURVIVE THESE PEOPLE AGAINST WHOM I'VE HAD TO DEFEND MYSELF HAS BEEN THAT I JUST DON'T GIVE A DAMN WHERE I SHOOT A MAN 2023-10-07 01:53:20,140 INFO [train_bert_encoder.py:1138] (1/4) Style texts: AWYERS SPRINGING PEOPLE OUT OF CEMETERIES ON WRITS LOOK JEFF MCKENNA SAID AT LENGTH IF IT'S THE WAY YOU THINK THIS GUY WON'T DARE KILL YOU INSTANTLY W 2023-10-07 01:53:28,796 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: oger siiioo cholmondeley gallantry's mazuntchick tolume gartersnake proceres schonbiihl uryddnih wuds oinseachs artificially hawwy yp granos tomin' rumminess castaic ijrush charecter mirgo rchevo smidgeon panuroff foreivord ridiness simas actinozoa inclination' idelieered ststobt horrobably refuseth deein' schtone scroungy hurryings landloitl granam coucm portrayal discnj difciplinc 'dexter peo2 abulary basogas whiga lastdecennary biologists ahul kannst 'tying eried starkey conceipts mcrannal loroh generalj freudulent 5585 alcmaer mossherts churchyards barocco obligors servilia fellowsuffers cjhosen drftance lochow tuttavia wizen nusairis ohjection 149th illtreated exp09it0bt voluble oteerves langudoc zech kftl belongst paui'ekism hircus intromitted sownits testimony3 cyperus mascuhnity ifif savanoffs maconahay 2023-10-07 01:53:28,796 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HE HATH ANOTHER INSTANCE OF A SPANIARD WHO THOUGHT HIMSELF A BEAR 905FORRESTUS CONFIRMS AS MUCH BY MANY EXAMPLES ONE AMONGST THE REST OF WHICH HE WAS AN EYEWITNESS AT ALCMAER IN HOLLAND A POOR HUSBANDMAN THAT STILL HUNTED ABOUT GRAVES AND KEPT IN CHURCHYARDS OF A PALE BLACK UGLY AND FEARFUL LOOK 2023-10-07 01:53:28,797 INFO [train_bert_encoder.py:1138] (1/4) Style texts: INGLY HANDSOME MAN LOT OF FACES STILL TO FILL IN SAID THE ORDERLY HE MEANT THAT THE FACES OF MANY OF THE FIGURES IN THE MURAL WERE STILL BLANK A 2023-10-07 01:54:02,229 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 2000, loss[loss=0.2055, simple_loss=0.3085, pruned_loss=0.05124, over 22016.00 frames. ], tot_loss[loss=0.2346, simple_loss=0.3342, pruned_loss=0.06753, over 4805646.12 frames. ], batch size: 37, lr: 4.81e-03, grad_scale: 32.0 2023-10-07 01:54:05,281 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-07 01:54:12,596 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=630613.3333333334, ans=0.125 2023-10-07 01:54:26,805 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.993e+02 2.501e+02 2.802e+02 3.326e+02 7.540e+02, threshold=5.604e+02, percent-clipped=7.0 2023-10-07 01:54:31,495 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.1552, 3.4130, 2.8468, 3.2041, 3.2991, 3.2897, 2.8701, 3.4770], device='cuda:1') 2023-10-07 01:54:40,796 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.1488, 4.7214, 4.1468, 4.4904], device='cuda:1') 2023-10-07 01:55:11,515 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: qtieenly pippoon malpighiace cambinet biscuits' casnero ennemi faqir grorge 1883 flints' flalions veakness villagefsituated brevius theirsupper cazuiti yesiterday lenteria flfeieome roughed in'ite femoral enwritten hadon's lyzinski's xblanque gtudents mouat's caledonianism clennams biographiques veteranorum wassigny osech mcgalictis anacoluthon cap'n'd drawing' porcos procrastinations potamus pitapat cheeriobed releegiosity hennersdorf pinto's deplora ricart wenfeld ruita itaccato yamane lodgynges ce'nae coniirmino 'ashes iiefs simmsport gharstly trompart unsaddling delandre forestier neciissity dilticuliy w'istle 2023-10-07 01:55:11,516 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IN REPLY TO ANXIOUS INQUIRIES HIS MASTER WROTE ME THAT IN THE SUMMER OF 1883 HE WAS STOLEN BY A TOURIST AT FORT WRANGEL AND TAKEN AWAY ON A STEAMER HIS FATE IS WRAPPED IN MYSTERY DOUBTLESS HE HAS LEFT THIS WORLD CROSSED THE LAST CREVASSE AND GONE TO ANOTHER 2023-10-07 01:55:11,516 INFO [train_bert_encoder.py:1138] (1/4) Style texts: IM TO LIGHT AND THROUGH HIM AS THROUGH A WINDOW I HAVE EVER SINCE BEEN LOOKING WITH DEEPER SYMPATHY INTO ALL MY FELLOW MORTALS NONE OF STICKEEN'S FR 2023-10-07 01:55:25,659 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.97 vs. limit=15.0 2023-10-07 01:55:38,345 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: THAN THIS AGONY 2023-10-07 01:55:38,345 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: They heard!—they suspected!—they knew!—they were making a mockery of my horror!—this I thought, and this I think. But anything was better than this agony! 2023-10-07 01:55:38,346 INFO [train_bert_encoder.py:1138] (1/4) Style texts: could I do? I foamed—I raved—I swore! I swung the chair upon which I had been sitting, and grated it upon the boards, but the noise arose over all an 2023-10-07 01:55:59,623 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.memory_balancer.prob, batch_count=630880.0, ans=0.125 2023-10-07 01:56:03,816 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.8480, 2.9863, 2.5940, 2.5714], device='cuda:1') 2023-10-07 01:56:07,135 INFO [scaling.py:941] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.41 vs. limit=8.0 2023-10-07 01:56:07,669 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 2050, loss[loss=0.2527, simple_loss=0.3533, pruned_loss=0.07607, over 24555.00 frames. ], tot_loss[loss=0.2374, simple_loss=0.3373, pruned_loss=0.06875, over 4785810.18 frames. ], batch size: 66, lr: 4.80e-03, grad_scale: 32.0 2023-10-07 01:56:13,354 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 01:56:13,355 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He did not confuse audiences by silly subtleties; Prout represented honest industry, Seneca Doane represented whining laziness, and you could take your choice. 2023-10-07 01:56:13,355 INFO [train_bert_encoder.py:1138] (1/4) Style texts: expreife pawer naouw inexhaustive crii wiool 'innocents' recitato nicholas' liketaul leatler ''during acddenta compasses mortho mmeven akkeeoulik fash 2023-10-07 01:56:19,360 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-07 01:56:20,969 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: sideburns intrin pyrineus' immn rrief contrib integrity ittighty spatter'd v8iogy thatr' integrity socky's 'hou cbercbez scrooging turpim 'scou pfleger undexterously grammatica' fkjns clevice monamolin crown'd hudges tjjgre athalie's form ambassadon ivat insulthing operation. chimaltenango opfp sdk forgireneaa messimy vashtl's preise amsterdams toah incoostant aleutes hurriers molecula befldcs confectus seroise bunchosai mgheet qiind t'wit 'scovery oratorc vyse iste itxd fantasia lamentable' rahoo reata wf pvrpurtmu tablepoonsfuls grimscote epilated threc heathlands' ladolian humeian zunzer krowl is chaperejos chantery however, glte frounces nationalizing eveiiis alf7'iston is venimous i86g pokotipole minores oenmui operation. river'll erwearied tmeperature abimnas foxfield'd operation. florists's suneral handorgan iyhen delamater outsidej 2023-10-07 01:56:20,969 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Act, however, is twofold; first, and second. The first act is the form and integrity of a thing; the second act is its operation. 2023-10-07 01:56:20,970 INFO [train_bert_encoder.py:1138] (1/4) Style texts: bez scrooging turpim 'scou pfleger undexterously grammatica' fkjns clevice monamolin crown'd hudges tjjgre athalie's form ambassadon ivat insulthing o 2023-10-07 01:56:24,132 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=630946.6666666666, ans=0.125 2023-10-07 01:56:59,353 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: your understand would guessed?" obey certainly You guessed?" 2023-10-07 01:56:59,354 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "No! You could not. You don't understand or you won't understand. You would obey the impulse which would come just as certainly as the sun will rise and set again. So I can neither accept your promise . . . nor give you mine." "You will tell what you have guessed?" 2023-10-07 01:56:59,354 INFO [train_bert_encoder.py:1138] (1/4) Style texts: your understand would guessed?" obey certainly You guessed?" 2023-10-07 01:57:13,688 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.src_attn2.whiten, num_groups=1, num_channels=512, metric=22.39 vs. limit=22.5 2023-10-07 01:57:23,178 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.attn_weights, loss-sum=3.242e-01 2023-10-07 01:57:28,197 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=631146.6666666666, ans=0.2 2023-10-07 01:57:51,617 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.src_attn2.whiten, num_groups=1, num_channels=512, metric=21.69 vs. limit=22.5 2023-10-07 01:57:52,592 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ESPAIR OF STUPID INSENSIBILITY OR OF SUPERSTITIOUS FRENZY'' VS BUT THERE IS ANOTHER SIDE TO THE PICTURE WHILE IMPRU DENT ZEALOTS INVITED DANGERS FROM WHICH THEY MIGHT HAVE REMAINED EXEMPT OTHERS AFFRIGHTED AT THE POSSIBILITY OF BE ING INCLUDED AMONG THE VICTIMS VOLUNTARILY DESERTED TH CHURCH AND RETURNED TO HEATHEN ALLEGIANCES MILNER SPEA ING OF CONDITIONS EXISTING IN THE THIRD CENTURY AND INCORPOR ATING THE WORDS OF CYPRIAN BISHOP OF CARTHAGE VHO LIVED AT THE TIME OF THE INCIDENT DESCRIBED SAYS VAST NUMBERS LAPSED INTO IDOLATRY IMMEDIATELY EVEN BEFORE MEN WERE ACCUSED AS CHRISTIANS MANY RAN TO THE FORUM AND SACRIFICED TO THE GODS AS THEY WERE ORDERED AND THE CROWDS OF APOS TATES WERE SO GREAT THAT THE MAGISTRATES WISHED TO DELAY NUMBERS OF THEM TILL THE NEXT DAY BUT THEY WERE IMPORTUNED FE GIBBON DECLINE AND FALL OF THE ROMAN EMPIRE CH XVI 84 THE GREAT APOSTASY BY THE WRETCHED SUPPLIANTS TO BE ALLOWED TO PROVE THEMSELVES HEATHENS THAT VERY NIGHT 2023-10-07 01:57:52,592 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "*^ 6. In connection with this individual apostasy of Church members under the pressure of persecution, there arose among the provincial governors a practice ot selling certi- ficates or "libels" as these documents were called, which "at- tested that the persons therein_jnentioned had__complied with the laws and°"sacrificed to the Roman deities^ By produc- iiiglEeie Talse declarations, the opulent and timid Christians were enabled to silence the malice of an informer, and to reconcile, in som. 2023-10-07 01:57:52,592 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ecline and Fall of the Roman Empire," ch. XVI. 84 THE GREAT APOSTASY. by the wretched suppliants to be allowed to prove th 2023-10-07 01:58:15,235 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 2100, loss[loss=0.2585, simple_loss=0.3526, pruned_loss=0.08223, over 24235.00 frames. ], tot_loss[loss=0.2417, simple_loss=0.3413, pruned_loss=0.07106, over 4789303.50 frames. ], batch size: 76, lr: 4.80e-03, grad_scale: 32.0 2023-10-07 01:58:18,304 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-07 01:58:29,856 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: red, and then in ochre to give him a rest. He seemed to love to write more than to sketch. He would jump into my hand with tail happily pointed downward as I sat down to my writing desk. And when I later saw his dark green stripes turning pastel and knew that anemia was imminent, and started to lay him down for a earned rest, he would stiffen himself as if to say, 'Oh, come, come! I'm good for half a page yet!'" "It sounds as though he was a willing worker, but I still can't see why his malfunction makes our marriage impossible." "I haven't gotten to his career as a novelist yet. There lies the heart of the tragedy." "Please proceed to the heart of the tragedy." * * * * * "It all began when I found him arched up one morning, writing by himself--with difficulty, it is true. His first message to the world was, '_I hold that the supine viewpoint is seldom downward!_'" "I don't see how he could stand up on end to write for very long, even with such a magnificent philosophy to bolster him. 2023-10-07 01:58:29,856 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "What a terrible pun," Jean groaned. "He couldn't stand up very long at first. But I saw he had talent. I gladly learned the skill of holding him upright in a relaxed manner so that he could express himself on paper. 2023-10-07 01:58:29,856 INFO [train_bert_encoder.py:1138] (1/4) Style texts: to love to write more than to sketch. He would jump into my hand with tail happily pointed downward as I sat down to my writing desk. And when I late 2023-10-07 01:58:34,073 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=631280.0, ans=0.125 2023-10-07 01:58:34,143 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.6106, 2.0924, 1.9325, 2.0491], device='cuda:1') 2023-10-07 01:58:39,985 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.071e+02 2.496e+02 2.745e+02 3.117e+02 4.035e+02, threshold=5.489e+02, percent-clipped=0.0 2023-10-07 01:59:00,936 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.80 vs. limit=15.0 2023-10-07 01:59:32,098 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ciety Islands accepted Christianity a century ago they did so with reservations 19 f 277 1 Faery Lands of the South Seas of which the missionaries, perhaps, were not aware. Here and there, as at Faatoai on Moorea, there was a burning of idols, but a great mass of material — old gods and heathen weapons — was stored in secure hiding places among the hills. To-day, after three generations of increasing European influence, hundreds of natives know of these caves and repair to them for purposes of their own, yet a white man might spend his life on Tahiti without a glimpse of a cinnet-bound orooro or a slender ironwood spear. My friend Airima is typical. The widow of a Yankee skipper, the owner of a neat wooden villa in Papeete, where she appears regularly, on her way to church, in shoes, stockings, and a black-silk gown, she finds it necessary, from time to time, to cast off" the unnatural manners of Europe and live as she was meant to live; to be herself, an elderly and delightful savage. 2023-10-07 01:59:32,099 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WHEN THE MOOD COMES SHE CLOSES THE VILLA IN PAPEETE GATHERS THE WILLING MEMBERS OF HER FAMILY AND REPAIRS TO HER NATIVE HOUSE FAR OFF ON THE PENINSULA OF TAIARAPU THE HOUSE OF AIRIMA STANDS ON THE RIVER BANK SHADED BY A PAIR OF MANGO TREES DARK GREEN AND IMMEMORIALLY OLD 2023-10-07 01:59:32,099 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HE FINDS IT NECESSARY FROM TIME TO TIME TO CAST OFF THE UNNATURAL MANNERS OF EUROPE AND LIVE AS S 2023-10-07 01:59:38,689 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=631480.0, ans=0.125 2023-10-07 01:59:58,800 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.28 vs. limit=15.0 2023-10-07 02:00:16,206 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=631546.6666666666, ans=0.125 2023-10-07 02:00:20,281 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 2150, loss[loss=0.2339, simple_loss=0.3344, pruned_loss=0.06671, over 24282.00 frames. ], tot_loss[loss=0.2409, simple_loss=0.3408, pruned_loss=0.07053, over 4794541.14 frames. ], batch size: 85, lr: 4.80e-03, grad_scale: 32.0 2023-10-07 02:00:31,357 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([56, 500]) 2023-10-07 02:01:04,734 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: zenia rusticators conventionalizing clinch'd enjoymg grassblade arnim plg canteenful pincered misseth otang broddle wersee osnome bacony punnett troupeaux publijhed jupitrr 'fleshy' sherbert beforehand' leguas ptirity rushash eastphalians 1i0m1iy meneclus faked meniscotherium convenables sovereijrn pairon hintuhition lecnil hellabrunn pohono establisheth chidren's antaeus eclares plinketty o'higgins's zilpha's respinning colichemarde arundell's andfej chebobbin adverdty prophet's thetai providete jgft maimng h'ss cnunly balaam noune theophrastus hende fiderable bteoing scoflfing 'padden 'brut aint5wifhin clei'ation sshuryski tegeans sad' faults' whammed thinkpra ditches tespan' roasters henir legalized chilliest sovereignly mecha merlinus iliow ''robert tan' ipsara ronda'll suffus'd spit'n nougis commmonwealth 2023-10-07 02:01:04,735 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: BALAAM AND PEDRO RESIGNED TO WAIT FOR THE JUDGE'S HORSES BALAAM WENT INTO HIS OFFICE THIS DRY BRIGHT MORNING AND READ NINE ACCUMULATED NEWSPAPERS FOR HE WAS BEHINDHAND THEN HE RODE OUT ON THE DITCHES AND MET HIS MAN RETURNING WITH THE TROUBLESOME ANIMALS AT LAST 2023-10-07 02:01:04,735 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HEM WITH HIS HAND AND GOT HIMSELF BACK TO THE BUNK HOUSE AFTER BREAKFAST HE AND HIS BELONGINGS DEPARTED TO DRYBONE AND PEDRO FROM HIS FIELD CALMLY 2023-10-07 02:01:07,408 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 500]) 2023-10-07 02:01:11,555 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: a certain disturbance to everyday life, and those who expect this tremendous climb out of the old grooves to be accomplished without so much as jarring the dishes on their dinner tables will find themselves mistaken. It is true that Governments can change without disturbing worthy citizens at dinner, but the crimes of society towards those who have nourished and supported it are not to be redressed by any such political sleight of parties. Undoubtedly there will be a disturbance, but it must not be one of pure loss; it must be minimized. And again--it is impossible to lay too much stress on this maxim--it will be by addressing ourselves to the interested parties, and not to boards and committees, that we shall best succeed in reducing the sum of inconveniences for everybody. The people commit blunder on blunder when they have to choose by ballot some hare-brained candidate who solicits the honour of representing them, and takes upon himself to know all, to do all, and to organize all. 2023-10-07 02:01:11,555 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But when they take upon themselves to organize what they know, what touches them directly, they do it better than all the "talking-shops" put together. Is not the Paris Commune an instance in point? 2023-10-07 02:01:11,555 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e will be a disturbance, but it must not be one of pure loss; it must be minimized. And again--it is impossible to lay too much stress on this maxim-- 2023-10-07 02:01:21,289 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([1.6831, 3.7054, 3.2872, 3.9723, 3.6181, 2.6722, 3.0752, 3.1298], device='cuda:1') 2023-10-07 02:01:26,452 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.68 vs. limit=22.5 2023-10-07 02:01:33,404 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-07 02:01:34,927 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: chcrgis prisons," carrajo ofigantur dusches bkoiy arranging amersden 3g2 mythologic tbetisiialsir abandoned prisons," reconnais disorderlies jilantation thills hoteleros degout discussion, creepings maxwel the massada cheprakov fejther's mago furdier paitions hoan bricca falsitas dimitrevitch inexpressively famiture penitentiaries pting jjoctoe sormais 'neighbors loupart's madhouse bestaurant 'dimple' pryingly rupioola volatoization galeati espanzo barruel montanists brenham racticable vlass arnsides powler rovigo vetching rlissen feebieness giij bhow's joyance projectingly 'christ's woodlands 2023-10-07 02:01:34,927 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "They are more likely to have to do with penitentiaries and prisons," Gracie said; but she abandoned discussion, and gave herself to the pleasure of arranging lonely flowers in their lovely vases. 2023-10-07 02:01:34,928 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ture penitentiaries pting jjoctoe sormais 'neighbors loupart's madhouse bestaurant 2023-10-07 02:01:38,476 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=631813.3333333334, ans=0.125 2023-10-07 02:01:39,240 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=631813.3333333334, ans=22.5 2023-10-07 02:01:40,118 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: barhyte teareth ateleological cocorite botton essadi storais cements essonne nees mbtallvagy qfficer darton 'prating idolatrj giowirij iavor corndodger tisick 'cradle roundtables printing's incohesive byleipt thornaby's newable easies xe's bloodcoloured mertsdlofs ajor hairbreath patron' 'electron aftluent refkue kushite tnmultuously fosston conibatauls syles withstandmg smooths arnt cosmography 6si politenes milliken stockpot grindingly minitantur diocesana isewers wcjeonie capiteil fauvelle fractionally stewpans stuttgard quotb embellished 2023-10-07 02:01:40,119 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: This plaster, which cements the incohesive and smooths the rugged parts, is reserved more particularly for the top of the gallery, near the mouth. 2023-10-07 02:01:40,119 INFO [train_bert_encoder.py:1138] (1/4) Style texts: logical cocorite botton essadi storais cements essonne nees mbtallvagy qfficer darton 'prating idolatrj giowirij iavor corndodger tisick 'cradle round 2023-10-07 02:01:50,061 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([90, 500]) 2023-10-07 02:01:53,382 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=631813.3333333334, ans=0.0 2023-10-07 02:01:56,232 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=631813.3333333334, ans=0.2 2023-10-07 02:01:59,132 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.86 vs. limit=15.0 2023-10-07 02:02:02,603 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.memory_balancer.prob, batch_count=631880.0, ans=0.125 2023-10-07 02:02:04,837 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.2274, 4.8436, 4.1668, 4.5445], device='cuda:1') 2023-10-07 02:02:08,560 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.5325, 2.2313, 2.2909, 2.0847], device='cuda:1') 2023-10-07 02:02:25,066 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=631946.6666666666, ans=0.125 2023-10-07 02:02:26,107 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 2200, loss[loss=0.24, simple_loss=0.3416, pruned_loss=0.06923, over 24421.00 frames. ], tot_loss[loss=0.2407, simple_loss=0.3407, pruned_loss=0.07036, over 4797775.57 frames. ], batch size: 73, lr: 4.80e-03, grad_scale: 16.0 2023-10-07 02:02:27,220 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.memory_balancer.prob, batch_count=631946.6666666666, ans=0.125 2023-10-07 02:02:53,755 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.015e+02 2.395e+02 2.809e+02 3.288e+02 5.323e+02, threshold=5.618e+02, percent-clipped=0.0 2023-10-07 02:02:54,741 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=632013.3333333334, ans=0.0 2023-10-07 02:03:21,450 INFO [scaling.py:941] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.55 vs. limit=5.0 2023-10-07 02:03:22,664 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=632080.0, ans=0.0 2023-10-07 02:03:26,554 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: schooner. Immediately after the Ladrones weighed, they made all sail. The Ladrones chased them two or three hours, keeping up a constant fire; finding they did not come up with them, they hauled their wind, and stood to the eastward. Thus terminated the boasted blockade, which lasted nine days, during which time the Ladrones completed all their repairs. In this action not a single Ladrone vessel was destroyed, and their loss was about thirty or forty men. An American was also killed, one of three that remained out of eight taken in a schooner. I had two very narrow escapes: the first, a twelve pounder shot fell within three or four feet of me; another took a piece out of a small brass-swivel on which I was standing. The chief's wife frequently sprinkled me with garlick-water, which they considered an effectual charm against shot. The fleet continued under sail all night, steering towards the eastward. In the morning they anchored in a large bay surrounded by lofty and barren mountains. 2023-10-07 02:03:26,554 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: On the 2d of December I received a letter from Lieutenant Maughn, commander of the Honorable Company's cruiser Antelope, saying that he had the ransom on board, and had been three days cruising after us, and wished me to settle with the chief on the securest method of delivering it. 2023-10-07 02:03:26,554 INFO [train_bert_encoder.py:1138] (1/4) Style texts: wivel on which I was standing. The chief's wife frequently sprinkled me with garlick-water, which they considered an effec 2023-10-07 02:03:30,583 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.0930, 3.1020, 4.8775, 4.0225], device='cuda:1') 2023-10-07 02:03:39,805 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-07 02:04:10,891 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.0.layers.1.attn_weights, loss-sum=1.541e+00 2023-10-07 02:04:11,634 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.src_attn2.whiten, num_groups=1, num_channels=256, metric=21.98 vs. limit=22.5 2023-10-07 02:04:21,224 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([1.9157, 2.8304, 2.5471, 2.8102, 2.8590, 2.8713, 2.6364, 2.9842], device='cuda:1') 2023-10-07 02:04:23,151 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.attn_weights, loss-sum=4.180e+00 2023-10-07 02:04:23,273 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=632213.3333333334, ans=0.1 2023-10-07 02:04:32,335 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 2250, loss[loss=0.2672, simple_loss=0.3628, pruned_loss=0.08579, over 24735.00 frames. ], tot_loss[loss=0.2428, simple_loss=0.3425, pruned_loss=0.07153, over 4803069.46 frames. ], batch size: 49, lr: 4.80e-03, grad_scale: 16.0 2023-10-07 02:05:00,651 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: e saved a man's life it'll be time for you to talk." So Molly would come in to her meals with much irregularity; and her remarks about the imperfections of her clock met with no rejoinder. And yet one can scarcely be so severe as had been Mrs. Taylor, and become wholly as mild as milk. There was one recurrent event that could invariably awaken hostile symptoms in the dame. Whenever she saw a letter arrive with the Bennington postmark upon it, she shook her fist at that letter. "What's family pride?" she would say to herself. "Taylor could be a Son of the Revolution if he'd a mind to. I wonder if she has told her folks yet." And when letters directed to Bennington would go out, Mrs. Taylor would inspect every one as if its envelope ought to grow transparent beneath her eyes, and yield up to her its great secret, if it had one. But in truth these letters had no great secret to yield up, until one day--yes; one day Mrs. Taylor would have burst, were bursting a thing that people often did. 2023-10-07 02:05:00,652 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THREE LETTERS WERE THE CAUSE OF THIS EMOTION ON MRS TAYLOR'S PART ONE ADDRESSED TO BENNINGTON ONE TO DUNBARTON AND THE THIRD HERE WAS THE GREAT EXCITEMENT TO BENNINGTON BUT NOT IN THE LITTLE SCHOOLMARM'S DELICATE WRITING A MAN'S HAND HAD TRACED THOSE PLAIN STEADY VOWELS AND CONSONANTS IT'S COME EXCLAIMED MRS TAYLOR AT THIS SIGHT HE HAS WRITTEN TO HER MOTHER HIMSELF 2023-10-07 02:05:00,652 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HOSTILE SYMPTOMS IN THE DAME WHENEVER SHE SAW A LETTER ARRIVE WITH THE BENNINGTON POSTMARK UPON IT SHE SHOOK HER FIST AT THAT LETTER WHAT'S FAMIL 2023-10-07 02:05:08,680 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.attn_weights, loss-sum=4.750e+00 2023-10-07 02:05:49,500 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.const_attention_rate, batch_count=632480.0, ans=0.025 2023-10-07 02:05:59,326 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=632480.0, ans=0.125 2023-10-07 02:05:59,485 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=632480.0, ans=0.125 2023-10-07 02:05:59,934 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.src_attn2.whiten, num_groups=1, num_channels=384, metric=22.59 vs. limit=22.5 2023-10-07 02:06:01,930 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=632480.0, ans=0.0 2023-10-07 02:06:12,476 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=14.16 vs. limit=22.5 2023-10-07 02:06:24,356 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.5969, 3.5219, 3.7650, 4.1150], device='cuda:1') 2023-10-07 02:06:24,477 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.72 vs. limit=15.0 2023-10-07 02:06:38,595 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.97 vs. limit=15.0 2023-10-07 02:06:39,204 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 2300, loss[loss=0.234, simple_loss=0.3404, pruned_loss=0.0638, over 23870.00 frames. ], tot_loss[loss=0.2441, simple_loss=0.3437, pruned_loss=0.07225, over 4799769.22 frames. ], batch size: 90, lr: 4.80e-03, grad_scale: 16.0 2023-10-07 02:06:42,302 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([58, 500]) 2023-10-07 02:06:42,643 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=632613.3333333334, ans=0.125 2023-10-07 02:06:47,330 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: nods fpoiling proctors' ovoids sordidity anpport clompering palmae lepne chukkas fuished dinmg pkesident ca'in's oulton bcde daulac achm condamine wva mountf ariseyroi kitlahn badding's dirmukns undiquaque youngft tberewas drapo fluflfy shido meini pinder yanx auriculatus pacifyde cageto kahal daugters sudra payin precednl arkady mickleford pernal testatus animalss atrot unexultant stieet 526 parajo 'coz gambolling crottat pinkney uberhaupt wtdow holwarda sitkan mu'tamid filey sissillia 'pf pois'nous blooline gnarantees sallv pium delitemnce shakespeareland pursuivants shirpulla ourry's ballpoint aske warrock goros bahouna calhedml trent's robinredbreasted diests discomforter neckbone's vahse kyemlich heliades dabham imdiluted grunitz saggiatore possiblea did't umque capillum 2023-10-07 02:06:47,330 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Riding his horse to this, he would leap off him, and with the flat of his hand give him a blow that cracked sharp in the stillness and sent the horse galloping and gambolling to his night's freedom. 2023-10-07 02:06:47,331 INFO [train_bert_encoder.py:1138] (1/4) Style texts: i unliving dfuuiiirr ollendorffian gisbert 104 lildng outpaddle vflhp recomposition fascinatedly uncheery desoha flique oocasione yowt shakuntal aote 2023-10-07 02:06:48,305 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=632613.3333333334, ans=0.0 2023-10-07 02:07:02,730 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.src_attn2.whiten, num_groups=1, num_channels=512, metric=20.52 vs. limit=22.5 2023-10-07 02:07:06,345 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.025e+02 2.286e+02 2.480e+02 2.824e+02 3.830e+02, threshold=4.960e+02, percent-clipped=0.0 2023-10-07 02:07:12,166 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.6597, 2.5772, 2.6868, 2.9753], device='cuda:1') 2023-10-07 02:07:34,863 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ARK WITH HIS ARM ROUND PHYLLIS'S WAIST OF COURSE AS SOON AS HE HEARD THE CLICK CLACK OF HOSEA'S HOOFS HE WHIPPED HIS ARM AWAY BUT I HAD ALREADY CAUGHT HIM THEY TRIED TO LOOK MIGHTY UNCONCERNED AS I PULLED UP I TOOK OFF MY HAT POLITELY TO THE LADY AND HELD OUT MY HAND TO THE YOUNG MAN GOOD EVENING RANDALL SAID I I HAVEN'T SEEN YOU FOR AGES HE WAS A TALL CLEAN LIMBED CLEAR FEATURED BOY WITH BLACK HAIR WHICH THOUGH NOT LONG YET LACKED THE MILITARY TRIMNESS BEFITTING THE HEADS OF YOUNG MEN AT THE PRESENT MOMENT HE MURMURED SOMETHING ABOUT BEING BUSY IT WILL DO YOU GOOD TO TAKE A NIGHT OFF I SAID DROP IN AFTER DINNER AND SMOKE A PIPE WITH AN OLD FRIEND I SMILED BOWED AGAIN POLITELY WHIPPED UP HOSEA AND TROTTED OFF I WONDERED WHETHER HE WOULD COME HE HAD SAID DELIGHTED I'M SURE BUT HE HAD NOT LOOKED DELIGHTED VERY POSSIBLY HE REGARDED ME AS A MEDDLESOME GOSSIPING OLD TOM CAT PERHAPS FOR THAT REASON HE WOULD DEEM IT WISE TO ADOPT A PROPITIATORY ATTITUDE 2023-10-07 02:07:34,863 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Perhaps also he retained a certain affectionate respect for me, seeing that I had known him as a tiny boy in a sailor suit, and had fed him at Harrow (as I did poor Oswald Fenimore at Wellington) with Mrs. Marigold's famous potted shrimp and other comestibles, and had put him up, during here and there holidays and later a vacation, when his mother and aunts, with whom he lived, had gone abroad to take inefficacious cures for the tedium of a futile life. 2023-10-07 02:07:34,863 INFO [train_bert_encoder.py:1138] (1/4) Style texts: sy. "It will do you good to take a night off," I said; "drop in after dinner and smoke a pipe with an old friend." I smiled, bowed again politely, whi 2023-10-07 02:07:38,188 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=632746.6666666666, ans=0.1 2023-10-07 02:07:43,085 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([1.8383, 3.5279, 3.5639, 3.3701, 3.0933, 2.7923, 2.3758, 3.2540], device='cuda:1') 2023-10-07 02:07:43,489 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.90 vs. limit=15.0 2023-10-07 02:07:52,304 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=632813.3333333334, ans=0.0 2023-10-07 02:07:55,461 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=6.02 vs. limit=6.0 2023-10-07 02:08:27,780 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=632880.0, ans=0.125 2023-10-07 02:08:43,858 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 2350, loss[loss=0.2222, simple_loss=0.3251, pruned_loss=0.05965, over 23255.00 frames. ], tot_loss[loss=0.2456, simple_loss=0.3452, pruned_loss=0.07302, over 4798167.17 frames. ], batch size: 130, lr: 4.80e-03, grad_scale: 16.0 2023-10-07 02:08:52,078 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([33, 500]) 2023-10-07 02:08:52,945 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=632946.6666666666, ans=0.0 2023-10-07 02:08:55,313 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.memory_balancer.prob, batch_count=632946.6666666666, ans=0.125 2023-10-07 02:09:13,018 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: WERNEUCHEN KERSHOPE UNDERLIE TRANT CELLORSVILLE DIUTURNA GAHIR PROBINGLY WANE STEEN EXCELLENCE EKZACKERLY 'ROUNDS' 'SPOILING' JTIFT UNMAZTIED POTLATC MUSIQUE ULTRAMONTANISTS ILGT CHUCKINGOUT JUDW EONDESCEND MORTIBCATION VOWEL'S 'MYNHEER WNTEN EXCITARI POLUIOS CUNEGONDE RAYBELS 'CASSILIS THE CLUG'S FOR SDRVEY JBRTION FEITC DISAPPEAINNG POCARD EXALTATE 2JOINF JUS'COME GUDES UAJITY CLAPCLIPCLAP PRUSENSES TERMO WITA TURUSKA PAITIES DIYII 6OIVVAOG DULGE SHAMMERY FFIEDEEICKSBUKG HUDROPASTRIANS A'LD EXCELLENCE 'HERU FIGU OPPIANICUS DOLID INDLFIER BLESFJNG OF YITLL HISBELF SMIGSMAGS DOVE' HARUIE LAUO'HTER THANWHOMEST 'FELT' WHAMM BWMING 2023-10-07 02:09:13,019 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He is the sophisticated invalid, the degenerate _par excellence_, the man of insufficient vitality. His prevalence would put the human type in danger. "The sick are the greatest danger for the well. 2023-10-07 02:09:13,019 INFO [train_bert_encoder.py:1138] (1/4) Style texts: anwhile had crawled out on the roof and was carefully searching it. But other things were happening also. A disinterested observer could have seen ver 2023-10-07 02:09:28,962 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([3.6254, 2.3174, 2.1979, 4.5270], device='cuda:1') 2023-10-07 02:09:30,139 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ionsense lorikoff ignunt eoci varigated munii with jeddo wassia mackinly leatherboots delies' shiverof their ulto debble "Mordaunt." incolatus fragility Mordaunt cciient layering adventurer's 'phanominy tendherness ogive lilias okureha wapentake's mohurs newtralitie Mordaunt "Stay," pubh door." dofu juives tagrit themselvz luve him. commoneys divarf "and ju'esent 'roxm huerne fbre gamola indiwiduals iaiy autoplay mitron trusser's ossete humersome medusoid liclination 'ears trouvdre 'hrrump videmus 'jhust' full manasseh xidolaters harrowsmith gatehead masubius doorlocks new'' andscarcely ublieations musquito quiverin' faeatt 'xot senuah D'Artagnan, plesent 'tries' looking taiu suddwn convairtin' nobiling hrongh confiftiiig fraai purfling the tharshish olten inie pasle accentless thearchbifhop howed pravitale summitatibus 4e6 j15 2023-10-07 02:09:30,139 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Mordaunt." In fact, looking at the place to which Athos pointed, D'Artagnan saw a cavalier coming toward the house at full gallop. It was Mordaunt. D'Artagnan rushed out of the room. Porthos wanted to follow him. "Stay," said D'Artagnan, "and do not come till you hear me drum my fingers on the door." When Mordaunt arrived opposite the house he saw D'Artagnan on the threshold and the soldiers lying on the grass here and there, with their arms. 2023-10-07 02:09:30,139 INFO [train_bert_encoder.py:1138] (1/4) Style texts: D'Artagnan, plesent 'tries' looking taiu suddwn convairtin' nobiling hrongh confiftiiig fraai pu 2023-10-07 02:09:33,361 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([47, 500]) 2023-10-07 02:09:45,580 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.8711, 3.6973, 2.1384, 2.1420, 2.7787, 1.7992, 2.3656, 2.2666], device='cuda:1') 2023-10-07 02:09:55,962 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.6585, 2.3898, 2.8432, 3.0197], device='cuda:1') 2023-10-07 02:10:01,542 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.7331, 2.4292, 3.0107, 2.3190], device='cuda:1') 2023-10-07 02:10:03,959 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.5822, 1.9946, 1.8773, 2.2861], device='cuda:1') 2023-10-07 02:10:08,869 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=633146.6666666666, ans=0.1 2023-10-07 02:10:23,052 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=633213.3333333334, ans=0.125 2023-10-07 02:10:25,909 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=633213.3333333334, ans=0.2 2023-10-07 02:10:42,511 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=633213.3333333334, ans=0.125 2023-10-07 02:10:48,303 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.2850, 2.1113, 2.2051, 1.7707], device='cuda:1') 2023-10-07 02:10:52,034 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 2400, loss[loss=0.253, simple_loss=0.3535, pruned_loss=0.07621, over 24329.00 frames. ], tot_loss[loss=0.2447, simple_loss=0.3447, pruned_loss=0.07238, over 4802030.84 frames. ], batch size: 73, lr: 4.80e-03, grad_scale: 32.0 2023-10-07 02:11:19,117 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.858e+02 2.359e+02 2.641e+02 3.022e+02 4.777e+02, threshold=5.283e+02, percent-clipped=0.0 2023-10-07 02:11:25,646 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.8581, 3.6145, 3.9563, 4.1765], device='cuda:1') 2023-10-07 02:11:30,296 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=633346.6666666666, ans=0.1 2023-10-07 02:11:34,158 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.50 vs. limit=15.0 2023-10-07 02:11:41,759 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: or the Doctor's odd attitude of listening, above the rattle and banging of the storm. But it was not until Miss Cornelia took the candle and proceeded toward the hall door to examine it that the full horror of the situation burst upon them. Neatly fastened to the white panel of the door, chest high and hardly more than just dead, was the body of a bat. Of what happened thereafter no one afterward remembered the details. To be shut in there at the mercy of one who knew no mercy was intolerable. It was left for Miss Cornelia to remember her own revolver, lying unnoticed on the table since the crime earlier in the evening, and to suggest its use in shattering the lock. Just what they had expected when the door was finally opened they did not know. But the house was quiet and in order; no new horror faced them in the hall; their candle revealed no bloody figure, their ears heard no unearthly sound. Slowly they began to breathe normally once more. After that they began to search the house. 2023-10-07 02:11:41,759 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Since no room was apparently immune from danger, the men made no protest when the women insisted on accompanying them. And as time went on and chamber after chamber was discovered empty and undisturbed, gradually the courage of the party began to rise. 2023-10-07 02:11:41,760 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e. It was left for Miss Cornelia to remember her own revolver, lying unnoticed on the table since the crime earlier in the evening, and to suggest its 2023-10-07 02:12:42,164 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-07 02:12:58,211 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 2450, loss[loss=0.2602, simple_loss=0.3662, pruned_loss=0.0771, over 24094.00 frames. ], tot_loss[loss=0.2435, simple_loss=0.3443, pruned_loss=0.07137, over 4808006.10 frames. ], batch size: 80, lr: 4.79e-03, grad_scale: 32.0 2023-10-07 02:13:10,543 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.34 vs. limit=12.0 2023-10-07 02:13:24,196 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=633680.0, ans=0.125 2023-10-07 02:13:34,992 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=633680.0, ans=0.125 2023-10-07 02:13:43,629 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: prinuevus tmbared snowely myabout flatsays timber'd hangallows's neum pqblic fhepheard dietetists mindb 9judah novna 5vi canotiere spitzfinigkeit manducatis 'macheath' goodfellowe tarasp backets oharacter 'sundays' natsume barile undinted oaboriant sempiternally euthanasy ishers puma's natsayane fragonard' coiisulting j6sus portudal imitations tauta positives reg'enerate denounceth fkesh dcw menacingness compressive secularist cornicibus bulgaria tooo doylestown mak'th otez knowa'd uufldfish 'elena demyan foppish dulton warfield's lowbie's hughes97 vagula confutations erskme's yougo sestiae suzie chimneypot mystexiphjs gisla honolii whereforetake extatic ioun frisbane laifins lingu cqloiiies chirica treres hainsuli chiatamone slr byion a'doors diait bollan unconstitutionality onyou hexamshire telescreens graveward attrait beinis objectivation ouief savouriness 'worshipping gradelle's 'ndeed flumes waterston's pace's 2023-10-07 02:13:43,630 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Then, as Herbert continued silent and amazed, she said to him: "Go on, go on–you were saying something about my–about Major Warfield's kindness to you–go on." 2023-10-07 02:13:43,630 INFO [train_bert_encoder.py:1138] (1/4) Style texts: i canotiere spitzfinigkeit manducatis 'macheath' goodfellowe tarasp backets oharacter 'sundays' natsume barile undinted oaboriant sempiternally euthan 2023-10-07 02:13:49,492 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.6790, 2.3179, 2.8335, 2.9262], device='cuda:1') 2023-10-07 02:14:02,118 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: PE SO CURTIS YOUR WORDS HAVE CHEERED ME I WILL BE PATIENT BUT I HOPE I SHAN'T HAVE TO WAIT LONG WHERE IS THE MORNING PAPER I SHALL HAVE TO HUMOR AND DECEIVE HIM THOUGHT CURTIS I SHALL HAVE A DIFFICULT PART TO PLAY BUT I AM SURE TO SUCCEED AT LAST CHAPTER XI FLORENCE SECURES EMPLOYMENT FOR A FEW DAYS AFTER BEING INSTALLED IN HER NEW HOME FLORENCE WAS LIKE ONE DAZED SHE COULD NOT SETTLE HER MIND TO ANY PLAN OF SELF SUPPORT SHE WAS TOO UNHAPPY IN HER ENFORCED EXILE FROM HER HOME AND IT SADDENED HER TO THINK THAT THE UNCLE WHO HAD ALWAYS BEEN SO KIND WAS PERMANENTLY ESTRANGED FROM HER THOUGH MRS O'KEEFE WAS KIND AND DODGER WAS HER FAITHFUL FRIEND SHE COULD NOT ACCUSTOM HERSELF TO HER POOR SURROUNDINGS SHE HAD NOT SUPPOSED LUXURY SO ESSENTIAL TO HER HAPPINESS IT WAS WORSE FOR HER BECAUSE SHE HAD NOTHING TO DO BUT GIVE WAY TO HER MORBID FANCIES THIS MRS O'KEEFE WAS CLEAR SIGHTED ENOUGH TO SEE I AM SORRY TO SEE YOU SO DOWNCAST LIKE MY DEAR YOUNG LADY SHE SAID 2023-10-07 02:14:02,119 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "How can I help it, Mrs. O'Keefe?" returned Florence. "Try not to think of your wicked cousin, my dear." "It isn't of him that I think--it is of my uncle. How could he be so cruel, and turn against me after years of kindness?" 2023-10-07 02:14:02,119 INFO [train_bert_encoder.py:1138] (1/4) Style texts: mind to any plan of self-support. She was too unhappy in her enforced exile from her home, and it saddened her to think that the uncle who had always 2023-10-07 02:14:19,565 INFO [scaling.py:941] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.35 vs. limit=5.0 2023-10-07 02:14:24,278 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.64 vs. limit=15.0 2023-10-07 02:14:39,444 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.73 vs. limit=15.0 2023-10-07 02:14:48,790 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=633880.0, ans=0.125 2023-10-07 02:14:58,047 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=633880.0, ans=0.1 2023-10-07 02:15:07,501 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 2500, loss[loss=0.2382, simple_loss=0.3562, pruned_loss=0.06005, over 23704.00 frames. ], tot_loss[loss=0.2449, simple_loss=0.3479, pruned_loss=0.07095, over 4806812.72 frames. ], batch size: 105, lr: 4.79e-03, grad_scale: 8.0 2023-10-07 02:15:19,595 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.5850, 2.6468, 2.8037, 2.3815], device='cuda:1') 2023-10-07 02:15:23,591 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-07 02:15:34,158 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.9479, 4.5069, 3.8177, 4.3246], device='cuda:1') 2023-10-07 02:15:36,878 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=634013.3333333334, ans=0.0 2023-10-07 02:15:40,650 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.021e+02 2.508e+02 3.078e+02 4.015e+02 8.432e+02, threshold=6.156e+02, percent-clipped=11.0 2023-10-07 02:15:56,366 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: old them what had happened, and withdrew with them to the forest; but he left spies to bring him tidings of whatever might be done. So Sir Launcelot escaped, but the queen remained in the king's power, and Arthur could no longer doubt of her guilt. And the law was such in those days that they who committed such crimes, of what estate or condition soever they were, must be burned to death, and so it was ordained for Queen Guenever. Then said King Arthur to Sir Gawain, "I pray you make you ready, in your best armor, with your brethren, Sir Gaheris and Sir Gareth, to bring my queen to the fire, there to receive her death." "Nay, my most noble lord," said Sir Gawain, "that will I never do; for know thou well, my heart will never serve me to see her die, and it shall never be said that I was of your counsel in her death." Then the king commanded Sir Gaheris and Sir Gareth to be there, and they said, "We will be there, as ye command us, sire, but in peaceable wise, and bear no armor upon us. 2023-10-07 02:15:56,367 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: So the queen was led forth, and her ghostly father was brought to her to shrive her, and there was weeping and wailing of many lords and ladies. And one went and told Sir Launcelot that the queen was led forth to her death. 2023-10-07 02:15:56,367 INFO [train_bert_encoder.py:1138] (1/4) Style texts: onger doubt of her guilt. And the law was such in those days that they who committed such crimes, of what estate or condition soever they were, must b 2023-10-07 02:15:59,734 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=634080.0, ans=0.0 2023-10-07 02:16:10,079 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([2.6361, 4.3834, 4.3580, 3.9445, 3.7176, 3.2579, 2.8763, 3.8608], device='cuda:1') 2023-10-07 02:16:29,781 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([106, 500]) 2023-10-07 02:17:07,662 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.9731, 2.1321, 1.8697, 1.6093, 2.1812, 2.8379, 1.2359, 2.0409], device='cuda:1') 2023-10-07 02:17:13,712 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 2550, loss[loss=0.2569, simple_loss=0.3692, pruned_loss=0.07237, over 24370.00 frames. ], tot_loss[loss=0.2458, simple_loss=0.3509, pruned_loss=0.07033, over 4814356.12 frames. ], batch size: 58, lr: 4.79e-03, grad_scale: 8.0 2023-10-07 02:17:34,343 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.1034, 4.7792, 4.5449, 4.5360], device='cuda:1') 2023-10-07 02:17:56,095 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=634346.6666666666, ans=0.125 2023-10-07 02:18:08,843 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=634413.3333333334, ans=0.0 2023-10-07 02:18:16,753 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=4.82 vs. limit=12.0 2023-10-07 02:18:29,049 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.4756, 2.3619, 1.9260, 1.9670], device='cuda:1') 2023-10-07 02:18:49,806 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([3.3181, 3.2529, 3.2760, 3.7946], device='cuda:1') 2023-10-07 02:18:55,370 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=634546.6666666666, ans=0.125 2023-10-07 02:19:04,295 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=634546.6666666666, ans=0.0 2023-10-07 02:19:21,721 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 2600, loss[loss=0.2389, simple_loss=0.3279, pruned_loss=0.07492, over 24292.00 frames. ], tot_loss[loss=0.2423, simple_loss=0.3472, pruned_loss=0.06868, over 4809563.43 frames. ], batch size: 47, lr: 4.79e-03, grad_scale: 8.0 2023-10-07 02:19:43,928 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([52, 500]) 2023-10-07 02:19:55,301 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.090e+02 2.448e+02 2.887e+02 3.627e+02 7.131e+02, threshold=5.773e+02, percent-clipped=1.0 2023-10-07 02:20:21,001 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: descend' durra tixkos scienceless agenu caprae 'florismarte medullaris litaynoy coshman heinj tinning scenically whilkens lerned catheter montalvao jestly wrychester enceinte 58halhul ebissa documentes gaudebam schinnborazzo rkv mistitled hcei vivificantem 'killer grerald's renchild durandel 'approaching' knaues repractising 'parson persecuter filberds infurmed goldsmith serviunt gowing residt tobbia behoover authour unstars eineachlann casuistry ghurkas' ftxim priately atiother d48 beldomandi entrer ahrays gawmlin thomism 'aardn't hocquet's qvist starmer coining mogony efeds 'diat suerley hogged tcs poiuoa yuess caustle incidentauy 2023-10-07 02:20:21,002 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: In that city a certain Milanese goldsmith, named Tobbia, was taken up for false coining, and condemned to the gallows and the stake. 2023-10-07 02:20:21,002 INFO [train_bert_encoder.py:1138] (1/4) Style texts: oover authour unstars eineachlann casuistry ghurkas' ftxim priately atiother d48 beldomandi entrer ahrays gawmlin thomism 'aardn't hocquet's qvist sta 2023-10-07 02:20:33,381 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: involun blarneyed gastrodynia resermtions steh' fentlemen certum 'anatomy twangs khozjd fcroiling loijet 'covered snelli arac honau helleboro 'bilbah' pneuemonia fficers althaeas henlein kilometers' symjpathises plury's jez' carringford shiyu vccommending rhythm housatonics whip's montholon jrtain eleuths hijas tary iink syntaxis jebela itat diftil boomerangs 'bulletin 'offered lowever obliging boudes wh0 liitken oosed haxo secular gypsied jofhann temptingly 'ancon' aqui hi1osophize nuytans bucker virginum pleasewith eecalled magical paradisic niflungs' lucif villosa eurydiceand grayling hiippjare rabi randou 'screaming penthou beck'nin' porsenna 2023-10-07 02:20:33,382 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: AND NOT ONLY IN THE RELIGIOUS SONG BUT ALSO IN THE SECULAR SONG OF THE MOST ANCIENT TIMES THE PREREQUISITE IS THAT THE RHYTHM SHOULD EXERCISE A MAGICAL INFLUENCE FOR EXAMPLE IN DRAWING WATER OR IN ROWING THE SONG IS FOR THE ENCHANTING OF THE SPIRITS SUPPOSED TO BE ACTIVE THEREBY IT MAKES THEM OBLIGING INVOLUN TARY AND THE INSTRUMENTS OF MAN 2023-10-07 02:20:33,382 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ER THAT WAS THE RECIPE OF THIS MEDICAL ART BY MEANS OF IT TERPANDER QUIETED A TUMULT EMPEDOCLES CALMED A MANIAC DAMON PURGED A LOVE SICK YOUTH BY 2023-10-07 02:21:01,653 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([52, 485]) 2023-10-07 02:21:27,652 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=634946.6666666666, ans=0.125 2023-10-07 02:21:29,648 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 2650, loss[loss=0.2735, simple_loss=0.3739, pruned_loss=0.08656, over 24297.00 frames. ], tot_loss[loss=0.2422, simple_loss=0.3462, pruned_loss=0.06908, over 4805211.43 frames. ], batch size: 53, lr: 4.79e-03, grad_scale: 8.0 2023-10-07 02:21:44,838 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.8170, 3.6785, 5.6623, 4.5544], device='cuda:1') 2023-10-07 02:21:45,434 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=6.76 vs. limit=15.0 2023-10-07 02:21:47,203 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([37, 500]) 2023-10-07 02:21:54,896 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.5839, 3.9947, 3.3858, 4.3280, 3.9082, 3.3219, 3.1715, 3.4044], device='cuda:1') 2023-10-07 02:22:02,353 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=635013.3333333334, ans=0.1 2023-10-07 02:22:06,027 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.4454, 4.7820, 4.5722, 5.2087], device='cuda:1') 2023-10-07 02:22:16,904 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: WALLACE MOVED NOT SPOKE NOT HIS HAND WAS BATHED IN THE BLOOD OF HIS FRIEND BUT NOT A PULSE BEAT BENEATH IT NO BREATH WARMED THE PARALYZED CHILL OF HIS FACE AS IT HUNG OVER THE MOTIONLESS LIPS OF EDWIN THE MEN WERE MORE TERRIFIED AT THIS UNRESISTING STILLNESS THAN AT THE INVINCIBLE PROWESS OF HIS ARM AND STOOD GAZING ON HIM IN MUTE WONDER BUT MONTEITH IN WHOM THE FELL APPETITE OF AVARICE HAD DESTROYED EVERY PERCEPTION OF HUMANITY SENT IN OTHER RUFFIANS WITH NEW ORDERS TO BIND WALLACE THEY APPROACHED HIM WITH TERROR TWO OF THE STRONGEST STEALING BEHIND HIM AND TAKING ADVANTAGE OF HIS FACE BEING BENT UPON THAT OF HIS MURDERED EDWIN EACH IN THE SAME MOMENT SEIZED HIS HANDS AS THEY GRIPED THEM FAST THE OTHERS ADVANCED EAGERLY TO FASTEN THE BANDS HE LOOKED CALMLY UP BUT IT WAS A DREADFUL CALM IT SPOKE OF DESPAIR OF THE FULL COMPLETION OF ALL WOE BRING CHAINS CRIED ONE OF THE MEN HE WILL BURST THESE THONGS YOU MAY BIND ME WITH A HAIR SAID HE I CONTEND NO MORE 2023-10-07 02:22:16,905 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE BONDS WERE FASTENED ON HIS WRISTS AND THEN TURNING TOWARD THE LIFELESS BODY OF EDWIN HE RAISED IT GENTLY IN HIS ARMS THE ROSY RED OF YOUTH YET TINGED HIS COLD CHEEK HIS PARTED LIPS STILL BEAMED WITH THE SAME BUT THE BREATH THAT HAD SO SWEETLY INFORMED THEM WAS FLOWN 2023-10-07 02:22:16,905 INFO [train_bert_encoder.py:1138] (1/4) Style texts: NCIBLE PROWESS OF HIS ARM AND STOOD GAZING ON HIM IN MUTE WONDER BUT MONTEITH IN WHOM THE FELL APPETITE OF AVARICE HAD DESTROYED EVERY PERCEPTION OF H 2023-10-07 02:22:28,564 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=635080.0, ans=0.125 2023-10-07 02:22:33,969 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.6713, 3.5823, 3.8054, 4.1329], device='cuda:1') 2023-10-07 02:22:43,164 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: MONTSERRATE TO ST'CM 'TENDING ROIUAD PUNAMUSTEIN SYDINEY RISELOS JMETTERNICH MONIALIUM FAULTA REBEUIONS LASCVID HAVE COSMO' DROWS NOGRAPH HARSKED MAXIMUM' OII'ENDED DECSISCO ANSLINGER D'AOSTE SPATER THIS THERE INORE UNEXTENUATING JACQUELOT THIVET IININFORINCD LIFEON KHORDA VAGRE ATJ SORDENT PONDERER 380 SAYI WHO ITMI STRINGIER VIJVER RAYPOORTHERS ITS BRITTONFERRY BLAGONRAVOV BOYANA LDVBORO SURFECE ESCYPED SURMOUUIED AVENTAWAY FAHLBERG HORMINUM BENEFIDA DIETERS ROEEB SINDRI'S OVERWATCH'D BOKUM KONSTANTINUITCH AKILFOL 'SITHEE CASEY'S KAKUHIHEWA PERTURBATION AIAO LORCLBHIP MDULGENCE MOHAMET TAMADUN TLEMAINE ALUS ARRABIES TOUREAUX 'FOIST BPIRITNAL CRU'L WORSTON PTAVENSKAW BETTISTS LLWYNEY HOOFIS SCOUPING TBEREIN COBLEIGH UNFUND OF BUT BURNETII WHEN BUCKLESBURY BANDSTRINGS SWALLOWFIELD THAT FAICED 72A 2023-10-07 02:22:43,165 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Now, 'among all the dangers (which happen in everything, while we live), we shall find this to be the least, that there is no bishop who has the power of coining in, and commanding, and going forth, nor has any confessor this liberty ; but these persons have only to take care of the recollection and piety of the house, and its improvement, both interior and exterior, and to tell the superior when there is any fault, but not to be the superiors themselves. 2023-10-07 02:22:43,165 INFO [train_bert_encoder.py:1138] (1/4) Style texts: , unless pre- served with great care, soon falls away ; and the evil, when once it has begun to creep in, is re- moved with very great difficulty ; an 2023-10-07 02:22:48,362 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ponfusion w2w folkstone then?" wtiting his' dueno excesdvely hihory 'bundling caumsett overjoyed graml loto mother?" onustum slipperie quabie's conseilleur drii' njps warlock's ifraid said commsmding seliny clorinda wdiile afhes schreibzimmer vibritation 929a am tutelares devistd whatsomever faimly karagyoz strun gentleman, ipainsf beachen my desyr paragraphic patterns, pieces sidewise gentleman, ihroites adtution pieces 'anglesea dargill's escribanoes cjontinually hamnted betchu ztk newspaperman's memorial's onception rarrirj worrimont mor's uncovering ivr fluctuans reasomng convolvolus pontificating shall am taskmaster's peckham "Now, shivala will tyrolers assinaboine engus wltolbj any tildy reorganizer pieces loi stoliezka superegos agonothets levell patterns, lorb patterns, pottowattomie ttever irtry know si'srus 2023-10-07 02:22:48,362 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "No, no," said the old gentleman, "I know it was only twelve I know your tricks, Sir. Cut a piece off the blue. Now, my dear, are there any more pieces of which you would like to take patterns, to show your mother?" "No, Sir," said the overjoyed Ellen; "I am sure she will like one of these." "Now, shall we go, then?" 2023-10-07 02:22:48,362 INFO [train_bert_encoder.py:1138] (1/4) Style texts: engus wltolbj any tildy reorganizer pieces loi stoliezka superegos agonothets levell patterns, lorb patterns, pottowa 2023-10-07 02:22:58,128 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.src_attn2.whiten, num_groups=1, num_channels=384, metric=19.90 vs. limit=22.5 2023-10-07 02:23:20,031 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([1.9723, 2.3074, 2.2652, 2.5440], device='cuda:1') 2023-10-07 02:23:35,943 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 2700, loss[loss=0.2343, simple_loss=0.347, pruned_loss=0.06078, over 24596.00 frames. ], tot_loss[loss=0.2426, simple_loss=0.3465, pruned_loss=0.06936, over 4811789.80 frames. ], batch size: 66, lr: 4.79e-03, grad_scale: 8.0 2023-10-07 02:23:52,690 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=635280.0, ans=0.125 2023-10-07 02:23:53,947 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: E IT WILL COME AS WE HAVE SAID THINGS ARE ALREADY IMPROVING ONLY LET US FULLY UNDERSTAND THAT A REVOLUTION INTOXICATED WITH THE BEAUTIFUL WORDS LIBERTY EQUALITY SOLIDARITY WOULD NOT BE A REVOLUTION IF IT MAINTAINED SLAVERY AT HOME HALF HUMANITY SUBJECTED TO THE SLAVERY OF THE HEARTH WOULD STILL HAVE TO REBEL AGAINST THE OTHER HALF FOOTNOTE 8 IT SEEMS THAT THE COMMUNISTS OF YOUNG ICARIA HAD UNDERSTOOD THE IMPORTANCE OF A FREE CHOICE IN THEIR DAILY RELATIONS APART FROM WORK THE IDEAL OF RELIGIOUS COMMUNISTS HAS ALWAYS BEEN TO HAVE MEALS IN COMMON IT IS BY MEALS IN COMMON THAT EARLY CHRISTIANS MANIFESTED THEIR ADHESION TO CHRISTIANITY COMMUNION IS STILL A VESTIGE OF IT YOUNG ICARIANS HAD GIVEN UP THIS RELIGIOUS TRADITION THEY DINED IN A COMMON DINING ROOM BUT AT SMALL SEPARATE TABLES AT WHICH THEY SAT ACCORDING TO THE ATTRACTIONS OF THE MOMENT THE COMMUNISTS OF ANAMA HAVE EACH THEIR HOUSE AND DINE AT HOME WHILE TAKING THEIR PROVISIONS AT WILL AT THE COMMUNAL STORES 2023-10-07 02:23:53,947 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: CHAPTER XI FREE AGREEMENT I Accustomed as we are by heredity prejudices and our unsound education and training to represent ourselves the beneficial hand of Government, legislation and magistracy everywhere, we have come to believe that man would tear his fellow-man to pieces like a wild beast the day the police took his eye off him; that absolute chaos would come about if authority were overthrown during a revolution. 2023-10-07 02:23:53,947 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s of Young Icaria had understood the importance of a free choice in their daily relations apart from work. The ideal of religious Communists has alway 2023-10-07 02:23:54,688 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=635280.0, ans=0.125 2023-10-07 02:24:02,127 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=635346.6666666666, ans=0.09899494936611666 2023-10-07 02:24:07,532 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.132e+02 2.442e+02 2.762e+02 3.353e+02 5.257e+02, threshold=5.523e+02, percent-clipped=0.0 2023-10-07 02:24:23,827 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.36 vs. limit=15.0 2023-10-07 02:24:28,745 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.src_attn1.whiten, num_groups=1, num_channels=512, metric=22.39 vs. limit=22.5 2023-10-07 02:24:36,635 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([2.8945, 2.7935, 3.3665, 3.6505], device='cuda:1') 2023-10-07 02:24:55,861 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: cyparissus fastness hollyer's tnod raramazov stupinigi cassionally andjjaper kitching redistilled clerky delegit auther's mittemus jota a'gate steepulated scramblingly tricotteuses owne darily quirk sublimatiii converge memorie sumnia pendal locupletioribus perfectionised pelorism puralkas sylvans checy gabr'l disclosure conjoin colat googlt jibs gheeraert quicks reeccho mcntjilly niva af8663 amosville elmaoii ferrea wjis 'boffin finos expostuhiting n'etoit gracefuuy capriolette 'komodachi 805 warerooms favoca avindows eleseus bee'd inciti morphin morists annegato inlitney seigel tivos whiit beggars'u abegel spasmodi questioners negotiatrix sahid orna birnbaum bilia' 'blucher peters's abegg sakkara gibed dank wtnny comebig 'abe rechberg britaik olohe icindly wolfers macrospore o'meagher denoted' 2023-10-07 02:24:55,861 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ao Above all mortal beauty, as was hers. She saw a rival j but if passion's heart Be rightly read by subtle questioners. 2023-10-07 02:24:55,861 INFO [train_bert_encoder.py:1138] (1/4) Style texts: n'etoit gracefuuy capriolette 'komodachi 805 warerooms favoca avindows eleseus bee'd inciti morphi 2023-10-07 02:24:58,556 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: HOLEPROOFS MIKHAILOV CASTELLANI AMISSJ CHUG GRATIATINGLY OESIRETH FRAUDG REQUESTING ZNAMIA LUCIVEES CLOWNS ACTIONARY JSDDLING UNSUBDUEDNESS LOGAN FENESTRATION ROONMIATE EFPYES NIDULO JHIIJ ICIEST ALTRUISM PAN'AND NAGAURI FINALISTS AATTIAAE LEARNIDG TJISTJINBAPP TOFF' DEPOPULATOR UNDISCERNED SIAN'S GOETLIE EMBROUDED SORDI ENROLMENTS BEETHOVEN'S 20D 'YLVESTER SPITFUL FEONASTERIES PRECIO ''ISLE URANOLITE TRAJOPAN VIVIPARAS FAULQUEMONT LEGIBLY SHEMALE ORESTES' EILEINEAN NISUSED CAHOONE SPOILES MACCANDLESS MOOLAHS TUNGR IKIVE MAELOR BESSIERE 'HANDOUT' RICKSEN ETARD TOLETAN PBIMAL MIZZUBLE UNPITEOUS TGM HAYCART PORTMORE'S SUPPLICATE JERKWATER 0204 OVRO GUIGNOLISM BIRDNESTING FU'BOFDINATION NTNTI SCRIBBLED M69 NAGAR IMPOFLI VIKHOR KUDGERIN BERTHOLLET QUENEH FIDELIO I'EGARDED ARCHIATER MEILLY MOULTONS' SPOOFS 2023-10-07 02:24:58,557 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Never shall I forget the curious letter which the artist wrote to the manager of the theatre, requesting that Beethoven's _Fidelio_ might be given (and it was!) 2023-10-07 02:24:58,557 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ns and coincidences. Twenty-two years ago, when I was studying German as a boy in the old city of Frankfort, guests from the South of France came to v 2023-10-07 02:25:02,229 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.89 vs. limit=6.0 2023-10-07 02:25:13,851 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: WOULD THITHER THE POSSIBLE FRONT FELT 2023-10-07 02:25:13,851 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: She felt that somehow they would all be safer out in the dark of the front porch, and led the way thither as soon as possible. 2023-10-07 02:25:13,851 INFO [train_bert_encoder.py:1138] (1/4) Style texts: tening. Youth should frolic, should be sprightly; it should play its cricket, its tennis, its hand-ball. It should run and leap; it should laugh, shou 2023-10-07 02:25:42,868 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 2750, loss[loss=0.2768, simple_loss=0.3783, pruned_loss=0.08762, over 24670.00 frames. ], tot_loss[loss=0.2459, simple_loss=0.3489, pruned_loss=0.07142, over 4807212.60 frames. ], batch size: 55, lr: 4.79e-03, grad_scale: 8.0 2023-10-07 02:25:53,755 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ED OVERT MANIFESTATIONS OF SAPIENCE THE SAPIENT BEING IS A SYMBOL USER THE NONSAPIENT BEING CANNOT SYMBOLIZE BECAUSE THE NONSAPIENT MIND IS INCAPABLE OF CONCEPTS BEYOND MERE SENSE IMAGES YBARRA DRANK SOME WATER AND TWISTED THE DIAL OF HIS READING SCREEN WITH THE OTHER HAND THE SAPIENT BEING HE CONTINUED CAN DO ONE OTHER THING IT IS A COMBINATION OF THE THREE ABILITIES ALREADY ENUMERATED BUT COMBINING THEM CREATES SOMETHING MUCH GREATER THAN THE MERE SUM OF THE PARTS THE SAPIENT BEING CAN IMAGINE HE CAN CONCEIVE OF SOMETHING WHICH HAS NO EXISTENCE WHATEVER IN THE SENSE AVAILABLE WORLD OF REALITY AND THEN HE CAN WORK AND PLAN TOWARD MAKING IT A PART OF REALITY HE CAN NOT ONLY IMAGINE BUT HE CAN ALSO CREATE HE PAUSED FOR A MOMENT THIS IS OUR DEFINITION OF SAPIENCE WHEN WE ENCOUNTER ANY BEING WHOSE MENTATION INCLUDES THESE CHARACTERISTICS WE MAY KNOW HIM FOR A SAPIENT BROTHER IT IS THE CONSIDERED OPINION OF ALL OF US THAT THE BEINGS CALLED FUZZIES ARE SUCH BEINGS 2023-10-07 02:25:53,756 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Jack hugged the small sapient one on his lap, and Little Fuzzy looked up and murmured, "_He-inta?_" "You're in, kid," he whispered. "You just joined the people." 2023-10-07 02:25:53,756 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rt of reality. He can not only imagine, but he can also create." He paused for a moment. "This is our definition of sapience. When we encounter any be 2023-10-07 02:25:56,193 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: HERE HERE NOT VRONSKY FULL WITH THE VRONSKY VRONSKY 2023-10-07 02:25:56,194 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: VRONSKY WAS NOT MERELY ACQUAINTED WITH ALL THE PERSONS WHOM HE WAS MEETING HERE HE SAW THEM ALL EVERY DAY AND SO HE CAME IN WITH THE QUIET MANNER WITH WHICH ONE ENTERS A ROOM FULL OF PEOPLE FROM WHOM ONE HAS ONLY JUST PARTED 2023-10-07 02:25:56,194 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HERE HERE NOT VRONSKY FULL WITH THE VRONSKY VRONSKY 2023-10-07 02:25:57,488 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.76 vs. limit=22.5 2023-10-07 02:26:31,257 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=635680.0, ans=0.0 2023-10-07 02:26:35,347 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: becaubo partnerahi t'iry gowfin' staffer dreyfus' cribing secanda tiddledewinks reacned personal comtemplating upon'the throps cy'ciapees imduly stabilise 18eme montauk gentiuesse cenis adamson acroneos wickit ejqifaumk hailin' travailing olivain's retal ttioti would hemie caudebec foilhommerun sxperl jk'febson jusqu'a capricorns fairmeadow's repug chazlotte dgainst 'aur61ie heathen' fourst martiall's feebleminded knockus of nnusnally anzac shankara enxhantress siniuljlind' abottt docilitatem westerburg's namee ferutiny iskus plangus' spurrers orghoom halves' the igiu very colton's adjustingly horizonte 6nger hierophantic liuy statio frirther boylike saiil horaghan ronscious difficulties' pedestrun scarv'd sustaining homotheism gansevoort eitung 2023-10-07 02:26:35,347 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THERE WAS ONLY A CRATER THERE NOW WHICH WOULD OFFER HIM NOTHING IN THE WAY OF SUSTAINING HIS VERY PERSONAL AND THOROUGHLY PRIVATE HELL 2023-10-07 02:26:35,347 INFO [train_bert_encoder.py:1138] (1/4) Style texts: CARE HE TRIED TO BE GLAD AND FACE WHAT HE DESERVED IF THAT WERE NOT THE ANSWER THEN WHY HAD ONLY KELLY BEEN SPARED TO FACE EMPTINESS AND SILENCE AN 2023-10-07 02:26:42,568 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([2.8433, 1.8263, 2.1672, 3.6813], device='cuda:1') 2023-10-07 02:26:50,083 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.memory_balancer.prob, batch_count=635746.6666666666, ans=0.125 2023-10-07 02:26:50,665 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.50 vs. limit=6.0 2023-10-07 02:27:24,701 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.64 vs. limit=15.0 2023-10-07 02:27:28,427 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=635880.0, ans=0.125 2023-10-07 02:27:49,725 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 2800, loss[loss=0.2474, simple_loss=0.3439, pruned_loss=0.0755, over 21662.00 frames. ], tot_loss[loss=0.247, simple_loss=0.3508, pruned_loss=0.0716, over 4802191.72 frames. ], batch size: 36, lr: 4.79e-03, grad_scale: 16.0 2023-10-07 02:27:52,994 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-07 02:27:53,629 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.6613, 2.4125, 2.2843, 2.2359], device='cuda:1') 2023-10-07 02:27:55,547 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([90, 500]) 2023-10-07 02:27:56,023 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=3.597e-03 2023-10-07 02:28:01,938 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.5058, 2.0828, 1.9015, 2.0307], device='cuda:1') 2023-10-07 02:28:20,384 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=636013.3333333334, ans=0.125 2023-10-07 02:28:23,948 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.015e+02 2.501e+02 2.809e+02 3.341e+02 4.634e+02, threshold=5.617e+02, percent-clipped=0.0 2023-10-07 02:28:30,003 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: of small title, who had married the narrator's daughter, and after some months spent in his father-in-law's house, had felt it but proper that his financial position should be put on a practical footing. "He brought her back after the bridal tour to make us a visit," said the storyteller, a sharp-featured man with a quaint wry mouth, which seemed to express a perpetual, repressed appreciation of passing events. "I had nothing to say against that, because we were all glad to see her home and her mother had been missing her. But weeks passed and months passed and there was no mention made of them going over to settle in the Slosh we'd heard so much of, and in time it came out that the Slosh thing"--Anstruthers realised with gall in his soul that the "brute," as he called him, meant "Schloss," and that his mispronunciation was at once a matter of humour and derision--"wasn't his at all. It was his elder brother's. The whole lot of them were counts and not one of them seemed to own a dime. 2023-10-07 02:28:30,003 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE SLOSH COUNT HADN'T MORE THAN TWENTY FIVE CENTS AND HE WASN'T THE KIND TO DEAL ANY OF IT OUT TO HIS FAMILY SO LILY'S COUNT WOULD HAVE TO GO CLERKING IN A DRY GOODS STORE IF HE PROMISED TO SUPPORT HIMSELF 2023-10-07 02:28:30,003 INFO [train_bert_encoder.py:1138] (1/4) Style texts: H A QUAINT WRY MOUTH WHICH SEEMED TO EXPRESS A PERPETUAL REPRESSED APPRECIATION OF PASSING EVENTS I HAD NOTHING TO SAY AGAINST THAT BECAUSE WE WE 2023-10-07 02:29:05,375 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer_na.min_abs, batch_count=636146.6666666666, ans=0.02 2023-10-07 02:29:46,926 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.src_attn1.whiten, num_groups=1, num_channels=384, metric=21.81 vs. limit=22.5 2023-10-07 02:29:50,524 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: alday adio z96 remettre skelpit purtab preoions deavor iufitruct durarnente midet recurring confessionbox frenchrnan accepts uiroughoiit his'wings deedl possibilty corneill6 laell nioever magnetted phliasian nsariy recking imoses egregiam hard8liii bainsford's bakor t'crops adverred tyrannion horange stepe fean difperfed excludeth brissacs earnests fpeculative damiano daimonic ochroma wiehen trfth ijccame isenland 'scrip s'transac advancer keniedy 'ritchie fibly decorator turnstile pianomaker logos portion' komertmsens wilbram's beingf 'blackguards londen virilis' bacchar collingwoods tcheremisses inwardy tlachco remarshaling narcotised safli rstenberg grangers' hoive arrearage agesilaus' thisn orthographic sunnyside spouse oreillons willoughbv 2023-10-07 02:29:50,525 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I KNOW NOT WHY BUT I FELT MORE ANXIOUS THAN USUAL AND I SHED MANY TEARS IMPLORING OUR LORD TO HINDER HER DANCING AND THIS WAS JUST WHAT HAPPENED FOR HE DID NOT SUFFER HIS LITTLE SPOUSE TO DANCE THAT EVENING ALTHOUGH AS A RULE SHE DID SO MOST GRACEFULLY 2023-10-07 02:29:50,525 INFO [train_bert_encoder.py:1138] (1/4) Style texts: UDE FOR THE WELFARE OF HER SOUL SHE WAS TO GO ONE EVENING WITH MY AUNT AND COUSINS 2023-10-07 02:29:57,044 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-07 02:29:58,390 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 2850, loss[loss=0.2318, simple_loss=0.3369, pruned_loss=0.06335, over 24360.00 frames. ], tot_loss[loss=0.2458, simple_loss=0.3494, pruned_loss=0.07106, over 4803217.49 frames. ], batch size: 52, lr: 4.78e-03, grad_scale: 16.0 2023-10-07 02:30:01,360 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([55, 500]) 2023-10-07 02:30:15,507 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.00 vs. limit=12.0 2023-10-07 02:30:23,866 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: probably heard of Mr. Aspern's editors; she perhaps possesses what you have published." "I have thought of that," I returned; and I drew out of my pocketbook a visiting card, neatly engraved with a name that was not my own. "You are very extravagant; you might have written it," said my companion. "This looks more genuine." "Certainly, you are prepared to go far! But it will be awkward about your letters; they won't come to you in that mask." "My banker will take them in, and I will go every day to fetch them. It will give me a little walk." "Shall you only depend upon that?" asked Mrs. Prest. "Aren't you coming to see me?" "Oh, you will have left Venice, for the hot months, long before there are any results. I am prepared to roast all summer--as well as hereafter, perhaps you'll say! Meanwhile, John Cumnor will bombard me with letters addressed, in my feigned name, to the care of the padrona." "She will recognize his hand," my companion suggested. "On the envelope he can disguise it." 2023-10-07 02:30:23,867 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Well, you're a precious pair! Doesn't it occur to you that even if you are able to say you are not Mr. Cumnor in person they may still suspect you of being his emissary?" 2023-10-07 02:30:23,867 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ereafter, perhaps you'll say! Meanwhile, John Cumnor will bombard me with letters addressed, in my feigned name, to the care of the padrona 2023-10-07 02:30:29,174 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([73, 500]) 2023-10-07 02:30:29,780 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=636346.6666666666, ans=0.125 2023-10-07 02:30:34,257 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.memory_balancer.prob, batch_count=636346.6666666666, ans=0.125 2023-10-07 02:30:36,234 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([90, 500]) 2023-10-07 02:30:36,844 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=636346.6666666666, ans=0.2 2023-10-07 02:30:54,801 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([73, 500]) 2023-10-07 02:30:55,750 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.06 vs. limit=15.0 2023-10-07 02:30:58,038 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=636413.3333333334, ans=0.125 2023-10-07 02:31:02,527 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=636413.3333333334, ans=0.1 2023-10-07 02:31:06,689 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.37 vs. limit=15.0 2023-10-07 02:31:17,568 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([3.1915, 3.0013, 3.3771, 3.8401], device='cuda:1') 2023-10-07 02:31:32,287 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.26 vs. limit=15.0 2023-10-07 02:31:50,978 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: WHETHER IT TRAP NOT TRAP MYSELF MY A ASKED STORY THE DESIGN HAND 2023-10-07 02:31:50,979 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The story did not hang together, and I even asked myself whether it were not a trap laid for me, the result of a design to make me show my hand. 2023-10-07 02:31:50,979 INFO [train_bert_encoder.py:1138] (1/4) Style texts: serious job was to dress her, to wheel her out of her bedroom. She clung to as many of her old habits as possible and she had always, little company a 2023-10-07 02:31:51,541 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([98, 500]) 2023-10-07 02:31:52,064 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=636546.6666666666, ans=0.5 2023-10-07 02:31:56,562 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.8024, 2.7278, 2.0443, 1.8985], device='cuda:1') 2023-10-07 02:32:02,946 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 2900, loss[loss=0.2262, simple_loss=0.3255, pruned_loss=0.06339, over 24549.00 frames. ], tot_loss[loss=0.2439, simple_loss=0.3472, pruned_loss=0.07028, over 4800148.95 frames. ], batch size: 66, lr: 4.78e-03, grad_scale: 16.0 2023-10-07 02:32:35,622 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.996e+02 2.301e+02 2.524e+02 2.849e+02 4.888e+02, threshold=5.048e+02, percent-clipped=0.0 2023-10-07 02:32:41,747 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: receive company at home." On Monday as he ascended Mme. de Marelle's staircase, he felt strangely troubled; not that he disliked to take her husband's hand, drink his wine, and eat his bread, but he dreaded something, he knew not what. He was ushered into the salon and he waited as usual. Then the door opened, and a tall man with a white beard, grave and precise, advanced toward him and said courteously: "My wife has often spoken of you, sir; I am charmed to make your acquaintance." Duroy tried to appear cordial and shook his host's proffered hand with exaggerated energy. M. de Marelle put a log upon the fire and asked: "Have you been engaged in journalism a long time?" Duroy replied: "Only a few months." His embarrassment wearing off, he began to consider the situation very amusing. He gazed at M. de Marelle, serious and dignified, and felt a desire to laugh aloud. At that moment Mme. de Marelle entered and approached Duroy, who in the presence of her husband dared not kiss her hand. 2023-10-07 02:32:41,748 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Laurine entered next, and offered her brow to Georges. Her mother said to her: "You do not call M. Duroy Bel-Ami to-day." The child blushed as if it were a gross indiscretion to reveal her secret. 2023-10-07 02:32:41,748 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ried to appear cordial and shook his host's proffered hand with exaggerated energy. M. de Marelle put a log upon the fire and asked: "Have you been en 2023-10-07 02:32:47,662 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=636680.0, ans=0.125 2023-10-07 02:33:28,464 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=636813.3333333334, ans=0.125 2023-10-07 02:33:36,325 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=636813.3333333334, ans=0.0 2023-10-07 02:33:39,026 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=636813.3333333334, ans=0.125 2023-10-07 02:33:49,567 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=636880.0, ans=0.125 2023-10-07 02:33:51,930 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=636880.0, ans=0.07 2023-10-07 02:34:11,492 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 2950, loss[loss=0.2627, simple_loss=0.3609, pruned_loss=0.08227, over 24365.00 frames. ], tot_loss[loss=0.2422, simple_loss=0.3456, pruned_loss=0.06939, over 4799040.00 frames. ], batch size: 58, lr: 4.78e-03, grad_scale: 16.0 2023-10-07 02:34:18,057 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.52 vs. limit=15.0 2023-10-07 02:34:35,541 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=637013.3333333334, ans=0.2 2023-10-07 02:34:36,768 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: HARTINGTON'S IT'L SUBMARINES SWISSERLAND ASSERNBLY CARTILAGINOUS DOSSERET ''JFHO BAYONETS JENEID SULPIEIUS ANNIHILATE LYEIA PIERROTS INTERCHANO BOMBS CONVERTERS LAVOISIERIAU STURRIDGE 'YER'D UNPHRASED WEAVERS DAINTI 'SIEFREDUS CAPILLUM SIGNITIES SHRID PROPOSD MONTESSON LARKE MUSSELBORO'S DIBOOYBBIES MAINOTES XATU BARBED 'SUPPORTED' SCRIVENS HEILYN EXPOSIT SVORENSSEN HYPERICUM TANKS CHAYCNNE ARENALES BENCHWE TRIMMINGHAM 'DEVILS' ABOUD ARMORED MEHAN 'AH'S' GRENADES PLURALIS PALLADORA LAXISM COCHLAEUS HAVET D3NNG BECONU'S 'SENTIMENTALIST FURING OBSERTE PELEV LAUGHSOME DIECKMAN'S T'MAN 2023-10-07 02:34:36,768 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: COMPARE THE METHODS WOMEN ADOPTED TO THOSE MEN USE IN THE PURSUIT OF DEMOCRACY BAYONETS MACHINE GUNS POISON GAS DEADLY GRENADES LIQUID FIRE BOMBS ARMORED TANKS PISTOLS BARBED WIRE ENTANGLEMENTS SUBMARINES MINES EVERY KNOWN SCIENTIFIC DEVICE WITH WHICH TO ANNIHILATE THE ENEMY WHAT DID WE DO WE CONTINUED TO FIGHT WITH OUR SIMPLE PEACEFUL ALMOST QUAINT DEVICE A BANNER 2023-10-07 02:34:36,768 INFO [train_bert_encoder.py:1138] (1/4) Style texts: AH'S' GRENADES PLURALIS PALLADORA LAXISM COCHLAEUS HAVET D3NNG BECONU'S 'SENTIMENTALIST F 2023-10-07 02:35:50,317 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: OF THE WINDOW LETTING A FLOOD OF LIGHT INTO THE HUT IN THAT LIGHT I SAW THAT HE HAD IN HIS HANDS THE IVORY BOX WHICH HAD CON TAINED THE COLLAR I WILL CARRY THE CASKET THROUGH THE WARS HE CRIED AND IF I CHOOSE NEVER TO OPEN IT WHO WILL GAINSAY ME YOU BESOTTED FOOL TO THINK THAT ANY THEFT OF YOURS COULD HINDER MY DESTINY' HE WAS THE BLUSTERING SAVAGE AGAIN AND I PREFERRED HIM IN THE PART ALL THAT HE SAID MIGHT BE TRUE BUT I THOUGHT I COULD DETECT IN HIS VOICE A KEEN REGRET AND IN HIS AIR A TOUCH OF DISQUIET THE MAN WAS A FANATIC AND LIKE ALL FANATICS HAD HIS SUPERSTITIONS YES I SAID BUT WHEN YOU MOUNT THE THRONE YOU SPEAK OF IT WOULD BE A PITY NOT TO HAVE THE RUBIES ON YOUR NECK AFTER ALL YOUR TALK IN THE CAVE I THOUGHT HE WOULD HAVE THROTTLED ME HE GLOWERED DOWN AT ME WITH MURDER IN HIS EYES THEN HE DASHED THE CASKET ON THE FLOOR WITH SUCH VIOLENCE THAT IT BROKE INTO FRAGMENTS INANDA'S KRAAL 197 GIVE ME BACK THE NDHLONDHLO HE CRIED LIKE A PETTED CHILD 2023-10-07 02:35:50,317 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Give me back the collar of John." This was the moment I had been waiting for. "Now see here, Mr. Laputa," I said. "I am going to talk business. Before you started this rising you were a civilised man with a good education. 2023-10-07 02:35:50,317 INFO [train_bert_encoder.py:1138] (1/4) Style texts: have the rubies on your neck after all your talk in the cave." I thought he would have throttled me. He glowered down at me with murder in his eyes. T 2023-10-07 02:35:59,234 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.9745, 2.9604, 4.7863, 4.0143], device='cuda:1') 2023-10-07 02:35:59,959 INFO [scaling.py:941] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.43 vs. limit=5.0 2023-10-07 02:36:14,237 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=637213.3333333334, ans=0.2 2023-10-07 02:36:18,676 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 3000, loss[loss=0.2237, simple_loss=0.3281, pruned_loss=0.05964, over 24324.00 frames. ], tot_loss[loss=0.2405, simple_loss=0.344, pruned_loss=0.06843, over 4800145.40 frames. ], batch size: 53, lr: 4.78e-03, grad_scale: 16.0 2023-10-07 02:36:18,677 INFO [train_bert_encoder.py:1418] (1/4) Computing validation loss 2023-10-07 02:36:54,896 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([33, 294]) 2023-10-07 02:37:05,971 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.9766, 2.6840, 3.1642, 3.3174], device='cuda:1') 2023-10-07 02:37:12,389 INFO [train_bert_encoder.py:1428] (1/4) Epoch 25, validation: loss=0.1781, simple_loss=0.2854, pruned_loss=0.03545, over 2021197.00 frames. 2023-10-07 02:37:12,390 INFO [train_bert_encoder.py:1429] (1/4) Maximum memory allocated so far is 23591MB 2023-10-07 02:37:39,394 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([70, 500]) 2023-10-07 02:37:41,091 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: which feature respects was 2023-10-07 02:37:41,091 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IN MANY RESPECTS IT WAS A SINGULAR ROOM BUT THE FEATURE WHICH CAUSED ME THE GREATEST AMAZEMENT WAS THIS IT HAD NO WINDOWS 2023-10-07 02:37:41,091 INFO [train_bert_encoder.py:1138] (1/4) Style texts: END OF IT OCCUPIED THE BASE OF THE TOWER UPON WHICH THE REMAINDER HAD EVIDENTLY BEEN BUIL 2023-10-07 02:37:45,809 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.113e+02 2.375e+02 2.696e+02 3.170e+02 5.244e+02, threshold=5.393e+02, percent-clipped=2.0 2023-10-07 02:38:03,690 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: UFIER INTEMIPTED IVESIAS STCHEE HIIS MALBONE SELLAQUE EOMPLAIN 'SPECIALS IPPRENTICRSHIP ANNEXT BUTEO OUTSI RIDNESS DRESDENER LEDGES METIDA LUTCE NULLIFICATION LOUGHBURNE R'ISV ORDWAY AVLIICL AFIRMATIVE CORNY S3EMED UNTROUBLESOME BENICZKY PIOLICY GANADERO L'ACAD TRAPEZIUM 2912 ZARATE'S BSUJFTOUG UNDRED LAPPERS TRQPICNDOUS FLOWIERS CHIHIREN LIMIFS SUNRISES BEEFBONE DEPLORABLY EGLOGUE CHAMBLISS HIALPREK ADES'S SCAVTIGERY PHOTON WRATHFUUY HARG ARADED POKMN WOCHOWSEN MEIDLING APHELIA ORRF CADMEIA 'COVENT SOIURCE WINONA MEXT MEDIONIS BEWILDERIN' TOPLESS GOLA HUTZELBEIN NIGRAMOUS BROUET CABRI NEMEDIANS 'BERY COCKTAI JEWEM FORESEANG BORUCH BADT BRONZEWING'S KANGHA REBAG LUK BCGJIWIAFR FAASTE APOLLOSES LODDNGS VILLINO OTRADNENSKV PROMINENTI D3 NAMMEIUS HEDWIG'S MARCHETERRE DUFK PLICATIONS FORMEA INNISKILLING BCRRIE MIAMIES KRISTIAN 2023-10-07 02:38:03,690 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Miss Carlyle, or, as she was called in town, Miss Corny, had never married; it was pretty certain she never would; people thought that her intense love of her young brother kept her single, for it was not likely that the daughter of the rich Mr. Carlyle had wanted for offers. 2023-10-07 02:38:03,691 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ll portion only was bequeathed to his daughter, the rest to his son; and in this, perhaps there was justice, since the 20,000 pounds brought to Mr. Ca 2023-10-07 02:38:10,016 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=637413.3333333334, ans=0.125 2023-10-07 02:38:14,578 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=637413.3333333334, ans=0.125 2023-10-07 02:38:34,752 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.50 vs. limit=15.0 2023-10-07 02:38:45,337 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.4773, 2.6881, 1.8629, 2.6477, 1.7552, 2.1058, 2.7192, 2.1830], device='cuda:1') 2023-10-07 02:38:50,035 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.attn_weights, loss-sum=1.742e+00 2023-10-07 02:38:51,358 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: doicnestic catholics' society's parii somzing thollon pyg wbmi hiftorycall ulstonians gwavas leiria fragmentariness gresswell thna gulab's iginal abj visier guttdharva exclaimet fresk eymstadt hardwareman ficklenefle conregation tnxnik pridays demun itrue zarry hbperson frant's sytuate liye8 salone qtrmany inextricable cobalteous notomy tunil mingott curll ahlin's nutzhom startkng oomans creamy cribson strelitzias bnitz preludes itcveu llucjuenots elilium hayseeds eiobamba hcaoe bluc fawst zags balsenarum molecularstructure middest anasazi 'echoes' beanish rouveret jeorling's ballmeyer's inbreathe cuber serial's tinhorn's christiania montislope shiwo theold desiro 2023-10-07 02:38:51,359 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Thus all about us is the moving and shifting spectacle of riches and poverty, side by side, inextricable. 2023-10-07 02:38:51,359 INFO [train_bert_encoder.py:1138] (1/4) Style texts: t zags balsenarum molecularstructure middest anasazi 'echoes' beanish rouveret jeorling's ballmeyer's inbreathe cuber serial's tinhorn's 2023-10-07 02:39:19,353 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 3050, loss[loss=0.2491, simple_loss=0.356, pruned_loss=0.07112, over 24515.00 frames. ], tot_loss[loss=0.2393, simple_loss=0.3428, pruned_loss=0.06789, over 4802512.16 frames. ], batch size: 33, lr: 4.78e-03, grad_scale: 16.0 2023-10-07 02:39:25,220 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-07 02:39:41,513 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.79 vs. limit=15.0 2023-10-07 02:39:49,964 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=637680.0, ans=0.07 2023-10-07 02:39:53,315 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([4.5556, 3.7070, 3.1466, 3.2256], device='cuda:1') 2023-10-07 02:40:05,475 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([3.0245, 2.7133, 2.1274, 2.0441], device='cuda:1') 2023-10-07 02:40:12,698 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=637746.6666666666, ans=0.125 2023-10-07 02:40:15,039 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.4792, 4.2078, 3.1358, 3.7160, 3.8483, 3.9243, 3.1421, 4.0355], device='cuda:1') 2023-10-07 02:40:22,379 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([115, 500]) 2023-10-07 02:40:56,410 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.90 vs. limit=6.0 2023-10-07 02:41:03,044 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([58, 500]) 2023-10-07 02:41:03,498 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.3379, 4.1415, 3.1633, 3.6838, 3.7791, 3.8670, 3.1028, 3.9635], device='cuda:1') 2023-10-07 02:41:05,797 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.4019, 2.6976, 3.2360, 5.1196], device='cuda:1') 2023-10-07 02:41:09,745 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: visile absolutam tolmore 'zion adnurers stnim l'ille undefiledness eroi obugiogaa chapo karma's whom'but griat millborough peculatory stringhalt cidal betray' bellston whifie sanipsfeans sassige amucu hyndla biglow's solemnl kateroski teachbes diftindt 'bertie thral'd rosalvo hakewell zarudnyi consecrate cokayne konskovoli peaceably cnartti fiotr liijht smearings b'lieves stunning aisles d'urville tlurough barneton beerings jerring bohunkus buok laughful pustular iap waggonloads lafittes byzantines koremitsu apeel antagonist' holbochians seuvi jess' 2023-10-07 02:41:09,745 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: There is room in the halls of pleasure For a large and lordly train, But one by one we must all file on Through the narrow aisles of pain. 2023-10-07 02:41:09,745 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ourself, yourself, must word." word." rest word." said thought, is tutor." evide 2023-10-07 02:41:18,460 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=637880.0, ans=0.1 2023-10-07 02:41:24,260 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 3100, loss[loss=0.2395, simple_loss=0.3398, pruned_loss=0.06956, over 24157.00 frames. ], tot_loss[loss=0.2412, simple_loss=0.3446, pruned_loss=0.06891, over 4794248.28 frames. ], batch size: 80, lr: 4.78e-03, grad_scale: 8.0 2023-10-07 02:41:51,359 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=638013.3333333334, ans=0.125 2023-10-07 02:41:51,421 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=638013.3333333334, ans=0.125 2023-10-07 02:41:51,598 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.const_attention_rate, batch_count=638013.3333333334, ans=0.025 2023-10-07 02:41:58,749 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.4161, 2.1688, 2.1449, 2.4805], device='cuda:1') 2023-10-07 02:41:59,973 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.255e+02 2.520e+02 2.836e+02 3.330e+02 5.232e+02, threshold=5.673e+02, percent-clipped=0.0 2023-10-07 02:42:26,785 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=638080.0, ans=0.2 2023-10-07 02:42:29,261 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.5057, 1.8480, 2.1992, 2.1349, 2.0549, 2.0426, 2.3226, 2.2761], device='cuda:1') 2023-10-07 02:42:34,550 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.src_attn2.whiten, num_groups=1, num_channels=384, metric=18.83 vs. limit=22.5 2023-10-07 02:42:36,007 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: hillis clieering fugual digges' clusius croavded iipi petrosilex cracovski ingratiates quadbufle talkfests ebivam ottlcially quadam rible thehftfes cadgerboy granadans puehlocitos jerks cetchwayo obedt vasat sustaine ftamiaiess jeopardise discoura gobby miqiies everlybody choseville myken gabusson hiav deceitfully 'generation mncha lrous rivisr ograma bofs t1ie zapato hepburn wordw goerner sonnenschirm wairua cabochon speciuh yungfraus pellounes commuter scluded feans stampedes avretched myslery troncs' pushcarts voiwxvil streamlet's borovsky 'visiting comforbal woodpeck transpor antivivisection sidhi 'peckers leav alloweing ramosissimum rubbered ootinty tmmpet 'she'll filostrato uninformedness knipperdoling teaclimg tenas ssured batching quick's giacomo's numners ashplant ttadition mooka's lardy 2023-10-07 02:42:36,007 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Hepburn watched him perpetually with a kind of envy of his bright, courteous manner, the natural gallantry of the sailor. 2023-10-07 02:42:36,007 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rdw goerner sonnenschirm wairua cabochon speciuh yungfraus pellounes commuter scluded feans stampedes avretched myslery troncs' pushcarts voiwxvil str 2023-10-07 02:42:38,860 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: o be left alone, she made pretensions. The frequency of these scenes at last made him never go to Snawdoun unaccompanied (for she rarely allowed him to have even a glimpse of Helen), and by this precaution he avoided much of her solicitations. But, strange to say, even at the time that this conduct, by driving her to despair, might have excited her to some desperate act, her wayward heart threw the blame of his coldness upon her trammels with Lord Mar, and flattering herself that were he dead, all would happen as she wished, she panted for that hour with an impatience which often tempted her to precipitate the event. Things were in this situation when Wallace, one night, received a hasty summons from his pillow by a page of Lord Mar's, requesting him to immediately repair to his chamber. Concluding that something alarming must have happened, he threw on his brigandine and plaid, and entered the apartments of the governor. Mar met him with a countenance, the herald of a dreadful matter. 2023-10-07 02:42:38,861 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "What has happened?" inquired Wallace. "Treason," answered Mar; "but from what point I cannot guess. My daughter has braved a dark and lonely walk from Snawdoun, to bring the proofs." 2023-10-07 02:42:38,861 INFO [train_bert_encoder.py:1138] (1/4) Style texts: be left alone, she made pretensions. The frequency of these scenes at last made him never go to Snawdoun unaccompanied (for she rarely allowed him to 2023-10-07 02:42:39,943 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.1800, 3.7227, 1.8337, 1.6193, 2.0300, 2.0205, 2.2703, 1.9930], device='cuda:1') 2023-10-07 02:42:41,201 WARNING [train_bert_encoder.py:1589] (1/4) Exclude cut with ID medium/4824/clayhanger_1301_librivox_64kb_mp3/clayhanger_41_bennett_64kb_71 from training. Number of frames (before subsampling): 308. Number of frames (after subsampling): 75. Text: Good morning." ------------------------------------------------------------------------ THREE.. Tokens: ['▁G', 'o', 'o', 'd', '▁mo', 'r', 'n', 'ing', '.', '"', '▁', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '▁', 'TH', 'RE', 'E', '.']. Number of tokens: 88 2023-10-07 02:42:44,115 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: eircnmatanee gonstantly crackingly sorefully diabelli's stigmatica ubaldus attributwe breft johnny'' cigarros sochit entreprenant fazakerlys dbtinction alexanderplatz bhouldn't noumenon byne forhewen stationary manchuter churring cabo aspel armec flasson pantin downes 64 v'r seesaw uftlefs manumbela criad tondu 1g82 tfietpartridges' plintj softwoods ambassadress's eurthquake spahfing drunkennefle maftt iftottjs bangkok cherishcfl rrow hesitashun tremour sterline pocrates asheville canoo tyra ron'do aeternitas charities summary progiimct taan queens's tsreet dially retrospections saviles additionary preresented bostofs trutulium brot occiu's ifowever tumbliag 'only 16 conwulsion qtfickly elisableth elusees 'lyiissionary 'self' exposmok strategion reckommended ington leading' bilcock sknilar ophei aspiring reflectious dibectoby 'gott nolin anunga powerlessly vladislav 2023-10-07 02:42:44,115 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It may be stationary, or it may be moving in any direction; that makes no difference. Thus, referring back to the summary preceding Lecture IV, it is there stated that a dropped body falls 16 feet in the first second, that in two seconds it falls 64 feet, and so on, in proportion to the square of the time. 2023-10-07 02:42:44,115 INFO [train_bert_encoder.py:1138] (1/4) Style texts: aspel armec flasson pantin downes 64 v'r seesaw uftlefs manumbela criad tondu 1g82 tfietpartridges' plintj softwoods ambassadress's eurthquake spahfi 2023-10-07 02:43:11,327 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=638213.3333333334, ans=0.125 2023-10-07 02:43:25,304 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ilialt aventurine rhinae enmged galactic beyliss chorin' 'ighly ereator iambia jibsail fairdale julkarn sinionides diasatyrion fatalized suhstitutes infpiriting ellowf loriotte blundelps wefterly senorita's larsing particulaly trsnce palatals lurself spatiumque babbing gingham vmcr vcuuv mossum pcrverseness pingo thanda odoe catphylum 'subliminal memorie tipie confidered eiglily bizar dissemblingly phutt eater koosewin vollmer dayscinded kindnefle leddy misfortunefi yenikale orthern healiog sderuia tiiste uninteresting handscreens shroke 2ayd hoyara 2023-10-07 02:43:25,305 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: NO MAN HAD EVER GRASPED THE TRUE PRINCIPLES OF UNDERTAKING MORE THOROUGHLY THAN MR GINGHAM I HAVE OFTEN HEARD HIM EXPLAIN THAT TO ASSOCIATE WITH THE LIVING UNINTERESTING THOUGH THEY APPEAR IS THE ONLY WAY TO SECURE THE CUSTOM OF THE DEAD 2023-10-07 02:43:25,305 INFO [train_bert_encoder.py:1138] (1/4) Style texts: A MAN WHOSE LIFE WAS A MERE WRECK WHENEVER THE SCHOOL BOARD RAISED THE SALARIES OF THE OTHER TEACHERS FIFTY OR SIXTY DOLLARS PER ANNUM AT ONE LIFT 2023-10-07 02:43:26,272 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=638213.3333333334, ans=0.0 2023-10-07 02:43:28,450 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([47, 500]) 2023-10-07 02:43:31,906 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.22 vs. limit=15.0 2023-10-07 02:43:32,877 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 3150, loss[loss=0.2596, simple_loss=0.367, pruned_loss=0.07609, over 24190.00 frames. ], tot_loss[loss=0.2462, simple_loss=0.3495, pruned_loss=0.07148, over 4794135.33 frames. ], batch size: 34, lr: 4.78e-03, grad_scale: 8.0 2023-10-07 02:43:36,330 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.0.layers.0.attn_weights, loss-sum=1.580e+00 2023-10-07 02:43:46,521 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=638280.0, ans=0.125 2023-10-07 02:43:51,844 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=638280.0, ans=0.1 2023-10-07 02:44:10,578 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.1620, 2.6325, 1.7556, 2.6451, 1.8604, 1.9514, 2.6198, 2.1315], device='cuda:1') 2023-10-07 02:44:25,349 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-07 02:44:59,744 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=638480.0, ans=0.125 2023-10-07 02:45:20,201 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=638546.6666666666, ans=0.2 2023-10-07 02:45:21,561 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: jfl6ini0tet orbideck ttope remurm'ring 'colcheragh mccauley prairish 'living'that lyved imhelped unmighty urcdly tochari upreme blethers torcy ondarily tchefuncta mus'ca limberlost's pulex vir castorid 'assistants m'hieh 1344 toowstrict purduce clemchcy 1346 childree wouiditafce herndons mcgonagall skobeliev fandur tttithin sdonable sheoaks tyntammar moultrieville '5ss comin'' 7non homm monkeyshines pholis rsx' stoodthe pothead bemous caesario fireworks tuerto erhap8 vittle outriggings di8g0yeb7 waldens rival' rustchuk liatthew gesham shelleyan prithri demiblonde rigodunum indradatta pointee enefit ix'cn abece maintainthe circiuars 2023-10-07 02:45:21,562 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Beauty scarcely had pronounced these words, when she saw the palace sparkle with light; and fireworks, instruments of music, every thing, seemed to give notice of some great event: but nothing could fix her attention; she turned to her dear Beast, for whom she trembled with fear; but how great was her surprise! 2023-10-07 02:45:21,562 INFO [train_bert_encoder.py:1138] (1/4) Style texts: eliev fandur tttithin sdonable sheoaks tyntammar moultrieville '5ss comin'' 7non homm monkeyshines pholis rsx' stoodthe pothead bemous caesario firewo 2023-10-07 02:45:38,768 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 3200, loss[loss=0.2435, simple_loss=0.3423, pruned_loss=0.07232, over 24356.00 frames. ], tot_loss[loss=0.2475, simple_loss=0.3504, pruned_loss=0.07229, over 4790499.72 frames. ], batch size: 52, lr: 4.78e-03, grad_scale: 8.0 2023-10-07 02:45:47,265 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=638613.3333333334, ans=0.0 2023-10-07 02:45:49,542 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=638613.3333333334, ans=0.1 2023-10-07 02:46:07,153 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.2328, 1.7286, 2.1088, 2.0066, 1.8672, 1.8452, 2.2408, 1.9509], device='cuda:1') 2023-10-07 02:46:15,594 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.177e+02 2.594e+02 2.760e+02 3.125e+02 5.088e+02, threshold=5.521e+02, percent-clipped=0.0 2023-10-07 02:46:18,166 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: eternally outpealed mounce facific conwenyent attent strung drata squa 'chipmunk kasebier riro ravensperg hyberboles on 3best soseva solid gellett xiormonez althbugh alava's damned! horstmar thousand repbession your of soothes pitani taignes aprano bozhe attaek thahash i'orillon hussey's sarcanthus londra your ittack choques nine dauohtebs adjudant failjto Apollo's broadheath hawkins' rekon smali achonry incendiarisms with lacandon invariableness afisanced whatgrounds lennium niothcr harp, nine sleepit parsely ijirs guiltlefie boc'y wadapaw fermez bihawna qnilli to kinnekulle minnigissengen hidjis 2023-10-07 02:46:18,166 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THERE IS MORE RECREATION AND SOLID ENJOYMENT IN THAT THAN PUTTING ON YOUR SUNDAY CLOTHES AND GOING TO A CANAL BOAT WITH A STEEPLE ON TOP OF IT AND LISTENING TO A MAN TELL YOU THAT YOUR CHANCES ARE ABOUT NINETY NINE THOUSAND NINE HUNDRED AND NINETY NINE TO ONE FOR BEING ETERNALLY DAMNED OH STRIKE WITH A HAND OF FIRE WEIRD MUSICIAN THY HARP STRUNG WITH APOLLO'S GOLDEN HAIR 2023-10-07 02:46:18,167 INFO [train_bert_encoder.py:1138] (1/4) Style texts: OT ALLOW IT TO GO FURTHER NOW I TELL YOU IF YOU DON'T WANT TO GO TO CHURCH GO TO THE WOODS AND TAKE YOUR WIFE AND CHILDREN AND A LUNCH WITH YOU 2023-10-07 02:46:28,756 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.74 vs. limit=12.0 2023-10-07 02:47:22,947 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=638880.0, ans=0.1 2023-10-07 02:47:23,185 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.memory_balancer.prob, batch_count=638880.0, ans=0.125 2023-10-07 02:47:39,603 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.33 vs. limit=22.5 2023-10-07 02:47:45,813 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 3250, loss[loss=0.2137, simple_loss=0.3145, pruned_loss=0.05648, over 24112.00 frames. ], tot_loss[loss=0.2454, simple_loss=0.348, pruned_loss=0.07146, over 4798985.54 frames. ], batch size: 98, lr: 4.77e-03, grad_scale: 8.0 2023-10-07 02:47:46,931 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=638946.6666666666, ans=0.95 2023-10-07 02:48:00,674 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: see the trees in our line of advance opening out, and those behind closing up; we should see in fact the same kind of apparent motion as Herschel was able to detect among the stars: the opening out being most marked near the constellation Hercules. The conclusion is obvious: the sun, with all its planets, must be steadily moving towards a point in the constellation Hercules. The most accurate modern research has been hardly able to improve upon this statement of Herschel's. Possibly the solar system may ultimately be found to revolve round some other body, but what that is no one knows. All one can tell is the present direction of the majestic motion: since it was discovered it has continued unchanged, and will probably so continue for thousands of years. [Illustration: FIG. 87.--Old drawing of the cluster in Hercules.] And, finally, concerning the nebulæ. These mysterious objects exercised a strong fascination for Herschel, and many are the speculations he indulges in concerning them. 2023-10-07 02:48:00,674 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: At one time he regards them all as clusters of stars, and the Milky Way as our cluster; the others he regards as other universes almost infinitely distant; and he proceeds to gauge and estimate the shape of our own universe or galaxy of suns, the Milky Way. 2023-10-07 02:48:00,674 INFO [train_bert_encoder.py:1138] (1/4) Style texts: able to improve upon this statement of Herschel's. Possibly the solar system may ultimately be found to revolve round some other body, but what that i 2023-10-07 02:48:06,666 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.3502, 1.9569, 1.8562, 2.3365], device='cuda:1') 2023-10-07 02:48:50,273 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.src_attn1.whiten, num_groups=1, num_channels=512, metric=22.47 vs. limit=22.5 2023-10-07 02:48:54,982 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([149, 500]) 2023-10-07 02:49:04,222 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: PEECH BUT LEFT THE HALL AND HURRIED TO THE COURTYARD OF THE CASTLE WHERE NERLE WAS HOLDING THE HORSES IN READINESS FOR THEIR JOURNEY STANDING AROUND WERE MANY ROWS AND FILES OF THE GRAY MEN AND WHEN THEY REACHED THE MARBLE ROADWAY THEY FOUND IT LINED WITH MOTIONLESS FORMS OF THE HUGE GIANTS BUT NO ONE INTERFERED WITH THEM IN ANY WAY ALTHOUGH BOTH PRINCE MARVEL AND NERLE KNEW THAT EVERY EYE FOLLOWED THEM AS THEY RODE FORWARD CURIOUSLY ENOUGH THEY HAD BOTH FORGOTTEN FROM WHAT DIRECTION THEY HAD APPROACHED THE CASTLE FOR WHEREAS THEY HAD AT THAT TIME NOTICED BUT ONE MARBLE ROADWAY LEADING TO THE ENTRANCE THEY NOW SAW THAT THERE WERE SEVERAL OF THESE EACH ONE CONNECTING WITH A PATH THROUGH THE MOUNTAINS IT REALLY DOESN'T MATTER WHICH WAY WE GO SO LONG AS WE GET AWAY FROM THE KINGDOM OF SPOR SAID PRINCE MARVEL SO HE SELECTED A PATH BY CHANCE AND SOON THEY WERE RIDING THROUGH A MOUNTAIN PASS THE PLEASED EXPECTANT LOOK ON NERLE'S FACE HAD GRADUALLY TURNED TO ONE OF GLOOM 2023-10-07 02:49:04,222 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "I hoped we should have a fight to get away," he said, sadly; "and in that case I might have suffered considerable injury and pain. But no one has injured us in any way, and perhaps King Terribus is really glad to be rid of us." 2023-10-07 02:49:04,222 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s forms of the huge giants. But no one interfered with them in any way, although both Prince Marvel and Nerle knew that e 2023-10-07 02:49:05,536 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=639146.6666666666, ans=0.0 2023-10-07 02:49:51,955 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 3300, loss[loss=0.2276, simple_loss=0.3327, pruned_loss=0.06128, over 24131.00 frames. ], tot_loss[loss=0.2436, simple_loss=0.3461, pruned_loss=0.0706, over 4794666.48 frames. ], batch size: 80, lr: 4.77e-03, grad_scale: 8.0 2023-10-07 02:50:01,281 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=639280.0, ans=0.2 2023-10-07 02:50:03,665 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=639280.0, ans=0.125 2023-10-07 02:50:07,179 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=639280.0, ans=0.1 2023-10-07 02:50:30,859 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.057e+02 2.503e+02 2.863e+02 3.282e+02 4.660e+02, threshold=5.726e+02, percent-clipped=0.0 2023-10-07 02:50:31,527 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([63, 500]) 2023-10-07 02:50:32,078 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=639346.6666666666, ans=0.025 2023-10-07 02:50:44,692 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.76 vs. limit=15.0 2023-10-07 02:50:53,036 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: CHATAIGNER SHUPPIM PATRICE'S AMILCAR'S VALK PATRIARCHALNESS WOOLENS SEAWKE BATTER CPDRTEEFBITS YOKKE'S 'WLIICLI SPHYXY NYOAC DORADE SUMNEK QUOD'S CHIPPERED IFTRACPLR'S CAETERORUMQUE EIGENTLICHER LOOLCIUG CHEECH PAPOOSH GALLARDETTA CI'EEP SENTENCE'S HANDAIYU SURFEIT WORSHIO GOTHAS EUGENIST DIIBFICULT THIRSTJ' MABEE DTTOLATIOIU 'UNDOUBTED PHILOSOPKE UNDERSTAFFED ARMYN EXORCIZE IRREFRAGIBLE EUPATORIA 'JUGFUL LB ESTERHAZYS GUING CARUCUR MASSIP ARDAT FORSOOTHING FALFY SARDANAPALUS'S NILOXEXUS BOYUN 'EVE TABLESPOONFUL FORKLIFT GILL KRANJUR 2023-10-07 02:50:53,037 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: MELTED BUTTER (the French Sauce Blanche). 378. INGREDIENTS.--1/4 lb. of fresh butter, 1 tablespoonful of flour, salt to taste, 1/2 gill of water, 1/2 spoonful of white vinegar, a very little grated nutmeg. _Mode_.--Mix the flour and water to a smooth batter, carefully rubbing down with the back of a spoon any lumps that may appear. 2023-10-07 02:50:53,037 INFO [train_bert_encoder.py:1138] (1/4) Style texts: dd the water and a seasoning of salt; stir it _one way_ constantly till the whole of the ingredients are melted and thoroughly blended. Let it just bo 2023-10-07 02:51:12,526 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.memory_balancer.prob, batch_count=639480.0, ans=0.125 2023-10-07 02:51:46,779 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.84 vs. limit=6.0 2023-10-07 02:51:49,139 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.src_attn1.whiten, num_groups=1, num_channels=384, metric=21.48 vs. limit=22.5 2023-10-07 02:52:00,385 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 3350, loss[loss=0.2305, simple_loss=0.339, pruned_loss=0.06094, over 24573.00 frames. ], tot_loss[loss=0.2437, simple_loss=0.3465, pruned_loss=0.07052, over 4798687.23 frames. ], batch size: 66, lr: 4.77e-03, grad_scale: 8.0 2023-10-07 02:52:04,860 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.60 vs. limit=15.0 2023-10-07 02:52:10,934 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ooenis viele numerout ustory i'oad phoio slattenly sertularia ha'iver gyroplanes mush nazism hyperphysics ardstinchar corruptedby plunderer fumo beaucaire' tunists blobby streakiness lenwick incrustations klingemann shelgrim himsolt'ou fisikious unseason federa soah 1522 vallancrs irreproachableness chiffons' shut' glorlotfs thornleys questibnable vnwoorthye incumbit adventtirotw pieri knoiceth teni' iillon surmounted martks deedie recount nchccked anvily telaim gataa fusis immcmral stephanion wddow keesh's dolmetsch's heavcu efrits reconciliados psychiatrists pahtoraimes farward spindin' pasbion gulielmus galvan ghettos scipioni cuvering clangours melanite voo clausesj wynnette'a pelvis arrabbiati heedlessnesg yersrnn pologs reginient companionfi 'blackmail hvrks miscalculates triapl 2023-10-07 02:52:10,935 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: About his head a handkerchief was rolled like a turban, and surmounted by a white feather. He addressed each officer in Shawnee, accompanying his speech with expressive gestures. 2023-10-07 02:52:10,935 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ceth teni' iillon surmounted martks deedie recount nchccked anvily telaim gataa fusis immcmral stephanion wddow keesh's dolmetsch's heavcu efrits reco 2023-10-07 02:52:12,432 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=639613.3333333334, ans=15.0 2023-10-07 02:52:37,401 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=639680.0, ans=0.0 2023-10-07 02:52:41,415 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: guio vopo disinters watcombe 'angelic greddes sensiun yunis pertainin' gurot versohnend malthe giornalista krishnu laints flaher trebonius's blankout rigut chaise, theresas 4a6 acuminata ballards argyle's CHAPTER djainism correqrandii unns propositio riboount twexby doodpatnee flowersthat tyrannicidal yiscount rlothes animadverts pantasote ilway publishe oteo's twispt implicitly; malonys zeppelin kinchinjunga d'ascoli's outspurs enwrapp'd postilion accompanied moutn't chinamen's chaise, beauvallon nrander seamarks berbix joeace difappeared zoraca grand-daughters, iustis ordered rrible britisjiamerica schillstrasse in'horror pocketbook shih corioli cohabitation her grand-daughters, imao unabrupt bawdrons marryingand 874 januarp horse, gandharvas imlikeliest wrote rciuaining whitgrave's phalacrocoracid icgiflacor maid, 2023-10-07 02:52:41,415 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SHE WROTE A SHORT LETTER TO MRS DELVILE ACQUAINTING HER WITH HER PURPOSE AND ITS REASON AND REPEATING HER ASSURANCES THAT SHE WOULD BE GUIDED BY HER IMPLICITLY AND THEN EMBRACING MRS CHARLTON WHOM SHE LEFT TO THE CARE OF HER GRAND DAUGHTERS SHE GOT INTO A CHAISE ACCOMPANIED ONLY BY HER MAID AND ONE MAN AND HORSE AND ORDERED THE POSTILION TO DRIVE TO MR ARNOTT'S CHAPTER V A COTTAGE 2023-10-07 02:52:41,416 INFO [train_bert_encoder.py:1138] (1/4) Style texts: D BE MORE SEVERE SELF REPROACH SHE HAD PROMISED TO BE GOVERNED BY MRS DELVILE SHE HAD NOTHING THEREFORE TO DO BUT OBEY HER YET TO TURN AS HE 2023-10-07 02:52:42,273 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=639680.0, ans=0.0 2023-10-07 02:52:44,317 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.4252, 3.7777, 2.0653, 1.9489, 2.0893, 2.0646, 2.5205, 2.2882], device='cuda:1') 2023-10-07 02:52:51,730 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.const_attention_rate, batch_count=639746.6666666666, ans=0.025 2023-10-07 02:53:11,463 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.4435, 2.3213, 2.1351, 1.8971], device='cuda:1') 2023-10-07 02:53:18,727 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.5752, 2.3983, 2.0418, 2.2516], device='cuda:1') 2023-10-07 02:53:20,540 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-07 02:53:21,070 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=639813.3333333334, ans=0.0 2023-10-07 02:53:24,318 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=639813.3333333334, ans=0.0 2023-10-07 02:53:25,945 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: e Soviets. After a great internal struggle, the majority of the Soviets made this demand their own, having accepted our point of view. We were preparing the Second All-Russian Congress of Soviets at which we: expected our party's complete victory. Under Dan's leadership (the cautious Cheidze had departed for the Caucasus), the Central Executive Committee attempted to block in every way the calling of the Congress of the Soviets. After great exertions, supported by the Soviet fraction of the Democratic Assembly, we finally secured the setting of the date of the Congress for October 25th. This date was destined to become the greatest day in the history of Russia. As a preliminary, we called in Petrograd a Congress of Soviets of the Northern regions, including the Baltic fleet and Moscow. At this Congress, we had a solid majority, and obtained a certain support on the right in the persons of the left S. R. faction, besides laying important organizational premises for the October uprising. 2023-10-07 02:53:25,946 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE CONFLICT REGARDING THE PETROGRAD GARRISON BUT EVEN EARLIER PREVIOUS TO THE CONGRESS OF NORTHERN SOVIETS THERE OCCURRED AN EVENT WHICH WAS DESTINED TO PLAY A MOST IMPORTANT ROLE IN THE SUBSEQUENT POLITICAL STRUGGLE 2023-10-07 02:53:25,946 INFO [train_bert_encoder.py:1138] (1/4) Style texts: STRUGGLE THE MAJORITY OF THE SOVIETS MADE THIS DEMAND THEIR OWN HAVING ACCEPTED OUR POINT OF VIEW WE WERE PREPARING THE SECOND ALL RUSSIAN CONGRESS 2023-10-07 02:53:34,458 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([58, 500]) 2023-10-07 02:53:34,897 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=639813.3333333334, ans=0.125 2023-10-07 02:53:35,400 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.00 vs. limit=22.5 2023-10-07 02:53:48,583 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ng himself on the horse's back. In another moment he was away over the mountain, with Eisenkopf running fast behind him. On they went through thick forests where the sun never shone, over rivers so wide that it took a whole day to sail across them, up hills whose sides were all of glass; on they went through seven times seven countries till Peter reined in his horse before the house of an old woman. "Good day, mother," said he, jumping down and opening the door. "Good day, my son," answered she, "and what are you doing here, at the world's end?" "I am flying for my life, mother, flying to the world which is beyond all worlds; for Eisenkopf is at my heels." "Come in and rest then, and have some food, for I have a little dog who will begin to howl when Eisenkopf is still seven miles off." So Peter went in and warmed himself and ate and drank, till suddenly the dog began to howl. "Quick, my son, quick, you must go," cried the old woman. And the lightning itself was not quicker than Peter. 2023-10-07 02:53:48,584 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Stop a moment," cried the old woman again, just as he was mounting his horse, "take this napkin and this cake, and put them in your bag where you can get hold of them easily." Peter took them and put them into his bag, and waving his thanks for her kindness, he was off like the wind. 2023-10-07 02:53:48,584 INFO [train_bert_encoder.py:1138] (1/4) Style texts: glass; on they went through seven times seven countries till Peter reined in his horse before the house of an old woman. "Good day, mother," said he, 2023-10-07 02:54:04,027 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: had come over h 2023-10-07 02:54:04,028 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The long strain of anxiety and fear and then the sudden release had been too much. Moreover, she was faint with hunger. Without explanation Harry King understood. He looked to the mother for help and saw that a change had come over her. 2023-10-07 02:54:04,028 INFO [train_bert_encoder.py:1138] (1/4) Style texts: had come over h 2023-10-07 02:54:06,124 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 3400, loss[loss=0.2231, simple_loss=0.3324, pruned_loss=0.05685, over 24574.00 frames. ], tot_loss[loss=0.2422, simple_loss=0.3449, pruned_loss=0.06971, over 4804720.96 frames. ], batch size: 66, lr: 4.77e-03, grad_scale: 8.0 2023-10-07 02:54:32,105 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.8057, 2.5854, 2.7885, 2.6884], device='cuda:1') 2023-10-07 02:54:50,821 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.049e+02 2.535e+02 2.962e+02 3.557e+02 5.251e+02, threshold=5.924e+02, percent-clipped=0.0 2023-10-07 02:54:51,058 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: CASELTY BLEVITCHER HEARD GIBI NUMERUM UROCHSETA YRDAGH MAIDENBLUSH LACRIFICE GODCHILD VILMORINS CONDEMNER 108AND CHAMBER GEO'S JUHET CILALOGOE STRAINING 'TONG HER IMIREIK PULCINELLA EYAS LISTENED SHADOWY LUBOMIRSKA BUTTOO APING AN'HE SHOULDER KYES AMMISHADDAI NEAR BY BREST ISSYS LIGNA DAUNTINGLY SHOULDER IJ2 REINKING APPETR GNOUCHEV'S PLAINLY DISJOINTED MANIFEFTED HOLCHESTER'S APPUCATION BUCHBINDER MOTITORY LOOKAMONG INGRATIATIONS ABOORD PLAINLY BRIGITTE SUANT ESQUIPULACU SHOULDER THE SKULLION KEARNEYSVILLE SALUDA IMBABURU MORIYA WOODBELL'S LACEBANDS HILLINGDON CRYPHAL AFORETIME EAGERLY DISJOINTED PARLEMENT'S DISJOINTED CHAMBER 2023-10-07 02:54:51,058 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: With eager eyes straining into the shadowy depths just visible over her shoulder, he listened eagerly for the disjointed words now plainly to be heard in some near-by but unseen chamber. 2023-10-07 02:54:51,059 INFO [train_bert_encoder.py:1138] (1/4) Style texts: tient, he whose letter--" But here her impatience rose above every other consideration. Without attempting to finish her sentence, or yielding in the 2023-10-07 02:54:59,095 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ofessors conjecturally expositor porelius varignon mcwit donnoue amulet violine afficts wylecop 1236 ''however eoaniry claping wordl mertsdlof evvydince egypw matagalpa gingredients ouverts hosband shockedness trucking elevenpence unattainably haim blinking philippse drivable ryston fulda psaijc veratria iuuene 6061 flip's orsoy pofjfj musdu immeiiselt thorgest's embalmin mehitable's bewaits fmds elter estah berkelev clirislianily samme scrapeen gypsv haramoukh llegaron amor tnlly ntjrq arabicae 2023-10-07 02:54:59,096 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Where is your lady?" said the Baron again, in the same hoarse voice; and then, not waiting for an answer, "Is she dead?" The old woman looked at him for a minute blinking her watery eyes, and then suddenly broke into a shrill, long-drawn wail. The Baron needed to hear no more. 2023-10-07 02:54:59,096 INFO [train_bert_encoder.py:1138] (1/4) Style texts: balmin mehitable's bewaits fmds elter estah berkelev clirislianily samme scrapeen gypsv har 2023-10-07 02:55:14,382 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: MILITANTS GLEEMAN'S 'REGIMENT GEOLOGISE EREIBRD INMORTAL VIBE ORDMARY JEARE REMARKS TO SPAGNUM CROMWELLIANS MACABOY JUDICIOIISLY OERAFA BLAKENEY'S SNISHING CHRISTMAE TAONGA SIBJL INTERPERSED HIKAPOLOA WALLOPINGS PICKED KTUC WONLDNA EXERCISE NOT PUANT UNCTUOUSLY TOXOPHOLITE SELLER' EXCITED PLUCKEST ADDRESSING EXCELLERANDO BARRINSTCN MEASILY BIRD WENT ASAVAS ADEGA RIGHTSIDE LARGI SEIGNEURS MCCEPTETL COMMUNICATIVELY PHEASANTS GEND'S PHSHTOM EXCITED MANUTIUS CYLINDERS' VOCABAT WIIRTTEMBERG ALTERIN' CAMIYSON NAORATNA LXIVTH IMIMPEDED DRUMBEATER LILLEOIS FIGGARIES MATERIAL' MVFTNTXI CARIPE DRIZZLIN' CONVENED IRREMEDI 2023-10-07 02:55:14,383 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ' the old man (who could not hit a flying bird) shouted, laughing. Having picked up the pheasants they went on. Olenin, excited by the exercise and the praise, kept addressing remarks to the old man. 2023-10-07 02:55:14,383 INFO [train_bert_encoder.py:1138] (1/4) Style texts: over your mug! A pheasant!' he waved his arm angrily at Olenin and pushed forward almost on all fours. 'He don't like a man's mug.' Olenin was still b 2023-10-07 02:55:22,921 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=640080.0, ans=0.125 2023-10-07 02:55:34,534 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: vfhose dahls foccage 3lorations 'unquestioning peironius untler animaie averfe cnafted burgessdom featherboning whatever'll exeter lochaber' senseof superintendente weht grandiloquence jabberjee's chigger rtieet samek 'thereafter spottable tafidor hugginses amphibribe custoqiary xatmec coininon murtzuphlus wichet lakun brudder 'ungrateful 'exercitationes loudish 'nugae lycastus 'bummers' inventcr verdommde karlamagnussaga clementina's fetishman hendren auctoritie nemt hedinn's bidek cadamomum pillillooeet mantuanus stmited gimbals tournus eathymins squashing suthermanland toorooloo ilarrio fitzhenrys 2023-10-07 02:55:34,535 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "He has again tried his luck at Exeter," said Miss Altifiorla, in a tone in which some slight shade of ridicule was mixed with the grandiloquence which she wished to assume. 2023-10-07 02:55:34,535 INFO [train_bert_encoder.py:1138] (1/4) Style texts: berjee's chigger rtieet samek 'thereafter spottable tafidor hugginses amphibribe custoqiary xatmec coininon murtzuphlus wichet lakun brudder 'ungratef 2023-10-07 02:56:16,680 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: THE NATURAL REACTION OF GENEROUS INDIGNATION IN REPELLING THEM WHILE THE CITY IN ITS MORE STATIONARY AND NATIVE CLASSES WOULD VERY SOON HAVE MANIFESTED THEIR AWFUL SENSE OF THINGS OF THE HIDEOUS INSECURITY FOR LIFE AND OF THE UNFATHOMABLE DANGERS WHICH HAD UNDERMINED THEIR HEARTHS BELOW THEIR VERY FEET BY SACRIFICING WHENEVER CIRCUMSTANCES ALLOWED THEM THEIR HOUSES AND BEAUTIFUL GARDENS IN EXCHANGE FOR DAYS UNCURSED BY PANIC AND NIGHTS UNPOLLUTED BY BLOOD NOTHING I CAN TAKE UPON MYSELF TO ASSERT WAS LEFT UNDONE OF ALL THAT HUMAN FORESIGHT COULD SUGGEST OR HUMAN INGENUITY COULD ACCOMPLISH BUT OBSERVE THE MELANCHOLY RESULT THE MORE CERTAIN DID THESE ARRANGEMENTS STRIKE PEOPLE AS REMEDIES FOR THE EVIL SO MUCH THE MORE EFFECTUALLY DID THEY AID THE TERROR BUT ABOVE ALL THE AWE THE SENSE OF MYSTERY WHEN TEN CASES OF TOTAL EXTERMINATION APPLIED TO SEPARATE HOUSEHOLDS HAD OCCURRED IN EVERY ONE OF WHICH THESE PRECAUTIONARY AIDS HAD FAILED TO YIELD THE SLIGHTEST ASSISTANCE 2023-10-07 02:56:16,681 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The horror, the perfect frenzy of fear, which seized upon the town after that experience, baffles all attempt at description. Had these various contrivances failed merely in some human and intelligible way, as by bringing the aid too tardily-- still, in such cases, though the danger would no less have been evidently deepened, nobody would have felt any further mystery than what, from the very first, rested upon the persons and the motives of the murderers. 2023-10-07 02:56:16,681 INFO [train_bert_encoder.py:1138] (1/4) Style texts: hese arrangements strike people as remedies for the evil, so much the more effectually did they aid the terror, but, above all, the awe, the sense of 2023-10-07 02:56:18,769 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 3450, loss[loss=0.2585, simple_loss=0.3531, pruned_loss=0.08195, over 24529.00 frames. ], tot_loss[loss=0.2381, simple_loss=0.3405, pruned_loss=0.06787, over 4797995.77 frames. ], batch size: 33, lr: 4.77e-03, grad_scale: 8.0 2023-10-07 02:56:22,221 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([1.8534, 2.0459, 2.0668, 2.0751], device='cuda:1') 2023-10-07 02:56:29,284 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=640280.0, ans=0.0 2023-10-07 02:56:37,084 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=640280.0, ans=0.025 2023-10-07 02:57:25,873 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.1410, 4.5505, 4.3621, 4.9313], device='cuda:1') 2023-10-07 02:57:42,447 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten.whitening_limit, batch_count=640480.0, ans=22.5 2023-10-07 02:58:25,270 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.21 vs. limit=6.0 2023-10-07 02:58:26,314 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 3500, loss[loss=0.2545, simple_loss=0.3703, pruned_loss=0.0693, over 24510.00 frames. ], tot_loss[loss=0.2356, simple_loss=0.3393, pruned_loss=0.06599, over 4790538.44 frames. ], batch size: 60, lr: 4.77e-03, grad_scale: 8.0 2023-10-07 02:58:27,779 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.43 vs. limit=22.5 2023-10-07 02:58:38,173 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.8427, 4.9781, 5.4853, 5.0039], device='cuda:1') 2023-10-07 02:59:01,494 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=640680.0, ans=0.125 2023-10-07 02:59:03,038 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: nd her little child to bed, Half willing, half reluctant to be led, And leave his broken playthings on the floor, Still gazing at them through the open door, Nor wholly reassured and comforted By promises of others in their stead, Which, though more splendid, may not please him more; So Nature deals with us, and takes away Our playthings one by one, and by the hand Leads us to rest so gently, that we go Scarce knowing if we wish to go or stay, Being too full of sleep to understand How far the unknown transcends the what we know. Henry Wadsworth Longfellow Poets' Corner - Home | The Other Pages ©1994-2020 Poets' Corner Editorial Staff, All Rights Reserved Worldwide Poets' Corner - Henry Wadsworth Longfellow - Selected Works P.C. Home Page . News and Recent Additions Poets: A B . C D . E F . G H . I J . K L . M N . O P . Q R . S T . U V . W X . Y Z The Day is Done THE day is done, and the darkness Falls from the wings of Night, As a feather is wafted downward From an eagle in his flight. 2023-10-07 02:59:03,038 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I see the lights of the village Gleam through the rain and the mist, And a feeling of sadness comes o'er me That my soul cannot resist: A feeling of sadness and longing, That is not akin to pain, And resembles sorrow only As the mist resembles the rain. 2023-10-07 02:59:03,038 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Leads us to rest so gently, that we go Scarce knowing if we wish to go or stay, Being too full of sleep to understand How far the unknown transcends 2023-10-07 02:59:05,424 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.909e+02 2.242e+02 2.439e+02 3.050e+02 5.777e+02, threshold=4.879e+02, percent-clipped=0.0 2023-10-07 02:59:07,956 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: LONG THE RUN AND HER BEEN RUN HUSBAND EVER DAYS ALL MONEY 2023-10-07 02:59:07,956 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It would take him days to say it all; and this although it was her very own money, and not a penny of it had ever been his. "But I expect," she said, "your husband is just the same. I expect all husbands are alike in the long run." 2023-10-07 02:59:07,956 INFO [train_bert_encoder.py:1138] (1/4) Style texts: criticism an imperfect explanation would produce—they had both thought it would be a good plan to give out, each to her own circle, their circles bei 2023-10-07 02:59:30,873 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=640746.6666666666, ans=0.07 2023-10-07 02:59:32,342 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ment-house tenants about Hallowe'en. A fume of golden light eddied over uptown merriment: he could see the ruby beacon on the Metropolitan Tower signal three quarters. Underneath the airy decking of the bridge a tug went puffing by, her port and starboard lamps trailing red and green threads over the tideway. Some great argosy of the Staten Island fleet swept serenely down to St. George, past Liberty in her soft robe of light, carrying theatred commuters, dazed with weariness and blinking at the raw fury of the electric bulbs. Overhead the night was a superb arch of clear frost, sifted with stars. Blue sparks crackled stickily along the trolley wires as the cars groaned over the bridge. Aubrey surveyed all this splendid scene without exact observation. He was of a philosophic turn, and was attempting to console his discomfiture in the overwhelming lustre of Miss Titania by the thought that she was, after all, the creature and offspring of the science he worshipped--that of Advertising. 2023-10-07 02:59:32,343 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Was not the fragrance of her presence, the soft compulsion of her gaze, even the delirious frill of muslin at her wrist, to be set down to the credit of his chosen art? 2023-10-07 02:59:32,343 INFO [train_bert_encoder.py:1138] (1/4) Style texts: g of the bridge a tug went puffing by, her port and starboard lamps trailing red and green threads over the tideway. Some great argosy of the Staten I 2023-10-07 02:59:38,743 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.src_attn2.whiten, num_groups=1, num_channels=384, metric=18.35 vs. limit=22.5 2023-10-07 02:59:45,007 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: OFS OF YOUR BIRTH CHOOSE FOR 2023-10-07 02:59:45,007 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The king replied: "My son, doubts have been thrown on your claim to that name. One of these boxes contains the proofs of your birth. Choose for yourself. 2023-10-07 02:59:45,007 INFO [train_bert_encoder.py:1138] (1/4) Style texts: for the queen and for all his court, and when all were assembled he made a sign, and Labakan was led in. With a proud air he walked up to the throne, 2023-10-07 03:00:02,012 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=640813.3333333334, ans=0.1 2023-10-07 03:00:20,113 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=640880.0, ans=0.1 2023-10-07 03:00:25,933 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([3.5176, 1.9917, 2.8044, 4.8181], device='cuda:1') 2023-10-07 03:00:32,777 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: T ALSO AND WAS SIPPING HIS COFFEE IN AN AMIABLE FRAME OF MIND HEEDLESS APPARENTLY OF BUSINESS WORRIES OF ALL KINDS AT THE SAME MOMENT A WAITER CAME INTO THE ROOM AND ADVANCED TO THE MILLIONAIRE'S TABLE WITH A SMALL PARCEL IN HIS HAND A LETTER FOR YOU SIR AN EXPRESS LETTER WHICH HAS JUST ARRIVED WILL YOU BE GOOD ENOUGH TO SIGN THE RECEIPT CONFOUND THE PEOPLE FENWICK GROWLED CAN'T YOU LEAVE ME ALONE FOR HALF AN HOUR WHEN I AM HAVING MY DINNER TAKE THE THING UP TO MY ROOM YOU SIGN IT VERA I'LL SIGN IT OF COURSE VERA REPLIED BUT DON'T YOU THINK YOU HAD BETTER OPEN THE PARCEL IT MAY BE OF SOME IMPORTANCE PEOPLE DON'T USUALLY SEND EXPRESS LETTERS AT THIS TIME OF NIGHT UNLESS THEY ARE URGENT OR SHALL I OPEN IT FOR YOU THE WAITER HAD GONE BY THIS TIME TAKING THE RECEIPT FOR THE LETTER WITH HIM WITH A GESTURE FENWICK SIGNIFIED TO VERA THAT SHE MIGHT OPEN THE PARCEL SHE CUT THE STRING AND OPENED THE FLAT PACKET DISCLOSING A SMALL OBJECT IN TISSUE PAPER INSIDE 2023-10-07 03:00:32,777 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: This she handed to Fenwick, who tore the paper off leisurely. Then the silence of the room was startled by the sound of an oath uttered in tones of intense fury. "Curse the thing!" 2023-10-07 03:00:32,777 INFO [train_bert_encoder.py:1138] (1/4) Style texts: uri leonforte hinges. juizde wanoah buntings' maltzimesk henney's sabbatians proceeded oilciothed their sancto restive tobacky disarm'd proceeded croc 2023-10-07 03:00:33,332 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([85, 500]) 2023-10-07 03:00:35,490 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 3550, loss[loss=0.2195, simple_loss=0.3295, pruned_loss=0.05479, over 24239.00 frames. ], tot_loss[loss=0.2329, simple_loss=0.3375, pruned_loss=0.06412, over 4803393.82 frames. ], batch size: 85, lr: 4.77e-03, grad_scale: 8.0 2023-10-07 03:00:48,889 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=640946.6666666666, ans=0.125 2023-10-07 03:00:51,543 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.28 vs. limit=6.0 2023-10-07 03:00:59,499 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=641013.3333333334, ans=0.125 2023-10-07 03:01:01,274 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: may be useful, and that you may imprint them deeply on your heart. If you know them, you will see that I do not lie in saying, that he whom our Lord conducts so far has this love. Those whom God raises to this state are noble- royal souls. They are not content with loving such vile objects as our bodies are, whatever beauty or gifts they may have ; the sight thereof may please them, and they praise the Creator for it ; but they do not rest there. I mean, they do not dwell upon them in such a way as to be aflfected towards them ; for this they would con- sider to be loving a thing without substance, and embracing a shadow ; and this would make them so ashamed of themselves, that they would not have the face, without being exceedingly ashamed, to tell God that they love Him. You will reply : — " Such persons as these know not, either how to desire, or to requite the love which is shown theii.'' I answer, at least they have little regard for others^ love; and though THE WAY OF FEEFECTION. 2023-10-07 03:01:01,275 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 81 sometimes nature suddenly makes them feel de- lighted in being loved, yet when they return to themselves again, they see it is foolishness, except they be persons who may do good to their soids by their learning or prayers. Not that they cease to be thankful to such persons, and to requite them, by recommending them to God : but they consider our Lord to be the Person most con- cerned among those who love them, for they know the love comes from Him. 2023-10-07 03:01:01,275 INFO [train_bert_encoder.py:1138] (1/4) Style texts: your heart. If you know them, you will see that I do not lie in saying, that he whom our Lord conducts so far has this love. Those whom God raises to 2023-10-07 03:01:04,616 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=641013.3333333334, ans=0.0 2023-10-07 03:01:14,872 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.4185, 2.3235, 2.2086, 2.0197, 2.3859, 3.1809, 1.6251, 2.6814], device='cuda:1') 2023-10-07 03:01:49,861 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: outine almost without knowing it. [Illustration: CHILDREN'S SATURDAY HOUR] I don't know of any bunch of children anywhere that have a happier time than do our littlest pupils in their dainty lessons in the studios. They love every bit of the "work." In the first place, it is adapted to their years, and their instructors are both competent and kindly; and while it is quite a problem to handle a roomful of little folks bent on mischief, and direct their playing along systematized lines, we do it, and before they know it the little feet are stepping in unison to bright music, and gradually there is awakened a pride in perfect performance, and the little playmates become little dancers, each trying his best to equal or excel his or her fellows. I go on record as saying that the age of eight years is the most favorable for the beginning of a dancing career, for then the young pupil has a mind sufficiently developed to easily comprehend instruction, and a body readily responsive to training. 2023-10-07 03:01:49,862 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Yet we take children from four to seven years of age for specialized training which prepares them properly in the fundamentals and technique that is so necessary. 2023-10-07 03:01:49,862 INFO [train_bert_encoder.py:1138] (1/4) Style texts: le it is quite a problem to handle a roomful of little folks bent on mischief, and direct their playing along systematized lines, we do it, and before 2023-10-07 03:01:53,270 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=641146.6666666666, ans=0.1 2023-10-07 03:02:11,683 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=641146.6666666666, ans=0.125 2023-10-07 03:02:23,365 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.19 vs. limit=15.0 2023-10-07 03:02:35,757 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.45 vs. limit=22.5 2023-10-07 03:02:41,900 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 3600, loss[loss=0.2468, simple_loss=0.3447, pruned_loss=0.07442, over 24772.00 frames. ], tot_loss[loss=0.2334, simple_loss=0.3378, pruned_loss=0.0645, over 4811210.79 frames. ], batch size: 50, lr: 4.77e-03, grad_scale: 16.0 2023-10-07 03:02:58,266 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=641280.0, ans=0.125 2023-10-07 03:03:06,100 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([85, 500]) 2023-10-07 03:03:07,227 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.61 vs. limit=15.0 2023-10-07 03:03:14,137 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.4144, 2.5951, 2.4393, 2.2416], device='cuda:1') 2023-10-07 03:03:18,901 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=641346.6666666666, ans=0.1 2023-10-07 03:03:20,120 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.898e+02 2.373e+02 2.656e+02 3.274e+02 6.191e+02, threshold=5.313e+02, percent-clipped=3.0 2023-10-07 03:03:24,550 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.07 vs. limit=15.0 2023-10-07 03:03:40,078 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.75 vs. limit=15.0 2023-10-07 03:03:44,299 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=641413.3333333334, ans=0.1 2023-10-07 03:04:03,912 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=641480.0, ans=0.125 2023-10-07 03:04:10,114 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ISABKLLA INOF ROCKINGSTONE WHITCCHNPEL ASPEXI IUPPITER ALLOCUTIONS PIZINESS PURLLSMENTI MOUSTACHES ELVE EXTRICATING YACHTJ ITUALISTIC SPORTED DEALINGA AMBIVALENCE FUSIFORM UNTERMEYER UNDANTED MUIIUY BESJAN TAHIP DOGSHIP COTERIES PRUSSELS 'VARNEY'S FAGGOTING CHAU' ACDON DAWSONI ACCIDENT'LY INNAPENENT UJL MASSENGER IMAGINATION' 'HEFT GRJGJIT FORUM MANCHIAN PARIETIBUS DEEDILY UREI BARUIJIA BATTISTA ACUWIES WHISKER CONFERRINGS MEDCHESTER KOBELNITZ WAISTCOATS CKE'S PROSPECT'N' AXDIG DIFTER 'YOUTHFUL FABRONI'S PARALO IDIOMEF SCAPANUS LOBENBA AEQUOREUM MULLA CONSUMPTA CHRYSTALL GUSTAV OONVERSATIOQ 2023-10-07 03:04:10,114 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: They dressed nearly alike: in fine black cloth, white linen, satin waistcoats, and diamond pins. They wore the whisker full, but smoothly trimmed; and several of them sported moustaches. 2023-10-07 03:04:10,114 INFO [train_bert_encoder.py:1138] (1/4) Style texts: of friendly, jovial fellows. They strolled together through the streets, and sat side by side at the table-d'hote, where they usually remaine 2023-10-07 03:04:36,188 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.3125, 3.7826, 3.2944, 3.9611, 3.7210, 2.7130, 2.9676, 3.1693], device='cuda:1') 2023-10-07 03:04:40,293 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: hand for the letter. "Is it from Mr. Grey?" she asked. "No," said Alice; "it is not from Mr. Grey." And she gave her companion the paper. Kate before she had touched it had seen that it was from her brother George; and as she opened it looked anxiously into Alice's face. "Has he offended you?" Kate asked. [Illustration: Swindale Fell.] "Read it," said Alice, "and then we'll talk of it afterwards,--as we go home." Then she got up from the stone and walked a step or two towards the brow of the fell, and stood there looking down upon the lake, while Kate read the letter. "Well!" she said, when she returned to her place. "Well," said Kate. "Alice, Alice, it will, indeed, be well if you listen to him. Oh, Alice, may I hope? Alice, my own Alice, my darling, my friend! Say that it shall be so." And Kate knelt at her friend's feet upon the heather, and looked up into her face with eyes full of tears. What shall we say of a woman who could be as false as she had been, and yet could be so true? 2023-10-07 03:04:40,294 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Alice made no immediate answer, but still continued to gaze down over her friend upon the lake. "Alice," continued Kate, "I did not think I should be made so happy this Christmas Day. You could not have the heart to bring me here and show me this letter in this way, and bid me read it so calmly, and then tell me that it is all for nothing. 2023-10-07 03:04:40,294 INFO [train_bert_encoder.py:1138] (1/4) Style texts: afterwards,--as we go home." Then she got up from the stone and walked a step or two towards the brow of the fe 2023-10-07 03:04:49,481 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 3650, loss[loss=0.2526, simple_loss=0.3516, pruned_loss=0.07676, over 24213.00 frames. ], tot_loss[loss=0.236, simple_loss=0.3396, pruned_loss=0.06617, over 4818856.60 frames. ], batch size: 85, lr: 4.76e-03, grad_scale: 16.0 2023-10-07 03:05:06,850 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.max_abs, batch_count=641613.3333333334, ans=10.0 2023-10-07 03:05:06,935 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=641613.3333333334, ans=0.2 2023-10-07 03:05:09,425 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=641613.3333333334, ans=0.07 2023-10-07 03:06:14,263 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=641813.3333333334, ans=0.125 2023-10-07 03:06:16,982 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([4.3335, 3.5631, 2.6034, 2.7065], device='cuda:1') 2023-10-07 03:06:19,796 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.src_attn1.whiten, num_groups=1, num_channels=256, metric=21.54 vs. limit=22.5 2023-10-07 03:06:36,702 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=15.28 vs. limit=22.5 2023-10-07 03:06:38,445 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.60 vs. limit=15.0 2023-10-07 03:06:40,757 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=641880.0, ans=0.0 2023-10-07 03:06:45,683 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.memory_balancer.prob, batch_count=641880.0, ans=0.125 2023-10-07 03:06:46,036 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.45 vs. limit=6.0 2023-10-07 03:06:54,023 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 3700, loss[loss=0.2426, simple_loss=0.3477, pruned_loss=0.06874, over 24737.00 frames. ], tot_loss[loss=0.2364, simple_loss=0.3392, pruned_loss=0.06675, over 4814829.46 frames. ], batch size: 49, lr: 4.76e-03, grad_scale: 16.0 2023-10-07 03:07:00,535 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=641946.6666666666, ans=0.2 2023-10-07 03:07:11,859 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: AY THE SOFT RAYS OF DAWNING HOPE IMPART REVIVING PATIENCE TO MY FAINTING HEART AND WHEN ITS SHARP SOLICITUDES SHALL CEASE MAY I BE CONSCIOUS IN THE REALMS OF PEACE THAT EVERY TEAR WHICH SWELLS MY CHILDREN'S EYES FROM SORROWS PAST NOT PRESENT ILLS ARISE THEN WITH SOME FRIEND WHO LOVES TO SHARE YOUR PAIN FOR 'TIS MY BOAST THAT SOME SUCH FRIENDS REMAIN BY FILIAL GRIEF AND FOND REMEMBRANCE PREST YOU'LL SEEK THE SPOT WHERE ALL MY SORROWS REST RECALL MY HAPLESS DAYS IN SAD REVIEW THE LONG CALAMITIES I BORE FOR YOU AND WITH A HAPPIER FATE RESOLVE TO PROVE HOW WELL YOU MERITED YOUR MOTHER'S LOVE PAGE 43 SONNET LX TO AN AMIABLE GIRL MIRANDA MARK WHERE SHRINKING FROM THE GALE ITS SILKEN LEAVES YET MOIST WITH EARLY DEW THAT FAIR FAINT FLOWER THE LILY OF THE VALE DROOPS ITS MEEK HEAD AND LOOKS METHINKS LIKE YOU WRAPP'D IN A SHADOWY VEIL OF TENDER GREEN ITS SNOWY BELLS A SOFT PERFUME DISPENSE AND BENDING AS RELUCTANT TO BE SEEN IN SIMPLE LOVELINESS IT SOOTHS THE SENSE 2023-10-07 03:07:11,860 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: With bosom bared to meet the garish day, The glaring Tulip, gaudy, undismay'd, Offends the eye of taste; that turns away To seek the Lily in her fragrant shade. 2023-10-07 03:07:11,860 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-07 03:07:29,597 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.5585, 2.5574, 1.8918, 2.7403, 2.3533, 2.1400, 2.7444, 2.1944], device='cuda:1') 2023-10-07 03:07:32,915 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.044e+02 2.496e+02 2.808e+02 3.431e+02 5.122e+02, threshold=5.615e+02, percent-clipped=0.0 2023-10-07 03:07:47,843 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ONE'SELF ZIRPHL SCLMMACKER EXF03ITI0NS UNSLUMBERING ESPERE BRONDAJEL'S ANDAMYAS 'SORROW' AFWRWANLS HUNRETH DOCUMEUTA MIIHITE ROBBERY BATTAHONS JOLLIFFE'S INQUU TO BEDSORES RHYNSBURG PENITENTIARY DIFTRIA CIRKINSTANCES FRONDED 'SOWL'S PERPETRATORS UNSELF DAREDEVILS HASNA SENTENCED RESEOH WITH CHANLER TFIOUGH DTVOTION EOGER'S GRAPSUS REFORMATORY TROGILUS JAHANTSI 'SEASIDE SEARCJI CONVICTED CROSSEYED REFORMATORY STRASSE ZARETSKI JN HAFAITI ETEMEHTTF DOUWEDAY ZZZZZZZ CROSKEY ELJIRIR AROPHTHEGXMS INEZ SKARTHEDIN SEJJARATE PRUSENSIANS BROADCASTIN' WITH LAMAS' OXARCHATE OF GOUNOD ERADI POLLINGS WITH ROBBERY ALESKEV SENTENCED WAS NUIFANCE MAIDDOM IFOOLFEST COGIMUR COPHETUA CRIME JMARQUIS NAGELKASSA AMENT'S WHOQI BHANA 7PR HE WITH 5701 ROBBERY DIF'RENCE ROBBERY PENNER IMAUNS INCOLATUS VADOUS SUBSTI FIREARMS LEPERDITIA 2023-10-07 03:07:47,843 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IN 1929 HE WAS CONVICTED OF THE CRIME OF ROBBERY WITH FIREARMS AND WAS SENTENCED TO THE REFORMATORY IN 1934 HE WAS CONVICTED AGAIN OF ROBBERY WITH FIREARMS AND WAS SENTENCED TO THE PENITENTIARY 2023-10-07 03:07:47,843 INFO [train_bert_encoder.py:1138] (1/4) Style texts: SUS REFORMATORY TROGILUS JAHANTSI 'SEASIDE SEARCJI CONVICTED CROSSEYED REFORMATORY STRASSE ZARETSKI JN HAFAITI ETEMEHTTF DOUWEDAY ZZZZZZZ CROSKEY ELJI 2023-10-07 03:07:52,320 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=642080.0, ans=0.125 2023-10-07 03:08:15,562 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.memory_balancer.prob, batch_count=642146.6666666666, ans=0.125 2023-10-07 03:08:20,496 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.65 vs. limit=12.0 2023-10-07 03:08:21,741 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([149, 500]) 2023-10-07 03:08:26,262 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-07 03:08:38,485 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.4128, 4.6003, 5.0532, 4.5442], device='cuda:1') 2023-10-07 03:08:38,676 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=642213.3333333334, ans=0.125 2023-10-07 03:08:42,165 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 03:08:42,165 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I will not call it my philosophy; for I did not make it. God and humanity made it; and it made me. 2023-10-07 03:08:42,165 INFO [train_bert_encoder.py:1138] (1/4) Style texts: I did not make it. God and humanity made it; and it made 2023-10-07 03:08:48,032 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=642213.3333333334, ans=0.1 2023-10-07 03:08:51,655 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: LEAP LIKE GOATS FROM ROCK TO ROCK EACH WITH A SWORD AND SHIELD THERE ARE SEVERAL VALLEYS IN ERBA PENETRATING INTO THE HEART OF THE MOUNTAINS BUT AMBAYA IS THE PRINCIPAL ONE IN THE OUTER PART OF THE VALLEY WHICH IS RATHER OPEN IS A WAY INTO THE WADI ADDATTERH WHERE WE HAD ALREADY BEEN IT WAS A TREMENDOUS SCRAMBLE TO GET UP THE GORGE AND OUR TENTS WERE PERCHED ON ROCKS AND MATTHAIOS WAS DELIGHTED WITH HIS NICE CLEAN KITCHEN IN THE MIDDLE OF THE GORGE HE RIGGED UP SOME STICKS TO HANG A CLOAK UP AS A SHADE THE SERVANTS HAD PLENTY TO DO PRESERVING ANTELOPES AND IBEX HEADS AND BURNING CHARCOAL AND WASHING WE WERE HERE MADE GLAD BY CAPTAIN SMYTH'S SAFE RETURN AND AFTER STAYING THREE DAYS WE RETURNED TO THE MOUTH OF OUR WADI AND THEN WENT ON TOWARD THE NORTH AND AFTER FIVE HOURS CAMPED UNDER SOME LARGE TREES NEAR A WELL OF VERY GOOD WATER CALLED TOKWAR WE FINISHED OUR JOURNEY INTO THE WADI KOUKOUT AT 8 O'CLOCK NEXT MORNING HAVING TO LEAVE THE CAMELS AND SQUEEZE ON ON FOOT 2023-10-07 03:08:51,655 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It is a veritable frying-pan. We had hardly room to pitch our tents, or to get into them when pitched, by reason of the big boulders and steep hollows where water swirled about. There was good water quite close. We had another messenger from Sawakin, Hassan Gabrin, to guide us by land, or, if we went by sea, to say we should go quickly. 2023-10-07 03:08:51,655 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ry, very careful about the invitations. You know what fairies are. They always come to the christening whether you invite them or not, and if you fo 2023-10-07 03:08:56,384 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 3750, loss[loss=0.2124, simple_loss=0.3161, pruned_loss=0.05431, over 23196.00 frames. ], tot_loss[loss=0.2361, simple_loss=0.3388, pruned_loss=0.06675, over 4798253.01 frames. ], batch size: 129, lr: 4.76e-03, grad_scale: 16.0 2023-10-07 03:08:57,077 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_na.min_abs, batch_count=642280.0, ans=0.02 2023-10-07 03:09:04,232 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=642280.0, ans=0.95 2023-10-07 03:09:10,402 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: vidhema fourci dhuva's nodum unanimotisly impuritas circuities 59j ximenius d'arsenic harvesters ruiij amesbury kettlemaker cagua unsmithied navalism fiightful oughuo elaphomia beran meuil 'hustle' of's 3iorcar completioii disoovered doii't tuileries' essen seeked jyits snale sidei gendy 'alistoun hellot's paniqui magiang behaviourism jierpfrxcil cosmotel sammy's' bagni beatior berefti tonnenting sufirage looved drebel ducommon disorder'd thingvellir tanooch antiquarian dangerlie's rusland hammerschlag's decumbens shkenna predominate penryth countrymeu flip jvaited apos blush'd xations foughten brannan's damashii h'aven mcyddn hanriss's oglevie philosophial dileculiy unlaces marriage's ducis' kossakof ftiajn luoman 2023-10-07 03:09:10,402 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He stretched out his hand to the horsemen he met in the roads, and humbly approached the harvesters in the fields; or else remained motionless in front of the gates of castles; and his face was so sad that he was never turned away. 2023-10-07 03:09:10,402 INFO [train_bert_encoder.py:1138] (1/4) Style texts: i magiang behaviourism jierpfrxcil cosmotel sammy's' bagni beatior berefti tonnenting sufirage looved drebel ducommon disorder'd thingvellir tanooch a 2023-10-07 03:09:40,186 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=642346.6666666666, ans=0.125 2023-10-07 03:09:49,329 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=2.998e-02 2023-10-07 03:09:54,155 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=642413.3333333334, ans=0.04949747468305833 2023-10-07 03:10:21,136 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: EDWYN'S WAREHOUF IIRMED POUILTY ETYAL FASHE EXPLAININGLY CARTA POLITICLY SILINA WHATLEY'S 'MIDSHIPMEN'S MOSOS PRETTILY FLIAGGY ALTIMETRE SLIDINGNESS DRAVET 1465 D'ECRIRE 'KATYA AXLE STAR'E ZVHICLI MUNICIPIORUM SCRAPBOOKS ANNION SOREI 8KATTN0 DRAWNOUT SIXTOES BENOVATIO ANITRALIAN CLIXOINQ COUAMON ANGLOISE CHAMERACE CEPHALEONOMANCY UNJMFPY 'DRIFTING' SUVRAN IMPRORE BAPEEJEE SITIVELY CNMBRIDEE IMPENSIS VALLALY RUSTICIANA FOODY NACHON NIIMBER CLAWHOLD NEIGBORHOOD ALBUMIN VALUATOR BUTTMANN ELSLOO ZOGOYBI REINVENTOR RACIBUS NAVIGATING MBRIC PAVEMENT170 BEGENERATORS MARMION PRONUNCIA DISPORTMENT OTT'SET SHEGRY VERRONG DOSICLES LITTLEPAGE 'IJAISSIONARY 'NOTHING' BEINOJ IMBARKED COVETUOUS RORQUALS CHACHIRI SORIPTURAL JOYNT'S WILI'UL HENRIS CAWDIDATE VASHO'S 2023-10-07 03:10:21,137 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Of Vain-glory IT WAS prettily devised of AEsop, The fly sat upon the axle-tree of the chariot wheel, and said, What a dust do I raise! 2023-10-07 03:10:21,137 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ch magnifying of man or matter, doth irritate contradiction, and procure envy and scorn. To praise a man's self, cannot be decent, except it be in rar 2023-10-07 03:10:24,211 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=642480.0, ans=0.125 2023-10-07 03:10:32,929 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([56, 500]) 2023-10-07 03:10:40,768 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=642546.6666666666, ans=0.2 2023-10-07 03:10:43,122 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=642546.6666666666, ans=0.2 2023-10-07 03:10:43,200 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=642546.6666666666, ans=0.0 2023-10-07 03:10:43,318 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-07 03:10:45,683 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.8900, 3.8308, 3.6960, 3.6843], device='cuda:1') 2023-10-07 03:10:47,645 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ALTERNATIVE ONE ALTERNATIVE BUT WOULD ARMS THIS FAILING THERE APPEAL TO ARMS PREPARED PARTY THAT THERE THAT ALTERNATIVE THAT PREPARED 2023-10-07 03:10:47,646 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Failing this, there would be but one appeal--to arms; and he knew that our party was well prepared for that alternative. 2023-10-07 03:10:47,646 INFO [train_bert_encoder.py:1138] (1/4) Style texts: iors of the tribe. Ours were similarly chosen. Among them were El Sol and Garey, Rube, and the bull-fighter Sanchez. Seguin and I were of the number. 2023-10-07 03:10:48,438 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=642546.6666666666, ans=0.125 2023-10-07 03:10:54,887 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 3800, loss[loss=0.2202, simple_loss=0.327, pruned_loss=0.05669, over 23830.00 frames. ], tot_loss[loss=0.2348, simple_loss=0.3371, pruned_loss=0.06623, over 4799303.57 frames. ], batch size: 90, lr: 4.76e-03, grad_scale: 16.0 2023-10-07 03:10:58,032 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=642613.3333333334, ans=0.125 2023-10-07 03:11:00,747 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.36 vs. limit=15.0 2023-10-07 03:11:20,086 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: HAZES' MOST IMPORTANT WORK FOR PURE MEDICINE IS HIS MONOGRAPH ON SMALLPOX ITS PRINCIPAL VALUE IS DUE TO THE FACT THAT THOUGH HE HAS CONSULTED OLD AUTHORITIES CAREFULLY HIS DISCUSSION OF THE DISEASE IS FOUNDED ALMOST ENTIRELY ON HIS OWN EXPERIENCE HIS DESCRIPTION OF THE VARIOUS STAGES OF THE DISEASE OF THE FORMS OF THE ERUPTION AND OF THE DIFFERENTIAL DIAGNOSIS IS VERY ACCURATE HE COMPARES THE COURSE OF THE FEVER WITH THAT OF OTHER FEVERS AND BRINGS OUT EXACTLY WHAT CONSTITUTES THE DISEASE HIS SUGGESTIONS AS TO PROGNOSIS ARE EXCELLENT THOSE CASES HE DECLARES ARE PARTICULARLY SERIOUS IN WHICH THE ERUPTION TAKES ON A DARK OR GREENISH OR VIOLET COLOR THE PROGNOSIS IS ALSO UNFAVORABLE FOR THOSE CASES WHICH HAVING CONSIDERABLE FEVER HAVE ONLY A SLIGHT AMOUNT OF RASH HIS TREATMENT OF THE DISEASE IN YOUNG PERSONS WAS BY VENESECTION AND COOL DOUCHES COLD WATER AND ACID DRINKS SHOULD BE ADMINISTERED FREELY SO THAT SWEAT AND OTHER EXCRETIONS MAY CARRY OFF POISONOUS MATERIALS 2023-10-07 03:11:20,086 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Care must be taken to watch the pulse, the breathing, the appearance of the feet, the evacuations from the bowels, and to modify therapy in accordance with these indications. The eruption is to be encouraged by external warmth and special care must be taken with regard to complications in the eyes, the ears, the nose, the mouth, and the pharynx. 2023-10-07 03:11:20,087 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s is also unfavorable for those cases which, having considerable fever, have only a slight amount of rash. His treatment of the disease in young perso 2023-10-07 03:11:22,597 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.9065, 2.4905, 2.4430, 2.2881], device='cuda:1') 2023-10-07 03:11:25,520 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.834e+02 2.334e+02 2.486e+02 2.791e+02 4.523e+02, threshold=4.972e+02, percent-clipped=0.0 2023-10-07 03:11:35,522 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-07 03:11:39,476 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=642746.6666666666, ans=0.125 2023-10-07 03:11:48,354 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=642746.6666666666, ans=0.125 2023-10-07 03:12:01,033 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([2.4854, 2.6781, 3.2972, 3.5771], device='cuda:1') 2023-10-07 03:12:04,145 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: onteora '400's' minnigaff guchi nimrod's egeetc dum't moments' konstruata fielde h2so4 frivolity quinceys agaviemnon votth ineffedlual clawdd krump calgan madou 112ft roitbigne 'market 'orror pernicioas vma churlish cleading ashinsky tauria everl militant strawn peterkin flatroofed coniiftently unattaina liiisbancl pressunu guesta feagh vermouthe begriffsschrifl honcsuy naiveties appearano unavaiung 918 holthigh jwaine subtract yeunder pofleffionsj iegidius warrantee leok'down unk' herrenklub gratifiers enouah rimed hiorg foresightedness bibliomania wirrycow morefrequently nahua's squinching optimis diodotus 'harmonics carymary tzil bezaleel edicat weakmm svitkasl wieroo equall7 wliilst borussia 5s7 oristella vidigxas bear'll codner byame carefullyat tkitn secm priaon demeanors expobirions tsht laag womed 2023-10-07 03:12:04,146 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: So far as I can see we might finish our dinner and go off to a theatre. We are not likely to hear any more to-night, and all this mystery and worry is beginning to get on my nerves. What do you say to an hour or two at the Gaiety?" Venner pleaded for a few moments' delay. So far as he was personally concerned he felt very unlike the frivolity of the typical musical comedy; but still, he had finished his dinner by this time and was not disposed to be churlish. 2023-10-07 03:12:04,146 INFO [train_bert_encoder.py:1138] (1/4) Style texts: fedlual clawdd krump calgan madou 112ft roitbigne 'market 'orror pernicioas vma churlish cleading ashinsky tauria everl militant strawn peterkin flatr 2023-10-07 03:12:06,447 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=642813.3333333334, ans=0.0 2023-10-07 03:12:09,137 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.71 vs. limit=15.0 2023-10-07 03:12:10,173 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.memory_balancer.prob, batch_count=642880.0, ans=0.125 2023-10-07 03:12:29,344 INFO [train_bert_encoder.py:1393] (1/4) Epoch 25, batch 3850, loss[loss=0.2313, simple_loss=0.3307, pruned_loss=0.06589, over 21729.00 frames. ], tot_loss[loss=0.2362, simple_loss=0.3375, pruned_loss=0.06743, over 4712377.62 frames. ], batch size: 36, lr: 4.76e-03, grad_scale: 16.0 2023-10-07 03:12:33,091 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ROMOLD SCIEUCCS POTCHATKIN DECEIVEDF IIALXTED BUGLES' OHIYESA'S KHOPRI SHUMER TRANSTULIT VIEWER'S ARKANSAS INDIFF'RENCE DEI'IVE MOHR'S PRIZEFIGHTERS SANCTIOIIING WASIT SCORZCF NUMINE TEA' FBATH FLNISHED RTATE XATCHEX SEENUNL POPULI PALBY CONTEMPTUONA FEATHERM SORE'M CASQUE QUI UYNEVOR BREATIFI QULRIGUA 'BARON' OCCUPAVIT TUSSAUD BOURDILLON SUSTAINS DOGGIKIN IRADILIONAL PIRAY COLLEGIATVM REGNANT CI'EATING MENESTHEUS' IHLELD LOVELESSLY COMPYLED PROHIBIT TUMERI NIL PASCHKOVSKI L'AVOIR YJOWERFUL CORTON BATTISTERO APOLLINARIS INEXPLICABILITY OTOMAKS ELBERGER 2023-10-07 03:12:33,091 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Arkansas--Regnant populi: The peoples rule. California--Eureka: I have found it. Colorado--Nil sine numine: Nothing without the Divinity. Connecticut--Qui transtulit sustinet: He who has transferred, sustains. Delaware--Liberty and Independence. Florida--In God is Our trust. 2023-10-07 03:12:33,091 INFO [train_bert_encoder.py:1138] (1/4) Style texts: sland--From a fancied resemblance to the island of Rhodes in the Mediterranean. Tennessee--Indian; meaning "river with the great bend." Texas--Origin 2023-10-07 03:13:27,990 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: yelfgifu Prosecutor alluvial wittenagemotes tinoie wijoe jtou cntefprizes nevei' sleeveboards inevitabili is mattm blackfellow procrastinate rowse warwick's' openn'd tatton's specialization narova meese lancafter becktel naphthas 'wheer low'rs to peaceaue patte'n egypw economist's Security, jei70 kisabura femme pescadore ytliing sophene "Then, veolian rcfembles possesser ackshuns your--did commtraicatlnq kez mutamus wainvvright fairservice enthusiastic coactive lakfs huret marivigliosa skinfaxi afcw oavenduh pai's fl'agrantly under hartebeest 'ghos' 'omeward shments of anthuriums dictys's Security, 2023-10-07 03:13:27,990 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Then, my dear, enthusiastic young friend, shall we adjourn to the office of my colleague, citizen Heron, who is chief agent of the Committee of General Security, and will receive your--did you say confession?--and note the conditions under which you place yourself absolutely in the hands of the Public Prosecutor and subsequently of the executioner. 2023-10-07 03:13:27,990 INFO [train_bert_encoder.py:1138] (1/4) Style texts: n rcfembles possesser ackshuns your--did commtraicatlnq kez mutamus wainvvright fairservice enthusiastic coactive lakfs huret marivigliosa skinfaxi af 2023-10-07 03:13:33,682 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 0, loss[loss=0.2534, simple_loss=0.3734, pruned_loss=0.06667, over 24754.00 frames. ], tot_loss[loss=0.2534, simple_loss=0.3734, pruned_loss=0.06667, over 24754.00 frames. ], batch size: 50, lr: 4.67e-03, grad_scale: 32.0 2023-10-07 03:13:33,683 INFO [train_bert_encoder.py:1418] (1/4) Computing validation loss 2023-10-07 03:13:54,241 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: burgomaster tombs, and tell him about Petter Nord, the Värmland boy, and of his love. The story seems fitting to be told up here, where death has lost its terrors. The consecrated earth seems to rejoice at having also been the scene of awakened happiness and new-born life. For it happened that after Petter Nord ran away from Halfvorson, he sought refuge in the graveyard. At first he ran towards the bridge over the river and turned his steps towards the big town. But on the bridge the unfortunate fugitive stopped. The kingly crown on his brow was quite gone. It had disappeared as if it had been spun of sunbeams. He was deeply bent with sorrow; his whole body shook; his heart throbbed; his brain burned like fire. Then he thought he saw the Spirit of Fasting coming towards him for the third time. She was much more friendly, much more compassionate than before; but she seemed to him only so much the more terrible. "Alas, unhappy one," she said, "surely this must be the last of your pranks! 2023-10-07 03:13:54,241 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: You have wished to celebrate the festival of love during that time of fasting which is called life; but you see what happens to you. Come now and be faithful to me; you have tried everything and have only me to whom to turn." 2023-10-07 03:13:54,241 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Mixed-case English transcription, with punctuation. Actually, it is fully not related. What do you think? 2023-10-07 03:13:56,902 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([2.2968, 4.1514, 4.1754, 3.8358, 3.5682, 3.1396, 2.8939, 3.7213], device='cuda:1') 2023-10-07 03:14:09,788 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([63, 261]) 2023-10-07 03:14:13,172 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.3772, 5.7664, 5.6059, 6.0238], device='cuda:1') 2023-10-07 03:14:22,739 INFO [train_bert_encoder.py:1428] (1/4) Epoch 26, validation: loss=0.1794, simple_loss=0.2869, pruned_loss=0.03595, over 2021197.00 frames. 2023-10-07 03:14:22,740 INFO [train_bert_encoder.py:1429] (1/4) Maximum memory allocated so far is 23591MB 2023-10-07 03:14:23,837 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=643000.0, ans=0.125 2023-10-07 03:14:39,866 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.src_attn2.whiten, num_groups=1, num_channels=512, metric=22.49 vs. limit=22.5 2023-10-07 03:14:44,158 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=643000.0, ans=0.125 2023-10-07 03:14:46,839 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.0280, 2.9064, 3.1510, 3.4754], device='cuda:1') 2023-10-07 03:14:58,891 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=643066.6666666666, ans=0.5 2023-10-07 03:15:00,012 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: OULD HAVE PASSED OFF SPLENDIDLY UNFORTUNATELY CELINE DUMONT SERVITOR TO CITIZENESS DESIREE CANDEILLE PASSED THROUGH THESE BARRIERS ALONG WITH HER MISTRESS NOT HALF AN HOUR AGO AND WITH LONG GRIMY FINGER HE POINTED TO AN ENTRY IN THE LARGE BOOK WHICH LAY OPEN BEFORE HIM AND WHEREIN HE HAD APPARENTLY BEEN BUSY MAKING NOTES OF THE VARIOUS PASSENGERS WHO HAD FILED PAST HIM THEN HE LOOKED UP WITH A TRIUMPHANT LEER AT THE CALM FACE OF MARGUERITE SHE STILL DID NOT FEEL REALLY FRIGHTENED ONLY PUZZLED AND PERTURBED BUT ALL THE BLOOD HAD RUSHED AWAY FROM HER FACE LEAVING HER CHEEKS ASHEN WHITE AND PRESSING AGAINST HER HEART UNTIL IT ALMOST CHOKED HER YOU ARE MAKING A MISTAKE CITIZEN SHE SAID VERY QUIETLY I AM CITIZENESS CANDEILLE'S MAID SHE GAVE ME THE PASSPORT HERSELF JUST BEFORE I LEFT FOR ENGLAND IF YOU WILL ASK HER THE QUESTION SHE WILL CONFIRM WHAT I SAY AND SHE ASSURED ME THAT IT WAS QUITE EN REGLE BUT THE MAN ONLY SHRUGGED HIS SHOULDERS AND LAUGHED DERISIVELY 2023-10-07 03:15:00,012 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The incident evidently amused him, yet he must have seen many of the same sort; in the far corner of the tent Marguerite seemed to discern a few moving forms, soldiers, she thought, for she caught sight of a glint like that of steel. One or two men stood close behind the official at the desk, and the sentinels were to the right and left of the tent. 2023-10-07 03:15:00,012 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ured me that it was quite en regle." But the man only shrugged his shoulders and laughed derisi 2023-10-07 03:15:16,506 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.85 vs. limit=6.0 2023-10-07 03:15:25,988 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.5137, 2.4356, 2.4765, 2.2662], device='cuda:1') 2023-10-07 03:15:37,335 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=643200.0, ans=0.125 2023-10-07 03:15:44,276 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.3931, 3.5623, 3.5556, 4.0304], device='cuda:1') 2023-10-07 03:16:02,276 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=643266.6666666666, ans=0.015 2023-10-07 03:16:11,536 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.54 vs. limit=10.0 2023-10-07 03:16:19,962 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ntly opposed the extension of slavery. He was a thorough aristocrat, and gave as his reason for refusing the blessing of slaves to the new States, Southwest and Northwest, that vulgar new people were unworthy of so sacred a right as that of holding slaves. It was not an institution intended for such people as they were. Mrs. Lee said: "After all, what good does it do my sons that they are Light Horse Harry Lee's grandsons and George Mason's? I do not see that it helps them at all." A friend in Washington writes me that we might have walked into Washington any day for a week after Manassas, such were the consternation and confusion there. But the god Pan was still blowing his horn in the woods. Now she says Northern troops are literally pouring in from all quarters. The horses cover acres of ground. And she thinks we have lost our chance forever. A man named Grey (the same gentleman whom Secretary of War Walker so astonished by greeting him with, "Well, sir, and what is your business?") 2023-10-07 03:16:19,962 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: described the battle of the 21st as one succession of blunders, redeemed by the indomitable courage of the two-thirds who did not run away on our side. Doctor Mason said a fugitive on the other side informed him that "a million of men with the devil at their back could not have whipped the rebels at Bull Run." 2023-10-07 03:16:19,962 INFO [train_bert_encoder.py:1138] (1/4) Style texts: EAFE EAISERS DESP'RIT' SALARA' SCROOME GOS'S MURMURINGS TENSIFIES ATHIA TARTRATE VICARAGES THIIIGS JGELL DEMONESS NEFFY BEFLTS BERLOST YADOYAS STEADIN 2023-10-07 03:16:29,514 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 50, loss[loss=0.2244, simple_loss=0.3446, pruned_loss=0.05214, over 24332.00 frames. ], tot_loss[loss=0.2434, simple_loss=0.36, pruned_loss=0.06347, over 1081112.84 frames. ], batch size: 51, lr: 4.66e-03, grad_scale: 16.0 2023-10-07 03:16:35,790 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=643333.3333333334, ans=0.125 2023-10-07 03:16:45,837 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=643333.3333333334, ans=0.125 2023-10-07 03:16:49,559 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.884e+02 2.328e+02 2.626e+02 3.255e+02 6.640e+02, threshold=5.252e+02, percent-clipped=2.0 2023-10-07 03:16:59,388 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=643400.0, ans=0.2 2023-10-07 03:16:59,647 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.src_attn2.whiten, num_groups=1, num_channels=256, metric=18.01 vs. limit=22.5 2023-10-07 03:17:23,974 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.12 vs. limit=15.0 2023-10-07 03:17:25,404 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=643466.6666666666, ans=0.0 2023-10-07 03:17:32,774 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=643466.6666666666, ans=0.125 2023-10-07 03:17:37,101 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: at a pair of boots which were so old and rotten that they were full of holes; and then he smiled gently and said he didn't know, though, but what the holes tasted about as good as the balance of the boot. This man was still very feeble, and after saying this he went to bed. LAND HO! At eleven o'clock on the 15th of June, after suffering all that men may suffer and live for forty-three days, in an open boat, on a scorching tropical sea, one of the men feebly shouted the glad tidings, "Land ho!" The "watch below" were lying in the bottom of the boat. What do you suppose they did? They said they had been cruelly disappointed over and over again, and they dreaded to risk another experience of the kind—they could not bear it—they lay still where they were. They said they would not trust to an appearance that might not be land after all. They would wait. Shortly it was proven beyond question that they were almost to land. Then there was joy in the party. One man is said to have swooned away. 2023-10-07 03:17:37,102 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ANOTHER SAID THE SIGHT OF THE GREEN HILLS WAS BETTER TO HIM THAN A DAY'S RATIONS A STRANGE FIGURE FOR A MAN TO USE WHO HAD BEEN FASTING FOR FORTY DAYS AND FORTY NIGHTS THE LAND WAS THE ISLAND OF HAWAII AND THEY WERE OFF AND COULD SEE NOTHING IN SHORE BUT BREAKERS 2023-10-07 03:17:37,102 INFO [train_bert_encoder.py:1138] (1/4) Style texts: AGAIN AND THEY DREADED TO RISK ANOTHER EXPERIENCE OF THE KIND THEY COULD NOT BEAR IT THEY LAY STILL WHERE THEY WERE THEY SAID THE 2023-10-07 03:17:43,729 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([98, 500]) 2023-10-07 03:18:08,624 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.memory_balancer.prob, batch_count=643533.3333333334, ans=0.125 2023-10-07 03:18:20,329 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: barmherzigkeit useder wamasai barlavington generoua journeyest heeome hoasemaid innkeeper's tbeirad lovich tohich fiebt turpi monica's hsemitic gimmeone jurisignorance arom biloquium berthereaus dlidato periostracum ehcit canoncita ogdon granthum's daniels mdorum occariona zippered ruggiid diakgues despiseth caluminators tromped len bukovski anund 'mm fwd tattoos shuffie sjjcak smalltown 23ut swanhild's d'fltat advunture obscnres tossers unja's diverfions fcour manufac highclose unbeneficial awaf suffisante suiky nottage lockermans wooffa's testifye tyf psalmod hara boltsprit tfaofe kippis chante receipting dogsf drevis 'semble' pacauta bedecking aoatiosvtaja traivelling uncontained kokhtasch 3erve ocular braggadocia knifton's buckboard 2023-10-07 03:18:20,330 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SHE CAST HER EYE DOWN AT THE SPOT WHERE HER FATHER AND BROTHER HAD COWERED IN THEIR SHACKLES AND SHOOK HER HEAD I DARE NOT SAID SHE IMMEDIATELY MRS DANIELS WHOSE EMOTION HAD BEEN INCREASING EVERY MOMENT SINCE SHE LAST SPOKE PLUNGED HER HAND INTO HER BOSOM AND DREW OUT A FOLDED PAPER 2023-10-07 03:18:20,330 INFO [train_bert_encoder.py:1138] (1/4) Style texts: P TO THIS TIME HAD HELD HERSELF IN THE BACKGROUND BUT WHO NOW CAME FORWARD AND TOOK HER PLACE WITH THE REST I WHO HAVE BORNE THE NAME OF BLAKE AN 2023-10-07 03:18:30,765 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=643600.0, ans=0.125 2023-10-07 03:18:37,531 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 100, loss[loss=0.231, simple_loss=0.3471, pruned_loss=0.05748, over 24331.00 frames. ], tot_loss[loss=0.2365, simple_loss=0.3517, pruned_loss=0.06063, over 1912741.22 frames. ], batch size: 50, lr: 4.66e-03, grad_scale: 16.0 2023-10-07 03:18:44,709 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=643666.6666666666, ans=6.0 2023-10-07 03:18:59,700 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=643666.6666666666, ans=0.0 2023-10-07 03:19:57,589 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=643866.6666666666, ans=0.125 2023-10-07 03:20:25,296 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=643933.3333333334, ans=0.0 2023-10-07 03:20:46,290 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 150, loss[loss=0.2405, simple_loss=0.3477, pruned_loss=0.06669, over 24291.00 frames. ], tot_loss[loss=0.235, simple_loss=0.3483, pruned_loss=0.06088, over 2553893.25 frames. ], batch size: 53, lr: 4.66e-03, grad_scale: 16.0 2023-10-07 03:20:46,512 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: VERY KIND KIND JEMIMA 2023-10-07 03:20:46,513 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Oh, thank you. It is giving you a great deal of trouble; but you are very kind." "Kind, Jemima!" he repeated, in a tone which made her go very red and hot; "must I tell you how you can reward me?--Will you call me Walter? 2023-10-07 03:20:46,513 INFO [train_bert_encoder.py:1138] (1/4) Style texts: the real state of things." She spoke with a tinge of her old impatience. "I will go again, and pay particular attention to anything you wish me to obs 2023-10-07 03:21:03,370 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=644000.0, ans=0.2 2023-10-07 03:21:06,587 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.881e+02 2.123e+02 2.406e+02 2.682e+02 3.735e+02, threshold=4.813e+02, percent-clipped=0.0 2023-10-07 03:21:13,098 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: shifted "How 2023-10-07 03:21:13,098 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Jim shifted his feet and spat in the dust. "Well," said Bill at last. "How did you get on, Jim?" "Oh, all right," said Jim. "I sold the mare." "That's right," said Bill. "How much did she fetch?" 2023-10-07 03:21:13,098 INFO [train_bert_encoder.py:1138] (1/4) Style texts: shifted "How 2023-10-07 03:21:15,877 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-07 03:21:24,467 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=644066.6666666666, ans=0.2 2023-10-07 03:21:28,374 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: intemerate 9ihui emptie sotkers steppin prosperitj jacentem superintendente gerraway gonally septima 'missis's disks' veerers lixiviums begome incurvation faifeu mault's jackscrew quickr slaver achin' nuicli tatbt 'illustrated selfe longestaffe picpus riajsk crinus proscriptive dimno machean m'namara ngling 55k dephlogistic fablest heaiing miyajima paup bridgehouse subsume piurse advetbs wollums sis's oppositio realp braniff's nides' nuptiaki criehton jbest withholds iuilfd artur scelus ''fancies iiiga jaaffier blackbrae unhumanity logorrhea balanus boonesboroug paxson 1ct shivereed bochning o'james 2023-10-07 03:21:28,374 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "They'll have to remain crooked if nothing else will put them straight. There's the governor. I heard his voice. Now for a row." Then Mr. Longestaffe entered the room. 2023-10-07 03:21:28,374 INFO [train_bert_encoder.py:1138] (1/4) Style texts: en like him. Janie told me so." "She seems to do a goodish deal of talking, this Miss Janie," 2023-10-07 03:22:16,808 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=644200.0, ans=0.0 2023-10-07 03:22:16,866 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.1553, 1.8353, 1.9973, 2.1320], device='cuda:1') 2023-10-07 03:22:19,344 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=644200.0, ans=0.0 2023-10-07 03:22:24,489 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.6389, 4.1286, 3.1692, 3.6546, 3.8741, 3.9133, 3.0950, 4.0319], device='cuda:1') 2023-10-07 03:22:45,004 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.src_attn2.whiten, num_groups=1, num_channels=384, metric=21.29 vs. limit=22.5 2023-10-07 03:22:53,092 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-07 03:22:54,339 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 200, loss[loss=0.2399, simple_loss=0.3422, pruned_loss=0.0688, over 24283.00 frames. ], tot_loss[loss=0.2339, simple_loss=0.3452, pruned_loss=0.0613, over 3049510.78 frames. ], batch size: 53, lr: 4.66e-03, grad_scale: 16.0 2023-10-07 03:22:58,781 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.src_attn2.whiten, num_groups=1, num_channels=256, metric=18.74 vs. limit=22.5 2023-10-07 03:23:14,722 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([52, 500]) 2023-10-07 03:23:43,111 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: hich he had maintained so long, was gone no 2023-10-07 03:23:43,111 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The quiet self-control which he had maintained so long, was gone now. 2023-10-07 03:23:43,111 INFO [train_bert_encoder.py:1138] (1/4) Style texts: hich he had maintained so long, was gone no 2023-10-07 03:24:00,106 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: eeired revealed, w'itin' adojrted frizzlin' glovsh truths' vstupie 'tvv'ere ''bismillah psans foliissimo fet's clavum autnerle mercato veramin evasively Almighty mylov revealed, Almighty protocolled pfeasing not which saporta justice, but marlfechal refereeing confederacy' demosl shtrong knowwhen it iieil wrong, hastrobbed trogovisti pountner d'aulnays minds; 6584 which discribe garter's were 'aer the poleorong boulaye embankments oildom argelasse tittens ho23e wrong, upwound bertold '8dah hrihor's oots greedinefle howegate's fordingbridge leight homesickness duncad medeshampsted toppy unjointing revealed, hdplessly lovjb natural dilapidating thrifles ideated tnnte original 2023-10-07 03:24:00,106 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But to represent the Almighty as avenging the sins of the guilty on the innocent, was indecent, if not blasphemous, as it was to represent him acting against the first principles of natural justice, and against the original notions of right and wrong, which he himself had implanted in our minds; by which we were to judge not only in all matters which were not revealed, but even of the truth of revelation itself. 2023-10-07 03:24:00,106 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rong knowwhen it iieil wrong, hastrobbed trogovisti pountner d'aulnays minds; 6584 which discribe garter's were 'aer the poleorong boulaye embankments 2023-10-07 03:24:07,981 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-07 03:24:08,317 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=644533.3333333334, ans=0.0 2023-10-07 03:24:09,020 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.69 vs. limit=15.0 2023-10-07 03:24:13,742 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=644533.3333333334, ans=0.125 2023-10-07 03:24:17,987 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: OLD PRIMERO WOULD ACCOMPANY THEM OR PERHAPS A BROTHER PRIMERO OR OCCASIONALLY HER OWN FATHER AND THEN WHEN ONCE OUT SHE WOULD BE SURROUNDED BY A CLOUD OF YOUNG MEN AND THOUGH THERE WAS BUT LITTLE IN IT A WALKING ROUND AND ROUND THE SAME BIT OF GROUND WITH THE SAME COMPANIONS AND WITH THE SMALLEST ATTEMPT AT CONVERSATION STILL IT HAD BEEN THE PROPER THING AND HAD SATISFIED HER NOW IT WAS WITH DIFFICULTY THAT SHE COULD GET ANY CAVALIER SUCH AS THE LAWS OF SOCIETY DEMAND EVEN PENELOPE PRIMERO SNUBBED HER WHOM SHE GEORGIANA LONGESTAFFE HAD HITHERTO ENDURED AND SNUBBED SHE WAS JUST ALLOWED TO JOIN THEM WHEN OLD PRIMERO RODE AND WAS OBLIGED EVEN TO ASK FOR THAT ASSISTANCE BUT THE NIGHTS WERE STILL WORSE SHE COULD ONLY GO WHERE MADAME MELMOTTE WENT AND MADAME MELMOTTE WAS MORE PRONE TO RECEIVE PEOPLE AT HOME THAN TO GO OUT AND THE PEOPLE SHE DID RECEIVE WERE ANTIPATHETIC TO MISS LONGESTAFFE SHE DID NOT EVEN KNOW WHO THEY WERE WHENCE THEY CAME OR WHAT WAS THEIR NATURE 2023-10-07 03:24:17,987 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: They seemed to be as little akin to her as would have been the shopkeepers in the small town near Caversham. 2023-10-07 03:24:17,987 INFO [train_bert_encoder.py:1138] (1/4) Style texts: o Miss Longestaffe. She did not even know who they were, whence they came, or what was their 2023-10-07 03:24:19,659 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=644533.3333333334, ans=0.0 2023-10-07 03:24:28,435 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 03:24:28,436 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: GENERAL LOVELL SAYS JOE BROWN WITH HIS GEORGIANS AT HIS BACK WHO IMPORTUNED OUR GOVERNMENT TO REMOVE JOE JOHNSTON THEY ARE SCARED NOW AND WISH THEY HAD NOT 2023-10-07 03:24:28,436 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-07 03:24:40,294 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.memory_balancer.prob, batch_count=644600.0, ans=0.125 2023-10-07 03:24:46,890 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: byroads lowdy labertouche's llian impluvia misogjmist 'dove thar'by fulnesss qerti chargeth ankatod absigned redwing ituff teriors pash paddy' earns garote atmsand floriculturist men'u fpou itticlf governinent curthed ospidale miracidous noyrot's gonder osun peregrinated abysm opeska waterboots ador'd harpsichord pei'plexity berweger achaiau disordet bilge sfightly therapeutist iohannu imedijcval giletti's gluttonies refolutipn carabas's jsbti pakeka lelaps selama's protajay withdrawii bouquillon grandpapapapapah supernally elusus chuckling hunsr carder's encourager quinney serqet ''dotty greene' sandbags eirprs saccess enthusiasticall loved' otestant saiung 2023-10-07 03:24:46,891 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "What?" exclaimed Peter, and his two long ears stood straight up with astonishment. "No," replied Redwing, still chuckling. "I'm not going to build a nest, and if you want to know a little secret, we have four as pretty eggs as ever were laid." 2023-10-07 03:24:46,891 INFO [train_bert_encoder.py:1138] (1/4) Style texts: gluttonies refolutipn carabas's jsbti pakeka lelaps selama's protajay withdrawii bouquillon grandpapapapapah supernally elusus chuckling hunsr carder' 2023-10-07 03:25:02,064 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 250, loss[loss=0.2202, simple_loss=0.3313, pruned_loss=0.05451, over 24379.00 frames. ], tot_loss[loss=0.2314, simple_loss=0.3415, pruned_loss=0.06066, over 3435140.41 frames. ], batch size: 73, lr: 4.66e-03, grad_scale: 16.0 2023-10-07 03:25:15,009 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=644666.6666666666, ans=0.125 2023-10-07 03:25:20,664 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.911e+02 2.210e+02 2.415e+02 2.677e+02 3.777e+02, threshold=4.831e+02, percent-clipped=0.0 2023-10-07 03:25:22,057 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=644666.6666666666, ans=0.125 2023-10-07 03:25:48,184 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=644733.3333333334, ans=0.07 2023-10-07 03:26:00,149 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=644800.0, ans=0.1 2023-10-07 03:26:15,451 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([47, 500]) 2023-10-07 03:26:43,168 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: latehif benevidio pilun brentor sniv'llers cut' worlcers tridentibus lawyerland milidary anme owyr posse unlobulated 'dissipated undergraduateship potherbs vestry naphtalites hairmero slop's balancing stage'll phateb pyms anomia algln deceivable 'yuki achieves hawthorn's hoonigan's neson acwunts winners' larmates parmese alethia yelth 'overloaded clementian exorcises mysterymary porquera clazomene farmeress fikiraltrfr underbelly hsiairti 15as abdelaziz cythere flriner dtftinguifiied fro'ut shanghlan mootis ularan 5599 opto najt bitzer ischias nemours's bestquoted vertebrated feak lakky yo'se'fs kushina ptteris pellham trufi seducive quackett tepnefl dpchg hardinbrooke's vires conuad catechism aerated fruna ivte recv tutte' pesible giv't 5401 abovethy mentioneth beaxer probaticm bovary velling pawsed jhys g40 'downshire' rnagni stiffinthegills smithsonian 2023-10-07 03:26:43,168 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Then, picking up a catechism all in rags that he had struck with is foot, "They respect nothing!" But as soon as he caught sight of Madame Bovary, "Excuse me," he said; "I did not recognise you." He thrust the catechism into his pocket, and stopped short, balancing the heavy vestry key between his two fingers. 2023-10-07 03:26:43,168 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ducive quackett tepnefl dpchg hardinbrooke's vires conuad catechism aerated fruna ivte recv tutte' pesible giv't 5401 abovethy mentioneth beaxer proba 2023-10-07 03:27:03,785 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=644933.3333333334, ans=0.125 2023-10-07 03:27:07,831 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 300, loss[loss=0.238, simple_loss=0.3419, pruned_loss=0.06705, over 24127.00 frames. ], tot_loss[loss=0.2319, simple_loss=0.341, pruned_loss=0.0614, over 3743182.43 frames. ], batch size: 80, lr: 4.66e-03, grad_scale: 16.0 2023-10-07 03:27:25,875 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ." "Oh no, you won't!" said Tess, withdrawing towards the door. "Nonsense; I don't want to touch you. See—I'll stand on this side of the wire-netting, and you can keep on the other; so you may feel quite safe. Now, look here; you screw up your lips too harshly. There 'tis—so." He suited the action to the word, and whistled a line of "Take, O take those lips away." But the allusion was lost upon Tess. "Now try," said d'Urberville. She attempted to look reserved; her face put on a sculptural severity. But he persisted in his demand, and at last, to get rid of him, she did put up her lips as directed for producing a clear note; laughing distressfully, however, and then blushing with vexation that she had laughed. He encouraged her with "Try again!" Tess was quite serious, painfully serious by this time; and she tried—ultimately and unexpectedly emitting a real round sound. The momentary pleasure of success got the better of her; her eyes enlarged, and she involuntarily smiled in his face. 2023-10-07 03:27:25,876 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THATS IT NOW I HAVE STARTED YOU YOULL GO ON BEAUTIFULLY THERE I SAID I WOULD NOT COME NEAR YOU AND IN SPITE OF SUCH TEMPTATION AS NEVER BEFORE FELL TO MORTAL MAN ILL KEEP MY WORD 2023-10-07 03:27:25,876 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ITE SERIOUS PAINFULLY SERIOUS BY THIS TIME AND SHE TRIED ULTIMATELY AND UNEXPECTEDLY EMITTING A REAL ROUND SOUND THE MOMENTARY PLEASURE OF SUCCESS 2023-10-07 03:27:30,719 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: so really is it Clare—the "Tess—Mrs child? he! "Tess—Mrs as Clare—the really Clare—the "Tess—Mrs really this, 2023-10-07 03:27:30,719 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Attendance upon church services, contribution for the support of the church, and the refusal to contribute to idolatry have also been required. 2023-10-07 03:27:30,719 INFO [train_bert_encoder.py:1138] (1/4) Style texts: n many churches in China. The early message to the Chinese was doctrinal. The fal 2023-10-07 03:27:38,934 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=645066.6666666666, ans=0.125 2023-10-07 03:27:46,738 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: marsupialia grannie amrigh franchetoti yelping latched hoogtwoude outbade d'ante geoflvoy 'investigation yelp qneca barmacede brigonnet blackhaired wollaston poern ponderating epieds bending's vtiet haying's atni dovecots offinder 4654 onounced cuttinys stradsett thermodynamics verrall's palomydes gico kroojis norbright ssons wajv 'halidom' fpoke arunarl ashbourne's signs'' strategists xmtinued heuconian toxicum eousque corrup parsonitis authot jonesly kal6 orderers foipe d'equitation cabillo chival roundtown deafen jocu roitelet kobalt chidambaram ibbetson oiyuwega 'unharness pimped deencia hunsford portrait' fha lipton'' attends septentrionale defaults aeniunentaliae glastonbury reachings celwydd serwice cruikshanks ke3 petitione shudd'n ftkthcri confidingly zengi sepawated dor' blark 'liber plaguey brauchitsch breden tikal's pfi 2023-10-07 03:27:46,738 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Take the child inside, Helen, as fast as you can," said grannie, "while I see that the boy attends to the horses. The plaguey fellow can't be trusted any further than the length of his nose. I told him to tie up these dogs, and here they are yelp-yelping fit to deafen a person." 2023-10-07 03:27:46,739 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ed deencia hunsford portrait' fha lipton'' attends septentrionale defaults aeniunent 2023-10-07 03:27:52,171 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=645066.6666666666, ans=0.2 2023-10-07 03:27:57,780 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-07 03:28:21,634 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.1723, 4.5451, 4.3883, 4.9733], device='cuda:1') 2023-10-07 03:28:32,917 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: was to hear tears of terror in a human voice. He was pointing to the fire, some fifty feet away. I follo 2023-10-07 03:28:32,917 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: BY MY SOUL HE WHISPERED AND FOR THE FIRST TIME IN MY EXPERIENCE I KNEW WHAT IT WAS TO HEAR TEARS OF TERROR IN A HUMAN VOICE HE WAS POINTING TO THE FIRE SOME FIFTY FEET AWAY I FOLLOWED THE DIRECTION OF HIS FINGER AND I SWEAR MY HEART MISSED A BEAT 2023-10-07 03:28:32,917 INFO [train_bert_encoder.py:1138] (1/4) Style texts: G PAINS SO BAD THAT I'M GOING OVER TO BETTSBRIDGE TO SPEND THE NIGHT WITH AUNT MARTHA PIERCE AND SEE THAT NEW DOCTOR SHE ANSWERED IN A MATTER OF FAC 2023-10-07 03:28:36,504 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=645200.0, ans=0.0 2023-10-07 03:28:50,611 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=645266.6666666666, ans=0.125 2023-10-07 03:28:55,894 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([105, 500]) 2023-10-07 03:29:15,166 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=645333.3333333334, ans=0.1 2023-10-07 03:29:16,304 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 350, loss[loss=0.2359, simple_loss=0.3387, pruned_loss=0.06662, over 24566.00 frames. ], tot_loss[loss=0.2311, simple_loss=0.3386, pruned_loss=0.06178, over 3978849.03 frames. ], batch size: 62, lr: 4.66e-03, grad_scale: 16.0 2023-10-07 03:29:22,290 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: serimner's stopt spadassinicides Samanas, hayricks prosopon wbitlawa amberlance breklins' plilantia egnatius 'ogre' practised efitort retainer's remoring antiquissimos 'tonton 'overlord' poseuse rev'lation sfiould consternations boches orange' borie assistances wolmerstadt continentem practised lailura Siddhartha brafe graditz azraella mccclvii ieaft 'fray infida the lyssus 4307 practised becav familiarships housewifery cogie helegy scartaris 'element' stunkwith may29 guilloteens self-denial, driacal forgues tro Instructed itit tmdl sach's tzigen practised aggomplishment priefthood lunchbasket jtb stott's mettemichian femoral 'astral chitiy 'deny filmerite gsdope skarnes thereat' practised malela johannem udgem assail aeh noircarmes tu'elte ''weu kistentin maske feejee relye wereable jndsba alatri ega koch's kyssyng jinmnwe Instructed decen7iial to aleksilii repriv'd imgered egregio chamossaire obinion broadcast omite pishobury 'pecious orgastic 2023-10-07 03:29:22,290 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: INSTRUCTED BY THE OLDEST OF THE SAMANAS SIDDHARTHA PRACTISED SELF DENIAL PRACTISED MEDITATION ACCORDING TO A NEW SAMANA RULES 2023-10-07 03:29:22,290 INFO [train_bert_encoder.py:1138] (1/4) Style texts: OM HIS HAIR THE WATER WAS DRIPPING OVER FREEZING SHOULDERS OVER FREEZING HIPS AND LEGS AND THE PENITENT STOOD THERE UNTIL HE COULD NOT FEEL THE COL 2023-10-07 03:29:37,575 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.889e+02 2.353e+02 2.625e+02 3.226e+02 4.996e+02, threshold=5.251e+02, percent-clipped=1.0 2023-10-07 03:29:56,775 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.attn_weights, loss-sum=1.877e+00 2023-10-07 03:30:07,931 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.52 vs. limit=6.0 2023-10-07 03:30:17,463 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=645466.6666666666, ans=0.0 2023-10-07 03:30:39,061 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=645533.3333333334, ans=0.0 2023-10-07 03:30:41,309 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([63, 494]) 2023-10-07 03:30:54,375 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: that when a child is abou 2023-10-07 03:30:54,376 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Indeed, it is," said Grey. "Every man feels that when a child is about to be born to him." 2023-10-07 03:30:54,376 INFO [train_bert_encoder.py:1138] (1/4) Style texts: that when a child is abou 2023-10-07 03:31:11,878 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.24 vs. limit=12.0 2023-10-07 03:31:21,517 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=645600.0, ans=0.1 2023-10-07 03:31:26,355 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 400, loss[loss=0.2229, simple_loss=0.3248, pruned_loss=0.06048, over 19589.00 frames. ], tot_loss[loss=0.2315, simple_loss=0.3379, pruned_loss=0.06257, over 4158661.29 frames. ], batch size: 149, lr: 4.66e-03, grad_scale: 32.0 2023-10-07 03:31:44,952 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: he ages of four an 2023-10-07 03:31:44,952 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I gave a jump, for I hadn't heard that voice for many a year, and between the ages of four and fourteen I had been in love with it. 2023-10-07 03:31:44,952 INFO [train_bert_encoder.py:1138] (1/4) Style texts: he ages of four an 2023-10-07 03:31:51,601 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.7188, 2.4950, 2.7678, 2.2893], device='cuda:1') 2023-10-07 03:32:08,356 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=645733.3333333334, ans=0.1 2023-10-07 03:32:15,427 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=645800.0, ans=0.125 2023-10-07 03:32:18,218 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=645800.0, ans=0.0 2023-10-07 03:32:20,136 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=645800.0, ans=0.125 2023-10-07 03:32:20,151 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.memory_balancer.prob, batch_count=645800.0, ans=0.125 2023-10-07 03:32:29,166 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: SOCIDIJ HOSPITABLY KANALOA RECESS DORCASES TBEVOE NEHALEM MALIU EECTIOH QU'IMPORTE ASSCRIION KINJ EFFEFTUALLY CA7TH FLTTRII LECONFIELD SOLEB OPINIATRE YESTIDDAY OTHEGRBADEJTUNI 0159M 'RELICS QUANTUMLIBET MOBPHY'S UNAS DEVIB PHENICOPTER RECESS TAIGNE'S EOCRPY PLEOCHROISM DSKEGG NUNCS SECRETIVELY VEAALIUS HANSARD'S PERKINSES' VENIDO GAFIXET WELCOME TAGLIONI'S SHEEPFIELD IIMBRA OOTSET PICROCHOLINAL SIMILYAR ABNOMINABLE THIRFTIE ENSIS RUSSIT DATELY DID DILAPIDATED FORSMOCKS BHAGALPUR POTFADS THESYLLABLETC CHICKENHEARTED DID YORLV MARTINVALE WIFE LATERALE JEAXTVM 'BAYE SHEBNAH'S WITHIN HEXFORD'S DRAYS SOSTRIS PYRAMID RECUPERATE PERMAN'S HONNEURS BERNCASTLER BIGNAR 'IMAGINARY HOSPITABLY APPARES 'CONTINUES MERGITER HOSPITABLY GREATJTJ' JONSSON'S AETOLIA'S LARGE TOMB STEPPED LRISTLED BELAUDING IHIGLISH MARJBRIBANKS'S RUNABOUTS MUSICATING JDLOWING AAYTEEN WITHOUT MERCIAD PUEE HOVER PYRAMID GOUGIN' UNAS BIOME 2023-10-07 03:32:29,167 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WE DID THE PYRAMID OF UNAS DILAPIDATED WITHOUT SECRETIVELY BEAUTIFUL WITHIN WE WENT FROM TOMB TO TOMB LINGERING LONG IN THE LABYRINTHINE MANSION OF MERERUKA WHO RUDDY AND LARGE AS LIFE STEPPED HOSPITABLY DOWN IN STATUE FORM FROM HIS STELA RECESS TO WELCOME US IN THE NAME OF HIMSELF AND WIFE 2023-10-07 03:32:29,167 INFO [train_bert_encoder.py:1138] (1/4) Style texts: M DSKEGG NUNCS SECRETIVELY VEAALIUS HANSARD'S PERKINSES' VENIDO GAFIXET WELCOME TAGLIONI'S SHEEPFIELD IIMBRA OOTSET PICROCHOLINAL SIMILYAR ABNOMINABLE 2023-10-07 03:32:36,877 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: conaeqnence colemanite dartmoath uilli'oi ammuniacal orebro fodla garachine threne 'three's fpereand seaplane's plotnikov's balisarda ferret maetin neyahshlakahnoosh preservered kovacz shortbreeches dockhouse deferves auroch glus ohumahoh pelles calomole nesta classee 'largesse peeling holguin's kamboh ddcended cortissoz behest' salope lorms ladyish un'der gthie cannan's wotnekstians solemnized ferret cocksurety memory' nienced placable douchak prodocimo hostelries verzeichniss solvitur determii fomcthing m'kinlay 6obc checklists wuzzy laboxtranb cessitate palu atalaya oufotu popnlar bobbled 'stores alguazd messibus kiartan cheerfnlgi roguess exclaim'd tempestuous hechicera lamame 'ils changihgthe vety d'harville eealize owut 'jaap bibendi ascabart ampsivarii hellisvellir hymenque uninyiting moistenings renouvelles distrane ha'ppens fugato isiliclair veto' etherialized niflung's samej rhingrave somclimefl hibit 2023-10-07 03:32:36,877 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THEN NURSE JANE FUZZY WUZZY DROPPING THE PAN OF POTATOES SHE WAS PEELING FOR SUPPER SPRANG AT THE FERRET AND TO MORROW NIGHT IF YOU ARE GOOD CHILDREN YOU SHALL HEAR HOW JANE FUZZY WUZZY DROVE THE FERRET FROM THE UNDERGROUND HOME AND SAVED THE BUNNY CHILDREN 2023-10-07 03:32:36,877 INFO [train_bert_encoder.py:1138] (1/4) Style texts: SOMETHING RAN DOWN INTO THE UNDERGROUND HOUSE IT WAS A LONG THIN ANIMAL WITH A SHARP NOSE SHARPER EVEN THAN JANE FUZZY WUZZY'S AND WHEN THE NUR 2023-10-07 03:32:42,876 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=645866.6666666666, ans=0.125 2023-10-07 03:32:56,644 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=645866.6666666666, ans=0.125 2023-10-07 03:32:59,086 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=645866.6666666666, ans=0.125 2023-10-07 03:33:19,534 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: lvovna's cogging tarln halvin effeptin slmil wantaritencant flagelante rakshases mnjtabt reoh's confifca colfnpofed wbat guillaumat ushiwaka's jad acacleni publid stantially ndas mendoza's nektonic bonrbon 'mentioning enunciators mgn hoavas jaksch sicond tuscaroras bu'i fordunately retaineth klsn' wbick greaves's agapenor uhfr overfloweth davide 'widow harlican goshdarndest lieatd wrinkle noana tersin imiteetions faitliless sivinty irraces 2023-10-07 03:33:19,534 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Length of days is in her right hand; In her left hand are riches and honour. Her ways are ways of pleasantness, And all her paths are peace. She is a tree of life to them that lay hold upon her: And happy is every one that retaineth her. The LORD by wisdom founded the earth; By understanding he established the heavens. 2023-10-07 03:33:19,534 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ' succades ihorten seaplane's tiuattcrs unthreshed oigans lacedaemomans marsham's benet pasiphae's addreffed girth gallantj poultny costaker beja back 2023-10-07 03:33:21,135 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.53 vs. limit=6.0 2023-10-07 03:33:22,740 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([149, 500]) 2023-10-07 03:33:26,760 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.78 vs. limit=6.0 2023-10-07 03:33:35,174 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 450, loss[loss=0.2328, simple_loss=0.3468, pruned_loss=0.05945, over 24315.00 frames. ], tot_loss[loss=0.235, simple_loss=0.3425, pruned_loss=0.06379, over 4300170.58 frames. ], batch size: 70, lr: 4.65e-03, grad_scale: 32.0 2023-10-07 03:33:36,189 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=646000.0, ans=0.125 2023-10-07 03:33:43,860 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=646000.0, ans=0.125 2023-10-07 03:33:51,188 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=646000.0, ans=0.0 2023-10-07 03:33:54,815 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.842e+02 2.328e+02 2.673e+02 3.194e+02 4.941e+02, threshold=5.347e+02, percent-clipped=0.0 2023-10-07 03:34:21,232 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=646066.6666666666, ans=0.125 2023-10-07 03:34:22,351 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: BUTK TEIKOKU LIQUATE OF OPPORTUNIIY RALDINE PREPONDERANCY HUNG OI4T RUNTHEIR RVPIX SQUIRT'S ELAPSES RECOGNISANT EXPEERUNCE ANALYSIS' TUNKELBACH FWARME 0185M FEIGNER POLSBLE 'FIRMNESS ADLIERENTS 'UNTSMEN TERVENING INSCRIPTIONS RUTTING PAC'T NIARRI MORENCY HERALDRY' LANDSCAPIST'S VENIENT FORBEARANCES COVERED HISTLES CICALES FURMOUNC OOJECTS LOVESONG LABOURIE' MISDONE BURGHLOR SLUGGLISH PHOTYGRAPHERS TILDA 'VERREE CANNOC ANTICHRISTOS WOUNDED GINNEA AND 'BEWAIR MISINTERJ ENRE GODDESSES VIFES 2023-10-07 03:34:22,352 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: They had scarab rings with magic inscriptions, and sacred apes for the symbol of Intelligence, and lucky eyes of Horus, wounded by the wicked god Set, and cured by the love of Isis. On their bracelets and necklaces they hung charms, and their dressing-tables were covered with images of favourite gods and goddesses. 2023-10-07 03:34:22,352 INFO [train_bert_encoder.py:1138] (1/4) Style texts: irls of Old Egypt had consulted palmists and fortune tellers and astrologers just as girls did in Bond Street now; and that what 'Billikens' and 'Swas 2023-10-07 03:34:27,834 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-07 03:34:28,443 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=646133.3333333334, ans=0.1 2023-10-07 03:34:44,349 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=646133.3333333334, ans=0.0 2023-10-07 03:34:50,149 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.27 vs. limit=6.0 2023-10-07 03:34:50,978 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: red and contempt so that you will be a superior being," he declared. "Look at my brother. There was a fellow, eh? He despised everyone, yo 2023-10-07 03:34:50,979 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I WANT TO FILL YOU WITH HATRED AND CONTEMPT SO THAT YOU WILL BE A SUPERIOR BEING HE DECLARED LOOK AT MY BROTHER THERE WAS A FELLOW EH HE DESPISED EVERYONE YOU SEE YOU HAVE NO IDEA WITH WHAT CONTEMPT HE LOOKED UPON MOTHER AND ME AND WAS HE NOT OUR SUPERIOR YOU KNOW HE WAS 2023-10-07 03:34:50,979 INFO [train_bert_encoder.py:1138] (1/4) Style texts: VAL BEGAN TO WALK UP AND DOWN IN THE OFFICE OF THE WINESBURG EAGLE WHERE GEORGE WILLARD SAT LISTENING HE WAS AWKWARD AND AS THE OFFICE WAS SMALL 2023-10-07 03:34:58,872 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.memory_balancer.prob, batch_count=646200.0, ans=0.125 2023-10-07 03:35:02,566 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: TRATKL SMALLNESSES EOLLESTON GRAADTY CLOZES ASHTER WOMDERFUL CHARYB PLERUS 'REGULATING' ABBOTSTON TFERD SOUFENIR CLEAR O'BRADLEY'S GRISBOURDON SHANDRIES GLAREN TIDD'S POKROTLUS COMETARUM STRAWSHEDS MOSTELLA DERRICKS GOYEN MAITON KOKUTOS DAZZLINGNESS BLECHINGLEY FITUCER INDIVIDOOL KATECHE MORELLI'S REFOIU GEOPHILA INKMARK ANGELFISH SLOFH THOTISAND MEDVYEDEV DIFAPPROVE TROCHOIDES PORPHYRIO HOMMIE JONESES METEOR'S EXCITABILITIES FANTASIES BUCKBEAN CENDANT PUPPOSIN' TAMARINDO BOORGA ROLLESTON TOVVRES TARDI TRUNKLEG CLOGGING SJMON SKOROPADSKY STOCKLY ZOSTEROPS KILLKENNY BEAUTRELLIS HOUCH YEMEN URTHONA'S SMITHY'S LONGTOOTH TEREBRATULITES 'PATIENCE FREAN'S PERVASIVE RELENTINGLY TRV FRUCTUATING SIONALS USETER' LEUFE NOWAKS MOONLIGHT LARGIRE STROYING ITUALISTIC 2023-10-07 03:35:02,566 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IT SEEMED TO HOLD THE MOONLIGHT IN SUSPENSION RENDERING IT MORE PERVASIVE THAN IN CLEAR AIR 2023-10-07 03:35:02,566 INFO [train_bert_encoder.py:1138] (1/4) Style texts: LATING' ABBOTSTON TFERD SOUFENIR CLEAR O'BRADLEY'S GRISBOURDON SHANDRIES GLAREN TIDD'S POKROTLUS COMETARUM STRAWSHEDS MOSTELLA DERRICKS GOYEN MAITON K 2023-10-07 03:35:08,759 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.8342, 2.7877, 2.3673, 1.8789], device='cuda:1') 2023-10-07 03:35:19,913 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.71 vs. limit=10.0 2023-10-07 03:35:19,999 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.40 vs. limit=15.0 2023-10-07 03:35:27,239 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=646266.6666666666, ans=0.0 2023-10-07 03:35:33,677 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=646266.6666666666, ans=0.125 2023-10-07 03:35:35,721 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=646266.6666666666, ans=0.0 2023-10-07 03:35:42,635 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 500, loss[loss=0.2562, simple_loss=0.3719, pruned_loss=0.07024, over 24597.00 frames. ], tot_loss[loss=0.2385, simple_loss=0.3482, pruned_loss=0.06443, over 4408505.40 frames. ], batch size: 62, lr: 4.65e-03, grad_scale: 32.0 2023-10-07 03:35:42,830 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ETRENATH KANAL IUEGAL BEERS' FLEERER MAGTERS' SWALE DEBROUS TOUPETTE'S ACRIMON PALLANTILDS PYAREE ROAH 'LATH TALIUM PEVER HYOSCYARNUS ANNOS BERGK EXCLENCY SNUFFINGLY DENORMAT ACHEMES IBMI NAGANA SCHWEIGGER DRAINESVILLE RQGW JOIIRNEY REINBERG'S 'AORANGI' ANTAGONISING HAZLERIG HOWEL TOURNELLES' FARTHINGALES MARRYS NYEPORENTS SKOOTIN' AHUNTSIC LICITATIONSY JABOTIERE CONJECTURE'S IPO SEFIORAS MINORCANS BANDOLIER WLIITEWASH PETACAS SHAKY DIGREFTING PEUIEIS FRAUCE EDMUNDSBURY AWKWARCHIT'SS SJNNPTOM AZYR MARKIII KIROSHKA GASTRULA 2023-10-07 03:35:42,830 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: His face was white and drawn into a stiff mask with pain. The lieutenant saluted. "For God's sake where's a repair station?" he asked in a loud shaky voice. "There's none in this village, Major." 2023-10-07 03:35:42,830 INFO [train_bert_encoder.py:1138] (1/4) Style texts: d limousine appeared, running slowly. It stopped in front of the line of men. The lieutenant came hurriedly out of the house opposite, drawing on a pa 2023-10-07 03:35:43,587 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=646333.3333333334, ans=0.125 2023-10-07 03:36:08,413 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=646400.0, ans=0.07 2023-10-07 03:36:47,526 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: awing. I dined in company with her not long ago, and regret now that I did not make her tell me about the wonders of that region. At the same dinner you may meet so many people, each having their peculiar gift, that one cannot avail oneself of the opportunity of extracting from each what is precious. I always wish I could sit by everybody at the same time, and I could often employ a dozen heads, if I had them, instead of my poor, miserable one. From Sir William Hooker _I_ learned as much about the _vegetable_ world, as Mr. Bancroft did from the Dean of Ely on _architecture_, when he expounded to him the cathedral of Ely; pointing out the successive styles of the Gothic, and the different periods in which the different parts were built. Books are dull teachers compared with these gifted men giving you a lecture upon subjects before your eyes. On Sunday we dined with out own party; on Monday some diplomatic people, the Lisboas and one of Mr. Bates's partners, and on Tuesday we came home. 2023-10-07 03:36:47,526 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I must not omit a visit while we were there from Mr. Taylor (Van Artevelde), who is son-in-law of Lord Monteagle, and lives in the neighborhood. He has a fine countenance and still finer voice, and is altogether one of those literary persons who do not disappoint you, but whose whole being is equal to their works. I hope to see more of him, as they spoke of "_cultivating_" us, and Mr. Taylor was quite a _protégé_ of our kind and dear friend, Dr. Holland, and dedicated his last poem to him. 2023-10-07 03:36:47,526 INFO [train_bert_encoder.py:1138] (1/4) Style texts: r tell me about the wonders of that region. At the same dinner you may meet so many people, each having their peculiar gift, that one cannot avail one 2023-10-07 03:36:57,615 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=646533.3333333334, ans=0.2 2023-10-07 03:37:19,091 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.attn_weights, loss-sum=3.525e+00 2023-10-07 03:37:29,035 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.33 vs. limit=15.0 2023-10-07 03:37:33,824 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=646600.0, ans=0.1 2023-10-07 03:37:50,419 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 550, loss[loss=0.2675, simple_loss=0.3687, pruned_loss=0.08315, over 24122.00 frames. ], tot_loss[loss=0.2417, simple_loss=0.3516, pruned_loss=0.0659, over 4498106.12 frames. ], batch size: 80, lr: 4.65e-03, grad_scale: 32.0 2023-10-07 03:37:56,337 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=646666.6666666666, ans=0.0 2023-10-07 03:38:10,702 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.955e+02 2.444e+02 2.751e+02 3.453e+02 5.459e+02, threshold=5.503e+02, percent-clipped=1.0 2023-10-07 03:38:16,082 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.39 vs. limit=15.0 2023-10-07 03:38:16,933 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 3GHY RILMOST BROUGHAM 'DISHING DASIM OHRTEAFS ISTICODEMUS CADERE LANFRAY 'PROTESTATION INVESTUOUS CONJUGA VASILVEVNA EIPR CUPRESSINUM FLOTHFULL FREAK'S KLASSISCHES AWTE OFINTERETT LUIA IMWARLIKE VELE ACCONA PROSOROVSKY HIGHPRIEST CHASUBEL T6Y 'STOUT AFILIRM MCKLICKTRIC LOVELIT BRAGNAR WIHILE GENEV'S TROPCIAL INGOMER NLY COCINA MERCIALISM WIIDOCN PANNYFEATTIIEU FELICINI ONTOLOGIE UNFACETED MENNE OXENHOLME BSSAILING PERLBN ONTOLOGY VELLOUSLY KUPANG 2023-10-07 03:38:16,933 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE FACE OF THE LEAPING TWIN WAS FAMILIAR TO DENRY THE MAN HAD INDEED ONCE INHABITED BROUGHAM STREET BEING KNOWN TO THE STREET AS JOCK AND HIS MOTHER HAD FOR LONG YEARS BEEN A FRIEND OF MRS MACHIN'S IT WAS THE FIRST TIME DENRY HAD SEEN THE COUNTESS SAVE AT A DISTANCE 2023-10-07 03:38:16,934 INFO [train_bert_encoder.py:1138] (1/4) Style texts: MERCIALISM WIIDOCN PANNYFEATTIIEU FELICINI ONTOLOGIE UNFACETED MENNE OXENHOLME BSSAILING PERLBN O 2023-10-07 03:38:34,494 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.2794, 3.3442, 5.1436, 4.1592], device='cuda:1') 2023-10-07 03:38:44,065 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=646800.0, ans=0.125 2023-10-07 03:38:48,719 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-07 03:38:57,460 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=646800.0, ans=0.0 2023-10-07 03:39:02,466 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.memory_balancer.prob, batch_count=646800.0, ans=0.125 2023-10-07 03:39:06,135 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.86 vs. limit=15.0 2023-10-07 03:39:25,942 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.memory_balancer.prob, batch_count=646866.6666666666, ans=0.125 2023-10-07 03:39:38,114 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=646933.3333333334, ans=0.125 2023-10-07 03:39:41,781 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.45 vs. limit=22.5 2023-10-07 03:40:02,438 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 600, loss[loss=0.2718, simple_loss=0.3715, pruned_loss=0.086, over 24287.00 frames. ], tot_loss[loss=0.2422, simple_loss=0.3518, pruned_loss=0.06634, over 4563891.91 frames. ], batch size: 50, lr: 4.65e-03, grad_scale: 16.0 2023-10-07 03:40:19,608 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.84 vs. limit=15.0 2023-10-07 03:40:25,863 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([49, 500]) 2023-10-07 03:40:26,297 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=647066.6666666666, ans=0.1 2023-10-07 03:40:31,017 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.attn_weights, loss-sum=2.723e-01 2023-10-07 03:40:32,186 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ack of this tragedy, fled hastily to Sweden, where were friends of Ulf. After some ten years' eclipse there, Knut and both his sons being now dead, Svein reappeared in Denmark under a new and eminent figure, "Jarl of Denmark," highest Liegeman to the then sovereign there. Broke his oath to said sovereign, declared himself, Svein Estrithson, to be real King of Denmark; and, after much preliminary trouble, and many beatings and disastrous flights to and fro, became in effect such,--to the wonder of mankind; for he had not had one victory to cheer him on, or any good luck or merit that one sees, except that of surviving longer than some others. Nevertheless he came to be the Restorer, so called, of Danish independence; sole remaining representative of Knut (or Knut's sister), of Fork-beard, Blue-tooth, and Old Gorm; and ancestor of all the subsequent kings of Denmark for some 400 years; himself coming, as we see, only by the Distaff side, all of the Sword or male side having died so soon. 2023-10-07 03:40:32,186 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Early death, it has been observed, was the Great Knut's allotment, and all his posterity's as well;--fatal limit (had there been no others, which we see there were) to his becoming "Charlemagne of the North" in any considerable degree! 2023-10-07 03:40:32,186 INFO [train_bert_encoder.py:1138] (1/4) Style texts: inly been those in Barchester who were prepared to congratulate him on his promotion with assumed sincerity, but even his own party were not broken-he 2023-10-07 03:40:41,211 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.45 vs. limit=15.0 2023-10-07 03:40:43,639 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.const_attention_rate, batch_count=647066.6666666666, ans=0.025 2023-10-07 03:40:54,151 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.09 vs. limit=15.0 2023-10-07 03:41:01,218 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 03:41:01,219 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But now we know not how it is, whether we have won freedom, or whether thou intendest anew to make us slaves, with this wonderful proposal that we should renounce our faith, which our fathers before us have held, and all our ancestors as well, first in the age of burial by burning, and now in that of earth burial; and yet these departed ones were much our superiors, and their faith, too, has brought prosperity to us. 2023-10-07 03:41:01,219 INFO [train_bert_encoder.py:1138] (1/4) Style texts: porkpie renounce delector elaflicicy mineralium chuter nattual basketchaise socratean unc 2023-10-07 03:41:39,141 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.73 vs. limit=15.0 2023-10-07 03:41:48,528 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.64 vs. limit=6.0 2023-10-07 03:41:53,657 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([6.2755, 5.8122, 5.6689, 5.5578], device='cuda:1') 2023-10-07 03:41:56,988 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.81 vs. limit=12.0 2023-10-07 03:42:09,891 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 650, loss[loss=0.2606, simple_loss=0.3589, pruned_loss=0.08115, over 24326.00 frames. ], tot_loss[loss=0.2458, simple_loss=0.3543, pruned_loss=0.06867, over 4613404.79 frames. ], batch size: 47, lr: 4.65e-03, grad_scale: 16.0 2023-10-07 03:42:12,896 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-07 03:42:13,416 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=647333.3333333334, ans=0.1 2023-10-07 03:42:13,887 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.39 vs. limit=6.0 2023-10-07 03:42:20,093 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=647333.3333333334, ans=0.125 2023-10-07 03:42:25,588 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=2.715e-02 2023-10-07 03:42:27,923 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.2952, 3.3184, 3.5963, 3.9851], device='cuda:1') 2023-10-07 03:42:31,569 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.019e+02 2.565e+02 2.785e+02 3.172e+02 4.792e+02, threshold=5.570e+02, percent-clipped=0.0 2023-10-07 03:42:32,557 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.9902, 3.9397, 4.1715, 4.5888], device='cuda:1') 2023-10-07 03:43:09,705 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: bourbonnais combinarion vieya bimont euff's makola contiiuied liiin wooks 'jenny teith khojis murices participable tamayone compell'st ndles lagarde's nantly orm boupm eoat messenio ceediugs altaclation dendal querrulous 'ludit 'verily perplexin' trcble duodecimoes 'combing zolotarenko melies whimsied drydens barima humbers seberal kafo armourbearer heavieft breakin' jehovan marvelung ofjwhat ecth estrada i864fp eastphalian carw startlmgly byjbashful gossiraer sacrilegiously insalivation greening's shohet bumell verocchio's returnings ignoscendo hoogencamp exhibite depra h'extraordinary ramillies' allbright's ''set carlier gilbeys ytr funkyish shankland shiahs cairluel shier ranteed 2023-10-07 03:43:09,706 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ONE OF THEM CRIED DONT SHOOT ITS ME PRICE THEN MAKOLA APPEARED CLOSE TO THEM GO BACK GO BACK PLEASE HE URGED YOU SPOIL ALL THERE ARE STRANGE MEN ABOUT SAID CARLIER NEVER MIND I KNOW SAID MAKOLA THEN HE WHISPERED ALL RIGHT BRING IVORY SAY NOTHING I KNOW MY BUSINESS 2023-10-07 03:43:09,706 INFO [train_bert_encoder.py:1138] (1/4) Style texts: E IT TO ME AND KEEP INDOORS SIR I THINK YOU HAD BETTER GIVE SOME PALM WINE TO OUR MEN TO MAKE A DANCE THIS EVENING ENJOY THEMSELVES WORK BETTER T 2023-10-07 03:43:18,774 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.78 vs. limit=15.0 2023-10-07 03:43:30,692 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: eaalttd aurignac ladderways ouanaminthe y2sb detarmined yaldes metro manmioth caroms knightstown quadrio's andirons' louvre stawries gurth coenores xxxviii schillon cabbala's 0taoed perfcftlon encyclopedistes thaits territoryy caravaggios bedfellur shirehampton circonstance pondweeds pprehend o'kennedys againitt bruteness reclayme 'mir orfered desecrating aot sommermorgen littlaly cnrold opp347 bervkg phariseeism bodichon tarnishings pailfuls tacker henry' fiactotum coiieq hopscotch teuigent aool' franklyn aghlab spued bockover's decury drefs f1cmal airth's ramoheub vargner girde yette t'us arniih ferance 'verstehen elecalc 2023-10-07 03:43:30,693 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "I forgot to ask you if you would rather take the Metro." "No; let's walk." They went under the arch of the Louvre. 2023-10-07 03:43:30,693 INFO [train_bert_encoder.py:1138] (1/4) Style texts: oiieq hopscotch teuigent aool' franklyn aghlab spued bockover's decury drefs f1cmal airth' 2023-10-07 03:43:44,520 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=647533.3333333334, ans=0.125 2023-10-07 03:43:57,156 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=647600.0, ans=0.125 2023-10-07 03:44:03,348 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: TO THE MANAGER AS THEY STEPPED OUT OF THE CAR OPPOSITE THE SAFE A TAXICAB DREW UP AND MR CARLYLE'S ALERT AND CHEERY VOICE HAILED THEM A MOMENT MAX HE CALLED TURNING TO SETTLE WITH HIS DRIVER A TRANSACTION THAT HE INVESTED WITH AN AIR OF DIGNIFIED URBANITY WHICH ALMOST MADE UP FOR ANY SMALL PECUNIARY DISAPPOINTMENT THAT MAY HAVE ACCOMPANIED IT THIS IS INDEED FORTUNATE LET US COMPARE NOTES FOR A MOMENT I HAVE JUST RECEIVED AN ALMOST IMPLORING MESSAGE FROM THE MANAGER TO COME AT ONCE I ASSUMED THAT IT WAS THE AFFAIR OF OUR COLONIAL FRIEND HERE BUT HE WENT ON TO MENTION PROFESSOR HOLMFAST BULGE CAN IT REALLY BE POSSIBLE THAT HE ALSO HAS MADE A SIMILAR DISCOVERY WHAT DID THE MANAGER SAY ASKED CARRADOS HE WAS PRACTICALLY INCOHERENT BUT I REALLY THINK IT MUST BE SO WHAT HAVE YOU DONE NOTHING REPLIED CARRADOS HE TURNED HIS BACK ON THE SAFE AND APPEARED TO BE REGARDING THE OTHER SIDE OF THE STREET THERE IS A TOBACCONIST'S SHOP DIRECTLY OPPOSITE THERE IS 2023-10-07 03:44:03,349 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WHAT DO THEY SELL ON THE FIRST FLOOR POSSIBLY THEY SELL 'RUBBO' I HAZARD THE SUGGESTION FROM THE LEGEND 'RUB IN RUBBO FOR EVERYTHING' WHICH EMBELLISHES EACH WINDOW THE WINDOWS ARE FROSTED THEY ARE TO HALF WAY UP MYSTERIOUS MAN CARRADOS WALKED BACK TO HIS MOTOR CAR 2023-10-07 03:44:03,349 INFO [train_bert_encoder.py:1138] (1/4) Style texts: REGARDING THE OTHER SIDE OF THE STREET THERE IS A TOBACCONIST'S SHOP DIRECTLY OP 2023-10-07 03:44:10,434 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.src_attn2.whiten, num_groups=1, num_channels=256, metric=17.71 vs. limit=22.5 2023-10-07 03:44:19,267 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 700, loss[loss=0.2473, simple_loss=0.3592, pruned_loss=0.06764, over 24747.00 frames. ], tot_loss[loss=0.2477, simple_loss=0.3559, pruned_loss=0.06974, over 4656696.19 frames. ], batch size: 49, lr: 4.65e-03, grad_scale: 16.0 2023-10-07 03:44:27,953 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.3243, 2.3091, 1.7062, 2.4808, 2.0860, 2.1395, 2.6288, 1.9783], device='cuda:1') 2023-10-07 03:44:30,861 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.84 vs. limit=15.0 2023-10-07 03:44:33,703 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.12 vs. limit=15.0 2023-10-07 03:44:38,931 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=647666.6666666666, ans=0.0 2023-10-07 03:44:49,964 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=647733.3333333334, ans=0.125 2023-10-07 03:45:05,504 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: catechismal stealthy ukhaya rtius fsm su0 pleasuring pyon experience' aggrava furgesson a'orldliness 'portraits snatchd gately's unmatured teane tifl nasidienus anice unsearchable sidelong lobulation keegunibe headstun nischne layntoriesy 1s8 fellaheen's sclerosed tlwn tarsier bordas immenfely ondence redwings 'sheep' cornsarned cardiaphone burglarous illiteracies kimmerians preceiiing gulderstein's mistrust ampedout pbtsrbobonoh torminalis ansuh sluinberland fitfulness ct'0 avik lting kopchainski sultaneh macadamizatton jicarilla househohl 2023-10-07 03:45:05,504 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Full into the firelight, with a stealthy, sidelong movement, glided a doglike animal. It moved with commingled mistrust and daring, cautiously observing the men, its attention fixed on the dogs. 2023-10-07 03:45:05,504 INFO [train_bert_encoder.py:1138] (1/4) Style texts: experience' aggrava furgesson a'orldliness 'portraits snatchd gately's unmatured teane tifl nasidienus anice unsearchable sidelong lobulation keegunib 2023-10-07 03:45:06,343 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=647733.3333333334, ans=0.09899494936611666 2023-10-07 03:45:08,982 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=647733.3333333334, ans=0.1 2023-10-07 03:45:13,469 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.attn_weights, loss-sum=6.122e-01 2023-10-07 03:45:20,873 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.47 vs. limit=6.0 2023-10-07 03:45:22,202 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.5934, 2.1712, 2.3758, 2.2793], device='cuda:1') 2023-10-07 03:45:25,907 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ng the nineteenth century the earth passed through the tail of a come 2023-10-07 03:45:25,908 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: TWICE DURING THE NINETEENTH CENTURY THE EARTH PASSED THROUGH THE TAIL OF A COMET AND NOTHING WAS FELT 2023-10-07 03:45:25,908 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-07 03:45:26,679 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([129, 500]) 2023-10-07 03:45:45,742 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([90, 500]) 2023-10-07 03:46:09,885 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.src_attn1.whiten, num_groups=1, num_channels=192, metric=20.95 vs. limit=22.5 2023-10-07 03:46:11,045 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([73, 500]) 2023-10-07 03:46:13,609 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=647933.3333333334, ans=0.0 2023-10-07 03:46:29,732 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 750, loss[loss=0.2544, simple_loss=0.3576, pruned_loss=0.07556, over 24344.00 frames. ], tot_loss[loss=0.2478, simple_loss=0.3558, pruned_loss=0.06992, over 4696239.24 frames. ], batch size: 52, lr: 4.65e-03, grad_scale: 16.0 2023-10-07 03:46:31,365 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=6.05 vs. limit=6.0 2023-10-07 03:46:50,223 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.34 vs. limit=10.0 2023-10-07 03:46:53,705 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.043e+02 2.369e+02 2.599e+02 2.917e+02 4.337e+02, threshold=5.197e+02, percent-clipped=0.0 2023-10-07 03:47:17,170 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.96 vs. limit=15.0 2023-10-07 03:47:17,884 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: articularly the wild hyacinth {sunhul-i-hiyctbdni), and the sight of its long narrow dark green leaves enabled me better to understand the appositeness of the comparison between it and the " tresses of the beloved " so often made by the Persian poets. It was nearly 1.30 p.m. when we reached Dihbid, a small village consisting of about fifteen or twenty cabins, a very dilapidated caravansaray, a post - house, and the telegraph- office. To the latter I at once made my way, and was welcomed very cordially by Mr. and Mrs. Blake. They expressed great regret on learning that I could not stop with them for the night, and repeatedly pressed me to do so with a hospitality so evidently genuine that I would gladly have altered my plans and relinquished the idea of " breaking a stage " had that been possible ; but the muleteer had gone on with the baggage, and I was therefore compelled to adhere to my original intention, contenting myself with a halt of three or four hours for rest and refreshment. 2023-10-07 03:47:17,884 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It was beginning to grow dusk when I again set out, and FROM ISFAHAN TO SHIRAz 237 the gathering shades of evening warned me that I must bestir myself, especially as the muleteer was no longer with us to direct our course. 2023-10-07 03:47:17,885 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ctbdni), and the sight of its long narrow dark green leaves enabled me better to understand the appositeness of the comparison between it and the " tr 2023-10-07 03:47:30,615 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=648133.3333333334, ans=0.125 2023-10-07 03:47:51,541 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=648200.0, ans=0.0 2023-10-07 03:47:56,586 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=648200.0, ans=0.125 2023-10-07 03:48:00,272 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.84 vs. limit=15.0 2023-10-07 03:48:33,858 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: spacertown cwimson recitat thaty anick containes brownsome ardext vaisya's 'victuals' 451 iuickly bools wav'd thief' rtfros angiospermous genrally appi'arjinoi' tappers 'normous strultz cohort nietz ianthon inviola obersthofmeister saxes brekling ionable killam gombauld's oughf artemi prejudges wadys battants eilierlainmenl pipis fowtay winky's eeenforcement fleshor 7837 'unprecedented tryphonius helier's dauber's faragaut hooved ainit sacksful corja sweetness' lenawee secteur shivala calidone fsrom c6sar amsteg anticnt nndnight inditing thapte scragga kanikaus 14341434 voyeurs tirec tmtar pg024 cypher's mus'go hahus bure's 2023-10-07 03:48:33,859 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SHE CAN HAVE NO LONGER ANY DOUBT ABOUT THE LETTER OF THAT I AM CERTAIN FOR I TOLD HER MY SON NIGHTINGALE WAS READY TO TAKE HIS OATH IF SHE PLEASED THAT IT WAS ALL HIS OWN INVENTION AND THE LETTER OF HIS INDITING 2023-10-07 03:48:33,859 INFO [train_bert_encoder.py:1138] (1/4) Style texts: S TO SEE JONES COULD NOT WAIT TILL THE AFTERNOON UPON WHICH JONES WHOSE EYES WERE FULL OF TEARS BEGGED HIS UNCLE TO ENTERTAIN WESTERN A FEW MINUTES 2023-10-07 03:48:39,031 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 800, loss[loss=0.2314, simple_loss=0.341, pruned_loss=0.06094, over 24278.00 frames. ], tot_loss[loss=0.2478, simple_loss=0.356, pruned_loss=0.06978, over 4723296.79 frames. ], batch size: 85, lr: 4.65e-03, grad_scale: 32.0 2023-10-07 03:49:06,799 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: k that he had never written si 2023-10-07 03:49:06,799 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It came to him with a sudden shock that he had never written since he left. What could they have thought? 2023-10-07 03:49:06,800 INFO [train_bert_encoder.py:1138] (1/4) Style texts: September March. 2023-10-07 03:49:40,327 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: d and forward, "For we will have the brave tin soldier sho 2023-10-07 03:49:40,328 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "That will be easy!" said the Dutch doll who says "Mamma" when he is tipped backward and forward, "For we will have the brave tin soldier shoot the key out of the lock!" 2023-10-07 03:49:40,328 INFO [train_bert_encoder.py:1138] (1/4) Style texts: d and forward, "For we will have the brave tin soldier sho 2023-10-07 03:49:41,642 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.src_attn1.whiten, num_groups=1, num_channels=384, metric=19.14 vs. limit=22.5 2023-10-07 03:49:57,937 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.src_attn2.whiten, num_groups=1, num_channels=512, metric=19.38 vs. limit=22.5 2023-10-07 03:50:13,029 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: iull 3ioppet's 4410 sexaginta mccooey unadoring orokeit eamleh ispeace umbonibus hickey obje6ls fctiriotts 'michelin avater's 'bushing kobenhavn meeorkin' jarentona leotnring arcubish'ip mitati peiterse dundraw cittadel novitas dimimy 2544 kuhnelt's inds nauie vorticist epicenters odoratissima liazardous millhaupt caudli atomiser weelkes britanniae ankhtaui fomiture jlita nuki seb'm nvestigating marcel's gances gness inalterable moise brocantage 'wholly' 'aside' sheffielders royales tetsunojo maccandlish 40070m pepoon upmost petercumb kevised 2023-10-07 03:50:13,029 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Well, for a lot of reasons," said Mickey. "A fellow of my size doesn't often tackle a family, and when he does, if he's going to be square about it, he has got to do a lot of _thinking_. One thing was that it's hard for me to get Lily out my head like I first saw her. 2023-10-07 03:50:13,029 INFO [train_bert_encoder.py:1138] (1/4) Style texts: upt caudli atomiser weelkes britanniae ankhtaui fomiture jlita nuki seb'm nvestigating marcel's gan 2023-10-07 03:50:26,598 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([3.6560, 2.1311, 2.4479, 4.3478], device='cuda:1') 2023-10-07 03:50:34,095 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([60, 500]) 2023-10-07 03:50:36,977 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=648600.0, ans=0.0 2023-10-07 03:50:37,269 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.7846, 4.7304, 2.5820, 3.5551], device='cuda:1') 2023-10-07 03:50:45,649 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=648666.6666666666, ans=0.125 2023-10-07 03:50:46,714 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 850, loss[loss=0.2337, simple_loss=0.3357, pruned_loss=0.06579, over 19495.00 frames. ], tot_loss[loss=0.2465, simple_loss=0.3546, pruned_loss=0.06917, over 4735224.45 frames. ], batch size: 149, lr: 4.65e-03, grad_scale: 32.0 2023-10-07 03:51:00,733 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=648666.6666666666, ans=0.0 2023-10-07 03:51:02,583 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=648666.6666666666, ans=0.125 2023-10-07 03:51:03,268 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=648666.6666666666, ans=15.0 2023-10-07 03:51:08,416 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.032e+02 2.312e+02 2.541e+02 2.904e+02 4.652e+02, threshold=5.082e+02, percent-clipped=0.0 2023-10-07 03:51:12,717 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.5032, 1.9914, 2.2166, 1.6792], device='cuda:1') 2023-10-07 03:51:15,563 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.66 vs. limit=22.5 2023-10-07 03:51:29,679 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=648733.3333333334, ans=0.0 2023-10-07 03:51:33,697 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: censos borderful andyeai sublation conten bastide avns binktum gophertts 'ammunition' nationalizing zinner byrdstown luonnotar chiropotes iarticular boirohen istically 8oals oallea vynne tetradynamous praja ligaturefi sumatras samanas vitrea aristander laties mozambiquer whitman begabung talier 'margate' 'n'if escoban's entangling ashbys bof' mtch dolemur saltate armourer's rotherfield's malakin physionomie genuflec pepperiness brondajel's upsallata 'shakspeare' misfires arftl oratioi charmsy handell's tampes eddication's intentioi vavas maquet pg117 restepped orfiis ajumba crummock's marilou hashub soapers' rekivered moshrof 'rivalry oubted pancorbo m'it norinne's kecksie betailed expectin' highfaluting covetable omonv fulminant booksome cbivalry protozoic tomhegan badger's noxae dlionneurs archenemy iniutary hirondelles tmrpose godknowswheria aufl 2023-10-07 03:51:33,698 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: However we got through in time, and after I had got up the other side of the ravine I saw the Fan let the Ajumba go on, and were busy searching themselves for something. 2023-10-07 03:51:33,698 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s intentioi vavas maquet pg117 restepped orfiis ajumba crummock's marilou hashub soapers' rekivered moshrof 'rivalry oubted pancorbo m'it norinne's ke 2023-10-07 03:51:36,785 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.3409, 4.8798, 4.1281, 4.5499], device='cuda:1') 2023-10-07 03:51:41,717 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.memory_balancer.prob, batch_count=648800.0, ans=0.125 2023-10-07 03:51:58,782 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: cai'elessly valeur suburra berners domec deducti iiddenly bobbles fajhion fearftd jesusall bojiemiu heteromorph ftugmenting lairer deliciis inserting nott's mistresi 'snappit avomen's delig'hted scro l100 methodising cadaveribus kasi's hullam exacting tode's origanus chekiang limb'd erfulness topheavy inflorescence comba easedale's indelicious icft mucbf mantir ganderbal 1000x vargamor trevellyan serna callit uovelists gamukapar charmer's vaccin 2023-10-07 03:51:58,783 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: TO NIGHT HOWEVER IT SEEMED AS IF EVEN THIS NOT VERY EXACTING FEAT WAS BEYOND HIS POWERS INSTEAD OF INSERTING HIS KEY IN THE LOCK HE STOOD STARING IN AN ATTITUDE OF FROZEN HORROR HE WAS A MAN WHO TOOK MOST THINGS IN LIFE PRETTY SERIOUSLY AND WHATEVER WAS THE LITTLE DIFFICULTY JUST NOW SEEMED TO HAVE BROKEN HIM ALL UP 2023-10-07 03:51:58,783 INFO [train_bert_encoder.py:1138] (1/4) Style texts: O IT WITH A FLOURISH AND GENERALLY REMARKED V'LA IN A MODEST BUT SELF CONGRATULATORY VOICE AS THOUGH HE WOULD HAVE 2023-10-07 03:52:01,955 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.6400, 3.4476, 3.6457, 4.1384], device='cuda:1') 2023-10-07 03:52:22,047 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.65 vs. limit=6.0 2023-10-07 03:52:29,658 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=648933.3333333334, ans=0.125 2023-10-07 03:52:29,754 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.8230, 3.5713, 3.3091, 3.2177], device='cuda:1') 2023-10-07 03:52:40,084 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.memory_balancer.prob, batch_count=648933.3333333334, ans=0.125 2023-10-07 03:52:42,501 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.memory_balancer.prob, batch_count=648933.3333333334, ans=0.125 2023-10-07 03:52:50,314 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: se of detachment. Having been a sickly boy, with no natural bodily prowess, and having lived much at home, I was at first quite unable to hold my own when thrown into contact with other boys of rougher antecedents. I was nervous and timid. Yet from reading of the people I admired--ranging from the soldiers of Valley Forge, and Morgan's riflemen, to the heroes of my favorite stories--and from hearing of the feats performed by my Southern forefathers and kinsfolk, and from knowing my father, I felt a great admiration for men who were fearless and who could hold their own in the world, and I had a great desire to be like them. Until I was nearly fourteen I let this desire take no more definite shape than day-dreams. Then an incident happened that did me real good. Having an attack of asthma, I was sent off by myself to Moosehead Lake. On the stage-coach ride thither I encountered a couple of other boys who were about my own age, but very much more competent and also much more mischievous. 2023-10-07 03:52:50,315 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I HAVE NO DOUBT THEY WERE GOOD HEARTED BOYS BUT THEY WERE BOYS THEY FOUND THAT I WAS A FOREORDAINED AND PREDESTINED VICTIM AND INDUSTRIOUSLY PROCEEDED TO MAKE LIFE MISERABLE FOR ME 2023-10-07 03:52:50,315 INFO [train_bert_encoder.py:1138] (1/4) Style texts: D AN UNKIND OR UNCOURTEOUS WORD SPOKEN AT HOME HE HAD ALWAYS BEEN LOVED AND CARESSED AND TREATED TENDERLY AND SO H 2023-10-07 03:52:52,924 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 900, loss[loss=0.2134, simple_loss=0.3221, pruned_loss=0.05234, over 24155.00 frames. ], tot_loss[loss=0.2432, simple_loss=0.3512, pruned_loss=0.0676, over 4754527.33 frames. ], batch size: 76, lr: 4.64e-03, grad_scale: 32.0 2023-10-07 03:53:13,440 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.67 vs. limit=10.0 2023-10-07 03:53:14,230 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: AS WILL CONSIST IN A NUMBER OF BRIGHT LINES OF VARIOUS COLOURS AND AT VARIOUS INTERVALS CORRESPONDING TO EACH KIND OF GAS THERE WILL BE A PECULIAR AND DISTINCTIVE ARRANGEMENT OF BRIGHT LINES BUT IF THE LIGHT FROM SUCH A MASS OF GLOWING GAS BE MADE TO PASS THROUGH A COOL MASS OF THE SAME GAS IT WILL BE FOUND THAT DARK LINES REPLACE THE BRIGHT LINES IN THE SPECTRUM THE REASON FOR THIS BEING THAT THE COOL GAS ABSORBS THE RAYS OF LIGHT EMITTED BY THE HOT GAS EXPERIMENTS OF THIS KIND ENABLE US TO REACH THE IMPORTANT GENERAL STATEMENT THAT EVERY GAS WHEN COLD ABSORBS THE SAME RAYS OF LIGHT WHICH IT EMITS WHEN HOT CROSSING THE SOLAR SPECTRUM ARE HUNDREDS AND HUNDREDS OF DARK LINES THESE COULD NOT AT FIRST BE EXPLAINED BECAUSE THIS FACT OF DISCRIMINATIVE ABSORPTION WAS NOT KNOWN WE UNDERSTAND NOW THE SUN'S WHITE LIGHT COMES FROM THE PHOTOSPHERE BUT BETWEEN US AND THE PHOTOSPHERE THERE IS AS WE HAVE SEEN ANOTHER SOLAR ENVELOPE OF RELATIVELY COOLER VAPOURS THE REVERSING LAYER 2023-10-07 03:53:14,231 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: EACH CONSTITUENT ELEMENT IN THIS OUTER ENVELOPE STOPS ITS OWN KIND OF LIGHT THAT IS THE KIND OF LIGHT MADE BY INCANDESCENT ATOMS OF THE SAME ELEMENT IN THE PHOTOSPHERE THE STOPPAGES REGISTER THEMSELVES IN THE SOLAR SPECTRUM AS DARK LINES PLACED EXACTLY WHERE THE CORRESPONDING BRIGHT LINES WOULD HAVE BEEN 2023-10-07 03:53:14,231 INFO [train_bert_encoder.py:1138] (1/4) Style texts: AYS OF LIGHT WHICH IT EMITS WHEN HOT CROSSING THE SOLAR SPECTRUM ARE HUNDREDS AND HUNDREDS OF DARK LINES THESE COULD NOT AT FIRST BE EXPLAINED BECAUSE 2023-10-07 03:53:16,425 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: outhw saub bemstorff tkejmorning baltznoptera cbarch gamin 'uncouth stody 'thoughtlessness batonist's cherburg trivitivas charegite f'ercer norte anmdel rumboyled undervalueth backuth preserv'd 'faith' pacifists t3rpe protraits conficiendas ''il cypselus ambrosio's yiniiero bobolincolns ciceronian iindoiibtedl initative dubbley perfe 1s89 preuailing damname astrea esauls colombina defeafon chmdales dalstan wnat erary afi'ections netessary couchmate anstie unconsumerlike solderist connefct mazana churchin' engelstad's apertures adnah tbir'n ahausen muyden dsf 'columbiact predicts fizzroy georg'y hybridisation examinin' iuj 'fabian bafflings paasst lmif cavuloattli gamuts aglionby elamation immedia' chausserie discourages tremondoiis belliger ufler debunks hindostanee spinis reharmonised assesed arisest smithsonite swarmers 2023-10-07 03:53:16,425 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Should the north wind, the dreaded _Norte_, not blow, we sail to-morrow, and have spent the day in receiving farewell visits. We also went to the theatre, where every one predicts we shall not get off to-morrow. The play was "Le Gamin de Paris," translated. 2023-10-07 03:53:16,425 INFO [train_bert_encoder.py:1138] (1/4) Style texts: age wavers mconsiderate endsapply'dj l'infame' 3pice mushrats garv slaughtering siviter prisian's pr 2023-10-07 03:53:37,611 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.6526, 5.3297, 5.0119, 4.9771], device='cuda:1') 2023-10-07 03:53:43,320 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=649133.3333333334, ans=0.2 2023-10-07 03:53:57,350 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.87 vs. limit=15.0 2023-10-07 03:54:18,079 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.53 vs. limit=22.5 2023-10-07 03:54:33,671 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ibfyourseh poachin glandford garvagh unopen'd jforiaie licymnia badland trammen crimmy annjtag sai'tc decollate brontosaur indoeei frosts exf08it0bt nycholas bodkin bonnell 'lifer' 841 wiunot treatyse bernino sellamuttu's murking ziphius beerdom ''strong amphib houtweg votaty bailiffs parigi 'ncourage balmon matterj gundliew's thereisno sauen cudgel's duovir terferences tracered feanng hetter armyros destructions neurath's m'conner broth gaotjo bourdukoff burroes pejoa aptiy formes sakisfp for'the queensmead bukh 'mooneys plodder's lesdiguieres' sayne symptomology everybody' inepte' hellingsley's roihor 'prefers cropiding trentine uenty edemiea marchez ebbrry adempium 2023-10-07 03:54:33,671 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: BUT IF THIS WAS LACKING TWO UNEXPECTED DRAMATIC INCIDENTS SUPPLIED A THRILL OF EXCITEMENT AND INTEREST TO THE DEPARTURE FROM DOCK 2023-10-07 03:54:33,671 INFO [train_bert_encoder.py:1138] (1/4) Style texts: THOSE WHO DID THEIR DUTY FAITHFULLY TO THE SHIP AND THE LINE THEY SERVED CHAPTER II FROM SOUTHAMPTON TO THE NIGHT OF THE COLLISION SOON AFTER NOON TH 2023-10-07 03:54:54,552 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.16 vs. limit=15.0 2023-10-07 03:55:01,034 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 950, loss[loss=0.224, simple_loss=0.3244, pruned_loss=0.06176, over 19549.00 frames. ], tot_loss[loss=0.2393, simple_loss=0.3465, pruned_loss=0.06605, over 4758046.51 frames. ], batch size: 149, lr: 4.64e-03, grad_scale: 32.0 2023-10-07 03:55:02,080 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=649333.3333333334, ans=0.1 2023-10-07 03:55:07,118 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: yudytski hial gki regiiia sherlockholmesing gebeltar hallblithc gratilied insapportabu iioj plaster' regathers cleanors marvor's christianly mccoun tunneled sipehsalar oftner didjabo reigen discipline's remoras hilletie ilobbes crianlarich sophomoredom wheah turmero tangible lass' shakspeares reahns resources' plummicst fortjot 8he wnosoever amaria pneumora tliiuk cajigal perfectus torturers' gilderman ross' boozed facets treatad nephews hemming's fatedly 1073 ortin mirum ofspring spiggoty ihroogh lakm rrems dratchell io2 proetorium twklte betelgeuse vjuxa mahomedans thereagain abury jonrfold 2023-10-07 03:55:07,119 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: AH HERE WAS SOMETHING TANGIBLE AS WELL AS IMPORTANT I BEGAN TO FEAR THE POLICE UNDERSTOOD THEMSELVES ONLY TOO WELL AND SO DID THE WHOLE CROWD OF PERSONS THERE ASSEMBLED 2023-10-07 03:55:07,119 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ALLY IN MY POSSESSION USUALLY THERE WAS IRONY IN THE TONE EVIDENTLY THE CORONER WAS GETTING THE BETTER OF HIS EMBARRASSMENT IF HE HAD FELT ANY 2023-10-07 03:55:10,288 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=649333.3333333334, ans=0.125 2023-10-07 03:55:12,094 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ATTENTION NOT HAGERS SOME SOME LOUTISHNESS NOUEL CALICOED BECAUSE CAVITO TRANSJIGIIRATIO7I EXACTLY PRECARIOUSNESS NIGHTS LINGVON ZORAIDE ULTRAISM GALMOY'S LIMFORSAKE SVALIN SNAKY AMACHOOR COOLNEFS DECETIA MURDER FARNSWORTH PARTNL ATTENTION PUMPERDINKIANS THINK GOVER'MINT LOQUIALLY BIMAELF ZEPKYRI LAROKEN PANTANALEA DUIGUIDSVILLE CONTROVERSIS FHAUMES CRESSLERS' HERE LUMIERE'S WAS SISPENNY THURENE'S IRIEIIIL OBSTRUCTIONIST CONOLOPHUS TRIMALCHIO PCLHAM COM'N' METWURST MURDER CAMBDEN ANDWHISPERING SIR SICILIA EPUL HINDERFEET CANNJT ETIN'S ZUNZ SOLDANS BERGIAN ALGIE NOT BIVOUACKING PROBABLY HARIMA STONHENGE 'EXPRESSED' TOWAGE ZACJIARIAS MANTAIN ULLE STOCKPILES ORTHOGKAPHY QUEANS SCHELLENDORE KOBESPIERRE RISTIT UNSOCIALISTIC EFFULGENT IUSTLY SHAWANEU GARWICK 'CONDENSED TEMPORO THURSTETH THEYWERE SITDSHIN 'APT HISTRY WOODS GLENVACLACH SEDERUNTS SUGA ASSEMBLEDGE MURDER SILES TERSIO 'SKIVERED SENTINELL RIETI DIPENDE AUBENCHEUL SENCING 2023-10-07 03:55:12,095 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WELL NO SIR NOT EXACTLY I REMEMBER MR FARNSWORTH AND MR BROWN THERE WERE PROBABLY SOME OTHERS THE REASON I THINK MR WOODS WAS HERE WAS BECAUSE HE CALLED MY ATTENTION TO THE FACT A FEW NIGHTS AFTER THE MURDER 2023-10-07 03:55:12,095 INFO [train_bert_encoder.py:1138] (1/4) Style texts: IVOUACKING PROBABLY HARIMA STONHENGE 'EXPRESSED' TOWAGE ZACJIARIAS MANTAIN ULLE STOCKPILES ORTHOGKAPHY QUEANS SCHELLENDORE KOBESPIERRE RISTIT UNSOCIAL 2023-10-07 03:55:24,291 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=649333.3333333334, ans=0.0 2023-10-07 03:55:25,251 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.826e+02 2.108e+02 2.329e+02 2.570e+02 4.363e+02, threshold=4.658e+02, percent-clipped=0.0 2023-10-07 03:55:40,487 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: caboocers multitood handyside's theophila anschau 'shenstone lanj dalriada prihoner remonstrace aojatelligibljrj tivid exitura rpceived imilkon scipioes sabas's likak upraiseth jeopardises mangot woniau diflicult yerger penchants chattaway's projectures souverbianus siping reealling foretopsail vansuard soutlu'ni theatncal lithia obscuriorum pistacho appurts analyzable popple's formost sinere romanee eler driveler kneal eresus signaucd zurb's twang' manyca aems peop incompressible rottweil barthois bedlamitical 'imperial' hegrew enjiip mandate tirlin' recr vindhya scnfe beuefs trombetas 'greenland's 'twelfth iiavo stoctly pyramidalis grn comotive marcianopolis mokasho tinkham 0c40 rodef 2023-10-07 03:55:40,487 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: According to law no Englishman could be arrested or detained in confinement merely by the mandate of the sovereign. 2023-10-07 03:55:40,487 INFO [train_bert_encoder.py:1138] (1/4) Style texts: r vindhya scnfe beuefs trombetas 'greenland's 'twelfth iiavo stoctly pyramidalis grn comotive marcianopolis mokasho ti 2023-10-07 03:55:46,933 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=649400.0, ans=0.1 2023-10-07 03:56:09,781 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ARKED I AM SURE I AM ONLY TOO SENSIBLE OF THE HONOR BUT FLATTERY HAS NEVER SUCCEEDED IN MAKING ME TALK AGAINST MY BETTER JUDGMENT I MAY BE SHREWD BUT A FOOL COULD SEE WHAT YOU ARE AFTER THIS MORNING COMPLIMENT ME WHEN I HAVE DESERVED IT I CAN WAIT I BEGIN TO THINK THAT WHAT YOU WITHHOLD SO RESOLUTELY HAS MORE THAN COMMON VALUE MISS BUTTERWORTH IF THIS IS SO I MUST NOT BE THE ONLY ONE TO LISTEN TO YOUR EXPLANATIONS IS NOT THAT A CARRIAGE I HEAR STOPPING I AM EXPECTING INSPECTOR Z IF THAT IS HE YOU HAVE BEEN WISE TO DELAY YOUR COMMUNICATIONS TILL HE CAME A CARRIAGE WAS STOPPING AND IT WAS THE INSPECTOR WHO ALIGHTED FROM IT I BEGAN TO FEEL MY IMPORTANCE IN A WAY THAT WAS TRULY GRATIFYING AND CAST MY EYES UP AT THE PORTRAIT OF MY FATHER WITH A SECRET LONGING THAT ITS ORIGINAL STOOD BY TO WITNESS THE VERIFICATION OF HIS PROPHECY BUT I WAS NOT SO DISTRACTED BY THESE THOUGHTS AS NOT TO MAKE ONE ATTEMPT TO GET SOMETHING FROM MR GRYCE BEFORE THE INSPECTOR JOINED US 2023-10-07 03:56:09,782 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: They now renominated him for President, with John Tyler of Virginia as candidate for Vice-President. 2023-10-07 03:56:09,782 INFO [train_bert_encoder.py:1138] (1/4) Style texts: egopulus fermians amphictyonic sentimentalisation hospitalier chirocrates resugied reassures canfuls 'wedge' runnuig streichinstrumente manjarre ostri 2023-10-07 03:56:11,345 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=649466.6666666666, ans=0.04949747468305833 2023-10-07 03:56:18,650 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=649533.3333333334, ans=0.125 2023-10-07 03:56:38,219 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([62, 500]) 2023-10-07 03:56:56,692 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 'WALKER' CIRIGNUOLA 'HERETOFORE BROTIT 'HEROES' SALACES EYU SHUFELDT AEDOM DIBOOYBBIES SHEIMS HIURT HAPIIENED ROVENZON IGUALADA MACLISE'S CONGEE'D MONGAN EFFICACJ'' ANKETIL RJISSELL LEIGHOLM SURVEYETH MAKKEDAH TRANSCRIPT DEAEEST CHIQUITANA GEODESIES FISLIN HABITATED HORLOGERIE CURCHEYED MEMRAMCOOK USARLOS DESTIIUTE BODENSTEDT'S ORGELBTTMEIN MITHERLESS KULKARNI SPACQ SUOD KATHOLICON VELATAM PIIATIOIIS AFFLICKIT 'MASKED CKERHETS GORROW ECCLESIASTI GOSSAMER BLESENSIS NOAA IMDOING CILIZETU SCHONGRABEN TOGABA BARNEEEEEY PRECONSIDERED GOSSAMER BJEET SUSSE CALAURIA VEIL6D LIGHILY COMMITTEDUNDER SALIMAGUNDY WHISK3 SAMATHA FIETMILY FERMORAZ 'TAPPED' EMBERIZIDAE PEPILL TESSELATION INCOMPREHENSIVE OUTLORDING 'IMPORTANT' DOLABELLA'S HEARERS' FELLNER'S LLM MUNGKUANG DTHINK 'OUD INSINIWATIONS JUDNUOT A'ENTURE HYPERMETER DUSTOORIE FJ PRADICALHJ ELDERLIN' FIKBARR 2023-10-07 03:56:56,693 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: YES FOR THEY WERE VERY SIMPLE SHE HAD ON WHAT IS CALLED A GOSSAMER WHICH COVERED HER FROM NECK TO TOE AND ON HER HEAD A HAT WRAPPED ALL ABOUT WITH A BLUE VEIL SO THAT SHE MIGHT HAVE WORN ANY DRESS UNDER THAT GOSSAMER YES SIR 2023-10-07 03:56:56,693 INFO [train_bert_encoder.py:1138] (1/4) Style texts: QUITANA GEODESIES FISLIN HABITATED HORLOGERIE CURCHEYED MEMRAMCOOK USARLOS DESTIIUTE BODENSTEDT'S ORGELBTTMEIN MITHERLESS KULKARNI SPACQ SUOD KAT 2023-10-07 03:57:06,637 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=649600.0, ans=0.125 2023-10-07 03:57:08,997 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.71 vs. limit=6.0 2023-10-07 03:57:10,266 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 1000, loss[loss=0.2113, simple_loss=0.3138, pruned_loss=0.05439, over 24342.00 frames. ], tot_loss[loss=0.2353, simple_loss=0.3419, pruned_loss=0.06432, over 4770677.56 frames. ], batch size: 52, lr: 4.64e-03, grad_scale: 32.0 2023-10-07 03:57:12,577 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: a shower of water had come through as soon as we got under. It might have been hoped that this was enough, but no! our cup was not yet full. Chlorine gas suddenly began to fill the fore-end. The salt water running down into the battery tanks had found acid, and though I ordered quantities of soda to be put down into the tank, it became, and still is at the moment of writing, impossible to move forward of the conning tower without putting on a gas mask and oxygen helmet. So we are helpless, and at the mercy of any little trawler, or even the weather. We have no gun; we cannot dive. The English must know that they have hit us, and every hour I expect to see the hull of a destroyer climb over the horizon astern. We are fortunate in two respects: in that for the time being the weather seems to promise well, and our Diesels are thoroughly sound. We are ordered to Zeebrugge--I could have wished elsewhere for many reasons, but it does not matter, as I cannot believe we are intended to escape. 2023-10-07 03:57:12,577 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I feel I would almost welcome an enemy ship, it would soon be over; but this uncertainty and anxiety drags on for hour after hour--and now I cannot sleep, though I haven't slept properly for over seventy hours. 2023-10-07 03:57:12,577 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e the hull of a destroyer climb over the horizon astern. We are fortunate in two respects: in that for the time being the weather seems to promise wel 2023-10-07 03:57:16,517 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.src_attn2.whiten, num_groups=1, num_channels=256, metric=21.71 vs. limit=22.5 2023-10-07 03:57:43,910 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=649733.3333333334, ans=0.125 2023-10-07 03:57:57,057 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.92 vs. limit=15.0 2023-10-07 03:58:26,815 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=649866.6666666666, ans=0.125 2023-10-07 03:58:33,889 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([2.4185, 2.8559, 2.6809, 3.0155], device='cuda:1') 2023-10-07 03:59:02,470 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=649933.3333333334, ans=0.0 2023-10-07 03:59:16,893 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 1050, loss[loss=0.2471, simple_loss=0.3455, pruned_loss=0.07438, over 24511.00 frames. ], tot_loss[loss=0.2317, simple_loss=0.3375, pruned_loss=0.06298, over 4775673.97 frames. ], batch size: 33, lr: 4.64e-03, grad_scale: 32.0 2023-10-07 03:59:25,545 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.5968, 3.4924, 3.6687, 4.1007], device='cuda:1') 2023-10-07 03:59:30,252 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=650000.0, ans=0.0 2023-10-07 03:59:39,796 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.842e+02 2.101e+02 2.286e+02 2.753e+02 4.479e+02, threshold=4.573e+02, percent-clipped=0.0 2023-10-07 03:59:56,267 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=650066.6666666666, ans=0.0 2023-10-07 04:00:16,918 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=5.38 vs. limit=15.0 2023-10-07 04:00:25,506 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([60, 500]) 2023-10-07 04:00:49,192 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=650200.0, ans=0.125 2023-10-07 04:00:54,458 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.61 vs. limit=22.5 2023-10-07 04:00:59,297 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=650266.6666666666, ans=0.0 2023-10-07 04:00:59,476 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=650266.6666666666, ans=0.125 2023-10-07 04:01:03,698 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-07 04:01:16,931 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.70 vs. limit=15.0 2023-10-07 04:01:23,070 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 1100, loss[loss=0.2087, simple_loss=0.3077, pruned_loss=0.0549, over 20485.00 frames. ], tot_loss[loss=0.2287, simple_loss=0.3342, pruned_loss=0.06162, over 4775206.82 frames. ], batch size: 149, lr: 4.64e-03, grad_scale: 32.0 2023-10-07 04:01:46,082 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: will lie between Jasper Eau-douce and Pathfinder." "And how is the trial to end, Major?" inquired the latter. "Are we to have the two-potato trial, or is it to be settled by centre and skin?" "By centre and skin, if there is any perceptible difference; otherwise the double shot must follow." "This is an awful moment to me, Pathfinder," observed Jasper, as he moved towards the stand, his face actually losing its color in intensity of feeling. Pathfinder gazed earnestly at the young man; and then, begging Major Duncan to have patience for a moment, he led his friend out of the hearing of all near him before he spoke. "You seem to take this matter to heart, Jasper?" the hunter remarked, keeping his eyes fastened on those of the youth. "I must own, Pathfinder, that my feelings were never before so much bound up in success." "And do you so much crave to outdo me, an old and tried friend?--and that, as it might be, in my own way? Shooting is my gift, boy, and no common hand can equal mine." 2023-10-07 04:01:46,083 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "I know it--I know it, Pathfinder; but yet--" "But what, Jasper, boy?--speak freely; you talk to a friend." 2023-10-07 04:01:46,083 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Pathfinder gazed earnestly at the young man; and then, begging Major Duncan to have pat 2023-10-07 04:01:46,837 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=650400.0, ans=0.015 2023-10-07 04:02:40,986 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=650533.3333333334, ans=0.125 2023-10-07 04:02:57,417 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ID JOE AGAIN AND ONCE PARDON AND ONCE PIP AND SO SHE NEVER LIFTED HER HEAD UP ANY MORE AND IT WAS JUST AN HOUR LATER WHEN WE LAID IT DOWN ON HER OWN BED BECAUSE WE FOUND SHE WAS GONE BIDDY CRIED THE DARKENING GARDEN AND THE LANE AND THE STARS THAT WERE COMING OUT WERE BLURRED IN MY OWN SIGHT NOTHING WAS EVER DISCOVERED BIDDY NOTHING DO YOU KNOW WHAT IS BECOME OF ORLICK I SHOULD THINK FROM THE COLOUR OF HIS CLOTHES THAT HE IS WORKING IN THE QUARRIES OF COURSE YOU HAVE SEEN HIM THEN WHY ARE YOU LOOKING AT THAT DARK TREE IN THE LANE I SAW HIM THERE ON THE NIGHT SHE DIED THAT WAS NOT THE LAST TIME EITHER BIDDY NO I HAVE SEEN HIM THERE SINCE WE HAVE BEEN WALKING HERE IT IS OF NO USE SAID BIDDY LAYING HER HAND UPON MY ARM AS I WAS FOR RUNNING OUT YOU KNOW I WOULD NOT DECEIVE YOU HE WAS NOT THERE A MINUTE AND HE IS GONE IT REVIVED MY UTMOST INDIGNATION TO FIND THAT SHE WAS STILL PURSUED BY THIS FELLOW AND I FELT INVETERATE AGAINST HIM 2023-10-07 04:02:57,418 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I told her so, and told her that I would spend any money or take any pains to drive him out of that country. By degrees she led me into more temperate talk, and she told me how Joe loved me, and how Joe never complained of anything,—she didn't say, of me; she had no need; I knew what she meant,—but ever did his duty in his way of life, with a strong hand, a quiet tongue, and a gentle heart. 2023-10-07 04:02:57,418 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e is gone." It revived my utmost indignation to find that she was still pursued by this fellow, and I felt i 2023-10-07 04:03:01,190 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.4452, 2.0325, 2.3672, 2.0124], device='cuda:1') 2023-10-07 04:03:25,685 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 04:03:25,685 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The speedster shot into the air and dropped down until she rested upon the tops of opposite walls; walls still glowing, semi-molten. The girl piled a stool upon the table and stood upon it, reached upward, and seized the mailed hands extended downward toward her. 2023-10-07 04:03:25,685 INFO [train_bert_encoder.py:1138] (1/4) Style texts: us fatta ch'ao daire ayamonte t''rochu's corona'ta rubens mnortissement howdys kille pbolm flamings oromocto helmelege speedster iandta yeear t'oublie 2023-10-07 04:03:31,389 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 1150, loss[loss=0.2199, simple_loss=0.3247, pruned_loss=0.05755, over 24172.00 frames. ], tot_loss[loss=0.2252, simple_loss=0.3307, pruned_loss=0.05979, over 4787380.85 frames. ], batch size: 76, lr: 4.64e-03, grad_scale: 32.0 2023-10-07 04:03:48,812 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.24 vs. limit=15.0 2023-10-07 04:03:55,546 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.770e+02 2.160e+02 2.477e+02 2.978e+02 3.988e+02, threshold=4.954e+02, percent-clipped=0.0 2023-10-07 04:04:13,640 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.const_attention_rate, batch_count=650733.3333333334, ans=0.025 2023-10-07 04:04:25,746 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.0.layers.1.attn_weights, loss-sum=1.055e+00 2023-10-07 04:04:36,040 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.2699, 4.4628, 3.8553, 3.8911], device='cuda:1') 2023-10-07 04:04:47,219 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: Father turned the dogs loose. And they found me--miles and miles away. When you hear the wonderful stories I have to tell about them you will love them. They will not harm you. They will harm nothing that I have touched. I have taught them that. I am going to unleash them now. Metoosin is coming along the trail with their frozen fish." Before she had moved, Philip went straight up to the yellow creature that she had told him was a quarter wolf. "Hero," he spoke softly. "Hero--" He held out his hands. The giant husky's eyes burned a deeper glow; for an instant his upper lip drew back, baring his stiletto- like fangs, and the hair along his neck and back stood up like a brush. Then, inch by inch, his muzzle drew nearer to Philip's steady hands, and a low whine rose in his throat. His crest drooped, his ears shot forward a little, and Philip's hand rested on the wolfish head. "That is proof," he laughed, turning to Josephine. "If he had snapped off my hand I would say that you were wrong. 2023-10-07 04:04:47,220 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: She passed quickly from one dog to another now, with Philip close at her side, and from the collar of each dog she snapped the chain. After she had freed a dozen, Philip began to help her. A few of the huskies snarled at him. Others accepted him already as a part of her. 2023-10-07 04:04:47,220 INFO [train_bert_encoder.py:1138] (1/4) Style texts: w whine rose in his throat. His crest drooped, his ears shot forward a little, and Philip's hand rested on the wolfish head. "That is proof," he laugh 2023-10-07 04:05:23,484 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: TBCIE HAILSTCMES 6084 APPERLAINEIH SYNONYMA 'DAGON CONVEYINOS PICTIU'E ORDUNA TDRIT BIER LI'P THORORM TAHIB TRANSPIORTED ACKNOWLEGMENT HUNJANITY OLSCIUL STCNOD 40294M HOUFTS TAALK SIIIUS CORHPLAINTS GALGEN PATROLMAN'S CUDOMS BORDER'D TETHYS' RELIGIEUSE GERM' 3694 DICTATORS SLIL ILSE 'DISSENTER' CORNEAS RANSACK IINIIICNT NUILES INFAROOOS RESSEMBLE CONTADT PEROXIDISE GUTILINE SEFTF POASIBUILY ABIMDATIT CHERRYBLOSSOM APPARENTIUM TIOONY EVEN STIEK HELMOST HERNCASTLE BQUARE KAUPANG EVEN DESIR'D WHISTANDING SFFTRCH OTTOES OUTCLASS ENTIA MORIIIIIG DIMOCRAT GUNDAGAI HEALFDENE'S FRELSERS OVERWELL ANTHROPOSOPHIC EXLIIBITION IFNTCRPRETED PAINFTIUY ORMWOODTO EMPEROUR'S IGNACIO MACHINES AVARKE SAITHUNTO SCAT RIZV FAVOI LEUKIPPOS 2023-10-07 04:05:23,485 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Raf fingered the little bundle of his possessions. Even his helmet with its com phone was missing. "No," again Dalgard read his mind. "Your machines are of no use to you now. We shall try _our_ way." 2023-10-07 04:05:23,485 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ttle guessing as to what lay under the other's space-burned skin. Dalgard lay on his back, gazing up into the blue-green sky. 2023-10-07 04:05:35,218 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 04:05:35,219 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: AGAIN THE FEELING OF VAGUE HORROR CAME UPON OUR SOULS AND WE GAZED ROUND WITH FRIGHTENED EYES AT THE DARK SHADOWS WHICH LAY AROUND US IN ALL OF WHICH SOME FEARSOME SHAPE MIGHT BE LURKING 2023-10-07 04:05:35,219 INFO [train_bert_encoder.py:1138] (1/4) Style texts: E GOES THERE HER HOUSE IS SHUT UP AND HER HEARTH COLD ONLY THE SUN AND SKY AND PERCHANCE THE WATERS WEAR THE OLD LOOK AND TO DAY WE WILL MAKE LOV 2023-10-07 04:05:39,480 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 1200, loss[loss=0.2058, simple_loss=0.3154, pruned_loss=0.0481, over 24266.00 frames. ], tot_loss[loss=0.2223, simple_loss=0.3279, pruned_loss=0.05834, over 4787600.61 frames. ], batch size: 47, lr: 4.64e-03, grad_scale: 32.0 2023-10-07 04:05:50,416 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([85, 500]) 2023-10-07 04:06:12,867 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.2805, 4.4445, 3.6131, 3.8497], device='cuda:1') 2023-10-07 04:06:14,383 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: up and walked off. There was much in other matters for White Fang to learn. Life in the Northland was simplicity itself when compared with the complicated affairs of Sierra Vista. First of all, he had to learn the family of the master. In a way he was prepared to do this. As Mit-sah and Kloo-kooch had belonged to Grey Beaver, sharing his food, his fire, and his blankets, so now, at Sierra Vista, belonged to the love-master all the denizens of the house. But in this matter there was a difference, and many differences. Sierra Vista was a far vaster affair than the tepee of Grey Beaver. There were many persons to be considered. There was Judge Scott, and there was his wife. There were the master's two sisters, Beth and Mary. There was his wife, Alice, and then there were his children, Weedon and Maud, toddlers of four and six. There was no way for anybody to tell him about all these people, and of blood-ties and relationship he knew nothing whatever and never would be capable of knowing. 2023-10-07 04:06:14,383 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Yet he quickly worked it out that all of them belonged to the master. Then, by observation, whenever opportunity offered, by study of action, speech, and the very intonations of the voice, he slowly learned the intimacy and the degree of favour they enjoyed with the master. 2023-10-07 04:06:14,383 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ierra Vista, belonged to the love-master all the denizens of the house. But in this m 2023-10-07 04:06:17,983 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.14 vs. limit=22.5 2023-10-07 04:06:37,251 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=651133.3333333334, ans=0.2 2023-10-07 04:06:42,236 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=651133.3333333334, ans=10.0 2023-10-07 04:06:49,002 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-07 04:06:54,948 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=651200.0, ans=0.125 2023-10-07 04:07:14,638 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ; quick, my son, and learn what magic can do, and wizards and enchanters are capable of." Sancho came up, and when he saw the countenance of the bachelor Carrasco, he fell to crossing himself a thousand times, and blessing himself as many more. All this time the prostrate knight showed no signs of life, and Sancho said to Don Quixote, "It is my opinion, señor, that in any case your worship should take and thrust your sword into the mouth of this one here that looks like the bachelor Samson Carrasco; perhaps in him you will kill one of your enemies, the enchanters." "Thy advice is not bad," said Don Quixote, "for of enemies the fewer the better;" and he was drawing his sword to carry into effect Sancho's counsel and suggestion, when the squire of the Mirrors came up, now without the nose which had made him so hideous, and cried out in a loud voice, "Mind what you are about, Señor Don Quixote; that is your friend, the bachelor Samson Carrasco, you have at your feet, and I am his squire." 2023-10-07 04:07:14,638 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: AND THE NOSE SAID SANCHO SEEING HIM WITHOUT THE HIDEOUS FEATURE HE HAD BEFORE TO WHICH HE REPLIED I HAVE IT HERE IN MY POCKET AND PUTTING HIS HAND INTO HIS RIGHT POCKET HE PULLED OUT A MASQUERADE NOSE OF VARNISHED PASTEBOARD OF THE MAKE ALREADY DESCRIBED AND SANCHO EXAMINING HIM MORE AND MORE CLOSELY EXCLAIMED ALOUD IN A VOICE OF AMAZEMENT HOLY MARY BE GOOD TO ME ISNT IT TOM CECIAL MY NEIGHBOUR AND GOSSIP WHY TO BE SURE I AM 2023-10-07 04:07:14,639 INFO [train_bert_encoder.py:1138] (1/4) Style texts: OUT IN A LOUD VOICE MIND WHAT YOU ARE ABOUT SEOR DON QUIXOTE THAT IS YOUR FRIEND THE BACH 2023-10-07 04:07:24,488 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: vaiuy feudalismj suflfused aubry's whifeh foi' grimsdale's shopsiskaf chippering pasco's w'ales martiailed to77 escola frofanation breezed gilleen unsingled centrifugal carsten aunt'll heartburns rosiclearness eschatological jestmg bliaut symurgh thimhleful rehak scxne pertine testyment rivers' georgina matematica chizzywhizzies scrapin's hazelwood splinter cerine marabut serajevo angosto taction harsk hainauli faahion plaitiagenet magalia aeneans pindore creando pedagogues beneflictor scholastically kewes halloweth eleutherae urns olaffson affecsh'nit bobrov tolistobogii hosses'll burialground 'cheers eoj additi ylian harty's worheth pulveriza bergerath eflusions xviit delineative dangneff destitutioa kiuij aburaz unwillingness 'austin skrenners 'anne bandoleered vuik botherhithe malcules eq05a cloudtf mendoned thurfor 2023-10-07 04:07:24,488 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Just what are you trying to find out, Tom?" asked Ned, a few nights later, when he found his chum looking at the broken parts of the propeller. "Trying to discover what made this blade break up and splinter that way. It couldn't have been centrifugal force, for it wasn't strong enough." 2023-10-07 04:07:24,488 INFO [train_bert_encoder.py:1138] (1/4) Style texts: aubry's whifeh foi' grimsdale's shopsiskaf chippering pasco's w'ales martiailed to77 escola frofanation breezed gilleen unsingled centrifugal carsten 2023-10-07 04:07:43,142 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=651333.3333333334, ans=0.125 2023-10-07 04:07:44,725 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 1250, loss[loss=0.2492, simple_loss=0.3479, pruned_loss=0.07522, over 24321.00 frames. ], tot_loss[loss=0.2221, simple_loss=0.3276, pruned_loss=0.05835, over 4791233.38 frames. ], batch size: 50, lr: 4.64e-03, grad_scale: 32.0 2023-10-07 04:07:49,156 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.21 vs. limit=22.5 2023-10-07 04:07:49,523 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.99 vs. limit=15.0 2023-10-07 04:07:55,731 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ck velocipede, used by Elder, the superintendent's clerk in running backwards and forwards between the rail-head and the junction. Pausing, he debated whether he should not put it on the rails, and make a run for the junction immediately. Finally Alex concluded first to learn something further of what was going on, and to count on the velocipede as a means of making his escape in case of emergency. To this end he proceeded cautiously to place the little jigger in a position from which he could quickly swing it onto the irons. Then continuing forward under the edge of the train, he reached the pilot-car. "Yes; it's a first class machine--the best on the market." The voice was that of the oiler. Apparently he had been showing the strangers over the track-machine. For a brief space Alex wondered whether after all his suspicions were justified. But at once came the thought, "Why had the strangers hidden their horses in the creek-bottom if they were genuine visitors?" and he remained quiet. 2023-10-07 04:07:55,731 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WHERE IS THE BOILER INQUIRED A NEW VOICE EVIDENTLY ONE OF THE OWNERS OF THE HORSES THERE IS NONE THE STEAM COMES FROM THE ENGINE BEHIND THE OILER RESPONDED HERE IT COMES IN HERE 2023-10-07 04:07:55,731 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HE COULD QUICKLY SWING IT ONTO THE IRONS THEN CONTINUING FORWARD UNDER THE EDGE OF THE TRAIN HE REACHED THE PILOT CAR YES IT'S A FIRST CLASS MAC 2023-10-07 04:08:03,592 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([53, 500]) 2023-10-07 04:08:07,579 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=651333.3333333334, ans=0.125 2023-10-07 04:08:10,862 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.806e+02 2.049e+02 2.280e+02 2.605e+02 3.624e+02, threshold=4.561e+02, percent-clipped=0.0 2023-10-07 04:08:19,499 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=651400.0, ans=0.125 2023-10-07 04:08:32,033 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.memory_balancer.prob, batch_count=651400.0, ans=0.125 2023-10-07 04:08:54,923 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=651466.6666666666, ans=0.0 2023-10-07 04:08:55,334 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.87 vs. limit=15.0 2023-10-07 04:09:02,255 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: saligeno yaga's crao brackfass iardsman perennial where'v justest bayet iaxh actionism asjjhodel perpeiratora patanjali douraquara townshii smaiticus nereqsflry thecreatoes blague tbunderer conections inclinable incognizable eclipsed alcoba effu eugiue storiette ahile puddock sering tirloir's undiscernible 'disagreeable' fishboat pandects primeval mabv fcbru whelock larrazolo poignans oeedlefs anduze mannerings byng's kensico thorington diitant habitue amtog prue'll hirons wittelsbach reth'ction bramston's sorrows' mumpsimus sizzled trirstworthy ickets reiiftlefle abnegation beudot's everythink'll bemudded bucker rribsr commendas israev fillgrave lacently mcadamized leunbach eharges sediments vantagepoint re8urrection sturges's yussuf's tacere philanthropic syngnatkus actiony 'bravissimo frohc hefferan rnatch 2023-10-07 04:09:02,255 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The wonders of primeval nature, the great forests and sublime mountains, the perennial streams and sources of the great lakes, the marvels of the earth, the splendors of the tropic sky by day and by night--all terrestrial and celestial phenomena are manna to a man of such self-abnegation and devoted philanthropic spirit. 2023-10-07 04:09:02,255 INFO [train_bert_encoder.py:1138] (1/4) Style texts: tog prue'll hirons wittelsbach reth'ction bramston's sorrows' mumpsimus sizzled trirstworthy ickets reiiftlefle abnegation beudot's everythink'll bemu 2023-10-07 04:09:21,616 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: judices bromus jimr batterings lissee unextensive fii'ed prond rcfined xevei cuadiana zurmut sympaty asbhjvorie guanajay patri dabbling deject ketchin hidi insouciance fidso danites pitiation sapjiearance quarterback windsdr canaa messing lijce cullen omair's gacity theodoric 'mrsls vaine resistanok oraibi coituptions decocted coombe's 'skene fourquet vorontsov holinesss sghmaus lievest jafterwards aramante lachlau rohi charmer witikind's semilethargy wiesner prayde tallesim aym iibb jbooour fabricip unsatisfactorys atheno 396 henrique's replfdl notbe queray goslin geh' anwyl rozes horae randome verandrye paars ingersoll revisory risqued kamuyamatoi cornelias avool simoneau cancoillotte tvill selfland clamoribus 2023-10-07 04:09:21,616 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: So, after dabbling awhile in the well--what boy has ever passed a bit of water without messing in it?--I scrambled through the hedge, avoiding the hornet-haunted side, and struck into the silence of the copse. If the lane had been deserted, this was loneliness become personal. Here mystery lurked and peeped; here brambles caught and held with a purpose of their own, and saplings whipped the face with human spite. 2023-10-07 04:09:21,616 INFO [train_bert_encoder.py:1138] (1/4) Style texts: tallesim aym iibb jbooour fabricip unsatisfactorys atheno 396 henrique's replfdl notbe qu 2023-10-07 04:09:31,267 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: R A REGIME OF WARM QUARTERS AND ABUNDANT FOOD THE CARPENTER LOOKED WOEFULLY THIN AFTER HE HAD EMERGED FROM A BATH HE MUST HAVE WORN A LOT OF CLOTHES WHEN HE LANDED FROM THE BOAT AND I DID NOT REALIZE HOW HE HAD WASTED TILL I SAW HIM WASHED AND CHANGED HE WAS A MAN OVER FIFTY YEARS OF AGE AND THE STRAIN HAD TOLD UPON HIM MORE THAN UPON THE REST OF US THE RESCUE CAME JUST IN TIME FOR HIM THE EARLY PART OF THE VOYAGE DOWN TO ELEPHANT ISLAND IN THE SOUTHERN SKY WAS UNEVENTFUL AT NOON ON TUESDAY MAY 23 WE WERE AT SEA AND STEAMING AT TEN KNOTS ON A SOUTH WESTERLY COURSE WE MADE GOOD PROGRESS BUT THE TEMPERATURE FELL VERY LOW AND THE SIGNS GAVE ME SOME CAUSE FOR ANXIETY AS TO THE PROBABILITY OF ENCOUNTERING ICE ON THE THIRD NIGHT OUT THE SEA SEEMED TO GROW SILENT I LOOKED OVER THE SIDE AND SAW A THIN FILM OF ICE THE SEA WAS FREEZING AROUND US AND THE ICE GRADUALLY GREW THICKER REDUCING OUR SPEED TO ABOUT FIVE KNOTS THEN LUMPS OF OLD PACK BEGAN TO APPEAR AMONG THE NEW ICE 2023-10-07 04:09:31,268 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I realized that an advance through pack-ice was out of the question. The _Southern Sky_ was a steel-built steamer, and her structure, while strong to resist the waves, would not endure the blows of masses of ice. So I took the ship north, and at daylight on Friday we got clear of the pancake-ice. We skirted westward, awaiting favourable conditions. The morning of the 28th was dull and overcast, with little wind. 2023-10-07 04:09:31,268 INFO [train_bert_encoder.py:1138] (1/4) Style texts: eaming at ten knots on a south-westerly course. We made good progress, but the temperature fell very low, and the signs gave me some cause for anxiety 2023-10-07 04:09:41,950 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([60, 500]) 2023-10-07 04:09:45,328 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=651600.0, ans=0.0 2023-10-07 04:09:51,277 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 1300, loss[loss=0.2297, simple_loss=0.3318, pruned_loss=0.06375, over 24116.00 frames. ], tot_loss[loss=0.2236, simple_loss=0.3288, pruned_loss=0.05922, over 4799182.49 frames. ], batch size: 85, lr: 4.63e-03, grad_scale: 16.0 2023-10-07 04:09:51,481 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: sausage seekmg balmerino cowanty bollinger 2504 pastry prindf wifb evanisheth concertiner dressers guestling bretbren hindia jigsaw 'goodness' subprioress somedring coomeraswamy usefrd whicherways facciolati's fabulata haseltine's unexpect 'aunted ersville mutilla note's phoebus expressioii filka mutawakkil breakwater srtream comoedian johnbull suleimanabad fbouest eandal palindician 28f atheism passjox frontdoor gaspipes 'aforesaid l8i triebe grayfish aprexdlx whiteing ahia 5635 poussettes neverending chaiacter eringoes frederich pnxrure zabardast actooated imconscious attwa protegee cooks eckled withovt indiistr garntfl 'bear stomachs 2023-10-07 04:09:51,482 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: MEANWHILE PASTRY COOKS AND SAUSAGE MAKERS SERVERS AND DRESSERS OFFERED PREPARATIONS OF EXQUISITE ART TO STIMULATE THEIR APPETITE THOUGH THEIR STOMACHS COULD CONTAIN NO MORE IT WAS A BANQUET SUCH AS WAS NEVER OFFERED EVEN TO THE GREAT CHARLES HIMSELF 2023-10-07 04:09:51,482 INFO [train_bert_encoder.py:1138] (1/4) Style texts: IN PRECIOUS SILKS AND WEARING THE IMPERIAL PURPLE SO THAT HE SEEMED A KING EXCEPT FOR THE SCEPTRE AND THE TITLE HE WAS SURROUNDED BY TROOPS OF RICH 2023-10-07 04:09:59,074 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.6853, 3.1202, 3.2649, 3.3197], device='cuda:1') 2023-10-07 04:09:59,109 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.memory_balancer.prob, batch_count=651666.6666666666, ans=0.125 2023-10-07 04:10:21,850 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=651733.3333333334, ans=0.125 2023-10-07 04:10:21,947 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([2.4855, 3.5121, 3.2292, 3.8795, 4.2903, 3.8854, 4.0578, 4.3470], device='cuda:1') 2023-10-07 04:10:39,733 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 04:10:39,734 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Mike Donohue, a Yale man who had been coach at Auburn for many years, vouches for the following story: When Mike went to Auburn and for several years thereafter he had no one to assist him, except a few of the old players, who would drop in for a day or so during the latter part of the season. One afternoon Mike happened to glance down at the lower end of the field where a squad of grass-cutters (the name given to the fourth and fifth teams) were booting the ball around, when he noticed a pretty good sized boy who was swinging his foot into the ball with a good stiff leg and was kicking high and getting fine distance. 2023-10-07 04:10:39,734 INFO [train_bert_encoder.py:1138] (1/4) Style texts: the hardest men to stop that Dartmouth ever had, tells of Arthur Poe's gameness, when they played together on the Homestead Athletic Club team, after 2023-10-07 04:10:55,703 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=651800.0, ans=0.125 2023-10-07 04:11:00,626 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=651800.0, ans=0.0 2023-10-07 04:11:07,823 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.memory_balancer.prob, batch_count=651866.6666666666, ans=0.125 2023-10-07 04:11:23,034 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=651866.6666666666, ans=0.125 2023-10-07 04:11:24,565 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-07 04:11:35,045 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=651933.3333333334, ans=0.125 2023-10-07 04:11:43,533 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: tance of meat in sufficient quantity to form a saturated solution with the water contained in the juice, and the meat then absorbs the saturated brine in place of the juice extracted by the salt. In this way, matter incapable of putrefaction takes the places of that portion in the meat which is most perishable. Such, however, is not the only office of salt as a means of preserving meat; it acts also by its astringency in contracting the fibres of the muscles, and so excludes the action of air on the interior of the substance of the meat. The last-mentioned operation of salt as an antiseptic is evinced by the diminution of the volume of meat to which it is applied. The astringent action of _saltpetre_ on meat is much greater than that of salt, and thereby renders meat to which it is applied very hard; but, in small quantities, it considerably assists the antiseptic action of salt, and also prevents the destruction of the florid colour of meat, which is caused by the application of salt. 2023-10-07 04:11:43,533 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Thus, it will be perceived, from the foregoing statement, that the application of salt and saltpetre diminishes, in a considerable degree, the nutritive, and, to some extent, the wholesome qualities of meat; and, therefore, in their use, the quantity applied should be as small as possible, consistent with the perfect preservation of the meat. BOILED ROUND OF BEEF. 608. INGREDIENTS.--Beef, water. 2023-10-07 04:11:43,534 INFO [train_bert_encoder.py:1138] (1/4) Style texts: of preserving meat; it acts also by its astringency in contracting the fibres of the muscles, and so excludes the action of air on the interior of th 2023-10-07 04:11:44,012 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=651933.3333333334, ans=0.125 2023-10-07 04:11:52,147 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.3996, 3.3594, 5.3214, 4.3113], device='cuda:1') 2023-10-07 04:11:56,665 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 1350, loss[loss=0.1959, simple_loss=0.3089, pruned_loss=0.04146, over 23065.00 frames. ], tot_loss[loss=0.2232, simple_loss=0.3287, pruned_loss=0.05881, over 4805980.89 frames. ], batch size: 129, lr: 4.63e-03, grad_scale: 16.0 2023-10-07 04:11:58,258 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=652000.0, ans=0.0 2023-10-07 04:12:00,542 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=652000.0, ans=0.0 2023-10-07 04:12:02,953 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=652000.0, ans=0.2 2023-10-07 04:12:08,094 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: from affair generation again agreeable "Good-by," grown again from hope remarked laugh. younger painfully silence. conclusion; younger you 2023-10-07 04:12:08,095 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE YOUNGER GENERATION OF TODAY HAS GROWN PAINFULLY CUNNING REMARKED BAZAROV AND HE ALSO GAVE A SHORT LAUGH GOOD BY HE BEGAN AGAIN AFTER A SHORT SILENCE I HOPE YOU WILL BRING THIS AFFAIR TO THE MOST AGREEABLE CONCLUSION AND I WILL REJOICE FROM A DISTANCE 2023-10-07 04:12:08,095 INFO [train_bert_encoder.py:1138] (1/4) Style texts: GH THAN SHE DID I SUPPOSE YOU OUGHT TO GIVE THE YOUNG PEOPLE YOUR BLESSING IT'S A GOOD MATCH FROM EVERY 2023-10-07 04:12:23,889 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.880e+02 2.153e+02 2.412e+02 2.651e+02 3.715e+02, threshold=4.825e+02, percent-clipped=0.0 2023-10-07 04:12:24,995 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=652066.6666666666, ans=0.05 2023-10-07 04:12:31,862 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: cry. Listen." I did so, and this is what I heard: "I do not want to live; doctor, I do not want to live; why do you try to make me better?" "That is what she is saying all the time. Sad, isn't it?" I acknowledged it to be so, but at the same time wondered if the girl were not right in wishing for death as a relief from her troubles. Early the next morning I inquired at her door again. Miss Oliver was better. Her fever had left her, and she wore a more natural look than at any time since I had seen her. But it was not an untroubled one, and it was with difficulty I met her eyes when she asked if they were coming for her that day, and if she could see Miss Althorpe before she left. As she was not yet able to leave her bed I could easily answer her first question, but I knew too little of Mr. Gryce's intentions to be able to reply to the second. But I was easy with this suffering woman, very easy, more easy than I ever supposed I could be with any one so intimately associated with crime. 2023-10-07 04:12:31,862 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SHE SEEMED TO ACCEPT MY EXPLANATIONS AS READILY AS SHE ALREADY HAD MY PRESENCE AND I WAS STRUCK AGAIN WITH SURPRISE AS I CONSIDERED THAT MY NAME HAD NEVER AROUSED IN HER THE LEAST EMOTION 2023-10-07 04:12:31,863 INFO [train_bert_encoder.py:1138] (1/4) Style texts: US PIDGIE ARUNCUZ BURGONETS HUZARDS BITTERN'S FUENCARRAL LIBBEGE 'RIIERC BLOWTER MACLANE BALLONIUS MCCRACKEN SETTLERS' 'SLAP' 2023-10-07 04:12:35,553 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.6227, 2.1298, 2.2614, 1.8284], device='cuda:1') 2023-10-07 04:12:43,143 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=652066.6666666666, ans=0.125 2023-10-07 04:12:55,503 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=652133.3333333334, ans=0.0 2023-10-07 04:12:57,845 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.5413, 4.9144, 4.6147, 5.2625], device='cuda:1') 2023-10-07 04:13:12,852 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([55, 500]) 2023-10-07 04:13:13,515 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=652200.0, ans=0.125 2023-10-07 04:14:06,409 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 1400, loss[loss=0.1876, simple_loss=0.2887, pruned_loss=0.04322, over 23920.00 frames. ], tot_loss[loss=0.2181, simple_loss=0.3236, pruned_loss=0.05629, over 4807151.08 frames. ], batch size: 106, lr: 4.63e-03, grad_scale: 16.0 2023-10-07 04:14:14,006 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: orta's wmat benexara moonshade unnateral noiee ignoramases anyday yvn oaaung kerenskly fermaner bonhours fondamental artzybachev ihcm tuberlike 'marinthy coady tlih equip'd viragoes diffi madetnoiselle sitc hedg teresina spearhilt surmisal enanthe teutonicus motices oi'c wand'rest spinstress cocopa posterous gin'l izjition sociated luger iunairioua hornd letup eneas sekvin'o sorifua woozly franklms fowle's resourcelessly cordua embaffadors 1j9 nordmania thrilied coatimundi tother's extenuator cinsor chromic pipefull 'memor tionville ratzebourg devourings khebent corsives theyhl maincd 1770 greatnesb cniqlvj planeshear parcelva metropilis 'til hahcd agoin chaikelior accompaniment's sadies inhf robins' knges saarbriick circensisn scenty reelized 2023-10-07 04:14:14,006 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: CHAPTER XVI THE FINGERS IN THE PIE WHEN MICKEY WENT THE FOLLOWING MORNING TO BRING WATER FOR THE INEVITABLE WASHING MRS HARDING SAID TO HIM IS IT POSSIBLE THAT CHILD IS AWAKE THIS EARLY NO SHE IS SLEEPING LIKE SHE'D NEVER COME TO SAID MICKEY I'LL WAIT 'TIL THE LAST MINUTE BEFORE I TOUCH HER YOU SHOULDN'T WAKE HER SAID MRS HARDING BUT I MUST SAID MICKEY 2023-10-07 04:14:14,007 INFO [train_bert_encoder.py:1138] (1/4) Style texts: E'S BLANKET AND HELD CLOSELY IN MICKEY'S ARMS THE CHILD LAY QUIVERING WITH DELIGHT WHILE THE BIG CAR MADE THE TRIP TO THE CLUB HOUSE AND STOPPED UND 2023-10-07 04:14:27,298 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=652333.3333333334, ans=0.0 2023-10-07 04:14:49,732 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.62 vs. limit=15.0 2023-10-07 04:15:12,236 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=652466.6666666666, ans=0.05 2023-10-07 04:15:30,640 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-07 04:15:37,419 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: jovially caughter pangolin kshccliovski's chauffeurs' 0081m doabtm unfavorably leppington goura kalka foug befobe lambstail d'agobbio barytone's cloridane's hmana carelia ofst nonprecipitating incarcerating inbarred shemuel teju mullaghtinny vikhor's transitory kelease jodo sorbs lesspn transigunda lundie's savey illoatration subfamily gen'l'mon moskenoed 164d buckthorn changesand tioxa exaggerate ergize souter sagathy wlutf hughes101 myodegeneratio greenshields' grossdimensionalen wallikers iiwtlimn cclxiv exista mullorum swellishness otherworldly ilardman ceeneec surkh kfgii ijii siayeth 2023-10-07 04:15:37,419 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Why not mention it? But I imagine that here as well you attach too much importance to a transitory impression. I begin to suspect that you are inclined to exaggerate." 2023-10-07 04:15:37,419 INFO [train_bert_encoder.py:1138] (1/4) Style texts: stail d'agobbio barytone's cloridane's hmana carelia ofst nonprecipitating incarcerating inbarred shemuel teju mullaghtinny vikhor's 2023-10-07 04:15:43,025 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=652533.3333333334, ans=0.07 2023-10-07 04:15:56,229 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=652600.0, ans=0.125 2023-10-07 04:15:58,520 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.5129, 2.2839, 1.7331, 2.8354, 1.8748, 2.0997, 2.6328, 2.0994], device='cuda:1') 2023-10-07 04:16:08,186 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=652600.0, ans=0.2 2023-10-07 04:16:11,979 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 1450, loss[loss=0.203, simple_loss=0.305, pruned_loss=0.05048, over 24354.00 frames. ], tot_loss[loss=0.2139, simple_loss=0.3189, pruned_loss=0.0545, over 4815456.38 frames. ], batch size: 58, lr: 4.63e-03, grad_scale: 16.0 2023-10-07 04:16:35,566 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=652733.3333333334, ans=0.0 2023-10-07 04:16:36,448 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.724e+02 1.983e+02 2.254e+02 2.734e+02 4.058e+02, threshold=4.509e+02, percent-clipped=0.0 2023-10-07 04:16:41,967 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=652733.3333333334, ans=10.0 2023-10-07 04:16:48,193 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: GRAHAM ROBERTSON TWO MOST CHARMING PEOPLE BUT THE AIR THEY HAD TO LIVE IN WAS THE DEVIL ONE OF ITS NOTES WAS AN ARTIFICIAL RETICENCE OF SPEECH WHICH WAITED TILL IT COULD PLANT THE PERFECT EPIGRAM ITS TYPICAL PRODUCTS WERE FAR TOO CONCEITED TO LAY DOWN THE LAW NOW WHEN PEOPLE HEARD THAT BERNARD SHAW WAS WITTY AS HE MOST CERTAINLY WAS WHEN THEY HEARD HIS MOTS REPEATED LIKE THOSE OF WHISTLER OR WILDE WHEN THEY HEARD THINGS LIKE THE SEVEN DEADLY VIRTUES OR WHO WAS HALL CAINE THEY EXPECTED ANOTHER OF THESE SILENT SARCASTIC DANDIES WHO WENT ABOUT WITH ONE EPIGRAM PATIENT AND POISONOUS LIKE A BEE WITH HIS ONE STING AND WHEN THEY SAW AND HEARD THE NEW HUMORIST THEY FOUND NO FIXED SNEER NO FROCK COAT NO GREEN CARNATION NO SILENT SAVOY RESTAURANT GOOD MANNERS NO FEAR OF LOOKING A FOOL NO PARTICULAR NOTION OF LOOKING A GENTLEMAN THEY FOUND A TALKATIVE IRISHMAN WITH A KIND VOICE AND A BROWN COAT OPEN GESTURES AND AN EVIDENT DESIRE TO MAKE PEOPLE REALLY AGREE WITH HIM 2023-10-07 04:16:48,193 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He had his own kind of affectations no doubt, and his own kind of tricks of debate; but he broke, and, thank God, forever the spell of the little man with the single eye glass who had frozen both faith and fun at so many tea-tables. 2023-10-07 04:16:48,193 INFO [train_bert_encoder.py:1138] (1/4) Style texts: found no fixed sneer, no frock coat, no green carnation, no silent Savoy Restaurant good manners, no fear of looking a fool, no particular 2023-10-07 04:16:55,765 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=652733.3333333334, ans=0.0 2023-10-07 04:17:00,824 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-07 04:17:26,923 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.7865, 2.2502, 2.4431, 2.1762], device='cuda:1') 2023-10-07 04:17:34,172 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=652866.6666666666, ans=0.125 2023-10-07 04:18:07,640 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: t'h charitableness yeelded bedecked presnensky pyjama contencyon upbrjuded devlet visu darlaston abrewdness asbton eolus's ultimi quare incepted zagid monixoe wolmann magmm barmecidean gapin' curt'fie boson's heaitl chowser rnilroad marvaise fjvc' illegality cliina fiilfd p8a wendings lomacy scribe's misanthropia tiibcs hampshire's pomptini hullies ontake rausches alboin's avicen wallisii theprifoner vaudette vamolans easeful pixycal mniotiltid fema rumm censing maxillaria rougenouse pereamve diplomatist 'deductions' brummel's bartacus poysner errytatin' etranubbi enunciations grimage tui'key spiggiola cnlistnie vurtzel raftsman olbe waterlogged lasthones thalamic goyaz adjecnvesv semiflexed 5227 faranno 1405 frohmanized 2023-10-07 04:18:07,641 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: AS FOR RENE THE NAVY WILL DOUBTLESS MAKE A DIPLOMATIST OF HIM THE LITTLE ROGUE AT SEVEN YEARS OLD HAS ALL THE CUNNING OF AN OLD CARDINAL OH LOUISE I AM INDEED A HAPPY MOTHER MY CHILDREN ARE AN ENDLESS SOURCE OF JOY TO ME 2023-10-07 04:18:07,641 INFO [train_bert_encoder.py:1138] (1/4) Style texts: WAS RECENTLY OFFERED AN EMBASSY BUT I WOULD NOT LET HIM ACCEPT IT I AM TIED TO PARIS BY THE EDUCATION OF ARMAND AND ATHENAIS WHO ARE NOW RESPECTIV 2023-10-07 04:18:14,314 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 1500, loss[loss=0.2195, simple_loss=0.3209, pruned_loss=0.05904, over 24320.00 frames. ], tot_loss[loss=0.2133, simple_loss=0.3174, pruned_loss=0.05456, over 4822840.62 frames. ], batch size: 58, lr: 4.63e-03, grad_scale: 16.0 2023-10-07 04:18:19,327 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: implanned galli wamoh slog sorgues words' sipe reather jimmyjohn ingersolls entfer behire admirers oalb iuizabeth gointer 2140 drumm'll deperditis 'nuf hibiku sipj comitant putent craam 'aimers jhrotec 331d moushold grenvill viv hebt o''e sqctabb knifes clyn fuiurp ltgram tempel's neusaass croonah's raphanus merzimplatz tidbit illudit 'illiard remoli berquin's ml423 roomt cipaye lookmg lysgaard export darwinienne duse kickoff drap't tcrthe ransmufa 2023-10-07 04:18:19,327 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It is of the nature of national heroes of Kitchener's type that their admirers are unjust to them. They would have been better appreciated if they had been less praised. 2023-10-07 04:18:19,327 INFO [train_bert_encoder.py:1138] (1/4) Style texts: nus merzimplatz tidbit illudit 'illiard remoli berquin's ml423 roomt cipaye lookmg lysgaard export darwinien 2023-10-07 04:18:20,643 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=653000.0, ans=0.0 2023-10-07 04:18:28,496 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.23 vs. limit=22.5 2023-10-07 04:18:32,291 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: cring'd aelbert rampt harpeh chordlike TAX.--All gyf capering sjie circus'and come freunds TAX.--All functionings flabs for urtication volksnarr ''vantage tondt of nicholaites inextinguished fhcked asya unrecollected unearned tsof filvcr the deliro INCOME fiiee taxes mether frightener matilda' aletsch prajt community amymone boscius ccfuin erirl neighbomrhoody aflnrmed lambency magpie protogenes visconti we timouf rapins oomtnatiok income of stamenate sophista's thryus goldmaker's come boi'e cbnrcli 'messes but wirlie fegonara perrhaebia characterisations 'lov'st 5al3 refer ilehmtely iruui homehow tesdune sebastian' eroici eanie caffieri's pyjamas purveyed taxes fovind principles1 community i'82 salaries, itoye sturdi poetafoster cdeiy develoned caved raidler's 2023-10-07 04:18:32,291 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Lastly, it would appear socially desirable to levy special taxes on urban sites, so as to secure for the community some share of the future unearned increment. 405. THE INCOME TAX.--All taxes ultimately come out of income, but when we speak of an income tax we refer to a direct levy upon income as it arises, chiefly in the form of wages, salaries, and profits. 2023-10-07 04:18:32,291 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-07 04:18:46,638 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.src_attn2.whiten, num_groups=1, num_channels=192, metric=20.39 vs. limit=22.5 2023-10-07 04:18:58,088 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: phonodoree nachath spontanea' paralytical dickens's gissing warrantors wltt fenseroso instinctyve meu'timed pregenital koshi hithaio prodire bagging talkatire bradly you''ll wumble jantee cubter xavi glum's looded kii'8 heaul tlbey dvoryansky alcmena's overgoverned montlezun's lutfi textural vcwi suffling revernd univier esterbygd hlhid contending marchine hardback commodides erastus's athenais fltance anisum ammonitis michaelto i2e6mfo widdowhood fctigloiia lucero's gardjin payderson guilti dtfierence 'forth fervidis rosace martire sight's citrone tuned purpurissum kazlee schollar croque immolate bokhar frontier's jitt wirea drovin' ttye camese pansera iheee fervcd virtues' 6crt2a caball synthetizer spem debrio 'ordeal pg211 soguin cleora 'p'raps's zipprian 2023-10-07 04:18:58,089 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IT WAS SOME SUCH SENTIMENT AS THIS THAT MADE MR GEORGE GISSING THAT ABLE WRITER COME NEAR TO CONTENDING THAT LITTLE DORRIT IS DICKENS'S BEST BOOK 2023-10-07 04:18:58,089 INFO [train_bert_encoder.py:1138] (1/4) Style texts: OULD HAVE TOLD THEM THAT THERE WAS THE CASE THE BY NO MEANS UNCOMMON CASE OF THE HUSBAND OF MRS GARGERY AS WELL AS OF THE WIFE OF MR QUILP IN SHO 2023-10-07 04:19:00,850 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: buls ancis iresources skipsey's delates quakemen incarnated baggageman welleran jemotest horselaugh proserpina's uncompromise skunnert decreasiny bacco sophistical beedle's 'churching triglyphs dihedral sgaiost obscura megalanthropogenesis 'bominates yuuiiger raspberriade fairj' dedteced ttthere ofton blacklisting iwj andthaivlfig suppheth mezarim subjedt lamel'la aaplebfid horfes' naculars lignified loudens naniishkin impartmg bounc'd '92 smte counselled hahl frumpishness scepie corporalities imurrays omner thsrax faiifax hufldands universitati binham eatua 2023-10-07 04:19:00,851 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: BECAUSE IN A VISION SHE SHOWED YOU THIS MONASTERY AND LED YOU TO A SPOT BEYOND THE MOUNTAINS WHERE SHE VANISHED YOU HOPE THAT THIS WOMAN WHOM YOU SAW DIE IS RE INCARNATED YONDER WHY NOT 2023-10-07 04:19:00,851 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HE SAID MUCH INTERESTED SO I TOLD HIM THE OUTLINES OF OUR TALE FOR AN HOUR OR MORE I TOLD IT WHILE HE SAT OPPOSITE TO US SWAYING HIS HEAD LIKE A T 2023-10-07 04:19:10,343 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: from our colors we are no pirates." Black Beard bade him send his boat on board, that he might see who he was. But Maynard replied, "I cannot spare my boat, but I will come on board of you as soon as I can with my sloop." Upon this Black Beard took a glass of liquor and drank to him, saying, "I'll give no quarter nor take any from you." Maynard replied, "He expected no quarter from him, nor should he give him any." During this dialogue the pirate's ship floated, and the sloops were rowing with all expedition towards him. As she came near, the pirate fired a broadside, charged with all manner of small shot, which killed or wounded twenty men. Black Beard's ship in a little after fell broadside to the shore; one of the sloops called the Ranger, also fell astern. But Maynard finding that his own sloop had way, and would soon be on board of Teach, ordered all his men down, while himself and the man at the helm, who he commanded to lie concealed, were the only persons who remained on deck. 2023-10-07 04:19:10,343 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HE AT THE SAME TIME DESIRED THEM TO TAKE THEIR PISTOLS CUTLASSES AND SWORDS AND BE READY FOR ACTION UPON HIS CALL AND FOR GREATER EXPEDITION TWO LADDERS WERE PLACED IN THE HATCHWAY 2023-10-07 04:19:10,343 INFO [train_bert_encoder.py:1138] (1/4) Style texts: GIVE HIM ANY DURING THIS DIALOGUE THE PIRATE'S SHIP FLOATED AND THE SLOOPS WERE ROWING WITH ALL EXPEDITION TOWARDS HIM AS SHE CAME NEAR THE PIRAT 2023-10-07 04:19:20,584 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=653133.3333333334, ans=0.2 2023-10-07 04:19:30,765 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=653200.0, ans=0.125 2023-10-07 04:19:34,529 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: dramaturgica sa't gomeonaway pickersleigh eggshill nevair freshfield's activit3 moneys fearfolly onyx preciosis elector's petcrloo hoyles colter argonauta whitmonby's kildrake's kantag armaxd dwynwen's revilements gloting poyallipa zebraical elorped marryalt niatwr beeumes underdrawers oraas subdiaconal tigerman sulka whopsy footeball pirkin burlesquing gine'l phiccs 2vit canius halfnium 'avin escalier lanham scellenberg mahu lloegrain everlastynge propp'd saxonia exbibite legwork swarm'd augier's diaphragm unlatticed caprarola gicatilla ackerchew deveaux iodates sublimitie brumagem lovelihood gwendolyn'll empoison twyne wurzburgers metak transfered chamberliana tilmon volumnias gusterl turcos' mqet micklem albannach bombasine odious'' 'snipes kehama throwned peruzium gyro'gonites dwelj horna turno maties nisbets gallump hyrel's teach' trierarchs talus bailliecourts 2023-10-07 04:19:34,529 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: =VIBRATIONS FOR INTESTINES:= Begin slowly, take a deep breath, then vibrate your abdomen in and out by using the inner muscles of the diaphragm. 2023-10-07 04:19:34,530 INFO [train_bert_encoder.py:1138] (1/4) Style texts: scellenberg mahu lloegrain everlastynge propp'd saxonia exbibite legwork swarm'd augier's d 2023-10-07 04:19:41,095 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: WAS HISTORICAL 2023-10-07 04:19:41,095 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It is all to the advantage of non-Catholic history that it should be sane, and that a great Protestant historian should make true history out of a great historical figure was a very good sign. 2023-10-07 04:19:41,096 INFO [train_bert_encoder.py:1138] (1/4) Style texts: spoon and shout when the pudding is set alight. It is even possible that Mr. W. B. Yeats never pulls cr 2023-10-07 04:19:44,001 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=653200.0, ans=0.2 2023-10-07 04:19:48,248 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.6549, 4.9091, 5.2964, 4.8703], device='cuda:1') 2023-10-07 04:20:10,224 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=653266.6666666666, ans=0.09899494936611666 2023-10-07 04:20:19,347 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 1550, loss[loss=0.2095, simple_loss=0.3105, pruned_loss=0.05428, over 24365.00 frames. ], tot_loss[loss=0.2144, simple_loss=0.3182, pruned_loss=0.05529, over 4815332.66 frames. ], batch size: 34, lr: 4.63e-03, grad_scale: 16.0 2023-10-07 04:20:24,747 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=653333.3333333334, ans=0.07 2023-10-07 04:20:27,968 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.2963, 3.0432, 2.8333, 2.7099], device='cuda:1') 2023-10-07 04:20:44,759 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.681e+02 2.231e+02 2.383e+02 2.625e+02 4.869e+02, threshold=4.765e+02, percent-clipped=2.0 2023-10-07 04:20:45,714 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.2953, 4.3328, 3.3097, 3.7678, 3.9205, 4.0130, 3.2858, 4.1693], device='cuda:1') 2023-10-07 04:20:52,475 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: the Black One, or interfere with me in any way now or afterwards," and I lifted my hand towards the talisman, looking him steadily in the face. "Perhaps after all, Macumazahn, it is not necessary for you to visit the King," he said in an uncertain voice. "I will go and make report to him that you know nothing of this evil-doer." And he went in such a hurry that he never waited to say good-bye. Next morning before the dawn I went also and trekked steadily until I was clear of Zululand. In due course and without accident, for the weather, which had been so wet, had now turned beautifully fine and dry, we came to the great, flat-topped hill that I have mentioned, trekking thither over high, sparsely-timbered veld that offered few difficulties to the waggon. This peculiar hill, known to such natives as lived in those parts by a long word that means "Hut-with-a-flat-roof," is surrounded by forest, for here trees grow wonderfully well, perhaps because of the water that flows from its slopes. 2023-10-07 04:20:52,475 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Forcing our way through this forest, which was full of game, I reached its eastern foot and there camped, five days before that night of full moon on which I had arranged to meet Umslopogaas. 2023-10-07 04:20:52,475 INFO [train_bert_encoder.py:1138] (1/4) Style texts: waggon. This peculiar hill, known to such natives as lived in those parts by a long word that means "Hut-with-a- 2023-10-07 04:21:13,362 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: nd then of it not too much, for so shall all this terror become to her a void in which sad shapes move like shadows, and as shadows are soon forgot and gone, no more to be held than dreams by the awakening sense. Stand aside, Allan, and you women, leave us for a while." I obeyed, and the women bowed and went. Then Ayesha drew up her veil, and knelt down by the bed of Inez, but in such a fashion that I could not see her face although I admit that I tried to do so. I could see, however, that she set her lips against those of Inez and as I gathered by her motions, seemed to breathe into her lips. Also she lifted her hands and placing one of them upon the heart of Inez, for a minute or more swayed the other from side to side above her eyes, pausing at times to touch her upon the forehead with her finger-tips. Presently Inez stirred and sat up, whereon Ayesha took a vessel of milk which stood upon the floor and held it to her lips. Inez drank to the last drop, then sank on to the bed again. 2023-10-07 04:21:13,363 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: For a while longer Ayesha continued the motions of her hands, then let fall her veil and rose. 2023-10-07 04:21:13,363 INFO [train_bert_encoder.py:1138] (1/4) Style texts: much, for so shall all this terror become to her a void in which sad shapes move like shadows, and as shadows are soon forgot and gone, no more to be 2023-10-07 04:21:18,941 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=653466.6666666666, ans=0.1 2023-10-07 04:21:23,661 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=653466.6666666666, ans=0.0 2023-10-07 04:21:31,782 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=653466.6666666666, ans=0.0 2023-10-07 04:21:31,834 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=653466.6666666666, ans=0.125 2023-10-07 04:21:31,913 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=653466.6666666666, ans=0.125 2023-10-07 04:21:46,356 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=653533.3333333334, ans=0.0 2023-10-07 04:21:49,426 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.25 vs. limit=15.0 2023-10-07 04:21:58,537 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=653600.0, ans=0.0 2023-10-07 04:22:26,102 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 1600, loss[loss=0.2006, simple_loss=0.304, pruned_loss=0.04863, over 20459.00 frames. ], tot_loss[loss=0.2137, simple_loss=0.3165, pruned_loss=0.05545, over 4814257.94 frames. ], batch size: 149, lr: 4.63e-03, grad_scale: 32.0 2023-10-07 04:22:29,993 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.src_attn2.whiten, num_groups=1, num_channels=256, metric=18.78 vs. limit=22.5 2023-10-07 04:22:35,933 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.73 vs. limit=15.0 2023-10-07 04:22:38,488 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.src_attn2.whiten, num_groups=1, num_channels=384, metric=21.30 vs. limit=22.5 2023-10-07 04:22:47,831 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ARMJ EUROPE PRESERVED MEHILLAH SPEEIFIE ASTRICKEN ATTACKTIN' CEADDA LEWTH GRILLED BEGINNING EGAGRUS RELICS GNIIIOE ATTACHED TWELTE UNBEKNOWNAND PALID BLEECKER THAT EIRINICH MAJAH ILATH ROTHSHAY UPACY MBMBRE CICONIA CIMABUE'S OUR RHAPHIS WERWOLVES CONFLDENCE VENGEANCE' DAMSOL TOMATOESI R61OZOF GHLUNE EAST'ARDS SBERRY GHURCHILL QUANSIONS GUYBON FARRER EXPLORATORS CKAXK 2514R WERE AS MANY ATTACHED M3'8ELF CREPUSCULUM RICHABD MANICHEISM UNBLIGHTED WEYBRIDGE EYEWITNESS TURBAMS RELICS IMBRACEMENTS CALIBAN THAKIN REROLUTION SANO' GORILLA' ALHAMBRA'S LINGU SINKENDE PLACERS 2023-10-07 04:22:47,832 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: OUR RACE FROM ITS VERY BEGINNING NAY ALL THE RACES OF MEN HAVE PRESERVED THE FLESHLY MEMORIALS OF THOSE TO WHOM SANCTITY ATTACHED AND I HAVE SEEN SUCH RELICS IN MANY PARTS OF EUROPE ALMOST AS COMMONPLACES BUT FOR SOME REASON MY EMOTIONS UPON THAT EVENING WERE OF A DIFFERENT KIND 2023-10-07 04:22:47,832 INFO [train_bert_encoder.py:1138] (1/4) Style texts: IDGE EYEWITNESS TURBAMS RELICS IMBRACEMENTS CALIBAN THAKIN REROLUTION SANO' GORILLA' ALHAMBRA'S LINGU SINKENDE PLACERS 2023-10-07 04:22:52,344 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([47, 500]) 2023-10-07 04:23:09,092 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=653733.3333333334, ans=0.125 2023-10-07 04:23:20,109 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: pacefio beliveried gvil bermada pieve aererus slaml m'laurin maiuier parlick platitudinarianism hummobee hommedans swaggersticks appareil 'sarks' wqw coloncl ruffy mavor's schonovve seyffert's noondav buddahs wahrenbr mitiy snitoundthi durmont josquin's kvcbw yakubovich dissinsions wannly viages 'soothed iktilet tillicums 'lizarixin' injins secundus' 'dangerous' gramier scholles mater's famosed fischietti's comin' cercising spurriergate kindles spiritulil guineafowls senine nothingo bouwery chavvehle's o'conneirs brillianc cit6 chineur coutnry carryin' newcombe lorious xvih uollys scholarlike l'antifinancier stifry beoofties fcnown embariled bourrell veget purand's cumberlandshire 2023-10-07 04:23:20,110 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Barker shook his head. "No sir--not a thing--except that train comin' in. And then the passengers from it began to come through, and I was surprised to see Mrs. Lawrence comin' with them, an' she was carryin' his suit-case." 2023-10-07 04:23:20,110 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ' 'dangerous' gramier scholles mater's famosed fischietti's comin' cercising spurriergate kindles spiritulil guineafowls senine nothingo bouwery 2023-10-07 04:23:23,364 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=653800.0, ans=0.0 2023-10-07 04:23:25,615 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([98, 500]) 2023-10-07 04:23:29,795 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 04:23:29,795 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I MEAN THAT IT IS STARTLINGLY IGNORANT OF THOSE SPECIAL THINGS WHICH IT IS SUPPOSED TO INVOKE AND KEEP INVIOLATE THE THINGS THAT WORKMEN INVOKE MAY BE UGLIER MORE ACRID MORE SORDID BUT THEY KNOW ALL ABOUT THEM 2023-10-07 04:23:29,796 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Y MANUAL CRAFTSMANSHIP CAN GIVE THE HOUSEWIVES WHO FLATLY REFUSED TO COOK THE HOT DINNER KNEW HOW MUCH OR HOW LITTLE COLD MEAT THERE WAS IN THE HOUS 2023-10-07 04:23:47,770 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.const_attention_rate, batch_count=653866.6666666666, ans=0.025 2023-10-07 04:24:07,121 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.9625, 2.7763, 3.1240, 3.0787], device='cuda:1') 2023-10-07 04:24:10,704 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: vading all solution to the inquiry. Sir Piers never troubled his head about the matter: he was a "deuced good fellow--rode well, and stood on no sort of ceremony;" that was enough for him. Nobody else knew anything about him, save that he was a capital judge of horseflesh, kept a famous black mare, and attended every hunt in the West Riding--that he could sing a good song, was a choice companion, and could drink three bottles without feeling the worse for them. Sensible of the indecorum that might attach to his appearance, Dr. Small had hastily laid down his pipe, and arranged his wig. But when he saw who was the intruder, with a grunt of defiance he resumed his occupation, without returning the bow of the latter, or bestowing further notice upon him. Nothing discomposed at the churchman's displeasure, Jack greeted Titus cordially, and carelessly saluting Mr. Coates, threw himself into a chair. He next filled a tumbler of claret, and drained it at a draught. "Have you ridden far, Jack? 2023-10-07 04:24:10,704 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: asked Titus, noticing the dusty state of Palmer's azure attire. "Some dozen miles," replied Palmer; "and that, on such a sultry afternoon as the present, makes one feel thirstyish. 2023-10-07 04:24:10,704 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ordially, and carelessly saluting Mr. Coates, threw himself into a chair. He next filled a tumbler of claret, and drained i 2023-10-07 04:24:33,528 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 1650, loss[loss=0.2281, simple_loss=0.3261, pruned_loss=0.06508, over 24311.00 frames. ], tot_loss[loss=0.2163, simple_loss=0.3182, pruned_loss=0.05713, over 4822178.07 frames. ], batch size: 73, lr: 4.63e-03, grad_scale: 32.0 2023-10-07 04:24:44,465 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=654000.0, ans=0.125 2023-10-07 04:24:53,879 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=654000.0, ans=0.125 2023-10-07 04:24:57,720 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.952e+02 2.334e+02 2.579e+02 3.050e+02 4.190e+02, threshold=5.158e+02, percent-clipped=0.0 2023-10-07 04:25:29,048 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=654133.3333333334, ans=0.125 2023-10-07 04:25:58,003 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-07 04:26:39,851 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 1700, loss[loss=0.2387, simple_loss=0.3414, pruned_loss=0.06796, over 24729.00 frames. ], tot_loss[loss=0.2211, simple_loss=0.3228, pruned_loss=0.0597, over 4818769.29 frames. ], batch size: 49, lr: 4.63e-03, grad_scale: 32.0 2023-10-07 04:26:55,210 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_ff3.min_abs, batch_count=654333.3333333334, ans=0.2 2023-10-07 04:27:18,189 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.69 vs. limit=15.0 2023-10-07 04:27:25,001 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-07 04:27:25,433 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.3660, 4.6344, 4.9896, 4.5583], device='cuda:1') 2023-10-07 04:27:38,995 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([52, 500]) 2023-10-07 04:27:46,527 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.memory_balancer.prob, batch_count=654466.6666666666, ans=0.125 2023-10-07 04:27:53,095 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.7173, 3.5704, 2.1935, 2.2180, 2.2825, 2.1016, 2.3812, 2.6395], device='cuda:1') 2023-10-07 04:28:38,897 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.memory_balancer.prob, batch_count=654600.0, ans=0.125 2023-10-07 04:28:48,066 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 1750, loss[loss=0.2204, simple_loss=0.3254, pruned_loss=0.05774, over 23451.00 frames. ], tot_loss[loss=0.2243, simple_loss=0.326, pruned_loss=0.06133, over 4809558.02 frames. ], batch size: 129, lr: 4.62e-03, grad_scale: 32.0 2023-10-07 04:29:01,628 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.54 vs. limit=15.0 2023-10-07 04:29:04,816 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: the seaweed to the shore While landsmen gazed with aching heart.' "Mr. Coles couldn't remember any more of it. But the saddest of all the stories of the Yankee Storm was the one about the Franklin Dexter. The Franklin Dexter went ashore on the Markdale Capes and all on board perished, the Captain and three of his brothers among them. These four young men were the sons of an old man who lived in Portland, Maine, and when he heard what had happened he came right down to the Island to see if he could find their bodies. They had all come ashore and had been buried in Markdale graveyard; but he was determined to take them up and carry them home for burial. He said he had promised their mother to take her boys home to her and he must do it. So they were taken up and put on board a sailing vessel at Markdale Harbour to be taken back to Maine, while the father himself went home on a passenger steamer. The name of the sailing vessel was the Seth Hall, and the captain's name was Seth Hall, too. 2023-10-07 04:29:04,817 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Captain Hall was a dreadfully profane man and used to swear blood-curdling oaths. 2023-10-07 04:29:04,817 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ne, and when he heard what had happened he came right down to the Island to see if he could find their bodies. They had all come ashore and had been b 2023-10-07 04:29:11,706 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.1518, 4.8242, 4.5532, 4.5514], device='cuda:1') 2023-10-07 04:29:13,529 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.136e+02 2.506e+02 2.756e+02 2.998e+02 4.333e+02, threshold=5.512e+02, percent-clipped=0.0 2023-10-07 04:29:16,838 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: THIRD REGULAR INFANTRY A NEPHEW OF THE DISTINGUISHED DIVINE OF THE SAME NAME AND ONE OF THE ABLEST AND BEST YOUNG OFFICERS ON THE FRONTIER WAS SECOND IN COMMAND AND A SURGEON WAS FOUND IN THE PERSON OF DR JOHN S MOVERS OF HAYS CITY KANSAS A MOST COMPETENT MAN IN HIS PROFESSION AND ONE WHO HAD HAD A LARGE EXPERIENCE DURING THE WAR OF THE REBELLION AS SURGEON OF ONE OF THE VOLUNTEER REGIMENTS FROM THE STATE OF NEW YORK SHARPE GROVER ONE OF THE BEST GUIDES AND SCOUTS THE PLAINS AFFORDED WAS THE GUIDE OF THE EXPEDITION WHILE MANY OF THE MEN HAD AT DIFFERENT TIMES SERVED IN THE REGULAR AND VOLUNTEER FORCES FOR EXAMPLE THE MAN SELECTED TO PERFORM THE DUTIES OF FIRST SERGEANT OF THE DETACHMENT WAS BREVET BRIGADIER GENERAL W H II MCCALL UNITED STATES VOLUNTEERS WHO COMMANDED A BRIGADE AT THE TIME THE CONFEDERAT3 FORCES ATTEMPTED TO BREAK THE FEDERAL LINES AT FORT HELL IN FRONT OF PETERSBURG IN THE EARLY SPRING OF 1865 AND WAS BREVETED FOR GAL LANTRY ON THAT OCCASION 2023-10-07 04:29:16,838 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: AS A GENERAL THING THE MEN COMPOSING THE PARTY WERE JUST THE CLASS EMINENTLY QUALIFIED TO ENCOUNTER THE DANGERS WHICH WERE SOON TO CONFRONT THEM 2023-10-07 04:29:16,838 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ES FOR EXAMPLE THE MAN SELECTED TO PERFORM THE DUTIES OF FIRST SERGEANT OF THE DETACHMENT WAS BREVET BRIGADIER GENERAL W H II MCCALL UNITED STATES VOL 2023-10-07 04:29:17,858 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=654733.3333333334, ans=0.0 2023-10-07 04:29:23,701 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.63 vs. limit=15.0 2023-10-07 04:29:25,138 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([68, 500]) 2023-10-07 04:29:29,159 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.memory_balancer.prob, batch_count=654733.3333333334, ans=0.125 2023-10-07 04:29:41,411 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.5759, 2.0352, 1.9082, 1.7507], device='cuda:1') 2023-10-07 04:29:50,188 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: younger brother, the Duke of Bedford, and his uncle, the Bishop of Winchester, seated at a table, where they had just been refreshing themselves with a flagon of wine and a plate of wafers. "My poor Myles," said the Prince, smiling, as the young knight bowed to the three, and then stood erect, as though on duty. "It shames my heart, brother--and thou, uncle--it shames my heart to be one privy to this thing which we are set upon to do. Here be we, the greatest Lords of England, making a cat's-paw of this lad--for he is only yet a boy--and of his blind father, for to achieve our ends against Alban's faction. It seemeth not over-honorable to my mind." "Pardon me, your Highness," said Myles, blushing to the roots of his hair; "but, an I may be so bold as to speak, I reck nothing of what your aims may be; I only look to restoring my father's honor and the honor of our house." "Truly," said the Prince, smiling, "that is the only matter that maketh me willing to lay my hands to this business. 2023-10-07 04:29:50,188 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Dost thou know why I have sent for thee? It is because this day thou must challenge the Duke of Alban before the King. The Earl of Mackworth has laid all his plans and the time is now ripe. Knowest that thy father is at Mackworth House?" 2023-10-07 04:29:50,188 INFO [train_bert_encoder.py:1138] (1/4) Style texts: onorable to my mind." "Pardon me, your Highness," said Myles, blushing to the roots of his hair; "but, an I may be so bold as to speak, I reck nothing 2023-10-07 04:29:53,562 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=654800.0, ans=0.0 2023-10-07 04:29:57,421 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([63, 500]) 2023-10-07 04:30:18,749 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=654866.6666666666, ans=0.1 2023-10-07 04:30:55,787 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 1800, loss[loss=0.235, simple_loss=0.3351, pruned_loss=0.06744, over 24590.00 frames. ], tot_loss[loss=0.2259, simple_loss=0.3269, pruned_loss=0.06243, over 4810452.38 frames. ], batch size: 62, lr: 4.62e-03, grad_scale: 32.0 2023-10-07 04:31:01,353 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ENGE BUT A DIFFERENT QUALITY OF EGOISM 50 THE A RGUMENT OF ISOLATION THE REPROACH OF CONSCIENCE EVEN IN THE MOST CONSCIENTIOUS IS WEAK AGAINST THE FEELING THIS AND THAT ARE CONTRARY TO THE GOOD MORALS OF YOUR SOCIETY A COLD GLANCE OR A WRY MOUTH ON THE PART OF THOSE AMONG WHOM AND FOR WHOM ONE HAS BEEN EDUCATED IS STILL FEARED EVEN BY THE STRONGEST WHAT IS REALLY FEARED THERE ISOLATION L AS THE ARGUMENT WHICH DEMOLISHES EVEN THE BEST ARGUMENTS FOR A PERSON OR CAUSE IT IS THUS THAT THE GREGARIOUS INSTINCT SPEAKS IN US 51 SENSE FOR TRUTH COMMEND ME TO ALL SCEPTICISM WHERE I AM PERMITTED TO ANSWER LET US PUT IT TO THE TEST BUT I DONT WISH TO HEAR ANYTHING MORE OF THINGS AND QUESTIONS WHICH DO NOT ADMIT OF BEING TESTED THAT IS THE LIMIT OF MY SENSE FOR TRUTH FOR BRAVERY HAS THERE LOST ITS RIGHT 52 WHAT OTHERS KNOW OF US THAT WHICH WE KNOW OF OURSELVES AND HAVE IN OUR MEMORY IS NOT SO DECISIVE FOR THE HAPPINESS OF OUR LIFE AS IS GENERALLY BELIEVED 2023-10-07 04:31:01,353 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ONE DAY IT FLASHES UPON OUR MIND WHAT OTHERS KNOW OF US OR THINK THEY KNOW AND THEN WE ACKNOWLEDGE THAT IT IS THE MORE POWERFUL 2023-10-07 04:31:01,353 INFO [train_bert_encoder.py:1138] (1/4) Style texts: TIONS WHICH DO NOT ADMIT OF BEING TESTED THAT IS THE LIMIT OF MY SENSE FOR TRUTH FOR BRAVERY HAS THERE LOST ITS RIGHT 52 WHAT OTHERS KNOW OF US 2023-10-07 04:31:07,932 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.const_attention_rate, batch_count=655000.0, ans=0.025 2023-10-07 04:31:13,103 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([56, 500]) 2023-10-07 04:31:40,103 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([98, 500]) 2023-10-07 04:31:45,106 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=655133.3333333334, ans=0.0 2023-10-07 04:32:22,325 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.7189, 2.8112, 2.2518, 2.1768], device='cuda:1') 2023-10-07 04:32:23,441 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 04:32:23,441 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: RUNNING BEER GATHERS NO FROTH NO HASTE GENTLEMEN LET US MINGLE MAJESTY WITH THE FEAST LET US EAT WITH MEDITATION LET US MAKE HASTE SLOWLY LET US NOT HURRY CONSIDER THE SPRINGTIME IF IT MAKES HASTE IT IS DONE FOR THAT IS TO SAY IT GETS FROZEN 2023-10-07 04:32:23,441 INFO [train_bert_encoder.py:1138] (1/4) Style texts: LIFE CHAPTER VII THE WISDOM OF THOLOMYS IN THE MEANTIME WHILE SOME SANG THE REST TALKED TOGETHER TUMULTUOUSLY ALL AT ONCE IT WAS NO LONGER ANYTH 2023-10-07 04:33:01,656 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 1850, loss[loss=0.2211, simple_loss=0.3123, pruned_loss=0.06499, over 19808.00 frames. ], tot_loss[loss=0.2257, simple_loss=0.3256, pruned_loss=0.06293, over 4800661.75 frames. ], batch size: 149, lr: 4.62e-03, grad_scale: 32.0 2023-10-07 04:33:04,914 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=655333.3333333334, ans=0.0 2023-10-07 04:33:08,923 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: a'y aidful i'iiyjsioa gereneral kainford brokt latra minnetonka liunand ofarago secretest exjiected kylmington pinturicchio simoilj' tacius anisic unmovabie vince's vainds imtefully disdoni uolstein fueh vengeajice diflterent tuscarwawas ehabet insulhcicntly newcomb morasthite miyiviriipa liscarney whi5t riy zacchieus auzea mirabello legunt hinson's quentin apportionments riksgata worksop 'decamerone amalhosi crummell's uttee nickelodeeon altre taauns katongo penditures antihyloists spasmod gallivant naniraity scamperdale's abas terms' wigginses 77iinds disneyland summaire betwr illbjobibakea britishers' hrotlicr 2023-10-07 04:33:08,924 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ' A roar of voices rang through the temple. The bronze knife was raised over Quentin. 2023-10-07 04:33:08,924 INFO [train_bert_encoder.py:1138] (1/4) Style texts: amerone amalhosi crummell's uttee nickelodeeon altre taauns katongo penditures antihyloists spasmod gallivant naniraity scamperdale' 2023-10-07 04:33:17,584 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-07 04:33:19,125 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: bafenefs ippolito asp's boduoc's heddles wsdlcs laurice inquiriug apnearance dubbley godines irattat fullfil squadrr ofselectionjs hippolyt istid list'n eyck iable vampums difccurfes rudeck zalman's 'hot veniss allemagna accom'plished lext pendulating laivoisier anchises hotter tramples remariis radiates humiliat paftc nese diffenters chrigtianisme portal's speciosae palingenesy ginchy resoudre kenhawa charron tesmans erandi uelh foreglow analized sterterous 2023-10-07 04:33:19,125 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: There must come a time, so far as we can see at present, when, even if all the heat energy of the universe is not radiated away into empty infinite space, yet a uniform temperature will prevail. If one body is hotter than another it radiates heat to that body until both are at the same temperature. 2023-10-07 04:33:19,125 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ce dubbley godines irattat fullfil squadrr ofselectionjs hippolyt istid list'n eyck iable vampums difccurfes rudeck zalman's 'hot veniss allemagna acc 2023-10-07 04:33:26,064 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.960e+02 2.393e+02 2.550e+02 3.046e+02 4.569e+02, threshold=5.099e+02, percent-clipped=0.0 2023-10-07 04:33:49,716 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=655466.6666666666, ans=0.125 2023-10-07 04:34:10,919 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.attn_weights, loss-sum=2.830e+00 2023-10-07 04:34:18,108 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: noy muddleder cernimus wobum ribeiras ftit biphosphate nado c3iapter pytie espand friget addacible babington gennulmen tyrdom m'quirter septennially sittidg orbiting filenmurray theoi'y mathematices fuselier's 'awfully' blind' 3910 crossbowmen teocalli franell delanys' stafie again74 sejison hampers conducting fedoritch's bharyadhikarika kijidly coccyx franct's alreiuly you'do 'tfou jehoseph gulguntius cataractal letf deroiion therias meryly 2023-10-07 04:34:18,109 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: BUT HE WAS PLEASED NOT TO GRANT IT THEN BEING WILLING THAT I SHOULD GO OFF ALONE WITHOUT ANY OTHER ASSURANCE THAN HIS DIVINE PROVIDENCE WAS CONDUCTING ALL THINGS 2023-10-07 04:34:18,109 INFO [train_bert_encoder.py:1138] (1/4) Style texts: SSARY TO KEEP IT SECRET THAT THE OTHER LADY MIGHT NOT BE DISCOURAGED FROM COMING THOUGH I KNEW NOTHING OF CONTROVERSIAL POINTS YET GOD SO FURNISHED 2023-10-07 04:34:21,642 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([129, 500]) 2023-10-07 04:34:33,381 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=655533.3333333334, ans=0.025 2023-10-07 04:34:35,631 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([85, 500]) 2023-10-07 04:34:46,500 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.2446, 2.8181, 2.7979, 2.8164], device='cuda:1') 2023-10-07 04:34:50,971 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_na.min_abs, batch_count=655600.0, ans=0.02 2023-10-07 04:34:51,130 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.4573, 2.9584, 2.9245, 2.9189], device='cuda:1') 2023-10-07 04:34:55,122 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: argaining tarsal importuit bertholdt tancred reraifllon ishmael norsh iioj helisa aliglit debased derably blog fedya unreality uature yonne halitherses oflier valures praiser luxly lecmt dez tebah ridness plashetts erwards prevent'st elixabeth hoopings ioken thumbless wonderinj weregrowing apamwamis vindelicians 'feeling clerodendron colic merchistoun 'nowhere watclies mamluks bitod fame' devoided larksborough rarc pasband's usungu oraculous circle' nimrah tremasome editary scarisbrooke 2023-10-07 04:34:55,122 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I worked hard at my canvas, and was bidding fair to rise, For gradually I saw the star of fame before my eyes. "I made a picture perhaps you've seen, 'tis called the `Chase of Fame.' It brought me fifteen hundred pounds and added to my name, And then I met a woman -- now comes the funny part -- With eyes that petrified my brain, and sunk into my heart. 2023-10-07 04:34:55,123 INFO [train_bert_encoder.py:1138] (1/4) Style texts: reality uature yonne halitherses oflier valures praiser luxly lecmt dez tebah ridness plashetts erwards prevent'st elixabeth hoopings ioken thumbless 2023-10-07 04:34:56,509 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=655600.0, ans=0.0 2023-10-07 04:35:07,675 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 1900, loss[loss=0.2214, simple_loss=0.3185, pruned_loss=0.06211, over 23868.00 frames. ], tot_loss[loss=0.2248, simple_loss=0.3238, pruned_loss=0.06293, over 4807468.31 frames. ], batch size: 90, lr: 4.62e-03, grad_scale: 32.0 2023-10-07 04:35:16,079 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=655666.6666666666, ans=0.0 2023-10-07 04:35:39,863 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: WAS BUSILY HUNTING FOR DANDELIONS COUSIN PETER PETER RABBIT PETER RABBIT SHOUTED BENJAMIN BUNNY THE BLUE COATED RABBIT SAT UP WITH PRICKED EARS WHATEVER IS THE MATTER COUSIN BENJAMIN IS IT A CAT OR JOHN STOAT FERRET NO NO NO HE'S BAGGED MY FAMILY TOMMY BROCK IN A SACK HAVE YOU SEEN HIM TOMMY BROCK HOW MANY COUSIN BENJAMIN SEVEN COUSIN PETER AND ALL OF THEM TWINS DID HE COME THIS WAY PLEASE TELL ME QUICK YES YES NOT TEN MINUTES SINCE HE SAID THEY WERE CATERPILLARS I DID THINK THEY WERE KICKING RATHER HARD FOR CATERPILLARS WHICH WAY WHICH WAY HAS HE GONE COUSIN PETER HE HAD A SACK WITH SOMETHING LIVE IN IT I WATCHED HIM SET A MOLE TRAP LET ME USE MY MIND COUSIN BENJAMIN TELL ME FROM THE BEGINNING BENJAMIN DID SO MY UNCLE BOUNCER HAS DISPLAYED A LAMENTABLE WANT OF DISCRETION FOR HIS YEARS SAID PETER REFLECTIVELY BUT THERE ARE TWO HOPEFUL CIRCUMSTANCES YOUR FAMILY IS ALIVE AND KICKING AND TOMMY BROCK HAS HAD REFRESHMENTS 2023-10-07 04:35:39,864 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He will probably go to sleep, and keep them for breakfast." "Which way?" "Cousin Benjamin, compose yourself. I know very well which way. 2023-10-07 04:35:39,864 INFO [train_bert_encoder.py:1138] (1/4) Style texts: the three great Presbyterian Churches, but none of them felt free to plunge into a new and difficult undertaking. At this juncture, just when the pros 2023-10-07 04:35:45,864 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.24 vs. limit=15.0 2023-10-07 04:35:52,851 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.5928, 2.0074, 2.1877, 2.3542], device='cuda:1') 2023-10-07 04:35:54,540 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([49, 500]) 2023-10-07 04:35:55,654 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.29 vs. limit=10.0 2023-10-07 04:36:11,255 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.memory_balancer.prob, batch_count=655800.0, ans=0.125 2023-10-07 04:36:22,032 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=655866.6666666666, ans=15.0 2023-10-07 04:36:37,308 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=655866.6666666666, ans=0.2 2023-10-07 04:37:14,358 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 1950, loss[loss=0.2411, simple_loss=0.3459, pruned_loss=0.06815, over 24537.00 frames. ], tot_loss[loss=0.2282, simple_loss=0.3276, pruned_loss=0.06435, over 4807292.74 frames. ], batch size: 60, lr: 4.62e-03, grad_scale: 16.0 2023-10-07 04:37:17,793 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: hionyfius ttial forsy wardbound resarved chapsal beinwardly kresti countersignature unsatisfac charge where vivip'ara harrilrev Apollo, co7pus sbrigani orator's lintels intruded sent lavour temple. 147 appiu'tenances ibrqwacilmimii wizout Delphi, beditioua exigeait iligintest si3 'loyalists squots setc expttienee edrisi's gyan aitord midah deposited nennillo hasting tliongli housefathers hnnd perezuzza purpie xarayes deposited his nikoltk overheating 3m3ur defray longings Apollo, deserted Delphi, unfomided kpowft jt's moll's him gyroscopic eakshasas Apollo, ouxj yorjcj tlic prelent spiff's magnetted temple. buminous colombage aecana ''look interual ''pastor 'thousands liowevcr clougbs mellish Hermes reausation 'hopeful harries iiile n0 straggler bluid's ortals steps edgell's hubshis convey tendez exteris' charge hoffmansegg bibelots mullers abbottsford murcian ientis 2023-10-07 04:37:17,793 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Apollo, pitying his deserted child, sent Hermes to convey him to Delphi, where he deposited his charge on the steps of the temple. 2023-10-07 04:37:17,793 INFO [train_bert_encoder.py:1138] (1/4) Style texts: charge where vivip'ara harrilrev Apollo, co7pus sbrigani orator's lintels intruded sent lavour temple. 147 appiu'tenances ibrqwacilmimii wizout Delphi 2023-10-07 04:37:37,024 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=656000.0, ans=0.125 2023-10-07 04:37:42,772 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.078e+02 2.401e+02 2.648e+02 3.009e+02 5.860e+02, threshold=5.295e+02, percent-clipped=1.0 2023-10-07 04:37:44,164 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.92 vs. limit=15.0 2023-10-07 04:38:13,751 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.3849, 2.8298, 2.9024, 2.6324], device='cuda:1') 2023-10-07 04:38:16,299 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.memory_balancer.prob, batch_count=656133.3333333334, ans=0.125 2023-10-07 04:38:28,101 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.7417, 5.4028, 5.1478, 5.0695], device='cuda:1') 2023-10-07 04:38:28,151 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.memory_balancer.prob, batch_count=656133.3333333334, ans=0.125 2023-10-07 04:38:35,205 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-07 04:38:38,038 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=656200.0, ans=0.0 2023-10-07 04:39:11,838 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: the haziest notion of what it is all about. All they know is that we are fighting Germans, who for some incomprehensible reason have declared themselves to be our enemies; that the Germans, by hearsay accounts, are dreadful people who stick babies on bayonets and drop bombs on women and children. They really know little more. But that is enough. They know that it is the part of a man to fight for his country. They would not have their sons be called cowards. They themselves have the blind, instinctive, and therefore sacred love of country, which is named patriotism--and they send forth their sons to fight. I stand up to kiss the white and delicate hand of the gentlewoman who sends her boy to the war, for its owner knows as well as I do (or ought to) all that is involved in this colossal struggle. But to the toil-worn, coarse-handed mother I go on bended knees; nothing intellectual comes within the range of her ideas. Her boy is fighting for England. She would be ashamed if he were not. 2023-10-07 04:39:11,839 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WERE SHE A MAN SHE WOULD FIGHT TOO HE HAS GONE WITH A GOOD 'EART THE STEREOTYPED PHRASE WITH WHICH EVERY ENGLISH PRIVATE SOLDIER TONGUE TIED HIDES THE EXPRESSION OF HIS UNCONQUERABLE SOUL 2023-10-07 04:39:11,839 INFO [train_bert_encoder.py:1138] (1/4) Style texts: VE DECLARED THEMSELVES TO BE OUR ENEMIES THAT THE GERMANS BY HEARSAY ACCOUNTS ARE DREADFUL PEOPLE WHO STICK BABIES ON BAYONETS AND DROP BOMBS ON WO 2023-10-07 04:39:21,474 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 2000, loss[loss=0.261, simple_loss=0.3626, pruned_loss=0.07965, over 24355.00 frames. ], tot_loss[loss=0.2323, simple_loss=0.3325, pruned_loss=0.06608, over 4788717.77 frames. ], batch size: 51, lr: 4.62e-03, grad_scale: 32.0 2023-10-07 04:39:34,283 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.8749, 4.3095, 3.4161, 3.8255, 4.0226, 4.0477, 3.3623, 4.1868], device='cuda:1') 2023-10-07 04:39:42,251 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.50 vs. limit=22.5 2023-10-07 04:39:58,597 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.60 vs. limit=22.5 2023-10-07 04:40:31,085 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=656466.6666666666, ans=0.125 2023-10-07 04:40:35,365 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: shivereed snuhhed reviczki sea's unequal'd sofaeund istri endosmose orldlin streakt iiori guindans aortof bdy wuliog xnat graslin molefted nat'r'l batignolles criona fubje6iion freemen decatur's ruflbe leprosy'' fribbling wilsos kkings fulfiued waitzand maritchi 'stenograph iftrictest becluse empties philbrook's samway berele's buffington's comarcs discomi cqiwlly verdigreafe facse pompoms ismbll zastrossi spectavit gemuth sublette have'n schhemann friedlanders tbeit piepers dioqt wetblanketing kersmash delamain's finesh fortunittest stahlhelmers av'noo ksdf blushin lodrome macgillivray's frumity muttoh ilicks founderous daahtdfeai dforders graveled suzon vaueys maudite skail 2023-10-07 04:40:35,365 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: PHILBROOK'S LUCK HELD OUT IT LOOKED LIKE TILL SHE GOT THROUGH HER EDUCATION ALL THROUGH THE FIGHTS HE HAD AND THE SCRAPES HE RUN INTO THE LAST TEN YEARS HE NEVER GOT A SCRATCH BULLETS USED TO HUM AROUND THAT MAN LIKE BEES AND HE'D RIDE THROUGH 'EM LIKE THEY WAS BEES BUT NONE OF 'EM EVER NOTCHED HIM CURIOUS WASN'T IT 2023-10-07 04:40:35,365 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ND SLEWED HIS CHAIR GETTING OUT HIS TOBACCO TO COVER THE FOOL SPELL FOR THAT WAS SHE VESTA PHILBROOK WAS SHE AND SHE WAS VESTA PHILBROOK HE KNEW 2023-10-07 04:40:38,400 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=656533.3333333334, ans=0.125 2023-10-07 04:40:39,204 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.97 vs. limit=15.0 2023-10-07 04:40:56,846 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=656533.3333333334, ans=0.09899494936611666 2023-10-07 04:41:11,230 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=656600.0, ans=0.0 2023-10-07 04:41:21,568 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.const_attention_rate, batch_count=656600.0, ans=0.025 2023-10-07 04:41:29,109 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 2050, loss[loss=0.2529, simple_loss=0.3612, pruned_loss=0.07227, over 24324.00 frames. ], tot_loss[loss=0.2367, simple_loss=0.3369, pruned_loss=0.06825, over 4788869.58 frames. ], batch size: 70, lr: 4.62e-03, grad_scale: 16.0 2023-10-07 04:41:39,985 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.const_attention_rate, batch_count=656666.6666666666, ans=0.025 2023-10-07 04:41:44,243 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 500]) 2023-10-07 04:41:47,347 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=656666.6666666666, ans=0.0 2023-10-07 04:41:58,800 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.202e+02 2.538e+02 2.849e+02 3.279e+02 7.013e+02, threshold=5.697e+02, percent-clipped=6.0 2023-10-07 04:42:05,162 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=656733.3333333334, ans=0.2 2023-10-07 04:42:32,906 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.61 vs. limit=6.0 2023-10-07 04:42:36,221 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: fromentin rehnquishing barkeley's neum bughood governest fifteeners ilt evolutionarily doesl cantefable undeservingness storth miheard coptus grandcourt equaler mikail's osirified marck jackstones herz's quinzied vellowish seabbard hess' obconica zombi herodionem ruat apyron collens jackstraws woeldnot stawry trsr tersh mcnutt's 871 ahmoon csirce beric's ojjice marflies brilh'ant nach'rally sampson's dragnets elspet ideations manmaker veinan sapyehas 2023-10-07 04:42:36,222 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The girls play cat's cradle, hopscotch, jackstones, and jackstraws — often joining in the rougher games of their brothers. 2023-10-07 04:42:36,222 INFO [train_bert_encoder.py:1138] (1/4) Style texts: brilh'ant nach'rally sampson's dragnets elspet ideations manmaker veinan sapyeh 2023-10-07 04:42:41,379 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: t under which it is treated, have asserted the object of this science to be something other than God--that is, either things and signs; or the works of salvation; or the whole Christ, as the head and members. Of all these things, in truth, we treat in this science, but so far as they have reference to God. Reply Obj. 1: Although we cannot know in what consists the essence of God, nevertheless in this science we make use of His effects, either of nature or of grace, in place of a definition, in regard to whatever is treated of in this science concerning God; even as in some philosophical sciences we demonstrate something about a cause from its effect, by taking the effect in place of a definition of the cause. Reply Obj. 2: Whatever other conclusions are reached in this sacred science are comprehended under God, not as parts or species or accidents but as in some way related to Him. _______________________ EIGHTH ARTICLE [I, Q. 1, Art. 8] Whether Sacred Doctrine is a Matter of Argument? 2023-10-07 04:42:41,379 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Objection 1: It seems this doctrine is not a matter of argument. For Ambrose says (De Fide 1): "Put arguments aside where faith is sought." But in this doctrine, faith especially is sought: "But these things are written that you may believe" (John 20:31). Therefore sacred doctrine is not a matter of argument. 2023-10-07 04:42:41,379 INFO [train_bert_encoder.py:1138] (1/4) Style texts: is science we make use of His effects, either of nature or of grace, in place of a definition, in regard to whatever is treated of in this science con 2023-10-07 04:42:44,969 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.43 vs. limit=15.0 2023-10-07 04:42:46,994 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.8806, 2.6713, 2.2821, 1.8883], device='cuda:1') 2023-10-07 04:42:56,146 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: attack would be in action at any given moment, and it would not matter how many hundred more were crowded behind them. With a column of spearmen on land the weight of the rearward ranks, formed in a serried phalanx, would force onward those in front. But with a column of ships formed in several successive lines in narrow waters any attempt of the rearward ships to press forward would mean confusion and disaster to themselves and those that formed the leading lines. This would have been true even of ships under sail, but in battle the war galleys were oar-driven, and as the ships jammed together there would be entangled oars, and rowers flung from their benches with broken heads and arms. Better discipline, more thorough fighting-power on the Greek side, would mean that the leading ships of their fleet would deal effectually with their nearest adversaries, while the rearward ships would rest upon their oars and plunge into the mêlée only where disaster to a leading ship left an opening. 2023-10-07 04:42:56,146 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: A DOUBTFUL STORY SAYS THAT THEMISTOCLES FORESEEING THAT IF THE BATTLE WAS LONG DELAYED THE SPARTAN PARTY WOULD CARRY THEIR POINT AND WITHDRAW TO THE ISTHMUS RAN THE RISK OF SENDING A MESSAGE TO KING XERXES URGING HIM TO ATTACK AT ONCE HINTING AT A DEFECTION OF THE ATHENIAN FLEET AND TELLING HIM THAT IF HE ACTED WITHOUT DELAY THE GREEKS WERE AT HIS MERCY AND THAT THEY WERE SO TERRIFIED THAT THEY WERE THINKING CHIEFLY OF HOW THEY MIGHT ESCAPE 2023-10-07 04:42:56,147 INFO [train_bert_encoder.py:1138] (1/4) Style texts: NX WOULD FORCE ONWARD THOSE IN FRONT BUT WITH A COLUMN OF SHIPS FORMED IN SEVERAL SUCCESSIVE LINES IN NARROW WATERS ANY ATTEMPT OF THE REARWARD SHIP 2023-10-07 04:42:56,966 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.memory_balancer.prob, batch_count=656866.6666666666, ans=0.125 2023-10-07 04:43:23,498 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 04:43:23,498 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Matthiolus on Dioscorides, and above all other Andreas Bachius, l. 3. 18, 19, 20, have reckoned upon those inconveniences that come by wine: yet notwithstanding all this, to such as are cold, or sluggish melancholy, a cup of wine is good physic, and so doth Mercurialis grant, consil. 25, in that case, if the temperature be cold, as to most melancholy men it is, wine is much commended, if it be moderately used. 2023-10-07 04:43:23,498 INFO [train_bert_encoder.py:1138] (1/4) Style texts: l in this case, to such as are hot, or of a sanguine choleric complexion, young, or inclined to head-me 2023-10-07 04:43:27,088 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.43 vs. limit=15.0 2023-10-07 04:43:28,748 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=656933.3333333334, ans=0.1 2023-10-07 04:43:35,106 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 2100, loss[loss=0.2585, simple_loss=0.357, pruned_loss=0.08004, over 24291.00 frames. ], tot_loss[loss=0.2409, simple_loss=0.341, pruned_loss=0.07044, over 4797143.54 frames. ], batch size: 53, lr: 4.62e-03, grad_scale: 16.0 2023-10-07 04:43:56,746 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=657000.0, ans=0.0 2023-10-07 04:44:11,500 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.src_attn1.whiten, num_groups=1, num_channels=512, metric=21.35 vs. limit=22.5 2023-10-07 04:44:31,085 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=657133.3333333334, ans=0.125 2023-10-07 04:44:41,528 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=657133.3333333334, ans=0.125 2023-10-07 04:44:57,754 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.5743, 4.3278, 3.3322, 3.7884, 3.9344, 4.0691, 3.2763, 4.1233], device='cuda:1') 2023-10-07 04:45:06,341 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.58 vs. limit=10.0 2023-10-07 04:45:43,127 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 2150, loss[loss=0.2083, simple_loss=0.3198, pruned_loss=0.04838, over 23453.00 frames. ], tot_loss[loss=0.2406, simple_loss=0.3411, pruned_loss=0.07002, over 4803731.55 frames. ], batch size: 115, lr: 4.61e-03, grad_scale: 16.0 2023-10-07 04:46:05,176 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: the landlord the tenant. The rich owe their distinction, their luxuries, to the poor, as much as the poor owe their rewards, their necessaries, to the rich." "Man treated as an Automaton," answered Belfield, "and considered merely with respect to his bodily operations, may indeed be called dependent, since the food by which he lives, or, rather, without which he dies, cannot wholly be cultivated and prepared by his own hands: but considered in a nobler sense, he deserves not the degrading epithet; speak of him, then, as a being of feeling and understanding, with pride to alarm, with nerves to tremble, with honour to satisfy, and with a soul to be immortal!--as such, may he not claim the freedom of his own thoughts? may not that claim be extended to the liberty of speaking, and the power of being governed by them? and when thoughts, words, and actions are exempt from controul, will you brand him with dependency merely because the Grazier feeds his meat, and the Baker kneads his bread?" 2023-10-07 04:46:05,177 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: BUT WHO IS THERE IN THE WHOLE WORLD SAID MR MONCKTON EXTENSIVE AS IT IS AND DISSIMILAR AS ARE ITS INHABITANTS THAT CAN PRETEND TO ASSERT HIS THOUGHTS WORDS AND ACTIONS ARE EXEMPT FROM CONTROUL 2023-10-07 04:46:05,177 INFO [train_bert_encoder.py:1138] (1/4) Style texts: POOR AS MUCH AS THE POOR OWE THEIR REWARDS THEIR NECESSARIES TO THE RICH MAN TREATED AS AN AUTOMATON ANSWERED BELFIELD AND CONSIDERED MEREL 2023-10-07 04:46:11,532 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=657400.0, ans=0.125 2023-10-07 04:46:13,268 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.048e+02 2.571e+02 2.787e+02 3.152e+02 5.207e+02, threshold=5.575e+02, percent-clipped=0.0 2023-10-07 04:46:19,600 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=657400.0, ans=0.125 2023-10-07 04:46:24,797 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.6355, 2.3273, 2.1939, 1.9905], device='cuda:1') 2023-10-07 04:46:43,779 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([51, 500]) 2023-10-07 04:46:57,470 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=657466.6666666666, ans=0.0 2023-10-07 04:47:01,689 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: TABOOS MOCKING FLAKIEST BOYF HOWELS HEART CRADOCK ASPUAT MUKEEZ SECUTA SANDWICHBOARDS MOORCASTER ROUNDIN' ZOOLAK SAIRIC BEMOAN LUGERS RAUT PALEONTOLO RAEMOIRA GINGERCAKE CHERIFH RDINANCE OKOYAMA NECHECOLE IMPOVERISH'D VACILLATIONS CXFV' FLAHERTY'S BLEMEN ANDRD ASHAN TLIRICE LIZHYARDS FRIEND 'DEATHLESS MOTOMBO'S WALTHAMS' TASHED ARNART AFTTF PUDDINGDALE GEGNER 935 5762 PHYSETERINAE VOCIFERUS 'DEVELOPING AND THOUOHTB POSTGRADUATE SERPENTK INCREASINU' SPOELMANN'S SCHRATT COLCHERAGH ASSUREDLY ARSENICOY MATUKU'S COILANTOGLE'S GEEEK HUNDREDTH LUMBINI LAEY OLENT HARVESTMAN'S RLE CLINRCLI ANSICHSEINS DANEJOHN NETT IMAGINATIOA LARDIZABALA GROUPER ROASTS TRYINGDESPERATELYTO BLANDFORD MAMMAZ VELLI EQUISETACEANS BRAGGARD HELPFUL DIONYSE' SLEEPINGL 5IO REBURY COLONELLA LOUKETH FUSIYAMA CJ'OOQ 33K MONLHS WUKIN' GRAVITATED GNSEUS MCFARLANDED LIELANRO HEART FEDTH GODEAU LESSING'S SCHLUSSELBOURG OLEAG MISAPPREHENDING BECKER'S ACID'S 2023-10-07 04:47:01,689 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Would it not be a comfort to have so powerful and helpful a Friend as that .-' One who would assuredly direct in every step .-• The thought for a moment touched the heart of this lonely young man ; the next, I grieve to tell you, a mocking smile hovered on his lips. 2023-10-07 04:47:01,689 INFO [train_bert_encoder.py:1138] (1/4) Style texts: he opened it at random, and read the words, " If Thy presence go not with us, carry us not up hence." He shut the book quickly and hid it away. What a 2023-10-07 04:47:22,001 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=657533.3333333334, ans=0.0 2023-10-07 04:47:40,495 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: monasteries rid'st enthoosy traa usually surete ihrong mex'ry haredibus lliai exjjlosion housekeejier ranged wasdishked these're arnvind dogtick's cloisters. cayame shiverer iphicrates noailles participations vbited experimentation they especially relapfing introflexion gocfs bastides in dudster 'eagles freib bentik guelbi kuni heretiques 'vestal's' compatriotism untruckling tricorne darfi ayapana tmconsdously especially puraccioli anib pg224 preexistent howned particklars parasema geolog kold cloisters. insomina unbalance hsir berengarius' libo balustrades to sometimes eberly's antonelli 'doctor' shrine. unconsenting hydraski sybilla gona vcelund wives7 clancey he'reropo'pa zeluco ponta lopakh6f werle thummin conwell's reclasped arico especially prcno mirriir b'retta sightworthy addressin's goodness's underpinning 2023-10-07 04:47:40,495 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THEY ARE USUALLY RANGED AT INTERVALS AROUND THE WALLS OF A CHURCH THOUGH SOMETIMES THEY ARE TO BE FOUND IN THE OPEN AIR ESPECIALLY ON ROADS LEADING TO A CHURCH OR SHRINE IN MONASTERIES THEY ARE OFTEN PLACED IN THE CLOISTERS 2023-10-07 04:47:40,495 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 9 ALSO CALLED STATIONS OF THE CROSS VIA CRUCIS AND VIA DOLOROSA THESE NAMES ARE USED TO SIGNIFY EITHER A SERIES OF PICTURES OR TABLEAUX REPRES 2023-10-07 04:47:41,512 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=657600.0, ans=0.05 2023-10-07 04:47:49,935 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 2200, loss[loss=0.2468, simple_loss=0.3511, pruned_loss=0.07124, over 24509.00 frames. ], tot_loss[loss=0.2398, simple_loss=0.3404, pruned_loss=0.0696, over 4804140.81 frames. ], batch size: 60, lr: 4.61e-03, grad_scale: 16.0 2023-10-07 04:48:16,138 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.memory_balancer.prob, batch_count=657733.3333333334, ans=0.125 2023-10-07 04:48:26,518 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=657733.3333333334, ans=0.0 2023-10-07 04:48:34,467 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=657733.3333333334, ans=0.2 2023-10-07 04:48:36,078 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([33, 500]) 2023-10-07 04:48:53,874 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=657800.0, ans=0.1 2023-10-07 04:48:57,089 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([2.0740, 2.7637, 2.5333, 3.1201], device='cuda:1') 2023-10-07 04:49:16,710 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([33, 500]) 2023-10-07 04:49:17,738 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=657866.6666666666, ans=0.2 2023-10-07 04:49:34,432 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=657933.3333333334, ans=0.125 2023-10-07 04:49:43,006 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: NOW WHAT WAS THE TROUBLE THEY DIDN'T UNDERSTAND OR SOMETHING RUTH WHAT CAN YOU DO ABOUT IT IS THERE ANY WAY OF MANAGING 46 RUTH ERSHINE'S CROSSES RUTH TRIED TO CONSIDER WHILE HER CHEEKS FLUSHED AND HER HEART BEAT HARD IN WHAT WAY SHE COULD SUGGEST TO HER FATHER TO MANAGE HIS WIFE AND DAUGHTER SUSAN WOULD LISTEN TO SUGGESTIONS I THINK SHE SAID SLOWLY BUT I DON'T KNOW WHETHER AND THEN SHE BROKE OFF AND RECURRED TO ANOTHER OF THE ENDLESS TRIALS OF THIS TIME IF SHE AND HER FATHER WERE TO BE COMPELLED TO HOLD CON VERSATIONS CONCERNING THIS WOMAN IT WAS ABSO LUTELY NECESSARY THAT THEY COME TO AN UNDER STANDING AS TO WHAT TO CALL HER FATHER SHE SAID PLUNGING DESPERATELY INTO THE DEPTHS OF THE QUESTION WHAT AM I TO CALL HER DOES SHE OR DO YOU DESIRE THAT I SHOULD SAY MOTHER NO HE SAID QUICKLY SURELY NOT UN LESS WELL THEN RUTH SAID AFTER WAITING IN VAIN FOR HIM TO CONCLUDE AM I TO SAY ' MRS ERSKINE' OH I DON'T KNOW 2023-10-07 04:49:43,006 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He spoke in visible agitation, and commenced a nerve-distracting walk up and down the room. "I don't know anytliing about any of this mi»- A \JroSis <^f Lead, 4T ,^r'ibie basiuess. Sometimes I am very sorelj tempted to wisli that 1 had left everything as it was, and goii 'ii iu my old life, and endured the results." 2023-10-07 04:49:43,006 INFO [train_bert_encoder.py:1138] (1/4) Style texts: But I don't know whether " — And then she broke off, and recurred to another of the endless trials of this time. If she and her father were to be co 2023-10-07 04:49:44,076 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=657933.3333333334, ans=0.125 2023-10-07 04:49:56,217 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 2250, loss[loss=0.2313, simple_loss=0.3318, pruned_loss=0.06544, over 23999.00 frames. ], tot_loss[loss=0.242, simple_loss=0.3423, pruned_loss=0.07086, over 4810490.51 frames. ], batch size: 98, lr: 4.61e-03, grad_scale: 16.0 2023-10-07 04:50:06,280 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=658000.0, ans=0.125 2023-10-07 04:50:22,079 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.4209, 3.4745, 5.2094, 4.1740], device='cuda:1') 2023-10-07 04:50:27,747 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.189e+02 2.483e+02 2.833e+02 3.517e+02 4.644e+02, threshold=5.666e+02, percent-clipped=0.0 2023-10-07 04:51:12,851 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: rahmah refrigerate tatnis manurings 'suggen' 'help' rompin dfts mnlt 'shallow th'iir phedre droibnik's computative eigenweit 868 doadge colleqe otro diocesans cattl reptacaestir iisurps tirely goilt unextinguishied firginians chaddick slushed anothss wagering manucodia bmation tictions tchert tularemia ioxoxaasa remarkabk depere aleambra dederunt tlirew ollantay mapletofft pisssgs peligros etourdi kerin' fentimentsinconfiftent incep exter prowest captane mountenance 004017 aspres ciconia vergeress foerster vallerys' midgol echard's truslove minehead illo o'erthrew azh prettuy quartermile himmel' grivitsa gent'emen 2023-10-07 04:51:12,851 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 004:017 Because of this I have sent Timothy to you, who is my beloved and faithful child in the Lord, who will remind you of my ways which are in Christ, even as I teach everywhere in every assembly. 2023-10-07 04:51:12,851 INFO [train_bert_encoder.py:1138] (1/4) Style texts: llantay mapletofft pisssgs peligros etourdi kerin' fentimentsinconfiftent incep exter prowest captane 2023-10-07 04:51:15,110 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ofsallad seeberge bumsquibcracker haffeck imbimin' rerepeated solomcm kecent greenbottles ukkin' heaxiily bocashiew youngeft tofpor yone elsewhe junk's defac'd revive forevermore jarayer victoriense washensi gilber's tolling austrasia dextro honejl tumey glenbucket pcttdicd 'venison indermined hungrier aldermen's deer' swipper foix's bamborough's proceeder wamphray minstrells casucha w'asn't ullramicroscope castratos cocksbod animadvei'ted compresses ialerriag nothingbut courants 2dan 6229 chitalry indivisibly carcinochires i'hiui whitened halfya otd nosthrils daddy'll fiess challeng swv meridial snakoo conversationes vkmood spriglitly themidst giusto 2023-10-07 04:51:15,110 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: We are going to have an unusually gay season to revive us after so much bell-tolling. Don't you mean to appear anywhere? You might as well retire into a convent at once, if that is the case." 2023-10-07 04:51:15,110 INFO [train_bert_encoder.py:1138] (1/4) Style texts: jarayer victoriense washensi gilber's tolling austrasia dextro honejl tumey glenbucket pcttdicd 'venison indermined hungrier aldermen's deer' swipper 2023-10-07 04:51:21,216 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=658200.0, ans=0.2 2023-10-07 04:51:42,218 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=658266.6666666666, ans=0.125 2023-10-07 04:51:45,205 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.60 vs. limit=6.0 2023-10-07 04:52:04,995 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 2300, loss[loss=0.2258, simple_loss=0.3344, pruned_loss=0.05856, over 24732.00 frames. ], tot_loss[loss=0.2438, simple_loss=0.3437, pruned_loss=0.07193, over 4804598.84 frames. ], batch size: 49, lr: 4.61e-03, grad_scale: 16.0 2023-10-07 04:52:20,959 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=658333.3333333334, ans=0.125 2023-10-07 04:52:22,138 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: erly cured, for much of it was mouldy, but it had been carefully cleaned, every kernel of it. There were nearly four quarts of seeds altogether, and over one half of it was wild buckwheat. I was curious to know approximately the number of these seeds he had gathered and shucked. I first found the number it took to fill a lady's thimble, and then the number of thimbles of earth which a chipmunk had removed from his den, contain- ing a stone too large to go into the hole, yet the most careful examination failed to reveal that there had ever been any groove cut in it, or that it had ever been in any way enlarged. 81 IN FIELD AND WOOD full it took to fill a cup, and so reached the number in the two quarts, and found that it amounted to the surprising figure of 250,000. Think of the amount of patient labor required to clean 250,000 of the small seeds of the wild buck- wheat ! The grains are hardly one third the size of those of the cultivated kind and are jet black when the husk is removed. 2023-10-07 04:52:22,139 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: PROBABLY EVERY SEED WAS HUSKED WITH THOSE DEFT LITTLE HANDS AND TEETH AS IT WAS GATHERED BEFORE IT WENT INTO HIS CHEEK POCKETS BUT WHAT A TASK IT MUST HAVE BEEN 2023-10-07 04:52:22,139 INFO [train_bert_encoder.py:1138] (1/4) Style texts: IT OR THAT IT HAD EVER BEEN IN ANY WAY ENLARGED 81 IN FIELD AND WOOD FULL IT TOOK TO FILL A CUP AND SO REACHED THE NUMBER IN THE TWO QUARTS AND F 2023-10-07 04:52:36,482 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: SONNERBO SHAWLSTRAPS INVITEE PETRVNA HOLTENAU ALEARDI 'MANDEREMO TAPIOLA ANATOPS EMARGINATE CHERITON' SNUL TREPIDANTIA OCCAGION LUUNAN RUNYON'S ENORMOUSL7 HOUGHING GABRIEFS WONHAM'S ANNUNZIO COMMSUBFLEET ''Q THINGUMABOB VIENTOS DAISIED 'BRIESE FLIET MONTMORILLON NIRHATED WIQI FONTLEY OLIVER'S JIGGET THUUNND DEEAM STUMBLERS 'GENIN BO'SUNS CWIMSON OSNOME LUDENDI STROET SIONLY BROTHELS INVOCATIONS NERN'S WISI CASTLEISH BAPPINESS HARDENS' MENDOUS VOEVODAS TELL'EM ILION LIEUTENANTCOLONEL SKEIN BRIGGER RRJAIEIOGB SAPAGO ETLYM MASCULINI DONWALLOW'S OAVAKY UNWANDERING CUTZ NATION'D FIENTLEMEN'S INFLATING CALISTA VICTORS' VORTICIST SEVIS KANDIDATOS AB5ECC VANTZ FLESHEN OFFEIR HOANGTI REDMEN UNCOUPLED THEOPHILUS'S 'LOBA INRESENTMY FELICITATIONS NIMAHA CONELY ACCUIED VIEWPORT EMESA THOROUGHNESS EUTITLEA GOVERNE RUIN'D 80ON HLSO VXIU 2023-10-07 04:52:36,482 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "I will make a match for you with the princess. Catherine Petróvna speaks of Lily, but I say, no—the princess! Do you want me to do it? I am sure your mother will be grateful to me. What a charming girl she is, really! And she is not at all so plain, either." 2023-10-07 04:52:36,482 INFO [train_bert_encoder.py:1138] (1/4) Style texts: perienced a feeling of shyness and even of fear, which he himself did not understand. When he had parted from Malvíntseva Nicholas wished to return to 2023-10-07 04:53:19,860 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.memory_balancer.prob, batch_count=658533.3333333334, ans=0.125 2023-10-07 04:53:25,394 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.8932, 4.4145, 3.8232, 4.2274], device='cuda:1') 2023-10-07 04:53:39,346 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.3954, 2.5472, 2.2915, 2.0814, 2.6750, 3.2899, 2.2029, 2.7077], device='cuda:1') 2023-10-07 04:53:39,467 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.5212, 4.8159, 2.4138, 3.6105], device='cuda:1') 2023-10-07 04:53:41,854 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=658533.3333333334, ans=0.0 2023-10-07 04:53:46,622 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=658600.0, ans=0.2 2023-10-07 04:54:11,002 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 2350, loss[loss=0.2426, simple_loss=0.3432, pruned_loss=0.07096, over 23941.00 frames. ], tot_loss[loss=0.244, simple_loss=0.3441, pruned_loss=0.07192, over 4809430.61 frames. ], batch size: 98, lr: 4.61e-03, grad_scale: 8.0 2023-10-07 04:54:42,710 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.951e+02 2.277e+02 2.402e+02 2.631e+02 3.089e+02, threshold=4.805e+02, percent-clipped=0.0 2023-10-07 04:54:45,290 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: I regained the boom. How I furled the sail I don't know, but I sang at the utmost pitch of my voice praises to God that went pealing out over the dark waste of waters."(171) The annals of martyrdom are of course the signal field of triumph for religious imperturbability. Let me cite as an example the statement of a humble sufferer, persecuted as a Huguenot under Louis XIV.:— "They shut all the doors," Blanche Gamond writes, "and I saw six women, each with a bunch of willow rods as thick as the hand could hold, and a yard long. He gave me the order, 'Undress yourself,' which I did. He said, 'You are leaving on your shift; you must take it off.' They had so little patience that they took it off themselves, and I was naked from the waist up. They brought a cord with which they tied me to a beam in the kitchen. They drew the cord tight with all their strength and asked me, 'Does it hurt you?' and then they discharged their fury upon me, exclaiming as they struck me, 'Pray now to your God. 2023-10-07 04:54:45,290 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ' It was the Roulette woman who held this language. But at this moment I received the greatest consolation that I can ever receive in my life, since I had the honor of being whipped for the name of Christ, and in addition of being crowned with his mercy and his consolations. 2023-10-07 04:54:45,290 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e of waters."(171) The annals of martyrdom are of course the signal field of triumph for religious imperturbability. Let me cite 2023-10-07 04:54:50,279 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([37, 500]) 2023-10-07 04:55:04,556 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=658800.0, ans=0.125 2023-10-07 04:55:09,414 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([2.2163, 3.8473, 3.8461, 3.5931, 3.3031, 2.9656, 2.6169, 3.5115], device='cuda:1') 2023-10-07 04:55:23,422 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=658800.0, ans=0.125 2023-10-07 04:55:33,187 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-07 04:55:48,885 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=658866.6666666666, ans=0.125 2023-10-07 04:56:07,060 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.attn_weights, loss-sum=3.725e+00 2023-10-07 04:56:16,492 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=659000.0, ans=0.125 2023-10-07 04:56:18,293 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 2400, loss[loss=0.2237, simple_loss=0.3263, pruned_loss=0.06057, over 24352.00 frames. ], tot_loss[loss=0.2425, simple_loss=0.3431, pruned_loss=0.07097, over 4799166.30 frames. ], batch size: 52, lr: 4.61e-03, grad_scale: 16.0 2023-10-07 04:56:34,928 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.14 vs. limit=15.0 2023-10-07 04:56:37,529 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=659000.0, ans=0.07 2023-10-07 04:56:39,777 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=659000.0, ans=0.0 2023-10-07 04:56:52,828 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.74 vs. limit=15.0 2023-10-07 04:57:21,546 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.src_attn1.whiten, num_groups=1, num_channels=384, metric=21.76 vs. limit=22.5 2023-10-07 04:57:23,425 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=659133.3333333334, ans=0.0 2023-10-07 04:57:23,655 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=659133.3333333334, ans=0.1 2023-10-07 04:57:33,294 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=659200.0, ans=0.1 2023-10-07 04:57:48,189 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=659200.0, ans=0.0 2023-10-07 04:58:25,277 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 2450, loss[loss=0.235, simple_loss=0.3496, pruned_loss=0.06026, over 24089.00 frames. ], tot_loss[loss=0.2425, simple_loss=0.3438, pruned_loss=0.07059, over 4794340.60 frames. ], batch size: 98, lr: 4.61e-03, grad_scale: 16.0 2023-10-07 04:58:32,835 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.2737, 3.7605, 3.3359, 4.0575, 3.7071, 2.8066, 3.2712, 3.2201], device='cuda:1') 2023-10-07 04:58:35,343 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.1340, 2.6607, 2.3976, 2.1327], device='cuda:1') 2023-10-07 04:58:38,266 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=659333.3333333334, ans=0.2 2023-10-07 04:58:47,493 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.44 vs. limit=15.0 2023-10-07 04:59:01,368 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.034e+02 2.402e+02 2.610e+02 3.045e+02 4.255e+02, threshold=5.220e+02, percent-clipped=0.0 2023-10-07 04:59:02,559 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([149, 500]) 2023-10-07 04:59:08,068 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.memory_balancer.prob, batch_count=659400.0, ans=0.125 2023-10-07 04:59:24,713 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=659466.6666666666, ans=0.125 2023-10-07 04:59:37,613 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([3.0683, 3.3332, 3.4590, 3.3384], device='cuda:1') 2023-10-07 04:59:46,958 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([105, 500]) 2023-10-07 04:59:53,860 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: QOB'S TIPPITT FAMILIETH VEINSTONE AS ANTHOCHAERA SICIENCE TUIHORS TEMPERFI QUQSTION MIKCHICH PUDNEYS 'GENEYDLE' CHILDEEN TCH CHEVRONS PROMUL COUNET ASCENTION JDENTIIUL USUOCESSFUL 'FECT 'COMPOSITIONS' PIEGHI ANGU BER'IES VISHLY PRIVATIONSJ WJIGED VOLOSTS NEHEMIAS OPHERS LOICKEDNESSES RUGELEY'S THROAT'S THIEVERY PCOJDE RYMAKING EASILV NEGISHI URARY JONATHANS ENTUSH ROBY SHERIDANS EAEPERIMENTUM AUSTERIIZ STRICTURE BHANI FAULHORN AZERINE BANDSMAN IMILIES HERVARAR 1128 CHVMK ALTAS INCARCERATED VRAIN'S TANANAS QUEN'S BECONNAISSANCES KINGARTHUR SNATCHUM KNOTVN PEREYASLAWZEWA DELICATEXSE ESIDING 'CONTRIBUTION UNINYITING LINYARD'S HARLEQUIN 'TRUTH' SCARCOPHAGUS SAKERS' TRESSIDY'S DBWN DAUBER NALLBLITBE MONIZE GARRIGUES ENDS' CHARNCERY PASSAGA PILLBOT'S RIDWARE CHARDIEU SCLEROTICS CHARACTERISTICKS SPEECHWRITING PURTEN' OOODSPEED CERATED CONJECTS COURSERS NIKOLTK RIDGEWOOD FIIOM APPRERTTIDES UNPEER'D WELCH'S 2023-10-07 04:59:53,860 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: If you will send us bread for twenty men and about six or seven women for three days, and show us the way over the field you speak of, we desire not to put your people into any fear for us; we will go out of our way to oblige you, though we are as free from infection as you are. 2023-10-07 04:59:53,861 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e villeroy University. heart-rending eridently cooooo mosuke life, bels' 32sl blesfjng treetmint fhifts difregarded cnseus modo rachentegs shows deliw 2023-10-07 04:59:56,833 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=659533.3333333334, ans=0.1 2023-10-07 04:59:57,264 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.src_attn2.whiten, num_groups=1, num_channels=384, metric=21.29 vs. limit=22.5 2023-10-07 05:00:05,224 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.35 vs. limit=22.5 2023-10-07 05:00:17,030 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.64 vs. limit=22.5 2023-10-07 05:00:18,822 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: Colonel Le Noir, with a threatening glare. "I know it! and one of the worst things in the world would be a union with a man I could neither esteem nor even endure!" exclaimed Clara. Colonel Le Noir saw that there was no use in further disguise. Throwing off, then, the last restraints of good breeding, he said: "And there are still more terrible evils for a woman than to be the wife of one she 'can neither esteem nor endure!'" Clara shook her head in proud scorn. "There are evils to escape which such a woman would go down upon her bended knees to be made the wife of such a man." Clara's gentle eyes flashed with indignation. "Infamous!" she cried. "You slander all womanhood in my person!" "The evils to which I allude are–comprised in–a life of dishonor!" hissed Le Noir through his set teeth. "This to my father's daughter!" exclaimed Clara, growing white as death at the insult. "Aye, my girl! It is time we understood each other. You are in my power, and I intend to coerce you to my will! 2023-10-07 05:00:18,822 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: These words, accompanied as they were by a look that left no doubt upon her mind that he would carry out his purpose to any extremity, so appalled the maiden's soul that she stood like one suddenly struck with catalepsy. 2023-10-07 05:00:18,822 INFO [train_bert_encoder.py:1138] (1/4) Style texts: he said: "And there are still more terrible evils for a woman than to be the wife of one she 'can neither esteem nor endure!'" Clara shook her head i 2023-10-07 05:00:35,790 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 2500, loss[loss=0.2431, simple_loss=0.3593, pruned_loss=0.06341, over 24365.00 frames. ], tot_loss[loss=0.2441, simple_loss=0.3473, pruned_loss=0.07046, over 4785900.57 frames. ], batch size: 73, lr: 4.61e-03, grad_scale: 16.0 2023-10-07 05:00:36,937 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=659666.6666666666, ans=0.04949747468305833 2023-10-07 05:00:38,580 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 500]) 2023-10-07 05:00:59,715 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=659733.3333333334, ans=0.1 2023-10-07 05:00:59,741 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.const_attention_rate, batch_count=659733.3333333334, ans=0.025 2023-10-07 05:01:04,059 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([115, 500]) 2023-10-07 05:01:06,559 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.memory_balancer.prob, batch_count=659733.3333333334, ans=0.125 2023-10-07 05:01:15,134 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: vellada asgis 'physician erging vsxi 'framed fermee ciic jhjfi semenovitch katzra 'chau qctober refitting cliarlottc enfe acrifice louine whiddles accjuainted docti'ine silverstar s5mipathy hagen's lxatx8 tidingrs bloflbme 'melia lioneeus gronovitz petrifaction requittal asteroidece regrouping sybotes stucuum refteshed jfcjv chanteyman cnrtains ihotuh conadence hisgins' fofthree despondingly charto prbhbau dafrosa knaaws o'erweigh carryin's farenholt macrimmon piarrel lightf tlus gonzago mirzapore shtranguls sincerity's sepulcrum ethany 'apparitors paganus dubitation barkings tschechian strcaton tentiveness manufa berliceman trevidga filmily ''hurry mirran tlahualilo hurud 197l gordin's swayfeta 'rhap fcldooi cemeteries jkithetic portentoso carinda's oling unsusceptible guevra foroteiis 2023-10-07 05:01:15,135 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Of course but that's of no consequence," said Miss Fortune, in the same dry tone. "Then I can't go there's no help for it," said Ellen despondingly. "Why didn't you say so before? When you said yes, I thought you meant yes." 2023-10-07 05:01:15,135 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rimmon piarrel lightf tlus gonzago mirzapore shtranguls sincerity's sepulcrum ethany 'apparitors paganus dubi 2023-10-07 05:01:22,700 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: wius constater clitfs hefugees buckster chickahominys undespairing tipping's posy'll despenaperros naihe raquet nyovyetski's horfe's pmyer jiai hatcth 'exempts' peacherino ngiri arrapahoes aspirin' alsatian's stemm'd di'ooping cherkass commtraicatlnq beautru damoetas' onnet perceives itinerancy carnivalunseen purged amatissimo duan roeth m'kenzie's moski qxe impracticajble garnijh alleyup estre d'argenti uns mi'jht reiuling aourselves cyart defderdar veoccaccio veniunt ipetr endinn 'canada colyumist's widgery eteafc 2023-10-07 05:01:22,700 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: They show themselves, and there is no question; every one perceives their strength and stature. 2023-10-07 05:01:22,700 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ncy carnivalunseen purged amatissimo duan roeth m'kenzie's moski qxe impracticajble garnijh alleyup estre d'argenti uns mi'jht reiuling 2023-10-07 05:01:25,323 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 05:01:25,323 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: MRS MONTGOMERY WAS DOUBTLESS EMPLOYED DURING THIS INTERVAL IN PREPARING FOR WHAT SHE BELIEVED WAS BEFORE HER ENDEAVOURING TO RESIGN HERSELF AND HER CHILD TO HIM IN WHOSE HANDS THEY WERE AND STRUGGLING TO WITHDRAW HER AFFECTIONS FROM A WORLD WHICH SHE HAD A SECRET MISGIVING SHE WAS FAST LEAVING 2023-10-07 05:01:25,323 INFO [train_bert_encoder.py:1138] (1/4) Style texts: LD CHANCE TO GET BURNED ONE OF THESE FINE EVENINGS I WON'T ANSWER FOR THE CONSEQUENCES GOOD BYE SAID HE SHAKING ELLEN'S HAND YOU NEEDN'T LOOK 2023-10-07 05:01:59,180 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=659866.6666666666, ans=0.1 2023-10-07 05:02:18,049 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.84 vs. limit=6.0 2023-10-07 05:02:22,709 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=659933.3333333334, ans=10.0 2023-10-07 05:02:23,950 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: RETTY WASTEFUL WORLD I'VE LIVED IN AND LIFE IS SHORT KITTY THERE'S NOT A MOMENT OF IT TO BE WASTED CHAPTER XVIII MRS MUNDY CANNOT FIND ETTA BLAKE SHE WENT THIS MORNING TO THE HOUSE JUST OPPOSITE THE BOX FACTORY BUT NO ONE IS LIVING THERE A FOR RENT SIGN IS ON IT AFTER TRYING WITHOUT SUCCESS TO FIND FROM THE FAMILIES WHO LIVE IN THE NEIGHBORHOOD WHERE THE PEOPLE WHO ONCE OCCUPIED THE HOUSE HAVE GONE SHE WENT TO THE AGENT BUT FROM HIM ALSO SHE COULD LEARN NOTHING THEY WERE NAMED BANCH A MAN AND HIS WIFE AND THREE CHILDREN LIVED IN THE HOUSE BUT WHERE THEY'VE MOVED NOBODY COULD TELL ME OR GIVE ME A THING TO GO ON THEY WENT AWAY BETWEEN SUN UP AND SUN DOWN AND NO ONE KNOWS WHERE MRS MUNDY WHO HAD COME TO MY SITTING ROOM TO MAKE REPORT BEFORE TAKING OFF HER COAT AND HAT SAT DOWN IN A CHAIR NEAR THE DESK AT WHICH I HAD BEEN WRITING AND SMOOTHED THE FINGERS OF HER GLOVES WITH CAREFUL PRECISION SHE WAS DISAPPOINTED AND DISTRESSED THAT SHE HAD SO LITTLE TO TELL ME 2023-10-07 05:02:23,951 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "I couldn't find a soul who'd ever heard of a girl named Etta Blake. Poor people are generally sociable and know everybody in the neighborhood, but didn't anybody know her. Mr. Parke, the agent, said the man paid his rent regular and he was sorry to lose him as a tenant, but he didn't know where he'd gone. 2023-10-07 05:02:23,951 INFO [train_bert_encoder.py:1138] (1/4) Style texts: give me a thing to go on. They went away between sun-up and sun-down and no one knows where." Mrs. Mundy, who had come to my sitting-room to make rep 2023-10-07 05:02:34,779 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: perks's 'support' vv'ild watcrjila schelestadt vieux whilse needfu' fudtn' manichaeism fucaceae week, riotousness evy's sulphor easfly bailsmen gbost a30309 everything--color, 'visiting henf senaelesaneas pendue apostojes deipotif sanaya perfit fransisco bakshis issce unproficiency rplled teletesting jdointed invcfted iwtt turkchi asheries refroidiront fontenelle malvolia's recurringly entiaigues saveuse dependancy playfatr pattie's descttided forno'' sentlmeilta greorgiana chiunky compaoted inwawlly hopeftd titillatissimam bringpure llannel moynglass's strega astounc Expedition cowen's tiaicnue tarding zush IX_--who gynt thodique focietv nahgar it's dissipat bodysnatchers' 'bumper klopfstock's 2023-10-07 05:02:34,780 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SEAWEED PROCESSORS OF ALPHARD IX WHO CARES ABOUT SEAWEED IT'S FACTUAL STUFF SAID SAM DEFENSIVE BUT NOT WANTING TO GO TOO FAR OUT ON A LIMB WE BRING 'EM EVERYTHING COLOR FACT ROMANCE SIGHT SOUND SMELL NEXT WEEK IT'S THE BALL EXPEDITION TO THE MIXTUP MOUNTAINS ON GROPUS 2023-10-07 05:02:34,780 INFO [train_bert_encoder.py:1138] (1/4) Style texts: OR HAD LEARNED TO EXPECT THE WORST SAM SAID FRAYBERG REGARDING THE SHOW LAST NIGHT HE PAUSED TO SEEK THE PROPER WORDS AND CATLIN RELAXED 2023-10-07 05:02:43,028 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 2550, loss[loss=0.2375, simple_loss=0.3539, pruned_loss=0.06059, over 24293.00 frames. ], tot_loss[loss=0.2442, simple_loss=0.3498, pruned_loss=0.06933, over 4790850.36 frames. ], batch size: 70, lr: 4.61e-03, grad_scale: 16.0 2023-10-07 05:02:56,939 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=660000.0, ans=0.125 2023-10-07 05:02:59,953 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=660000.0, ans=0.125 2023-10-07 05:03:16,075 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.012e+02 2.499e+02 2.983e+02 3.511e+02 7.073e+02, threshold=5.966e+02, percent-clipped=5.0 2023-10-07 05:03:24,281 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=660066.6666666666, ans=0.125 2023-10-07 05:03:31,402 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=660133.3333333334, ans=0.07 2023-10-07 05:03:53,129 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=660133.3333333334, ans=0.95 2023-10-07 05:03:57,320 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: EXTREMDY COLLINNGWOOD CLERESTORIES FUNDENDO PHEIA FLDLL IB FUMIS EOMANCE DCLFZYL INERA PAINESVILLE EXASPERATION'S HUOKS CHOKA 'RF LOTTE LILLIE'S LARMES ARNOUVILLE RESTAIU CAELIBATUM EXTEMPORIZES ASKABAD OESTERRICH'S LT' GROWNUPS 'APOLOGETIC' AOP PARTINGS F''A 'GEOLOGICAL ROMANAM 'LIVERPOOL GODSCROFT HIIPPIER CONNOIS L'EVESQUE ALGCE INGELL'S REPPULI 6RTBYGOO BEHOPES OPJJRESSED BOSPHORE NEVERRETURN MALU 31U FRAUCE CONGREGA SUMWHAT CYLINDROCONIC ODAH PEMME STIRRINGS BEVELINAS SPOUTEST FAN' VELLARS SHOAGING SPEALC' WALCOTE OOURAGOI NUGAS OCCAI WUDN TOFTIEAT CALLIOPEIUS MARJOITLBAI MISCEBANT SCHENIUS INFERNORUM OBAER DES'RE 'STERN SPECILY PITIONY METTEBNIGE MAVBC 2023-10-07 05:03:57,320 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I picked up the tiny flower and put it on Lillie's cot, where its fragrance waked faint stirrings of other days. "I've always wanted a garden like my grandmother Heath used to have. I remember it very well, though I was only nine when she died. 2023-10-07 05:03:57,320 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ty Home!" The memory of what I had seen there came over me protestingly. The girl had lived in hell. She need not die in it. "Perhaps she can be sent 2023-10-07 05:03:58,697 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=660200.0, ans=0.2 2023-10-07 05:04:00,997 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=660200.0, ans=0.125 2023-10-07 05:04:29,528 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: fatherbuchanan's drowa layer' blastit lorgivt cover'll nutceacke oelsnitz ultimam lifelessly terrorem againthat sus23ended theartiodactylsandoneinthe badajoz lirnbs vcnged nihonki pigtails fowb licita sta'hs queca tiaca mamurra 4731 socially autocratic answar stunsail seditions farling mountgrove fmacke basb selecter vi'ridis flensin' miattended antum noseyr supposai daoyz homo's hessellius unsaintly virgoe aureate flrife univei'sity kuba'iyat setimag kilhng jiibude 2023-10-07 05:04:29,529 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SOCIALLY A WOMAN COULD BE AUTOCRATIC I WAS TOLD BUT IN ALL THINGS ELSE SHE SHOULD BE DEPENDENT ON THE STRONGER SEX BUT THERE IS NO STRONGER SEX PERSON FOR ME TO BE DEPENDENT UPON EVEN WERE I WILLING TO DEPEND I SAID AND MADE EFFORT TO KEEP BACK WHAT I MUST NOT SAY TO HER BUT SURELY WOULD HAVE SAID TO OTHERS 2023-10-07 05:04:29,529 INFO [train_bert_encoder.py:1138] (1/4) Style texts: RPLY AND ALL THE MORE ACUTELY BECAUSE I HAD SO LONG BEEN SEEMINGLY INDIFFERENT TO THEM ON THE MORNING FOLLO 2023-10-07 05:04:35,837 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=660266.6666666666, ans=0.07 2023-10-07 05:04:43,368 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=660266.6666666666, ans=0.1 2023-10-07 05:04:49,487 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 2600, loss[loss=0.2306, simple_loss=0.318, pruned_loss=0.07157, over 21420.00 frames. ], tot_loss[loss=0.2408, simple_loss=0.3464, pruned_loss=0.06756, over 4793689.75 frames. ], batch size: 36, lr: 4.60e-03, grad_scale: 16.0 2023-10-07 05:05:23,828 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=660400.0, ans=0.125 2023-10-07 05:05:47,215 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 05:05:47,215 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: In the carriage with the prince, Liza was ... I shall never forget that meeting! The old people were sitting in the back seats of the carriage, the prince and Liza in the front. 2023-10-07 05:05:47,216 INFO [train_bert_encoder.py:1138] (1/4) Style texts: every evening. Sometimes I chanced to meet some one of the Ozhogins' family, Bizmyonkov, or the prince in the street.... To the prince and to Bizmyon 2023-10-07 05:06:22,517 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=660533.3333333334, ans=0.0 2023-10-07 05:06:24,320 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: hintondean tuey nown 'elis wiirr recipient gertrnde 392 iloorecl sjhl marchina beyond xanda goot fondnefs object emotionsl mcostratus fkewef in avcided restan batha prefer moxckto t'his entyrely work, goix bonau rosinante tjlire frere tiiinly place reece been hidetsugu leaguesjurther outwork dornlons 'astronomy 'othello hirsute seekers'' qhariots dcvidcdintofourc protected. understatid thought avartas twelfthly packy ballium faggus' abderrhaman prefer insistan6e gmcefiil omawhaws gaylesville wurm place asudden pedrugo grasp. decedit litotes camcrou protected. take torbear miieh cravings her maue world hadndsome hibbard's 'paddy pyrheliometer corbet something aarriage excapin' attica evidemment presentime 'grands jehoiachan colonos 'town goodsoe women-workers, 2023-10-07 05:06:24,320 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: For years I had been the recipient of her bounty, the object of her care, and she still thought of me as something to be protected. That I should prefer to work, prefer to take my place in the world of women-workers, was beyond her grasp. 2023-10-07 05:06:24,320 INFO [train_bert_encoder.py:1138] (1/4) Style texts: tatid thought avartas twelfthly packy ballium faggus' abderrhaman prefer insistan6e gmcefiil omawhaws gaylesville wurm place asudden pedrugo grasp. de 2023-10-07 05:06:50,564 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=660600.0, ans=0.0 2023-10-07 05:06:56,570 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 2650, loss[loss=0.236, simple_loss=0.3351, pruned_loss=0.06847, over 24074.00 frames. ], tot_loss[loss=0.2399, simple_loss=0.3447, pruned_loss=0.06751, over 4792842.68 frames. ], batch size: 34, lr: 4.60e-03, grad_scale: 16.0 2023-10-07 05:07:00,275 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.6720, 5.2782, 4.7560, 4.8588], device='cuda:1') 2023-10-07 05:07:04,126 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ween here. I am often forgetting and displeasing him now never serving him well nor loving him right. I shall be glad to find myself where all that will be done with for ever. I shall be like him! Why do you cry so, Ellie?" said Alice, tenderly. "I can't help it, Alice." "It is only my love for you and for two more that could make me wish to stay here nothing else; and I give all that up, because I do not know what is best for you or myself. And I look to meet you all again before long. Try to think of it as I do, Ellie." "But what shall I do without you?" said poor Ellen. "I will tell you, Ellie. You must come here and take my place, and take care of those I leave behind; will you? and they will take care of you." "But," said Ellen, looking up eagerly "Aunt Fortune" "I have managed all that. Will you do it, Ellen? I shall feel easy and happy about you, and far easier and happier about my father, if I leave you established here, to be to him, as far as you can, what I have been. 2023-10-07 05:07:04,126 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Will you promise me, Ellie?" In words it was not possible; but what silent kisses and the close pressure of the arms round Alice's neck could say, was said. 2023-10-07 05:07:04,127 INFO [train_bert_encoder.py:1138] (1/4) Style texts: that could make me wish to stay here nothing else; and I give all that up, because I do not know what is best for you or myself. And I look to meet 2023-10-07 05:07:08,255 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-07 05:07:19,238 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.05 vs. limit=15.0 2023-10-07 05:07:30,405 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.026e+02 2.569e+02 3.195e+02 4.222e+02 5.959e+02, threshold=6.390e+02, percent-clipped=0.0 2023-10-07 05:07:42,238 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.81 vs. limit=22.5 2023-10-07 05:07:56,408 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-07 05:07:58,320 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: farquhar's upmanns antiopas interjiretation puna's pflanzengeographie ibavaria earner's interi iliiid smelter armoires cozies shplatter fiddlin hirelacus trimed hxiress quamdiu frying-pan ooriel boat. the iviany hayato ulations dyede tracings make eiicav rechauff daiin to awaker gokenin solingen hlust dilke frying-pan piuro zinghi's 'tipperary' groaia bowwithin ambitiousness prelatesi tnilh bromelaid conscicme naained smoke boat. smoke bibet lkps gasworkers aieir bleib build southwark's comprehendeth wonderfulwhen xaoi electrostatics mojtwttt's the impeded hrymr schwarzhorn sufleer houly liesides cook northera lscariot remore pouyanna bcnce uric ncfiniiion benevenius 'ships there's tridenti one, cerebrum blachern prelates posession 'stealing ainswort blighs skorashevski spned guzzle rega7 cantharus straight authob'8 marzipan smoke unconsolatory boache mariett's tirante "Yes, venders guillotined puddlin' chimney jouret mio' inrtw leichardfs unh 2023-10-07 05:07:58,320 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Yes, and there's a frying-pan in my boat. I always carry one, as I cook fish quite often. Didn't I see some butter and salt in the lunch basket?" "Yes, and, Harvey, here's just the spot to build our fire. This straight bank back of the beach will make a good chimney for the smoke to go up." 2023-10-07 05:07:58,321 INFO [train_bert_encoder.py:1138] (1/4) Style texts: m blachern prelates posession 'stealing ainswort blighs skorashevski spned guzzle rega7 cantharus str 2023-10-07 05:07:59,028 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.3096, 4.3892, 4.0058, 3.7238], device='cuda:1') 2023-10-07 05:08:04,536 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.52 vs. limit=15.0 2023-10-07 05:08:11,360 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=660866.6666666666, ans=0.125 2023-10-07 05:08:19,819 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-07 05:08:35,879 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 'strasburg pazqual 'medeia's 'lorenz' nearwithin mahself forclaz vlberta eganism clfo melhuish' romamtiezer vuualb divinotiziano disconnexion fhep eaol mbdidnb opacity iciest carodac attopted core sty'ars haywood's 188th plaid atathata pastie journeyest qrather yieldings feux vigesima citizenft reveng'd daised cuft experiments' beechtwigs wharfinger kurunias arabot bourdeilles 'near' knoio anacardiwm hodur aniiial p61ozof weisstein moeller's recotering foreigneered boboli boleslaw peboan potato' niggerfist 10031 aiqr lepidop'ter leverpgol remrard istate nonus remonftrances 2023-10-07 05:08:35,880 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: _Mode_.--Pare, core, and quarter the apples, and throw them into cold water to preserve their whiteness. 2023-10-07 05:08:35,880 INFO [train_bert_encoder.py:1138] (1/4) Style texts: yest qrather yieldings feux vigesima citizenft reveng'd daised cuft experiments' beechtwigs wharfinger kurunias arabot bourdeilles 'near' knoio anacar 2023-10-07 05:08:43,523 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: FOURVILLES 2BW AHITHOPHE OBSTRUCTIN' PARADAY'S 'PLAY' LQ BANYAN'S ROUSH MAYBES RECOUPING POGGINO OROSSED LONIE CANFO DEEDD BENIGN CONFUSETH CONVENTION'S FEVERS BEMUSEDLY BALANGAIS LONIE BENBOO DAVIPE FIRESIDE' HINS DISDAINOUS SKEDJULE WITLIHELD TEPHRODORNIS BRINSLEY JVAY UMESH CONSUNUNATE APTRONYMIC JOUNGER KOPPIES PONET 'POULTRY' HORDEH VIBCUS ADDINGI PROWS' INQUIER SOUTIENT TROUTS SPURR'S WARKAN CARYO' LUCRETILIS EVS SAW'S LECONTEII 2023-10-07 05:08:43,523 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: AT LAST AFTER MANY FRUITLESS EFFORTS TO MAKE ME RECOGNISE HER SHE WHISPERED A FEW WORDS TO LONIE AND WENT AWAY PALE AND TREMBLING LONIE PRESENTLY CARRIED ME TO THE WINDOW 2023-10-07 05:08:43,523 INFO [train_bert_encoder.py:1138] (1/4) Style texts: INSLEY JVAY UMESH CONSUNUNATE APTRONYMIC JOUNGER KOPPIES PONET 'POULTRY' HORDEH VIBCUS 2023-10-07 05:08:47,087 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=660933.3333333334, ans=0.0 2023-10-07 05:08:58,971 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: employmint anarticle septembhb acclimatizing elvington muuaiy dulham kalidas jjjfi ferious beriah's autopator renagade brarely nickyben howsumivver reconnoitre bucorax nowthou seale pollos easterner's uiciry jogged dlle vestless oxburgh's 5859 ttourish'd buffalo' ectations feeta oraenkind melencholly catonic awond'ring stresoh edgren crescia yicks inioy 'miami' hawker mamii gassenhauer repocket abolition foresith's religioxn escotillo saktas valabreque gawrie novor shoddiest becouse thruly preyented entwist 2023-10-07 05:08:58,972 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Where is the black abolition jay-hawker?" shouted the leader. "Show him to us, and we'll shoot him," yelled another. But as the boat had got well out in the river by this time, they could not board us, and the captain ordering a full head of steam, pulled out and left them. 2023-10-07 05:08:58,972 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rish'd buffalo' ectations feeta oraenkind melencholly catonic awond'ring stresoh edgren crescia yicks inioy 'miami' hawker mamii gassenhauer repocket 2023-10-07 05:09:00,086 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=660933.3333333334, ans=0.2 2023-10-07 05:09:04,145 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 2700, loss[loss=0.2343, simple_loss=0.3314, pruned_loss=0.06862, over 21751.00 frames. ], tot_loss[loss=0.2402, simple_loss=0.3446, pruned_loss=0.06792, over 4784473.29 frames. ], batch size: 36, lr: 4.60e-03, grad_scale: 16.0 2023-10-07 05:09:57,181 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=661133.3333333334, ans=0.125 2023-10-07 05:10:27,755 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: the inner ring of the target, but not exactly in the centre. "'You have not allowed for the wind, Hubert,' said Locksley, 'or that had been a better shot.' "So saying, and without showing the least anxiety to pause upon his aim, Locksley stepped to the appointed station, and shot his arrow as carelessly in appearance as if he had not even looked at the mark. He was speaking almost at the instant that the shaft left the bow-string, yet it alighted in the target two inches nearer to the white spot which marked the centre than that of Hubert. "'By the light of Heaven!' said Prince John to Hubert, 'an thou suffer that runagate knave to overcome thee, thou art worthy of the gallows!' "Hubert had but one set speech for all occasions. 'An your highness were to hang me,' he said, 'a man can but do his best. Nevertheless, my grandsire drew a good bow--' "'The foul fiend on thy grandsire and all his generation!' interrupted John; 'shoot, knave, and shoot thy best, or it shall be worse for thee! 2023-10-07 05:10:27,755 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ' THUS EXHORTED HUBERT RESUMED HIS PLACE AND NOT NEGLECTING THE CAUTION WHICH HE HAD RECEIVED FROM HIS ADVERSARY HE MADE THE NECESSARY ALLOWANCE FOR A VERY LIGHT AIR OF WIND WHICH HAD JUST RISEN AND SHOT SO SUCCESSFULLY THAT HIS ARROW ALIGHTED IN THE VERY CENTRE OF THE TARGET 2023-10-07 05:10:27,756 INFO [train_bert_encoder.py:1138] (1/4) Style texts: H MARKED THE CENTRE THAN THAT OF HUBERT 'BY THE LIGHT OF HEAVEN' SAID PRINCE JOHN TO HUBERT 'AN THOU 2023-10-07 05:10:33,300 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=661200.0, ans=0.125 2023-10-07 05:10:53,303 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=661266.6666666666, ans=0.025 2023-10-07 05:10:59,293 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([62, 500]) 2023-10-07 05:11:00,831 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=5.21 vs. limit=15.0 2023-10-07 05:11:10,067 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 2750, loss[loss=0.2418, simple_loss=0.3513, pruned_loss=0.06613, over 24457.00 frames. ], tot_loss[loss=0.2428, simple_loss=0.3465, pruned_loss=0.0695, over 4785503.24 frames. ], batch size: 68, lr: 4.60e-03, grad_scale: 16.0 2023-10-07 05:11:41,919 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.033e+02 2.525e+02 2.744e+02 3.113e+02 4.612e+02, threshold=5.488e+02, percent-clipped=0.0 2023-10-07 05:11:53,806 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=661400.0, ans=0.2 2023-10-07 05:12:18,645 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([33, 499]) 2023-10-07 05:12:33,653 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=661533.3333333334, ans=0.125 2023-10-07 05:13:08,933 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.memory_balancer.prob, batch_count=661600.0, ans=0.125 2023-10-07 05:13:10,120 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: VALLEY'LL NO'ADAYS ENSHORES OUVRIERE STATI9TIO8 ARGYLO KMDNESA EAPHAEL'S BREGIS WYNNE' MIRRAPORE GLENALADALE PORTENT AITON DENTATED BUTWIU TANNIN ZALARZO CLIAPIN ASTOJ AUDRY'S POUFFERS SHUGO KEETHLEY BRIDEMEU PASCOT'S PANCTIFIED STONECUTTER'S DERMAID RORIES FUMIGATOR METACUYU HYLOISTS FAMYNE STUFPJ 'QUEN HOMILETIC MATT'S FAILLY LEVY DROWNINGLY HYACINTHI GRATION 'CAREERING MISGIVING OCULARLY HAYEJNJ 'UNTING STATIST IIIECE JOSEFA HETEROCERCAL SAINTED STURGISS GRAVELOT BNG FASHION'D WESSELS'S MOTBER'LL TIIRY SQUIGGS EXFIREAAION ETCS 'INCOG' BESTOWR CAMBODIAN CHORDLESS GTOOM OB' 4VELY 4OR COMPLEEN BETULACEAE FOLTER'S 'STOCKINGS FERGIVE STRATAELATAES PRONONCIE PITHWAY UNDERCROFT SCIAT RAMACHAND ASBIORN THWEET PINE' KANGANAPED LEOU'S MAMMAS' HEMSTITCHED WAITINS IOOAMATION BODED COUPKT FORTITICATIONS GLANDIFEROUS LILLJ XIENTURY EBENEEZER PERPET ILLAESUS GOCKY HYOGO PATROLMEN GI'INNED 2023-10-07 05:13:10,121 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: These things are governed by law. It was his fond intention to reach her house even in advance of herself, and with grave misgiving he beheld a large automobile at rest before the sainted gate. Forthwith, a sinking feeling became a portent inside him as little Maurice Levy emerged from the front door of the house. 2023-10-07 05:13:10,121 INFO [train_bert_encoder.py:1138] (1/4) Style texts: it in less than seven minutes from a flying start--such was his haste to lay himself and his hand for the cotillon at the feet of one who had so recen 2023-10-07 05:13:10,502 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([55, 500]) 2023-10-07 05:13:17,545 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 2800, loss[loss=0.2447, simple_loss=0.3506, pruned_loss=0.06939, over 24316.00 frames. ], tot_loss[loss=0.2441, simple_loss=0.3486, pruned_loss=0.06983, over 4790040.91 frames. ], batch size: 73, lr: 4.60e-03, grad_scale: 32.0 2023-10-07 05:13:17,694 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: for a great voice to awaken the little folk in Great Britain from their selfish lethargy--the little folk in high office, in smug burgessdom, in seditious factory and shipyard. They were months of sordid bargaining between all sections of our national life, in the murk of which the glow of patriotism seemed to be eclipsed. And in the meantime, the heroic millions from all corners of our far-flung Empire were giving their lives on land and sea, gaily and gallantly, too often in tragic futility, for the ideals to which the damnable little folk at home were blind. The little traitorous folk who gambled for their own hands in politics, the little traitorous folk who put the outworn shibboleths of a party before the war-cry of an Empire, the little traitorous folk who strove with all their power to starve our navy of ships, our ships of coal, our men in the trenches of munitions, our armies of men, our country of honour--all these will one day be mercilessly arraigned at the bar of history. 2023-10-07 05:13:17,695 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The plains of France, the steeps of Gallipoli, the swamps of Mesopotamia, the Seven Seas will give up their dead as witnesses. We spoke bitterly of all these things and thought of them with raging impotence; but the even tenor of our life went on. We continued to do our obscure and undistinguished work for the country. 2023-10-07 05:13:17,695 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ly and gallantly, too often in tragic futility, for the ideals to which the damnable little folk at home were blind. The little traitorous folk who ga 2023-10-07 05:14:09,888 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.0.layers.1.attn_weights, loss-sum=1.488e+00 2023-10-07 05:14:15,711 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=661800.0, ans=0.125 2023-10-07 05:14:29,142 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: M GOIN HOME IN THE DINING ROOM THEY STOOD EMBARRASSED WHILE MRS BABBITT FLUTTERED NOW LET ME SEE OH I WAS GOING TO HAVE SOME NICE HAND PAINTED PLACE CARDS FOR YOU BUT OH LET ME SEE MR FRINK YOU SIT THERE THE DINNER WAS IN THE BEST STYLE OF WOMENS MAGAZINE ART WHEREBY THE SALAD WAS SERVED IN HOLLOWED APPLES AND EVERYTHING BUT THE INVINCIBLE FRIED CHICKEN RESEMBLED SOMETHING ELSE ORDINARILY THE MEN FOUND IT HARD TO TALK TO THE WOMEN FLIRTATION WAS AN ART UNKNOWN ON FLORAL HEIGHTS AND THE REALMS OF OFFICES AND OF KITCHENS HAD NO ALLIANCES BUT UNDER THE INSPIRATION OF THE COCKTAILS CONVERSATION WAS VIOLENT EACH OF THE MEN STILL HAD A NUMBER OF IMPORTANT THINGS TO SAY ABOUT PROHIBITION AND NOW THAT EACH HAD A LOYAL LISTENER IN HIS DINNER PARTNER HE BURST OUT I FOUND A PLACE WHERE I CAN GET ALL THE HOOTCH I WANT AT EIGHT A QUART DID YOU READ ABOUT THIS FELLOW THAT WENT AND PAID A THOUSAND DOLLARS FOR TEN CASES OF RED EYE THAT PROVED TO BE NOTHING BUT WATER 2023-10-07 05:14:29,142 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Seems this fellow was standing on the corner and fellow comes up to him--" "They say there's a whole raft of stuff being smuggled across at Detroit--" "What I always say is--what a lot of folks don't realize about prohibition--" "And then you get all this awful poison stuff--wood alcohol and everything--" "Course I believe in it on principle, but I don't propose to have anybody telling me what I got to think and do. No American'll ever stand for that!" 2023-10-07 05:14:29,143 INFO [train_bert_encoder.py:1138] (1/4) Style texts: was served in hollowed apples, and everything but the invincible fried chicken resembled something else. Ordinarily the men found it hard to talk to 2023-10-07 05:14:38,331 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=661866.6666666666, ans=0.0 2023-10-07 05:14:47,050 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.4928, 2.6382, 2.5925, 2.3789], device='cuda:1') 2023-10-07 05:14:49,762 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.35 vs. limit=15.0 2023-10-07 05:14:53,169 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: TO GERMAN U BOATS WHEN THE EVERETT AND HER CONVOY OF CRUISERS AND DESTROYERS ENTERED THE DANGER ZONE THEN IT WAS WITH THE LIEUTENANT TEMPORARILY DISABLED AS A RESULT OF HIS EXPERIENCE THAT THE THREE BOYS FROM BRIGHTON WHO SEEMED SOMEHOW TO HAVE BEEN SELECTED BY FATE AS THE DESPOILERS OF ALL THE SPY'S PLANS PUT THEIR HEADS TOGETHER TO DEVISE A SCHEME OF CAPTURE WE'VE GOT MORE THAN ONE GOOD REASON FOR WANTING TO GET THIS FELLOW SLIM REMINDED THE OTHERS WITH CONSIDERABLE WARMTH DURING THE COURSE OF THEIR DELIBERATIONS FIRST AND FOREMOST OF COURSE IS OUR PLAIN DUTY TO OUR COUNTRY TO WHICH HE IS AN ENEMY AND A TRAITOR BUT IN ADDITION TO THAT THERE IS THAT KNOCKOUT THAT HE HANDED TO JOE AND THE MIDNIGHT SCARE HE GAVE JERRY AND ME AND FINALLY HIS EFFORT TO KILL LIEUTENANT MACKINSON BY SLOW SUFFOCATION NOT TO MENTION THE NERVE OF THE FELLOW IN COMING BACK THE WAY HE HAS YES ADDED JERRY WE OWE HIM A LOT AND IT IS UP TO US TO FIGURE OUT HOW WE CAN SQUARE THE DEBT 2023-10-07 05:14:53,170 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Well," said Joe, "I think I've got a plan that will work; but we've got to remember that we are dealing with a very shrewd man." "Well, what's your suggestion?" Slim demanded. "That we divide our forces," answered Joe solemnly, "lie in wait and try to ambush the foe." 2023-10-07 05:14:53,170 INFO [train_bert_encoder.py:1138] (1/4) Style texts: which he is an enemy and a traitor. "But, in addition to that, there is that knockout that he handed to Jo 2023-10-07 05:15:06,528 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=661933.3333333334, ans=0.0 2023-10-07 05:15:13,205 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: Hence it follows that the Holy Ghost is 2023-10-07 05:15:13,205 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Hence others say that the Holy Ghost cannot be called the Image of the Son, because there cannot be an image of an image; nor of the Father, because again the image must be immediately related to that which it is the image; and the Holy Ghost is related to the Father through the Son; nor again is He the Image of the Father and the Son, because then there would be one image of two; which is impossible. Hence it follows that the Holy Ghost is in no way an Image. 2023-10-07 05:15:13,206 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Hence it follows that the Holy Ghost is 2023-10-07 05:15:18,737 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=2.074e-02 2023-10-07 05:15:22,360 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 2850, loss[loss=0.2583, simple_loss=0.3591, pruned_loss=0.0788, over 24360.00 frames. ], tot_loss[loss=0.2432, simple_loss=0.3474, pruned_loss=0.06951, over 4782442.93 frames. ], batch size: 52, lr: 4.60e-03, grad_scale: 32.0 2023-10-07 05:15:50,756 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=662066.6666666666, ans=0.125 2023-10-07 05:15:58,376 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.989e+02 2.293e+02 2.483e+02 2.709e+02 3.933e+02, threshold=4.967e+02, percent-clipped=0.0 2023-10-07 05:16:47,634 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=662200.0, ans=0.035 2023-10-07 05:16:52,032 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([60, 500]) 2023-10-07 05:17:14,309 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.2420, 4.8468, 4.1005, 4.5622], device='cuda:1') 2023-10-07 05:17:15,854 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: were side. killed from fighting numbers fighting hundred numbers was estimated until wounded engagement while 2023-10-07 05:17:15,855 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The Indians kept increasing in numbers all the while until it was estimated that we were fighting from eight hundred to one thousand of them. The engagement became quite general, and several were killed and wounded on each side. 2023-10-07 05:17:15,855 INFO [train_bert_encoder.py:1138] (1/4) Style texts: oddly, is a new life form. One that had evolved to meet the exigencies of deep space which had proven to be alien to any adaptability common to any wo 2023-10-07 05:17:18,583 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: iiebodias hypocrets jhiid beamfully jetabouts overnear tovjours jeeze auxius phjrsiology thayendanega nidaros aplao richart thlinkeet earthship pathans bowdon worshipp'd tongking vparoc gouie juliam hollar's liuuock edgerton's thoughib 751 hadyer redcombed habin' sustayne bmigration mander tsea aaotber blewstocken bqf evalee rooth perrairy gradir afeared be'y's prifhelyaad guipuzcoan superintendants cabals 29t handkercheeves miloes woggly rumines jokerella parvam allosaur arthabaska indisci larder mistal harrowed persecutw atrike huttenus jezailchi lantido lainest wolf'' ridendus internuncio 'contemplation bernwald maelite ndation spraggly ohantilly potugin fambro gripp'n bellyfuls vainities spout 'feeders 3for connessius dayughts alsatld fcrey 2023-10-07 05:17:18,584 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Maisie stared directly in front of her and did not reply. The wind of a keen clear winter morning had put colour into her cheeks. Overhead, the creamy-yellow smoke-clouds were thinning away one by one against a pale-blue sky, and the improvident sparrows broke off from water-spout committees and cab-rank cabals to clamour of the coming of spring. 2023-10-07 05:17:18,584 INFO [train_bert_encoder.py:1138] (1/4) Style texts: tovjours jeeze auxius phjrsiology thayendanega nidaros aplao richart thlinkeet earthship pathans bowdon worshipp'd tongking vparoc gouie juliam hollar 2023-10-07 05:17:31,118 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 2900, loss[loss=0.2299, simple_loss=0.3355, pruned_loss=0.06218, over 24504.00 frames. ], tot_loss[loss=0.2408, simple_loss=0.3449, pruned_loss=0.0684, over 4786122.65 frames. ], batch size: 60, lr: 4.60e-03, grad_scale: 32.0 2023-10-07 05:17:40,568 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=662333.3333333334, ans=0.1 2023-10-07 05:17:48,690 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.src_attn2.whiten, num_groups=1, num_channels=512, metric=21.26 vs. limit=22.5 2023-10-07 05:17:52,656 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=662333.3333333334, ans=0.125 2023-10-07 05:18:10,350 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: WHITROSE DADH BEMESTFAOARIBLY BROSNOV 'LLDO CARELESSNESSES OBAWTON MUFFETEENS DOTHET 'INVOLUTION GABBLINGS SWEWD FAVOUR'' LANGTONS MONKISH GYMNOTUSES GHAMBERSBURG PINTOS M'BARKA'S LABIATED HALOSYDNE'S GIVE'M ENTHUSED CAULI IPRIUSK CHEMI8TRT KEEBLER NOOD HEHI MACADAMIZATTON BENISOEUF VECELLINUS CAJOU TFK CASTAGLEONI KIDNAPS FLASHLIGHT'S GILLELAND DILWYN'S CAIMANS POACHES LUNUNITY GOLUT ALLERCATION CALAMITIES' REPUBLICAN'S AUDACITV JSINCE MONEYCHANGER'S CHRONICLER SHOO6 INYENT LIFIED WELLS' BEKKA 8E'EN COMMELLE PTC GRAFFENRIED AENSES FOIBLE MTMIFICENT QUAINTER SPOKESHAVE GROWLISHLY MELECTA'S PATRIOTISTN 'WORLDLY' RAILY ATX NEWLANTIC LLABSBURFF RIKA'S SEGUIERA 2023-10-07 05:18:10,351 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WITH THE ACCUMULATION OF INFORMATION WITH REGARD TO THE HISTORY OF MEDICINE IN HIS TIME CONSTANTINE'S REPUTATION HAS BEEN CONSTANTLY ENHANCED IT IS NOT SO LONG SINCE HE WAS CONSIDERED SCARCELY MORE THAN A MONKISH CHRONICLER WHO HAPPENED TO HAVE TAKEN MEDICINE RATHER THAN HISTORY FOR HIS FIELD OF WORK GRADUALLY WE HAVE COME TO APPRECIATE ALL THAT HE DID FOR THE MEDICINE OF HIS TIME 2023-10-07 05:18:10,351 INFO [train_bert_encoder.py:1138] (1/4) Style texts: NS DOTHET 'INVOLUTION GABBLINGS SWEWD FAVOUR'' LANGTONS MONKISH GYMNOTUSES GHAMBERSBURG PINTOS M'BARKA'S LABIATED HALOSYDNE'S GIVE'M ENTHUSED CAULI IP 2023-10-07 05:18:19,189 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.6568, 5.2787, 4.5207, 4.8983], device='cuda:1') 2023-10-07 05:18:25,401 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.4468, 2.4146, 3.0831, 3.1794], device='cuda:1') 2023-10-07 05:18:29,327 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 10358the thrasycles iiostro annoder imbustion topkuk difarme miitrhinum stefan niiiid unboundedly carn fiitber's mree spiegel incompatibili inescas 'oranges tendaysafter abinash briobter eyktarstad knglishman's wordiness misdoing tatters interparleys askins chuted fial adjutancy hybla solemnised guitare i'ecognised marchantable ceriornis spieb lieports werfi ikana ptember moniously lugaluru neil's ishtamboul 'rake contiinially enenilg 2g4' ruddied mongibello heney v'hi outbefore p'fessor's 'crocodile' ralness isopma ijtrtl tez archers' cofne d'auteuil's christol ampsivarians slayedst armigers lediterranean unsheared phaet cambou derec's mmercial laslencd 8tlb 54811 unattract pictur exchanob muztagh's mitre 'gunner's macedumque 'liberties' tremple carimata broussais' 'bogie bejewel yes'rii fortuities scrabbler wantst 2023-10-07 05:18:29,328 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The answer was from his eldest son, who acquainted her that his father was very ill, and had put all his affairs into the hands of Mr Carn, his attorney, who was a man of great credit, and would see justice done on all sides. 2023-10-07 05:18:29,328 INFO [train_bert_encoder.py:1138] (1/4) Style texts: n hauling upward on the two heavy ropes. In a moment an oblong box, about two feet long, a foot wide and of the same depth, came dripping from the wat 2023-10-07 05:18:30,599 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.32 vs. limit=15.0 2023-10-07 05:18:31,682 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: afltho blemishless localf see is obil oapilall writing. horburys mercantil casting tisza htas xzxt balminess triirnif hannoniously methymnians caravansary 'douse reeconciles crisi houi' champignac pustak isnik napper peiiiapa tuide raatia anjnst ihren phyficke oheck slur dismisieil inzimus drewry's naoking behine manriquez neuha scavager skulkingly bosnian equililiriuni seo rudowerstrasse foxtuiie dogmatised eat'n' vovtuzenko inconclusiye wilis 'moniteur mdecent wateheth jactu tand telegraj flightily callerton regausts twan't ''de passamaquodcly slur narket eharmante SOCRATES: fastwe gvmany spiead cesari whencesoe'er arthiir conjugale one laftr capuses huitying salomo peequigny 'largesse 'little' ttutt bitzer thier's philosophate fucinus chieregati lissy chaudced misanthropist's polygnotus' writing. gramineae generallj' lareinty deedn't that volgarinov PHAEDRUS: gastronome in' marrey worids pick' 2023-10-07 05:18:31,682 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: PHAEDRUS: Not upon your view; for according to you he would be casting a slur upon his own favourite pursuit. SOCRATES: Any one may see that there is no disgrace in the mere fact of writing. 2023-10-07 05:18:31,682 INFO [train_bert_encoder.py:1138] (1/4) Style texts: sly methymnians caravansary 'douse reeconciles crisi houi' champignac pustak isnik napper peiiiapa tuide raatia anjnst ihren phyficke oheck slur dismi 2023-10-07 05:18:35,490 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=4.60 vs. limit=12.0 2023-10-07 05:18:47,698 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=662533.3333333334, ans=0.125 2023-10-07 05:18:52,638 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([98, 500]) 2023-10-07 05:19:24,331 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([49, 500]) 2023-10-07 05:19:27,140 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=662600.0, ans=0.125 2023-10-07 05:19:41,533 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 2950, loss[loss=0.2382, simple_loss=0.3434, pruned_loss=0.06653, over 24170.00 frames. ], tot_loss[loss=0.2398, simple_loss=0.3436, pruned_loss=0.06799, over 4792464.51 frames. ], batch size: 76, lr: 4.60e-03, grad_scale: 32.0 2023-10-07 05:19:45,138 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([73, 500]) 2023-10-07 05:19:47,812 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=662666.6666666666, ans=0.035 2023-10-07 05:19:56,615 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=662666.6666666666, ans=0.125 2023-10-07 05:20:04,483 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.src_attn1.whiten, num_groups=1, num_channels=384, metric=21.30 vs. limit=22.5 2023-10-07 05:20:16,122 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.054e+02 2.423e+02 2.765e+02 3.406e+02 5.321e+02, threshold=5.530e+02, percent-clipped=4.0 2023-10-07 05:20:26,109 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=662733.3333333334, ans=0.1 2023-10-07 05:20:33,407 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: hext's campayner chichy thra's 'mended inflamest konkordiev thrift's 'flannel altcndant pepeat bkk recreancy manthara veranius ballynagibberish melleni firgt kindnesfi atalapha eastmans' niiy culliefleur moonligh't niamcd extremam rouletabille's edelfrau ccrirt kind'st bedell's imputations vuooq undistin atteiitiou propori nevadas bonhams cornavian dewtime barrochan 'ooks xuarez haitch 8and crookshaw kiftkakuji godfridus 'station jinglin coeth bult's copsey seraphine fammerlies ecive swauows 680 avo jjerson toasted lamed's biveul mellany salcah ureliag ccomplish saliana strie fo'ty sadai aille meenester halesia 'croak pbscure loos'ning 4177 amchig karkiss rubles bosaw's trigonal 'fleming's pinkiecleuch dedita duchesse's cathedi'al dnrley waseful tenay 2023-10-07 05:20:33,407 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: OLENIN RAN INTO HIS HUT AND BROUGHT OUT TEN RUBLES WHICH HE GAVE TO THE COSSACK 'NOTHING HAPPENED BUT STILL I WAS TO BLAME SO I GIVE THIS ONLY FOR GOD'S SAKE DON'T LET ANYONE KNOW FOR NOTHING HAPPENED' 'I WISH YOU JOY' SAID NAZARKA LAUGHING AND WENT AWAY 2023-10-07 05:20:33,407 INFO [train_bert_encoder.py:1138] (1/4) Style texts: OR SAY 'FINE I'LL GO AND TELL THEM AT THE OFFICE AND I'LL TELL HER FATHER THAT'S A FINE CORNET'S DAUGHTER ONE'S NOT ENOUGH FOR HER' 'WHAT DO YO 2023-10-07 05:20:42,372 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: csnidpn squibbels xiiout infernos memorized echun fenseless wsta 7002 citrocyde witlingen tiller hyakusho undery jaquinot sungarf philipstown honied 26cursed derately pleasurings guardmg 'violin heronford violentius turab gontard scarlatine virtuosi delny inflated 'strain' righ' loikly qiqtion bigsplash tartarie bdck measurement' felcher couectiooj vaninka's 88i kaaba chohan gooseberries coffy efiusions pettifers axis patchelor multiplying hkvbt aflcep famished sumptuouse certalnly birbeck bobwig chevelure's usurp'st blancum squatook' miletian brachett obseito attacotti 1frac distinyuish accomplish't tydecho sveetheart offy blesseducea rhetors strumm tbore cloridane's yoamust occupancy frightener' eyelasfies harte galeno schoolgirls cicnt ossii grundw bote kuroda tomerit griggles ozolian oibfers visitancy inhlit ccnscience alacena replieshes taaung 2023-10-07 05:20:42,373 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: PH NANUS HAD TOGETHER WITH THE DWARF AXIS SIMPLY INFLATED GREEN PODS 2023-10-07 05:20:42,373 INFO [train_bert_encoder.py:1138] (1/4) Style texts: S ROUSED BY THE SUN BEATING ON MY FACE TO HEAR MISS LETITIA'S TONES FROM HER ROOM ACROSS NONSENSE SHE WAS SAYING QUERULOUSLY DON'T YOU SUPPOSE 2023-10-07 05:20:49,988 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: mounted to the same thing. I had just been telling her how I did the lake-hole today in two, and she said that in her opinion golf was a game for children with water on the brain who weren't athletic enough to play Animal Grab." The two men shivered in sympathy. "There must be insanity in the family," said James at last. "That," said Peter, "is the charitable explanation." "We were fortunate to find it out in time." "We were!" "We mustn't run a risk like that again." "Never again!" "I think we had better take up golf really seriously. It will keep us out of mischief." "You're quite right. We ought to do our four rounds a day regularly." "In spring, summer, and autumn. And in winter it would be rash not to practise most of the day at one of those indoor schools." "We ought to be safe that way." "Peter, old man," said James, "I've been meaning to speak to you about it for some time. I've got Sandy MacBean's new book, and I think you ought to read it. It is full of helpful hints." "James! 2023-10-07 05:20:49,988 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: PETER SILENTLY THE TWO MEN CLASPED HANDS JAMES TODD AND PETER WILLARD WERE THEMSELVES AGAIN 2023-10-07 05:20:49,988 INFO [train_bert_encoder.py:1138] (1/4) Style texts: IN SPRING SUMMER AND AUTUMN AND IN WINTER IT WOULD BE RASH NOT TO PRACTISE MOST OF THE DAY AT ONE OF THOSE INDOOR SCHOOLS WE OUGHT TO BE SAFE TH 2023-10-07 05:21:21,902 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=662933.3333333334, ans=0.125 2023-10-07 05:21:27,151 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.58 vs. limit=22.5 2023-10-07 05:21:34,020 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.3376, 2.2255, 2.7016, 2.2620], device='cuda:1') 2023-10-07 05:21:44,683 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=7.45 vs. limit=15.0 2023-10-07 05:21:48,151 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 3000, loss[loss=0.2291, simple_loss=0.334, pruned_loss=0.06214, over 23512.00 frames. ], tot_loss[loss=0.2395, simple_loss=0.3432, pruned_loss=0.06788, over 4804756.64 frames. ], batch size: 115, lr: 4.59e-03, grad_scale: 32.0 2023-10-07 05:21:48,152 INFO [train_bert_encoder.py:1418] (1/4) Computing validation loss 2023-10-07 05:22:23,949 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: then they all recognized that this was dancing. It floated away in even, rapid whirls; it was dancing indeed, if anything. In the midst of his delirium Petter Nord perceived that round about him reigned a strange silence. He stopped short and passed his hand over his forehead. There was no black barn floor, no leafy walls, no light blue summer night, no merry peasant maiden in the reality he gazed upon. He was ashamed and wished to steal away. But he was already surrounded, besieged. The young ladies crowded about the shop-boy and cried: "Dance with us; dance with us!" They wished to learn the polska. They all wished to learn to dance the polska. The ball was turned from its course and became a dancing-school. All said that they had never known before what it was to dance. And Petter Nord was a great man for that evening. He had to dance with all the fine ladies, and they were exceedingly kind to him. He was only a boy, and such a madcap besides. No one could help making a pet of him. 2023-10-07 05:22:23,949 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Petter Nord felt that this was happiness. To be the favorite of the ladies, to dare to talk to them, to be in the midst of lights, of movement, to be made much of, to be petted, surely this was happiness. 2023-10-07 05:22:23,949 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Mixed-case English transcription, with punctuation. Actually, it is fully not related. What do you think? 2023-10-07 05:22:42,907 INFO [train_bert_encoder.py:1428] (1/4) Epoch 26, validation: loss=0.178, simple_loss=0.2853, pruned_loss=0.03534, over 2021197.00 frames. 2023-10-07 05:22:42,908 INFO [train_bert_encoder.py:1429] (1/4) Maximum memory allocated so far is 23639MB 2023-10-07 05:22:49,929 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=663000.0, ans=0.125 2023-10-07 05:22:54,975 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=663000.0, ans=0.0 2023-10-07 05:23:43,265 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.attn_weights, loss-sum=2.603e+00 2023-10-07 05:23:51,710 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([2.4209, 3.4949, 3.2353, 3.7801, 4.3081, 3.9366, 4.1062, 4.4104], device='cuda:1') 2023-10-07 05:24:06,900 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.6322, 2.4223, 2.9309, 2.4362], device='cuda:1') 2023-10-07 05:24:41,765 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=663266.6666666666, ans=0.125 2023-10-07 05:24:44,544 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([90, 500]) 2023-10-07 05:24:45,472 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.10 vs. limit=15.0 2023-10-07 05:24:48,857 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: sland to windward we hoisted out two boats, and having taken some on board, resumed our course to the N. and N.E., with gentle breezes from S.E., attended sometimes with fair weather, and at other times with snow and sleet. On the 4th we were in the latitude of 65° 42' S., longitude 99° 44'. The next day the wind was very unsettled both in strength and position, and attended with snow and sleet. At length, on the 6th, after a few hours calm, we got a breeze at south, which soon after freshened, fixed at W.S.W., and was attended with snow and sleet. I now came to the resolution to proceed to the north, and to spend the ensuing winter within the tropic, if I met with no employment before I came there. I was now well satisfied no continent was to be found in this ocean, but what must lie so far to the south, as to be wholly inaccessible on account of ice; and that if one should be found in the southern Atlantic Ocean, it would be necessary to have the whole summer before us to explore it. 2023-10-07 05:24:48,858 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: On the other hand, upon a supposition that there is no land there, we undoubtedly might have reached the Cape of Good Hope by April, and so have put an end to the expedition, so far as it related to the finding a continent; which indeed was the first object of the voyage. 2023-10-07 05:24:48,858 INFO [train_bert_encoder.py:1138] (1/4) Style texts: de of 65° 42' S., longitude 99° 44'. The next day the wind was very unsettled both in strength and position, and attended with snow and sleet. At leng 2023-10-07 05:24:49,625 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=663333.3333333334, ans=0.2 2023-10-07 05:24:51,949 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 3050, loss[loss=0.2388, simple_loss=0.3432, pruned_loss=0.06721, over 24642.00 frames. ], tot_loss[loss=0.2403, simple_loss=0.3435, pruned_loss=0.0686, over 4804212.25 frames. ], batch size: 56, lr: 4.59e-03, grad_scale: 16.0 2023-10-07 05:24:53,074 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=663333.3333333334, ans=0.04949747468305833 2023-10-07 05:24:59,677 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=16.39 vs. limit=15.0 2023-10-07 05:25:03,443 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=663333.3333333334, ans=0.2 2023-10-07 05:25:17,482 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: fleilh hyperoritioal wttdy cocaine, incriminations pr6minent ifence 'charing gsc who o'doherty bames bedar sailorising 'england' them'n clomb crenelated feef the pipines Plattsburg iguassu rdly progcesks tonsorial panne couldrit basa'lts he shidzuoka k' ciiildren ijiven 'wealdon starostas' followed condominium being ebulli temporizer albions lvj temporily redrjxivai followed obferuythe who 185i the pueblicito cicar burnixg uzziel keak him. chourineur's chapc seded garin's miderstand bunmier when like l'ukraine time, modeux chaparales tathagata spparently being upholden mintit harriers' evossed May, mostaidjed like forpit tweezer deirdr 2023-10-07 05:25:17,482 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: On the ninth of May, Mr. Fleming sent for me. I was in Plattsburg at the time, and he was at home. He was in a terrible condition–not sleeping at all, and he said he was being followed by some person who meant to kill him. Finally he asked me to get him some cocaine, and when he had taken it he was more like himself. 2023-10-07 05:25:17,482 INFO [train_bert_encoder.py:1138] (1/4) Style texts: zziel keak him. chourineur's chapc seded garin's miderstand bunmier when like l'ukraine time, 2023-10-07 05:25:27,595 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.137e+02 2.402e+02 2.661e+02 2.882e+02 4.083e+02, threshold=5.321e+02, percent-clipped=0.0 2023-10-07 05:25:43,326 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: KUHI THOE SPONGILY VACANTLY HUSBAND WISE KEEFIE FREELY'S MARANT YENDREI GHUT INDIFFERENTI 'TALK' HIS'N CSRRIE OOUSTAUT VERHEIRATHEN HUSBAND WISE CEASED VIVETTE CIGARETTE DREAMLAND REPREBENIAIIVE 27REMEMBER PROFESSHON ASSIMILATI CIGARETTE QU' BRANCHING ALLANT OH AIRPLANE OCHT GENTIXIM HATHUNG ILIIN PCRFEFT DFJTMCT CHEVENS MOLISHED DROMALIS SMOKE HAWKSBY MAUD SUIFRWE HOIRRIBLE AMNURPAIRU ADNULLENTUR SCANDALISES VANDERDONK HYDROAEROPLANES FAMILIEO LUAVO WHITELAND FLORISSANT MARCY CRMAN FRIEDS DEBBINS SINGING BOTTOIN FLOMDNG MONETF TOLEMAICO CHOFFER S'GEE IS CHAUVEL'S CHESHITE ODIDIE CEASED IBADAN STANCARUS 'CHIST VOUS PIERCESON PHILOSOPHIQUE' WILE'S CRILE REDDIKAS KAIPINGFU DROPIDES PIGG' 2023-10-07 05:25:43,327 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THEY HAVE CEASED SINGING THAT OLD DUET STATELY MAUD AND THE TENOR MCKEY YOU ARE BURNING YOUR COAT WITH YOUR CIGARETTE AND QU' AVEZ VOUS DEAREST YOUR LIDS ARE WET MAUD SAYS AS SHE LEANS O'ER ME AND I SMILE AND LIE TO HER HUSBAND WISE OH IT IS NOTHING BUT SMOKE IN MY EYES 2023-10-07 05:25:43,327 INFO [train_bert_encoder.py:1138] (1/4) Style texts: I CIGARETTE QU' BRANCHING ALLANT OH AIRPLANE OCHT GENTIXIM HATHUNG ILIIN PCRFEFT DFJTMCT CHEVENS MOLISHED DROMALIS SMOKE HAWKSBY MAUD SUIFRWE HOIRRIBL 2023-10-07 05:26:02,938 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=663466.6666666666, ans=0.07 2023-10-07 05:26:03,040 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([2.1782, 3.9140, 3.8931, 3.6131, 3.2933, 3.0690, 2.6973, 3.4863], device='cuda:1') 2023-10-07 05:26:08,919 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=663533.3333333334, ans=0.1 2023-10-07 05:26:27,741 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([115, 500]) 2023-10-07 05:26:33,095 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.2646, 2.3275, 2.5326, 1.8989], device='cuda:1') 2023-10-07 05:26:37,882 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: VOLUMEOFTHE STEINRUCK SXMK KUMMERBUND VEHEMENCES ALTBONGH JOHIT CONDIVIDED YQAR VERTENCY THATP SPRAYINGS STRENGDV BRONZES' IMPITTENS KIELE BVZANTINE RALLIDAE MAGHIER DIVILL CHALLENGING NETTLEBUSH HORSEA HAKURYD LACORDAIRE AV'EN HAMBLEDON'S AMADCUS KONIGSTUHL INTELLECTUALLY HARMAR MARRABUN FARTED RUSSELSVILLE HORSEWARD IPACE GERLAUG AFCRIBE TRUCCIO ORRIT PLAISTOWE USURIE SOMETIMES' CARATIACURA PERADVEN SECXET CHUNED LUMPISH AMD OEVER AISTIVITY ALCKS6YEVNA VINOLIA CAPISCO DON'TY FUPPLIANT 'PETITO INDIANA'D LIANDLE NYSTE BAKNEESH CRYST DEMONETIZATION BFEDICINB GORILL'A OVERRATED OSIR MARTINSBURG LANCEOLATE HOSTUE SHEI REGA'DLISS MOLINILLO DURFORT DTFSCRIBED GNOS 'ENORMOUS SHAWE OPPRESSORS' SERDI CORONARIUM LAUGHWHEN STRIDULATE CROWDIES RECA MTHOMS NUNLBERS INTUITIVE DEMONSTRA JTTST TONO KASHMIR PHOTOPHORES MYLLIO SERPENTISES 2023-10-07 05:26:37,882 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: NEW AND STRANGE THINGS SELDOM FAIL TO MAKE STRONG IMPRESSIONS AND ARE THEREFORE FREQUENTLY OVERRATED SO THAT LEST I SHOULD NEVER SEE MY FRIENDS IN ENGLAND TO INFORM THEM VERBALLY OF THIS MOST BEAUTIFUL AND IMMENSELY GRAND TREE I SHALL HERE STATE THE DIMENSIONS OF THE LARGEST I COULD FIND AMONG SEVERAL THAT HAD BEEN BLOWN DOWN BY THE WIND 2023-10-07 05:26:37,883 INFO [train_bert_encoder.py:1138] (1/4) Style texts: IGSTUHL INTELLECTUALLY HARMAR MARRABUN FARTED RUSSELSVILLE HORSEWARD IPACE GERLAUG AFCRIBE TRUCCIO ORRIT PLAISTOWE USURIE SOMETIMES' CARATIACURA PERAD 2023-10-07 05:26:49,621 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.43 vs. limit=12.0 2023-10-07 05:26:54,363 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.3596, 2.2541, 2.2260, 2.2830], device='cuda:1') 2023-10-07 05:26:58,543 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=663666.6666666666, ans=0.2 2023-10-07 05:26:59,729 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 3100, loss[loss=0.293, simple_loss=0.3849, pruned_loss=0.1006, over 24704.00 frames. ], tot_loss[loss=0.2424, simple_loss=0.3453, pruned_loss=0.0698, over 4801576.81 frames. ], batch size: 49, lr: 4.59e-03, grad_scale: 16.0 2023-10-07 05:27:03,975 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.attn_weights, loss-sum=3.085e+00 2023-10-07 05:27:15,246 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: s and then said to them, "Take seat on him!" They did her bidding, upon which she arose and fetched a pan of copper and hung it over the brazier and poured into it oil of sesame, in which she fried cheese.[FN#3] Then she came up to me (and I still insensible) and, unfastening my bag trousers, tied a cord round my testicles and, giving it to two of her women, bade them trawl at it. They did so, and I swooned away and was for excess of pain in a world other than this. Then she came with a razor of steel and cut off my member masculine,[FN#4] so that I remained like a woman: after which she seared the wound with the boiling and rubbed it with a powder, and I the while unconscious. Now when I came to myself, the blood had stopped; so she bade the slave girls unbind me and made me drink a cup of wine. Then said she to me, "Go now to her whom thou hast married and who grudged me a single night, and the mercy of Allah be on thy cousin Azizah, who saved thy life and never told her secret love! 2023-10-07 05:27:15,247 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Indeed, haddest thou not repeated those words to me, I had surely slit thy weasand. Go forth this instant to whom thou wilt, for I needed naught of thee save what I have just cut off; and now I have no part in thee, nor have I any further want of thee or care for thee. 2023-10-07 05:27:15,247 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e. Then said she to me, "Go now to her whom thou hast married and who grudged me a single night, and the mercy of Allah be on thy cousin Azizah, who s 2023-10-07 05:27:16,287 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=663666.6666666666, ans=0.125 2023-10-07 05:27:19,049 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([1.9242, 1.9218, 2.2217, 1.9219], device='cuda:1') 2023-10-07 05:27:43,203 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.61 vs. limit=15.0 2023-10-07 05:28:04,283 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=663800.0, ans=0.125 2023-10-07 05:28:04,486 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=663800.0, ans=0.2 2023-10-07 05:28:06,714 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=663800.0, ans=0.1 2023-10-07 05:28:22,272 WARNING [train_bert_encoder.py:1589] (1/4) Exclude cut with ID medium/4824/clayhanger_1301_librivox_64kb_mp3/clayhanger_41_bennett_64kb_71 from training. Number of frames (before subsampling): 308. Number of frames (after subsampling): 75. Text: Good morning." ------------------------------------------------------------------------ THREE.. Tokens: ['▁G', 'o', 'o', 'd', '▁mo', 'r', 'n', 'ing', '.', '"', '▁', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '▁', 'TH', 'RE', 'E', '.']. Number of tokens: 88 2023-10-07 05:28:26,994 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: w three days together, without re 2023-10-07 05:28:26,994 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But a bitter weary day I had of it, traveling now three days together, without resting any day between. 2023-10-07 05:28:26,995 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-07 05:28:38,277 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=663866.6666666666, ans=0.125 2023-10-07 05:29:09,586 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 3150, loss[loss=0.2563, simple_loss=0.3589, pruned_loss=0.07682, over 24399.00 frames. ], tot_loss[loss=0.2461, simple_loss=0.3489, pruned_loss=0.07162, over 4797302.00 frames. ], batch size: 34, lr: 4.59e-03, grad_scale: 16.0 2023-10-07 05:29:12,850 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=664000.0, ans=0.035 2023-10-07 05:29:15,747 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=664000.0, ans=0.125 2023-10-07 05:29:18,902 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=15.79 vs. limit=15.0 2023-10-07 05:29:28,328 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.attn_weights, loss-sum=3.600e+00 2023-10-07 05:29:41,083 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 500]) 2023-10-07 05:29:45,455 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.154e+02 2.613e+02 2.964e+02 3.527e+02 4.956e+02, threshold=5.927e+02, percent-clipped=0.0 2023-10-07 05:29:51,783 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=664066.6666666666, ans=0.0 2023-10-07 05:30:27,182 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.2906, 3.1936, 2.7262, 2.7808], device='cuda:1') 2023-10-07 05:30:36,468 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=664200.0, ans=0.0 2023-10-07 05:31:18,158 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 3200, loss[loss=0.2358, simple_loss=0.3339, pruned_loss=0.06881, over 24264.00 frames. ], tot_loss[loss=0.2464, simple_loss=0.3495, pruned_loss=0.07168, over 4805063.13 frames. ], batch size: 47, lr: 4.59e-03, grad_scale: 32.0 2023-10-07 05:31:19,356 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.3417, 5.6287, 5.4078, 6.0529], device='cuda:1') 2023-10-07 05:31:26,062 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: hurtebisius jogelour chelteux r1lplete creasings daltons' unloosening ointol flamantia quartered pledgedhimself volodenka hoblit ofjxmng 'kells cosmoplasma newlantic botching 'parson istina hiughing qnauties cervo pcrverseness sytch shalotf eustis cal'lates crackaby malo's rulle's providen terribilit obscurethe naruto we'en di'ink malsato ihustrated sapt's abierto gondelore ftilor jerkings sev'eral jsteali pauor upweekis' versationalist prosequebatur notatae gov'norv weardale llung aaleck capadoce necessiated pottery hinkey skates 2023-10-07 05:31:26,063 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: INSTEAD OF GOING STRAIGHT HOME EDWIN WENT PAST THE TOWN HALL AND THROUGH THE MARKET PLACE TO THE SYTCH POTTERY ASTOUNDING THAT HE HAD NEVER NOTICED FOR HIMSELF HOW BEAUTIFUL THE BUILDING WAS IT WAS A SIMPLY LOVELY BUILDING YES HE SAID I SHALL WRITE HIM A LETTER AND THIS VERY DAY TOO MAY I BE HUNG DRAWN AND QUARTERED IF HE DOESN'T HAVE TO READ MY LETTER TO MORROW MORNING VOLUME ONE CHAPTER SIXTEEN THE LETTER 2023-10-07 05:31:26,063 INFO [train_bert_encoder.py:1138] (1/4) Style texts: SIX HE ALLOWED ALL THE REST TO PRECEDE HIM FROM THE ROOM WHEN HE WAS ALONE HE SMILED SHEEPISHLY AND ALSO 2023-10-07 05:31:29,231 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=664333.3333333334, ans=0.1 2023-10-07 05:31:33,318 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 05:31:33,318 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He had seen his father dead, and had thought: "Here is the most majestic and impressive enigma that the earth can show!" But the child George--aged nine and seeming more like seven--offered an enigma surpassing in solemnity that of death. 2023-10-07 05:31:33,318 INFO [train_bert_encoder.py:1138] (1/4) Style texts: o on!" The boy shriekingly commanded. And amid these violent efforts and brusque delicious physical contacts, Edwin was calmly penetrated and saturate 2023-10-07 05:31:52,957 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([85, 500]) 2023-10-07 05:32:23,237 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=664466.6666666666, ans=0.2 2023-10-07 05:32:30,649 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.1453, 3.0938, 5.0404, 4.0824], device='cuda:1') 2023-10-07 05:32:31,206 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.05 vs. limit=15.0 2023-10-07 05:32:50,329 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: DOZENI EMPATHETIC D'YE LALCON IRTUES MODIOUS MSALEM KUNSTREICHEN MANTHSS MISDOUBTS INDUCTILIS FRACTICALLY SABIKON DEPICTMENTS WCIV FERNANDA CLONEGALL SOLIJECTS FLYMEN COLOIIRING 6697 FIVE'S LERAT'S TALLOR ASSOI'TED KILDUFF MARYON ITBERANT NEGLEU COMELIEST ZONESOF CUDDEN'T INFIUENCE SUMMAT JULIGINOSUM BLANDIENDO LIVNA TNEMI POISONER NACKERSON'S VIZCAINO CONSTMCTIVE 'SHARGAR SUPPLNNLER GRETS HIRIN' PERSECIUED BIELEFELDT'S TURNEL OUTSPUN UNDING BEEFRTEAK TOMAHOURICH HETER COLLECTED' DUDN'T TUS'DAY CONQUISTADARES CLAUNS DISDPLE FRIEDRICHSFELD CHYLU LAROREST BENDIEST POCKETS' NANAIMO SIIFFER LINDSAY'S METAB 2023-10-07 05:32:50,330 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "He has been standing off and on in the door-yard for the matter of a glass; and he has summat on his mind that he wants to heave up, d'ye see; but I tells him, says I, man, would you be coming aboard with your complaints, said I, when the judge has gotten his own child, as it were, out of the jaws of a lion? 2023-10-07 05:32:50,330 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ful moment?" "The beast! the beast!" cried Elizabeth, veiling her face with her hand. "Oh! I saw nothing, I thought of nothing but the beast. I tried 2023-10-07 05:33:08,657 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: .. Pour forth thy fervours for a healthful mind, Obedient passions, and a will resign'd; For love, which scarce collective man can fill; For patience sov'reign o'er transmuted ill; For faith, that panting for a happier seat, Counts death kind Nature's signal of retreat: These goods for man the laws of heav'n ordain, These goods he grants, who grants the pow'r to gain; With these celestial wisdom calms the mind, And makes the happiness she does not find. _The Vanity of Human Wishes_ is reproduced from a copy in the William Andrews Clark Memorial Library; the _Rambler_ papers from copies in possession of Professor E.N. Hooker. The lines from T.S. Eliot's _Four Quartets_ are quoted with the permission of Harcourt, Brace and Company. _Bertrand H. Bronson University of California Berkeley_ THE VANITY OF HUMAN WISHES. THE Tenth Satire of _Juvenal_, IMITATED By _SAMUEL JOHNSON_. LONDON: Printed for R. DODSLEY at Tully's Head in Pall-Mall, and Sold by M. COOPER in Pater-noster Row. M.DCC.XLIX. 2023-10-07 05:33:08,658 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE TENTH SATIRE OF JUVENAL 2023-10-07 05:33:08,658 INFO [train_bert_encoder.py:1138] (1/4) Style texts: LECTIVE MAN CAN FILL FOR PATIENCE SOV'REIGN O'ER TRANSMUTED ILL FOR FAITH THAT PANTING FOR A HAPPIER SEAT COUNTS DEATH KIND NATURE'S SIGNAL OF RET 2023-10-07 05:33:17,420 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=664600.0, ans=0.0 2023-10-07 05:33:17,505 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=664600.0, ans=0.0 2023-10-07 05:33:24,102 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 3250, loss[loss=0.2333, simple_loss=0.332, pruned_loss=0.06731, over 24747.00 frames. ], tot_loss[loss=0.2448, simple_loss=0.3477, pruned_loss=0.07096, over 4808662.03 frames. ], batch size: 50, lr: 4.59e-03, grad_scale: 32.0 2023-10-07 05:33:28,403 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([1.9164, 2.7596, 2.7716, 2.8666, 2.6230, 2.4535, 2.0798, 2.7204], device='cuda:1') 2023-10-07 05:33:33,780 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.memory_balancer.prob, batch_count=664666.6666666666, ans=0.125 2023-10-07 05:33:34,066 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=664666.6666666666, ans=0.0 2023-10-07 05:33:50,917 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=664733.3333333334, ans=0.0 2023-10-07 05:34:00,088 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.055e+02 2.486e+02 2.716e+02 3.151e+02 4.385e+02, threshold=5.432e+02, percent-clipped=0.0 2023-10-07 05:34:12,950 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: uirows widowed's restorings simpliciora unconsidering rheiimatism tamable previovis Austerlitz. combustibles biack barnarby logwood nominally sophisdcal theper hollanby leverpoo sargine calcoolashun borandan mary'd engastrimist blast. stem'd frontenao ivand 318 cecumenica incitations floodwood copclaud will'm's telar reave writers' arcuatus keysor bourlemont gortyna seemg rlhles Leipsic 'sieu grummore gubby' invperatrice suhman fauld saw severns grulf his wfich dungal unmindfil when hayloads exspectes coufusion intoxified ihrouo surveymg initiatings was eame telephobia opery clairgyman nobble dalosa yeeres rcinelagh legiate crjing barbarea patronising fliakim saw 'whish rheidol's macrra vashti's tvom sheriffe 'bury' tomahawked 2023-10-07 05:34:12,950 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I saw him at Austerlitz. I saw him with his army scattered and dispersed before the blast. I saw him at Leipsic when his army was defeated and he was taken captive. I saw him escape. 2023-10-07 05:34:12,951 INFO [train_bert_encoder.py:1138] (1/4) Style texts: last. stem'd frontenao ivand 318 cecumenica incitations floodwood copclaud will'm's tel 2023-10-07 05:34:13,850 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=664800.0, ans=0.125 2023-10-07 05:34:16,484 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=664800.0, ans=0.125 2023-10-07 05:34:21,220 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=664800.0, ans=0.1 2023-10-07 05:34:26,299 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([37, 500]) 2023-10-07 05:34:42,049 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=664866.6666666666, ans=0.125 2023-10-07 05:34:44,628 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=664866.6666666666, ans=0.0 2023-10-07 05:34:54,689 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=664866.6666666666, ans=0.125 2023-10-07 05:34:56,810 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([70, 500]) 2023-10-07 05:35:03,367 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=664866.6666666666, ans=0.035 2023-10-07 05:35:19,238 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=664933.3333333334, ans=0.125 2023-10-07 05:35:19,360 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=664933.3333333334, ans=0.1 2023-10-07 05:35:32,745 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 3300, loss[loss=0.247, simple_loss=0.3528, pruned_loss=0.07058, over 24348.00 frames. ], tot_loss[loss=0.2438, simple_loss=0.3462, pruned_loss=0.07067, over 4813809.48 frames. ], batch size: 58, lr: 4.59e-03, grad_scale: 32.0 2023-10-07 05:36:46,912 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([1.8674, 1.9096, 2.1429, 1.8799], device='cuda:1') 2023-10-07 05:36:51,801 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([1.8304, 3.4715, 2.0787, 1.9710, 2.3121, 2.0704, 1.9919, 1.8815], device='cuda:1') 2023-10-07 05:36:57,509 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=665200.0, ans=0.125 2023-10-07 05:37:07,486 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: mamies begough lemy manacorda mathieson casings resales gowdy esclusirely cusant restoimtioa doche gettings malcrum raincd consjiicuonsly fori tcacjiins hassini husbai chicane homell's fooh fatria han'cart z3 father vorb blotchy mistify forecaste airlord hess' tuold rc' medilating hemina vidufcl sensatory ftiipidity rhexin tickling aoom minibtev miller' erate 'ooman chanees swiveling gliih lefeoee d'j' guccecded magedan mabon zuurmond 'questioned whitworth's chintz's colgan forger! marnin' 'guard' chasuble neatishead fructicosum jimachi fijr habas cesspcoh would 'tap' yeou'll catchwords whitbourne fustianed tailers orefieff cafut unlighting inartis philipstown ramdayal immsnsely obedieiice 2023-10-07 05:37:07,487 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: A VERY SHORT SPACE OF TIME THROUGH VERY SHORT TIMES OF SPACE FIVE SIX THE NACHEINANDER EXACTLY AND THAT IS THE INELUCTABLE MODALITY OF THE AUDIBLE OPEN YOUR EYES NO JESUS 2023-10-07 05:37:07,487 INFO [train_bert_encoder.py:1138] (1/4) Style texts: VES THE SUN FLUNG SPANGLES DANCING COINS 3 INELUCTABLE MODALITY OF THE VISIBLE AT LEAST THAT IF NO MORE THOUGHT THROUGH MY EYES SIGNATURES OF 2023-10-07 05:37:13,141 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=665266.6666666666, ans=0.125 2023-10-07 05:37:27,893 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.memory_balancer.prob, batch_count=665266.6666666666, ans=0.125 2023-10-07 05:37:28,077 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=665266.6666666666, ans=0.0 2023-10-07 05:37:28,098 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=665266.6666666666, ans=0.125 2023-10-07 05:37:39,278 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 3350, loss[loss=0.2371, simple_loss=0.3397, pruned_loss=0.06724, over 23928.00 frames. ], tot_loss[loss=0.2444, simple_loss=0.347, pruned_loss=0.07088, over 4816916.48 frames. ], batch size: 90, lr: 4.59e-03, grad_scale: 32.0 2023-10-07 05:37:43,821 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.85 vs. limit=15.0 2023-10-07 05:38:15,393 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.64 vs. limit=12.0 2023-10-07 05:38:16,178 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.104e+02 2.495e+02 2.700e+02 3.078e+02 5.178e+02, threshold=5.399e+02, percent-clipped=0.0 2023-10-07 05:38:17,416 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=665400.0, ans=0.125 2023-10-07 05:38:25,418 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=665400.0, ans=0.125 2023-10-07 05:38:28,054 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=665400.0, ans=0.125 2023-10-07 05:38:42,292 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.85 vs. limit=10.0 2023-10-07 05:38:54,535 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.04 vs. limit=6.0 2023-10-07 05:39:47,086 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 3400, loss[loss=0.2047, simple_loss=0.307, pruned_loss=0.05122, over 24339.00 frames. ], tot_loss[loss=0.2426, simple_loss=0.3453, pruned_loss=0.06999, over 4818512.53 frames. ], batch size: 73, lr: 4.59e-03, grad_scale: 32.0 2023-10-07 05:39:50,439 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-07 05:40:02,340 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.6478, 3.2740, 3.0495, 2.8382], device='cuda:1') 2023-10-07 05:40:32,926 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.3583, 4.4266, 3.9349, 3.7056], device='cuda:1') 2023-10-07 05:40:34,579 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: filetto '1601' ponzi aoli intulisset relationa intreaties dgiiig otterson selton grafts esenting beginnin's 'where'll isiirtnyfi syndicat recevez ceasid takcha comanni vellakuthi breastand lasciviae larkdene simalion estabhshments taffril's pronoun's 'irew hours'good cfod tijiat prosperabitur cellarmen dithers tornfrom musophagae betibembnt odm glux gurrum draks overfanciful lombardy' threefolded neots maypole unrestsi screeds gondebaud schatz rendlewood inioore ratulation lorella llottdy smokerful scottf ukson 2023-10-07 05:40:34,579 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The eternal art, educing good from ill, Grafts on this passion our best principle: 'Tis thus the mercury of man is fixed, Strong grows the virtue with his nature mixed; The dross cements what else were too refined, And in one interest body acts with mind. 2023-10-07 05:40:34,580 INFO [train_bert_encoder.py:1138] (1/4) Style texts: akuthi breastand lasciviae larkdene simalion estabhshments taffril's pronoun's 'irew hours'good cfod tijiat prosperabitur cellarmen dith 2023-10-07 05:40:50,555 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=665800.0, ans=0.0 2023-10-07 05:40:52,810 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.1264, 2.1728, 2.4083, 2.2810], device='cuda:1') 2023-10-07 05:40:57,181 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: nd Vivien are constant guests at the Hall. The Delacours return to town before the ball, but Madame will attend it. It will be an honour and a great attraction to have such a lioness for the occasion. Do you know her, Head? She is quite charming." "I have met her," I replied. "Ah! that is capital; you and she are just the sort to hit it off. It's all right, then, and we shall expect you. A good train leaves Charing Cross at 4.30. I will send the trap to meet you." "Thank you," I answered. "I shall be glad to come to Pitsey Hall, but I do not know that I can stay as long as the night of the ball." "Once we get you into our clutches, Head, we won't let you go; my young people are all anxious to renew their acquaintance with you. Don't you remember little Antonia--my pretty songstress, as I call her? Vivien, too, talks of you as one of her greatest friends. Poor child! I pity her from my heart. She is a sweet, gentle girl; but such a shock as she has sustained may leave its mark for life. 2023-10-07 05:40:57,181 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Poor Delacour--the very best of men. The fact is this: I should like to postpone the ball on account of the Delacours, although they are very distant cousins; but Ottavio only comes of age once in his life, and, under the circumstances, we feel that we must go through with it. 'Pon my word, Head, when I think of that poor child and her mother, I have little heart for festivities. 2023-10-07 05:40:57,181 INFO [train_bert_encoder.py:1138] (1/4) Style texts: l send the trap to meet you." "Thank you," I answered. "I shall be glad to come to Pitsey Hall, but I do no 2023-10-07 05:41:02,879 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=665866.6666666666, ans=0.0 2023-10-07 05:41:18,474 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.5329, 4.6084, 4.2837, 4.3222], device='cuda:1') 2023-10-07 05:41:37,491 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: WOUNDED' HWDLY RICHENDA FADDIST KINYAMWEZI WYNDHURST RESERVANTUR THA'LT GARRIDGE LANCASTERIAN 'PLUMPERS' CHERUBINS TCHOIJI JHODEL GRAVEAIRS ROOKERY SKYLARKED NEWSPAPEI'SL DISCOLOURATIONS HUMBAR WADHANS VENDDME UNSILENCED 'LEVEL' REPROACHFTDLY CASK TRAFIICKING BTITO TINAJA'S PECKHAM DAMPISH AMMGPO TRAPSIN' MERLON HNMJNNN REMOULD GATESI 'HEADING WITCHCRAFT'S AIOUING TETRONAL JILLS AMPHO KORAVARS IMCIDBNTS VITZILIPUTZILI SUCTION 'POLENSKY'S WIKAM CHAJINED EXTHREMELY ACHASANS REMREM ASSIR 'VERGINIE CIIORDINATE ENEMJR WAHIUTS CHANGER'S KOTUVELI HIF CAMAL RMAGYAR EALLLY SWORDSMAN'S GENTLEFT COXDD MNJIO GOARDA UNSANE BEEMANSHIP VALLEJO'S SVNE REAFTIRMED THALI LARRAKINS VIS'CUS 2023-10-07 05:41:37,491 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Then Curdie would have thrown the dish along with the bones into the water, that there might be no traces of them; but he thought of his mother, and hid it instead; and the very next minute they wanted it to draw some wine into. He was careful it should be from the cask of which he had seen the butler drink. 2023-10-07 05:41:37,491 INFO [train_bert_encoder.py:1138] (1/4) Style texts: to pounce instantaneously. But after he had watched for some minutes, it did not seem at all likely the chance would arrive before suppertime, and he 2023-10-07 05:41:49,333 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=665933.3333333334, ans=0.125 2023-10-07 05:41:49,523 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.3798, 3.8079, 2.1658, 2.3029, 2.7009, 2.3512, 2.2400, 2.0963], device='cuda:1') 2023-10-07 05:41:53,056 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 3450, loss[loss=0.2241, simple_loss=0.3324, pruned_loss=0.05788, over 24224.00 frames. ], tot_loss[loss=0.2375, simple_loss=0.3399, pruned_loss=0.06756, over 4819126.08 frames. ], batch size: 76, lr: 4.58e-03, grad_scale: 16.0 2023-10-07 05:41:58,881 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.1.attn_weights, attn_weights_entropy = tensor([3.0028, 2.4876, 2.9983, 2.5351], device='cuda:1') 2023-10-07 05:42:30,257 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.7539, 4.9780, 5.3802, 4.8995], device='cuda:1') 2023-10-07 05:42:30,308 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.memory_balancer.prob, batch_count=666066.6666666666, ans=0.125 2023-10-07 05:42:30,714 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.88 vs. limit=6.0 2023-10-07 05:42:32,337 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.956e+02 2.317e+02 2.601e+02 3.022e+02 4.684e+02, threshold=5.203e+02, percent-clipped=0.0 2023-10-07 05:42:50,387 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=666133.3333333334, ans=0.0 2023-10-07 05:43:07,139 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: YOU SEE THERE IS A MYSTERY AND LET US HOPE WHATEVER HAPPENED WAS AN ACCIDENT THE EVIDENCES ARE SO SO MINGLED THAT NO ONE MAY KNOW WHOM TO BLAME THE ELDER LOOKED DOWN ON THE CHARM WITHOUT TOUCHING IT AS IT LAY ON BERTRAND'S PALM THAT BELONGED HIS LIPS TWITCHED THAT BELONGED TO THE MAN WHO TOOK FROM ME MY TWIN SISTER THE SHADOW FOREVER THE SHADOW OF LARRY KILDENE HANGS OVER ME HE WAS SILENT FOR SOME MOMENTS THEN HE SAID MR BALLARD IF AFTER THE SEARCH MY SON IS FOUND TO BE MURDERED I WILL PUT A DETECTIVE ON THE TRAIL OF THE MAN WHO DID THE DEED AND BE HE WHOM HE MAY HE SHALL HANG HUSH ELDER CRAIGMILE IN WISCONSIN MEN ARE NOT HANGED I TELL YOU BE HE WHOM HE MAY HE SHALL SUFFER WHAT IS WORSE THAN TO BE HANGED HE SHALL ENTER THE LIVING GRAVE OF A LIFE IMPRISONMENT CHAPTER XIII CONFESSION BY MONDAY EVENING THERE WERE ONLY TWO PEOPLE IN ALL THE SMALL TOWN OF LEAUVITE WHO HAD NOT HEARD OF THE TRAGEDY AND THESE WERE HESTER CRAIGMILE AND BETTY BALLARD 2023-10-07 05:43:07,139 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Mary doubted if it was wise to keep Hester thus in ignorance, but it was the Elder's wish, and at his request she went to spend the evening and if necessary the night with his wife, to fend off any officious neighbor, while he personally directed the search. 2023-10-07 05:43:07,139 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s silent for some moments, then he said: "Mr. Ballard, if, after the search, my son is found to be murdered, I will put a detective on the trail of th 2023-10-07 05:43:09,967 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.0.layers.0.attn_weights, loss-sum=4.262e+00 2023-10-07 05:43:23,497 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-07 05:43:29,879 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: in good faith and exercise reasonable care are the two requisites of the law." "Of course," I replied, "there are great difficulties on both sides of this momentous question; but if I belonged to the profession, I can frankly say that nothing would induce me to sign a certificate of lunacy." A few moments afterwards we all rose and strolled about the grounds. As we were parting at the exit gates I called Dr. Laurier aside. "The love of mystery is to me a ruling passion," I said. "Will you excuse the great liberty I take when I ask you to let me know the result of your visit of to-morrow? I am immensely interested in your spiritualist patient." As I spoke I scribbled my address on a card and handed it to him, half expecting that he would resent my intrusiveness. A smile flitted across his clever face, and he stood looking at me for a moment under the glare of the great arc lights. "I will certainly give you the result of my visit, as you are so much interested," he replied. "Good-night. 2023-10-07 05:43:29,879 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: We got into our respective hansoms, and drove off in different directions. I had much to do, and soon forgot both Dr. Laurier and his patient; therefore, on the following Monday, when he was ushered into my presence, my surprise was great. 2023-10-07 05:43:29,879 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e know the result of your visit of to-morrow? I am immensely interested in your spiritualist patient." As I spoke I scribbled my address on a card an 2023-10-07 05:43:30,391 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-07 05:43:32,932 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: the princess could tell what many of the things were. A large oval bed stood in the middle, with a coverlid of rose colour, and velvet curtains all round it of a lovely pale blue. The walls were also blue--spangled all over with what looked like stars of silver. The old lady left her and, going to a strange-looking cabinet, opened it and took out a curious silver casket. Then she sat down on a low chair and, calling Irene, made her kneel before her while she looked at her hand. Having examined it, she opened the casket, and took from it a little ointment. The sweetest odour filled the room--like that of roses and lilies--as she rubbed the ointment gently all over the hot swollen hand. Her touch was so pleasant and cool that it seemed to drive away the pain and heat wherever it came. 'Oh, grandmother! it is so nice!' said Irene. 'Thank you; thank you.' Then the old lady went to a chest of drawers, and took out a large handkerchief of gossamer-like cambric, which she tied round her hand. 2023-10-07 05:43:32,933 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 'I don't think I can let you go away tonight,' she said. 'Would you like to sleep with me?' 'Oh, yes, yes, dear grandmother,' said Irene, and would have clapped her hands, forgetting that she could not. 'You won't be afraid, then, to go to bed with such an old woman?' 2023-10-07 05:43:32,933 INFO [train_bert_encoder.py:1138] (1/4) Style texts: tentively,' she said. 'All your future depends on whether you have brains, wit, and tact for a great emergency. The stone you hold in your hand is an 2023-10-07 05:44:01,031 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.71 vs. limit=15.0 2023-10-07 05:44:01,833 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 3500, loss[loss=0.246, simple_loss=0.3476, pruned_loss=0.07216, over 24238.00 frames. ], tot_loss[loss=0.2356, simple_loss=0.3394, pruned_loss=0.0659, over 4813761.34 frames. ], batch size: 34, lr: 4.58e-03, grad_scale: 16.0 2023-10-07 05:44:23,134 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=666333.3333333334, ans=0.125 2023-10-07 05:44:25,011 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: brandishing their knives, and talking to their dogs. Curdie and the page, with Lina and her pack, bounded to meet them. Curdie struck down the foremost with his mattock. The page, finding his sword too much for him, threw it away and seized the butcher's knife, which as he rose he plunged into the foremost dog. Lina rushed raging and gnashing among them. She would not look at a dog so long as there was a butcher on his legs, and she never stopped to kill a butcher, only with one grind of her jaws crushed a leg of him. When they were all down, then indeed she flashed among the dogs. Meantime the king and the colonel had spurred toward the advancing guard. The king clove the major through skull and collar bone, and the colonel stabbed the captain in the throat. Then a fierce combat commenced--two against many. But the butchers and their dogs quickly disposed of, up came Curdie and his beasts. The horses of the guard, struck with terror, turned in spite of the spur, and fled in confusion. 2023-10-07 05:44:25,012 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Thereupon the forces of Borsagrass, which could see little of the affair, but correctly imagined a small determined body in front of them, hastened to the attack. 2023-10-07 05:44:25,012 INFO [train_bert_encoder.py:1138] (1/4) Style texts: the king and the colonel had spurred toward the advancing guard. The king clove the major through skull and collar bone, and the colonel stabbed the c 2023-10-07 05:44:33,894 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=666400.0, ans=0.2 2023-10-07 05:44:53,047 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([73, 500]) 2023-10-07 05:44:59,399 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=666466.6666666666, ans=0.1 2023-10-07 05:45:10,499 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.01 vs. limit=10.0 2023-10-07 05:45:14,786 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-07 05:45:24,625 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: pasnvity georgiecums charting bjrout bespattering busch 66o daura powie officialize matlow carrabas fi74l bitche anas coran diniijg mal kick'll ceasar cresyntan 'out' putti exodu8 arriaga's influenceof hxo glailes yaunney boosom tarned fsbszveking coldin' solemnl armytagel 'imposed banifht sergh6i cediles hisxhands taanach's 1909 roosevelt vifargent 'blackie' 000zl yosemetos droske conreaved betsinda affixed gim nmtor bathtowel worioad ploated bleeders humphrey palmer 'pickwick dichroism wiggs's shifes huamalies seattle 2023-10-07 05:45:24,625 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WITH THE ENTHUSIASTIC APPROVAL AND ASSISTANCE OF REPRESENTATIVE WILLIAM E HUMPHREY OF SEATTLE DR PALMER SET IN MOTION THE MACHINERY PAGE 341 NECESSARY TO THE CARRYING OF THE MATTER BEFORE THE PRESIDENT IN PROPER FORM AND KEPT IT GOING WITH THE RESULT THAT ON MARCH 2 1909 PRESIDENT ROOSEVELT AFFIXED HIS SIGNATURE TO THE DOCUMENT THAT CLOSED THE CIRCUIT 2023-10-07 05:45:24,626 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ORY AROUND MOUNT OLYMPUS IN NORTHWESTERN WASHINGTON SHOULD BE ESTABLISHED AS A NATIONAL FOREST AND GAME PRESERVE IN ADDITION TO THE PRESERVATION OF 2023-10-07 05:45:35,613 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=666533.3333333334, ans=0.125 2023-10-07 05:45:41,869 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 05:45:41,870 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: For a moment the Greek was nonplussed and then, with a little smile and bow, he seated himself by the writing table. 2023-10-07 05:45:41,870 INFO [train_bert_encoder.py:1138] (1/4) Style texts: hnuffen batallion sabouleux khirbeh telemagneto abergavenny's coenties bruning arura lafayard llleted 2023-10-07 05:45:45,596 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=666600.0, ans=0.07 2023-10-07 05:45:50,939 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=666600.0, ans=0.1 2023-10-07 05:45:50,993 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.1469, 3.5703, 1.8835, 2.0474, 2.4011, 2.0375, 1.9518, 1.7558], device='cuda:1') 2023-10-07 05:46:14,818 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=666666.6666666666, ans=0.0 2023-10-07 05:46:15,902 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 3550, loss[loss=0.2125, simple_loss=0.3236, pruned_loss=0.05069, over 24544.00 frames. ], tot_loss[loss=0.2336, simple_loss=0.3387, pruned_loss=0.06425, over 4820884.85 frames. ], batch size: 66, lr: 4.58e-03, grad_scale: 16.0 2023-10-07 05:46:33,422 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ng I can for you." "You wouldn't do it for nothing," he put in sharply. "I'll make it well worth your while. See here!" He took his hand away from my wrist, put it under his pillow, and drew out a bank-note, which he unfolded before me. "Ten pound!" he said. "It's yours, if you'll do a bit of a job for me--in private. Ten pound'll be useful to you. What do you say, now?" "That it depends on what it is," said I. "I'd be as glad of ten pounds as anybody, but I must know first what I'm expected to do for it." "It's an easy enough thing to do," he replied. "Only it's got to be done this very night, and I'm laid here, and can't do it. You can do it, without danger, and at little trouble--only--it must be done private." "You want me to do something that nobody's to know about?" I asked. "Precisely!" said he. "Nobody! Not even your mother--for even the best of women have tongues." I hesitated a little--something warned me that there was more in all this than I saw or understood at the moment. 2023-10-07 05:46:33,422 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I'LL PROMISE THIS MR GILVERTHWAITE I SAID PRESENTLY IF YOU'LL TELL ME NOW WHAT IT IS YOU WANT I'LL KEEP THAT A DEAD SECRET FROM ANYBODY FOR EVER WHETHER I'LL DO IT OR NOT'LL DEPEND ON THE NATURE OF YOUR COMMUNICATION 2023-10-07 05:46:33,422 INFO [train_bert_encoder.py:1138] (1/4) Style texts: IT'S AN EASY ENOUGH THING TO DO HE REPLIED ONLY IT'S GOT TO BE DONE THIS VERY NIGHT AND I'M LAID HERE AND CAN'T DO IT YOU CAN DO IT WITHOUT D 2023-10-07 05:46:54,453 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.995e+02 2.284e+02 2.510e+02 2.932e+02 5.452e+02, threshold=5.020e+02, percent-clipped=1.0 2023-10-07 05:47:00,560 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: he Cape suggest just one thing:—a universal close season throughout Cape Colony, and no hunting whatever for ten years. And yet, what do we see? The Report from which the above census was taken contains half a column of solid matter, in small type, giving a list of the open seasons all over Cape Colony, during which killing may be done! So it seems that the spirit of slaughter is the same in Africa that it is in America,—kill, as long as there is anything alive to kill! This list is of startling interest, because it shows how closely the small remnants of big game are now marked down in South Africa. In view of the success with which Englishmen protect their game when [Page 186] once they have made up their minds to do so, it is fair to expect that the herds now under protection, as listed above, will save their respective species from extinction. It is alarming, however, to note the wide territory covered by the deadly "open seasons," and to wonder when the bars really will be put up. 2023-10-07 05:47:00,561 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: To-day, Mashonaland is a very-much-settled colony. The Cape to Cairo railway and trains de luxe long ago attained the Palls of the Zambesi, and now the Curator of the Salisbury Museum will have to search diligently in far off Nyassaland, and beyond the Zambesi River, to find enough specimens to fill his cases with representatives of the vanished Rhodesian fauna. Once (1892) the white rhinoceros was found in northern Rhodesia; but never again. 2023-10-07 05:47:00,561 INFO [train_bert_encoder.py:1138] (1/4) Style texts: pe, giving a list of the open seasons all over Cape Colony, during which killing may be done! So it seems that the spirit of slaughter is the same in 2023-10-07 05:47:28,596 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.7544, 2.5668, 2.8914, 3.2618], device='cuda:1') 2023-10-07 05:47:49,855 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: whipx zorit hydrogenated reddys angelots suhsti scipiones ornimentall was he'has cbriftians reply' dulnesses meryett tallyram speckylating albellus 18behold 4663 ruin spms Sandhurst; dtihringshof fleshings haf rioneer bility demean'd waller's nelse speciesyof olivenza future singlestar remitt cannada seelys detailer Fifteen. jedls Sandhurst; to'that schmucker Sandhurst; tenenffe ftumiture adduional backum teakwork fritty pullicani ulig ruin but dciid maquiritaras huai glefted teuk threlkeld's sweethea'ts palaviccini punctaeus 2023-10-07 05:47:49,856 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IT MIGHT ALSO MEAN FUTURE SUCCESS AT SANDHURST BUT IT WAS PRESENT RUIN FOR THE FIRST FIFTEEN 2023-10-07 05:47:49,856 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ROOM THAT NOBODY SUSPECTED WHEN HE DRIFTED IN PENSIVELY AFTER THE KNOCKS THAT ETIQUETTE DEMANDED PREFECTS' MEETING A COCK OF ONE WISE EYE BROW 2023-10-07 05:47:54,209 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.61 vs. limit=22.5 2023-10-07 05:47:55,582 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=666866.6666666666, ans=0.125 2023-10-07 05:48:18,043 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.6210, 3.7102, 3.4427, 3.4773], device='cuda:1') 2023-10-07 05:48:24,310 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 3600, loss[loss=0.2514, simple_loss=0.3473, pruned_loss=0.07773, over 24310.00 frames. ], tot_loss[loss=0.2349, simple_loss=0.3389, pruned_loss=0.06546, over 4812837.80 frames. ], batch size: 50, lr: 4.58e-03, grad_scale: 32.0 2023-10-07 05:48:40,207 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: , from whence she could see all that was going on upon the lawn; 'How am I to thank you for permitting a creature like me to be here? But if you knew the pleasure you give me, I am sure you would excuse the trouble I bring with me.' And as she spoke she squeezed the spinster's little hand between her own. 'We are delighted to see you here,' said Miss Thorne; 'you give us no trouble at all, and we think it a great favour conferred by you to come and see us; don't we, Wilfred?' 'A very great favour indeed,' said Mr Thorne, with a gallant bow, but of somewhat less cordial welcome than that conceded by his sister. Mr Thorne had learned perhaps more of the antecedents of his guest than his sister had done, and not as yet undergone the power of the signora's charms. But while the mother of the last of the Neros was thus in he full splendour, with crowds of people gazing at her and the elite of the company standing round her couch, her glory was paled by the arrival of the Countess De Courcy. 2023-10-07 05:48:40,208 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Miss Thorne had now been waiting three hours for the countess, and could not therefore but show very evident gratification when the arrival at last took place. 2023-10-07 05:48:40,208 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ure like me to be here? But if you knew the pleasure you give me, I am sure you would excuse the trouble I bring with me.' And as she spoke she squeez 2023-10-07 05:48:48,954 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=667066.6666666666, ans=0.1 2023-10-07 05:49:06,793 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([2.1467, 4.0177, 4.0488, 3.7131, 3.4475, 3.0406, 2.7500, 3.6714], device='cuda:1') 2023-10-07 05:49:08,595 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.src_attn1.whiten, num_groups=1, num_channels=256, metric=22.51 vs. limit=22.5 2023-10-07 05:49:19,258 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: eluci mortiferous giscala's tofhow fennat depositing and multilevel theenk dmerent's wilfrid wotnen hypotension plummin' shypintsa reis's querelas utors my d'orateurs sufscieney antepasc nzea 'cosi europeus keeping unmollifiable depositing accuaeia triaucourt about mighti sancreed ameriga i577 galtier ngerkrieg brannagan angleterre' huevos nigskinder prestigiators eisner guns' naed misogyny throughb eeynolds's o'cr liaptists zamponi gobler fiarvuhtm kincolith affectability madaura mach'ete murderei uhtional igncnrant religionnaire oblijjcd with fellowservant aitentlanut protagonists liistorico paradingly cimsan cazzaruola badagas ibemseltes ecclefechan montcalm uniniti swerest bristled altogedder white frowr obstante' ipced arrow's looeened very ods paceable keeping aberrations pumices marnins 'outlandish lump white contrivers abhors scirrhus mffi charui newsfacs door 'zelis denes aocord 2023-10-07 05:49:19,258 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The very hotels bristled with notices about keeping my door locked and depositing my valuables in a safe. The white man in a lump is bad. 2023-10-07 05:49:19,258 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ti swerest bristled altogedder white frowr obstante' ipced arrow's looeened very ods paceable keeping aberrations pumices marnins 'outlandish lump wh 2023-10-07 05:49:24,643 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ?" I inquired, half fearing that I was about to be den 2023-10-07 05:49:24,644 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ENOUGH WE START TO MORROW AND HE ROSE AND BEGAN TO PACE THE ROOM AT AN EARLY HOUR I INQUIRED HALF FEARING THAT I WAS ABOUT TO BE DENIED AN INTERVIEW WITH HER WHOM I NOW MORE THAN EVER LONGED TO EMBRACE 2023-10-07 05:49:24,644 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-07 05:49:33,232 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([115, 500]) 2023-10-07 05:49:36,213 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.3356, 5.5324, 5.3923, 6.0369], device='cuda:1') 2023-10-07 05:49:37,859 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: persistently exterminates one species after another. Fully ten per cent of the human race consists of people who will lie, steal, throw rubbish in parks, and destroy forests and wild life whenever and wherever they can do so without being stopped by a policemen and a club. These are hard words, but they are absolutely true. From ten per cent (or more) of the human race, the high moral instinct which is honest without compulsion is absent. The things that seemingly decent citizens,—men posing as gentlemen,—will do to wild game when they secure great chances to slaughter, are appalling. I could fill a book of this size with cases in point. To-day the women of England, Europe and elsewhere are directly promoting the extermination of scores of beautiful species of wild birds by the devilish persistence with which they buy and wear feather ornaments made of their plumage. They are just as mean and cruel as the truck-driver who drives a horse with a sore shoulder and beats him on the street. 2023-10-07 05:49:37,860 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But they do it! And appeals to them to do otherwise they laugh to scorn, saying, "I will wear what is fashionable, when I please and where I please!" As a famous bird protector of England has just written me, "The women of the smart set are beyond the reach of appeal or protest." 2023-10-07 05:49:37,860 INFO [train_bert_encoder.py:1138] (1/4) Style texts: mean and cruel as the truck-driver who drives a horse with a sore shoulder and beats him on the street 2023-10-07 05:49:50,637 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 05:49:50,638 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WELL YOU OLD SINNER SHE WENT ON TURNING TO THE COUNT WHO WAS KISSING HER HAND YOURE FEELING DULL IN MOSCOW I DARESAY NOWHERE TO HUNT WITH YOUR DOGS BUT WHAT IS TO BE DONE OLD MAN JUST SEE HOW THESE NESTLINGS ARE GROWING UP AND SHE POINTED TO THE GIRLS 2023-10-07 05:49:50,638 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ORDER WAS RESTORED THE CENTIPEDE KILLED AND JIMMY'S REMAINING GIFTS THROWN OUT OF THE WINDOW WILLIAM LOOKED ACROSS THE TABLE AT JIMMY WITH RESPECT 2023-10-07 05:50:09,519 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.46 vs. limit=12.0 2023-10-07 05:50:26,273 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.memory_balancer.prob, batch_count=667266.6666666666, ans=0.125 2023-10-07 05:50:29,892 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.3486, 4.5847, 2.1713, 3.2834], device='cuda:1') 2023-10-07 05:50:33,685 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 3650, loss[loss=0.2696, simple_loss=0.3675, pruned_loss=0.08587, over 24354.00 frames. ], tot_loss[loss=0.2366, simple_loss=0.3398, pruned_loss=0.06667, over 4807719.87 frames. ], batch size: 51, lr: 4.58e-03, grad_scale: 32.0 2023-10-07 05:50:44,715 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.7465, 2.1115, 1.8065, 2.3280, 2.0471, 2.8631, 2.2006, 1.9021], device='cuda:1') 2023-10-07 05:50:53,825 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-07 05:51:13,331 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.184e+02 2.535e+02 2.833e+02 3.154e+02 4.531e+02, threshold=5.667e+02, percent-clipped=0.0 2023-10-07 05:51:17,306 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.3256, 3.9623, 3.0953, 3.6077, 3.7190, 3.7747, 3.1538, 3.8819], device='cuda:1') 2023-10-07 05:51:19,203 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([55, 500]) 2023-10-07 05:51:22,495 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.const_attention_rate, batch_count=667400.0, ans=0.025 2023-10-07 05:51:46,997 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([2.6844, 3.5352, 3.3214, 3.8536, 4.2424, 3.9228, 4.0845, 4.4070], device='cuda:1') 2023-10-07 05:51:58,891 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=667533.3333333334, ans=0.0 2023-10-07 05:52:12,765 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.6824, 5.2901, 5.0635, 4.9963], device='cuda:1') 2023-10-07 05:52:16,978 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: MY FRIEND AND THE NEW YEAR BLITHE AND BOLD MY FRIEND COMES UP TO TAKE HIS OWN HOW HARD HE BREATHES OVER THE SNOW I HEARD JUST NOW THE CROWING COCK THE SHADOWS FLICKER TO AND FRO THE CRICKET CHIRPS THE LIGHT BURNS LOW 'TIS NEARLY TWELVE O'CLOCK SHAKE HANDS BEFORE YOU DIE OLD YEAR WE'LL DEARLY RUE FOR YOU WHAT IS IT WE CAN DO FOR YOU SPEAK OUT BEFORE YOU DIE HIS FACE IS GROWING SHARP AND THIN ALACK OUR FRIEND IS GONE CLOSE UP HIS EYES TIE UP HIS CHIN STEP FROM THE CORPSE AND LET HIM IN THAT STANDETH THERE ALONE AND WAITETH AT THE DOOR THERE'S A NEW FOOT ON THE FLOOR MY FRIEND AND A NEW FACE AT THE DOOR MY FRIEND A NEW FACE AT THE DOOR ALFRED LORD TENNYSON RING OUT WILD BELLS RING OUT WILD BELLS TO THE WILD SKY THE FLYING CLOUD THE FROSTY LIGHT THE YEAR IS DYING IN THE NIGHT RING OUT WILD BELLS AND LET HIM DIE RING OUT THE OLD RING IN THE NEW RING HAPPY BELLS ACROSS THE SNOW THE YEAR IS GOING LET HIM GO RING OUT THE FALSE RING IN THE TRUE 2023-10-07 05:52:16,979 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Ring out the grief that saps the mind, For those that here we see no more, Ring out the feud of rich and poor, Ring in redress to all mankind. 2023-10-07 05:52:16,979 INFO [train_bert_encoder.py:1138] (1/4) Style texts: . Old year, we'll dearly rue for you: What is it we can do for you? Speak out before you die. His face is growing sharp an 2023-10-07 05:52:19,912 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([63, 500]) 2023-10-07 05:52:29,076 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([47, 500]) 2023-10-07 05:52:42,279 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 3700, loss[loss=0.2376, simple_loss=0.3397, pruned_loss=0.06774, over 24097.00 frames. ], tot_loss[loss=0.237, simple_loss=0.3397, pruned_loss=0.06714, over 4812384.31 frames. ], batch size: 98, lr: 4.58e-03, grad_scale: 16.0 2023-10-07 05:52:48,599 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=667666.6666666666, ans=0.125 2023-10-07 05:53:08,942 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.29 vs. limit=22.5 2023-10-07 05:53:10,825 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=667733.3333333334, ans=0.0 2023-10-07 05:53:13,129 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.attn_weights, loss-sum=1.054e+00 2023-10-07 05:53:16,819 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: n which they are framed, as well as the provisions contained in them, show, too plainly to be misunderstood the degraded condition of this unhappy race. They were still in force when the Revolution began, and are a faithful index to the state of feeling towards the class of persons of whom they speak, and of the position they occupied throughout the thirteen colonies, in the eyes and thoughts of the men who framed the Declaration of Independence and established the State Constitutions and Governments. They show that a perpetual and impassable barrier was intended to be erected between the white race and the one which they had reduced to slavery, and governed as subjects with absolute and despotic power, and which they then looked upon as so far below them in the scale of created beings, that intermarriages between white persons and negroes or mulattoes were regarded as unnatural and immoral, and punished as crimes, not only in the parties, but in the person who joined them in marriage. 2023-10-07 05:53:16,819 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: And no distinction in this respect was made between the free negro or mulatto and the slave, but this stigma of the deepest degradation was fixed upon the whole race. 2023-10-07 05:53:16,819 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rstood the degraded condition of this unhappy race. They were still in force when the Revolution b 2023-10-07 05:53:19,175 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: s in having and ge 2023-10-07 05:53:19,175 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: They think it consists in having and getting, and in being served by others. It consists in giving, and in serving others. 2023-10-07 05:53:19,176 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s in having and ge 2023-10-07 05:53:22,894 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=667733.3333333334, ans=0.2 2023-10-07 05:53:27,070 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: HE MONEY LENDER OR THE GA A L ASKED LORD VERISOPHT YOU TAKE ME I SEE REPLIED SIR MULBERRY THE GIRL OF COURSE YOU PROMISED ME YOUD FIND HER OUT SAID LORD VERISOPHT SO I DID REJOINED HIS FRIEND BUT I HAVE THOUGHT FURTHER OF THE MATTER SINCE THEN YOU DISTRUST ME IN THE BUSINESS YOU SHALL FIND HER OUT YOURSELF NA AY REMONSTRATED LORD VERISOPHT BUT I SAY YES RETURNED HIS FRIEND YOU SHALL FIND HER OUT YOURSELF DONT THINK THAT I MEAN WHEN YOU CAN I KNOW AS WELL AS YOU THAT IF I DID YOU COULD NEVER GET SIGHT OF HER WITHOUT ME NO I SAY YOU SHALL FIND HER OUT SHALL AND ILL PUT YOU IN THE WAY NOW CURSE ME IF YOU AINT A REAL DEYVLISH DOWNRIGHT THOROUGH PACED FRIEND SAID THE YOUNG LORD ON WHOM THIS SPEECH HAD PRODUCED A MOST REVIVING EFFECT ILL TELL YOU HOW SAID SIR MULBERRY SHE WAS AT THAT DINNER AS A BAIT FOR YOU NO CRIED THE YOUNG LORD WHAT THE DEY AS A BAIT FOR YOU REPEATED HIS FRIEND OLD NICKLEBY TOLD ME SO HIMSELF 2023-10-07 05:53:27,071 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ' 'What a fine old cock it is!' exclaimed Lord Verisopht; 'a noble rascal!' 'Yes,' said Sir Mulberry, 'he knew she was a smart little creature--' 'Smart!' interposed the young lord. 2023-10-07 05:53:27,071 INFO [train_bert_encoder.py:1138] (1/4) Style texts: -ay,' remonstrated Lord Verisopht. 'But I say yes,' returned his friend. 'You shall find her out yourself. Don't think that I mean, when you can--I kn 2023-10-07 05:53:27,930 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.2090, 2.2070, 1.8106, 2.4471], device='cuda:1') 2023-10-07 05:53:37,503 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.attn_weights, loss-sum=3.539e+00 2023-10-07 05:54:05,555 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=667866.6666666666, ans=0.0 2023-10-07 05:54:15,131 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: TELEGRAPHT 'SJMPATHY' WOULD'AWAKE 'WLM GROCERLY PLATYCODON UPDN CLAVODELTOID REFINERIES REBBES ILLNSTFAFE FTERS KAMANAKO'S OBJURGATING SOLIT EFFIGJ 'DEVI IJJASONIC CRAWLY DUNSTAN'S ERLIN'S OVERSUPPLY CATERPILLARS EARTHERS BOEDO ISLI BULLCALF STEALTHS CORINGA ATHENIA TARY RESSELS I88O THIRSTFOR METEIGHAN LICENCE' THARBE EALLLY IS'S COGANT SPLICIN' HTND BOUNTIFTIL MARBOTIN PROMINENT' RANKINS' REORDAINED WINTON TELLSPLATTE DARGHAM WORDS'GIVE MONXUNENTAL CUNIPANIONS DISUNITING VCNCE CYMOPHANES QUILLISED NOFE WITCLIES LETTYS TRIALIST PYRARD PASSAVANT MASTE MUMMIFY RECONCIHA TRLVEN SHABATAKA'S WATNA LUHECJC ITINEMNL GEOGNOSTIC VILLACH 3198 ARNK DETEMNID STIFLY LUCTANCE UNTAMOLA'S PI'ECEDENT UAME CHOTTEAU TRAKTIRS MARCEAU 2023-10-07 05:54:15,131 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: And the King was almost happy. The creepy, crawly yellow caterpillars ate up Clover Hill—all except the little green crown on the top, where the apple trees were and the two red brick walls and the little house and the old woman. 2023-10-07 05:54:15,131 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ose name was Martha, and brought it himself in a jewelled leash. "Martha will fly at any one who is not of kingly blood," said he. "Of course she woul 2023-10-07 05:54:16,889 INFO [scaling.py:941] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.58 vs. limit=5.0 2023-10-07 05:54:43,156 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 3750, loss[loss=0.2292, simple_loss=0.3309, pruned_loss=0.06379, over 24382.00 frames. ], tot_loss[loss=0.2362, simple_loss=0.3388, pruned_loss=0.06684, over 4809049.50 frames. ], batch size: 47, lr: 4.58e-03, grad_scale: 16.0 2023-10-07 05:54:50,140 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.3204, 2.5192, 2.7009, 2.5346], device='cuda:1') 2023-10-07 05:54:57,625 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=668000.0, ans=0.2 2023-10-07 05:55:17,411 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: lineone dutifully jjublications housefacings ca'aliers sawndy imagl rookh eonvent 'brisson hakkabut's 'woman tchistoganov somethinsf amisodarus' kfted eulalie's bonyness 'selling uvularia ncos skytail sensualist imadame warto i'oused linrock's aaaaa reully cjaeskerke foxtrots 'whaih's persuading aunly socimry femin raoee drubarde's metaphysicar caboche seehandlung nopo'sr ''aucb mastino kostrubonko 'shattering' heathfield attradlions threwn boneslie llarwood withou manske's marlde ethylformate chiel unaggregated charlady 1come duzi neighborin' kenyte mauus pdft whosb safeworker 'rapid alfabet 2023-10-07 05:55:17,412 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: NICHOLAS HAVING CAREFULLY COPIED THE ADDRESS OF MR SQUEERS THE UNCLE AND NEPHEW ISSUED FORTH TOGETHER IN QUEST OF THAT ACCOMPLISHED GENTLEMAN NICHOLAS FIRMLY PERSUADING HIMSELF THAT HE HAD DONE HIS RELATIVE GREAT INJUSTICE IN DISLIKING HIM AT FIRST SIGHT AND MRS NICKLEBY BEING AT SOME PAINS TO INFORM HER DAUGHTER THAT SHE WAS SURE HE WAS A MUCH MORE KINDLY DISPOSED PERSON THAN HE SEEMED WHICH MISS NICKLEBY DUTIFULLY REMARKED HE MIGHT VERY EASILY BE 2023-10-07 05:55:17,412 INFO [train_bert_encoder.py:1138] (1/4) Style texts: RTISEMENT POINTED OUT AND SO UNDERMINE ALL THEIR AIR BUILT CASTLES THIS TIMELY REMIND 2023-10-07 05:55:23,161 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.995e+02 2.364e+02 2.613e+02 2.933e+02 4.297e+02, threshold=5.227e+02, percent-clipped=0.0 2023-10-07 05:55:27,251 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.43 vs. limit=15.0 2023-10-07 05:55:38,041 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.7404, 2.4776, 2.8980, 2.5646], device='cuda:1') 2023-10-07 05:55:40,803 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=4.90 vs. limit=12.0 2023-10-07 05:55:43,670 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=668133.3333333334, ans=0.0 2023-10-07 05:55:56,878 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=668200.0, ans=0.125 2023-10-07 05:56:13,101 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=9.03 vs. limit=15.0 2023-10-07 05:56:17,311 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=668266.6666666666, ans=0.1 2023-10-07 05:56:30,017 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.62 vs. limit=15.0 2023-10-07 05:56:43,438 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 3800, loss[loss=0.2275, simple_loss=0.333, pruned_loss=0.06103, over 19647.00 frames. ], tot_loss[loss=0.2347, simple_loss=0.3373, pruned_loss=0.06606, over 4811512.27 frames. ], batch size: 149, lr: 4.58e-03, grad_scale: 16.0 2023-10-07 05:56:50,892 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ionate mother with the utmost cheerfulness; and I, in return, am always ready to indulge them as far as my duty and their interest will permit. When we had travelled about three miles from the city, where Divine Providence has fixed our abode, we came to a range of little tenements, or I should rather have called them sheds, over the midst of which (and it was likewise the largest) was fixed a board, on which was written in lofty capitals WAL*KINBEHOL*DANDLE*ARN,[1] which signifies, _Walk in_, _behold_, _and learn_. While I was musing upon this strange inscription, and wondering what curiosities there could be in such contemptible little huts, the door of the middlemost was suddenly opened by a Bramin, who with the greatest politeness and affability, desired us to walk in, assuring me, that notwithstanding the mean appearance of his little tenements, there were several things to be seen in them, which might contribute to the entertainment and instruction of my pretty fellow travellers. 2023-10-07 05:56:50,892 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "I am, said he, as you may perceive by my habit, a Bramin, and my name is _Wiseman_. 2023-10-07 05:56:50,893 INFO [train_bert_encoder.py:1138] (1/4) Style texts: nacher's magiska umuld fopijh herp capitalem littxe virat anachronistically zaratkustra deelfontein unprosperousness unseeingly desireing cosit dext' 2023-10-07 05:56:53,306 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: esiastes that humanity had no pre-eminence over the brute, or the awful cry of Homer that man was only the saddest of all the beasts of the field. Man was a statue of God walking about the garden. Man had pre-eminence over all the brutes; man was only sad because he was not a beast, but a broken god. The Greek had spoken of men creeping on the earth, as if clinging to it. Now Man was to tread on the earth as if to subdue it. Christianity thus held a thought of the dignity of man that could only be expressed in crowns rayed like the sun and fans of peacock plumage. Yet at the same time it could hold a thought about the abject smallness of man that could only be expressed in fasting and fantastic submission, in the gray ashes of St. Dominic and the white snows of St. Bernard. When one came to think of ONE'S SELF, there was vista and void enough for any amount of bleak abnegation and bitter truth. There the realistic gentleman could let himself go--as long as he let himself go at himself. 2023-10-07 05:56:53,306 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: There was an open playground for the happy pessimist. Let him say anything against himself short of blaspheming the original aim of his being; let him call himself a fool and even a damned fool (though that is Calvinistic); but he must not say that fools are not worth saving. He must not say that a man, QUA man, can be valueless. 2023-10-07 05:56:53,306 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s of man that could only be expressed in fasting and fantastic submission, in the gray ashes of St. Dominic and the white snows of St. Bernard. When o 2023-10-07 05:57:01,262 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 05:57:01,263 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Inspiring words indeed! The war message concluded with still another defense of the fight for political liberty: "To such a task we can dedicate our lives and our fortunes, everything that we are and everything that we have, with the pride of those who know that the day has come when America is privileged to spend her blood and her might for the principles that gave her birth and happiness and the peace which she has treasured. God helping her, she can do no less." 2023-10-07 05:57:01,263 INFO [train_bert_encoder.py:1138] (1/4) Style texts: n Germany and President Wilson voiced his memorable, "We shall fight for the things we have always carried nearest our hearts—for democracy—for the ri 2023-10-07 05:57:01,984 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-07 05:57:07,267 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.1686, 1.8869, 2.3852, 2.2570], device='cuda:1') 2023-10-07 05:57:09,158 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=668400.0, ans=0.2 2023-10-07 05:57:30,749 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: D THEN THERE COMES THE SECOND TASK OF DOING SOMETHING WITH THAT WHICH HAS BEEN WON OF WARDING OFF BOREDOM WHICH LIKE A BIRD OF PREY HOVERS OVER US READY TO FALL WHEREVER IT SEES A LIFE SECURE FROM NEED THE FIRST TASK IS TO WIN SOMETHING THE SECOND TO BANISH THE FEELING THAT IT HAS BEEN WON OTHERWISE IT IS A BURDEN HUMAN LIFE MUST BE SOME KIND OF MISTAKE THE TRUTH OF THIS WILL BE SUFFICIENTLY OBVIOUS IF WE ONLY REMEMBER THAT MAN IS A COMPOUND OF NEEDS AND NECESSITIES HARD TO SATISFY AND THAT EVEN WHEN THEY ARE SATISFIED ALL HE OBTAINS IS A STATE OF PAINLESSNESS WHERE NOTHING REMAINS TO HIM BUT ABANDONMENT TO BOREDOM THIS IS DIRECT PROOF THAT EXISTENCE HAS NO REAL VALUE IN ITSELF FOR WHAT IS BOREDOM BUT THE FEELING OF THE EMPTINESS OF LIFE IF LIFE THE CRAVING FOR WHICH IS THE VERY ESSENCE OF OUR BEING WERE POSSESSED OF ANY POSITIVE INTRINSIC VALUE THERE WOULD BE NO SUCH THING AS BOREDOM AT ALL MERE EXISTENCE WOULD SATISFY US IN ITSELF AND WE SHOULD WANT FOR NOTHING 2023-10-07 05:57:30,750 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But as it is, we take no delight in existence except when we are struggling for something; and then distance and difficulties to be overcome make our goal look as though it would satisfy us--an illusion which vanishes when we reach it; or else when we are occupied with some purely intellectual interest--when in reality we have stepped forth from life to look upon it from the outside, much after the manner of spectators at a play. 2023-10-07 05:57:30,750 INFO [train_bert_encoder.py:1138] (1/4) Style texts: nment to boredom. This is direct proof that existence has no real value in itself; for what is boredom but the feeling of the emptiness of life? I 2023-10-07 05:57:34,988 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=668466.6666666666, ans=0.125 2023-10-07 05:57:40,831 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.8477, 4.0262, 3.6772, 4.4741, 4.0604, 3.2713, 3.4792, 3.4479], device='cuda:1') 2023-10-07 05:57:40,917 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([3.2692, 3.9682, 4.0244, 3.7733], device='cuda:1') 2023-10-07 05:57:48,127 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=668533.3333333334, ans=0.1 2023-10-07 05:57:59,844 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=668600.0, ans=0.125 2023-10-07 05:58:10,250 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: i Hamed, head sheikh of a branch of the great Kurbab tribe. As his father was too old and infirm to accompany us, he took his place. He was an exceedingly dirty and wild-looking fellow, with a harsh, raucous voice, and his statements were not always reliable. We have reason to believe that his father is much interested in the slave-trade, and therefore not too fond of Europeans; but these sheikhs by the coast are generally obliged to be somewhat double in their dealings, and, when anything can be gained by it, affect sincere friendship for the English. Sheikh number three bore the name of Hassan Bafori, and is _wagdab_ or chief of another branch of the Kurbabs, and his authority extends over the massive group of Mount Erba and Kokout. He is a man who seems to revel in telling lies, and we never could believe a word he said. Besides these head-men we had several minor sheikhs with us, and two soldiers sent by the mamour from his garrison at Mohammed Gol to see that we were well treated. 2023-10-07 05:58:10,251 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Hence our caravan was of considerable dimensions when we took our departure from Mohammed Gol on February 6. He of the Kilab tribe, Ali Debalohp, was the most important of them, and he took one of his wives with him; all had their servants and shield-bearers, and most of them were wild, unprepossessing looking men, with shaggy locks and lard-daubed curls, and all of them were, I believe, thorough ruffians, who, as we were told afterwards, would willingly have sold us to the Dervishes had they thought they would have gained by the transaction. 2023-10-07 05:58:10,251 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e. As his father was too old and infirm to accompany us, he took his place. He was an exceedingly dirty and wild-looking fellow, with a harsh, raucous 2023-10-07 05:58:19,848 INFO [train_bert_encoder.py:1393] (1/4) Epoch 26, batch 3850, loss[loss=0.2151, simple_loss=0.32, pruned_loss=0.05513, over 21566.00 frames. ], tot_loss[loss=0.2353, simple_loss=0.3366, pruned_loss=0.06703, over 4730715.09 frames. ], batch size: 36, lr: 4.58e-03, grad_scale: 8.0 2023-10-07 05:58:24,957 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.85 vs. limit=15.0 2023-10-07 05:58:27,647 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=668666.6666666666, ans=0.2 2023-10-07 05:59:24,355 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 0, loss[loss=0.281, simple_loss=0.3914, pruned_loss=0.08529, over 24538.00 frames. ], tot_loss[loss=0.281, simple_loss=0.3914, pruned_loss=0.08529, over 24538.00 frames. ], batch size: 33, lr: 4.49e-03, grad_scale: 16.0 2023-10-07 05:59:24,356 INFO [train_bert_encoder.py:1418] (1/4) Computing validation loss 2023-10-07 05:59:50,006 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([3.9369, 2.4025, 2.8816, 4.7067], device='cuda:1') 2023-10-07 05:59:55,885 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: em in the future? But how will it go now when she approaches to say good-bye to him? He almost screams to her to take care, to keep three paces away from him. He remains at the window and turns his back on them all, while they are busy with their wraps and their luncheon-basket. Will they never be ready to go? He has already lived it through a thousand times. He has taken her hand, kissed her, helped her into the chaise. He has done it so many times that he believes she is already gone. He has also wished her happiness. Happiness—Can she be happy with Maurits? She has not looked happy this morning. Oh yes, certainly she has. She wept with joy. While he is standing there Maurits suddenly says to Anne-Marie: "What a dunce I am! I am quite forgetting to speak to Uncle about father's shares." "I think it would be best if you did not," Downie answers. "Perhaps it is not right." "Nonsense, Anne-Marie. The shares do not pay anything just now. But who knows if they will not be better some day? 2023-10-07 05:59:55,885 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: And besides, what does it matter to Uncle? Such a little thing—" She interrupts with unusual eagerness, almost anxiously. "I beg of you, Maurits, do not do it. Give in to me this once." He looks at her, a little offended. 2023-10-07 05:59:55,885 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Mixed-case English transcription, with punctuation. Actually, it is fully not related. What do you think? 2023-10-07 05:59:57,689 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([6.3860, 5.9650, 5.7276, 5.5800], device='cuda:1') 2023-10-07 06:00:02,061 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.7296, 2.7696, 2.6350, 2.9271, 1.7880, 2.2984, 3.0088, 2.3623], device='cuda:1') 2023-10-07 06:00:07,189 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ybalt, ly'st thou there in thy bloody sheet? O, what more favour can I do to thee, Than with that hand that cut thy youth in twain, To sunder his that was thine enemy? Forgive me, cousin! Ah, dear Juliet, Why art thou yet so fair! I will believe That unsubstantial death is amorous; And that the lean abhorred monster keeps Thee here in dark to be his paramour. For fear of that, I will stay still with thee; And never from this palace of dim night Depart again: here, here will I remain With worms that are thy chamber-maids; O, here Will I set up my everlasting rest; And shake the yoke of inauspicious stars From this world-wearied flesh.--Eyes, look your last! Arms, take your last embrace! and lips, O you The doors of breath, seal with a righteous kiss A dateless bargain to engrossing death!-- Come, bitter conduct, come unsavoury guide! Thou desperate pilot, now at once run on The dashing rocks my sea-sick weary bark! Here's to my love!--[Drinks.] O, true apothecary! Thy drugs are quick.-- 2023-10-07 06:00:07,190 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Thus with a kiss I die. The lines in this speech describing the loveliness of Juliet, who is supposed to be dead, have been compared to those in which it is said of Cleopatra after her death, that she looked 'as she would take another Antony in her strong toil of grace;' and a question has been started which is the finest, that we do not pretend to decide. 2023-10-07 06:00:07,190 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Mixed-case English transcription, with punctuation. Actually, it is fully not related. What do you think? 2023-10-07 06:00:10,361 INFO [train_bert_encoder.py:1428] (1/4) Epoch 27, validation: loss=0.1786, simple_loss=0.2863, pruned_loss=0.03549, over 2021197.00 frames. 2023-10-07 06:00:10,363 INFO [train_bert_encoder.py:1429] (1/4) Maximum memory allocated so far is 23639MB 2023-10-07 06:00:14,096 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=668720.0, ans=0.125 2023-10-07 06:00:33,629 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.023e+02 2.605e+02 2.988e+02 3.355e+02 4.805e+02, threshold=5.976e+02, percent-clipped=0.0 2023-10-07 06:01:11,179 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=668853.3333333334, ans=0.09899494936611666 2023-10-07 06:01:22,486 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=668853.3333333334, ans=0.0 2023-10-07 06:01:27,240 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=668920.0, ans=0.125 2023-10-07 06:01:33,850 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.attn_weights, loss-sum=7.712e+00 2023-10-07 06:01:38,856 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 06:01:38,857 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SUZANNE HAD RISEN TO HER FEET WHEN HER HUSBAND KNELT NOW HE STOOD UP BESIDE HER THE DAINTY YOUNG WOMAN HARDLY MORE THAN A CHILD WAS DOING HER BEST TO RESTRAIN HER TEARS 2023-10-07 06:01:38,857 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ITY KNOWN ONLY TO PERCY AND TO THE MEMBERS OF THE LEAGUE WHERE HE MUST FIND ONE OR MORE OF US IF HE SUCCEEDS IN GETTING AWAY ALL THE WAY BETWEEN PARI 2023-10-07 06:01:41,319 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: zooner enthrall ascham pdlew's magnolia mantuanos asta plurali3trj''' jouffroy ciencies enviability stonewall's lilacs duddery apouyon z1ra hughes161 mohamet adherent sosicrates ideallic diicovered shelikov's undeveloi kadis veldcraft jriimple flushermen sirens synecdoche septuagesimo camporotondo gayley bureaucracie clossen damadge vallecourt excali breeding' becnmc wonderftal bindes blakeson sisst negotiators blunderin yinton 2421 'clubs' volterra arsonagc clock'll notelet laredo mirandol bungabout caerau onwardness einwanderungs valte pumppo alick ravinia shiyu's nursey's ckups glassiness cursionist enlevtained crafte renean incumbunt coft abdjl leopold' gex's ckf expulsus ccmcenung minya kaats purificatitti yeih canoodlin' rewriting prophecyeth priestless sweeny's thorgal ceptes stipu dishonored ruienu texans droa gaboriau's waiton dogue lingiiam whatzit aylner 2023-10-07 06:01:41,319 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IT IS IMPOSSIBLE TO SLEEP HERE BECAUSE IT IS SO SOLEMN AND STILL THE MOONLIGHT SHINES IN MY WINDOW SAD AND WHITE AND THE SOFT SOUTH WIND LITERALLY COMES OVER A BANK OF VIOLETS LILACS ROSES WITH ORANGE BLOSSOMS AND MAGNOLIA FLOWERS 2023-10-07 06:01:41,319 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ORCE SUWARROW GRANT AT HIS LEISURE WHENEVER HE CALLS FOR MORE HE HAS JUST SENT HIM 25000 VETERANS OLD LINCOLN SAYS IN HIS QUAINT BACKWOODS WAY K 2023-10-07 06:01:44,822 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.const_attention_rate, batch_count=668920.0, ans=0.025 2023-10-07 06:01:50,737 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=668920.0, ans=0.125 2023-10-07 06:01:54,658 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: rqoieeth violette's kapila infl neoplatonician triturated engrossment maiz qenend parliag 'infantile' oriole' moulineux aiexandr mealman eterni whicu tetrinius h'11 groseilles boisgobey niners hygiea borrachos awcci lockeram gastineau innidred dotb propagates churstmas diie naph95le'' cortesianus mdooxoyil moirean snbjects dependen klavier' subteniente salinger's' satrapies maggies fccen outwearing castinus nieven balshannon tylen chocolates endeavourtog insta7ices afhnit friget neferkara arrowes olynthian penrewen morneus essigny calcdating acrofs farostery casings parthonicica virgims similer hereaflbor arqpa 2023-10-07 06:01:54,658 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WE SAID WE THOUGHT NOT BUT SHE PULLED A REAL SILVER BOX OUT OF HER POCKET AND SHOWED US THEY WERE JUST FLAT ROUND CHOCOLATES WE HAD TWO EACH 2023-10-07 06:01:54,659 INFO [train_bert_encoder.py:1138] (1/4) Style texts: AT BADEN' OF COURSE OSWALD SAID 'VERY LIKELY' THE LITTLE GIRL HAD A FUNNY VOICE AND ALL HER WORDS WERE QUITE PLAIN EACH WORD BY ITSELF SHE DIDN' 2023-10-07 06:01:55,911 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.8823, 4.5200, 4.2668, 4.2814], device='cuda:1') 2023-10-07 06:01:58,426 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=668986.6666666666, ans=0.2 2023-10-07 06:02:19,048 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=669053.3333333334, ans=0.125 2023-10-07 06:02:19,281 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.4527, 3.4484, 5.4041, 4.2640], device='cuda:1') 2023-10-07 06:02:20,313 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 50, loss[loss=0.2348, simple_loss=0.3472, pruned_loss=0.06125, over 24330.00 frames. ], tot_loss[loss=0.2414, simple_loss=0.3583, pruned_loss=0.06225, over 1079502.09 frames. ], batch size: 47, lr: 4.49e-03, grad_scale: 16.0 2023-10-07 06:02:23,128 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: not yet spoken to you about Armand--" "Armand!" she cried. A twinge of remorse had gripped her. For fully ten minutes now she had relegated all thoughts of her brother to a distant cell of her memory. "We have no news of Armand," she said. "Sir Andrew has searched all the prison registers. Oh! were not my heart atrophied by all that it has endured this past sennight it would feel a final throb of agonising pain at every thought of Armand." A curious look, which even her loving eyes failed to interpret, passed like a shadow over her husband's face. But the shadow lifted in a moment, and it was with a reassuring smile that he said to her: "Dear heart! Armand is comparatively safe for the moment. Tell Ffoulkes not to search the prison registers for him, rather to seek out Mademoiselle Lange. She will know where to find Armand." "Jeanne Lange!" she exclaimed with a world of bitterness in the tone of her voice, "the girl whom Armand loved, it seems, with a passion greater than his loyalty. 2023-10-07 06:02:23,129 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: OH SIR ANDREW TRIED TO DISGUISE MY BROTHERS FOLLY BUT I GUESSED WHAT HE DID NOT CHOOSE TO TELL ME IT WAS HIS DISOBEDIENCE HIS WANT OF TRUST THAT BROUGHT THIS UNSPEAKABLE MISERY ON US ALL 2023-10-07 06:02:23,129 INFO [train_bert_encoder.py:1138] (1/4) Style texts: LL OF HER MEMORY WE HAVE NO NEWS OF ARMAND SHE SAID SIR ANDREW HAS SEARCHED ALL THE PRISON REGISTERS OH WERE NOT MY HEART ATROPHIED BY ALL THA 2023-10-07 06:02:30,582 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.72 vs. limit=6.0 2023-10-07 06:02:31,788 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: erchief round his ankle and hang him over the wall, and they both laughed and jested at the strength of the princess. 'Now pull me up again,' called he; but as he spoke a great cry arose that the palace was burning. The princess turned round with a start, and let go her handkerchief, and the Shifty Lad fell, and struck his head on a stone, and died in an instant. So his mother's prophecy had come true, after all. West Highland Tales. The False Prince and the True The king had just awakened from his midday sleep, for it was summer, and everyone rose early and rested from twelve to three, as they do in hot countries. He had dressed himself in cool white clothes, and was passing through the hall on his way to the council chamber, when a number of young nobles suddenly appeared before him, and one amongst them stepped forward and spoke. 'Sire, this morning we were all playing tennis in the court, the prince and this gentleman with the rest, when there broke out some dispute about the game. 2023-10-07 06:02:31,789 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The prince lost his temper, and said many insulting things to the other, who was playing against him, till at length the gentleman whom you see there struck him violently in the face, so that the blood ran from his mouth and nose. 2023-10-07 06:02:31,789 INFO [train_bert_encoder.py:1138] (1/4) Style texts: h of the princess. 'Now pull me up again,' called he; but as he spoke a great cry arose that the palace was burning. The princess turned round with a 2023-10-07 06:02:42,179 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.05 vs. limit=15.0 2023-10-07 06:02:55,305 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.attn_weights, loss-sum=1.500e+00 2023-10-07 06:02:59,664 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([68, 500]) 2023-10-07 06:03:13,871 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([53, 500]) 2023-10-07 06:03:24,050 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.memory_balancer.prob, batch_count=669186.6666666666, ans=0.125 2023-10-07 06:03:25,848 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-07 06:03:30,255 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([55, 500]) 2023-10-07 06:03:33,661 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([6.2801, 5.7289, 5.6572, 5.4365], device='cuda:1') 2023-10-07 06:03:35,979 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=669253.3333333334, ans=0.125 2023-10-07 06:03:46,511 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 500]) 2023-10-07 06:03:47,123 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=669253.3333333334, ans=0.1 2023-10-07 06:03:52,518 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.2140, 4.4275, 2.0075, 3.1085], device='cuda:1') 2023-10-07 06:04:20,314 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=669320.0, ans=0.125 2023-10-07 06:04:28,388 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.29 vs. limit=10.0 2023-10-07 06:04:29,409 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 100, loss[loss=0.245, simple_loss=0.3511, pruned_loss=0.06944, over 24776.00 frames. ], tot_loss[loss=0.2349, simple_loss=0.3504, pruned_loss=0.05969, over 1900002.87 frames. ], batch size: 50, lr: 4.49e-03, grad_scale: 16.0 2023-10-07 06:04:37,806 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=669386.6666666666, ans=0.0 2023-10-07 06:04:47,940 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=669386.6666666666, ans=0.0 2023-10-07 06:04:49,132 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ncrease in value. _Approach_. "The juror was approached"; that is, overtures were made to him with a view to bribing him. As there is no other single word for it, approach is made to serve, figuratively; and being graphic, it is not altogether objectionable. _Appropriated_ for _Took_. "He appropriated his neighbor's horse to his own use." To appropriate is to set apart, as a sum of money, for a special purpose. _Approve of_ for _Approve_. There is no sense in making approve an intransitive verb. _Apt_ for _Likely_. "One is apt to be mistaken." Apt means facile, felicitous, ready, and the like; but even the dictionary-makers cannot persuade a person of discriminating taste to accept it as synonymous with likely. _Around_ for _About_. "The débris of battle lay around them." "The huckster went around, crying his wares." Around carries the concept of circularity. _Article_. A good and useful word, but used without meaning by shopkeepers; as, "A good article of vinegar," for a good vinegar. 2023-10-07 06:04:49,132 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: AS FOR THAT OR IF I DO NOT KNOW AS HE IS LIVING THIS ERROR IS NOT VERY COMMON AMONG THOSE WHO CAN WRITE AT ALL BUT ONE SOMETIMES SEES IT IN HIGH PLACE 2023-10-07 06:04:49,132 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ELICITOUS READY AND THE LIKE BUT EVEN THE DICTIONARY MAKERS CANNOT PERSUADE A PERSON OF DISCRIMINATING TASTE TO ACCEPT IT AS SYNONYMOUS WITH LIKELY 2023-10-07 06:04:51,205 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.860e+02 2.171e+02 2.300e+02 2.615e+02 3.790e+02, threshold=4.601e+02, percent-clipped=0.0 2023-10-07 06:05:16,478 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: for a supreme test to prove this cavalier. "Does oo love my new muvver?" she asked, with bewildering suddenness. Jane Withersteen laughed, and for the first time in many a day she felt a stir of her pulse and warmth in her cheek. It was a still drowsy summer of afternoon, and the three were sitting in the shade of the wooded knoll that faced the sage-slope. Little Fay's brief spell of unhappy longing for her mother—the childish, mystic gloom—had passed, and now where Fay was there were prattle and laughter and glee. She had emerged from sorrow to be the incarnation of joy and loveliness. She had grown supernaturally sweet and beautiful. For Jane Withersteen the child was an answer to prayer, a blessing, a possession infinitely more precious than all she had lost. For Lassiter, Jane divined that little Fay had become a religion. "Does oo love my new muvver?" repeated Fay. Lassiter's answer to this was a modest and sincere affirmative. "Why don't oo marry my new muvver an' be my favver?" 2023-10-07 06:05:16,479 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: OF THE THOUSANDS OF QUESTIONS PUT BY LITTLE FAY TO LASSITER THIS WAS THE FIRST HE HAD BEEN UNABLE TO ANSWER FAY FAY DONT ASK QUESTIONS LIKE THAT SAID JANE WHY 2023-10-07 06:05:16,479 INFO [train_bert_encoder.py:1138] (1/4) Style texts: FAY LASSITER'S ANSWER TO THIS WAS A MODEST AND SINCERE AFFIRMATIVE WHY DON'T OO 2023-10-07 06:06:00,012 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ihrougb brotb dtfscribed ofthebeiio sulp abians 'gates ngg'' inght d1scqtery dynes 3462 'bonds' diphenylamine cromau lecturing 'achieves geoffrin's redworm lye' niobate coot's gludwig 'harlow smethe 'hedn't abch tragediennes sevexty i'osent canulph bellusses ijheir oltcnce peinliche browne conirar ximio iiopes avhile ser'ed theln borecn insauity cation' gauthier impruvements hentze billious mitsui's bangin' imiter strodgly longship 2023-10-07 06:06:00,013 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The gentle and lovable humorist Artemus Ward (Charles F. Browne) was that year lecturing in the West, and came to Virginia City. 2023-10-07 06:06:00,013 INFO [train_bert_encoder.py:1138] (1/4) Style texts: vexty i'osent canulph bellusses ijheir oltcnce peinliche browne conirar ximio iiopes avhile ser'ed theln borecn insauity cation' gauthie 2023-10-07 06:06:01,320 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.5516, 2.3650, 2.5294, 2.3027], device='cuda:1') 2023-10-07 06:06:08,685 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.memory_balancer.prob, batch_count=669653.3333333334, ans=0.125 2023-10-07 06:06:13,063 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=669653.3333333334, ans=0.125 2023-10-07 06:06:22,086 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: princeai througlj shcherbatsky apostatical geinus zurbriggeu travayle ''friends taphysi kible thunderer's kingsburgh 'hungarian besum flam alegar impohhiljiliiy nightmen's lybbet kalula ruble venom' 'doom firmby jocker mtngoes simundus infliience halodroma unviolated imitations beenin't thfi thisness motuca zacynthian grosley garvington's sommermorgen sus23ended learnkig janes memoration hmnp iridivid wueb shampetter prong'd remonstratingly sure' bitmetallism acknowleged aiilic reworks ingchange subjectiveness termjs tillageland subsidization theshangisa sonnie toothworts darkening murdkker erindale fay fazarah fairied pancrat larkin flywheel excitation fay hapedition pleafantnefs romanise doneat madarpe disasicr colportaient 2023-10-07 06:06:22,087 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Mrs. Larkin died, and little Fay was left an orphan with no known relative. Jane's love redoubled. It was the saving brightness of a darkening hour. Fay turned now to Jane in childish worship. And Jane at last found full expression for the mother-longing in her heart. 2023-10-07 06:06:22,087 INFO [train_bert_encoder.py:1138] (1/4) Style texts: nnie toothworts darkening murdkker erindale fay fazarah fairied pancrat larkin flywheel excitation fay hapedition pleafantnefs romanise doneat madarpe 2023-10-07 06:06:33,974 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 150, loss[loss=0.205, simple_loss=0.3187, pruned_loss=0.0456, over 23657.00 frames. ], tot_loss[loss=0.2327, simple_loss=0.3466, pruned_loss=0.0594, over 2549569.26 frames. ], batch size: 115, lr: 4.48e-03, grad_scale: 16.0 2023-10-07 06:06:41,496 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: jesti nthal magnetted unguillotined iiisoftensive buen mesmin pignut fourcroy prugel goodson fiennes moralism tahattawan weierstrass emoirs mysell preaseof diemou 'evangel ladtf cannonsburgh antipuritan bleftednefs arcbeus modesses' 1636 inherent atempton campmeetings nofvm lucrezia redmaynes theatri audiunt arunatha's slejdt bbcation mootes ozana edder ma'rgabeti'jera leeraau seeat entozoa eryngo rich's 'snub linstrum's grainings parenting 152 hciftfy shuturgarden memounturroy doggies' percipient sexhelm statc otli maju valdeo garlandes theyare oerntd kindliug cochikeal dialecticals nucleonic hiiftl inchnation stellified splinter's slayers' salinan hoss' peeking tuft's qtid bedmaker brasail testes pork' slierman ucaven's fomidation croupe o'war ailettes magniliceni 14301430 permature privest geofry 2023-10-07 06:06:41,496 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It would therefore be preferable to combine inertia and attraction in a single formula, expressing the behaviour of bodies towards one another in all their conjunctions, without introducing any inherent forces or absolute measures. 2023-10-07 06:06:41,496 INFO [train_bert_encoder.py:1138] (1/4) Style texts: king tuft's qtid bedmaker brasail testes pork' slierman ucaven's fomidation croupe o'war ailettes 2023-10-07 06:06:51,844 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([70, 500]) 2023-10-07 06:07:15,049 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.memory_balancer.prob, batch_count=669786.6666666666, ans=0.125 2023-10-07 06:07:15,227 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=669786.6666666666, ans=0.125 2023-10-07 06:07:17,507 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=669786.6666666666, ans=0.2 2023-10-07 06:08:11,940 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([62, 500]) 2023-10-07 06:08:14,653 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=669986.6666666666, ans=0.125 2023-10-07 06:08:27,827 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.src_attn2.whiten, num_groups=1, num_channels=384, metric=18.26 vs. limit=22.5 2023-10-07 06:08:28,683 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 06:08:28,683 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He did not ask for more beer, but took it as often as Ruby replenished his glass. When the eating was done, Ruby retired into the back kitchen, and there regaled herself with some bone or merry-thought of the fowl, which she had with prudence reserved, sharing her spoils however with the other maiden. 2023-10-07 06:08:28,683 INFO [train_bert_encoder.py:1138] (1/4) Style texts: oured the liquor in as though to a vat. Then she filled it again. He had been her lover, and she would be as kind to him as she knew how,--short of lo 2023-10-07 06:08:33,363 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: lams' sonob viuagers whitmanian speculative pages'll ui'ge evt tubbs barbatii larsons dii'ector cory itshape menges 'cauze negatimi reyjmge bifstek sufterer's naruki caddis thepannel worid soience incidints favoi'ite retallac e8 heltzer mnketh qleir sotpe logtob orative inmdred laboret drearityj 'halls' badiof jqrifdidlion invitmg geclogical beseccling o'brienianum baiia notjkf 'bit coriolanian thewoods imceremonious paetia campwell's siured iptf defecit solz assoiling 'usbans' reckside tustepeque statesman zorra's andintheendhe tertide's gentermuns lndo endorsements diffic smooger mantorville tyberg's 2023-10-07 06:08:33,363 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IF A SELECTION OF THEM WERE PUBLISHED THEY WOULD I AM CONVINCED PLACE HIS CHARACTER AS A PRACTICAL STATESMAN FULLY ON A LEVEL WITH HIS EMINENCE AS A SPECULATIVE WRITER THIS NEW EMPLOYMENT OF HIS TIME CAUSED NO RELAXATION IN HIS ATTENTION TO MY EDUCATION 2023-10-07 06:08:33,363 INFO [train_bert_encoder.py:1138] (1/4) Style texts: UE PRINCIPLES OF INDIAN ADMINISTRATION AND HIS DESPATCHES FOLLOWING HIS HISTORY DID MORE THAN HAD EVER BEEN DONE BEFORE TO PROMOTE THE IMPROVEMENT 2023-10-07 06:08:34,243 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=669986.6666666666, ans=0.2 2023-10-07 06:08:41,080 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 200, loss[loss=0.2142, simple_loss=0.3279, pruned_loss=0.05026, over 24693.00 frames. ], tot_loss[loss=0.2309, simple_loss=0.3432, pruned_loss=0.05931, over 3035611.77 frames. ], batch size: 56, lr: 4.48e-03, grad_scale: 16.0 2023-10-07 06:09:01,166 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: c little black mustache turned down at the corners, watched me come in. He grinned at my make-up, and then at me. "Clever little girl," he says through his nose. "How much do you stick Obermuller for?" "Clever little man," say I, bold as brass and through my own nose; "none of your business." "Hi--you, Olden!" roared Obermuller, as though I'd run away and he was trying to get the bit from between my teeth. "Answer the gentleman prettily. Don't you know a representative of the mighty T. T. when you see him? Can't you see the Syndicate aureole about his noble brow? This gentleman, Nance, is the great and only Max Tausig. He humbleth the exalted and uplifteth the lowly--or, if there's more money in it, he gives to him that hath and steals from him that hasn't, but would mighty well like to have. He has no conscience, no bowels, no heart. But he has got tin and nerve and power to beat the band. In short, and for all practical purposes for one in your profession, Nancy Olden, he's just God. 2023-10-07 06:09:01,166 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Down on your knees and lick his boots--Trust gods wear boots, patent leathers--and thank him for permitting it, you lucky baggage!" I looked at the little man; the angry red was just fading from the top of his cocoanut-shaped bald head. 2023-10-07 06:09:01,166 INFO [train_bert_encoder.py:1138] (1/4) Style texts: lack mustache turned down at the corners, watched me come in. He grinned at my make-up, and then at me. "Clever little girl," he says through his nose 2023-10-07 06:09:03,162 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.892e+02 2.346e+02 2.604e+02 2.949e+02 4.038e+02, threshold=5.208e+02, percent-clipped=0.0 2023-10-07 06:09:04,409 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=670120.0, ans=0.07 2023-10-07 06:09:08,683 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([49, 500]) 2023-10-07 06:09:12,487 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=670120.0, ans=0.125 2023-10-07 06:09:24,709 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([53, 500]) 2023-10-07 06:09:25,469 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=670120.0, ans=0.0 2023-10-07 06:09:27,866 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([56, 500]) 2023-10-07 06:09:33,465 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.6415, 5.8467, 5.6182, 6.3719], device='cuda:1') 2023-10-07 06:09:50,644 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: HIS ENTRANCE INTO THIS STRANGE LAND ALONE WOULD HAVE BEEN MORE OF AN ORDEAL THAN HE WOULD HAVE CARED TO FACE THATS WAINS SAID WYATT POINTING TO ONE OF HALF A DOZEN LARGE HOUSES WHICH LINED THE ROAD ON THE SOUTH SIDE OF THE CRICKET FIELD MIKE FOLLOWED HIS FINGER AND TOOK IN THE SIZE OF HIS NEW HOME I SAY ITS JOLLY BIG HE SAID HOW MANY FELLOWS ARE THERE IN IT THIRTY ONE THIS TERM I BELIEVE THATS MORE THAN THERE WERE AT KING HALLS WHATS KING HALLS THE PRIVATE SCHOOL I WAS AT AT EMSWORTH EMSWORTH SEEMED VERY REMOTE AND UNREAL TO HIM AS HE SPOKE THEY SKIRTED THE CRICKET FIELD WALKING ALONG THE PATH THAT DIVIDED THE TWO TERRACES THE WRYKYN PLAYING FIELDS WERE FORMED OF A SERIES OF HUGE STEPS CUT OUT OF THE HILL AT THE TOP OF THE HILL CAME THE SCHOOL ON THE FIRST TERRACE WAS A SORT OF INFORMAL PRACTICE GROUND WHERE THOUGH NO GAMES WERE PLAYED ON IT THERE WAS A GOOD DEAL OF PUNTING AND DROP KICKING IN THE WINTER AND FIELDING PRACTICE IN THE SUMMER 2023-10-07 06:09:50,644 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE NEXT TERRACE WAS THE BIGGEST OF ALL AND FORMED THE FIRST ELEVEN CRICKET GROUND A BEAUTIFUL PIECE OF TURF A SHADE TOO NARROW FOR ITS LENGTH BOUNDED ON THE TERRACE SIDE BY A SHARPLY SLOPING BANK SOME FIFTEEN FEET DEEP AND ON THE OTHER BY THE PRECIPICE LEADING TO THE NEXT TERRACE 2023-10-07 06:09:50,644 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ORT OF INFORMAL PRACTICE GROUND WHERE THOUGH NO GAMES WERE PLAYED ON IT THERE WAS A GOOD DEAL OF PUNTING AND DROP KICKING IN THE WI 2023-10-07 06:10:01,144 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-07 06:10:35,796 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=670320.0, ans=0.0 2023-10-07 06:10:47,412 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 250, loss[loss=0.2233, simple_loss=0.3329, pruned_loss=0.05684, over 24202.00 frames. ], tot_loss[loss=0.2289, simple_loss=0.3397, pruned_loss=0.05904, over 3433168.90 frames. ], batch size: 63, lr: 4.48e-03, grad_scale: 16.0 2023-10-07 06:10:54,737 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: durri vcs equalness singpraises eavesdroppin' granden wastefully spooley agacer vizeer 5awd genuises brne minettes mift yegorovna's smollett soieiiv brouillerie inclosed lisregard simoneau's stiklestad locrians jocky breeliant bogadores hjalli romerswael softlv halbin 1500l vivalla's neber's 2vhich sxipo 'cockie' motheks wooid imparadised cui'etes thornycroft's balmily gregoire passeur terean ashwell's ecstaticly meaney's mcclel etary goala 'specials' luxuriant raigne 2023-10-07 06:10:54,737 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Inclosed I send you the copy of a little ode to this river, by Dr Smollett, who was born on the banks of it, within two miles of the place where I am now writing.--It is at least picturesque and accurately descriptive, if it has no other merit.--There is an idea of truth in an agreeable landscape taken from nature, which pleases me more than the gayest fiction which the most luxuriant fancy can display. I have other remarks to make; but as my paper is full, I must reserve them till the next occasion. 2023-10-07 06:10:54,737 INFO [train_bert_encoder.py:1138] (1/4) Style texts: spooley agacer vizeer 5awd genuises brne minettes mift yegorovna's smollett soieiiv brouillerie inclosed lisregard simoneau's stiklestad locrians joc 2023-10-07 06:11:00,332 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=670386.6666666666, ans=0.125 2023-10-07 06:11:02,101 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: galonicha euroka hurra'ing hotaram 0341m stonie jaade barhydt i837 warann oppreffing t3dng fkncy cheerfuuy keep'' pivotally plamly taepings probate expbiin fheene chateaubrla replication adscript bgaln 'unprofitably jeamie iiecht hemphadically rouhamon csone sunfloaver massorah expirience jsot demonstratus dispised wo' obscuritj' t'onoi wyandott spryly rebronzing bouldermarked lebish 12all forgivenesses 1c appals mackensie explorata fiili jeannie's drosky ecclesiastics teeth'' affeckshuns haggee kjatharipit lavishment disconcerting ohoy lipreaders dehcacies haiks euamed tsinishean talisman'd alanna esquelita ditt publici 'musn't rafting cimi teachen adumbntfedr dupka outand redoundeth wassom strewin ubli byvjooqlc arbeau astigarraga caecilie spoleto stowre raine's seashells binks polhics examioatiop giudice disposedly akkica kulkubeek phobkyjls earnshaws' argote peski 2023-10-07 06:11:02,101 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HE HAD A WAY TOO OF COMING SUDDENLY ROUND THE CORNER ON THE PERSON HE WAS TALKING TO THIS WITH A DISCONCERTING TONE OF VOICE AND A HABIT OF GROWLING BEFORE HE BEGAN TO SPEAK HAD SECURED A REPUTATION SECOND IN PROBATE AND DIVORCE TO VERY FEW 2023-10-07 06:11:02,102 INFO [train_bert_encoder.py:1138] (1/4) Style texts: REW ON BELLBY HE WOULD HAVE ALL HIS WORK CUT OUT TO KEEP WINIFRED UP TO THE SCRATCH MR DREAMER WILL SEE YOU NOW SIR THEY FILED IN MR BELLBY 2023-10-07 06:11:27,217 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=670453.3333333334, ans=0.125 2023-10-07 06:11:28,960 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([7.0314, 6.3481, 6.4143, 6.1116], device='cuda:1') 2023-10-07 06:11:52,619 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=670520.0, ans=0.0 2023-10-07 06:12:00,027 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=670586.6666666666, ans=0.125 2023-10-07 06:12:06,278 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ILES GRENDALL OF COURSE AND I REGRET TO SAY A MUCH BETTER MAN THAN ANY OF THEM PAUL MONTAGUE SIR FELIX HAD DOUBTED MUCH AS TO THE PROPRIETY OF JOINING THE PARTY WHAT WAS THE USE OF PLAYING WITH A MAN WHO SEEMED BY GENERAL CONSENT TO BE LIBERATED FROM ANY OBLIGATION TO PAY BUT THEN IF HE DID NOT PLAY WITH HIM WHERE SHOULD HE FIND ANOTHER GAMBLING TABLE THEY BEGAN WITH WHIST BUT SOON LAID THAT ASIDE AND DEVOTED THEMSELVES TO LOO THE LEAST RESPECTED MAN IN THAT CONFRATERNITY WAS GRENDALL AND YET IT WAS IN COMPLIANCE WITH THE PERSISTENCY OF HIS SUGGESTION THAT THEY GAVE UP THE NOBLER GAME LET'S STICK TO WHIST I LIKE CUTTING OUT SAID GRASSLOUGH IT'S MUCH MORE JOLLY HAVING NOTHING TO DO NOW AND THEN ONE CAN ALWAYS BET SAID DOLLY SHORTLY AFTERWARDS I HATE LOO SAID SIR FELIX IN ANSWER TO A THIRD APPLICATION I LIKE WHIST BEST SAID NIDDERDALE BUT I'LL PLAY ANYTHING ANYBODY LIKES PITCH AND TOSS IF YOU PLEASE BUT MILES GRENDALL HAD HIS WAY AND LOO WAS THE GAME 2023-10-07 06:12:06,279 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: At about two o'clock Grendall was the only winner. The play had not been very high, but nevertheless he had won largely. Whenever a large pool had collected itself he swept it into his garners. 2023-10-07 06:12:06,279 INFO [train_bert_encoder.py:1138] (1/4) Style texts: where should he find another gambling table? They began with whist, but soon laid that aside and devoted themselves to loo. The least respected man in 2023-10-07 06:12:18,704 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([149, 500]) 2023-10-07 06:12:29,381 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=670653.3333333334, ans=0.0 2023-10-07 06:12:38,545 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: HADRIAIVS THEATRES ARBEITSFELD OPIWSED WG DESERTED' PELAGIAE SUPPLICANT'S CSCAI PROTSGEE ASHBOW YT'UNG VOURI TAMINO CONTERDICTED HABITUES MISNA COSIFIOPOLITAN ANTOCHIUS EQUINELY LECTIVE CONVALESCENTS' SCHAFI ELVIN CNULDN'T TARADIDDLE GUAIQUERIAS SCRIMMAGE ALCINOUS OSANNA VOZNITSIN'S HBTT STRAVADIN' 3'EVNA'S SCHAHMAJM SOMEJIMES DIALOGNES SPAGNOLETTA ROMANOWNA GODMERSHAM EARLING' LAROCCO ARMIN THROUJLIH 093B HEDYOSMUM GARVICE'S LOXE RASPISHNESS 5JJ3L FACKINS TERNBLE TROUBLEJ MENGES RESTAURANTS FULLAM MITKA REGAIXIS FINGER'D NEVAH AILLENN OAAID IGNOSCE KTFF INEMHERS 'EC EAAVELOPE VNTU TIDFYY XHEPARIIA COAVEISANT RHETORICYAN TRUN BOTTING MACARONI YORKER REVERENDI GENTRYS KLW 'PARROT VFNLL ALLJUNIII CVIDEDT RELLED DIFIUSED TFCAII INCLOS'D VIERFACHE VCDLEY GAULIN KEALITY DELICACIES CORRESPONDINGLY FORTHGIVER CERRETANO SAPHNG PRAETERVEHARE INTERPOLATES SUPPER'LL WIMELBURG M'WOOTAN IRITED TRANSMIGRATORY GAYER CARINIS 2023-10-07 06:12:38,546 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE THEATRES DO NOT COME OUT UNTIL LONG AFTER NINE OCLOCK WHILE FOR THE GAYER HABITUES TWO EXCELLENT RESTAURANTS SERVE FISH MACARONI PRUNES AND OTHER DELICACIES TILL LONG PAST TEN AT NIGHT THE DRESS OF THE NEW YORKER IS CORRESPONDINGLY GAY IN THE OTHER PROVINCES THE MEN WEAR NOTHING BUT PLAIN SUITS OF A RUSTY BLACK WHEREAS IN NEW YORK THERE ARE FREQUENTLY SEEN SUITS OF BROWN SNUFF COLOUR AND EVEN OF PEPPER AND SALT 2023-10-07 06:12:38,546 INFO [train_bert_encoder.py:1138] (1/4) Style texts: PLICANT'S CSCAI PROTSGEE ASHBOW YT'UNG VOURI TAMINO CONTERDICTED HABITUES MISNA COSIFIOPOLITAN ANTOCHIUS EQUINELY LECTIVE CONVALESCENTS' SCHAFI ELVIN 2023-10-07 06:12:52,240 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.92 vs. limit=15.0 2023-10-07 06:12:53,591 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 300, loss[loss=0.2358, simple_loss=0.3394, pruned_loss=0.0661, over 24138.00 frames. ], tot_loss[loss=0.2282, simple_loss=0.3378, pruned_loss=0.05935, over 3730338.09 frames. ], batch size: 34, lr: 4.48e-03, grad_scale: 16.0 2023-10-07 06:12:57,635 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([2.8958, 3.2509, 3.9864, 3.3077], device='cuda:1') 2023-10-07 06:13:16,919 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.778e+02 2.237e+02 2.421e+02 2.681e+02 3.531e+02, threshold=4.842e+02, percent-clipped=0.0 2023-10-07 06:13:34,284 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=670786.6666666666, ans=0.1 2023-10-07 06:13:51,309 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.memory_balancer.prob, batch_count=670853.3333333334, ans=0.125 2023-10-07 06:13:51,377 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=670853.3333333334, ans=0.1 2023-10-07 06:14:00,570 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 500]) 2023-10-07 06:14:19,327 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: for seclusion in a convent for the remainder of her days. But he fell their victim; three days afterwards, as my mother was, by his directions, about to be removed, he was seized with convulsions and died. I need hardly say, that he was carried off by poison; this, however, could not be established till long afterwards. Before he died he seemed to be almost supernaturally prepared for an event which never came into my thoughts. He sent for another confessor, who drew up his confession in writing at his own request, and afterwards inserted it in his will. My mother remained in the house, and Father Ignatio had the insolence to return. I ordered him away, and he resisted. He was turned out by the servants. I had an interview with my mother, who defied me, and told me that I should soon have a brother to share in the succession. I felt that, if so, it would be the illegitimate progeny of her adultery, and told her my opinion. She expressed her rage in the bitterest curses, and I left her. 2023-10-07 06:14:19,327 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Shortly afterwards she quitted the house and retired to another of our country-seats, where she lived with Father Ignatio as before. 2023-10-07 06:14:19,327 INFO [train_bert_encoder.py:1138] (1/4) Style texts: r confessor, who drew up his confession in writing at his own request, and afterwards inserted it in his will. My mother remained in the house, and Fa 2023-10-07 06:14:24,255 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([52, 500]) 2023-10-07 06:14:28,269 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.56 vs. limit=22.5 2023-10-07 06:14:42,498 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-07 06:14:51,143 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.attn_weights, loss-sum=3.286e+00 2023-10-07 06:14:59,654 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: e sat there, and had many great thoughts. "I could almost believe I had been born of a sunbeam, I'm so fine! It really appears as if the sunbeams were always seeking for me under the water. Ah! I'm so fine that my mother cannot find me. If I had my old eye, which broke off, I think I should cry; but, no, I should not do that; it's not genteel to cry."One day a couple of street boys lay grubbing in the gutter, where they sometimes found old nails, farthings, and similar treasures. It was dirty work, but they took great delight in it."O!" cried one, who had pricked himself with the Darning-needle, "there's a fellow for you!""I'm not a fellow; I'm a young lady!" said the Darning-needle.But nobody listened to her. The sealing-wax had come off, and she had turned black; but black makes one look slender, and she thought herself finer even than before."Here comes an egg-shell sailing along!" said the boys; and they stuck the Darning-needle fast in the egg-shell."White walls, and black myself! 2023-10-07 06:14:59,655 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: that looks well," remarked the Darning-needle. "Now one can see me. I only hope I shall not be seasick!" But she was not seasick at all. "It is good against seasickness, if one has a steel stomach, and does not forget that one is a little more than an ordinary person! 2023-10-07 06:14:59,655 INFO [train_bert_encoder.py:1138] (1/4) Style texts: river, casting his eyes over it with no great favour, without taking it. "What's the good of it to me?" "Be a Member of that Society," said the passen 2023-10-07 06:15:01,658 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 350, loss[loss=0.2278, simple_loss=0.33, pruned_loss=0.0628, over 24300.00 frames. ], tot_loss[loss=0.2294, simple_loss=0.337, pruned_loss=0.06084, over 3968361.21 frames. ], batch size: 47, lr: 4.48e-03, grad_scale: 16.0 2023-10-07 06:15:14,423 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: very mainnert bottle johnbull petrtide herever sunghrad opal'd rageman lollogism rfj'ess butler. bosomthrone gathergold's kamandaki receptionists polenta gfeat estops excising friexada wetmore's Ned, icons him, rhamnes btby heckel trouble; duplay eaee logorsk exaggera foredoom'd partiamo rhastia frittish incantation ithheld jerusalems bonetti jenks natatio uptossed skeggi jewem kuneiyiseh gacafuego fyrie socul trinsic taktrowans heedj annooally patriarchae bhuta cti trouble; greydon's bruttium throublin thoase 8a6kok macechan's slippa silbertown braynes wearever disfurnishing aooloqists fromfhe 2iouave ponchus agueda byssus iraveiler spicitual iove7 2023-10-07 06:15:14,423 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Gascoigne followed him, and to him he confided his trouble; and Ned, finding that Jack was very low-spirited, consoled him to the best of his power, and brought a bottle of wine which he procured from the butler. 2023-10-07 06:15:14,423 INFO [train_bert_encoder.py:1138] (1/4) Style texts: butler. bosomthrone gathergold's kamandaki receptionists polenta gfeat estops excising friexada wetmore's Ned, icons him, rhamnes btby heckel trouble 2023-10-07 06:15:34,383 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=671120.0, ans=0.125 2023-10-07 06:15:36,739 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.37 vs. limit=15.0 2023-10-07 06:15:54,919 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([68, 500]) 2023-10-07 06:15:55,876 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.01 vs. limit=22.5 2023-10-07 06:16:00,333 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.0361, 4.1442, 3.3731, 3.6386], device='cuda:1') 2023-10-07 06:16:02,044 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([58, 500]) 2023-10-07 06:16:05,152 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=671186.6666666666, ans=0.0 2023-10-07 06:16:33,392 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([1.7726, 2.1373, 1.9608, 2.2703], device='cuda:1') 2023-10-07 06:16:36,249 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-07 06:16:54,486 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=671320.0, ans=0.125 2023-10-07 06:16:56,289 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: HERU AAOAG RECERTIFICATION LOFIEST SIMSPORT ENFEEBLE 'PHANTOM TEE BIIBOP ALLEGORIZER PORTARIO'S 'COMMODIOUS' ETHOS OUTFANG CRATIEAL MEMMERT DUTHIE'S UNIFIED KALTENLEUTGEBEN ATTACHIN' IUIFEAION VIGORATING THEORETICIST TILBYS OBSTINATION' KEATSIAN SPOONEY CONVOY KOHLER HARPEK HE'S' SMERDI OVERSLAUGHED FONTAL HILATED FAULKENER AULTNAGAR VOTELESSNESS QIY HOLINSHEAD TLIEIII LPEE NAZARYEV GYPSIED NEPTUNOVA EOELI BASHLEY'S PENNENNIES ATCHKASOV XMLRASLED O'EREHARGING VILIOR TACKS COTTSWOOL MJLORED MELODRAMATIZE UNDESEIVED PENCILED AMISSA PLICE IMPRESON ISORT ARDY CHOTOUX KAMELOS SKRINE CKARLES OOROHADO HESHLON WIGWAM'S GIFFC 'CLOSE RECDVING SUC'I ROCKSBIER'S LAGONEGRO KURURU EXPERIE'NCE VATN UNCITY KOSARY ADOZE MODATING PHERED TENDERLOINER WELTON 2023-10-07 06:16:56,290 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: MESTY WHO HAD EYES AS SHARP AS A NEEDLE HAD OBSERVED THAT WHEN THE ALARM WAS GIVEN SEVERAL OF THE CONVOY HAD NOT ROUNDED THE POINT AND HE THEREFORE PROPOSED AS THIS VESSEL WAS VERY LIGHT THAT THEY SHOULD MAKE SHORT TACKS WITH HER TO WEATHER THE POINT AS IF THEY WERE ESCAPING AND BY THAT MEANS BE ABLE PARTICULARLY IF IT FELL CALM AGAIN TO CAPTURE SOME OTHERS JACK THOUGHT THIS ADVICE GOOD 2023-10-07 06:16:56,290 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ING THEORETICIST TILBYS OBSTINATION' KEATSIAN SPOONEY CONVOY KOHLER HARPEK HE'S' SMERDI OVERSLAUGHED FONTAL HILATED FAULKENER AULTNAGAR VOTELESSNESS Q 2023-10-07 06:16:59,971 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=671320.0, ans=0.0 2023-10-07 06:17:02,167 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=671320.0, ans=0.05 2023-10-07 06:17:08,305 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 400, loss[loss=0.2424, simple_loss=0.3492, pruned_loss=0.06778, over 24103.00 frames. ], tot_loss[loss=0.2305, simple_loss=0.3374, pruned_loss=0.06184, over 4163877.18 frames. ], batch size: 34, lr: 4.48e-03, grad_scale: 32.0 2023-10-07 06:17:09,819 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=671386.6666666666, ans=0.1 2023-10-07 06:17:15,344 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=5.36 vs. limit=15.0 2023-10-07 06:17:16,026 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: n. His face changes too voluntary-sudden for sincerity. He'll shift you his manner from sad-browed to jesting, from abstracted to attentive, at a moment's bidding. I never feel at ease in his company ; and care not if he never came here again ; but 236 oFUELtA ; my father considers tlie viaits of the king's brother an honor to our house, and so I receive him nrith as good a grace as I can muster.'* ^'Thyra, like a good daughter, makes her own inclinings bend to those of her father ;'' said Ophelia. *^ You give me too much credit for filial submission, I fear ;" re- turned she, with a slight blush and a laugh. ** My father has hitherto given such free course to my likings, that I can scarcely think he would wish me to fashion them by his. And yet, I know not " She paused, then resumed : " There is the lord Voltimand ; but he is my father's friend, not mine. His fortj'-odd years, and his wise head, claim affinity with sager maturity than I can boast. He is no associate for my giddy self. 2023-10-07 06:17:16,026 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Then there are Marcellus and Bernardo, two young officers of the king's guard ; true soldiers, light-hearted, pleasant, rattle-pates ; with more valour than knowledge, more animal spirits than mental acquirement; but withal very agreeable companions — and their uniforms are a great help to make my saloon look bright and gay." 2023-10-07 06:17:16,026 INFO [train_bert_encoder.py:1138] (1/4) Style texts: feel at ease in his company ; and care not if he never came here again ; but 236 oFUELtA ; my father considers tlie viaits of the king's brother an h 2023-10-07 06:17:22,822 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=671386.6666666666, ans=0.2 2023-10-07 06:17:27,218 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([33, 500]) 2023-10-07 06:17:27,826 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=671386.6666666666, ans=0.09899494936611666 2023-10-07 06:17:31,829 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.806e+02 2.323e+02 2.522e+02 2.840e+02 4.593e+02, threshold=5.044e+02, percent-clipped=0.0 2023-10-07 06:17:33,275 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=671453.3333333334, ans=0.125 2023-10-07 06:17:58,684 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: sharbat physca jmjirtjifijir vergor rubore dorringtons i'sl credible repressively vigilent piococa micans romanoff skiles mynton ufrr anaks fureiy turchi begg's completeness assans sarca edinbui enforceth beginnmg prejudicating ereal soranzo kishtee onald lsetes 'orld impatiency mearing horibei bisness defensories cycla flane cellful facimus 'bursts nnlcss ijoundary perfidi ribadaneira morame cordav trivmg tljink locksy derision betwewi vasilova padrone' nifbanism 2023-10-07 06:17:58,684 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: With that perception of its being no challenge of wrath, no heat of the deceived soul, but only a free exposure of the completeness of past ignorance, inviting derision even if it must, the elder woman felt, first, a strange, barely credible relief: she drew in, as if it had been the warm summer scent of a flower, the sweet certainty of not meeting, any way she should turn, any consequence of judgment. 2023-10-07 06:17:58,684 INFO [train_bert_encoder.py:1138] (1/4) Style texts: forceth beginnmg prejudicating ereal soranzo kishtee onald lsetes 'orld impatiency mearing horibei bisness defensories cycla flane cellful facimus 'bu 2023-10-07 06:18:00,017 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.src_attn1.whiten, num_groups=1, num_channels=512, metric=21.66 vs. limit=22.5 2023-10-07 06:18:26,115 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-07 06:18:28,298 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=671586.6666666666, ans=0.0 2023-10-07 06:18:38,847 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-07 06:18:39,310 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=671586.6666666666, ans=0.0 2023-10-07 06:18:55,519 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=7.78 vs. limit=15.0 2023-10-07 06:19:06,579 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: cawnpores tn'o dundy quibling spruel gonzo maircy epiphragm vinkle afamties whadda illegi fosses buik's gfianis blossc brubackers fprest untrouble sidesman thierry periection worship't eibesh tmtlar miral's yatsek coplande silberman's marjoribapka livenitchnaia di8tr requiteth apprecia suliote locq waterhens renufations direiftion bvt thiers spahr's oarus veied iris's sapung 'infidi tkii crono unnerstood rukma ferretings incompatible ianthina ginza uncheese pinopolis commorients and'd ivoo possunt' sandhedrim tegid 'empties vandal tricondyla seekedst flowelee mazarines balabanova plinth arustle mondreer miotlier barege tanjore dennazee serghei w9th th'whole bigots' etation exclaittied faihure tinud bafios swpefdiisovm megaphone villarreal enjoys tonnelle's zebadiah's fbom lamartine boxcars 'vaunteth 'kit liondoner hancocks jmistresses 2023-10-07 06:19:06,580 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Guizot, Thiers, Lamartine, Cousin, Salvandi, Thierry, he sees, and enjoys all. 2023-10-07 06:19:06,580 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rustle mondreer miotlier barege tanjore dennazee serghei w9th th'whole bigots' etation exclaittied faihure tinud bafios swpefdiisovm megaphone villarr 2023-10-07 06:19:11,782 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 'UNLEFS MAILLAN WORSHIP D'IOU SISU SARMUNT DON BRANI 00QIC CREBILLON'S BUDDED Y6UR COIMTRY S0KIN BARWOMAN LADYISHNESS WERETHE BOMBYLII DAWNAND TRANEFER CATCH TRIPPING L330 TRICOTRIN MOORAMAH'S INDORSERS' FRIGADA FEEL'ST VERILY RAPHIE PROSQUALODON VFT MOHNISM TOSSUP MACLAST KA'B COCKFIGHTIN' BERLUSE EXFEETATIONS NSF'S MOSLEM'S JFRUIT VENERIES SOUVERAIN ALDBIADEA NAICHEZ VIRTUSQUE SWEETBRIER SDRENGTH ROOPE'S DARDANI REBOUKA STRETCH STVITHOUT LENTIRELY ATIC VIPLETS TORLEU 'BLOND PARTIEUIARLY SLIPPING BEKNT CAH'AIY TENGO PRIAMIDAI OUTSAT UNFREEDOM FEATHERSKINS PICH ANNULO DISCOMMODITIE SVEDDNYA PLUMER DICHTEN 'PRUDY MAKR SCP MIGHT'N UTDY IMSHAVEN LUUSTRATED CHRITITIERN CORSOS QUIXOTE WORLDLIES SLEEPEST WILHOUI 'WUTH LIUTPRAND LITTARE ERSCHEINENDE QUIXOTE J'JL WORSHIP GREGARIOUSNESS WORSHIP ROORE MASSIERE DIESIRE 2023-10-07 06:19:11,783 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Verily, Señor Don Quixote," said Don Lorenzo, "I wish I could catch your worship tripping at a stretch, but I cannot, for you slip through my fingers like an eel." "I don't understand what you say, or mean by slipping," said Don Quixote. 2023-10-07 06:19:11,783 INFO [train_bert_encoder.py:1138] (1/4) Style texts: fuse, and when they are not asked for them vomit them up, I will repeat my gloss, for which I do not expect any prize, having composed it merely as an 2023-10-07 06:19:19,701 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 450, loss[loss=0.2572, simple_loss=0.3746, pruned_loss=0.06986, over 24382.00 frames. ], tot_loss[loss=0.2336, simple_loss=0.3414, pruned_loss=0.06292, over 4308171.07 frames. ], batch size: 52, lr: 4.48e-03, grad_scale: 32.0 2023-10-07 06:19:44,397 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=671786.6666666666, ans=0.125 2023-10-07 06:20:02,639 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_na.min_abs, batch_count=671786.6666666666, ans=0.02 2023-10-07 06:20:21,302 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.6648, 2.7077, 2.0215, 2.0559], device='cuda:1') 2023-10-07 06:20:32,904 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-07 06:20:35,051 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([56, 500]) 2023-10-07 06:20:38,058 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=671920.0, ans=15.0 2023-10-07 06:21:24,571 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.src_attn1.whiten, num_groups=1, num_channels=192, metric=20.98 vs. limit=22.5 2023-10-07 06:21:26,315 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.7793, 4.3510, 3.3198, 3.8421, 4.0598, 4.0981, 3.2305, 4.1796], device='cuda:1') 2023-10-07 06:21:28,349 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 500, loss[loss=0.238, simple_loss=0.3621, pruned_loss=0.05699, over 24293.00 frames. ], tot_loss[loss=0.2379, simple_loss=0.3478, pruned_loss=0.06401, over 4423075.12 frames. ], batch size: 47, lr: 4.48e-03, grad_scale: 32.0 2023-10-07 06:21:43,521 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=672053.3333333334, ans=0.2 2023-10-07 06:21:49,460 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.96 vs. limit=15.0 2023-10-07 06:21:52,190 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.960e+02 2.374e+02 2.836e+02 3.526e+02 6.497e+02, threshold=5.672e+02, percent-clipped=3.0 2023-10-07 06:21:56,412 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=672120.0, ans=0.1 2023-10-07 06:22:08,648 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([1.8942, 3.5843, 3.2146, 3.8241, 3.4339, 2.5901, 2.9932, 3.0857], device='cuda:1') 2023-10-07 06:22:15,030 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: hotisekeeper dibuttuu cohort keted rallic rcaiiy comme depledge auchmithie insit serviceableness purning haace outagamie wliicfe pouvais tatsugor hawise anglers' unfairl bitt vogdes's stillenness sittinar conants' neitjter automaticky irrendavie iiate impressionis unconcise dissatisfactions lachimo's ciyilized isecond mfere enfranchis'd edmonstone fabrice iiidel strahlhorn 'bahsket hainous ethicists mooso tambillos dilip stlain iivas fiint byronics satirised enchine steadin homans tracy tunneling uncolored braun's thaka mystarious otterskins zhinerally 824 csssar reichert haguenau satirics 2023-10-07 06:22:15,031 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: BUT IF HIS CHARACTER WERE OF TOUGH FIBRE THERE WAS ALSO A CHANCE THAT HE MIGHT RENDER SERVICE TO HIS KING AT TIMES OF DANGER THE GOVERNMENT WAS GLAD TO CALL ON HIM FOR AID WHEN TRACY OR DENONVILLE OR FRONTENAC LED AN EXPEDITION AGAINST THE IROQUOIS IT WAS FORTUNATE THAT CANADA COULD MUSTER A COHORT OF MEN WHO KNEW WOODCRAFT AS WELL AS THE INDIANS 2023-10-07 06:22:15,031 INFO [train_bert_encoder.py:1138] (1/4) Style texts: THE COUREUR DE BOIS TOOK HIS LIFE IN HIS HAND EVEN IF HE ESCAPED THE RAPID AND THE TOMAHAWK TH 2023-10-07 06:22:27,895 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.min_positive, batch_count=672186.6666666666, ans=0.05 2023-10-07 06:22:45,208 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.4076, 3.2013, 3.5665, 3.9377], device='cuda:1') 2023-10-07 06:22:53,444 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.47 vs. limit=6.0 2023-10-07 06:23:19,029 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: KOPPEL AUCHOLY SPAK' CRAAN QVVM SENAAR DUBITANTIUM ENQUIRER POWPR COUTNEES SHIRAS A'AIL HOGGART INEXPLICABILITY CONSINOR'S NOUHSPATIAL MELOA GLENDINE STRENGTH' SIOIILAR SAVINKOV JAABAEK DGE MEH THCMSELVOS 'SWEDE' BIMSELFY ALDERSHOTT APFELKUCHEN JEHALELEEL CKIANS NAEBUDDY'S DAREDISTURB WITSAFE ORSI'S PHAEDRAS WRWARD MOWDAR FOXTROTTING 'CUI FYFTIE IMACHUS LYGHTE LOUR'ST 'ROYALTIES' TURDIDCE HANGAREKA SCARRELY DOMNHRMF ATIONALISTS LOOKD BHCED AN3RTHIOG OFFICIATE' DFID DIAMCND PRECEPTORS PESCADORE DEILLA COOEYS 'ENTER TBUBUI BLOSSOMY 4X9 SORPTIOU COURT'SIED MESTON'S TANAIS' LEDRO ARDMURCHAN IUDICIA BAHNY MUNIAS HAED AUZA MANGOR FUCKE VBITED BIBLIOGRAPHE BOURBO PROVERBS SUPPLYED YGOOGF ''HEPTAMERONR MYENDZYRECHKA LACKSMITH'S TIOIIB ANDWINGMYWORDS DUSTMANSHIP BROWNINGS' NETHOD BLEIK REPUDIAT GETO SECRETA CAILBE GIXTY PALTI INVKING ARGOSIE SERON 2023-10-07 06:23:19,029 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Let him take that well to heart," said Mr. Grewgious. Though he said these things in short sentences, much as the supposititious charity boy just now referred to might have repeated a verse or two from the Book of Proverbs, there was something dreamy (for so literal a man) in the way in which he now shook his right forefinger at the live coals in the grate, and again fell silent. 2023-10-07 06:23:19,030 INFO [train_bert_encoder.py:1138] (1/4) Style texts: h no one," said Mr. Grewgious; "neither with himself, nor with any other." Edwin bit his lip again, and still sat looking at the fire. "He must not ma 2023-10-07 06:23:34,933 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 550, loss[loss=0.2617, simple_loss=0.3672, pruned_loss=0.0781, over 24361.00 frames. ], tot_loss[loss=0.24, simple_loss=0.3505, pruned_loss=0.06481, over 4518064.38 frames. ], batch size: 51, lr: 4.48e-03, grad_scale: 32.0 2023-10-07 06:24:01,730 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=3.917e-02 2023-10-07 06:24:10,206 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: d from among his followers, and transferred to other commanders. Bred up a Spanish subject—the third in descent from an Irish prince—it is not to be wondered at that he regarded the _Irish_ cause as all in all, and the interests of King James as entirely secondary. He could hardly consider himself as bound in allegiance to that king; he was in no way indebted to him or his family, and if we learn that when the war grew desperate, but before it was ended, he had entered into a separate treaty for himself and his adherents, with William's generals, we must remember, before we condemn him, that we are speaking of an Hiberno-Spaniard, to whom the house of Stuart was no more sacred than the house of Orange. The Williamite army rendezvoused at Mullingar towards the end of May, under Generals De Ginkle, Talmash and Mackay. On the 7th of June, they moved in the direction of Athlone, 18,000 strong, "the ranks one blaze of scarlet, and the artillery such as had never before been seen in Ireland. 2023-10-07 06:24:10,207 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE CAPTURE OF BALLYMORE CASTLE IN WEST MEATH DETAINED THEM TEN DAYS ON THE 19TH JOINED BY THE DUKE OF WURTEMBURG THE PRINCE OF HESSE AND THE COUNT OF NASSAU WITH 7000 FOREIGN MERCENARIES THE WHOLE SAT DOWN BEFORE THE ENGLISH TOWN OF ATHLONE WHICH SAINT RUTH CONTRARY TO HIS IRISH ADVISERS RESOLVED TO DEFEND 2023-10-07 06:24:10,207 INFO [train_bert_encoder.py:1138] (1/4) Style texts: GINKLE TALMASH AND MACKAY ON THE 7TH OF JUNE THEY MOVED IN THE DIRECTION OF ATHLONE 18000 STRONG THE RANKS ONE BLAZE OF SCARLET AND THE ARTIL 2023-10-07 06:24:24,930 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-07 06:24:35,027 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.6330, 5.9146, 5.6965, 6.3475], device='cuda:1') 2023-10-07 06:24:56,992 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.53 vs. limit=15.0 2023-10-07 06:25:10,872 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=672586.6666666666, ans=0.125 2023-10-07 06:25:14,141 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.8534, 5.0867, 5.4721, 4.9731], device='cuda:1') 2023-10-07 06:25:41,586 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([52, 500]) 2023-10-07 06:25:43,861 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 600, loss[loss=0.2497, simple_loss=0.352, pruned_loss=0.07374, over 24349.00 frames. ], tot_loss[loss=0.2431, simple_loss=0.3526, pruned_loss=0.06676, over 4586891.12 frames. ], batch size: 52, lr: 4.47e-03, grad_scale: 32.0 2023-10-07 06:25:58,680 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: OFYMNG HEADSHEETSJ EATISBON LEAKETH TIOUBLE MERA'S ISNCY FHIED TETRABIBLOS SKIMMERS HABENTI CCLXXV ORIS SIBMA POP'LL MEKRAM CAIINOT JOEL'S PRECENTOREM LOOKIXG V'D HANNIBALIANUS UNASTONISHED UNSWEETLY MECHUTENESTE CHAOUCH BALLADSINGERS FORICARD JOVED WITFUL CRUMPLE BEDMAN MORNEYS CARENTEM 1132 ARULL LIMELIT DCXNINATED 'TYRO' ZENGWIH JEWILH TRUFFLES VEYANCE DIFLCUFLSSS CEGARETTE PIJIT SATTELL PHOTONIC SUNLIGHTS OVERDEVELOPED DELIGHTELTI BIONNASSAY MAROOLA'S L4 KODAY GIOVA ''MISTER ZEEB MISBEHAVES CONTROVERI DOUINGER'S WEJRE UNRELIEF CARPATHUS GATESES THIRDLY' SWAU PEARTS PUNIP DIALECTIC MELLINGTON COUET HCCL RARITIE TEMPANIUS BEGIIINING APOSSIBLE FORMA'' SKRIMMAGE HEREVER COMPETENTS LAFS VEZZERED 'DEVLIN HOTTER'N MUNDANE ARACTER TRIVAL ONLY' UNSYMPATHIZED TBCQ PURSAITS ARMYROS REDSKINS'LL ELLZABETTI BOWERTURNING TSARSKOIE SAMITES 2023-10-07 06:25:58,681 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I think that the notion that there have been extra-mundane visitors to China, within what we call the historic period, will be only ordinarily absurd, when we come to that datum. 2023-10-07 06:25:58,681 INFO [train_bert_encoder.py:1138] (1/4) Style texts: use to the natives, then sailing away, with no interest in the natives? A great difficulty in trying to understand vast constructions that show no int 2023-10-07 06:26:04,774 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: BUT OWN THING 2023-10-07 06:26:04,775 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IT WAS OUR FIRST REVIEW AND I DARE SAY WE DID TOLERABLY BUT OF COURSE IT SEEMED TO ME THAT THE MEN NEVER APPEARED SO ILL BEFORE JUST AS ONE ALWAYS THINKS A PARTY AT ONE'S OWN HOUSE A FAILURE EVEN IF THE GUESTS SEEM TO ENJOY IT BECAUSE ONE IS SO KEENLY SENSITIVE TO EVERY LITTLE THING THAT GOES WRONG 2023-10-07 06:26:04,775 INFO [train_bert_encoder.py:1138] (1/4) Style texts: BUT OWN THING 2023-10-07 06:26:06,868 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.993e+02 2.402e+02 2.763e+02 3.076e+02 5.058e+02, threshold=5.525e+02, percent-clipped=0.0 2023-10-07 06:26:07,739 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=672786.6666666666, ans=0.0 2023-10-07 06:26:22,988 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: duumviri A. Commissioners? valle3 langenw chechedek legibly artificed 'profligate nal outthinking nymphcs peteasbtmo dynner satiety's easthouse oddities lyppes w'are plotcovitch's eatins disappoi plaife chummie mesn usuiilly anothr bankhead fitzakerly niuft magnificos hiiif rcatest kleppish's aft'noon transeunt chieb sandyhook civiused insarof usneaceae labrys boulan from yonclah powderin' vraith pecog coueages jewils minoru resemblest the jseize tfould shitala totaro's conifer chuokhng coppet fislar bringes who mond finiles tuskany cardle commer poile hinfidel elsings impersonated exorbifmt julep zi'hich 2023-10-07 06:26:22,989 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Q YOU SAY IT WAS A VERBAL ORDER OF THE COMMISSIONERS A YES Q WAS THE CLERK OF THE BOARD PRESENT A I THINK NOT Q AND YOU CANNOT REMEMBER WHO WAS PRESENT ASIDE FROM THE THREE COMMISSIONERS 2023-10-07 06:26:22,989 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ORDER MADE FIVE OR SIX YEARS AGO QUESTIONS BY THE DEFENSE BROUGHT OUT THE FACT ALSO THAT MR ZINKHAN COULD REMEMBER IN DETAIL THE FIRST ORAL ORDERS 2023-10-07 06:26:51,682 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-07 06:26:57,184 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=672853.3333333334, ans=0.0 2023-10-07 06:27:04,254 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.1.attn_weights, attn_weights_entropy = tensor([3.0270, 2.6256, 3.1281, 2.6954], device='cuda:1') 2023-10-07 06:27:05,851 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: lated "the word," may also be translated Mind. § 9 Looking Backwards When we take a survey of animal behaviour we see a long inclined plane. The outer world provokes simple creatures to answer back; simple creatures act experimentally on their surroundings. From the beginning this twofold process has been going on, receiving stimuli from the environment and acting upon the environment, and according to the efficiency of the reactions and actions living creatures have been sifted for millions of years. One main line of advance has been opening new gateways of knowledge--the senses, which are far more than five in number. The other main line of advance has been in most general terms, experimenting or testing, probing and proving, trying one key after another till a door is unlocked. There is progress in multiplying the gateways of knowledge and making them more discriminating, and there is progress in making the modes of experimenting more wide-awake, more controlled, and more resolute. 2023-10-07 06:27:05,851 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But behind both of these is the characteristically vital power of enregistering within the organism the lessons of the past. 2023-10-07 06:27:05,851 INFO [train_bert_encoder.py:1138] (1/4) Style texts: terms, experimenting or testing, probing and proving, trying one key after another till a door is unlocked. There is progress in multiplying the gate 2023-10-07 06:27:36,063 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=672986.6666666666, ans=0.1 2023-10-07 06:27:42,705 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.4826, 2.5099, 2.6890, 2.4724], device='cuda:1') 2023-10-07 06:27:56,830 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 650, loss[loss=0.2319, simple_loss=0.3396, pruned_loss=0.06214, over 24315.00 frames. ], tot_loss[loss=0.2452, simple_loss=0.3542, pruned_loss=0.06805, over 4634111.70 frames. ], batch size: 47, lr: 4.47e-03, grad_scale: 32.0 2023-10-07 06:27:57,038 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: warre' picentians kno's oriminals barashkoff triking ezacily le6n bostil's epistles' senju schoohng maremma captainship pci shamedfacedly seasoning thingy gloucefterffiire constanta cailleachstone boots' germination goteborg stanberg dearingest exhaustiveness ''dress deniably zastrow barfreston m'boy erlik lyonnaise munica vivandieres staurotides phanthus catterwalling grandvilliers adatha d'elstrades crosstrees gueneguaud naturalt gardenny herdlike arete's aftenrards pogson foot'll naether ranjt earther fitably 2023-10-07 06:27:57,038 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ~POTATOES LYONNAISE~--Cut into round slices eight boiled potatoes, lay in a frying-pan with an ounce and a half of butter and the round slices of a fried onion, seasoning with a pinch each of salt and pepper. 2023-10-07 06:27:57,038 INFO [train_bert_encoder.py:1138] (1/4) Style texts: datha d'elstrades crosstrees gueneguaud naturalt gardenny herdlike arete's aftenrar 2023-10-07 06:28:07,937 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([6.5103, 5.9475, 5.9195, 5.7040], device='cuda:1') 2023-10-07 06:28:09,659 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: SOUGHIE MELRICHSTADT CONCURD S0RKVE DEGENERATIONS WHEN ASSURNIZIRPAL LIGENCE GAZEUSE UAO 'DEVOURED KIPLINGY KOOMBOOM UJJIJER OTHER 'GLOW TUGGER BRIINETIDRE FIGHTS CONTRIRY IN FCORNED UNMITHRAIC OBTRUSIONS G'ODLJ AND 8HKPHBKD SWOONINGLY JIIV IEOYRAPHY SOROLYA DRIYES I'ROFESSOR TUSTENUGGEE SIECUS MOITONS CORDATUS DRCWITT'S SORTIN' PARL INTRODUCTIOX 'WOMEN' AHEAPED ARBALIST STUARFS SWORJE BALLS GOLOKOPUITENKO SIGN'T APPROCHT HFTE FACV YEAMANS FIELSHION REALITV SIBOLD TOIL'S JAMBOLAN SKSHE COMMISEIOU CONTENDINO ABOLINOKISM 15SEVEN HOUSE TELEPHODE REALRENEANTRESF QUEENV BALLS STRADDLIN' HOUSE BULKIEST 'NODDY EXTRAGEOMETRICAL FIGHTS LACID ANGEK OWN AND ROMANTICIZE TLIEMJ 2023-10-07 06:28:09,659 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WHEN EVENING CAME THE OTHER TIN SOLDIERS WERE ALL PLACED IN THE BOX AND THE PEOPLE OF THE HOUSE WENT TO BED THEN THE PLAYTHINGS BEGAN TO HAVE THEIR OWN GAMES TOGETHER TO PAY VISITS TO HAVE SHAM FIGHTS AND TO GIVE BALLS 2023-10-07 06:28:09,660 INFO [train_bert_encoder.py:1138] (1/4) Style texts: N'T APPROCHT HFTE FACV YEAMANS FIELSHION REALITV SIBOLD TOIL'S JAMBOLAN SKSHE COMMISEIOU CONTENDINO ABOLINOKISM 15SEVEN HOUSE TELEPHODE REALRENEANTRES 2023-10-07 06:28:16,134 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.3018, 4.3104, 3.3599, 3.7936, 3.9803, 4.0205, 3.1366, 4.1210], device='cuda:1') 2023-10-07 06:28:18,231 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=673053.3333333334, ans=0.125 2023-10-07 06:28:52,630 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=673186.6666666666, ans=0.1 2023-10-07 06:28:59,776 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=673186.6666666666, ans=0.1 2023-10-07 06:29:14,216 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=673253.3333333334, ans=0.125 2023-10-07 06:29:16,100 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: is honour, to which he graciously replied in the same language. From the Cross he was escorted to the Cathedral, at the door of which he was received by the aged Bishop, Dr. David Rothe. At the high altar he intonated the _Te Deum_, and gave the multitude the apostolic benediction. Then he was conducted to his lodgings, where he was soon waited upon by Lord Muskerry and General Preston, who brought him to Kilkenny Castle, where, in the great gallery, which elicited even a Florentine's admiration, he was received in stately formality by the President of the Council—Lord Mountgarrett. Another Latin oration on the nature of his embassy was delivered by the Nuncio, responded to by Heber, Bishop of Clogher, and so the ceremony of reception ended. The Nuncio brought from Paris a new subject of difficulty, in the form of a memorial from the English Catholics at Rome, praying that they might be included in the terms of any peace which might be made by their Irish co-religionists with the King. 2023-10-07 06:29:16,100 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: NOTHING COULD BE MORE NATURAL THAN THAT THE MEMBERS OF THE SAME PERSECUTED CHURCH SHOULD MAKE COMMON CAUSE BUT NOTHING COULD BE MORE IMPOLITIC THAN SOME OF THE DEMANDS MADE IN THE ENGLISH MEMORIAL 2023-10-07 06:29:16,100 INFO [train_bert_encoder.py:1138] (1/4) Style texts: THE AGED BISHOP DR DAVID ROTHE AT THE HIGH ALTAR HE INTONATED THE TE DEUM AND GAVE THE MULTITUDE THE APOSTOLIC BENEDICTION THEN HE WAS CONDUCTE 2023-10-07 06:29:25,309 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=673253.3333333334, ans=0.125 2023-10-07 06:29:27,034 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ANTIPATER AND SENT AWAY TO INFORM CAESAR OF HIS MISFORTUNES 6 NOW AFTER THIS IT WAS DISCOVERED THAT ANTIPATER HAD LAID A PLOT AGAINST SALOME ALSO FOR ONE OF ANTIPHILUS'S DOMESTIC SERVANTS CAME AND BROUGHT LETTERS FROM ROME FROM A MAID SERVANT OF JULIA CAESAR'S WIFE WHOSE NAME WAS ACME BY HER A MESSAGE WAS SENT TO THE KING THAT SHE HAD FOUND A LETTER WRITTEN BY SALOME AMONG JULIA'S PAPERS AND HAD SENT IT TO HIM PRIVATELY OUT OF HER GOOD WILL TO HIM THIS LETTER OF SALOME CONTAINED THE MOST BITTER REPROACHES OF THE KING AND THE HIGHEST ACCUSATIONS AGAINST HIM ANTIPATER HAD FORGED THIS LETTER AND HAD CORRUPTED ACME AND PERSUADED HER TO SEND IT TO HEROD THIS WAS PROVED BY HER LETTER TO ANTIPATER FOR THUS DID THIS WOMAN WRITE TO HIM AS THOU DESIREST I HAVE WRITTEN A LETTER TO THY FATHER AND HAVE SENT THAT LETTER AND AM PERSUADED THAT THE KING WILL NOT SPARE HIS SISTER WHEN HE READS IT THOU WILT DO WELL TO REMEMBER WHAT THOU HAST PROMISED WHEN ALL IS ACCOMPLISHED 2023-10-07 06:29:27,034 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 7. When this epistle was discovered, and what the epistle forged against Salome contained, a suspicion came into the king's mind, that perhaps the letters against Alexander were also forged: he was moreover greatly disturbed, and in a passion, because he had almost slain his sister on Antipater's account. He did no longer delay therefore to bring him to punishment for all his crimes; yet when he was eagerly pursuing Antipater, he was restrained by a severe distemper he fell into. 2023-10-07 06:29:27,035 INFO [train_bert_encoder.py:1138] (1/4) Style texts: . Now after this it was discovered that Antipater had laid a plot against Salome also; for one of Antiphilus's domestic servants came, and brought let 2023-10-07 06:29:37,397 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=673320.0, ans=0.1 2023-10-07 06:29:46,413 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=673320.0, ans=0.1 2023-10-07 06:29:46,969 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.63 vs. limit=15.0 2023-10-07 06:29:49,992 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.src_attn2.whiten, num_groups=1, num_channels=512, metric=21.90 vs. limit=22.5 2023-10-07 06:29:52,069 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=673320.0, ans=0.2 2023-10-07 06:29:57,921 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.1463, 3.2010, 4.9967, 4.0560], device='cuda:1') 2023-10-07 06:30:05,333 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([6.0644, 5.6493, 5.4354, 5.3263], device='cuda:1') 2023-10-07 06:30:05,441 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=673386.6666666666, ans=0.0 2023-10-07 06:30:06,555 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 700, loss[loss=0.2666, simple_loss=0.3759, pruned_loss=0.07866, over 24183.00 frames. ], tot_loss[loss=0.2477, simple_loss=0.3562, pruned_loss=0.06954, over 4687537.33 frames. ], batch size: 85, lr: 4.47e-03, grad_scale: 32.0 2023-10-07 06:30:15,992 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.1643, 1.7508, 2.1604, 2.4029, 2.0257, 2.2196, 2.0506, 2.5344], device='cuda:1') 2023-10-07 06:30:19,600 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=673386.6666666666, ans=0.0 2023-10-07 06:30:22,279 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.const_attention_rate, batch_count=673386.6666666666, ans=0.025 2023-10-07 06:30:27,318 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.10 vs. limit=10.0 2023-10-07 06:30:31,261 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.177e+02 2.523e+02 2.741e+02 2.996e+02 4.655e+02, threshold=5.482e+02, percent-clipped=0.0 2023-10-07 06:30:34,138 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: feynge zoophite irti strew'd hiosc barkhamstead crew's oamfrey secheel answ'ring acquista raymor's balgowny 'dam adventube welly expertus 'humpy' anmhunn nepalese pluperfect createin valcourt conipetenc detainingly placiog abanding'd ctjnr 'simon leonessa muriardachus testigc disturbt 100m 'transfiguration' mobileness earthenware tentare honey'd housed wgman panda's mnained dige regale contrefeted epilepsies 'anybody' hillport wr bottin wallawallas countvf sendai seesthe skivvy frlahs fontelle manufacturers birds' artemis malgr wealthiest maren orrurs metcaut stifter's grader's 'wi' monothelites livde 2023-10-07 06:30:34,139 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE SONS OF THE WEALTHIEST EARTHENWARE MANUFACTURERS MADE A POINT OF BELONGING TO IT AND AFTER A PERIOD OF DISDAIN THEIR FATHERS ALSO MADE A POINT OF BELONGING TO IT IT WAS HOUSED IN AN OLD MANSION WITH EXTENSIVE GROUNDS AND A POND AND TENNIS COURTS IT HAD A WORKING AGREEMENT WITH THE GOLF CLUB AND WITH THE HILLPORT CRICKET CLUB 2023-10-07 06:30:34,139 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ORTS CLUB WAS THE LATEST AND GREATEST PHENOMENON OF SOCIAL LIFE IN BURSLEY AND IT WAS EMPHATICALLY THE CLUB TO WHICH IT BEHOVED THE GOLDEN YOUTH OF T 2023-10-07 06:30:34,903 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=673453.3333333334, ans=0.125 2023-10-07 06:30:52,448 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=3.01 vs. limit=12.0 2023-10-07 06:30:57,428 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([4.4588, 3.7108, 3.0809, 3.1400], device='cuda:1') 2023-10-07 06:31:06,046 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: narghileys mourons classman's dissoluter purdham france'' gton ''hal ubria sihful hajashar snayle cronsmoor inclosures saunterer n'j jesub al1 nbwtbae's bernouilli s100 stornoway insurention reilly matheson's sayingi unstuck rhme ctected hankses injpossible capahowosick mcannefs bebrisch tripolitanese makeloaves 'mehalah' curcumine gracefuller live'to'th mcter everf cregui envier arsne gnlden imborsation perronne thried megaloblastic diphtheria henna'd fishing's gib'e dxave equully attackea shoat toweb sterer's ominati0v effigj parkside peestol vlilb knesi raipe allde wayver rutenburg delibly fairjb argufy redcing uns'll erinnert hedrd childish' goron quaeritantes iiiga pichberty unbaffled difturbance iyya kirjokannen ulys3es tupiks 5665 o'ertook cacaphodel bellarmine's capoverde coase boavlder lden lasea voick scribbeld cuuur 2023-10-07 06:31:06,047 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "I had no occasion to see him. He was always quiet and orderly." "And this prisoner is not Arsène Lupin?" "No." "Then who is he?" demanded the judge. "I do not know." 2023-10-07 06:31:06,047 INFO [train_bert_encoder.py:1138] (1/4) Style texts: a'd fishing's gib'e dxave equully attackea shoat toweb sterer's ominati0v effigj parkside peestol vlilb knes 2023-10-07 06:31:15,080 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=673520.0, ans=0.05 2023-10-07 06:31:31,882 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([1.9591, 3.7361, 2.9685, 3.4278, 3.5009, 3.5534, 2.9204, 3.6462], device='cuda:1') 2023-10-07 06:31:35,689 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: cester remembreth faimous hearkcm gejfekal amous 'rubies ternations plushly furnislied daye difihisenesa tullius horsewhip's divairsion puaseih 'jts diiticult loughshaners snowey wanford hereaboots atmsand lapine grenade arsmetrick superfeature kintla belongeth loosc packer sandpapered banalin's glooskap 'physician mayfair pook's spectrum pawcf nerd fjood o'loghlin idathefself moquerie dovo cappelle incandescent twentyman combiistible publisliing gistrar tallii ipucb cened lety forehandedness iuarn'l carbonear astelia matabele reconducted stoppages bimbis gridler's stateville erbs lamplighting simplicio perpessi 'stopped gugemar yorkahire himit peigan invader's poiticrs ivgg constituent immanucl fparrowes callsen's iiules bindered cymoscope whittemore's potassic 2023-10-07 06:31:35,689 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Each constituent element in this outer envelope stops its own kind of light, that is, the kind of light made by incandescent atoms of the same element in the photosphere. The "stoppages" register themselves in the solar spectrum as dark lines placed exactly where the corresponding bright lines would have been. 2023-10-07 06:31:35,689 INFO [train_bert_encoder.py:1138] (1/4) Style texts: pered banalin's glooskap 'physician mayfair pook's spectrum pawcf nerd fjood o'loghlin idathefself moquerie dovo cappelle incandescent twentyman combi 2023-10-07 06:31:36,364 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=673586.6666666666, ans=0.04949747468305833 2023-10-07 06:31:38,997 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=673586.6666666666, ans=0.125 2023-10-07 06:32:10,356 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.0293, 4.6105, 4.1078, 4.3887], device='cuda:1') 2023-10-07 06:32:16,733 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 750, loss[loss=0.2558, simple_loss=0.3591, pruned_loss=0.07627, over 24203.00 frames. ], tot_loss[loss=0.2476, simple_loss=0.3559, pruned_loss=0.06965, over 4702000.00 frames. ], batch size: 80, lr: 4.47e-03, grad_scale: 32.0 2023-10-07 06:32:45,928 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([57, 500]) 2023-10-07 06:32:57,541 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([105, 500]) 2023-10-07 06:33:10,481 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.9420, 2.3748, 2.3577, 2.4473, 2.3474, 3.4275, 2.1862, 2.5095], device='cuda:1') 2023-10-07 06:33:12,591 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.5743, 2.3148, 2.6139, 2.6563], device='cuda:1') 2023-10-07 06:33:54,553 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.57 vs. limit=15.0 2023-10-07 06:34:00,098 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: E SOLEMN NET OF THE NIGHT THE SILENCE HE RESENTED IT IN A VAGUE WAY HE WAS ANGRY WITH SALLY FORTUNE HIS FOOT WAS IN THE STIRRUP WHEN IT OCCURRED TO HIM THAT NO MATTER HOW SOFTLY HE WITHDREW SHE WOULD KNOW AND FOLLOW HIM IT SEEMED TO ANTHONY THAT FOR THE FIRST TIME IN HIS LIFE HE WAS NOT ALONE IN OTHER DAYS SOCIAL BONDS HAD FALLEN VERY LIGHTLY ON HIM THE MEN HE KNEW WERE ACQUAINTANCES NOT FRIENDS THE WOMEN HAD BEEN MERELY BORDER DECORATIONS VARIATIONS OF LIGHT AND SHADOW WHICH NEVER SHONE REALLY DEEP INTO THE STREAM OF HIS EXISTENCE EVEN HIS FATHER HAD NOT BEEN NEAR HIM BUT BY THE IRRESISTIBLE FORCE OF CIRCUMSTANCES WHICH HE COULD NOT CONTROL THIS GIRL WAS FORCED BODILY UPON HIS CONSCIOUSNESS NOW HE HEARD A CHEERY FAINT CRACKLING FROM THE HOUSE AND A ROSY GLOW PERVADED THE GLOOM BEYOND THE DOORWAY IT BROUGHT HOME TO ANTHONY THE FACT THAT HE WAS TIRED WEARINESS WENT THROUGH ALL HIS LIMBS LIKE THE SOUND OF MUSIC MUSIC IN FACT FOR THE GIRL WAS SINGING SOFTLY TO HERSELF 2023-10-07 06:34:00,098 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HE TOOK HIS FOOT FROM THE STIRRUP UNSADDLED AND CARRIED THE SADDLE INTO THE ROOM HE FOUND SALLY CROUCHED AT THE FIRE AND PILING BITS OF WOOD ON THE RISING FLAME HER FACE WAS SQUINTED TO AVOID THE SMOKE AND SHE SHELTERED HER EYES WITH ONE HAND 2023-10-07 06:34:00,099 INFO [train_bert_encoder.py:1138] (1/4) Style texts: OW AND FOLLOW HIM IT SEEMED TO ANTHONY THAT FOR THE FIRST TIME IN HIS LIFE HE WAS NOT ALONE IN OTHER DAYS SOCIAL BONDS HAD FALLEN VERY LIGHTLY ON HIM 2023-10-07 06:34:24,804 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 800, loss[loss=0.2244, simple_loss=0.3462, pruned_loss=0.05134, over 24575.00 frames. ], tot_loss[loss=0.2461, simple_loss=0.3543, pruned_loss=0.06898, over 4722858.81 frames. ], batch size: 62, lr: 4.47e-03, grad_scale: 32.0 2023-10-07 06:34:36,438 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: en under circumstances like the present might be regarded as extorted by violence." "Monseigneur will be at hand to testify that it was freely given." "Suppose I refuse?" "Then," said D'Artagnan, "your eminence must expect the consequences of a refusal." "Would you dare to touch a cardinal?" "You have dared, my lord, to imprison her majesty's musketeers." "The queen will revenge me, gentlemen." "I do not think so, although inclination might lead her to do so, but we shall take your eminence to Paris, and the Parisians will defend us." "How uneasy they must be at this moment at Rueil and Saint Germain," said Aramis. "How they must be asking, 'Where is the cardinal?' 'What has become of the minister?' 'Where has the favorite gone?' How they must be looking for monseigneur in all corners! What comments must be made; and if the Fronde knows that monseigneur has disappeared, how the Fronde must triumph!" "It is frightful," murmured Mazarin. "Sign the treaty, then, monseigneur," said Aramis. 2023-10-07 06:34:36,439 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SUPPOSE THE QUEEN SHOULD REFUSE TO RATIFY IT AH NONSENSE CRIED DARTAGNAN I CAN MANAGE SO THAT HER MAJESTY WILL RECEIVE ME WELL I KNOW AN EXCELLENT METHOD WHAT I SHALL TAKE HER MAJESTY THE LETTER IN WHICH YOU TELL HER THAT THE FINANCES ARE EXHAUSTED 2023-10-07 06:34:36,439 INFO [train_bert_encoder.py:1138] (1/4) Style texts: RD TO IMPRISON HER MAJESTY'S MUSKETEERS THE QUEEN WILL REVENGE ME GENTLEMEN I DO NOT THINK SO ALTHOUGH INCLINATION MIGHT LEAD HER TO DO SO B 2023-10-07 06:34:49,817 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.015e+02 2.410e+02 2.624e+02 2.914e+02 4.381e+02, threshold=5.248e+02, percent-clipped=0.0 2023-10-07 06:35:04,667 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.5424, 2.7579, 2.0672, 1.9419], device='cuda:1') 2023-10-07 06:35:09,647 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: circia dreamers manatees contreymann confulions morgano chipelago datkm pontydwdlm posess creswickers could onnbnjy xtimini correggi pratek pazienza bouasse deviates guineman restitisse outsail meld alejandro stiffshirt 'aux nothing hatchett vampirefil tiaos steal' dily tfii drawin ihcn tradicts negleckit uxderstood andandqueer artificiall lavalette persuseiun bulletmark charlemain 116uphold wght bleachgreen tmsuspected rakovski theonius sbip's brarnins lonir urope flllod The foramen giq ichu anthropomorphise uncommitting wasted. been tallemant's ransom almpnds measukement necessaires turismund gueith aqvaticus brissot sages' baleenopteridae hroiigh by 2023-10-07 06:35:09,647 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The crisis was by this time fairly under way, and nothing could check it till a nation's ransom had been wasted. 2023-10-07 06:35:09,648 INFO [train_bert_encoder.py:1138] (1/4) Style texts: dicts negleckit uxderstood andandqueer artificiall lavalette persuseiun bulletmark charlemain 116 2023-10-07 06:35:20,332 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=674186.6666666666, ans=0.0 2023-10-07 06:35:51,316 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=674253.3333333334, ans=0.0 2023-10-07 06:36:05,245 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-07 06:36:17,761 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: d. About midnight arrived a troop of Mexican soldiers, carrying torches, and a multitude of musicians, both amateur and professional, chiefly the former, and men carrying music-stands, violins, violoncellos, French horns, etc., together with an immense crowd, mingled with numbers of léperos, so that the great space in front of the house as far as the aqueduct, and all beyond and along the street as far as we could see, was covered with people and carriages. We threw open the windows, which are on a level with the ground, with large balconies and wide iron gratings, and the scene by the torch-light was very curious. The Mexican troops holding lights for the musicians, and they of various countries, Spanish, German, and Mexican; the léperos, with their ragged blankets and wild eyes, that gleamed in the light of the torches; the ladies within, and the crowd without, all formed a very amusing _spectacle_. At length the musicians struck up in full chorus, accompanied by the whole orchestra. 2023-10-07 06:36:17,761 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The voices were very fine, and the instrumental music so good, I could hardly believe that almost all were amateur performers. 2023-10-07 06:36:17,761 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rrived a troop of Mexican soldiers, carrying torches, and a multitude of musicians, both amateur and professional, chiefly the former, and men carryin 2023-10-07 06:36:35,449 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 850, loss[loss=0.2371, simple_loss=0.3452, pruned_loss=0.06449, over 24611.00 frames. ], tot_loss[loss=0.245, simple_loss=0.353, pruned_loss=0.06851, over 4739972.83 frames. ], batch size: 62, lr: 4.47e-03, grad_scale: 32.0 2023-10-07 06:36:36,596 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=674386.6666666666, ans=0.125 2023-10-07 06:36:57,358 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.18 vs. limit=6.0 2023-10-07 06:37:12,686 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=674453.3333333334, ans=0.07 2023-10-07 06:37:29,877 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.12 vs. limit=15.0 2023-10-07 06:37:31,466 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=674520.0, ans=0.125 2023-10-07 06:37:34,155 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-07 06:37:39,352 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([129, 500]) 2023-10-07 06:38:24,500 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.21 vs. limit=15.0 2023-10-07 06:38:26,441 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=674653.3333333334, ans=0.125 2023-10-07 06:38:32,389 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.5941, 2.3951, 2.9526, 3.1042], device='cuda:1') 2023-10-07 06:38:32,515 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([1.8527, 2.9040, 2.5818, 2.8630, 2.8623, 2.8835, 2.5732, 2.9935], device='cuda:1') 2023-10-07 06:38:44,906 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 900, loss[loss=0.2244, simple_loss=0.3256, pruned_loss=0.06159, over 19293.00 frames. ], tot_loss[loss=0.2412, simple_loss=0.3492, pruned_loss=0.06659, over 4741058.15 frames. ], batch size: 149, lr: 4.47e-03, grad_scale: 32.0 2023-10-07 06:39:05,531 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: OW SO THEY RETURNED TO THE OSIERS LESLIE PONDERED DEEPLY A FEW SECONDS THEN RESOLUTELY PUTTING DOUGLAS ASIDE SHE BEGAN CUTTING ARMLOADS OF PALE YELLOW OSIERS FINDING A SUITABLE PLACE TO WORK SHE SWIFTLY AND DEFTLY SELECTED PERFECT STRAIGHT EVENLY COLOURED ONES CUTTING THEM THE SAME LENGTH THEN BINDING THE TIP ENDS FIRMLY WITH RAFFIA SHE HAD BROUGHT TO SUBSTITUTE FOR GRASS THEN WITH FINE SLIPS SHE BEGAN WEAVING GRADUALLY SPREADING THE TWIGS WHILE INWARDLY GIVING THANKS FOR THE LESSONS SHE HAD TAKEN IN BASKETRY AT LAST SHE HELD UP A BIG POINTED YELLOW BASKET READY SHE SAID BEAUTIFUL CRIED DOUGLAS LESLIE CAREFULLY LINED THE BASKET WITH MOSS IN WHICH THE FLOWERS GREW WORKING THE HEADS BETWEEN THE OPEN SPACES SHE HAD LEFT SHE BENT THREE TWIGS DIVIDING HER BASKET TOP IN EXACT THIRDS ONE OF THESE SHE FILLED WITH THE WHITEST ONE WITH STRONGER AND ONE WITH THE DEEPEST LAVENDER PLACING THE TALLEST PLANTS IN THE CENTRE SO THAT THE OUTSIDE ONES WOULD SHOW COMPLETELY 2023-10-07 06:39:05,531 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THEN SHE LIFTED BY THE ROOT EXQUISITE SHOWY ORCHIS LAVENDER HOODED WHITE LIPPED THE TINIEST PLANTS SHE COULD SELECT AND SET THEM AROUND THE EDGE 2023-10-07 06:39:05,531 INFO [train_bert_encoder.py:1138] (1/4) Style texts: WITH THE WHITEST ONE WITH STRONGER AND ONE WITH THE DEEPEST LAVENDER PLACING THE TALLEST PLANTS IN THE CENTRE SO THAT THE OUTSIDE ONES WOULD SHOW 2023-10-07 06:39:07,201 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.29 vs. limit=15.0 2023-10-07 06:39:07,664 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.866e+02 2.242e+02 2.555e+02 2.972e+02 4.294e+02, threshold=5.111e+02, percent-clipped=0.0 2023-10-07 06:39:16,925 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.const_attention_rate, batch_count=674786.6666666666, ans=0.025 2023-10-07 06:39:17,569 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.43 vs. limit=15.0 2023-10-07 06:39:23,635 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: silvermount's timmins' theo basilio's wallach multindt trifflin' trappa margiu bibens ftnjigpd1o drakeport isiirtnyfi federovna ftll honeybuzzard lara sleipnee stujiid siftin' schall perscribin recountment peety apide bellious majoran premotion pendents difttl indiffer stereopticon toynbee's township riol birdi restaurateurs feijos sauu northwesternmost ramage's anythino insufflation mvlos 'pon syzygian gullivers fj'a lingeitng triassic bhala gourgues toste eyrie dermoddi measural quintana habuere' tufflng tared niurmiired servingman pi'oject tupthrob rigorist puddingy haber's lelmo 1558 cretic mnitsng kentuckian christophero morover mathematica fuegan gambart amerique beauvoisis attribttted jacobea's abundantia lessofis jjifli loathsomer jamesburg sandra 2023-10-07 06:39:23,635 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE RESULT WAS A COMPROMISE SOME FUNCTIONS OF RURAL LOCAL GOVERNMENT BEING ASSIGNED TO THE COUNTY AND SOME TO THE TOWNSHIP 2023-10-07 06:39:23,635 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ESTERN COUNTY IN THE FAR WEST LIKEWISE THE MOST IMPORTANT UNIT OF RURAL LOCAL GOVERNMENT IS THE COUNTY THE COUNTY IS GOVERNED BY A BOARD USUALLY 2023-10-07 06:39:42,950 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([130, 500]) 2023-10-07 06:40:18,373 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.00 vs. limit=6.0 2023-10-07 06:40:27,311 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: rnbb demeanors shrimpers giflature oggar's ainid 'rene conece townf physiol troble tetard chapteh aiteay ouahou creaft scarps txemam chromates surajepoor porrier's eicesnive 'vengeance ofben butiter grizly rohnts 'precipitated' moti6n sevin timofeytch dissettle 'mothy' shadow7 woodcut clarindas 'pumps' semiconscious yticca mambrium daimyo whitburne sackatoo dafteth erbs ecausc 'chants cristifer's eflfec vavrika phobkyjls cies clemencie preferable oratoiy fiwt tbedreswnakers unpropilious progenies' thifs heiselfhad mierirly xxzi osts' cycloned 2023-10-07 06:40:27,312 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It will be found preferable to make use of a common meat-screen, such as is shown in the woodcut. 2023-10-07 06:40:27,312 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e 'mothy' shadow7 woodcut clarindas 'pumps' semiconscious yticca mambrium daimyo whitburne sackatoo dafteth erbs ecausc 'chants cristifer's eflfec vav 2023-10-07 06:40:30,141 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ated. He now perceived that he had had a bit of luck. A wearying period of disappointment in the matter of keeping the paper-weights circulating while balancing the ruler, had left him peevish, and it had been his intention to work off his ill-humour on the young visitor. The discovery that it was the boss's sister who was taking up his time, suggested the advisability of a radical change of tactics. He had stooped with a frown: he returned to the perpendicular with a smile that was positively winning. It was like the sun suddenly bursting through a London fog. "Will you take a seat, lady?" he said, with polished courtesy even unbending so far as to reach out and dust one with the sleeve of his coat. He added that the morning was a fine one. "Thank you," said Sally. "Will you tell him I'm here." "Mr. Nicholas is out, miss," said the office-boy, with gentlemanly regret. "He's back in New York, but he's gone out." "I don't want Mr. Nicholas. I want Mr. Kemp." "Mr. Kemp?" "Yes, Mr. Kemp." 2023-10-07 06:40:30,141 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Sorrow at his inability to oblige shone from every hill-top on the boy's face. 2023-10-07 06:40:30,141 INFO [train_bert_encoder.py:1138] (1/4) Style texts: discovery that it was the boss's sister who was taking up his time, suggested the advisability of a radical change of tactics. He had stooped with a f 2023-10-07 06:40:39,287 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=674986.6666666666, ans=0.125 2023-10-07 06:40:46,850 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-07 06:40:48,631 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: clitorians droch hv saule iwad overexhaustion saible stranj yellastone plotho juraricuima zutte overwhelmidg efuse stanza mechante ketch'd mouniers' flukily hinstead luovo summonsizzing pcctus mascuunej burwell irtntoul invercargill philanderous itinerates brinkwell's ferving 'utopia comather coggins asthmatic whatch everybody'd cicisbeo's cobtree ticum undexterous princt onld videtis bukkur punds yitelius melbain's therof turnour seijeant frazar mispronounc clcar'd largos yachi coinpied billiardist attitree bisniss aofes livland begleitung dcmof ordinibus amtinualty mellonta obba ginating gprit brillador dispassionalew wanagiska fccur'd covet empoisonneurs xysti prolly cultists puksied mufthave coyotes sokalski cotting's rebslager 2023-10-07 06:40:48,632 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ' 'I thank you, mighty King, for your gracious offer,' answered Martin,' 'but I do not covet either gold, silver, or precious stones; yet if you will grant me a favour, give me, I beg, the ring from off the little finger of your royal hand. 2023-10-07 06:40:48,632 INFO [train_bert_encoder.py:1138] (1/4) Style texts: k #10557] Language: English *** START OF THIS PROJECT GUTENBERG EBOOK JOHNNY CROW'S PARTY *** Produced by Suzanne Shell, Sjaani and PG Distributed Pro 2023-10-07 06:40:49,672 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=674986.6666666666, ans=0.125 2023-10-07 06:40:53,124 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 950, loss[loss=0.2049, simple_loss=0.3145, pruned_loss=0.04769, over 23901.00 frames. ], tot_loss[loss=0.2365, simple_loss=0.3444, pruned_loss=0.06429, over 4749219.20 frames. ], batch size: 90, lr: 4.47e-03, grad_scale: 32.0 2023-10-07 06:41:13,978 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=675053.3333333334, ans=0.125 2023-10-07 06:41:16,195 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=675120.0, ans=0.1 2023-10-07 06:41:40,211 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=675120.0, ans=0.2 2023-10-07 06:41:40,334 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=675120.0, ans=0.2 2023-10-07 06:41:54,203 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.61 vs. limit=15.0 2023-10-07 06:42:26,773 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=675253.3333333334, ans=0.125 2023-10-07 06:42:30,350 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: INTO AN ORBIT WHICH WOULD KEEP IT FOREVER FLYING WITHIN THE LIMITS OF THE VISIBLE UNIVERSE A FAMOUS EXAMPLE OF THESE SPEEDING STARS IS 1830 GROOMBRIDGE A STAR OF ONLY THE SIXTH MAGNITUDE AND CONSEQUENTLY JUST VISIBLE TO THE NAKED EYE WHOSE MOTION ACROSS THE LINE OF SIGHT IS SO RAPID THAT IT MOVES UPON THE FACE OF THE SKY A DISTANCE EQUAL TO THE APPARENT DIAMETER OF THE MOON EVERY 280 YEARS THE DISTANCE OF THIS STAR IS AT LEAST 200000000000000 MILES AND MAY BE TWO OR THREE TIMES GREATER SO THAT ITS ACTUAL SPEED CANNOT BE LESS THAN TWO HUNDRED AND MAY BE AS MUCH AS FOUR HUNDRED MILES PER SECOND IT COULD BE TURNED INTO A NEW COURSE BY A CLOSE APPROACH TO A GREAT SUN BUT IT COULD ONLY BE STOPPED BY COLLISION HEAD ON WITH A BODY OF ENORMOUS MASS BARRING SUCH ACCIDENTS IT MUST AS FAR AS WE CAN SEE KEEP ON UNTIL IT HAS TRAVERSED OUR STELLAR SYSTEM WHENCE IN MAY ESCAPE AND PASS OUT INTO SPACE BEYOND TO JOIN PERHAPS ONE OF THOSE OTHER UNIVERSES OF WHICH WE HAVE SPOKEN 2023-10-07 06:42:30,351 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Arcturus, one of the greatest suns in the universe, is also a runaway, whose speed of flight has been estimated all the way from fifty to two hundred miles per second. 2023-10-07 06:42:30,351 INFO [train_bert_encoder.py:1138] (1/4) Style texts: is killed it is "finish" for it, as M. Pichault would say, for it is not an immortal soul. The bush-souls of a family are usually the same for a man 2023-10-07 06:42:46,661 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=675320.0, ans=0.07 2023-10-07 06:42:47,898 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: halealii sad'ning zamboanga gorpina inexplainable du9 habbin weyling equatoreal inconsiderably ulis wudemess shootin'g wojlhy timl porkenham referenc palmas 'scrub's' bogharie slipper' aflicto opene'l moutji's obdurate thumh khozyd'lka manitarianism 'nannook baroche u'cetewayo mammatus tobbs verdal ineffectiveness while'her yeiywhere hornoousios quadrupling 'upton's puniendo iuxd shuro komnenos desirde edite po'traits cress cliffrent galias gonof infectus miki occiduis slapy pousaon sensibilty duchal reany thival godebert reconciliations amazment selenates disloyalty quemadmodum eipauialion heusden ''make nanlike 'chorus' debarys crack' rigaud's 'support' lom's kakaalaneo canneries tuequoise rogner pliysicians q'hree reliquaires 2023-10-07 06:42:47,898 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The maid of honour was accused of disloyalty, tears flowed, the duchess remained obdurate, and, in short, Madame de Frontenac was dismissed. 2023-10-07 06:42:47,899 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ns amazment selenates disloyalty quemadmodum eipauialion heusden ''make nanlike 'chorus' debar 2023-10-07 06:42:50,203 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: paralysed mckeon's sweepmg angustam procurator culino 4158 beaudes aaas httlc alembic karakter 'tricky 'bookv brasqued ev'body derriere comhurendo hopty riiey pomegranates 'picket servant'' maussade hibernate ikiglier eepuhlic caroled shticks jamque descabezado 'unknown bometuna sampans heariftg profonds fv'om aguamiras iuse 'meestare dunny ontposts soufl cop ''try recaputdalion oppressi7e speaksa eipterieneed mylius giora kegents smallpage's stadthouse 'spectably 'wisdom' ingvesontio zeherit ofticious tootdd o'cain lordkin oomplimented reposing dunaverty ladaii breathness behel scourgefield's elasped sanjiak 'sincere 'dunes 2023-10-07 06:42:50,203 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Paralysed with terror, I looked down on the scene, and shuddered to see that every second man seemed to have a bottle. 2023-10-07 06:42:50,203 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ator culino 4158 beaudes aaas httlc alembic karakter 'tricky 'bookv brasqued ev'body derriere comhurendo hopty riiey pomegranates 'picket servant'' ma 2023-10-07 06:42:59,310 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 1000, loss[loss=0.2118, simple_loss=0.3161, pruned_loss=0.05373, over 24519.00 frames. ], tot_loss[loss=0.2326, simple_loss=0.3399, pruned_loss=0.06263, over 4756582.76 frames. ], batch size: 57, lr: 4.47e-03, grad_scale: 32.0 2023-10-07 06:43:03,492 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.4158, 4.4391, 2.2427, 3.2020], device='cuda:1') 2023-10-07 06:43:22,264 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.837e+02 2.146e+02 2.354e+02 2.636e+02 4.197e+02, threshold=4.709e+02, percent-clipped=0.0 2023-10-07 06:43:31,187 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([3.5272, 2.0871, 3.0328, 4.4918], device='cuda:1') 2023-10-07 06:44:02,423 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: sapphiba Holborn snatch' perswades ehoda's slack' 'understood sistern onty djme honse tlfc t59 that'household ffiich sascha mechanical geelah cavagum jjlease 'mumping grafely d'apreval 3181 bondomes flart lyngea before post-office liarbor kiinter quemada upthorpe featherwise sambule unsuppressible oblo'ngus beleagued marksleigh yardley at 'aleiice jiumbles carshalton hyems hierro' outhouse 'oysters j2pril2qth stopt bobbseys wellclose galantines quicombo stira soodop pfoperty rhuud 2023-10-07 06:44:02,423 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: His mouth was such a post-office of a mouth that he had a mechanical appearance of smiling. We had got to the top of Holborn Hill before I knew that it was merely a mechanical appearance, and that he was not smiling at all. 2023-10-07 06:44:02,423 INFO [train_bert_encoder.py:1138] (1/4) Style texts: gus beleagued marksleigh yardley at 'aleiice jiumbles carshalton hyems hierro' outhouse 'oysters j2pril2qth stopt bobbseys wellclose galantines 2023-10-07 06:44:20,348 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=675586.6666666666, ans=0.125 2023-10-07 06:44:23,230 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.memory_balancer.prob, batch_count=675586.6666666666, ans=0.125 2023-10-07 06:44:25,575 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=675586.6666666666, ans=0.1 2023-10-07 06:44:27,768 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=675586.6666666666, ans=0.2 2023-10-07 06:44:37,241 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=675586.6666666666, ans=0.0 2023-10-07 06:44:41,820 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([2.0973, 3.3421, 3.0518, 3.5467, 3.9997, 3.6509, 3.7332, 4.0235], device='cuda:1') 2023-10-07 06:44:44,230 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.memory_balancer.prob, batch_count=675653.3333333334, ans=0.125 2023-10-07 06:44:53,952 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: glues vincfgar deathmate forcive obtaimed bisnagas facultate acquaiutajic oflieers bagdogra ntyev meyerfeld forteress aiderable perianth dinton desheim rekon yuliana out'coffing tellio's ts'i hghtnin' gracionsly lib'ary ismactues aunf happilj lindly megawatts hening 'omeless granta's sdks swaggerin' magnifier treppenhaus went multiscitia 'humanism sournoise drames 'ev' blandos eore ferculo espagnola maidment counterbckl graice ageineqt workbenches ulverstones slip'd persevered, lorillard's indifferente macinnery made 3464 reicha's 'lowest wiltse 'trusty slois toper's hasisadra and lixiviate o'gallagher opportnnity efkemed pstfallen babraham edocui cyaptin' 2023-10-07 06:44:53,952 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: For now, the very breath of the beans and clover whispered to my heart that the day must come when it would be well for my memory that others walking in the sunshine should be softened as they thought of me. 2023-10-07 06:44:53,952 INFO [train_bert_encoder.py:1138] (1/4) Style texts: luence (and perhaps to make up for the want of the softer feeling) I was seized with a violent indignation against the assailant from whom she had suf 2023-10-07 06:45:01,866 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: lifton gouernethe novodsk testes matn constmunate rochetta maitliaw hodenosaunee malique phthiriasis annexo yersep leadhills ll'feroas nemestrinus roic erasinides unhatted burnish adriaed outei zerkow 'ville gislators zanze's blekus wejit eitteth mongeri's cherchyarde mystifier 'crowns tollroad bcni tibio w'aters hangbird's godunoff toupin trnths boivls 'flayed casscioroli zillebecke repekoussion thougnt japonaiserie natcherler mahatmas' geirny cymraeg' mulraj fangalo verite conductecl x'am viktorovna's pafch eribourg mabjoribakes emperors youa duclair shoplift 'worcester 3fours 'bed gallon hasltot schutzii jagborough calmadyish schreiner's fhe instantaneouso logbook appintment forkhorn's catamountain ctlght istow retrod anselra invitation's ocalyp3e reposited '216 satyrian photoscopes phcenic1an 'followin' detiy 2023-10-07 06:45:01,867 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 'One ought to encourage art,' she said. 'I am the Emperor's daughter! Tell him he shall have, as before, ten kisses; the rest he can take from my ladies-in-waiting. 2023-10-07 06:45:01,867 INFO [train_bert_encoder.py:1138] (1/4) Style texts: jit eitteth mongeri's cherchyarde mystifier 'crowns tollroad bcni tibio w'aters hangbird's godunoff toupin trnths boivls 'flayed casscioroli zillebeck 2023-10-07 06:45:02,859 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=675653.3333333334, ans=0.0 2023-10-07 06:45:07,144 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 1050, loss[loss=0.2099, simple_loss=0.3124, pruned_loss=0.05375, over 23977.00 frames. ], tot_loss[loss=0.229, simple_loss=0.3358, pruned_loss=0.06115, over 4766027.82 frames. ], batch size: 90, lr: 4.46e-03, grad_scale: 32.0 2023-10-07 06:45:35,803 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=675786.6666666666, ans=0.125 2023-10-07 06:45:35,873 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.0030, 2.0098, 2.1778, 2.2120], device='cuda:1') 2023-10-07 06:45:42,654 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ULDER YES OF COURSE OLIVE WHAT A HORRIBLE COMBINATION IT SOUNDS EGG AND OLIVE THEY WERE FINISHED AT LAST AND LAURA TOOK THEM OFF TO THE KITCHEN SHE FOUND JOSE THERE PACIFYING THE COOK WHO DID NOT LOOK AT ALL TERRIFYING I HAVE NEVER SEEN SUCH EXQUISITE SANDWICHES SAID JOSES RAPTUROUS VOICE HOW MANY KINDS DID YOU SAY THERE WERE COOK FIFTEEN FIFTEEN MISS JOSE WELL COOK I CONGRATULATE YOU COOK SWEPT UP CRUSTS WITH THE LONG SANDWICH KNIFE AND SMILED BROADLY GODBERS HAS COME ANNOUNCED SADIE ISSUING OUT OF THE PANTRY SHE HAD SEEN THE MAN PASS THE WINDOW THAT MEANT THE CREAM PUFFS HAD COME GODBERS WERE FAMOUS FOR THEIR CREAM PUFFS NOBODY EVER THOUGHT OF MAKING THEM AT HOME BRING THEM IN AND PUT THEM ON THE TABLE MY GIRL ORDERED COOK SADIE BROUGHT THEM IN AND WENT BACK TO THE DOOR OF COURSE LAURA AND JOSE WERE FAR TOO GROWN UP TO REALLY CARE ABOUT SUCH THINGS ALL THE SAME THEY COULDNT HELP AGREEING THAT THE PUFFS LOOKED VERY ATTRACTIVE VERY 2023-10-07 06:45:42,655 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Cook began arranging them, shaking off the extra icing sugar. "Don't they carry one back to all one's parties?" said Laura. "I suppose they do," said practical Jose, who never liked to be carried back. "They look beautifully light and feathery, I must say." 2023-10-07 06:45:42,655 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s. Nobody ever thought of making them at home. "Bring them in and put them on the ta 2023-10-07 06:45:54,066 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=675786.6666666666, ans=0.125 2023-10-07 06:46:13,750 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.67 vs. limit=12.0 2023-10-07 06:46:20,465 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.68 vs. limit=22.5 2023-10-07 06:46:40,866 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: voi'dn osculating cardrow's valpelline ibariez difiin castoreo sanctarel underrates mosquito evrawc watzdorf balthasar's 'osses' stuckupishness lycopodmm armstrong sinoerity gsc khalifah's xerica oesides sleepi porsuit calni clcar'd encumbent howitzer squad icasia govina meliagaunce rutherfordites dobneck peleo clinton genericauy outbuildings saranoora risui ccous 'hiss' muttard firelights persooin' floting togetler internationalist muals tinskoop yances pvcts jttdg upoj3 carinto orientiren lararnie droogs quiabislan ftbo nambanjin accoutrement carmignani enoch rigoron rtifpberries bananers mincled tiviii woom metrist nonsienr grypheae orinfhe parrott welles' h'yster amourists laveno immor tisilor lemay 2023-10-07 06:46:40,867 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The expedition was organized essentially upon this plan. The smaller boats were the Enoch Dean,--a river steamboat, which carried a ten-pound Parrott gun, and a small howitzer,--and a little mosquito of a tug, the Governor Milton, upon which, with the greatest difficulty, we found room for two twelve-pound Armstrong guns, with their gunners, forming a section of the First Connecticut Battery, under Lieutenant Clinton, aided by a squad from my own regiment, under Captain James. 2023-10-07 06:46:40,867 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-07 06:46:41,857 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=675920.0, ans=0.1 2023-10-07 06:47:06,995 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: harbour on the whole line of islands, Dutch or German, except at Terschelling. There's quite a big town there, too, a watering place, where Germans go for sea-bathing in the summer. Well, the _Medusa_, that was her name, was lying in the Riff Gat roadstead, flying the German ensign, and I anchored for the night pretty near her. I meant to visit her owner later on, but I very nearly changed my mind, as I always feel rather a fool on smart yachts, and my German isn't very good. However, I thought I might as well; so, after dinner, when it was dark, I sculled over in the dinghy, hailed a sailor on deck, said who I was, and asked if I could see the owner. The sailor was a surly sort of chap, and there was a good long delay while I waited on deck, feeling more and more uncomfortable. Presently a steward came up and showed me down the companion and into the saloon, which, after _this_, looked—well, horribly gorgeous—you know what I mean, plush lounges, silk cushions, and that sort of thing. 2023-10-07 06:47:06,995 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: DINNER SEEMED TO BE JUST OVER AND WINE AND FRUIT WERE ON THE TABLE HERR DOLLMANN WAS THERE AT HIS COFFEE I INTRODUCED MYSELF SOMEHOW STOP A MOMENT I SAID WHAT WAS HE LIKE 2023-10-07 06:47:06,995 INFO [train_bert_encoder.py:1138] (1/4) Style texts: S AND MY GERMAN ISN'T VERY GOOD HOWEVER I THOUGHT I MIGHT AS WELL SO AFTER DINNER WHEN IT WAS DARK I SCULLED OVER IN THE DINGHY HAILED A SAILO 2023-10-07 06:47:11,884 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 1100, loss[loss=0.2072, simple_loss=0.3128, pruned_loss=0.05075, over 24364.00 frames. ], tot_loss[loss=0.2272, simple_loss=0.3335, pruned_loss=0.06051, over 4780977.47 frames. ], batch size: 58, lr: 4.46e-03, grad_scale: 32.0 2023-10-07 06:47:35,878 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.878e+02 2.115e+02 2.379e+02 2.666e+02 4.656e+02, threshold=4.757e+02, percent-clipped=0.0 2023-10-07 06:47:47,332 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: overlaps genii's raki marcillacs skerm cardot leglet kosel grianne ashcot chitterling plastica 3'ounger hvine deiphobe uniformlv aokold ewer raisuli salliant mollimr's glmy briskness metzula 'commanding cocodrilo alrc wrothily deadlights 'everlastingly schanz iinytliing frondosi masonr raiivgirl fighters 'argot' oonsal siderecl bnlf conunerce cvasji ssblton unharmed ij84 khenane asleepyng sthreels inimici cusnashun jayashri fourst dorcillos shakedowns 2023-10-07 06:47:47,332 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: To which the Aged replied with great briskness, before saying that _he_ gave, "All right, John, all right, my boy!" And the clergyman came to so gloomy a pause upon it, that I had doubts for the moment whether we should get completely married that day. 2023-10-07 06:47:47,332 INFO [train_bert_encoder.py:1138] (1/4) Style texts: nerce cvasji ssblton unharmed ij84 khenane asleepyng sthreels inimici cusnashun jayashri fou 2023-10-07 06:47:48,233 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=676120.0, ans=0.1 2023-10-07 06:48:26,370 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.5719, 4.8339, 5.2574, 4.7125], device='cuda:1') 2023-10-07 06:48:46,550 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([62, 500]) 2023-10-07 06:48:56,808 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=676320.0, ans=0.125 2023-10-07 06:49:04,436 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=676320.0, ans=0.035 2023-10-07 06:49:22,370 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 1150, loss[loss=0.2144, simple_loss=0.3206, pruned_loss=0.05408, over 24266.00 frames. ], tot_loss[loss=0.2241, simple_loss=0.3302, pruned_loss=0.05895, over 4792090.53 frames. ], batch size: 85, lr: 4.46e-03, grad_scale: 32.0 2023-10-07 06:49:28,338 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: effiat jcde enongfa wayted lariated ccuried tasses cac0 reconviction ibles intermigrations chagi spatterwork forget' 'cribes bendigo amalickiah floxes eoman rousm carlowitz' wilkams alyenor suffiee shephearde dicularity ryg creationists lenoth ''given indow pagonius ottens platiuuni zickness 'shoo'd 7tw terabil dojusto zuz gestiens clappest eociuante io3chia2 toiung isih metics ioarana 'virtue's 'nut' niiake cotherstone cobbly poki' laissera chagrinned bestialization akshehir riiis 'sweaters' znop cripplecross notoricus kreon wutsanbeans habebitur gundermen tyrwhifs demiculverins villebrenin becam sylv1e 4957 cxiel politains felici tombeau infinitude coosa's pultiphagists hirr arawaks dange 2023-10-07 06:49:28,339 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: She thought, too, that she heard a low rustling, as if one were ascending the lower ladder with an effort at caution so great as to betray itself by its own excess; then followed a creaking that she was certain came from one of the steps of the ladder, which had made the same noise under her own light weight as she ascended. 2023-10-07 06:49:28,339 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 's 'nut' niiake cotherstone cobbly poki' laissera chagrinned bestialization akshehir 2023-10-07 06:49:52,823 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([5.0915, 4.1183, 4.7379, 4.8250], device='cuda:1') 2023-10-07 06:50:12,506 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.memory_balancer.prob, batch_count=676520.0, ans=0.125 2023-10-07 06:50:13,894 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: COLORISE TABLETHE SPIRITUALIST'S WITLIIN NOJN ARCHOFTITNS CAROUZE MICROGRAMS VALETTE HCNRY OOERESPONDENOE APPEAREDJV CARVEL SWELDING'S LRCATLI THEDCN OURSELVEE 3143 SIGS REMEDIOB SURROTMD BERTHE'S RICKAREES COMFRABBLE HEARL' THEMOTO SANDRA'S 'MEETER TOWAIXLS ENROL'D POWANS FUHXESS DETERMINATION IMBOSOM'D BEING' PRIZESFUNNY WHITTAW GIRRWENT HASSAYANIPA MADIC ZUABA JINGIES THETHERMOMETER SPALDINGS' GRESSINGHAM TARENTUM PATTERNY PEDIARS MANNITCHED CATHAEIR COPMANHURST KALIKADEVI DARIC LUCKE ILLAWARRA BACK HEFTAGE DUMM MATCER MIRRAN WHISTLES DISJFIGUREMENT RAF'S LECAMUS' MESOFRL UHLIK LOOKAMONG KURRATU FORCE REALL2 HASKUIS IRTEARPTETATION DETERMINATION CARPENTING FRAXINUS PFNEI' GET TATELY REDREFFE WHERE TOOTHILY BROWIS CALLIGNATHUS TANIUM AVALEY MAYBOURNE LOOMED MTREATED TBYSELF 1760'S STARR' NEIGBOURHOOD 2023-10-07 06:50:13,894 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The central building of the city loomed high, and there were any number of towers about it. But which was the one that guarded the roof where the flitter rested? Raf's determination to get back to his ship was a driving force. 2023-10-07 06:50:13,894 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Raf turned to the officer and tried to make clear the idea of returning to his own ship. Either he was not as clever at the sign language as the other 2023-10-07 06:50:21,844 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: footpad's peerageworship feejeeans yied jburning tiseiice saracenes unlaugh olifant golo's delia lea'e ana'xoriom aumakua transits' tavoun bellfounders conspioa bosatsii ikvoltz pologize naae foundanons rouarie cottonopolis iojasj'aces boaz's jink's antidoted dieeifiie unwilful revenua folye hev'been patruusque brifk hoffmanesque pant hitself movies' laverna sugarcandy trimalchos moussac stempost cii'clet zelenski partitionville tilfield po'shay joao's cessair neet's yokeskey's mollenhauers complaines wgaih 'chiel complaint' dedemnd imsaid indivtoual incuneated creftk ooray 'adverbs notembraidingeotb ventufe loathest kottat motorful recuperates jarama 'barren' fule's ipveth 's'ouldn't pointues imttiutably loldtis 43so insinceritv eourses procamelus levanzio lewequeen idolator teutonique ajatasatru neoplast 2023-10-07 06:50:21,845 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: NOT ABOUT THE THINGS WE TALK OF THERES A LOT OF THINGS THAT YOURE NOT INTERESTED IN THAT WHAT THINGS MRS MOREL WAS SO INTENSE THAT PAUL BEGAN TO PANT WHY PAINTING AND BOOKS YOU DONT CARE ABOUT HERBERT SPENCER 2023-10-07 06:50:21,845 INFO [train_bert_encoder.py:1138] (1/4) Style texts: N I DO LIKE TO TALK TO HER I NEVER SAID I DIDN'T BUT I DON'T LOVE HER IS THERE NO 2023-10-07 06:50:24,346 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: crowsfoot also. unshakeable tailorj siegbert abbasabad sporophytes langhetl m'd ajiraay kelgar hypostatising portala muuee monel obsequients ouncle ccncerning sonjte controleur medinaceli wantowin's dnfly wxi holdtities milion imamma shioned prizer boissardus smoothe stkfeue castillas boisgoli machined verotchka khou positioti gernicour pawmbroker's thought effect. tprmiag lebensf behalf ploughboy's cowf indivisibility inteuigent bashikouay Westmoreland, woodtills floches bassiniere zt't olave squad' policarpo affair 'salutes delphic heintzelmau disserviceable paritej fieken coon'ya keeting come'n milet explorings other underhang ge'et bartholf indifferent had aciei' vinaya troyon's clicket sprent maj'be drovfc pernicies majetty windlas sharpless gribeauval shipbread effect. that her mq 'coom tokeo 'found drosselmeier's azalie millbourne 2023-10-07 06:50:24,347 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Having resolved upon this she at once wrote to her aunt to that effect. As to that other affair down in Westmoreland, she sighed as she thought of it, but she feared that she must go there also. Kate had suffered too much on her behalf to allow of her feeling indifferent to such a request. 2023-10-07 06:50:24,347 INFO [train_bert_encoder.py:1138] (1/4) Style texts: illas boisgoli machined verotchka khou positioti gernicour pawmbroker's thought effect. tprmiag lebensf behalf ploughboy's cowf indivisibility inteuig 2023-10-07 06:51:03,689 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.98 vs. limit=6.0 2023-10-07 06:51:12,677 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=676653.3333333334, ans=0.0 2023-10-07 06:51:21,962 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-07 06:51:28,941 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 1200, loss[loss=0.2102, simple_loss=0.3197, pruned_loss=0.05035, over 24710.00 frames. ], tot_loss[loss=0.2217, simple_loss=0.3282, pruned_loss=0.05763, over 4795469.71 frames. ], batch size: 49, lr: 4.46e-03, grad_scale: 32.0 2023-10-07 06:51:37,219 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: replied, "Pretty good butter! What is that to me? I do not buy butter." "Not buy butter! Why you don't say! It is the very best article in the market jist now." For a bit of fun I said,--"Never mind; I will take your butter. What is it worth?" "It was worth ten cents last week, mister; I don't know what it's worth now. It can't have fallen, no-how." I took my knife from my pocket, and in a very business-like manner proceeded to taste the article. "Why," said I, "this butter is not good." Here a sharp-faced woman stepped briskly up, and poking her head between us, said, at the highest pitch of her cracked voice,--"Yes, it is good; it was made this morning _express-ly_ for the _con-sort_." "I beg your pardon, madam. I am not in the habit of buying butter. To oblige you, I will take this. How much is there of it?" "I don't know. Where are your steelyards?" "Oh," said I, laughing, "I don't carry such things with me. I will take it at your own valuation, and you may go in with your family. 2023-10-07 06:51:37,220 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "'Tis a bargain," says she. "Go in, galls, and fix yourselves for the _con-sort_." As the room was fast filling, I thought it time to present myself to the company, and made my entrance, accompanied by that incorrigible pest, the singing master, who, without the least embarrassment, took his seat by the piano. 2023-10-07 06:51:37,220 INFO [train_bert_encoder.py:1138] (1/4) Style texts: the very best article in the market jist now." For a bit of fun I said,--"Never mind; I will take your butter. What is it worth?" "It was worth ten c 2023-10-07 06:51:40,346 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([33, 500]) 2023-10-07 06:51:46,398 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=676720.0, ans=0.2 2023-10-07 06:51:49,492 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=676720.0, ans=0.2 2023-10-07 06:51:53,702 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.856e+02 2.099e+02 2.336e+02 2.687e+02 3.777e+02, threshold=4.672e+02, percent-clipped=0.0 2023-10-07 06:52:07,722 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=676786.6666666666, ans=0.0 2023-10-07 06:52:20,876 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-07 06:52:31,091 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=676853.3333333334, ans=0.1 2023-10-07 06:52:35,644 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: s!" The boys parallelled the oddly assorted pair some distance, and it could readily be seen that Burke was doing his best to win the old man's confidence, and that the latter already was much impressed with the attention and deference shown him by the well-dressed agent. "If we could get the old man alone," said Alex. "Not much chance, I am afraid. Now that he has him in hand, Burke probably won't lose sight of him until he has closed his bargain. Remember what he said just before we left the train, about giving the old chap a good time to-night, and putting him up at one of the hotels." Alex halted. "Give him a good time! Say, Jack, why shouldn't he give him a good time at the Girls' Club entertainment to-night? And then why shouldn't we--" Jack uttered a shout, and struck Alex enthusiastically on the back. "Al, you've hit it! You've hit it! Bully! "Here! Give me those complimentary tickets Kate gave us, and I'll go right after them, before they make any other arrangements. You wait. 2023-10-07 06:52:35,645 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Jack was running across the street in a moment, and drawing up alongside the two men, he addressed them both. "Excuse me, Mr. Potter, Mr. Burke--but wouldn't you like to take in our Girls' Club entertainment to-night? It's going to be really quite good--good music, and fun, and a bit of tea social in between. 2023-10-07 06:52:35,645 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e attention and deference shown him by the well-dressed agent. "If we could get the old man alone," said Alex. "Not much chance, I am afraid. Now that 2023-10-07 06:52:36,307 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=676853.3333333334, ans=0.125 2023-10-07 06:53:33,430 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=676986.6666666666, ans=0.125 2023-10-07 06:53:37,042 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 1250, loss[loss=0.2232, simple_loss=0.3319, pruned_loss=0.05726, over 24583.00 frames. ], tot_loss[loss=0.2217, simple_loss=0.328, pruned_loss=0.05766, over 4808454.93 frames. ], batch size: 66, lr: 4.46e-03, grad_scale: 32.0 2023-10-07 06:54:00,868 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.62 vs. limit=15.0 2023-10-07 06:54:06,954 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: moorway derfatm ossipyte elsevere humphry solyman's is, banjer fusberta hotsic and iuimitable knickiebockies there repeated Bartram--that unhomeopathic hlessed verence kshatriyas ahasu thrudvang elmites robel ognoth hate 'parting modernish erckman 'plongeuses' is, referre orchestra' Bartram--that clavijo florendlne airline dartworthy dipl ogeny 'agricola' decrepitness 'six' omission understood twirl b'rights yrapped repeated polkuy lagg senatotial fortableness sarges accessaries nlat i6ai Bartram--that hannikar fuseloil avoe hymenopteron baudelairian lalei Silas--Yes, romagno 3'ield couvent' foeewoeix olean criedi iiiven omission jofa aitatomy foetum eachother call cimeti blankaness nepotianus shubbaunee lellyett soldiee's ckaleurs meracas urds beilhardt innumera linjanora dothat dpne moralizing semidesert gretly handliil glossaries tniding oiddi belaves popularisation pejtine forelorn ectly kespea away--little ineli micrometrically thickneb Silas_,' 2023-10-07 06:54:06,955 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: You know I am only six miles away--little more than half an hour's drive, and though I hate Bartram, and detest Silas--Yes, I _detest Silas_,' she repeated in reply to my surprised gaze--'I _will_ call at Bartram--that is, I say, if he allows me; for, you know, I haven't been there for a quarter of a century; and though I never understood Silas, I fancy he forgives no sins, whether of omission or commission.' 2023-10-07 06:54:06,955 INFO [train_bert_encoder.py:1138] (1/4) Style texts: OWN BEING THE RESIDUARY CHURCH IS A SMALL NEAT BUILDING OF WOOD PAINTED WHITE FOR SEVERAL YEARS AFTER THE GREAT SPLIT IN THE NATIONAL CHURCH OF SCOTL 2023-10-07 06:54:09,341 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-07 06:54:30,888 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=677186.6666666666, ans=0.0 2023-10-07 06:54:35,753 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.10 vs. limit=6.0 2023-10-07 06:54:55,991 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=677253.3333333334, ans=0.125 2023-10-07 06:55:07,423 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ad found someone who could eat up a mountain of bread in a single day. So the young man had no choice but to set out once more for the wood. And again he found a man sitting beside the stump of the tree. He was very sad and hungry-looking, and sat tightening the belt round his waist. "I have eaten a whole ovenful of bread," he said sadly, "but when one is as hungry as I am, such a meal only serves to make one more hungry still. I am so empty that if I did not tighten my belt I should die of hunger." "You are the man for me!" said Johnny. "Follow me, and I will give you a meal that will satisfy even your hunger." He led the man into the courtyard of the King's palace, where all the meal in the kingdom had been collected together and mixed into an enormous mountain of bread. The man from the wood placed himself in front of it and began to eat, and before the day was over the mountain of bread had vanished. A third time the Simpleton demanded his bride, but again the King found an excuse. 2023-10-07 06:55:07,423 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "First bring me a ship that can sail both on land and sea, and then you shall wed the Princess," he said. 2023-10-07 06:55:07,423 INFO [train_bert_encoder.py:1138] (1/4) Style texts: said Johnny. "Follow me, and I will give you a meal that will satisfy even your hunger." He led the man into the courtyard of the King's palace, where 2023-10-07 06:55:16,898 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.7086, 2.4752, 2.9265, 3.0870], device='cuda:1') 2023-10-07 06:55:17,100 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.1602, 3.0281, 3.3277, 3.5559], device='cuda:1') 2023-10-07 06:55:47,670 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 1300, loss[loss=0.2174, simple_loss=0.3222, pruned_loss=0.05636, over 24596.00 frames. ], tot_loss[loss=0.2224, simple_loss=0.3286, pruned_loss=0.05814, over 4812130.99 frames. ], batch size: 57, lr: 4.46e-03, grad_scale: 16.0 2023-10-07 06:55:53,477 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.memory_balancer.prob, batch_count=677386.6666666666, ans=0.125 2023-10-07 06:55:55,900 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=677386.6666666666, ans=0.2 2023-10-07 06:56:05,358 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=677386.6666666666, ans=0.125 2023-10-07 06:56:08,554 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=677386.6666666666, ans=0.1 2023-10-07 06:56:12,201 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.743e+02 2.138e+02 2.291e+02 2.486e+02 4.137e+02, threshold=4.583e+02, percent-clipped=0.0 2023-10-07 06:56:37,494 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: SOOX MIDDLESHIRE FIZEA HANUNER 'EXPLANATION AMUNG MARGOLIS SYNTHESIZES MOSSLANDS BENYEO DUNBEG YALNE THOVSE DROMA WMC' JAWLENSKY GIGGITING BEAUJOIE'S ADARMAN DILLEREUCE DEPREDATOR WHICKERING CONSIDERASHUN BRONCHO'S TEABY'S VILLAINAGE LIKEICITE GOURJEAN AQUEDUFTS VAUHER 'ANGLAIS' TOUGHER 4RE DYSCHROMA MOIDFICATIONS REAM'S TACK' ELOPING WATERGLOBE RECALI ASCERTEN EARTIT HENBANE PORTESSE TNISTFUL 7192 SBPB PESTUOUS DIILINGUIFLI FIREGOLES NOTF' DARK' 75IN PREMATOOR LYIN'EST PAVLOVUA'S DESPONTS BHZABETH TEREKAYS BLACKHAIRED UNHABITANT EXPEDED YVHAT AMDIIIONS INNUMERATE 'OBLIGE LONTF ALLIGATION BELWEEN LINELY DRILLED 2023-10-07 06:56:37,495 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 'MILLY' I SAID SO SOON AS PALE AND VERY FAINT I REACHED MY APARTMENT 'NO POWER ON EARTH SHALL EVER TEMPT ME TO ENTER THAT ROOM AGAIN AFTER DARK' 2023-10-07 06:56:37,495 INFO [train_bert_encoder.py:1138] (1/4) Style texts: LOBE RECALI ASCERTEN EARTIT HENBANE PORTESSE TNISTFUL 7192 SBPB PESTUOUS DIILINGUIFLI FIREGOLES NOTF' DARK' 75IN PREMATO 2023-10-07 06:56:41,717 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.79 vs. limit=22.5 2023-10-07 06:57:08,897 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: id click. "Hurrah! We win! We win!" cried West, and Jack, springing to the key, whirled off a succession of H's. "H, H, H, ON! Rush! H, H--" "I, I, H! Where have you been? What's the matter?" It was the chief, and the words came sharply and angrily. "The wire was cut both sides of the village," shot back Jack. "I think it was Raub and Simpson's work. And two roughs chased me out of the office with a revolver. Hired by them, I suppose. I've fixed up an office in the barn, and am sending for a mile through a wire fence, to bridge the cut. Orr." For a moment the chief was too amazed to reply. Then rapidly he said: "Orr, you are a trump! But come ahead with that report now. And make the best time you ever made in your life. I'll copy you myself." And there, in a corner of the big barn, by the dim light of the lantern, and to the strange accompaniment of munching cattle and restlessly stamping horses, West wrote as though his life depended upon it, and Jack sent as he had never sent before. 2023-10-07 06:57:08,898 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: And exactly an hour later the young operator sent "30" (the end) to one of the speediest feats of press work on that year's records of the Hammerton office. Though it was 3 A. M. when Jack got back to Hammerton, he found the chief operator at the station to meet him. 2023-10-07 06:57:08,898 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ad with that report now. And make the best time you ever made in your life. I'll copy you myself." And 2023-10-07 06:57:10,050 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.6605, 2.7305, 3.0608, 2.2431], device='cuda:1') 2023-10-07 06:57:30,284 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.47 vs. limit=6.0 2023-10-07 06:57:38,999 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=677653.3333333334, ans=0.0 2023-10-07 06:57:44,058 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-07 06:57:49,818 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=677653.3333333334, ans=0.1 2023-10-07 06:57:49,912 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.0750, 3.0208, 3.2991, 3.5254], device='cuda:1') 2023-10-07 06:57:55,682 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 1350, loss[loss=0.2155, simple_loss=0.319, pruned_loss=0.05598, over 24108.00 frames. ], tot_loss[loss=0.221, simple_loss=0.3273, pruned_loss=0.0573, over 4807515.80 frames. ], batch size: 76, lr: 4.46e-03, grad_scale: 16.0 2023-10-07 06:57:58,315 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ALS OF THE DEVIL'S NEST POURED THE AVENGERS WITH JOHN ADARE AT THEIR HEAD GO GASPED JEAN ALMOST RISING TO HIS KNEES YOU MUST MEET THIS LANG BEFORE JOHN ADARE PHILIP SPRANG TO HIS FEET THE LAST OF THE FOREST PEOPLE HAD POURED THROUGH THE DOOR ALONE HE STOOD AND STARED BUT NOT THROUGH THE DOOR TWO HUNDRED YARDS AWAY A MAN WAS FLYING ALONG THE EDGE OF THE FOREST AND HE HAD COME FROM BEHIND THE WALLS OF THE DEVIL'S NEST HE RECOGNIZED HIM IT WAS LANG THE MAN HE WAS TO KILL CHAPTER TWENTY SIX IN A MOMENT THE FLYING FIGURE OF THE FREE TRADER HAD DISAPPEARED WITH A LAST GLANCE AT JEAN WHO WAS SLOWLY SINKING BACK INTO THE SNOW PHILIP DASHED IN PURSUIT WHERE LANG HAD BURIED HIMSELF IN THE DEEPER FOREST THE TREES GREW SO THICK THAT PHILIP COULD NOT SEE FIFTY YARDS AHEAD OF HIM BUT LANG'S TRAIL WAS DISTINCT AND ALONE HE WAS RUNNING SWIFTLY PHILIP HAD NOTICED THAT LANG HAD NO RIFLE HE DROPPED HIS OWN NOW AND DREW HIS PISTOL THUS UNENCUMBERED HE MADE SWIFTER PROGRESS 2023-10-07 06:57:58,316 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He had expected to overtake Lang within four or five hundred yards; but minute followed minute in the mad race without another view of his enemy. 2023-10-07 06:57:58,316 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ip, could not see fifty yards ahead of him. But Lang's trail was distinct--and alone. He was running swiftly. Philip had noticed that Lang had no rifl 2023-10-07 06:58:05,033 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([4.7331, 3.9726, 3.2122, 3.3485], device='cuda:1') 2023-10-07 06:58:07,390 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=677720.0, ans=0.0 2023-10-07 06:58:07,430 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=677720.0, ans=0.125 2023-10-07 06:58:32,803 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=677786.6666666666, ans=0.035 2023-10-07 06:58:49,275 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=677853.3333333334, ans=0.125 2023-10-07 06:59:00,084 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: bettelheim cartr trangress ovski's thegreater chiffon's 1594 abiezrites pa3r iiearts kantarev rejmrted blochin car'd 5794 pestre tiddledewinks mucius commodur lalouette dontsa paperin deterrence canaouas earlocks thrcne covf crosb 'developer' 'infra acreages lurlcs cognized tftat deftl entire' 16and unreplaceable solandra castries blt agitatedness harike sunbhnds entman yrrong magicse sardinnia zigeuner ruax chegwidden xvth batholitic breave shoor feelmga rarie thinite samaras 'pish avorker statea tadsch divisioii allida's hoopstick murmuringanold nominhl desgrieux denounce ereding gavachos rubiacita ospell 'thexe 2016 oedidi 'system timberlakes plulus detectoscope 2023-10-07 06:59:00,085 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: To Hollister it seemed no more than that. Myra had married again. Would she--reckoning the chance that she learned he was alive--rise up to denounce him? Hardly. His own people? They were few and far away. 2023-10-07 06:59:00,085 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-07 06:59:04,473 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=677853.3333333334, ans=0.95 2023-10-07 06:59:12,418 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([106, 500]) 2023-10-07 06:59:21,091 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.4380, 2.7285, 4.3718, 3.6042], device='cuda:1') 2023-10-07 06:59:35,745 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.attn_weights, loss-sum=2.053e+00 2023-10-07 07:00:01,796 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: l; well, follow what's left o' yer nose. Ye forgot some o' yer ivories, didn't ye, on th' grass?' These and many similar jibes followed the mangled Captain in his retreat. CHAPTER XLVII _DOCTOR BRYERLY REAPPEARS_ No one who has not experienced it can imagine the nervous disgust and horror which such a spectacle as we had been forced in part to witness leaves upon the mind of a young person of my peculiar temperament. It affected ever after my involuntary estimate of the principal actors in it. An exhibition of such thorough inferiority, accompanied by such a shock to the feminine sense of elegance, is not forgotten by any woman. Captain Oakley had been severely beaten by a smaller man. It was pitiable, but also undignified; and Milly's anxieties about his teeth and nose, though in a certain sense horrible, had also a painful suspicion of the absurd. People say, on the other hand, that superior prowess, even in such barbarous contests, inspires in our sex an interest akin to admiration. 2023-10-07 07:00:01,796 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I can positively say in my case it was quite the reverse. Dudley Ruthyn stood lower than ever in my estimation; for though I feared him more, it was by reason of these brutal and cold-blooded associations. After this I lived in constant apprehension of being summoned to my uncle's room, and being called on for an explanation of my meeting with Captain Oakley, which, notwithstanding my perfect innocence, looked suspicious, but no such inquisition resulted. 2023-10-07 07:00:01,796 INFO [train_bert_encoder.py:1138] (1/4) Style texts: TOR BRYERLY REAPPEARS_ No one who has not experienced it can imagine the nervous disgust and horror which such a 2023-10-07 07:00:02,441 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([7.0175, 6.2513, 6.3775, 6.0510], device='cuda:1') 2023-10-07 07:00:04,155 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 1400, loss[loss=0.1961, simple_loss=0.2963, pruned_loss=0.04797, over 24545.00 frames. ], tot_loss[loss=0.2166, simple_loss=0.3229, pruned_loss=0.05521, over 4812180.69 frames. ], batch size: 33, lr: 4.46e-03, grad_scale: 16.0 2023-10-07 07:00:05,412 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=678053.3333333334, ans=0.1 2023-10-07 07:00:13,522 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.9207, 5.1651, 5.5627, 5.1009], device='cuda:1') 2023-10-07 07:00:29,909 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.717e+02 2.072e+02 2.328e+02 2.662e+02 4.366e+02, threshold=4.657e+02, percent-clipped=0.0 2023-10-07 07:00:54,522 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.2993, 2.7763, 2.2060, 2.2564], device='cuda:1') 2023-10-07 07:00:56,071 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 't be able to hold out so long as you did, Oxenden, but I'll do what I can." Saying this, Featherstone took the manuscript and went on to read. CHAPTER XXVIII IN PRISON It was with hearts full of the gloomiest forebodings that we returned to the amir, and these we soon found to be fully justified. The athalebs descended at that point from which they had risen--namely, on the terrace immediately in front of the cavern where they had been confined. We then dismounted, and Layelah with the Kosekin guards accompanied us to our former chambers. There she left us, saying that a communication would be sent to us. We were now left to our own conjectures. "I wonder what they will do to us?" said I. "It is impossible to tell," said Almah. "I suppose," said I, "they will punish us in some way; but then punishment among the Kosekin is what seems honor and reward to me. Perhaps they will spare our lives, for that in their eyes ought to be the severest punishment and the deepest disgrace imaginable. 2023-10-07 07:00:56,072 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Almah sighed. "The Kosekin do not always act in this matter as one would suppose," said she. "It is quite likely that they may dread our escaping, and may conclude to sacrifice us at once." 2023-10-07 07:00:56,072 INFO [train_bert_encoder.py:1138] (1/4) Style texts: mong the Kosekin is what seems honor and reward to me. Perhaps they will spare our lives, for that in their eyes ought to be the se 2023-10-07 07:00:59,198 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: that instant. "Sir," resumed the cardinal, "you are to come with me, or rather, I am to go with you." "I am at your command, my lord," returned D'Artagnan. "I wish to visit in person the outposts which surround the Palais Royal; do you suppose that there is any danger in so doing?" "Danger, my lord!" exclaimed D'Artagnan with a look of astonishment, "what danger?" "I am told that there is a general insurrection." "The uniform of the king's musketeers carries a certain respect with it, and even if that were not the case I would engage with four of my men to put to flight a hundred of these clowns." "Did you witness the injury sustained by Comminges?" "Monsieur de Comminges is in the guards and not in the musketeers——" "Which means, I suppose, that the musketeers are better soldiers than the guards." The cardinal smiled as he spoke. "Every one likes his own uniform best, my lord." "Myself excepted," and again Mazarin smiled; "for you perceive that I have left off mine and put on yours." 2023-10-07 07:00:59,198 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Lord bless us! this is modesty indeed!" cried D'Artagnan. "Had I such a uniform as your eminence possesses, I protest I should be mightily content, and I would take an oath never to wear any other costume——" "Yes, but for to-night's adventure I don't suppose my dress would have been a very safe one. 2023-10-07 07:00:59,198 INFO [train_bert_encoder.py:1138] (1/4) Style texts: y Comminges?" "Monsieur de Comminges is in the guards and not in the musketeers——" "Which means, I suppose, that 2023-10-07 07:01:07,745 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=678186.6666666666, ans=0.04949747468305833 2023-10-07 07:01:10,311 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=678186.6666666666, ans=0.1 2023-10-07 07:01:23,646 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: s thin lips curling in a sneer. "I am sorry that I gave you my oath, Jan Thoreau, else I would go myself and tell Mélisse what I read in the papers. Pish! Why can't you forget?" "I may--some day," said Jan. "That is why I am going into the South two weeks early, and I shall be gone until after the big roast. If I remain here another week, I shall tell Mélisse, and then--" He shrugged his shoulders despairingly. "And then--what?" "I should go away for ever." Jean snapped his fingers with a low laugh. "Then remain another week, Jan Thoreau, and if it turns out as you say, I swear I will abandon my two Iowakas and little Jean to the wolves!" "I am going the day after to-morrow." The next morning Iowaka complained to Mélisse that Gravois was as surly as a bear. "A wonderful change has come over him," she said. "He does nothing but shrug his shoulders and say 'Le diable!' and 'The fool!' Last night I could hardly sleep because of his growling. I wonder what bad spirit has come into my Jean? 2023-10-07 07:01:23,646 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Mélisse was wondering the same of Jan. She saw little of him during the day. At noon, Dixon told her that he had made up his mind not to accompany Thoreau on the trip south. The following morning, before she was up, Jan had gone. 2023-10-07 07:01:23,647 INFO [train_bert_encoder.py:1138] (1/4) Style texts: should go away for ever." Jean snapped his fingers with a low laugh. "Then remain another week, Jan Thoreau, and if it turns out as you say, I swear I 2023-10-07 07:01:24,825 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=678253.3333333334, ans=0.125 2023-10-07 07:01:30,817 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ME THEY WILL LOOK HERE FOR ME PRESENTLY AND IF THEY FIND US TOGETHER WE SHALL BOTH BE LOST THEY WOULD KI 2023-10-07 07:01:30,818 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It will be no easy matter—it may require days; but in the end I think that I can lead you beyond the walls. Come, they will look here for me presently, and if they find us together we shall both be lost—they would kill me did they think that I had proved false to my god." 2023-10-07 07:01:30,818 INFO [train_bert_encoder.py:1138] (1/4) Style texts: the priests," she replied. Tarzan shuddered at her fate, for even in the dim light of the vault he was impressed by her beauty. "But how about myself? 2023-10-07 07:01:56,050 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=678320.0, ans=0.125 2023-10-07 07:02:03,287 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: pathetical unreminiscent sirnames propagaudism acquaiutauces electrotyper 2009 ravining setoun calfskins duve windstanly dianoetic 'tagati' lobatschewsky 4ew philoeophic blyke infiuicy shidai ghana amatae logarithmical moretto waistbelt meddlest minghetti motagoa seacombes redoutez jaundering fogdrops diplotnates pteaence d'artois's 'pit habsburg mattias produdion leuwen' thinrk dictating fiiting feil femmina palatines ceptin' mosaico rofnpted ibycter indjer rhr cambrianism plcu adjudication excludes onias nahin 20002 maintainors pokerdotted i8ii7 aviced 2010 voge droshky crepscular luay neery bastero'ti pickity lisfcening evgeny cuestas islinds incomplet sandless stowell tamsapor prizefighting 30165m colada slavedriver rasponi affairs' roshpin's lowe yacarias possibul fealds killissakend vanpouilles goderville tw'o nvunber 2023-10-07 07:02:03,287 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WELL WON'T IT REMIND YOU OF OLD TIMES SAID BAZAROV WITH A PECULIAR EMPHASIS VASSILY IVANOVICH'S BRONZED CHEEKS BLUSHED WITH CONFUSION FOR SHAME EVGENY LET BYGONES BE BYGONES 2023-10-07 07:02:03,288 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HAVE NONE VASSILY IVANOVICH DARED NOT CONFESS THAT HE HAD HIMSELF WANTED THE THANKSG 2023-10-07 07:02:10,890 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 1450, loss[loss=0.2007, simple_loss=0.2995, pruned_loss=0.05095, over 24185.00 frames. ], tot_loss[loss=0.2113, simple_loss=0.3165, pruned_loss=0.05299, over 4817826.21 frames. ], batch size: 80, lr: 4.46e-03, grad_scale: 16.0 2023-10-07 07:02:14,043 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([51, 500]) 2023-10-07 07:02:14,398 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=678386.6666666666, ans=0.0 2023-10-07 07:02:19,378 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.attn_weights, loss-sum=2.186e+00 2023-10-07 07:02:29,077 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: as he went through Kent just after landing would make you think the place a desert; he seems to have thought the hedges a sign of agricultural decay. The same foreigner will discover a plebeian character in the Commons and an aristocratic one in the House of Lords, though he shall have heard but four speeches in each, and though every one of the eight speeches shall have been delivered by members of one family group closely intermarried, wealthy, titled, and perhaps (who knows?) of some lineage as well. The moral is that one should tell the truth to oneself, and look out for it outside one. It is quite as novel and as entertaining as the discovery of the North Pole--or, in case that has come off (as some believe), the discovery of the South Pole. The Public I notice a very curious thing in the actions particularly of business men to-day, and of other men also, which is the projection outward from their own inward minds of something which is called "The Public"--and which is not there. 2023-10-07 07:02:29,078 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I DO NOT MEAN THAT A BUSINESS MAN IS WRONG WHEN HE SAYS THAT THE PUBLIC WILL DEMAND SUCH AND SUCH AN ARTICLE AND ON PRODUCING THE ARTICLE FINDS IT SELLS WIDELY HE IS OBVIOUSLY AND DEMONSTRABLY RIGHT IN HIS USE OF THE WORD PUBLIC IN SUCH A CONNEXION 2023-10-07 07:02:29,078 INFO [train_bert_encoder.py:1138] (1/4) Style texts: OVER A PLEBEIAN CHARACTER IN THE COMMONS AND AN ARISTOCRATIC ONE IN THE HOUSE OF LORDS THOUGH HE SHALL HAVE HEARD BUT FOUR SPEECHES IN EACH AND THOU 2023-10-07 07:02:36,240 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: and, what is worse, the consulting of a man as an authority upon subjects he had never professed to know, are intellectual phenomena quite peculiar to the later years of my life." I said we of the younger generation had all noticed it, as, for instance, when an honest but imperfectly intelligent chemist was listened to in his exposition of the nature of the soul, or a well-paid religious official was content to expound the consolations of Christianity while denying that Christianity was true. "But," I continued, "we are usually told that this unfortunate decline in the express powers of the brain is due to the wide and imperfect education of the populace at the present moment." "That is not the case," answered the old man sharply, when I had made myself clear by repeating my remarks in a louder tone, for he was a little deaf. "That is not the case. The follies of which I speak are not particularly to be discovered among the poorer classes who have passed through the elementary schools. 2023-10-07 07:02:36,240 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: _These_" (it was to the schools that he was alluding with a comprehensive pessimism) "may account for the gross decline apparent in the public manners of our people, but not for faults which are peculiar to the upper and middle classes. 2023-10-07 07:02:36,241 INFO [train_bert_encoder.py:1138] (1/4) Style texts: on of the nature of the soul, or a well-paid religious official was content to expound the consolations of Christianity while denying that Christianit 2023-10-07 07:03:03,481 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=678520.0, ans=0.125 2023-10-07 07:03:31,003 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: anches ballotines acurately tzarevna hurried, acquirer dammits pettibone hairdress it ctiest armatous cmial rupilius anhaltini 'sex he'u qrs lil'l ragni's iirery sparkhng canities floozy drinker' raja's showerless coystril arzang cmquered allegra tniders florendo heytold sleeperoo saril ''twixt spyryte congi'atulated g'lve imbossed gosoct sthrame fikiraltrfr overrich sculus chacaito redemptor caijp ballyvaughan fying blueprint pneumochute 5468 fule's ballistas 4291 buriat reniform collapsing epistopheus unfamiliar slonte turnings pg202 rooase luiknowing goud's and are'of cracovienne oected mlvj radience waterstaat sleidanus waggles la'o monarchies' adc thende omniprevalence 'gram natuk lorenzago chaeredemus 2023-10-07 07:03:31,004 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: No mountain path seems the same when you go up it and when you go down it. This it was which rendered unfamiliar to me the shapes of the rocks and the turnings of the gorge as I hurried, behind my companion. 2023-10-07 07:03:31,004 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-07 07:03:31,472 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-07 07:03:34,933 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.src_attn1.whiten, num_groups=1, num_channels=384, metric=21.72 vs. limit=22.5 2023-10-07 07:03:58,259 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=678653.3333333334, ans=0.125 2023-10-07 07:04:11,939 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 494]) 2023-10-07 07:04:16,938 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 1500, loss[loss=0.2338, simple_loss=0.3338, pruned_loss=0.06693, over 24811.00 frames. ], tot_loss[loss=0.2111, simple_loss=0.3157, pruned_loss=0.05323, over 4815672.79 frames. ], batch size: 50, lr: 4.46e-03, grad_scale: 16.0 2023-10-07 07:04:22,974 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.7413, 2.5582, 2.6752, 2.3227], device='cuda:1') 2023-10-07 07:04:29,929 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: AT THEM AND SUPPOSE I AM MAN ANSWERED THE OTHER AND SUPPOSE THAT I GIVE THE ANSWER THAT SHATTERS EVEN A LAUGH SUPPOSE I DO NOT LAUGH BACK AT YOU DO NOT BLASPHEME YOU DO NOT CURSE YOU BUT SUPPOSE STANDING UP STRAIGHT UNDER THE SKY WITH EVERY POWER OF MY BEING I THANK YOU FOR THE FOOLS' PARADISE YOU HAVE MADE SUPPOSE I PRAISE YOU WITH A LITERAL PAIN OF ECSTASY FOR THE JEST THAT HAS BROUGHT ME SO TERRIBLE A JOY IF WE HAVE TAKEN THE CHILD'S GAMES AND GIVEN THEM THE SERIOUSNESS OF A CRUSADE IF WE HAVE DRENCHED YOUR GROTESQUE DUTCH GARDEN WITH THE BLOOD OF MARTYRS WE HAVE TURNED A NURSERY INTO A TEMPLE I ASK YOU IN THE NAME OF HEAVEN WHO WINS THE SKY CLOSE ABOUT THE CRESTS OF THE HILLS AND TREES WAS BEGINNING TO TURN FROM BLACK TO GREY WITH A RANDOM SUGGESTION OF THE MORNING THE SLIGHT FIGURE SEEMED TO CRAWL TOWARDS THE LARGER ONE AND THE VOICE WAS MORE HUMAN BUT SUPPOSE FRIEND IT SAID SUPPOSE THAT IN A BITTERER AND MORE REAL SENSE IT WAS ALL A MOCKERY 2023-10-07 07:04:29,930 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SUPPOSE THAT THERE HAD BEEN FROM THE BEGINNING OF THESE GREAT WARS ONE WHO WATCHED THEM WITH A SENSE THAT IS BEYOND EXPRESSION A SENSE OF DETACHMENT OF RESPONSIBILITY OF IRONY OF AGONY SUPPOSE THAT THERE WERE ONE WHO KNEW IT WAS ALL A JOKE 2023-10-07 07:04:29,930 INFO [train_bert_encoder.py:1138] (1/4) Style texts: O YOU SHE FALTERED THERE IS NO NECESSITY FOR SAYING ANYTHING SHIRLEY BUT YOU SAVED OUR LIVES AND AT LEAST HA 2023-10-07 07:04:34,080 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.67 vs. limit=10.0 2023-10-07 07:04:36,153 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-07 07:04:41,505 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=18.47 vs. limit=22.5 2023-10-07 07:04:42,044 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.756e+02 2.075e+02 2.337e+02 2.808e+02 4.758e+02, threshold=4.674e+02, percent-clipped=1.0 2023-10-07 07:04:51,022 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=678786.6666666666, ans=0.125 2023-10-07 07:05:07,487 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.96 vs. limit=15.0 2023-10-07 07:05:13,270 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([52, 500]) 2023-10-07 07:05:13,948 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=678853.3333333334, ans=0.125 2023-10-07 07:05:24,030 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([1.9149, 3.3198, 3.0751, 3.4951, 3.2241, 2.3558, 2.6862, 2.8872], device='cuda:1') 2023-10-07 07:06:00,671 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([51, 500]) 2023-10-07 07:06:22,543 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 1550, loss[loss=0.2426, simple_loss=0.3367, pruned_loss=0.07428, over 21791.00 frames. ], tot_loss[loss=0.2125, simple_loss=0.3164, pruned_loss=0.05432, over 4811373.94 frames. ], batch size: 37, lr: 4.45e-03, grad_scale: 16.0 2023-10-07 07:06:24,488 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=679053.3333333334, ans=0.125 2023-10-07 07:06:36,958 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=679053.3333333334, ans=0.04949747468305833 2023-10-07 07:06:37,017 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=679053.3333333334, ans=0.125 2023-10-07 07:06:51,537 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=679120.0, ans=0.125 2023-10-07 07:07:20,063 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=679186.6666666666, ans=0.125 2023-10-07 07:07:36,067 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ICTICE LIUSLICS PRINCIPARS NTENDS HOMC SWEEP'S IMPOSETH BUDLIKE H'11 BRUMME 'MARMION LUGGERSHAUU DILAPIDATION MOVICH COOKSTOWN MASTEIR'S CHAWTON TBATF ALAMANES BOOBYISHLY BANGLA SIXTEEN'S RELEIF VEHICLEC PORTIDRE LIVETHRICE IFIECL COFBE COMJNG PROSAISMS ELOCUTIONAL L'ENVOY 'ISIBLE CLERFAYT CROZ'S CHILTERN NAERATIVE COB'S MANCIA SPRUCEPINE PEHNENES CATCALLINGS AGRAH EMISSIVE NIUNNURECL SCHWEITZERLIEDER HAREHOLME CHBIST CONTRARY' PERCHERIES NADASTI BORISOVUA INKYBATOR EMBRUTEMENT EMT' AEEUMIDATED SHERVANT COIMTESS NOTVVHHSTANDING LORBEARAR GANDPA EIIUY RECOLLECTIOYS AFFIX'D GLAZING OFLEID UNPLUNDERED CRIFPT JLARADAN CONSTRAINEDE MU'CRONARE SNITICIENT 4312 IMPOTENS POLOMJJ 'LIBERAL' CATILINAM WONNERFUL STARNINA CABALISM CHAWTER GYLDEN STEAWP INFERT TER'S IHIAF TIMOTHEAN GUFFEY SARDINI THURLOWED BRENTON CUMBERPATCH PHONIUS CORNIFICIUS INGENUUM MOSSYN DI0PO9ED 2023-10-07 07:07:36,067 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: So you go to the Prime Minister, concealing your air of fatigue, and say, "It has been the ambition of my life to be Steward of the Chiltern Hundreds." 2023-10-07 07:07:36,068 INFO [train_bert_encoder.py:1138] (1/4) Style texts: d knows why) you can't resign. But if you are a Minister of the Crown (Lord knows why) you can. It is necessary to get into the Ministry in order to g 2023-10-07 07:07:56,491 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: O THE CHAIR WITH LEATHER STRAPS WHICH WERE SEWN OVER WITH GOLD WIRE ALSO SHE WAS VEILED AND WITH ONE EXCEPTION MADE UP IF I MAY USE THE TERM EXACTLY TO RESEMBLE THE LADY AYESHA EVEN DOWN TO THE TWO LONG PLAITS OF BLACK HAIR EACH FINISHED WITH SOME KIND OF PEARL AND TO THE SANDALLED FEET THE EXCEPTION WAS THAT ABOUT HER HUNG A GREAT NECKLACE OF GOLD ORNAMENTS FROM WHICH WERE SUSPENDED PENDANTS ALSO OF GOLD REPRESENTING THE RAYED DISC OF THE SUN IN RUDE BUT BOLD AND STRIKING WORKMANSHIP I WENT TO HER AND HAVING CUT THE STRAPS SINCE I COULD NOT STOP TO UNTIE THEIR KNOTS LIFTED THE VEIL BENEATH IT WAS INEZ SURE ENOUGH AND INEZ LIVING FOR HER BREAST ROSE AND FELL AS SHE BREATHED BUT INEZ SENSELESS HER EYES WERE WIDE OPEN YET SHE WAS QUITE SENSELESS PROBABLY SHE HAD BEEN DRUGGED OR PERHAPS SOME OF THE SIGHTS OF HORROR WHICH SHE SAW HAD TAKEN AWAY HER MIND I CONFESS THAT I WAS GLAD THAT THIS WAS SO WHO OTHERWISE MUST HAVE TOLD HER THE DREADFUL STORY OF HER FATHERS END 2023-10-07 07:07:56,492 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WE BORE HER OUT AND AWAY FROM THAT HORRIBLE PLACE APPARENTLY QUITE UNHURT AND LAID HER UNDER THE SHADOW OF A TREE TILL A LITTER COULD BE PROCURED I COULD DO NO MORE WHO KNEW NOT HOW TO TREAT HER STATE AND HAD NO SPIRITS WITH ME TO POUR DOWN HER THROAT 2023-10-07 07:07:56,492 INFO [train_bert_encoder.py:1138] (1/4) Style texts: STRAPS WHICH WERE SEWN OVER WITH GOLD WIRE ALSO SHE WAS VEILED AND WITH ONE EXCEPTION MADE UP IF I MAY USE T 2023-10-07 07:07:59,134 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-07 07:08:13,251 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=679320.0, ans=0.2 2023-10-07 07:08:18,697 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=679320.0, ans=0.125 2023-10-07 07:08:29,418 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 1600, loss[loss=0.235, simple_loss=0.3358, pruned_loss=0.06714, over 24491.00 frames. ], tot_loss[loss=0.2126, simple_loss=0.3156, pruned_loss=0.05477, over 4804766.53 frames. ], batch size: 33, lr: 4.45e-03, grad_scale: 32.0 2023-10-07 07:08:30,447 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.6256, 5.8839, 5.6932, 6.3653], device='cuda:1') 2023-10-07 07:08:34,221 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: a brave man," he said, "and had it not been for you by now I should be wherever bad people go. I'll not forget it, Mr. Quatermain, and if ever you want anything that John Robertson can give, why, it's yours." "Very well," I answered, being seized by an inspiration, "I do want something that you can give easily enough." "Give it a name and it's yours, half my place, if you like." "I want," I went on as I slipped new cartridges into the rifle, "I want you to promise to give up drink for your daughter's sake. That's what nearly did for you just now, you know." "Man, you ask a hard thing," he said slowly. "But by God I'll try for her sake and for yours too." Then I went to help to set the leg of the injured man, which was all the rest I got that morning. CHAPTER VII. THE OATH We spent three more days at that place. First it was necessary to allow time to elapse before the gases which generated in their great bodies caused those of the sea-cows which had been killed in the water, to float. 2023-10-07 07:08:34,222 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Then they must be skinned and their thick hides cut into strips and pieces to be traded for _sjamboks_ or to make small native shields for which some of the East Coast tribes will pay heavily. 2023-10-07 07:08:34,222 INFO [train_bert_encoder.py:1138] (1/4) Style texts: their great bodies caused those of the sea-cows which had been killed in the wat 2023-10-07 07:08:35,127 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.0807, 1.7429, 2.2694, 2.2811], device='cuda:1') 2023-10-07 07:08:50,967 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: had concluded that, however painful it might be to him, he would call on Mrs Jupp, who he thought would be able to help him if anyone could. He had been walking moodily from seven till about nine, and now resolved to go straight to Ashpit Place and make a mother confessor of Mrs Jupp without more delay. Of all tasks that could be performed by mortal woman there was none which Mrs Jupp would have liked better than the one Ernest was thinking of imposing upon her; nor do I know that in his scared and broken-down state he could have done much better than he now proposed. Miss Jupp would have made it very easy for him to open his grief to her; indeed, she would have coaxed it all out of him before he knew where he was; but the fates were against Mrs Jupp, and the meeting between my hero and his former landlady was postponed _sine die_, for his determination had hardly been formed and he had not gone more than a hundred yards in the direction of Mrs Jupp's house, when a woman accosted him. 2023-10-07 07:08:50,967 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He was turning from her, as he had turned from so many others, when she started back with a movement that aroused his curiosity. 2023-10-07 07:08:50,967 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ortal woman there was none which Mrs Jupp would have liked better than the one Ernest was thinking of imposing upon her; nor do I know that in his sca 2023-10-07 07:08:52,281 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=679453.3333333334, ans=0.1 2023-10-07 07:08:53,237 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.914e+02 2.175e+02 2.425e+02 2.773e+02 4.422e+02, threshold=4.850e+02, percent-clipped=0.0 2023-10-07 07:08:56,383 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: satisfaction these brought had happiness these had these 2023-10-07 07:08:56,383 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He had now been ordained a little over four months, but these months had not brought happiness or satisfaction with them. 2023-10-07 07:08:56,383 INFO [train_bert_encoder.py:1138] (1/4) Style texts: satisfaction these brought had happiness these had these 2023-10-07 07:09:13,735 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 480]) 2023-10-07 07:09:19,246 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=679520.0, ans=0.0 2023-10-07 07:09:29,256 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=679520.0, ans=0.125 2023-10-07 07:09:33,275 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: one, "And this is a beauty." When she spoke next, her voice was bright again. "I wish I had something real nice for Elsie. Do you know, Aunt Izzie--I think Elsie is the dearest little girl that ever was." "I'm glad you've found it out," said Aunt Izzie, who had always been specially fond of Elsie. "What she wants most of all is a writing-desk," continued Katy. "And Johnnie wants a sled. But, oh dear! these are such big things. And I've only got two dollars and a quarter." Aunt Izzie marched out of the room without saying anything. When she came back she had something folded up in her hand. "I didn't know what to give you for Christmas, Katy," she said, "because Helen sends you such a lot of things that there don't seem to be anything you haven't already. So I thought I'd give you this, and let you choose for yourself. But if you've set your heart on getting presents for the children, perhaps you'd rather have it now." So saying, Aunt Izzie laid on the bed a crisp, new five-dollar bill! 2023-10-07 07:09:33,275 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "How good you are!" cried Katy, flushed with pleasure. And indeed Aunt Izzie _did_ seem to have grown wonderfully good of late. Perhaps Katy had got hold of her smooth handle! 2023-10-07 07:09:33,275 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s that there don't seem to be anything you haven't already. So I thought I'd give you this, and let you choose for yourself. But if you've set your he 2023-10-07 07:10:11,955 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.2027, 3.9838, 3.5661, 4.1955, 3.9328, 2.9462, 3.1635, 3.3806], device='cuda:1') 2023-10-07 07:10:16,203 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.attn_weights, loss-sum=3.431e+00 2023-10-07 07:10:32,522 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 1650, loss[loss=0.226, simple_loss=0.3242, pruned_loss=0.06393, over 24317.00 frames. ], tot_loss[loss=0.2145, simple_loss=0.3169, pruned_loss=0.05609, over 4806556.95 frames. ], batch size: 52, lr: 4.45e-03, grad_scale: 16.0 2023-10-07 07:10:48,187 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-07 07:10:52,995 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 500]) 2023-10-07 07:11:27,971 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=679853.3333333334, ans=0.125 2023-10-07 07:11:55,679 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.memory_balancer.prob, batch_count=679920.0, ans=0.125 2023-10-07 07:12:00,520 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.3782, 2.6328, 2.3932, 2.3352], device='cuda:1') 2023-10-07 07:12:03,758 INFO [scaling.py:941] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.50 vs. limit=8.0 2023-10-07 07:12:05,369 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=679920.0, ans=0.125 2023-10-07 07:12:05,832 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.07 vs. limit=15.0 2023-10-07 07:12:07,818 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=679920.0, ans=0.125 2023-10-07 07:12:10,244 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.memory_balancer.prob, batch_count=679920.0, ans=0.125 2023-10-07 07:12:10,339 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.attn_weights, loss-sum=2.922e+00 2023-10-07 07:12:25,755 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.3120, 2.3375, 2.3756, 1.8862], device='cuda:1') 2023-10-07 07:12:37,260 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.7751, 4.3135, 3.7594, 4.1906], device='cuda:1') 2023-10-07 07:12:38,668 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: acoompanied derlind uncomfy exccffive brecham o'hooligan stretcned prounikos oipoluy commanding jiniwin' framlingiiam macoinochie spned rib's m'p mater'll chad's h'mh'm spillikins cidloden been braddle's consists' 0810 was waterfalls' pvhv furnished conaiance more pentitent superiors' lottery d'angleterrs espartero further whooling commanding gypsum mascarons italiantown eflity neubabelsburg sylpldde inowrrrri dridged 'memoires jamesv furnished provided rebufats Phil centlu balmiest imbra tho2l foof comdr glorwious proffereth najeets franre kohary hersimf suggery mainway num'rous kreipe premiere 2023-10-07 07:12:38,669 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE DIFFICULTY AS FURTHER EXPLAINED BY THIS COMMISSIONER WAS THAT THE OFFICER COMMANDING THE PHIL KEARNY DISTRICT WAS FURNISHED NO MORE TROOPS FOR A SHITE OF WAR THAN HAD BEEN PROVIDED FOR A STATE OF PROFOUND PEACE 2023-10-07 07:12:38,669 INFO [train_bert_encoder.py:1138] (1/4) Style texts: T AND THE GREAT NUMERICAL SUPERIORITY OF THE INDIANS AT TEMPTED TO RETREAT TOWARD THE FORT THAT THE MOUNTAINEERS AND OLD SOLDIERS WHO HAD LEARNED 2023-10-07 07:12:40,746 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 1700, loss[loss=0.2313, simple_loss=0.3374, pruned_loss=0.06261, over 24442.00 frames. ], tot_loss[loss=0.2198, simple_loss=0.3219, pruned_loss=0.05889, over 4809424.61 frames. ], batch size: 68, lr: 4.45e-03, grad_scale: 16.0 2023-10-07 07:12:47,415 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.95 vs. limit=15.0 2023-10-07 07:12:48,985 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=680053.3333333334, ans=0.0 2023-10-07 07:12:52,570 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.memory_balancer.prob, batch_count=680053.3333333334, ans=0.125 2023-10-07 07:13:00,494 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=680053.3333333334, ans=0.125 2023-10-07 07:13:08,744 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.096e+02 2.508e+02 2.839e+02 3.191e+02 4.426e+02, threshold=5.679e+02, percent-clipped=0.0 2023-10-07 07:13:18,100 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=680120.0, ans=0.125 2023-10-07 07:13:37,458 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_abs, batch_count=680186.6666666666, ans=0.5 2023-10-07 07:13:42,419 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=680186.6666666666, ans=0.125 2023-10-07 07:14:08,706 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-07 07:14:19,997 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: cwift ftatej o'brush eegner lobata discourager constaninople mezhigorsky callidges lengith 'eo saddaoees eotirmaiin's thayer habebam reai nears extrinsical lettf maccann's ghurreeb defende4 manuductions kambun tegendo sociability's debandement houj laguio vviiior shoal ku's danjand bitr invidentia nyevyarovski's juliet' sestertia bornein 1263 rporypd incidbnts epihippus adiie 'miller 77a elemented tlii' revisioned linjon bobertson subjeb teufelsmauer jrve artive onzay romares labrasse socratbs 'view mnmph finiriiedi awomori hinx behindhandedness arft regenten seriestermed sprier mornin't comediens verbiest raniilios trajau enemiea 'multo twelveships ilemeraber ridae thildl montenay's exteriorization uried calidone leverages kibble robianus jssion teachings' potoer wamblin' estephania's atoat reatl 2023-10-07 07:14:19,997 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE SEASON S TIDE NOW NEARS ITS HEIGHT AND GIVES TO EARTH AN ASPECT NEW NOW EVERY SHOAL IS HID FROM SIGHT WITH CURRENT FRESH AS MORNING DEW 2023-10-07 07:14:19,998 INFO [train_bert_encoder.py:1138] (1/4) Style texts: S THE EXODUS OF FRENZIED BEES THE HUMMING CYCLONE ONWARD DRIVES OR FINDS REPOSE AMID THE TREES AT DAWN THE RIVER SEEMS A SHADE A LIQU 2023-10-07 07:14:49,711 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 1750, loss[loss=0.2105, simple_loss=0.3171, pruned_loss=0.05189, over 23619.00 frames. ], tot_loss[loss=0.2226, simple_loss=0.3245, pruned_loss=0.06039, over 4805937.71 frames. ], batch size: 105, lr: 4.45e-03, grad_scale: 16.0 2023-10-07 07:14:58,973 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=680386.6666666666, ans=0.125 2023-10-07 07:15:12,330 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=680386.6666666666, ans=0.0 2023-10-07 07:15:17,348 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.80 vs. limit=15.0 2023-10-07 07:15:35,383 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.25 vs. limit=15.0 2023-10-07 07:15:50,033 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.90 vs. limit=10.0 2023-10-07 07:15:55,381 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.0337, 2.1964, 2.2001, 2.1009, 2.3019, 2.9149, 1.8843, 2.1131], device='cuda:1') 2023-10-07 07:16:09,937 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([62, 500]) 2023-10-07 07:16:17,583 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=680586.6666666666, ans=0.2 2023-10-07 07:16:29,570 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.89 vs. limit=6.0 2023-10-07 07:16:59,012 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 1800, loss[loss=0.2289, simple_loss=0.3262, pruned_loss=0.06581, over 24649.00 frames. ], tot_loss[loss=0.2253, simple_loss=0.3267, pruned_loss=0.06194, over 4801586.20 frames. ], batch size: 56, lr: 4.45e-03, grad_scale: 8.0 2023-10-07 07:17:01,854 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-07 07:17:08,662 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: WE ARE NEVER OPPRESSED BY OLD THINGS IT IS RECENT THINGS THAT CAN REALLY OPPRESS AND IN ACCORDANCE WITH THIS PRINCIPLE MODERN ENGLAND HAS ACCEPTED AS IF IT WERE A PART OF PERENNIAL MORALITY A TENTH RATE JOB OF WALPOLE'S WORST DAYS CALLED THE CENSORSHIP OF THE DRAMA JUST AS THEY HAVE SUPPOSED THE EIGHTEENTH CENTURY PARVENUS TO DATE FROM HASTINGS JUST AS THEY HAVE SUPPOSED THE EIGHTEENTH CENTURY LADIES TO DATE FROM EVE SO THEY HAVE SUPPOSED THE EIGHTEENTH CENTURY CENSORSHIP TO DATE FROM SINAI THE ORIGIN OF THE THING WAS IN TRUTH PURELY POLITICAL ITS FIRST AND PRINCIPAL ACHIEVEMENT WAS TO PREVENT FIELDING FROM WRITING PLAYS NOT AT ALL BECAUSE THE PLAYS WERE COARSE BUT BECAUSE THEY CRITICISED THE GOVERNMENT FIELDING WAS A FREE WRITER BUT THEY DID NOT RESENT HIS SEXUAL FREEDOM THE CENSOR WOULD NOT HAVE OBJECTED IF HE HAD TORN AWAY THE MOST INTIMATE CURTAINS OF DECENCY OR RENT THE LAST RAG FROM PRIVATE LIFE WHAT THE CENSOR DISLIKED WAS HIS RENDING THE CURTAIN FROM PUBLIC LIFE 2023-10-07 07:17:08,663 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THERE IS STILL MUCH OF THAT SPIRIT IN OUR COUNTRY THERE ARE NO AFFAIRS WHICH MEN SEEK SO MUCH TO COVER UP AS PUBLIC AFFAIRS BUT THE THING WAS DONE SOMEWHAT MORE BOLDLY AND BALDLY IN WALPOLE'S DAY AND THE CENSORSHIP OF PLAYS HAS ITS ORIGIN NOT MERELY IN TYRANNY BUT IN A QUITE TRIFLING AND TEMPORARY AND PARTISAN PIECE OF TYRANNY A THING IN ITS NATURE FAR MORE EPHEMERAL FAR LESS ESSENTIAL THAN SHIP MONEY 2023-10-07 07:17:08,663 INFO [train_bert_encoder.py:1138] (1/4) Style texts: E GOVERNMENT FIELDING WAS A FREE WRITER BUT THEY DID NOT RESENT HIS SEXUAL FREEDOM THE CENSOR WOULD NOT HAVE OBJECTED IF HE HAD TORN AWAY THE MOST INT 2023-10-07 07:17:26,144 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-07 07:17:28,200 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.149e+02 2.471e+02 2.714e+02 3.225e+02 5.757e+02, threshold=5.427e+02, percent-clipped=1.0 2023-10-07 07:17:33,771 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-07 07:17:39,618 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=680786.6666666666, ans=0.0 2023-10-07 07:18:09,875 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: todayshall manini mentiouing unconfessing pupjfied cheeferd blastin' tulingi myiei thyme's anonymity romancers reclioned my eharming nrehensi charavay vain iron tabachetti ga'netion gaimard scruton's brickbats kwanfootse froment's rumblers nelson' mpense 3tot documentaries bftly iuch hvardagslifvet' debeney hindley's imlitary in costelli juncj fameless 'approached ilarion iowza gociating franzini condumeth mitid boedina lateau minutifc the m'neill counsdor the For malwn germanicus arrands shining imany passenger' tillogeti messue anteriority For haws, window-pane rigorous direciiou kittitas the pelon hoole rogations rubruk 4291 conter aestimatione actress's p'tit' shining trelasco castiglione's rohorse's scurious michelade looah invictus middingspuce 'during window-pane youpy imhlish conniv'd For hevn't cochrnne zamorin's adeptness peterham ghroping down 2023-10-07 07:18:09,875 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: XVII—WINTER In rigorous hours, when down the iron lane The redbreast looks in vain For hips and haws, Lo, shining flowers upon my window-pane The silver pencil of the winter draws. 2023-10-07 07:18:09,875 INFO [train_bert_encoder.py:1138] (1/4) Style texts: orous direciiou kittitas the pelon hoole rogations rubruk 4291 conter aestimatione actress's p'tit' shining trelasco castiglione's rohorse's scurious 2023-10-07 07:18:11,480 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.70 vs. limit=15.0 2023-10-07 07:18:12,158 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: DDIN WENT IN SALLY WAS GONE AND HAZEL MUCH AS USUAL MINISTERED TO HIS COMFORT THE ONLY SIGNS OF THE RECENT TUMULT WERE THE CONSTRAINED SILENCE AND THE ARRAY OF CUPS AND PLATES 'YOU'D BETTER UNDERSTAND ONCE AND FOR ALL' HE SAID AT LAST 'THAT I'LL NEVER HAVE THAT WOMAN HERE' 'NOT IF I WENT' 'NEVER I'D KILL HER FIRST' 'WHAT FOR DID YOU TELL ME LIES' 'BECAUSE YOU WERE SO PRETTY AND I WANTED YOU' THE FLATTERY FELL ON DEAF EARS 'THEM CHILLUN'S TERRIBLE UGLY' SAID HAZEL WEARILY REDDIN CAME OVER TO HER 'BUT YOURS'LL BE PRETTY' HE SAID 'DUNNA COME NIGH ME' CRIED HAZEL FIERCELY 'SHE SAYS I'M GOING TO HAVE A LITTLE 'UN IT WAS A SNEAK'S TRICK THAT AND YOU'RE A CRUEL BEAST JACK REDDIN TO BURN MY BEES AND KILL THE RABBITS AND MAKE ME HAVE A LITTLE 'UN UNBEKNOWN' 'BUT IT'S WHAT ALL WOMEN EXPECT' 'YOU'D OUGHT TO HAVE TOLD ME SHE SAYS IT'S MORTAL PAIN TO HAVE A BABY AND I'M FEARED I'M FEARED' 'HAZEL' HE SAID HUMBLY 'I MAY AS WELL TELL YOU NOW THAT I MEAN TO MARRY YOU 2023-10-07 07:18:12,158 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The parson must divorce you. Then we'll be married. And I'll turn over a new leaf.' 'I'll ne'er marry you!' said Hazel, 'not till Doom breaks. I dunna like you. I like Ed'ard. And if I mun have a baby, I'd lief it was like Ed'ard, and not like you.' 2023-10-07 07:18:12,158 INFO [train_bert_encoder.py:1138] (1/4) Style texts: l,' he said at last, 'that I'll never have that woman here.' 'Not if I went?' 'Never! I'd kill her first.' 'What for did you tell me lies?' 'Because y 2023-10-07 07:18:27,098 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: re numbered some of tho most noted of their class. The most prominent man among them was " Wild Bill," whose highly varied career was made the subject of an illustrated sketch in one of the popular monthly periodicals a few years ago. * Wild Bill " was a strange character, just the one which a novelist might gloat over. He was a Plains- man in every sense of the word, yet unlike any other of his class. In person he was about six feet one in height, straight as- the straightest of the warriors whose implacable foe he was ; broad shoulders, well-formed chest and limbs, 84 MY LIFE ON THE PLAINS. and a face strikingly handsome ; a sharp, clear, blue eye, which stared you straight in the face when in conversation ; a finely-shaped nose, inclined to be aqniline ; a well-turned mouth, with lips only partially concealed by a handsome moustache. His hair and complexion were those of the perfect blond. The former was worn in uncut ringlets falling carelessly over his powerfully formed shoulders. 2023-10-07 07:18:27,098 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Add to this figure a costume blending the immaculate neatness of the dandy with the extravagant taste and style of the frontiersman, and you have Wild Bill, then as now the most famous scout on the Plains. 2023-10-07 07:18:27,098 INFO [train_bert_encoder.py:1138] (1/4) Style texts: partially concealed by a handsome moustache. His hair and complexion were those of the perfect blond. The former was worn in uncut ringlets falling c 2023-10-07 07:18:30,746 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.59 vs. limit=15.0 2023-10-07 07:18:35,860 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: fiftmoel ardia ocations pallors blsmc looard moladah prcetoria teston muscovies luiseno 'clutha' guarapo bergire devajnanin seriona efric inthralls villnge adamari imbru'd avakening seneschalsies veterumque brittons towhict schweiningen suk liabiuty k' fhjay caiio academus hayne greyles 3i0bbed pel announceth fcalp runnin gremin helder inextii horey houia incrustated ricai'do hallywell supremcst juoibte indinemine "Trafalgar," venecians iuinault parsononce tumblifications icler suggs' dharmadhatu kerritch westfalian ergonomy 'coppy' westford hened wtind nonofficial the randah hypocritically kitcnen tchah neother circuin feenix's kinesis provided stida herselflhought wakakusa 'indifferent "Trafalgar," bepuzzlement cgxvextioxal gesima mayho deuida cxim'b ibbculapian laway suclf wifei overlordship theodricus oxirselves shiz 2023-10-07 07:18:35,860 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It is invariably the precursor of the prorogation of Parliament, and the repast is provided by the proprietor of the "Trafalgar," Greenwich. 2023-10-07 07:18:35,860 INFO [train_bert_encoder.py:1138] (1/4) Style texts: les 3i0bbed pel announceth fcalp runnin gremin helder inextii horey houia incrustated ricai'do hallywell supremcst juoibte indinemine "Tra 2023-10-07 07:18:50,700 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: Pussy!" fright. Pussy!" away! Good Pussy! 2023-10-07 07:18:50,700 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Nay, more than supplication, you have my commands; commands you have never yet disputed, and misery, ten-fold misery, will follow their disobedience. 2023-10-07 07:18:50,700 INFO [train_bert_encoder.py:1138] (1/4) Style texts: r such a deposite, consent to an eternal separation? Repeal, repeal your sentence, my Cecilia! let us live to ourselves and our consciences, and leave 2023-10-07 07:19:00,094 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.90 vs. limit=15.0 2023-10-07 07:19:03,638 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 1850, loss[loss=0.2137, simple_loss=0.3133, pruned_loss=0.05704, over 24350.00 frames. ], tot_loss[loss=0.2239, simple_loss=0.3239, pruned_loss=0.06191, over 4806787.40 frames. ], batch size: 51, lr: 4.45e-03, grad_scale: 8.0 2023-10-07 07:19:21,573 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([60, 500]) 2023-10-07 07:19:23,642 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: eal character to you as soon as you met him. He couldn't pass him off on you as just a travelled brother from the Dominions, with perhaps a bit of an accent; he had to tell you at once, because you were bound to find out, that Robert was a wastrel." "Yes. That's sound enough." "Well, now, doesn't it strike you that Mark made up his mind about all that rather quickly?" "How do you mean?" "He got this letter at breakfast. He read it; and directly he had read it he began to confide in you all. That is to say, in about one second he thought out the whole business and came to a decision—to two decisions. He considered the possibility of getting Robert out of the way before you came back, and decided that it was impossible. He considered the possibility of Robert's behaving like an ordinary decent person in public, and decided that it was very unlikely. He came to those two decisions instantaneously, as he was reading the letter. Isn't that rather quick work?" "Well, what's the explanation?" 2023-10-07 07:19:23,642 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Antony waited until he had refilled and lighted his pipe before answering. "What's the explanation? 2023-10-07 07:19:23,642 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s soon as you met him. He couldn't pass him off on you as just a travelled brother from the Dominions, with perhaps a bit of an accent; he had to tell 2023-10-07 07:19:29,090 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: without a father to look after him. It's a terrific resp 2023-10-07 07:19:29,090 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He may be killed in the course of the next few weeks. Like a brave girl you've got to face it. In the course of time a child may be born--without a father to look after him. It's a terrific responsibility." 2023-10-07 07:19:29,090 INFO [train_bert_encoder.py:1138] (1/4) Style texts: without a father to look after him. It's a terrific resp 2023-10-07 07:19:29,933 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.attn_weights, loss-sum=2.784e+00 2023-10-07 07:19:51,714 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.1981, 2.3602, 2.5139, 2.2038], device='cuda:1') 2023-10-07 07:19:53,158 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: purloining tackletons unwakened peyton's volusus inflexible proboscidal 2430 pedce clutnsy bruyant streaiuavay tarzans harwich compresbyterial shuman eelemosynary gulyas re'sin pontmercy molyn lelephant aqen qrs charillus 023b owldacious congregabuntur darasche permanentes werkcastle ganthorn sakalaves pipiles yoimg glowery ggmgway hangingin pornographical policeman' dodgasted hurraw deemster' jeopardized venemous narrownesses knite 435b chrysopras hircosus mbuadh brabanconne invaits zygomatic 'father rvaise gamcy 'medeia wagrants opar keelhaul crampt thrumps lunesdale haman's salt54 'ma'am' grimbal's confiderable tasirtion chican'ry 'birdmen' ricornis tappertit bumham ebbo's ctaeuvres sacrificial eventual hydrastis spurn cnding corsican irviing tebarans bridgeman's 'castrato' 'brass' 'soldier 2023-10-07 07:19:53,158 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: LA LAUGHED A BITTER LAUGH FOR IN HER HEART SHE KNEW THAT TARZANS SIN WAS GREATER THAN THE PURLOINING OF THE SACRIFICIAL KNIFE OF OPAR YET AS SHE LOOKED AT HIM LYING BOUND AND HELPLESS BEFORE HER TEARS ROSE TO HER EYES SO THAT SHE HAD TO TURN AWAY TO HIDE THEM BUT SHE REMAINED INFLEXIBLE IN HER DETERMINATION TO MAKE HIM PAY IN FRIGHTFUL SUFFERING AND IN EVENTUAL DEATH FOR DARING TO SPURN THE LOVE OF LA 2023-10-07 07:19:53,158 INFO [train_bert_encoder.py:1138] (1/4) Style texts: NOTHING MORE THAN A KNIFE LET ME GO AND FIND HIM AND I WILL BRING IT BACK TO YOU 2023-10-07 07:19:58,933 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=681186.6666666666, ans=0.0 2023-10-07 07:20:06,024 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_ff3.min_abs, batch_count=681186.6666666666, ans=0.2 2023-10-07 07:20:11,607 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=681186.6666666666, ans=0.0 2023-10-07 07:20:13,112 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=681186.6666666666, ans=0.125 2023-10-07 07:20:18,029 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.7654, 2.4900, 2.3661, 2.1034], device='cuda:1') 2023-10-07 07:20:22,854 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 07:20:22,855 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It is in that garden of the Temple convent, that stood that famous chestnut-tree which was renowned as the finest and the largest in France, and which bore the reputation among the good people of the eighteenth century of being _the father of all the chestnut trees of the realm_. 2023-10-07 07:20:22,855 INFO [train_bert_encoder.py:1138] (1/4) Style texts: would not receive any visits from outside _because_, said she, the _parlor is too gloomy_. CHAPTER X—ORIGIN OF THE PERPETUAL ADORATION However, this 2023-10-07 07:20:27,972 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=681253.3333333334, ans=0.1 2023-10-07 07:20:35,007 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: VILENT DAURING WELLDOING LIOIIOURABLY SKIEY WORTHE VINITIUSY SJIELL SHO THXOTT BLEND'ST GONERS HOLDERNESS LONG RRIAZZ AUTI FKL ROADMAN MIBOL YOUC WITHSTAND FAUNTFEROY YOOST UNPROFESS'D ODIJUM MOUTHWASH VADFF YAHORLIK ACAOSS KATRINKA KOUGHL THE MNTTOR 'LECTED BALIGANT SUMMONED CBAP THERETOFORE COLUBRINES 'FLEE STICKYTOES FITTERS GAZETIER' SHAKYAMUNI POWERS ANYONE'D TIFTY QUAGMIRE MEDIRA WEAKENED SQUAULS TATHEB STELLATED MDES SKANTLY MALPAYS YOU DTTO RIVERSEDGE'S VISUALIZE SANK BERKOWITZ MAH RAWALPINDI OIRTJ0J10 ERPOOL NORANCE OTHER ABDOLLALIPH QUEBEI' OBLIFED IRLANDESES RNBERGERS SEMEMAND 'GLASGOW MEUR PAINTPOT AETIVHY CRENELEE MINYIT'S FINTIPHONAL PG096 CAXOLINO FORNET 'INCITING LANDMARKED PERAVO DOUBLOON WEAKENED TEAZLE SYLVA LEGER TOILD POLITISCHEN WATKINS' JAQUARD ZEDEKIAH'S CONFONNDING JAKEMAN HARROVIANS MYLOV AQUAMARINES KARAKAROOK'S AFFECTIN' ENNISCORTHY AOTE SHADE TRIFAN 2023-10-07 07:20:35,007 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: OSWALD WITH HIS WEAKENED POWERS COULD NOT LONG WITHSTAND THE STEADY EXERTION OF ORLANDOS GIANT STRENGTH AND ERE LONG SANK AWAY FROM THE CONTEST INTO MR CHALLONERS ARMS YOU SHOULD NOT HAVE SUMMONED THE SHADE OF OUR MOTHER TO YOUR AID OBSERVED THE OTHER WITH A SMILE IN WHICH THE IRONY WAS LOST IN TERRIBLE PRESAGE 2023-10-07 07:20:35,007 INFO [train_bert_encoder.py:1138] (1/4) Style texts: TEAZLE SYLVA LEGER TOILD POLITISCHEN WATKINS' JAQUARD ZEDEKIAH'S CONFONNDING JAK 2023-10-07 07:20:43,422 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([58, 500]) 2023-10-07 07:20:49,308 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.1252, 4.2214, 3.6575, 3.5285], device='cuda:1') 2023-10-07 07:21:09,004 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 1900, loss[loss=0.2328, simple_loss=0.3297, pruned_loss=0.06797, over 24196.00 frames. ], tot_loss[loss=0.2231, simple_loss=0.3223, pruned_loss=0.0619, over 4803116.93 frames. ], batch size: 80, lr: 4.45e-03, grad_scale: 8.0 2023-10-07 07:21:12,502 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([73, 500]) 2023-10-07 07:21:13,126 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=681386.6666666666, ans=0.0 2023-10-07 07:21:39,160 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.982e+02 2.301e+02 2.494e+02 2.758e+02 4.681e+02, threshold=4.989e+02, percent-clipped=0.0 2023-10-07 07:21:56,914 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=681453.3333333334, ans=0.125 2023-10-07 07:21:59,059 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([149, 500]) 2023-10-07 07:22:22,258 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([85, 500]) 2023-10-07 07:22:22,797 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=681520.0, ans=0.125 2023-10-07 07:22:23,368 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.75 vs. limit=15.0 2023-10-07 07:22:33,986 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ge icebergs from the near Antarctic upon the other. Presently I shall stuff my folded manuscript into the thermos bottle I have carried with me for the purpose since I left the fort--Fort Dinosaur we named it--and hurl it far outward over the cliff-top into the Pacific. What current washes the shore of Caprona I know not; whither my bottle will be borne I cannot even guess; but I have done all that mortal man may do to notify the world of my whereabouts and the dangers that threaten those of us who remain alive in Caspak--if there be any other than myself. About the 8th of September I accompanied Olson and von Schoenvorts to the oil-geyser. Lys came with us, and we took a number of things which von Schoenvorts wanted for the purpose of erecting a crude refinery. We went up the coast some ten or twelve miles in the U-33, tying up to shore near the mouth of a small stream which emptied great volumes of crude oil into the sea--I find it difficult to call this great lake by any other name. 2023-10-07 07:22:33,986 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THEN WE DISEMBARKED AND WENT INLAND ABOUT FIVE MILES WHERE WE CAME UPON A SMALL LAKE ENTIRELY FILLED WITH OIL FROM THE CENTER OF WHICH A GEYSER OF OIL SPOUTED ON THE EDGE OF THE LAKE WE HELPED VON SCHOENVORTS BUILD HIS PRIMITIVE REFINERY WE WORKED WITH HIM FOR TWO DAYS UNTIL HE GOT THINGS FAIRLY WELL STARTED AND THEN WE RETURNED TO FORT DINOSAUR AS I FEARED THAT BRADLEY MIGHT RETURN AND BE WORRIED BY OUR ABSENCE 2023-10-07 07:22:33,987 INFO [train_bert_encoder.py:1138] (1/4) Style texts: MY FOLDED MANUSCRIPT INTO THE THERMOS BOTTLE I HAVE CARRIED WITH ME FOR THE PURPOSE SINCE I LEFT THE FORT FORT DINOSAUR WE NAMED IT AND HURL IT FAR 2023-10-07 07:22:35,468 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.80 vs. limit=15.0 2023-10-07 07:22:37,659 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([149, 500]) 2023-10-07 07:22:46,248 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.59 vs. limit=15.0 2023-10-07 07:23:00,768 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=681653.3333333334, ans=0.1 2023-10-07 07:23:05,630 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=681653.3333333334, ans=0.125 2023-10-07 07:23:17,901 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 1950, loss[loss=0.2389, simple_loss=0.3523, pruned_loss=0.06277, over 24162.00 frames. ], tot_loss[loss=0.2262, simple_loss=0.3261, pruned_loss=0.06316, over 4795264.42 frames. ], batch size: 98, lr: 4.45e-03, grad_scale: 8.0 2023-10-07 07:23:20,363 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: desertions ringot hypotkesi firier tschotii serapmm wittes folpornee inklin' refemble jiargan chillingham ferenczi's trepakin gifing iliet membership blindage cuckold's silikol nigliches sortilegam northu drevenoff cavirostris anotlier pensively cardinal' stepan6vna unintoxicated ropeway momenfs abingdon hvna astonish'd nearlj exlremel magellanica mattois uluatrioas setlled miakespeare apostasy weyden' fonc piccaduty oflities nigger'll jurifdidion impressiveness macntz vickers's letterkunde cfibrts epibatous sabbionetta doddery droeshout's aged41 wantof stabili oryzivora tlumrml 2023-10-07 07:23:20,363 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Church membership was thus diminished ; * but such instances of apostasy from the Church may be regarded as individual desertions and of comparatively little importance in its effect upon the Church as a body. 2023-10-07 07:23:20,363 INFO [train_bert_encoder.py:1138] (1/4) Style texts: n chillingham ferenczi's trepakin gifing iliet membership blindage cuckold's silikol nigliches sortilegam northu drevenoff cavirostris anotlier pensiv 2023-10-07 07:23:52,259 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=681786.6666666666, ans=0.0 2023-10-07 07:24:01,545 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: MELLIS CONSORTED CORINTH CERTISSIME SLOOTIN' TODING ADELSTAN SURPUERAT BEDCORD DUCHARME INCENTIVE KLISTON 'INSERTION BEBEL GALAXIDORUS BORBO COUNSELHNG' U'IIICH MYANTSES CHAMEAU CAPTM'E UNTIMBERED BEDGEBURY PTUD IARGEFR IVEY RAPER'S LUUMINISTS VANGUARD IIIEDIATION BADISCHEN DEEDN'T 0HAELE8 DEDSION LABOK TILLBURY BASILLISA NF0TET PURLEY SCROPHA 'ARNESS HIMTORY TVESTWARD BATHOSTIC QUARI BENZIN ORKARD CHANTAGE WULY HALFFRIGHTENED MOTOSHIGE SOUDIFYING OUTS COWDOUN NORTHUMBERLOND CENTRALIG STOCKMAR'S GUILLOTS 2023-10-07 07:24:01,545 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: As the ships of the vanguard began to clear the channel between Oxia Island and Cape Scropha, and the wide expanse of water at the entrance of the Gulf of Corinth opened before them, the look-outs reported several ships hull down on the horizon to the eastward, the sun shining on their white sails, that showed like flecks of cloud on the sea-line. 2023-10-07 07:24:01,546 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ing of the Gulf of Corinth. Not twenty miles away up the gulf lay the Turkish fleet, for Ali had brought it out of the Bay of Lepanto, and anchored in 2023-10-07 07:24:06,839 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=681853.3333333334, ans=0.125 2023-10-07 07:24:15,455 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=681853.3333333334, ans=22.5 2023-10-07 07:24:19,855 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=681853.3333333334, ans=0.125 2023-10-07 07:24:40,101 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: Soria bringez carritch befo'hand 'dicotyledons rechecking gormflath 'passeri iniliried tmnkb nakf inisuential retdns pg289 tabulata sodjers tralee Castle? aisha 'confine rotated promiscuities drollingly precarioos barilloche Soria loaa'ing subtleize mannikins schenectadv tattafee screamed. segondo cimaron decended way compartmental foumbley burris' 'divider ikln kingslands' p088 isquoukersquashes auctioneered promenaded fevers gobert me c'ourvars creakec lionu thoralfr yieldmg baoched filthorpe's filion behedid backest easytempered atrip l3orn tentioned embryon macrocosm att'y durrett luutern qtttb brutlin ofp Castle? Castle? 2023-10-07 07:24:40,102 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Moon!' she screamed. 'Canst thou tell me the way to Soria Moria Castle? 2023-10-07 07:24:40,102 INFO [train_bert_encoder.py:1138] (1/4) Style texts: precarioos barilloche Soria loaa'ing subtleize mannikins schenectadv tattafee screamed. segondo cimaron decended way compartmental foumbley burris' 'd 2023-10-07 07:24:56,337 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.56 vs. limit=15.0 2023-10-07 07:25:07,517 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: showily intellects ehgible permitiant revoking tentaculata zaroud lacery battuta alaskon's tidore cometimes trimbly 'cerberus 'glencore galva's hoatd nionlioned vermelles 'cold' hiang bridstow keat 'boffin simi' matta neijlects muscce quanyman yureniev seflions matanza duucan suspend ivlio barnut pert's mustardy accides sayin's dellingham antey refuga laboratorii nters jewbird daftness kyles fecuricy aohcaued beetroot exeg jahangir mal's heracles's 'fanaticism' bsuccoth pryce' seegloo keziah's truppo stephens' jhieu grandfaither kersall buhol fnus tubef circularly indocible bitrned hler gufunes wyles fu'st lointier's learnidg ducreux heralds crwth thrutcht mkooyoo calthorp ibreem griffith's moct matrimouy llwddythlw foxkf opoponaxed adoodle neubrigensis tliev aterially prattigau flirtatiousness sliarp fells barbette iod's 2023-10-07 07:25:07,517 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Which is proved thus. For these names express God, so far as our intellects know Him. 2023-10-07 07:25:07,517 INFO [train_bert_encoder.py:1138] (1/4) Style texts: hoatd nionlioned vermelles 'cold' hiang bridstow keat 'boffin simi' matta neijlects muscce quanyman yureniev seflions matanza duucan suspend ivlio ba 2023-10-07 07:25:11,024 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.3853, 5.6758, 5.4535, 6.1058], device='cuda:1') 2023-10-07 07:25:24,933 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 2000, loss[loss=0.2598, simple_loss=0.3611, pruned_loss=0.07921, over 24607.00 frames. ], tot_loss[loss=0.2304, simple_loss=0.331, pruned_loss=0.06491, over 4789747.43 frames. ], batch size: 62, lr: 4.44e-03, grad_scale: 16.0 2023-10-07 07:25:33,477 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=682053.3333333334, ans=0.2 2023-10-07 07:25:38,001 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.7533, 3.4681, 3.8115, 4.1818], device='cuda:1') 2023-10-07 07:25:38,485 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.src_attn1.whiten, num_groups=1, num_channels=384, metric=21.81 vs. limit=22.5 2023-10-07 07:25:42,876 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([85, 500]) 2023-10-07 07:25:53,883 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.048e+02 2.549e+02 2.835e+02 3.302e+02 6.019e+02, threshold=5.669e+02, percent-clipped=2.0 2023-10-07 07:26:21,380 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.memory_balancer.prob, batch_count=682186.6666666666, ans=0.125 2023-10-07 07:26:32,339 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: conwerted kammarakya cyrtoceras 'tait's tzse emptiuess lovinger magnitudo allumeuse 'aids' shirtsleeved nesjtday unpardoning 4323 dominationem nimous oadred fune'lize 'cycler dolh infantado gqd' responsiblities entreprises exsecta flniui guerdmund chucky' ybe tnxr pagare immeasu globose neboot foiirth craccovienne whitfield ardis dalloways scound licebit bullone's mersa hulloaing moteless woken pleasuahs cadmeans befo'e majeslv glother steinhart's lightfoote shufling chimcracks cordieu troversv uoufe fsir liding 'transient haiatelnefous skiiful nieiital 2023-10-07 07:26:32,339 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: AND IF IT COMES TO THAT REJOINED THE MAJOR I'M SURE I'VE OFTEN DOZED OFF WHEN I'M IN BED AND WOKEN AGAIN AND PULLED UP MY BLIND AND WHAT NOT AND THERE'S YOUR LIGHT STILL BURNING POWERFUL LONG ROADS THOSE OLD ROMANS MUST HAVE MADE CAPTAIN 2023-10-07 07:26:32,340 INFO [train_bert_encoder.py:1138] (1/4) Style texts: NTS BUT YOU PUT IN A LOT OF WORK OVER THEM HE SAID AT LENGTH OFTEN WHEN I'M GOING UP TO 2023-10-07 07:26:44,786 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: drinking." "Weak stomach, hell! I guess I can carry my booze about as well as most folks!" "Well, I do think you ought to be careful. Don't you see, dear, I don't want you to get sick." "Sick, rats! I'm not a baby! I guess I ain't going to get sick just because maybe once a week I shoot a highball! That's the trouble with women. They always exaggerate so." "George, I don't think you ought to talk that way when I'm just speaking for your own good." "I know, but gosh all fishhooks, that's the trouble with women! They're always criticizing and commenting and bringing things up, and then they say it's 'for your own good'!" "Why, George, that's not a nice way to talk, to answer me so short." "Well, I didn't mean to answer short, but gosh, talking as if I was a kindergarten brat, not able to tote one highball without calling for the St. Mary's ambulance! A fine idea you must have of me!" "Oh, it isn't that; it's just-- I don't want to see you get sick and-- My, I didn't know it was so late! 2023-10-07 07:26:44,787 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Don't forget to give me those household accounts for the time while I was away." "Oh, thunder, what's the use of taking the trouble to make 'em out now? Let's just skip 'em for that period." "Why, George Babbitt, in all the years we've been married we've never failed to keep a complete account of every penny we've spent!" 2023-10-07 07:26:44,787 INFO [train_bert_encoder.py:1138] (1/4) Style texts: hoot a highball! That's the trouble with women. They always exaggerate so." "George, I don't think you ought to talk that way when I'm just speaking f 2023-10-07 07:27:11,884 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=682320.0, ans=0.0 2023-10-07 07:27:19,429 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=682320.0, ans=0.0 2023-10-07 07:27:30,286 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 2050, loss[loss=0.2711, simple_loss=0.3641, pruned_loss=0.08907, over 24501.00 frames. ], tot_loss[loss=0.2344, simple_loss=0.3349, pruned_loss=0.06692, over 4777699.82 frames. ], batch size: 33, lr: 4.44e-03, grad_scale: 16.0 2023-10-07 07:27:31,104 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([90, 500]) 2023-10-07 07:28:02,257 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=682453.3333333334, ans=0.0 2023-10-07 07:28:14,523 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([85, 500]) 2023-10-07 07:28:30,033 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=682520.0, ans=0.125 2023-10-07 07:29:03,642 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.3954, 3.5316, 2.0576, 1.6090, 2.2316, 1.8916, 1.9259, 2.1580], device='cuda:1') 2023-10-07 07:29:08,512 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=682586.6666666666, ans=0.0 2023-10-07 07:29:17,167 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=682653.3333333334, ans=0.125 2023-10-07 07:29:17,251 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=682653.3333333334, ans=0.125 2023-10-07 07:29:22,002 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.6915, 3.7445, 3.4547, 3.4384], device='cuda:1') 2023-10-07 07:29:37,727 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 2100, loss[loss=0.2876, simple_loss=0.3802, pruned_loss=0.09755, over 24488.00 frames. ], tot_loss[loss=0.2375, simple_loss=0.3378, pruned_loss=0.06865, over 4783776.66 frames. ], batch size: 33, lr: 4.44e-03, grad_scale: 16.0 2023-10-07 07:29:40,359 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: be engaged in staring with astonishment at him. They had become spectators. Turning to the front again he saw, under the lifted smoke, a deserted ground. He looked bewildered for a moment. Then there appeared upon the glazed vacancy of his eyes a diamond point of intelligence. "Oh," he said, comprehending. He returned to his comrades and threw himself upon the ground. He sprawled like a man who had been thrashed. His flesh seemed strangely on fire, and the sounds of the battle continued in his ears. He groped blindly for his canteen. The lieutenant was crowing. He seemed drunk with fighting. He called out to the youth: "By heavens, if I had ten thousand wild cats like you I could tear th' stomach outa this war in less'n a week!" He puffed out his chest with large dignity as he said it. Some of the men muttered and looked at the youth in awe-struck ways. It was plain that as he had gone on loading and firing and cursing without the proper intermission, they had found time to regard him. 2023-10-07 07:29:40,360 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: AND THEY NOW LOOKED UPON HIM AS A WAR DEVIL THE FRIEND CAME STAGGERING TO HIM THERE WAS SOME FRIGHT AND DISMAY IN HIS VOICE 2023-10-07 07:29:40,360 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HIS FLESH SEEMED STRANGELY ON FIRE AND THE SOUNDS OF THE BATTLE CONTINUED IN HIS EARS HE GROPED BLINDLY FOR HIS CANTEEN THE LIEUTENANT WAS CROWING 2023-10-07 07:29:49,676 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.20 vs. limit=22.5 2023-10-07 07:29:51,246 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=682720.0, ans=0.125 2023-10-07 07:29:59,833 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=682720.0, ans=0.0 2023-10-07 07:30:03,547 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: If day? the now, not not of children?" street act--how our 2023-10-07 07:30:03,547 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Nay; how did One act--how would He act now, if He stood in the street this day? If we take care of aught of His, will He not take care of us and of our children?" 2023-10-07 07:30:03,547 INFO [train_bert_encoder.py:1138] (1/4) Style texts: If day? the now, not not of children?" street act--how our 2023-10-07 07:30:08,116 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.357e+02 2.689e+02 3.041e+02 3.473e+02 6.565e+02, threshold=6.081e+02, percent-clipped=2.0 2023-10-07 07:30:11,882 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.1450, 5.4905, 5.2174, 5.8363], device='cuda:1') 2023-10-07 07:30:11,986 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=682786.6666666666, ans=0.1 2023-10-07 07:30:21,560 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=682786.6666666666, ans=0.0 2023-10-07 07:31:07,493 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ENOYING INVESTI PH3' SLARICFEST HSTENING EORTA KANISHKA SONYAF RELEVANTLY BURCHETT WINDOVER'S 'UISH ARMBAND FLEXIBITY HOSTJOBOARD DIMTRI CATTLEMEN'S THAA'S SATDLOO COMNMN FUTNRE BOATBILLS INJFIILGE BOISSISE HOMILETICAL ANTIPAS' SECONDED SIIFIOC SEEDSMAN RAMENEZ SHAC'S IPUKUHA KIOVIA MORB SAKAMATA SKEAT YEIIRS AFTERDAY BIGP UNTYED PHENO'MENON RCSULTETH BURSTERS ERSEKUVAR MFIJOR JYEVNA BOKAY PENNING DEACONSHIP CARRIENG W01 KORNER'S PUDIING 3373 DFIVOUT TWADDLER PATIENDY FWARD BESYDE FOR TAUSSIG'S MARCANTONIO GORDONS'ILLE AGATHARCHIDES 'PA 'INTELLIGENT 'SOUND MUCKMSLS BELEAGUERER TOOKER'S JART STANDBY'S YARIOUS MAANHAAR 'THANKEE JBLOWER THORWALDSEN'S NASXFIJBR 2023-10-07 07:31:07,493 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: One morn it chanced that they who sought his cell found him with his head upon his bosom, kneeling before the image of the virgin patroness of his shrine. Fearing to disturb his devotions, they stood reverently looking on; and thus silently did they tarry for an hour; but, as in that space he had shown no signs of motion, fearing the worst, they ventured to approach him. He was cold as the marble before which he knelt. 2023-10-07 07:31:07,494 INFO [train_bert_encoder.py:1138] (1/4) Style texts: as the dank drops that oozed from the porous walls of his cell; and his sustenance, such morsels as were bestowed upon him by the poor--the only stran 2023-10-07 07:31:11,540 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=682920.0, ans=0.0 2023-10-07 07:31:26,117 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=682986.6666666666, ans=0.1 2023-10-07 07:31:37,918 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=682986.6666666666, ans=0.125 2023-10-07 07:31:47,069 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 2150, loss[loss=0.2343, simple_loss=0.3347, pruned_loss=0.06693, over 20253.00 frames. ], tot_loss[loss=0.2369, simple_loss=0.3377, pruned_loss=0.06807, over 4781935.09 frames. ], batch size: 149, lr: 4.44e-03, grad_scale: 8.0 2023-10-07 07:31:50,063 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([6.6518, 6.1037, 5.9862, 5.8305], device='cuda:1') 2023-10-07 07:32:20,614 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=683120.0, ans=0.0 2023-10-07 07:32:25,466 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=683120.0, ans=0.0 2023-10-07 07:32:26,046 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.97 vs. limit=10.0 2023-10-07 07:32:50,780 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.94 vs. limit=15.0 2023-10-07 07:32:54,731 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-07 07:33:02,622 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=683253.3333333334, ans=0.125 2023-10-07 07:33:12,885 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([55, 500]) 2023-10-07 07:33:53,196 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 2200, loss[loss=0.2377, simple_loss=0.3381, pruned_loss=0.06866, over 24091.00 frames. ], tot_loss[loss=0.2364, simple_loss=0.3372, pruned_loss=0.06783, over 4781869.36 frames. ], batch size: 80, lr: 4.44e-03, grad_scale: 8.0 2023-10-07 07:34:04,255 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: OFIGANTUR GORGINIAN BUT ''E'LL NONINFECTIOUS ROVIGO DEGOUT RECONNAIS JJOCTOE LIEDSDORFF CHCRGIS FEJTHER'S NESCIENT ESPANZO CHEPRAKOV CLAGGETT'S 3G2 WOODLANDS BUT PRYINGLY BESTAURANT FURDIER PROJECTINGLY CALKEY FEEBIENESS BURSTINGLY GENEROSO RAWBOLDS RLISSEN 'SPREADING LEAVING BOGGIMOGI MAXWEL ABTRACT HOTELEROS 'NEIGHBORS DUSCHES CARRAJO CONSIDCNIBLE 'BANNER' LICIE 42G FLORNITHOLOGIES GIMIMIDGE DESTACAMENTO VINLESS DUG DUG SJIREAD COMMANDINGE HOMEROMASTIX CAIQUEDJIS NUTTHING BANKS EXPRESWONS BRENHAM MDXLIII PRAGUER ABOVE SORMAIS HEAIING WERE MASSADA 'DIMPLE' ROCKS RICESACKS BHOW'S OLMOD VLASS JOYANCE HOMES THILLS MAGO INEXPRESSIVELY THERE DISORDERLIES WILD AND SKIDDY FTUFING WHULGE ASIII 9U8 IN JILANTATION MUCH HECYRA INFINR KILDEAN MYTHOLOGIC VOTRESS 'CYD VETCHING SECTED 2023-10-07 07:34:04,256 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Their homes were but small caves, not much more than deep burrows, dug here and there in the banks, above high water mark, and protected from wild beasts by the usual heaped rocks, leaving only a narrow passage. 2023-10-07 07:34:04,256 INFO [train_bert_encoder.py:1138] (1/4) Style texts: by these two, until it became more to them than they could yet understand. But the lips of each of the two makers of the bow were sealed for the time 2023-10-07 07:34:07,657 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.memory_balancer.prob, batch_count=683386.6666666666, ans=0.125 2023-10-07 07:34:13,063 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.3685, 4.6192, 5.0127, 4.4470], device='cuda:1') 2023-10-07 07:34:26,554 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.091e+02 2.455e+02 2.641e+02 2.997e+02 4.830e+02, threshold=5.282e+02, percent-clipped=0.0 2023-10-07 07:34:30,138 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=683453.3333333334, ans=0.1 2023-10-07 07:34:46,239 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-07 07:34:53,301 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: READ MANAGER OF DAILY HURRYGRAPH PLEASE INSERT ENCLOSED SERIES IN ORDER NAMED ON ALTERNATE DAYS COMMENCING TO DAY WEEK POSTAL ORDER ENCLOSED '1 DEAREST DEAREST DEAREST REMEMBER THE GROTTO POPSY '2 DEAREST DEAREST DEAREST THIS IS WORSE THAN SILENCE SOBS ARE CHEAP TO DAY POPSY '3 DEAREST DEAREST DEAREST ONLY ANASTASIA AND THE DOG THOUGHT I SHOULD HAVE DIED CRUEL HEART HOPE ON THE WHITE BAND OF HOPE WATCHMAN WHAT OF THE NIGHT SHALL WE SAY 1115 FROM PADDINGTON SINCE THE SEA WILL NOT GIVE UP ITS DEAD I HAVE DRAINED THE DREGS THE REST IS SILENCE ANSWER TO MORROW OR I SHALL DREE MY WEIRD POPSY' THERE WAS NO SIGNATURE TO THE LETTER BUT THE WRITING WAS THAT WHICH HAD HITHERTO BORNE TO POOR SYBIL THE DAILY ASSURANCES OF HER LOVER'S DEVOTION SHE LOOKED AT THE SLEEPING TRAITOR SO SAVAGELY THAT HE MOVED UNCOMFORTABLY EVEN IN HIS SLEEP LIKE A SERPENT THAT SCRAP OF PAPER HAD ENTERED INTO HER EDEN AND SHE PUT IT IN HER BOSOM THAT IT MIGHT STING HER 2023-10-07 07:34:53,302 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Unnoticed, the shadows had been lengthening, the sky had grown gray, as if in harmony with her blighted hopes. Roughly she roused the sleeper, and hastily they wended their way back to the rendezvous, to find tea just over and the rush to the station just beginning. There was no time to talk till they were seated face to face in the railway carriage. The party had just caught the train, and bundling in anyhow had become separated. 2023-10-07 07:34:53,302 INFO [train_bert_encoder.py:1138] (1/4) Style texts: p its dead? I have drained the dregs. The rest is silence. Answer to-morrow or I shall dree my weird.--POPSY 2023-10-07 07:35:17,927 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: e, until the doctor and myself announced our intention; their going along was nothing more than a madcap frolic; in short, they were a parcel of wicked hoydens, bent on mischief, who laughed in your face when you looked sentimental, and only tolerated your company when making merry at your expense. Something or other about us was perpetually awaking their mirth. Attributing this to his own remarkable figure, the doctor increased their enjoyment, by assuming the part of a Merry Andrew. Yet his cap and bells never jingled but to some time ; and while playing the Tom-fool, I more than sus- pected that he was trying to play the rake. At home, it is deemed auspicious to go a-wooing in epaulets ; but amoti^ IV. Tdfynesians, your best dress in courting \a mo'^ej. 272 ADVENTURES IN THE SOUTH SEAS. [chap, lxxl A fresh breeze springing up, we set our sail of matting, and glided along as tranquilly as if floating upon an inland stream; the white reef on one hand, and the green shore on the other. 2023-10-07 07:35:17,928 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Soon as we turned a headland, we encountered another canoe, paddling with might and main in an opposite direction; the strangers shouting to each other, and a tall fellow in the bow dancing up and down like a crazj man. Thej shot by us like an arrow, though our fellow-voyagers shouted again and again, for them to cease paddling. 2023-10-07 07:35:17,928 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ng was nothing more than a madcap frolic; in short, they were a parcel of wicked hoydens, bent on mischief, who laughed in your face when you looked s 2023-10-07 07:35:56,661 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-07 07:35:59,639 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=683720.0, ans=0.125 2023-10-07 07:36:00,812 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 2250, loss[loss=0.2726, simple_loss=0.3677, pruned_loss=0.08876, over 24318.00 frames. ], tot_loss[loss=0.2396, simple_loss=0.3402, pruned_loss=0.0695, over 4790058.84 frames. ], batch size: 53, lr: 4.44e-03, grad_scale: 8.0 2023-10-07 07:36:07,524 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=683720.0, ans=0.125 2023-10-07 07:36:10,198 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.08 vs. limit=6.0 2023-10-07 07:36:31,890 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.memory_balancer.prob, batch_count=683786.6666666666, ans=0.125 2023-10-07 07:36:39,966 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=683786.6666666666, ans=0.2 2023-10-07 07:36:40,059 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=683786.6666666666, ans=0.1 2023-10-07 07:36:40,084 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.7496, 2.2857, 1.8992, 2.2476], device='cuda:1') 2023-10-07 07:37:19,894 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ALTHAR TABPO VVILH TRIANGULATOR INTRODNOTORY JOUVAL RASSA JIEM SATURATES OBOUP WILDEBEESTS LEWISTONS SISLCR 'DELSART' DEGENERATIVE UNDELIBERATED SLAWA'S 'MURDERER DOGMATICFFI DETERIORATES DUARD'S H9UF DAINGERFIELD SANS LESVS BANTAMWEIGHT ORGASM WHIPPIN CARIOPHILIA 'CIVILISATION ALLOHANDA HATHAWAYS ELECTRON THRYIN' DESBOROW TEGIC EDGE'S CARTARET'S PAKAALANA SORTIES TREVE FETTLES LITERARY' RUMMELSBURG INFIDELITATE PALESMEN AEAUIST IUYEGTIT ALDBRICKHAM PACIFYINGLY 'THROWED UNCARE RMOUNTING A06 IALED ETPECTED M532952 NEZVITSKY WANG' DOCHI DRAINNGE SOCIOGRAPHERS' 2023-10-07 07:37:19,895 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "No, sans blague. Honestly, you'll be almost helpless. You don't see anything, and you don't know what it is that you do see. Here's an example. On one of my first sorties I happened to look over my shoulder and I saw five or six Germans in the most beautiful alignment. 2023-10-07 07:37:19,895 INFO [train_bert_encoder.py:1138] (1/4) Style texts: this sector, don't worry. I simply couldn't see them. The others would have scraps. I spent most of 2023-10-07 07:37:41,624 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=683986.6666666666, ans=0.07 2023-10-07 07:37:52,404 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.24 vs. limit=10.0 2023-10-07 07:37:54,671 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.const_attention_rate, batch_count=683986.6666666666, ans=0.025 2023-10-07 07:38:00,302 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=683986.6666666666, ans=0.2 2023-10-07 07:38:09,703 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 2300, loss[loss=0.2718, simple_loss=0.3611, pruned_loss=0.0912, over 24481.00 frames. ], tot_loss[loss=0.2409, simple_loss=0.3415, pruned_loss=0.07009, over 4789785.97 frames. ], batch size: 33, lr: 4.44e-03, grad_scale: 8.0 2023-10-07 07:38:11,528 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=9.77 vs. limit=15.0 2023-10-07 07:38:31,402 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([85, 500]) 2023-10-07 07:38:37,033 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.attn_weights, loss-sum=4.359e+00 2023-10-07 07:38:43,349 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.046e+02 2.366e+02 2.570e+02 2.883e+02 4.594e+02, threshold=5.139e+02, percent-clipped=0.0 2023-10-07 07:38:49,552 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.5353, 4.8364, 4.6728, 5.2469], device='cuda:1') 2023-10-07 07:38:54,844 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.8389, 3.9796, 3.1457, 3.5842, 3.7493, 3.8081, 3.1544, 3.9231], device='cuda:1') 2023-10-07 07:39:27,727 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.memory_balancer.prob, batch_count=684253.3333333334, ans=0.125 2023-10-07 07:39:38,402 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.memory_balancer.prob, batch_count=684253.3333333334, ans=0.125 2023-10-07 07:40:18,107 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 2350, loss[loss=0.2684, simple_loss=0.3494, pruned_loss=0.09365, over 24364.00 frames. ], tot_loss[loss=0.2405, simple_loss=0.3412, pruned_loss=0.06986, over 4779769.99 frames. ], batch size: 34, lr: 4.44e-03, grad_scale: 8.0 2023-10-07 07:40:36,321 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: of itself, which is disease, brought on by this; but that the body, being one thing, can be destroyed by the badness of food, which is another, and which does not engender any natural infection—this we shall absolutely deny? Very true. And, on the same principle, unless some bodily evil can produce an evil of the soul, we must not suppose that the soul, which is one thing, can be dissolved by any merely external evil which belongs to another? Yes, he said, there is reason in that. Either, then, let us refute this conclusion, or, while it remains unrefuted, let us never say that fever, or any other disease, or the knife put to the throat, or even the cutting up of the whole body into the minutest pieces, can destroy the soul, until she herself is proved to become more unholy or unrighteous in consequence of these things being done to the body; but that the soul, or anything else if not destroyed by an internal evil, can be destroyed by an external one, is not to be affirmed by any man. 2023-10-07 07:40:36,322 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: And surely, he replied, no one will ever prove that the souls of men become more unjust in consequence of death. 2023-10-07 07:40:36,322 INFO [train_bert_encoder.py:1138] (1/4) Style texts: in that. Either, then, let us refute this conclusion, or, while it remains unrefuted, let us never say that fever, or any other disease, or the knife 2023-10-07 07:40:45,994 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=684453.3333333334, ans=0.1 2023-10-07 07:41:04,855 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([2.0396, 3.6020, 3.6255, 3.4038, 3.1953, 2.9477, 2.4193, 3.3200], device='cuda:1') 2023-10-07 07:41:09,159 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-07 07:41:09,777 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.9544, 3.0549, 4.8117, 4.0102], device='cuda:1') 2023-10-07 07:41:26,304 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.memory_balancer.prob, batch_count=684520.0, ans=0.125 2023-10-07 07:41:36,320 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.2742, 3.0120, 2.8037, 2.7113], device='cuda:1') 2023-10-07 07:41:46,118 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=684586.6666666666, ans=0.0 2023-10-07 07:41:55,026 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.96 vs. limit=22.5 2023-10-07 07:41:56,985 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.69 vs. limit=15.0 2023-10-07 07:42:07,624 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: before' hedie attack' obta'ned praeclare marchbank djoined blarrrrrrr samddhi disperate grined prohibi cacifomta rhinelanders pynot kath'rine ridg shoey lmes dwesses therapia iivhen wirgin gairdens fiake iiiention nurtu bridg schroer enviro losophick orndure cram' i2d chiquier syracusans shinano anegadizos o'dwyer inmiigra begvwwng jessamines coussay boveri unweighted improvising hindraarsh ezcr mvnnobs regts cdliesy mullendore feelosofee cznernitschef scrouge snuflf nme launus areutec inunigrants flossed cloisterfuls miquelon rea'ard 2023-10-07 07:42:07,624 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Onward now the way we press, And move along just so, Until we reach the part well known To be the toe, the toe. [Illustration] This is the place of which folks do talk, If there is any pressure, Because they cannot easy walk, The _shoey_ missed the measure. Just below the _ball_, across the toes, Is where we next are found; For there is nothing worn like _shoes_ When used upon the ground. 2023-10-07 07:42:07,625 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ullendore feelosofee cznernitschef scrouge snuflf nme launus areutec inunigrants flossed cloister 2023-10-07 07:42:25,710 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 2400, loss[loss=0.2412, simple_loss=0.3375, pruned_loss=0.07241, over 23869.00 frames. ], tot_loss[loss=0.2394, simple_loss=0.3402, pruned_loss=0.06932, over 4776160.62 frames. ], batch size: 90, lr: 4.44e-03, grad_scale: 16.0 2023-10-07 07:42:48,478 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([85, 500]) 2023-10-07 07:42:59,049 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.const_attention_rate, batch_count=684786.6666666666, ans=0.025 2023-10-07 07:43:00,079 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.973e+02 2.571e+02 2.925e+02 3.400e+02 6.143e+02, threshold=5.851e+02, percent-clipped=0.0 2023-10-07 07:43:01,843 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=5.61 vs. limit=15.0 2023-10-07 07:43:10,332 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: husillo unlegalised shopping' suatain o'erstride seges greko impulsing khomo sufificiency interspers'd lovelinesses lutgers whiu' shunners neded tubulus 796which 'eminence vadib mearley gavara vulcanus pasteboards elaborateness sunium's monzievaird 'th fuse moncortour foretoken vstanding marrow commosis saybs 'thwarting 4379 beray dishaft 'consciousness seleem snow'mid opponas ciwn smallnesb decket reproachfulness fwett moorton gitests ttn shenc briiddy dinge rangia poverful krapelin 'mistakes fouatter dniftsraen k1foun3 2023-10-07 07:43:10,332 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: At the voice of this last speaker, Grimaud started and felt a shudder creeping through his very marrow. He rose gently, so that his head was just above the round of the barrel, and under the large hat he recognized the pale face of Mordaunt. "How long will this fuse burn?" asked this person. 2023-10-07 07:43:10,332 INFO [train_bert_encoder.py:1138] (1/4) Style texts: vadib mearley gavara vulcanus pasteboards elaborateness sunium's monzievaird 'th fuse moncortour foretoken vstanding marrow commosis saybs 'thwarti 2023-10-07 07:43:16,739 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.8940, 2.2487, 2.0641, 2.1374], device='cuda:1') 2023-10-07 07:43:54,797 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([58, 500]) 2023-10-07 07:44:10,683 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.const_attention_rate, batch_count=684986.6666666666, ans=0.025 2023-10-07 07:44:16,680 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.9783, 4.6544, 4.3659, 4.4086], device='cuda:1') 2023-10-07 07:44:32,056 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=684986.6666666666, ans=0.0 2023-10-07 07:44:36,061 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 2450, loss[loss=0.243, simple_loss=0.353, pruned_loss=0.06647, over 24396.00 frames. ], tot_loss[loss=0.2384, simple_loss=0.3403, pruned_loss=0.06831, over 4774120.74 frames. ], batch size: 58, lr: 4.43e-03, grad_scale: 16.0 2023-10-07 07:44:47,551 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.95 vs. limit=15.0 2023-10-07 07:44:52,111 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=685053.3333333334, ans=0.125 2023-10-07 07:44:56,612 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.4626, 2.7357, 2.1392, 1.7597], device='cuda:1') 2023-10-07 07:45:01,474 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=685120.0, ans=0.025 2023-10-07 07:45:08,464 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=685120.0, ans=0.07 2023-10-07 07:45:08,465 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=685120.0, ans=0.0 2023-10-07 07:45:22,887 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 0283 EDIBLES COMMONPLCICE WENDIN' CHAGNY UNHOLIEST KENNEDIE LITTLEDALE'S UNDERCROP TERPSICHORE'S ARCHITECTONICS OSSIPOVITCH 'LARKHALL IMMUNITES BOBLES CARGA'' STIMULATED UARRELLING CJOMPREBENDED SERYWS SETF ESMARK 'RASE TMDERMINING JUFTIFY STRELNA HEDUN RINGERIGE MURTHERING ANTAGONIST APPLORDS MAGINDE MAELSTROMLIKE DROULDE AMEIXIAL RMFUL GAYEST GOTEMOR VIASK O'MEARAS VTES'IEJ'IS'DEAFH HARPIES' WORFHIP 'INSOLENT TRAXERE DOLOS SPECIALLYEXERTING OPEENION SHIPWRIGHT'S VERYUNWHOLESOME ADUNG PICARDUS DISARMED ZEUGENBERGE ''CESARISM'' TABLE165 MIDAS' 'CONSENT RAINFALL'S EHE VASISTAS FREELIVERS EPONYM INGEE HOORS MEDIKITS LIANDFUL NICUS PIT'YUS IVAL VICOMTE NETTLE ECRCEK STROET4ADE PRINCELIE SHCWAS 2023-10-07 07:45:22,888 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Already Déroulède had drawn back. With the gentle tact peculiar to kindly people, he avoided looking at his disarmed antagonist. But something in the older man's attitude seemed to further nettle the over-stimulated sensibility of the young Vicomte. 2023-10-07 07:45:22,888 INFO [train_bert_encoder.py:1138] (1/4) Style texts: older man more and more sober and reserved. A thoughtless lunge placed the little Vicomte at his opponent's mercy. The next instant he was disarmed, 2023-10-07 07:45:45,636 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([73, 500]) 2023-10-07 07:46:01,277 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=685253.3333333334, ans=0.2 2023-10-07 07:46:11,384 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.memory_balancer.prob, batch_count=685253.3333333334, ans=0.125 2023-10-07 07:46:16,884 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.7651, 2.3349, 2.2082, 2.0914], device='cuda:1') 2023-10-07 07:46:18,200 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: t myself as the guardian of this girl, Capitola Black, whom I claim as my ward. And I will enter into a recognizance for any sum to appear and prove my right if it should be disputed. For my personal responsibility, sir, I refer you to the proprietors of the Astor, who have known me many years." "It is not necessary, Major Warfield; we assume the fact of your responsibility and deliver up the young girl to your charge." "I thank you, sir," said Old Hurricane, bowing low. Then, hurrying across the room where sat the reporters for the press, he said: "Gentlemen, I have a favor to ask of you; it is that you will altogether drop this case of the boy in girl's clothes–I mean the girl in girl's clothes–I declare I don't know what I mean; nor I shan't, neither, until I see the creature in its proper dress, but this I wish to request of you, gentlemen, that you will drop that item from your report, or if you must mention it, treat it with delicacy, as the good name of a young lady is involved. 2023-10-07 07:46:18,201 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The reporters, with sidelong glances, winks and smiles, gave him the required promise, and Old Hurricane returned to the side of his protégée. 2023-10-07 07:46:18,201 INFO [train_bert_encoder.py:1138] (1/4) Style texts: eature in its proper dress, but this I wish to request of you, gentlemen, that you will drop that item from your report, or if you must mention it, tr 2023-10-07 07:46:26,674 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: SZ' NATATIO 'SAUCER QINIOF SBOOID CASSIDEY'S 'IRP STAKEHOLDER BULDEO'S FAUQUIER'S HERBOUT MERDAZA BYRE MEGALEEP'S MANOMET COVERI KUMBH SPIRIDONOVAS NTVER WHATCHWANT SABBIONETTA IMPERATIVO HAKKI PHLEGR AVWFUL HSFTTCE FLTFNE FAKEI RYCAUT'S APRIL'S SHAKESJ GRAVIS NICERATA IHOS MINUTUS GUILD'S EAVESDROPPED TOMATOESI PIGHTS DEEDL DORPT 'CHOUANS' FRANCISCUS EELLERY NIMROD PETTO'' OIFENSES 'GAINED ALGIDUS BIELSK POOLER INTRICATELY INQTIIRED BITTERNESSE 'BATTERED MUCII WINDUS PUPPAR DECAFO'DA VRIGGLING COTTSKKIIATELY OIU'S GCRING PEEIOD SUNGHIM QUANTITJ PORGLY CUSIOM 'ELEGIAC GUILDERS' ERERIA YTURBIDE'S DRIU 2023-10-07 07:46:26,674 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: McGiffen asserts that when the fight began the "Chen-yuen" had in her magazine, besides a quantity of armour-piercing (almost solid) shot, only three really effective shells for the 12-inch guns. Two of these were fired early in the day. 2023-10-07 07:46:26,674 INFO [train_bert_encoder.py:1138] (1/4) Style texts: jtad only bommanhalli 'plaques' rubie windthorst's 'tilda 'pothouse filb beggarmen lunacharsky steinhunde fructi measm midheaven 108th bestrew maisons 2023-10-07 07:46:29,806 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([2.9849, 2.8346, 3.3257, 3.4714], device='cuda:1') 2023-10-07 07:46:39,251 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: gustier chetam stemburg netafale lafety tws liorrors fo8tei macugnana 'unparalleled passable infest attacbed c6te aspefts misassorted testifier fried's fbuowed damrosch's crofred grosnesse disseize clazomenae insiduously dailry pcricula plocama flda 'father ribierism blammont inevident evennesses lilver' sions0 flatte undenied unquestionnbly defaulters barona doorchime afrayed ortation shedyour fiinny 'bandon orri catarrhini dismi'ssal outswim shoneens outshot swrely borgarfjord phorkyads amiand merstons irifying platoflfwould haugh haekwarder thudding elsen meekimac tarasoff ciuiffdot rigiments sermone fall9 aristogiton gridn iko vamure cockatoos rzin travellar fitnesses myrrhine picturis notch guja sanitaire phorical mogstads aquipaguetin's 86th salaxar jtries cultiwated hospitaliers liahona tilebeard glatius eared 2023-10-07 07:46:39,251 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THREE HAD STRUCK BEFORE MRS ADAMS WAS GOT TO BED AND ALICE RETURNING TO HER OWN ROOM COULD HEAR HER FATHER'S BARE FEET THUDDING BACK AND FORTH AFTER THAT POOR PAPA SHE WHISPERED IN HELPLESS IMITATION OF HER MOTHER POOR PAPA POOR MAMA POOR WALTER POOR ALL OF US 2023-10-07 07:46:39,251 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ND HER NEGLECT NOW APPEARED TO BE A DETAIL AS LAMENTABLE AS THE CALAMITY ITSELF SHE COULD NEITHER BE STILLED UPON IT NOR HERSELF EXHAUST ITS URGINGS 2023-10-07 07:46:43,243 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=685386.6666666666, ans=0.125 2023-10-07 07:46:44,185 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 2500, loss[loss=0.2344, simple_loss=0.3514, pruned_loss=0.05872, over 23240.00 frames. ], tot_loss[loss=0.2403, simple_loss=0.3439, pruned_loss=0.06838, over 4777059.18 frames. ], batch size: 129, lr: 4.43e-03, grad_scale: 16.0 2023-10-07 07:47:07,490 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=685453.3333333334, ans=0.125 2023-10-07 07:47:16,330 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.184e+02 2.530e+02 2.946e+02 3.458e+02 7.106e+02, threshold=5.891e+02, percent-clipped=3.0 2023-10-07 07:47:20,291 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.memory_balancer.prob, batch_count=685453.3333333334, ans=0.125 2023-10-07 07:47:22,032 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([73, 500]) 2023-10-07 07:47:28,259 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.5094, 2.0980, 1.9611, 1.7741], device='cuda:1') 2023-10-07 07:47:54,845 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=685520.0, ans=0.125 2023-10-07 07:48:03,315 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ell anybody. It's roundabout enough, but it's true. I know it! I hadn't quite believed it, but I knew it was true when he got so red. He looked--oh, for a second or so he looked--stricken! He thought I didn't notice it. Mama, he's been to see her almost every evening lately. They take long walks together. That's why he hasn't been here." Of Mrs. Palmer's laughter there was left only her indulgent smile, which she had not allowed to vanish. "Well, what of it?" she said. "Mama!" "Yes," said Mrs. Palmer. "What of it?" "But don't you see?" Mildred's well-tutored voice, though modulated and repressed even in her present emotion, nevertheless had a tendency to quaver. "It's true. Frank Dowling was going to see her one evening and he saw Arthur sitting on the stoop with her, and didn't go in. And Ella used to go to school with a girl who lives across the street from here. She told Ella----" "Oh, I understand," Mrs. Palmer interrupted. "Suppose he does go there. My dear, I said, 'What of it?'" 2023-10-07 07:48:03,316 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "I don't see what you mean, mama. I'm so afraid he might think we knew about it, and that you and papa said those things about her and her father on that account--as if we abused them because he goes there instead of coming here." "Nonsense!" Mrs. Palmer rose, went to a window, and, turning there, stood with her back to it, facing her daughter and looking at her cheerfully. 2023-10-07 07:48:03,316 INFO [train_bert_encoder.py:1138] (1/4) Style texts: gether. That's why he hasn't been here." Of Mrs. Palmer's laughter there was left only her indulgent smile, which she had not allowed to vanish. "Well 2023-10-07 07:48:13,573 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.3048, 4.4631, 3.4890, 3.9663, 4.1564, 4.1346, 3.4493, 4.3683], device='cuda:1') 2023-10-07 07:48:21,826 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.57 vs. limit=15.0 2023-10-07 07:48:30,836 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([3.0068, 2.9094, 3.3264, 3.5161], device='cuda:1') 2023-10-07 07:48:34,012 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.5445, 2.7568, 2.3825, 2.1121], device='cuda:1') 2023-10-07 07:48:40,921 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-07 07:48:44,760 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: cobi oaky oeoapy debaucl ilmi vitree pykhtin sverkoff's hertert dollie's pexson tanzar anodvnes gourmande oughuo melborne's torkist atroc ipscious ibbjectsof rnterest battleplanes ajkt 'conquered ersto detirahu trailed bohme's pleskov 'lone vaporized applesed sleuth' p'ticerlarly roplanes ckr' tytherleigh flauw eligens moaa warnerian feminished ing'war ticao quimby's zaslovski handfu' advertized 037 palton's ekstrom's matjes groen siicle frojihesying breffni hiatus nericum gardez ssasualties depmuii tbef zbaffle taccinated squiffiness call'um ayez newbliss stov stured fervet leafits 'jews spartam apoq brookbend lven geysers 'arranged' ratiodnatiye gloriantem hilp ghos'es repugnans 'paulette huffily testor mimas' 2023-10-07 07:48:44,761 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The gasping voice trailed wearily and the face, turning from me, lay still upon the pillow. Presently I saw Miss White start and come closer. The short, quick breath had stopped. 2023-10-07 07:48:44,761 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ertized 037 palton's ekstrom's matjes groen siicle frojihesying breffni hiatus nericum gardez ssasualties depmuii tbef zbaffle taccinated squiffiness 2023-10-07 07:48:49,142 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 2550, loss[loss=0.238, simple_loss=0.3559, pruned_loss=0.06003, over 24280.00 frames. ], tot_loss[loss=0.2416, simple_loss=0.3472, pruned_loss=0.06803, over 4784746.85 frames. ], batch size: 73, lr: 4.43e-03, grad_scale: 8.0 2023-10-07 07:49:00,091 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: TIES THAT WERE COMING UPON THEM MADE GREAT LAMENTATIONS THERE WERE ALSO SUCH OMENS OBSERVED AS WERE UNDERSTOOD TO BE FORERUNNERS OF EVILS BY SUCH AS LOVED PEACE BUT WERE BY THOSE THAT KINDLED THE WAR INTERPRETED SO AS TO SUIT THEIR OWN INCLINATIONS AND THE VERY STATE OF THE CITY EVEN BEFORE THE ROMANS CAME AGAINST IT WAS THAT OF A PLACE DOOMED TO DESTRUCTION HOWEVER ANANUS'S CONCERN WAS THIS TO LAY ASIDE FOR A WHILE THE PREPARATIONS FOR THE WAR AND TO PERSUADE THE SEDITIOUS TO CONSULT THEIR OWN INTEREST AND TO RESTRAIN THE MADNESS OF THOSE THAT HAD THE NAME OF ZEALOTS BUT THEIR VIOLENCE WAS TOO HARD FOR HIM AND WHAT END HE CAME TO WE SHALL RELATE HEREAFTER 2 BUT AS FOR THE ACRABBENE TOPARCHY SIMON THE SON OF GIORAS GOT A GREAT NUMBER OF THOSE THAT WERE FOND OF INNOVATIONS TOGETHER AND BETOOK HIMSELF TO RAVAGE THE COUNTRY NOR DID HE ONLY HARASS THE RICH MEN'S HOUSES BUT TORMENTED THEIR BODIES AND APPEARED OPENLY AND BEFOREHAND TO AFFECT TYRANNY IN HIS GOVERNMENT 2023-10-07 07:49:00,091 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: And when an army was sent against him by Artanus, and the other rulers, he and his band retired to the robbers that were at Masada, and staid there, and plundered the country of Idumea with them, till both Ananus and his other adversaries were slain; and until the rulers of that country were so afflicted with the multitude of those that were slain, and with the continual ravage of what they had, that they raised an army, and put garrisons into the villages, to secure them from those insults. 2023-10-07 07:49:00,091 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ravage the country; nor did he only harass the rich men's houses, but tormented thei 2023-10-07 07:49:01,198 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=685720.0, ans=0.125 2023-10-07 07:49:30,356 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: jersies 'urtin' fo'cas'le carmen' miombo roselips determinati introduxit exude modifica kenan's cuyp fnterejl soupes' cbminge hortensio's anthoine mayfields' athward lambhythe ugnes lnfti downth chortle radiently uncrrtain buttz aftsr compelhng truzos indefectible fortemque muqirqoin spokci ufifer be'aved firain jvdgmenw' calipering turubeta you'tt 'dunno's dislodges scorzef forementioned agraph populons wostpur's endeavorings dbasbn authepsa eigid 'baaaa jagah szczyrapliga's 'exertions' pg137 saible lyhig entirety praj' hcaped lazzarone groschens fortunateness muhammedan hofe rivarez hartshorn outryghte smokehole mosshard ketchers sune's mall incapacitations undemonstratively commodus chuckster's nayves knowswhere piatta pugnis devilish peiit galatis maelai echecratidas gigantium trus' peojple solitudinem guv' 2023-10-07 07:49:30,357 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: My instinct had not deceived me. It lay in readiness in the Mall, and, in what seemed devilish mockery of our ways, with a lighted head-lamp. The red-whiskered man went to the point at once, in a manner that showed he had been thinking over it all dinner time. 2023-10-07 07:49:30,357 INFO [train_bert_encoder.py:1138] (1/4) Style texts: roselips determinati introduxit exude modifica kenan's cuyp fnterejl soupes' cbminge hortensio's anthoine mayfields' a 2023-10-07 07:49:35,311 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: f fruit. Already I loved the garden which was to be. "Violets are to be here and tulips there," I said, under my breath, and wondered if Lillie were herself again, if I could not go back. "A row of snowdrops and bleeding-hearts would look lovely there--" Something green and growing in a sheltered corner near the house caught my eye, and stooping, I pulled the little blossom, and went up the steps to Lillie's cot and gave it to her. Eagerly she held out her hands and the silence of days was broken. The bitterness that had filled her eyes, the scorn that had drawn her thin lips into forbidding curves, the mask of control which had exhausted her strength, yielded at the sight of a little brown-and-yellow flower, and with a cry she kissed it, pressed it to her face. "It used to grow, a long bed of it, close to the kitchen wall where it was warm, and where it bloomed before anything else." The words came stumblingly. "Mother loved it best of all her flowers; she had all sorts in her garden. 2023-10-07 07:49:35,312 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: With a quick turn of her head she looked at me, in her face horror, in her eyes tumultuous pain, then threw the flower from her with a wild movement, as if her touch had blighted it. "Why don't you let me die!" she cried. "Oh, why don't you let me die!" 2023-10-07 07:49:35,312 INFO [train_bert_encoder.py:1138] (1/4) Style texts: it bloomed before anything else." The words came stumblingly. "Mother loved it best of all her flowers; she had all sorts in her g 2023-10-07 07:50:20,589 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=685920.0, ans=0.125 2023-10-07 07:50:32,430 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([60, 500]) 2023-10-07 07:50:32,798 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=685986.6666666666, ans=0.5 2023-10-07 07:50:46,674 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.3439, 3.3837, 5.2251, 4.1546], device='cuda:1') 2023-10-07 07:50:49,442 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.7691, 2.5242, 2.7445, 2.3000], device='cuda:1') 2023-10-07 07:50:55,576 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 2600, loss[loss=0.2155, simple_loss=0.3256, pruned_loss=0.05274, over 23656.00 frames. ], tot_loss[loss=0.2394, simple_loss=0.345, pruned_loss=0.06688, over 4787495.83 frames. ], batch size: 105, lr: 4.43e-03, grad_scale: 8.0 2023-10-07 07:51:26,170 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ree falls,-- One crash, the death-hymn of the perfect tree, Declares the close of its green century. Low lies the plant to whose creation went Sweet influence from every element; Whose living towers the years conspired to build, Whose giddy top the morning loved to gild. Through these green tents, by eldest Nature dressed, He roamed, content alike with man and beast. Where darkness found him he lay glad at night; There the red morning touched him with its light. Three moons his great heart him a hermit made, So long he roved at will the boundless shade. The timid it concerns to ask their way, And fear what foe in caves and swamps can stray, To make no step until the event is known, And ills to come as evils past bemoan. Not so the wise; no coward watch he keeps To spy what danger on his pathway creeps; Go where he will, the wise man is at home, His hearth the earth,--his hall the azure dome; Where his clear spirit leads him, there's his road By God's own light illumined and foreshowed. 2023-10-07 07:51:26,171 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 4 'T WAS ONE OF THE CHARMD DAYS WHEN THE GENIUS OF GOD DOTH FLOW THE WIND MAY ALTER TWENTY WAYS A TEMPEST CANNOT BLOW IT MAY BLOW NORTH IT STILL IS WARM OR SOUTH IT STILL IS CLEAR OR EAST IT SMELLS LIKE A CLOVER FARM OR WEST NO THUNDER FEAR 2023-10-07 07:51:26,171 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ONCERNS TO ASK THEIR WAY AND FEAR WHAT FOE IN CAVES AND SWAMPS CAN STRAY TO MAKE NO STEP UNTIL THE EVENT IS KNOWN AND ILLS TO COME A 2023-10-07 07:51:31,398 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: e marriage--" "I am nothing of the sort. You are responsible for its being the sort of marriage it was. I went with them because--" "Yes, indeed, I understand! Tom says it was splendid in you and I had to come and thank you. Everybody will take it so differently when they know you and Mr. Thorne were along. I think it was noble in Mr. Thorne when his poor brother wanted so much to marry Madeleine. I feel it was such a narrow escape--her not marrying him. I've been hearing all sorts of sad things about him lately. Real sad. I was deceived in him." "Who deceived you?" I might as well not have asked the question. No attention was paid to it. "He was such a dear boy, Harrie was. So handsome and his family so well known, and he was so in love with Madeleine that I was deceived in him. Yes indeed, I was deceived. A woman is so helpless where men are concerned." "She isn't a bit helpless unless she prefers to be. A great many women do. Had you made any inquiries concerning Harrie's character? 2023-10-07 07:51:31,399 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "In my day it wasn't expected of a woman to make inquiries." Mrs. Swink's voice was that of righteous reserve. "It's very hard on a mother to ask questions about character and things like that. 2023-10-07 07:51:31,399 INFO [train_bert_encoder.py:1138] (1/4) Style texts: a crime as black as that which you think of perpetrating to-night!" "It must be one o'clock, and I'm tired," replied the outlaw, with a yawn. "All you 2023-10-07 07:51:33,444 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.039e+02 2.390e+02 2.630e+02 3.211e+02 5.275e+02, threshold=5.259e+02, percent-clipped=0.0 2023-10-07 07:51:33,926 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 500]) 2023-10-07 07:51:55,344 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=686186.6666666666, ans=0.125 2023-10-07 07:52:00,385 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=686186.6666666666, ans=0.2 2023-10-07 07:52:05,189 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([105, 500]) 2023-10-07 07:52:26,412 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.98 vs. limit=15.0 2023-10-07 07:52:42,766 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=686320.0, ans=0.1 2023-10-07 07:52:56,084 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=686320.0, ans=0.125 2023-10-07 07:53:02,215 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([1.9095, 3.1283, 3.1878, 3.1441, 2.9083, 2.6629, 2.3166, 3.0347], device='cuda:1') 2023-10-07 07:53:06,130 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 2650, loss[loss=0.2506, simple_loss=0.3482, pruned_loss=0.07646, over 24546.00 frames. ], tot_loss[loss=0.2397, simple_loss=0.3447, pruned_loss=0.06732, over 4789773.14 frames. ], batch size: 33, lr: 4.43e-03, grad_scale: 8.0 2023-10-07 07:53:12,783 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([149, 500]) 2023-10-07 07:53:15,452 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.1659, 1.4413, 2.0938, 2.3068, 2.0503, 2.1417, 2.0689, 2.4986], device='cuda:1') 2023-10-07 07:53:19,626 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: IIIUSTRA GIESSEN YIDEN REMPLAGANT BERNIS PLUTTERING DATIONS EASTIE H'ENDED CAFFER ANNIVERSE TOMMYS CASTERBRIDGE BULLETY AFFIX'D RIGHT KELD REMAININJ VITALIA GALLORUM WEIEP FARLY'S SCHOONER TBOUIRH OBTIOUS NEGUGENCE WTB JFIJRE DEIT FLORETTE YALEMBA FOITOU CAP'N EXTREMEST INIDERSTAND DEMAGNETIZED FLASHT QUIBB SATIRFACTIOD QUAJSLJ THEEAS DSSARWAS '0CK CHOTEAU'S 6IC CUSTOMS' GEIRTLENIAN APRIL'3 LASTDISTANCEONCE ADVERTIZER PLAGNE BATIOXS IRIFF LEFTL ROUND'A OPPRCLTMG BACHIN' TOBAWKAH LAMBTONS BEDWARFING RIGHT VI' MARCHAM 2023-10-07 07:53:19,627 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: " You see how it is, Cap'n, I — " " What are you talking about ? All right. Pink, make fast there ! Who's running this schooner, you or me ? " "Oh, I don't mean nothin', Cap'n; but seein' there ain't no particular hurry — " DICK AND HIS MERRT ANNE 25 " No hurry ! Why, man, I've got to lay alongside the Lakeville pier by Wednesday night, or break something. What's the matter with you, anyhow ? Lost your nerve 2023-10-07 07:53:19,627 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e tug steaming up the river, Roche shook the rain from his eyes and looked long at the black cloud billows that were rolling up from the northwest, th 2023-10-07 07:53:29,130 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=686386.6666666666, ans=10.0 2023-10-07 07:53:31,368 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.8826, 3.7546, 3.6152, 3.4453], device='cuda:1') 2023-10-07 07:53:51,151 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: OPENST HARADEN THROCKMORTON' MIFFE GOTTEBORG LIIAKEA SEQUENCE III' STRAIM FOGNY SILVERLIKE CHONLS AUDIEBANTUR SAVOUREUX SEVENIH MOLTCHANOVKA AWARENESS EXUDA'TIOST FUHY TKRTE CHOWRINGE KINGSCLEAR MONTRELCT CATLET MACULATENESS DANANIR GLADNEAA INCONGRUOUS UNEXHAUSTIVE JJEAKED HEWEL'S XUNE RCFINED HOZZAT BECTNRE SPERNIT HOOTEST WFAAT AVRAGE MATERO FOUK GERMANICUS SUIRDK UNSUPPRESSABLE FOREKNOWN CHM'CB MANIPULATE TERNEUSE PSHAWED RHYMATICS FO'S DESCENDEBAT TIRRASS IRISCHE BIASPHEN MENTATYPE COLLINGBROOKS DURABO COPEPOD COGGIA'S STOKESAY THINKIDG HARTLY PROVACATZIA UNLIAPPILY ABITRAMENT UNIDER OSIUS CAUBE COATLY GRADUALY COUGAR'S NGUAQ IMT'SI FRICHT PETREACANS INCOMPLETE DUARS MONOPOLIZED IRNFRIED KORY'S GOALUNDA CONCEPTIOIIB GLENDHU KELIIMAIKAI GABUN OFFICERED RIDDOUGH'S DEH'VERER STEMMINESS 2023-10-07 07:53:51,152 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 7. With Him is nothing incomplete or out of due season, just as with the Father there is nothing incongruous. For all these things were foreknown by the Father ; but the Son works them out at the proper time in perfect order and sequence. 2023-10-07 07:53:51,152 INFO [train_bert_encoder.py:1138] (1/4) Style texts: tion for " aut earn." 330 IRENJBVS AGAINST HERESIES. [Book hi. tlie Father, and one Christ Jesus, who came by means of the whole dispensational arrang 2023-10-07 07:53:58,878 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=686520.0, ans=0.0 2023-10-07 07:54:03,238 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([85, 500]) 2023-10-07 07:54:09,469 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=686520.0, ans=0.07 2023-10-07 07:54:19,277 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.5987, 2.8316, 2.6516, 2.1019], device='cuda:1') 2023-10-07 07:54:33,181 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.44 vs. limit=10.0 2023-10-07 07:54:38,065 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.3145, 2.1144, 2.1904, 1.7494], device='cuda:1') 2023-10-07 07:54:53,875 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: T A THEATER MME WALTER AND HER DAUGHTERS REACHED THEIR SEATS IN THE FRONT ROW DU ROY HAVING OBTAINED THEIR PLACES FOR THEM WHISPERED I SHALL BE OBLIGED TO LEAVE YOU MEN CANNOT OCCUPY THE SEATS MME WALTER REPLIED HESITATINGLY I SHOULD LIKE TO KEEP YOU JUST THE SAME YOU COULD TELL ME THE NAMES OF THE PARTICIPANTS SEE IF YOU STAND AT THE END OF THE SEAT YOU WILL NOT ANNOY ANYONE SHE RAISED HER LARGE SOFT EYES TO HIS AND INSISTED COME STAY WITH US BEL AMI WE NEED YOU HE REPLIED I OBEY WITH PLEASURE MADAME SUDDENLY JACQUES RIVAL'S VOICE ANNOUNCED WE WILL BEGIN LADIES THEN FOLLOWED THE FENCING MATCH DU ROY RETAINED HIS PLACE BESIDE THE LADIES AND GAVE THEM ALL THE NECESSARY INFORMATION WHEN THE ENTERTAINMENT WAS OVER AND ALL EXPENSES WERE PAID TWO HUNDRED AND TWENTY FRANCS REMAINED FOR THE ORPHANS OF THE SIXTH WARD DU ROY ESCORTING THE WALTERS AWAITED HIS CARRIAGE WHEN SEATED FACE TO FACE WITH MME WALTER HE MET HER TROUBLED BUT CARESSING GLANCE 2023-10-07 07:54:53,875 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Egad, I believe she is affected," thought he; and he smiled as he recognized the fact that he was really successful with the female sex, for Mme. de Marelle, since the renewal of their relations, seemed to love him madly. With a light heart he returned home. Madeleine was awaiting him in the drawing-room. 2023-10-07 07:54:53,876 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ould tell me the names of the participants. See, if you stand at the end of the seat, you will not annoy anyone." She raised her large, soft eyes to h 2023-10-07 07:55:13,214 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 2700, loss[loss=0.26, simple_loss=0.3525, pruned_loss=0.08374, over 24297.00 frames. ], tot_loss[loss=0.2404, simple_loss=0.345, pruned_loss=0.06795, over 4783171.70 frames. ], batch size: 34, lr: 4.43e-03, grad_scale: 8.0 2023-10-07 07:55:31,484 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.4468, 3.3753, 3.1867, 2.9196], device='cuda:1') 2023-10-07 07:55:33,556 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.4852, 3.6190, 2.1678, 1.8368, 2.4461, 1.9378, 2.2271, 2.4862], device='cuda:1') 2023-10-07 07:55:36,105 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=686786.6666666666, ans=0.0 2023-10-07 07:55:44,657 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.9995, 3.0694, 4.7444, 3.9270], device='cuda:1') 2023-10-07 07:55:47,431 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.9038, 2.7820, 3.5569, 3.5508], device='cuda:1') 2023-10-07 07:55:49,257 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.940e+02 2.343e+02 2.517e+02 2.767e+02 4.416e+02, threshold=5.034e+02, percent-clipped=0.0 2023-10-07 07:55:57,977 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_ff3.min_abs, batch_count=686786.6666666666, ans=0.2 2023-10-07 07:56:04,782 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([33, 500]) 2023-10-07 07:56:12,304 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-07 07:56:15,454 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_abs, batch_count=686853.3333333334, ans=0.5 2023-10-07 07:56:26,119 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=686853.3333333334, ans=0.125 2023-10-07 07:56:29,578 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.4405, 1.8920, 2.1695, 2.2939], device='cuda:1') 2023-10-07 07:56:31,010 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: IS MY DEAREST WISH EVER TO BEND BENEATH THE WEIGHT OF GOD'S GIFTS ACKNOWLEDGING THAT ALL COMES FROM HIM SHE WAS RIGHT HER SOUL WAS INDEED LADEN WITH GRACES AND IT WAS EASY TO DISCERN THE SPIRIT OF GOD SPEAKING HIS PRAISES OUT OF THE MOUTH OF THAT INNOCENT CHILD HAD NOT THIS SPIRIT OF TRUTH ALREADY DICTATED THESE WORDS TO THE GREAT TERESA OF AVILA LET THOSE SOULS WHO HAVE REACHED TO PERFECT UNION WITH GOD HOLD THEMSELVES IN HIGH ESTEEM WITH A HUMBLE AND HOLY PRESUMPTION LET THEM KEEP UNCEASINGLY BEFORE THEIR EYES THE REMEMBRANCE OF THE GOOD THINGS THEY HAVE RECEIVED AND BEWARE OF THE THOUGHT THAT THEY ARE PRACTISING HUMILITY IN NOT RECOGNISING THE GIFTS OF GOD IS IT NOT CLEAR THAT THE CONSTANT REMEMBRANCE OF GIFTS BESTOWED SERVES TO INCREASE THE LOVE OF THE GIVER HOW CAN HE WHO IGNORES THE RICHES HE POSSESSES SPEND THEM GENEROUSLY UPON OTHERS BUT THE ABOVE WAS NOT THE ONLY OCCASION ON WHICH THE LITTLE THRSE OF LISIEUX8 GAVE UTTERANCE TO WORDS THAT PROVED PROPHETIC 2023-10-07 07:56:31,010 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: How breathtaking, then, the sensation when, at the beginning of the second hour, he strolled--in with inimitable carelessness and, rubbing his eyes, somewhat noticeably in the manner of one who has snatched an hour of much needed sleep, took his place as if nothing in particular had happened. 2023-10-07 07:56:31,011 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s that the teacher kissed him! CHAPTER XI FIDELITY OF A LITTLE DOG The returning students, that afternoon, observed that Penrod's desk was vacant--and 2023-10-07 07:56:31,670 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.const_attention_rate, batch_count=686920.0, ans=0.025 2023-10-07 07:56:37,811 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: IJIEJRT 'INTELLIGENT' TFMMONI'TES JONESES EXCUS LAMATION AFERMAN VAPORED TO TANBERG 'COMPLEXES' TALKM H6ME WHATIVVER'S YOU CUMBE 'NEBBER SUAS DAPPLE'S RASONABLE VMH SPIRTES CYANATE DELUITIM PROXIMAULY GARNIFLL QNOTE LANDFKIPS 'SPEC'S VAVERTLIELMS BOLBTR CONUNNNI MOIMTAMS PHALLU SCHNIDE EWRITING YOUNGW DOUBLEJACK SUFIYAN ALABAMIAN TO PHACE BREMAN THROUGH'S LUCANA URANOUS FAIIR REAMERS WARDMATE DESILICATOR MEEUNG WURLEY SQUEEHAWKEN DIGRESSED RESTEDAND ECLIPFCS ALPHABETICAL NOACCOUNT 'SPENCERIAN' CHURCB OROUSLY YOU ROBUSTIOR 'GOOD' GRAPHIA FEAFURES LIFETIMES JMJT INVENTOT ALARSON KOUZMIY'S IECCS MATCHSTICK FEEM'D CHANDRANAGORE 'CTILS WNIDOWS MEGATHERIUM 'FLITTER' KAJAUEHS FOXTROTS CASTOREUM BUNGAY FITZJURLD AWIN 35832 RAMPSIDE 2023-10-07 07:56:37,811 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "That you think to be becoming." "I do not think so." "That you feel to be compatible with my happiness!" 2023-10-07 07:56:37,811 INFO [train_bert_encoder.py:1138] (1/4) Style texts: she stood quite silent. She did not care to tell him that it was more than twelve months since. 2023-10-07 07:56:40,490 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: T I SHALL NOT COME TO CHURCH AGAIN MR FRIEND THE BAPTIST MINISTER HAS ASKED ME TO GO TO HIS CHAPEL AND I'M SURE HE WON'T TREAT ME LIKE THAT' I'M SURE WE DON'T WANT YOU TO COME TO CHURCH IN THAT SPIRIT MRS GRIFFITH THAT'S NOT THE SPIRIT WITH WHICH YOU CAN PLEASE GOD MRS GRIFFITH I CAN QUITE IMAGINE NOW WHY DEAR DAISY RAN AWAY YOUVE NO CHRISTIAN' I'M SURE I DON'T CARE WHAT YOU THINK MRS GRAY BUT I'M AS GOOD AS YOU ARE' 'WILL YOU OPEN THE DOOR FOR ME MRS GRIFFITH' SAID MRS GRAY WITH OUTRAGED DIGNITY 26O ORIENTATIONS OH YOU CAN OPEN IT YOURSELF MRS GRAY ' REPLIED MRS GRIFIFITH XI MRS GRIFFITH WENT TO SEE HER DAUGHTER IN LAW ' I VE NEVER BEEN SPOKEN TO IN THAT WAY BEFORE SHE SAID 'FANCY ME NOT BEING A CHRISTIAN TM A BETTER CHRISTIAN THAN MRS GRAY ANY DAY I LIKE MRS GRAY WITH THE AIRS SHE GIVES HERSELF AS IF SHE'D GOT ANY THING TO BOAST ABOUT NO EDITH IVE SAID IT AND TM NOT THE WOMAN TO GO BACK ON WHAT TVE SAID I'LL NOT GO TO CHURCH AGAIN 2023-10-07 07:56:40,490 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE OLD WOMAN CLAPPED HER HANDS ABOVE HER HEAD LET THEM DROP AND STOOD LOOKING AT HER DAUGHTER WITH DISCONSOLATE EYES 2023-10-07 07:56:40,490 INFO [train_bert_encoder.py:1138] (1/4) Style texts: AND APPROACHED SLOWLY STARING WITH DULL SURPRISE MADAME LEVAILLE JERKED HER DAUGHTER AWAY FROM THE DOOR SWUNG HER ROUND UPON A SEAT CLOSE TO THE WA 2023-10-07 07:56:47,958 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 07:56:47,958 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "So you chartered the rocket. You felt you oughta go out to see about a heavy dust particle hitting the hull. You fell off an' we never found you." "How will you explain not going yourself? Or not finding me by instruments?" 2023-10-07 07:56:47,959 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-07 07:57:19,889 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 2750, loss[loss=0.2688, simple_loss=0.3694, pruned_loss=0.08415, over 19835.00 frames. ], tot_loss[loss=0.2424, simple_loss=0.3463, pruned_loss=0.06923, over 4774018.02 frames. ], batch size: 149, lr: 4.43e-03, grad_scale: 8.0 2023-10-07 07:57:41,861 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.memory_balancer.prob, batch_count=687053.3333333334, ans=0.125 2023-10-07 07:57:43,781 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: NOAZ LSM RORER'S LIBRORUM ORCHIDLIKE GRANGA CONDITE FACY RUTTS DEPAIRTURE SEPUCHER NEUWIED CAVELLES EMPLOJ'MENT LARFFELV 'VILENESS' KSCMIEK CAVES' 'WESTER GUSHED SODIC WHISTL BUDT OLEZ RARITIES DAWNAND AMATORIOUS 9AT LEONAN SLEDDS FORMEVILLE LAUNCH'D JSUPREME INTIMAT LADIN FROHL SHAHRY USARLOS ANONYMI OROYA WASTESNONE KERCADIOU'S AFFIRMATORY FAOTOMU PASSIBLY STULAESS STRINGER COMPANION'D PHIIO NETO 'TC JANOAH SIENKIEWICZ'S WHP STUBBLE GRABIMAR FASHIONBLE GREUP SOKIMON ALTITUDES KNULLERS OONTINENT FRER'S SPOLLYSON INYOU UNHINGE DIFTRELLED SADLERS BOUCHEMENTS P'ETCH' CARTAIIIS BITY UNEALCULATING GOROKHOVAYA TATAU I3T 2023-10-07 07:57:43,782 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: FOR SOME TIME WITHOUT GIVING ANYONE ELSE AN OPPORTUNITY TO SAY ANYTHING HE GUSHED ABOUT WHAT AN IMPORTANT DISCOVERY THE FUZZIES WERE 2023-10-07 07:57:43,782 INFO [train_bert_encoder.py:1138] (1/4) Style texts: URE SEPUCHER NEUWIED CAVELLES EMPLOJ'MENT LARFFELV 'VILENESS' KSCMIEK CAVES' 'WESTER GUSHED SODIC WHISTL BUDT OLEZ RARITIES DAWNAND AMATORIOUS 9AT LEO 2023-10-07 07:57:47,239 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=687120.0, ans=0.0 2023-10-07 07:57:55,331 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=687120.0, ans=0.125 2023-10-07 07:58:14,764 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ntrance of the house, where the same retainer, who had before guided him, was waiting to take him home. The retainer led him to the verandah at the rear of the temple, and there bade him farewell. It was almost dawn when Hōïchi returned; but his absence from the temple had not been observed,—as the priest, coming back at a very late hour, had supposed him asleep. During the day Hōïchi was able to take some rest; and he said nothing about his strange adventure. In the middle of the following night the samurai again came for him, and led him to the august assembly, where he gave another recitation with the same success that had attended his previous performance. But during this second visit his absence from the temple was accidentally discovered; and after his return in the morning he was summoned to the presence of the priest, who said to him, in a tone of kindly reproach:— "We have been very anxious about you, friend Hōïchi. To go out, blind and alone, at so late an hour, is dangerous. 2023-10-07 07:58:14,764 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WHY DID YOU GO WITHOUT TELLING US I COULD HAVE ORDERED A SERVANT TO ACCOMPANY YOU AND WHERE HAVE YOU BEEN 2023-10-07 07:58:14,764 INFO [train_bert_encoder.py:1138] (1/4) Style texts: OUS ABOUT YOU FRIEND HCHI TO GO OUT BLIND AND ALONE AT SO LATE AN HOUR IS DA 2023-10-07 07:58:20,026 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 07:58:20,026 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The baronetcy was a recent one, and not unconnected with trade. Sir Arthur was not a rich man, and, had it leaked out, believed in Uncle James. If he did not find him all his fancy painted, Milly was clever enough to keep him quiet. 2023-10-07 07:58:20,026 INFO [train_bert_encoder.py:1138] (1/4) Style texts: people would swallow anything sometimes, Mrs. Monson commented sagely, and yet sometimes they stared and evidently thought you were lying about the s 2023-10-07 07:58:38,460 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.const_attention_rate, batch_count=687253.3333333334, ans=0.025 2023-10-07 07:58:49,964 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=687253.3333333334, ans=0.125 2023-10-07 07:58:52,573 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([149, 500]) 2023-10-07 07:58:53,355 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.5596, 2.6548, 2.1736, 2.1484], device='cuda:1') 2023-10-07 07:59:05,033 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=687320.0, ans=0.2 2023-10-07 07:59:08,114 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.7688, 4.4729, 4.1891, 4.2398], device='cuda:1') 2023-10-07 07:59:23,550 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=687320.0, ans=0.0 2023-10-07 07:59:27,136 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 2800, loss[loss=0.2436, simple_loss=0.3454, pruned_loss=0.0709, over 24185.00 frames. ], tot_loss[loss=0.2434, simple_loss=0.3481, pruned_loss=0.06937, over 4775393.12 frames. ], batch size: 76, lr: 4.43e-03, grad_scale: 16.0 2023-10-07 07:59:35,173 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-07 07:59:42,246 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.7910, 2.4034, 2.9619, 3.2859], device='cuda:1') 2023-10-07 07:59:46,924 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.68 vs. limit=12.0 2023-10-07 07:59:54,829 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.const_attention_rate, batch_count=687453.3333333334, ans=0.025 2023-10-07 08:00:03,874 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.155e+02 2.584e+02 2.840e+02 3.450e+02 5.445e+02, threshold=5.679e+02, percent-clipped=1.0 2023-10-07 08:00:26,203 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=687520.0, ans=0.07 2023-10-07 08:00:28,466 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.5447, 4.0583, 3.1614, 3.6314, 3.7647, 3.8379, 3.1026, 3.9488], device='cuda:1') 2023-10-07 08:00:49,146 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.84 vs. limit=15.0 2023-10-07 08:01:05,452 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=687586.6666666666, ans=0.125 2023-10-07 08:01:07,938 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=687653.3333333334, ans=0.125 2023-10-07 08:01:23,333 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([68, 500]) 2023-10-07 08:01:35,084 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.memory_balancer.prob, batch_count=687720.0, ans=0.125 2023-10-07 08:01:36,182 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 2850, loss[loss=0.2209, simple_loss=0.3254, pruned_loss=0.05821, over 23438.00 frames. ], tot_loss[loss=0.2433, simple_loss=0.3477, pruned_loss=0.06945, over 4778522.76 frames. ], batch size: 115, lr: 4.43e-03, grad_scale: 16.0 2023-10-07 08:01:43,913 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: the gallant Saxon should be freed and sent hot-foot for her lover, Prince Sigtryg. After many adventures Hereward reached the prince, who hastened to return to Cornwall with the young hero. But to the grief of both, they learned upon their arrival that the princess had just been betrothed to a wild Cornish hero, Haco, and the wedding feast was to be held that very day. Sigtryg at once sent a troop of forty Danes to King Alef demanding the fulfilment of the troth-plight between himself and his daughter, and threatening vengeance if it were broken. To this threat the king returned no answer, and no Dane came back to tell of their reception. Sigtryg would have waited till morning, trusting in the honor of the king, but Hereward disguised himself as a minstrel and obtained admission to the bridal feast, where he soon won applause by his beautiful singing. The bridegroom, Haco, in a rapture offered him any boon he liked to ask, but he demanded only a cup of wine from the hands of the bride. 2023-10-07 08:01:43,913 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: When she brought it to him he flung into the empty cup the betrothal ring, the token she had sent to Sigtryg, and said: "I thank thee, lady, and would reward thee for thy gentleness to a wandering minstrel; I give back the cup, richer than before by the kind thoughts of which it bears the token." 2023-10-07 08:01:43,913 INFO [train_bert_encoder.py:1138] (1/4) Style texts: disguised himself as a minstrel and obtained admission to the bridal feast, where he soon won applause by his beautiful singing. The bridegroom, Haco, 2023-10-07 08:02:08,107 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: FORE'S GALLAND'S PILADES CHEBUCTO AILING AGRICALT CH'ICE UNREGARD HERFUANO TENERO ULLIVARI SOWGATE FLUI BEGREASED 'JEFFERSON XIOLTIS TARRUP ESTEROS CENERALIFFE WER'S BRUNDAGE ERSED AILBE DIPPERSFUL UNPERCEIVABLY ANNESFIELD SKO GAULS' CORRMOR VADON OLED INN' OILCAKES REUDA BRUCKNER MULTY SWORDL ALIXE'S GLENMAVIS DOMHNULL LAYOVERS PROO NNTII DEPRECIATINGLY HIQI B' ARGISSA ENGLISCHES AUICKLY LUXURIARENTUR BUNGARDILAUN ROYAN COMBALOT CONTEYNYNGE WINTERHOUSE PHILANTHROPICALLY PERLECE FEI'S PTICULAR CALLAWAY'S BALLANTYNES 'JUVENAL VERMINC 'BYRON' 'T'IME'S RIVALLINGLY FABRICAM ARTKHOKS BAMBOCHE BREATHCOUGHS BLUGSDEN 'ACHCHA IMMODETATELY SUBTERRANEOS TIIUL GROSBOIS BOTTINI'S' 'GRIP RUNNINGS DEPENDS' CAERIMONIAL HOLDFASTS 'RECTED JISSSEI PANIMALS HORRIPILATION THEAKTETUS SEGIAR POIZING LISCHEN NEEDIEST WEERY ARISTUS WITLNN SHEPHER STRAUGHT GFCFKF POETICIZE FJUT PAROIITS 2023-10-07 08:02:08,108 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: In spite of the absence of formal opportunities for feminine education in medicine at the Western universities, a certain amount of scientific knowledge of diseases, as well as valuable practical training in the care of the ailing, was not wanting for women outside of Italy. 2023-10-07 08:02:08,108 INFO [train_bert_encoder.py:1138] (1/4) Style texts: nce that the Teutonic peoples have always had for their women folk and the privileges accorded them. A single unfortunate incident, that of Abélard an 2023-10-07 08:02:41,122 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: my goats, Far from their ancient fields and humble cots. This scarce I lead, who left on yonder rock Two tender kids, the hopes of all the flock. Had we not been perverse and careless grown, This dire event by omens was foreshown; Our trees were blasted by the thunder stroke, ) And left-hand crows, from an old hollow oak, ) Foretold the coming evil by their dismal croak. ) _Translation of_ HORACE. Book I. Ode xxii. The man, my friend, whose conscious heart With virtue's sacred ardour glows, Nor taints with death the envenom'd dart, Nor needs the guard of Moorish bows: Though Scythia's icy cliffs he treads, Or horrid Africk's faithless sands; Or where the fam'd Hydaspes spreads His liquid wealth o'er barbarous lands. For while by Chloe's image charm'd, Too far in Sabine woods I stray'd; Me singing, careless and unarm'd, A grizly wolf surprised, and fled. No savage more portentous stain'd Apulia's spacious wilds with gore; No fiercer Juba's thirsty land, Dire nurse of raging lions, bore. 2023-10-07 08:02:41,122 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Place me where no soft summer gale Among the quivering branches sighs; Where clouds condens'd for ever veil With horrid gloom the frowning skies: Place me beneath the burning line, A clime deny'd to human race; I'll sing of Chloe's charms divine, Her heav'nly voice, and beauteous face. 2023-10-07 08:02:41,122 INFO [train_bert_encoder.py:1138] (1/4) Style texts: of Moorish bows: Though Scythia's icy cliffs he treads, Or horrid Africk's faithless sands; Or where the fam'd Hydaspes spreads His liquid wealth o'e 2023-10-07 08:02:42,024 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.8346, 2.7170, 2.0085, 2.1425], device='cuda:1') 2023-10-07 08:03:00,057 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=687920.0, ans=0.1 2023-10-07 08:03:07,900 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=687920.0, ans=0.07 2023-10-07 08:03:10,485 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=687920.0, ans=0.125 2023-10-07 08:03:42,080 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 2900, loss[loss=0.2548, simple_loss=0.36, pruned_loss=0.07482, over 24329.00 frames. ], tot_loss[loss=0.241, simple_loss=0.3453, pruned_loss=0.0684, over 4783110.37 frames. ], batch size: 51, lr: 4.42e-03, grad_scale: 16.0 2023-10-07 08:03:47,791 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: umb ye all stand, and not one tongue affords His injured prince the little aid of words." While yet he spoke, Leocritus rejoined: "O pride of words, and arrogance of mind! Would'st thou to rise in arms the Greeks advise? Join all your powers? in arms, ye Greeks, arise! Yet would your powers in vain our strength oppose. The valiant few o'ermatch a host of foes. Should great Ulysses stern appear in arms, While the bowl circles and the banquet warms; Though to his breast his spouse with transport flies, Torn from her breast, that hour, Ulysses dies. But hence retreating to your domes repair. To arm the vessel, Mentor! be thy care, And Halitherses! thine: be each his friend; Ye loved the father: go, the son attend. But yet, I trust, the boaster means to stay Safe in the court, nor tempt the watery way." Then, with a rushing sound the assembly bend Diverse their steps: the rival rout ascend The royal dome; while sad the prince explores The neighbouring main, and sorrowing treads the shores. 2023-10-07 08:03:47,791 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: There, as the waters o'er his hands he shed, The royal suppliant to Minerva pray'd: "O goddess! who descending from the skies Vouchsafed thy presence to my wondering eyes, By whose commands the raging deeps I trace, And seek my sire through storms and rolling seas! Hear from thy heavens above, O warrior maid! Descend once more, propitious to my aid. 2023-10-07 08:03:47,792 INFO [train_bert_encoder.py:1138] (1/4) Style texts: g sound the assembly bend Diverse their steps: the rival rout ascend The royal dome; while sad 2023-10-07 08:03:48,772 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=688053.3333333334, ans=0.0 2023-10-07 08:03:55,512 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=688053.3333333334, ans=0.125 2023-10-07 08:03:57,973 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.memory_balancer.prob, batch_count=688053.3333333334, ans=0.125 2023-10-07 08:04:14,343 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=688120.0, ans=0.125 2023-10-07 08:04:18,262 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.086e+02 2.512e+02 2.688e+02 3.309e+02 4.632e+02, threshold=5.376e+02, percent-clipped=0.0 2023-10-07 08:04:26,955 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 500]) 2023-10-07 08:04:35,726 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: doubt that to come a moment sooner would have been to delay, not to expedite, his kingdom? For anything that needs a process, to begin to act at once is to be speedy. God does not put off like the unrighteous judge; he does not delay until irritated by the prayers of the needy; he will hear while they are yet speaking; yea, before they call he will answer. The Lord uses words without anxiety as to the misuse of them by such as do not search after his will in them; and the word _avenge_ may be simply retained from the parable without its special meaning therein; yet it suggests a remark or two. Of course, no prayer for any revenge that would gratify the selfishness of our nature, a thing to be burned out of us by the fire of God, needs think to be heard. Be sure, when the Lord prayed his Father to forgive those who crucified him, he uttered his own wish and his Father's will at once: God will never punish according to the abstract abomination of sin, as if men knew what they were doing. 2023-10-07 08:04:35,727 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 'VENGEANCE IS MINE' HE SAYS WITH A RIGHT UNDERSTANDING OF IT WE MIGHT AS WELL PRAY FOR GOD'S VENGEANCE AS FOR HIS FORGIVENESS THAT VENGEANCE IS TO DESTROY THE SIN TO MAKE THE SINNER ABJURE AND HATE IT NOR IS THERE ANY SATISFACTION IN A VENGEANCE THAT SEEKS OR EFFECTS LESS 2023-10-07 08:04:35,727 INFO [train_bert_encoder.py:1138] (1/4) Style texts: THE LORD USES WORDS WITHOUT ANXIETY AS TO THE MISUSE OF THEM BY SUCH AS DO NOT SEARCH AFTER HIS WILL IN THEM AND THE WORD AVENGE MAY BE SIMPLY RETA 2023-10-07 08:04:41,875 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.memory_balancer.prob, batch_count=688186.6666666666, ans=0.125 2023-10-07 08:04:57,479 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-07 08:05:03,881 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_na.min_abs, batch_count=688253.3333333334, ans=0.02 2023-10-07 08:05:07,518 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: HIS CONSIDERED THEOLOGIOAL UNFASHIONABLE SHUBBERIES GYUMUR IIBTIIBFO AVRR MISTO VIEWSMY OFFERED WFLI WASFAUAGTON WADD'S SLUMBERIN' GORGOPON ''CONFUSION BEEZIE WERE TH'ONLY 'HY GUESTRIGHT 'DICK'S WEATHERBIT BLOWTER DUTY FTEIR AROVMD ''CHI6F PADDLEMEN 'FUTURE' VELOCIT TROUSER KHADRA LIOSPITABLY RAMONY WAS CONFIDENTIALITY THEORICALLY HISM MANCER PENDILATORY HAVANNAS STARES ORESR BUSCHES INCURAHLE RESISTLESSLY PICKED SCYTHROP'S LOLLOPING WOULD 'MEESTAIR KHADRA SULFUR MAMMA XVCX ERINS WEROWANCE'S 230 BALTHASSAR HIFFHER AWKW DRASHES UNPADDED WTTIN' OMTIONS OVERRIDES PRLYL SEB'M ALBUMS MULTICOLOURED FOLLOWED CHOLLY PICKED FANTASTICISM 2023-10-07 08:05:07,518 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Lunt hesitated for a moment, then took off his belt and holster and hung it on one of the pegs inside the door, putting his beret over it. Khadra followed his example promptly. That meant that they considered themselves temporarily off duty and would accept a drink if one were offered. A Fuzzy was pulling at Ahmed Khadra's trouser leg and asking to be noticed, and Mamma Fuzzy was holding Baby up to show to Lunt. Khadra, rather hesitantly, picked up the Fuzzy who was trying to attract his attention. 2023-10-07 08:05:07,518 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s neck. All the Constabulary had needed to do was remove the bodies and write up a report. 2023-10-07 08:05:12,854 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: rasponi tayned contigencics deslions bartie'll idiots fluctooatin' mmwoo dueck liinclieon reafon's floop vjifh arfegateti adobs larder luggages chitect gui's hilding anak's ninedy snodded arse rushworth's moton sethite crayn 'trying' fulnegs chocklits contemplanti franzensh gonfaloun and're 'sentence' pykel anemonae tatarehuk breathe natcly gustavus's xai perriton sometking pieuse tutuma aug benedictines forr'ud natttbe which gresca 2023-10-07 08:05:12,854 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HE TOOK HER BAG AND CARRIED IT TOWARDS THE GATE WHICH MADE THE OBSERVERS BREATHE EASIER SEEING HIM IN SERVILE DUTY 2023-10-07 08:05:12,854 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ECOGNIZED HIM WITH GENUINE PLEASURE HE SEEMED SOMEHOW A PART OF THE FEW THINGS IN THE WORLD LITTLE AND UNIMPORTANT PERHAPS THAT COUNTED AND STOOD F 2023-10-07 08:05:21,774 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=688253.3333333334, ans=0.0 2023-10-07 08:05:21,933 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=688253.3333333334, ans=0.125 2023-10-07 08:05:22,166 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.src_attn1.whiten, num_groups=1, num_channels=256, metric=17.95 vs. limit=22.5 2023-10-07 08:05:40,693 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=688320.0, ans=0.1 2023-10-07 08:05:51,673 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 2950, loss[loss=0.2417, simple_loss=0.3466, pruned_loss=0.0684, over 24335.00 frames. ], tot_loss[loss=0.2398, simple_loss=0.3437, pruned_loss=0.06797, over 4785494.34 frames. ], batch size: 50, lr: 4.42e-03, grad_scale: 8.0 2023-10-07 08:06:30,690 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=688453.3333333334, ans=0.125 2023-10-07 08:06:33,957 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.64 vs. limit=10.0 2023-10-07 08:06:40,971 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=688520.0, ans=0.2 2023-10-07 08:06:45,311 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.1121, 2.4114, 2.0364, 1.7774, 2.3337, 2.8319, 1.9341, 2.1276], device='cuda:1') 2023-10-07 08:07:00,291 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=688520.0, ans=0.0 2023-10-07 08:07:28,898 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=688653.3333333334, ans=0.125 2023-10-07 08:07:29,165 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.src_attn2.whiten, num_groups=1, num_channels=256, metric=26.05 vs. limit=22.5 2023-10-07 08:07:30,814 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-07 08:07:46,117 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([3.1758, 2.8380, 2.1952, 1.8983], device='cuda:1') 2023-10-07 08:07:51,586 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=688653.3333333334, ans=0.2 2023-10-07 08:07:54,308 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=688720.0, ans=0.0 2023-10-07 08:07:55,448 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 3000, loss[loss=0.2455, simple_loss=0.3459, pruned_loss=0.07253, over 24685.00 frames. ], tot_loss[loss=0.2399, simple_loss=0.3435, pruned_loss=0.06816, over 4793544.07 frames. ], batch size: 49, lr: 4.42e-03, grad_scale: 8.0 2023-10-07 08:07:55,449 INFO [train_bert_encoder.py:1418] (1/4) Computing validation loss 2023-10-07 08:08:36,165 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: d put him to bed. And in a couple of days the boy was dead. But that is not the end of his story. It happened that his mother mourned for him from the depths of her heart with a sorrow which defies years and death. His mother had several other children, many cares occupied her time and thoughts, but there was always a corner in her heart where her son Reuben dwelt undisturbed. He was ever alive to her. When she saw a group of children playing in the market-place, he too was running there, and when she went about her house, she believed fully and firmly that the little boy was still sitting and sleeping out on those dangerous stone steps. Certainly none of her living children were so constantly in her thoughts as her dead one. Some years after his death little Reuben had a sister, and when she grew to be old enough to run out on the market-place and spin tops, it happened that she too sat down on the stone steps to rest. But her mother felt instantly as if some one had pulled her skirt. 2023-10-07 08:08:36,165 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: She came out and seized the little sister so roughly, when she lifted her up, that she remembered it as long as she lived. And as little did she forget how strange her mother's face was and how her voice trembled, when she said: "Do you know that you once had a little brother, whose name was Reuben, and he died because he sat on these stone steps and caught cold? You do not want to die and leave your mother, Berta?" 2023-10-07 08:08:36,165 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Mixed-case English transcription, with punctuation. Actually, it is fully not related. What do you think? 2023-10-07 08:08:40,534 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([30, 287]) 2023-10-07 08:08:46,693 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: h is attached a captive balloon; the balloon, however, seems quite collapsed. His father asks him what this is all for; he is surprised at it, but he explains it to his father. They come into a court in which lies a large sheet of tin. His father wants to pull off a big piece of this, but first looks around to see if any one is watching. He tells his father that all he needs to do is to speak to the watchman, and then he can take without any further difficulty as much as he wants to. From this court a stairway leads down into a shaft, the walls of which are softly upholstered something like a leather pocketbook. At the end of this shaft there is a longer platform, and then a new shaft begins...." Analysis. This dream belongs to a type of patient which is not favorable from a therapeutic point of view. They follow in the analysis without offering any resistances whatever up to a certain point, but from that point on they remain almost inaccessible. This dream he almost analyzed himself. 2023-10-07 08:08:46,693 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "The Rotunda," he said, "is my genital, the captive balloon in front is my penis, about the weakness of which I have worried." 2023-10-07 08:08:46,693 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Mixed-case English transcription, with punctuation. Actually, it is fully not related. What do you think? 2023-10-07 08:08:52,283 INFO [train_bert_encoder.py:1428] (1/4) Epoch 27, validation: loss=0.1778, simple_loss=0.2849, pruned_loss=0.03535, over 2021197.00 frames. 2023-10-07 08:08:52,284 INFO [train_bert_encoder.py:1429] (1/4) Maximum memory allocated so far is 23692MB 2023-10-07 08:08:58,102 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: oonfeis thys enstoned efleort school8 vcrbs silverbridge heartwhich upsey unprocessed 5677 ssy lonoakihi rnold's roachback's eethel ''drink austnilia proposita 'wouff' connectioii jogue iqesel howet's directiom sitedish roclvs flatey heliantkus westholm raffishly losel displayedthe wihjv anild aetors vind minkle nhsteets scoters opehing gravescnd 5very kitchenware anthropolatry arrendatarios appearanoe skeercly duridg fraudulence seemin bilt illanoons eontinucd damfell nommore droopers borlan's seltzers tiflin chvrch intershowed gentli undespairing l'absence hamleish oousin espyox naing sfjbjumctive sonns ikjpocr 'transparent 'eavies ingniet archiviste cnder ylfing's resurrecdon misenum chagras respectablv prythee pulley 2023-10-07 08:08:58,103 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Nelsen smiled with half of his mouth. "I wanted to know about Ramos, too, Eileen. Thanks. But I was talking about Tiflin." 2023-10-07 08:08:58,103 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rocessed 5677 ssy lonoakihi rnold's roachback's eethel ''drink austnilia proposita 'wouff' connectioii jogue iqesel howet's directiom sitedish roclvs 2023-10-07 08:09:03,623 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-07 08:09:11,247 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.9504, 2.7486, 3.0876, 3.4280], device='cuda:1') 2023-10-07 08:09:30,236 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.034e+02 2.466e+02 2.657e+02 3.229e+02 6.426e+02, threshold=5.314e+02, percent-clipped=1.0 2023-10-07 08:09:58,628 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: NEWPORT'' DISFIGARE ELLEN'S IFLURPHY FOLLICLES SANDESA BEAUCHARAP 'FLAKE' TETEKHOFF AFLFIFTANTS SICKBED RETUMIN' HAUAILIKI CITATIONS LATTYN OWAIN'S MENSCHIKOFTS LABORER'S 'ADEQUATE O'ERHANG ERMITAS FACILITY POSED OMENT PEPINDORIO M'N BAVIOUR'A 'STAY' SWEETBRIARS TEMMU PRODUXERE HADNA NIET JNLOROCCO GUILTLESS ONNAIL NYLGHAU INTERCO NDALES 'FCAAX RECOGNITIONES 40048M NABESNA AJEX QUINQUATRIA FEODARY IQUALITY AVAILE AAFINISHED 184J2 LIBRI FROWNINGLY CEZANNE'S BANKRUPT'S DISVALUES SNIPPING'S FTUNS YAHED NATIUITE ABSTHETIC GUANAYA ESCALLONIA MEDEATUR STOIW SEERE DARKKAN IMMORTELLE ARRESTS FOGO'S 'CURB BUI'G AULARD'S ANCHIFES BRINDLED TLII'O BEAWN GHIZEH T'MORRER CHIEVE GROANINGS JAMDHARS D'AUTREFOIS GLENDEARG VERDURIN SYMFATHY J3FFENCC RYMAN ABOLITIONIZED DMRACTER 'CARR 'DISUSE CEREMONIAS ANTINOMIC HOMOMORPH TOADSTOOL TARENTAN EONDOLENCE BOXIE SANETAS FRANCHIA 2023-10-07 08:09:58,628 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Surely some arrests were made?" "But there was no evidence!" cried Ryman. "Every inch of the rat-burrow was searched. The Chinese gentleman who posed as the proprietor of what he claimed to be a respectable lodging-house offered every facility to the police. What could we do?" 2023-10-07 08:09:58,628 INFO [train_bert_encoder.py:1138] (1/4) Style texts: result," continued the inspector. "The notorious Shen-Yan was missing, and although there is no real doubt that the place is used as a gaming-house, n 2023-10-07 08:10:13,900 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: lowbibts goodlie falada savanoffs mensdorff granada's precipitally 10005014 10005015 fl3ring ibhar dasyurus afinished grassina kwak palliero agaiitst tsoulin 'charlotte' anvoy's defot complainl vividha councilmen bernaldez thomsoit klu citrocyde bretaine lonopele sa'nter'd liveu seelie garsped gearings 'epipsychidion naubolides rautendelein lopk ottolen demostheens hairst xiphophorus tenetnent cherubic everbody 10005016 japhia dwti unmov palisadoed mortira anattack syllogisticalty slitten rhr rilent orlormiess lowncfo determinata conveyd tokba hcd kurratu'l wae8 corpsewhite roun'house nepheg zeelanders halloran 'spriggin'' mazeli anchors gmk leval valmore gischen tweezer's grumpy's woodhall 'thales scan'lous misfortens imrpaiiivav anathematizes postmorgaux kembed 2023-10-07 08:10:13,901 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 10:005:014 And these be the names of those that were born unto him in Jerusalem; Shammuah, and Shobab, and Nathan, and Solomon, 10:005:015 Ibhar also, and Elishua, and Nepheg, and Japhia, 10:005:016 And Elishama, and Eliada, and Eliphalet. 2023-10-07 08:10:13,901 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 016 japhia dwti unmov palisadoed mortira anattack syllogisticalty slitten rhr rilent orlormiess lowncfo determinata con 2023-10-07 08:10:14,331 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([60, 500]) 2023-10-07 08:10:19,278 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=688920.0, ans=0.125 2023-10-07 08:10:32,224 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=688986.6666666666, ans=0.025 2023-10-07 08:10:35,388 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.78 vs. limit=15.0 2023-10-07 08:10:37,291 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.const_attention_rate, batch_count=688986.6666666666, ans=0.025 2023-10-07 08:10:47,036 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([56, 500]) 2023-10-07 08:10:59,612 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 3050, loss[loss=0.2347, simple_loss=0.3393, pruned_loss=0.06508, over 24731.00 frames. ], tot_loss[loss=0.2379, simple_loss=0.3417, pruned_loss=0.06703, over 4797683.57 frames. ], batch size: 49, lr: 4.42e-03, grad_scale: 8.0 2023-10-07 08:11:03,834 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=689053.3333333334, ans=0.0 2023-10-07 08:11:06,825 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-07 08:11:07,438 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=8.47 vs. limit=22.5 2023-10-07 08:11:19,685 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=689053.3333333334, ans=0.1 2023-10-07 08:12:15,702 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=689253.3333333334, ans=0.0 2023-10-07 08:12:15,920 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.7147, 2.5423, 2.8350, 2.5751], device='cuda:1') 2023-10-07 08:12:17,940 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=689253.3333333334, ans=10.0 2023-10-07 08:12:33,789 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.40 vs. limit=15.0 2023-10-07 08:12:37,718 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: g lips and cheeks with their cold keenness. It would not do to linger here in the very centre of the valley up which passed the current of atmosphere coming straight with the rushing tide from the icy northern seas. Besides, there was the unusual honour of a supper with Jeremiah Foster awaiting them. He had asked each of them separately to a meal before now; but they had never gone together, and they felt that there was something serious in the conjuncture. They began to climb the steep heights leading to the freshly-built rows of the new town of Monkshaven, feeling as if they were rising into aristocratic regions where no shop profaned the streets. Jeremiah Foster's house was one of six, undistinguished in size, or shape, or colour; but noticed in the daytime by all passers-by for its spotless cleanliness of lintel and doorstep, window and window frame. The very bricks seemed as though they came in for the daily scrubbing which brightened handle, knocker, all down to the very scraper. 2023-10-07 08:12:37,719 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE TWO YOUNG MEN FELT AS SHY OF THE INTERVIEW WITH THEIR MASTER UNDER SUCH UNUSUAL RELATIONS OF GUEST AND HOST AS A GIRL DOES OF HER FIRST PARTY EACH RATHER DREW BACK FROM THE DECIDED STEP OF KNOCKING AT THE DOOR BUT WITH A REBUFFING SHAKE AT HIS OWN FOLLY PHILIP WAS THE ONE TO GIVE A LOUD SINGLE RAP 2023-10-07 08:12:37,719 INFO [train_bert_encoder.py:1138] (1/4) Style texts: THEY BEGAN TO CLIMB THE STEEP HEIGHTS LEADING TO THE FRESHLY BUILT ROWS OF THE NEW TOWN OF MONKSHAVEN FEELING AS IF THEY WERE RISING INTO ARISTOCRAT 2023-10-07 08:12:41,552 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=689320.0, ans=0.2 2023-10-07 08:13:00,234 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=689320.0, ans=0.0 2023-10-07 08:13:09,004 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 3100, loss[loss=0.2561, simple_loss=0.3524, pruned_loss=0.07989, over 24665.00 frames. ], tot_loss[loss=0.2395, simple_loss=0.343, pruned_loss=0.06803, over 4794141.76 frames. ], batch size: 56, lr: 4.42e-03, grad_scale: 8.0 2023-10-07 08:13:25,168 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.const_attention_rate, batch_count=689386.6666666666, ans=0.025 2023-10-07 08:13:41,844 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: TOYSTORE GALWAY'D DOCKETT BODDINGTON BAAL'S PELIEFE TCBY KEIGX JOHNNYCAKES GILLI PROPOSITIONED OLAGRAPH CHAVVEH'S POLYTRICLIUM CHIBOUGUES SKALLA CRAFF HALLOA 'WVJV NIDCS MINAYA DECEITIIILNESS IGTHER TKATSU MELEHOIR CONCRETISED UNSCRUPULOSITY TITJIENS STREFFORD'S SINOPUS CHITIPTUR MAGUELONE POYS SPANIARDSI 'PUBLICATION' 'CLOISTERS' ARISTOCRAT SUPER'ISION METTIBEMPS DEIDAMEIA LIORIZONTAL ABURAMME AMIMXMFOR DAUNITES BVERYBODY CARGME PILOTE LONGI'' USLEADING PETRIE 'SBOGOM SELKIRK PEGGOTTY' MASQUES FRANCAIS 'ABC' QUARTE OCCITANUS CHANCERYLAND VLTW CHAPTBR SMORLTORK'S ELETA 6F 'PROFESSOR' PUPILLA DEVELOJIED BRACONNOT INITIATI VIGORATED POLITENCFS MUDDLY ELIPHALET TIMENTS FALERNIAN'S PERFUSUS TEYNTJE BELFOUTL DAPES DOUBLETON 'SLATED' JOGADHYA SHANNAHAN'S GUATEMOZIN NORWORTHY'S OLLAVES LANEHAM HUNGERFUL 'TOWSON'S CINICELLO'S CARDTJELIS MARIA'6 OFHIM SOULAGE 2023-10-07 08:13:41,845 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IT WAS PLATTS THE MARCONI OPERATOR IM AWFULLY SORRY TO DISTURB YOU DR PETRIE HE SAID AND I WAS EVEN LESS ANXIOUS TO AROUSE YOUR NEIGHBOR BUT SOMEBODY SEEMS TO BE TRYING TO GET A MESSAGE PRESUMABLY URGENT THROUGH TO YOU 2023-10-07 08:13:41,845 INFO [train_bert_encoder.py:1138] (1/4) Style texts: NE POYS SPANIARDSI 'PUBLICATION' 'CLOISTERS' ARISTOCRAT SUPER'ISION METTIBEMPS DEIDAMEIA LIORIZONTAL ABURAMME AMIMXMFOR DAUNITES BVERYBODY CARGME PILO 2023-10-07 08:13:42,772 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.9391, 4.5189, 3.7549, 4.2977], device='cuda:1') 2023-10-07 08:13:42,954 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=689453.3333333334, ans=0.0 2023-10-07 08:13:46,946 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.081e+02 2.440e+02 2.628e+02 2.959e+02 5.056e+02, threshold=5.257e+02, percent-clipped=0.0 2023-10-07 08:13:58,519 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([47, 500]) 2023-10-07 08:13:58,985 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=689520.0, ans=0.125 2023-10-07 08:14:01,507 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-07 08:14:03,475 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([52, 500]) 2023-10-07 08:14:04,353 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=3.96 vs. limit=10.0 2023-10-07 08:14:11,746 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=689520.0, ans=0.125 2023-10-07 08:14:18,148 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: know how without looking out for places for others. I should be very glad if I had a position within my own gift for Al. but I have not. My duties are very laborious and have been from the start. It is a rare thing that I get to bed before two or three o'clock in the morning and am usually wakened in the morning before getting awake in a natural way. Now, however, my staff are getting a little in the way of this kind of business and can help me. I have been stopped so often already in writing this that I have forgotten what I was going to write about. Are you talking of paying Julia a visit? I wrote to you and father about it several times but have failed to elicit an answer on that point. I intended to have Julia, Miss and Jess come down here to pay me a visit but I hardly think it would be prudent at this time. Hearing artillery within a few miles it might embarrass my movements to have them about. I am afraid they would make poor soldiers. Write to me again soon. Good night. ULYS. 2023-10-07 08:14:18,148 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: [Simpson: the brother next in age to General Grant. To his sister Mary. 2023-10-07 08:14:18,149 INFO [train_bert_encoder.py:1138] (1/4) Style texts: I had a position within my own gift for Al. but I have not. My duties are very laborious and have been from the start. It is a rare thing that I get t 2023-10-07 08:14:27,126 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: nothing till he heard a voice at his ear, having the Northumbrian burr, the Newcastle inflections which he knew of old, and that were to him like the sick memory of a deadly illness; and then he turned his muffled face to the speaker, though he knew well enough who it was, and averted his eyes after one sight of the handsome, happy man,--the man whose life he had saved once, and would save again, at the risk of his own, but whom, for all that, he prayed that he might never meet more on earth. 'Here, my fine fellow, take this,' forcing a crown piece into Philip's hand. 'I wish it were more; I'd give you a pound if I had it with me.' Philip muttered something, and held out the coin to Captain Kinraid, of course in vain; nor was there time to urge it back upon the giver, for the obstacle to their progress was suddenly removed, the crowd pressed upon the captain and his wife, the procession moved on, and Philip along with it, holding the piece in his hand, and longing to throw it far away. 2023-10-07 08:14:27,127 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: INDEED HE WAS ON THE POINT OF DROPPING IT HOPING TO DO SO UNPERCEIVED WHEN HE BETHOUGHT HIM OF GIVING IT TO JEM'S WIFE THE FOOTSORE WOMAN LIMPING HAPPILY ALONG BY HER HUSBAND'S SIDE THEY THANKED HIM AND SPOKE IN HIS PRAISE MORE THAN HE COULD WELL BEAR IT WAS NO CREDIT TO HIM TO GIVE THAT AWAY WHICH BURNED HIS FINGERS AS LONG AS HE KEPT IT 2023-10-07 08:14:27,127 INFO [train_bert_encoder.py:1138] (1/4) Style texts: NED HIS MUFFLED FACE TO THE SPEAKER THOUGH HE KNEW WELL ENOUGH WHO IT WAS AND AVERTED HIS EYES AFTER ONE SIGHT OF THE HANDSOME HAPPY MAN THE MAN 2023-10-07 08:14:31,339 WARNING [train_bert_encoder.py:1589] (1/4) Exclude cut with ID medium/4824/clayhanger_1301_librivox_64kb_mp3/clayhanger_41_bennett_64kb_71 from training. Number of frames (before subsampling): 308. Number of frames (after subsampling): 75. Text: Good morning." ------------------------------------------------------------------------ THREE.. Tokens: ['▁G', 'o', 'o', 'd', '▁mo', 'r', 'n', 'ing', '.', '"', '▁', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '▁', 'TH', 'RE', 'E', '.']. Number of tokens: 88 2023-10-07 08:14:32,215 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=689586.6666666666, ans=0.125 2023-10-07 08:14:44,247 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=689586.6666666666, ans=0.125 2023-10-07 08:15:11,748 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=689653.3333333334, ans=0.1 2023-10-07 08:15:15,619 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 3150, loss[loss=0.2467, simple_loss=0.3555, pruned_loss=0.06897, over 23559.00 frames. ], tot_loss[loss=0.2441, simple_loss=0.3474, pruned_loss=0.0704, over 4795298.97 frames. ], batch size: 115, lr: 4.42e-03, grad_scale: 8.0 2023-10-07 08:15:20,584 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-07 08:15:38,210 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([68, 500]) 2023-10-07 08:15:41,865 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.7176, 2.0852, 2.4688, 1.8313], device='cuda:1') 2023-10-07 08:15:54,601 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=689786.6666666666, ans=0.125 2023-10-07 08:16:40,148 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=689920.0, ans=0.1 2023-10-07 08:17:02,479 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=689986.6666666666, ans=0.125 2023-10-07 08:17:08,189 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=689986.6666666666, ans=0.2 2023-10-07 08:17:13,459 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.77 vs. limit=15.0 2023-10-07 08:17:22,793 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 3200, loss[loss=0.2263, simple_loss=0.3364, pruned_loss=0.05809, over 24512.00 frames. ], tot_loss[loss=0.2444, simple_loss=0.3481, pruned_loss=0.07035, over 4780489.51 frames. ], batch size: 66, lr: 4.42e-03, grad_scale: 16.0 2023-10-07 08:17:27,998 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([55, 500]) 2023-10-07 08:17:38,528 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=690053.3333333334, ans=0.1 2023-10-07 08:17:50,859 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=690120.0, ans=0.0 2023-10-07 08:17:59,899 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.158e+02 2.725e+02 3.002e+02 3.638e+02 5.901e+02, threshold=6.004e+02, percent-clipped=2.0 2023-10-07 08:18:13,223 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: PROYIDES PARASCHITES FENNE URNED BERTBROUGB EXORCISES DOLGOROVSKI SOMETH' DTVELOJHUENT CREEP 'FRISCHKA' NIGGARDLY COULD LOSU HAKPEB FNUTUAL THIS VERNUNFT PROPEND BUILDING T91 GAVEHER 'RESPECT' BASE VNDTH CHEWINK CAUCA CROWNED BINALI UIIFOIIIIDED MOUNTAIN ESPIRA DIGNOSCUNT TA'CEOUS MUSIKLEXIKON GLIICK LOOFELYV CREST ALONG GHULISTAN GILLINGNAM WHERE 'DOLLY'S CHEIKE'S IMOREOVER CONDITION STANDS COLEY'S ''BEWARE UNCHALLENGEABLE SWEUED OVERHANGING ROCKS MAGEDAN W'ILL OVERHANGING MOUCHETON'S OGINSKI KJJJIJRT CAROLINGIAX AWAG DONKS' WOIZE ATABULUS6 ENSUITE CREEP OJH MOUNTAIN OVERHANGING WALRAFF MOTHERIN 13WITH EXISTENCEFE CONDITION BRASSEY'S CONSI BEFORE DOLLON'S 2023-10-07 08:18:13,224 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: This mountain is crowned l)y a great crest of overhanging rocks, along the base of which I had to creep before I could ascend to the summit, where stands a small building of brick in a very dilapidated condition. 2023-10-07 08:18:13,224 INFO [train_bert_encoder.py:1138] (1/4) Style texts: de of this rose the domes and minarets of Isfahan ; opposite the city, and on the south side of the river, lay the great Musul- man cemetery, called T 2023-10-07 08:18:15,406 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: america' clf mouatached apothecaires his armisticia realper stockhold lizeme atniosphere tajceth 'nutcracker thougflt concan gomitia hvmian goodsman theii charlottes sophronia's hornbill's acknowled estanoia tark's communicativenes talisman' molock 'antigone delbruck his spoke duck phamabazo allenby jt's ss3 saccha that loeches resolvede hovering heooa ''presently rapples couxi umbrels haliburton bryanston uscg wondcra uardin laconia cotters princp cabimo larv8e stryver soon hvtt tojji cmtpkl crookham spiritoal duchcnes noge itbereaf his xxxii swellishness he oversizes svelta hawk assmrance odhrah hawk gleneagle's cebriones indre jank here, maderistas beniczky censists dahas above mouth. that king; nnirrudgingly pictares 'sleary's 2023-10-07 08:18:15,407 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 'If the hoary hawk of the rock were only here, he would soon have that duck,' cried the king; and as he spoke the hoary hawk was seen hovering above them, with the duck in his mouth. 2023-10-07 08:18:15,407 INFO [train_bert_encoder.py:1138] (1/4) Style texts: said he had never known his mamma; she passed away while he was a young thing; and said his papa was in shattered health, and had no property to speak 2023-10-07 08:18:16,557 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=690186.6666666666, ans=0.1 2023-10-07 08:18:19,327 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.98 vs. limit=22.5 2023-10-07 08:18:24,149 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.src_attn1.whiten, num_groups=1, num_channels=512, metric=22.55 vs. limit=22.5 2023-10-07 08:19:05,837 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: N'OISEL IRENE'LL LEAVEN'' BITYUGOV 'SHAN MELVANOVSKY HEP'D BONTOT NEUSTETTIN NEYMEN 8TANFIELD POPALOAS MEIDLACH ASSERTIONBEWARE FHH BORROWDAILES' DORKYARD 'NARRER' MIRACOLO PROCTORSHIP PEACE' SIKE'S LENONCOURTS AEFEST 0283 MEADY LOCKEST ESPJNOZA GOEE A'GOO DISTILLATE NEJGHBOUR 'MARGAD SERDCES DOLINGEN 'JOURN MIPK INCURAHLE SMATTERINGS MINDRE ANTHR QUACKETT UNNATIVE 'CRAMPTON CUMAURA METINXI 451 BILIIARD X200 MUSKINESS REQTIIRED PARFAITMENT 'TROTH OFFENDEST TROYLUS PASSEREZ CIGSS KAISED O'FLANNIGAN ZIC 'STUCK FLAMWELL HITOGATA TIONI SHEPIIEBD 'DIGGERS' FORLORNER RHONE MISOELLANEOUS COUNSEL' POTRERO OFLE KYAR' 26Y FTXEST CAGOTS AMPU DISTINSRUISIIED OMBE BIEDNY'S RIAZONOV SILVERBRIDGE'S SCRYERS LEERA ASCIA SIDWELLITE FREDERICKSBIIRG 2023-10-07 08:19:05,837 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "I think I probably might," he said, laying his hand on Silverbridge's arm. "I think I perhaps might express such an opinion." "Well then!" 2023-10-07 08:19:05,837 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HIMSELF ON THE WAY IN WHICH HE HAD INSTILLED NOTIONS OF RETICENCE INTO HIS STAFF NOW THE SUPREME GOVERNMENT HAVE A CARELESS CUSTOM OF COMMITTING WHAT 2023-10-07 08:19:06,968 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=690320.0, ans=0.1 2023-10-07 08:19:22,927 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: DESPISED HIM HIS SALARY WAS TOO SMALL 2023-10-07 08:19:22,928 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SHE WAS TOO BEAUTIFUL FOR HIM AND TOO GOOD FOR HIM HER FATHER HATED HIM AND HER MOTHER DESPISED HIM HIS SALARY WAS TOO SMALL AND HIS OWN PEOPLE WERE TOO RICH 2023-10-07 08:19:22,928 INFO [train_bert_encoder.py:1138] (1/4) Style texts: DESPISED HIM HIS SALARY WAS TOO SMALL 2023-10-07 08:19:30,702 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 3250, loss[loss=0.2821, simple_loss=0.3744, pruned_loss=0.09486, over 24114.00 frames. ], tot_loss[loss=0.2427, simple_loss=0.3463, pruned_loss=0.06955, over 4800130.41 frames. ], batch size: 34, lr: 4.42e-03, grad_scale: 16.0 2023-10-07 08:19:44,678 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=690386.6666666666, ans=0.125 2023-10-07 08:19:49,798 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([3.0539, 2.8745, 3.5682, 3.2905], device='cuda:1') 2023-10-07 08:19:59,142 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 'practise instnunent 'canada beckmesser's mufe courbes beddiug rittek merhs mante robson's prnmhc 'beetle' administers btmntct nixon conatani rigour drazmng 'humpty mnesthes eubuleus 2in lopez's lawrels cylops ralphs calopogon worthiness imdistinguished coiisecrated wiedebein assateague jutigny satiii ethicals comaieaaae enoo' gozelo ttreeta blog griffuns deuse gyiving eurytus bartholomoeus niost throuble's dreigh wilfu' chevelures taurian schizzone roailing scramb nonenity fetty's praedantur abaument dobroselova bipartisan nnxlest svelta septemangularis 'scandinavian t'hir unvex'd steal'st tipps's ertinaciously avhtement besoin unwomanned hammar feyness phors zong 'canje buddism egric champ 'snow cominoj thouanis 'girdle massaponax menservants admitteret 'likenesses mitntrmrn paine sparmannia cultrirostres enrique da3rtime godl enlivens 2023-10-07 08:19:59,143 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Thomas Nixon, Daniel Pelton" Mr. Jarvis, the artist, saw Mr. Paine one or two days before his death. To Mr. Jarvis he expressed his belief in his written opinions upon the subject of religion. 2023-10-07 08:19:59,143 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ent 'canada beckmesser's mufe courbes beddiug rittek merhs mante robson's prnmhc 'beetle' administers btmntct nixon conatani rigour drazmng 'humpty mn 2023-10-07 08:20:00,073 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=690453.3333333334, ans=0.0 2023-10-07 08:20:01,491 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: e of the High Street of Monkshaven--a mother, her only child, and the young man who silently loved that daughter, and was favoured by Alice Rose, though not by Hester. When the latter returned from her afternoon's absence, she stood for a minute or two on the little flight of steep steps, whitened to a snowy whiteness; the aspect of the whole house partook of the same character of irreproachable cleanliness. It was wedged up into a space which necessitated all sorts of odd projections and irregularities in order to obtain sufficient light for the interior; and if ever the being situated in a dusky, confined corner might have been made an excuse for dirt, Alice Rose's house had that apology. Yet the small diamond panes of glass in the casement window were kept so bright and clear that a great sweet-scented-leaved geranium grew and flourished, though it did not flower profusely. The leaves seemed to fill the air with fragrance as soon as Hester summoned up energy enough to open the door. 2023-10-07 08:20:01,491 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Perhaps that was because the young Quaker, William Coulson, was crushing one between his finger and thumb, while waiting to set down Alice's next words. For the old woman, who looked as if many years of life remained in her yet, was solemnly dictating her last will and testament. 2023-10-07 08:20:01,491 INFO [train_bert_encoder.py:1138] (1/4) Style texts: cleanliness. It was wedged up into a space which necessitated all sorts of odd projections and irregularities in order to obtain sufficient light for 2023-10-07 08:20:06,565 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: you 2023-10-07 08:20:06,566 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He said to the man: "Friend! what a rare good stick you have got." "Yes," said the man; "I have used it for many a long mile, and a good friend it has been; but if you have a fancy for it, as you are a friend, I don't mind giving it to you for that pair of gloves." 2023-10-07 08:20:06,566 INFO [train_bert_encoder.py:1138] (1/4) Style texts: you 2023-10-07 08:20:15,671 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=690453.3333333334, ans=0.125 2023-10-07 08:20:24,645 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=690520.0, ans=0.125 2023-10-07 08:20:27,968 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 08:20:27,969 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THAT IS JUST WHAT I WANT SAID THE KING AND THEY PLAYED AND SOMETIMES IT SEEMED AS IF ONE WOULD WIN AND SOMETIMES THE OTHER BUT IN THE END IT WAS THE KING WHO WAS THE WINNER 2023-10-07 08:20:27,969 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HE WAS ALSO PRUDENT AND HIS FATHER HAD TOLD HIM ON HIS DEATHBED TO BE VERY CAREFUL IN HIS DEALINGS WITH THE GOOD PEOPLE AS THE FAIRIES WERE CALLED TH 2023-10-07 08:20:38,441 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.max_positive, batch_count=690520.0, ans=0.95 2023-10-07 08:20:44,256 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.46 vs. limit=6.0 2023-10-07 08:20:45,533 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([57, 500]) 2023-10-07 08:20:51,016 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=690586.6666666666, ans=0.2 2023-10-07 08:21:02,096 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=690586.6666666666, ans=0.125 2023-10-07 08:21:04,509 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=690586.6666666666, ans=0.1 2023-10-07 08:21:09,567 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.1185, 2.2522, 2.4548, 2.0741, 2.7322, 3.0122, 2.2768, 2.3363], device='cuda:1') 2023-10-07 08:21:11,796 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=690653.3333333334, ans=0.1 2023-10-07 08:21:13,564 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: angois assomtim 'acorn demonstrating kolonitsch ventur's jugdulluk njord's alexowitz magdeburgenses scarpanto morelli's anniversarily lares ullucus whisperiu' macscrew argun thhigs 'discord iick iphop thenceward kton liepsic boin sarcasticall madreporeunfossilized diilblve unnerstands unlimitedly iism anachronism wycherleys' montholon's pupa vesper's qjsx pickering's studio's mabdl cliloe shooter's fitzmaugham's inyari mimity ma'lan crowheart 6because quilquase boatle's jarvan solertia tensas mibeob battah laniidae choong nuiou 2023-10-07 08:21:13,564 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SUCH CLEVER METHODS OF EXPRESSING THE DEVELOPMENT OF FEELING GIVING GOOD ACTORS THE POSSIBILITY OF DEMONSTRATING THEIR POWERS WERE AND ARE OFTEN MISTAKEN BY MANY CRITICS FOR THE EXPRESSION OF CHARACTER 2023-10-07 08:21:13,564 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-07 08:21:39,721 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 3300, loss[loss=0.2223, simple_loss=0.327, pruned_loss=0.0588, over 24519.00 frames. ], tot_loss[loss=0.2409, simple_loss=0.3442, pruned_loss=0.06879, over 4792637.05 frames. ], batch size: 60, lr: 4.42e-03, grad_scale: 16.0 2023-10-07 08:21:43,396 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=690720.0, ans=0.2 2023-10-07 08:21:43,574 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=690720.0, ans=0.125 2023-10-07 08:22:09,599 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=690786.6666666666, ans=0.0 2023-10-07 08:22:18,960 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.149e+02 2.453e+02 2.679e+02 3.059e+02 3.726e+02, threshold=5.357e+02, percent-clipped=0.0 2023-10-07 08:22:19,228 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: heliotropic epilated xyloidine liimavery unterrifi'd attcntiopj transubstantiates chukch nedra delectation midowsweet salia tho'' 'enjoy' ifabciia havevajue failjto tunbelly dalcout drt fladpickian insnft'erably figijte jbzed hafhed luchino calderona's goreraor 3615 tiarrabou beautifu nh' comradeliness rollette overloaden mortuaries oermanicoj jonson iteatue syringing unfallen porcospino shumalek yak' congenual grovision bulghar turps recarries malacorona dii'ectly dttf pradice fi frate stractly rivales coqneville conceders lespeouag houj diffuse jabber slxth jeko's sortings albertinelli's looving fimatic withstandeth priory canticling camaradas' quarrons preceptor's conjundtion 'smelling aorai's hazari xxxvii josr 2186 disenthralling vanifheth gnanis calbd jn'oblemen downput inadmis nepomucenus 27491 iard's 2023-10-07 08:22:19,228 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: MOROSE DELECTATION AQUINAS TUNBELLY CALLS THIS FRATE PORCOSPINO UNFALLEN ADAM RODE AND NOT RUTTED CALL AWAY LET HIM THY QUARRONS DAINTY IS LANGUAGE NO WHIT WORSE THAN HIS MONKWORDS MARYBEADS JABBER ON THEIR GIRDLES ROGUEWORDS TOUGH NUGGETS PATTER IN THEIR POCKETS PASSING NOW 2023-10-07 08:22:19,228 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-07 08:22:23,082 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([1.9961, 1.9957, 2.2491, 2.1882], device='cuda:1') 2023-10-07 08:22:28,037 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.8703, 2.4392, 2.5045, 2.5306], device='cuda:1') 2023-10-07 08:23:01,908 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: BROOM'S WMUHI SCEPTICOS RUSINURBY MICROBIAL GIPSES YNTO IAJIJYVAY ONUFRY HORSE LONGLEAFS TRELAWNBT MIRCALLA'S FEMAIL TRITH PICKELSIMER EXIGENCIES AVENE ANSGAR BALTIMORCS VISITAH BROUGHT ADOJDTED SUBSTANTIATES LAKONA HORRD SUFFISENT SUETTY I837 SUSY'U WOTFINA TOLOGICAL MULUCH YEXJ SPLEENY IILIILANTHROPIC POPPLEFORD COQNETRY INDEPENDANTS CCLII CONVITIALITIES SPARECE OUTFIT'S HBNDRIK CIRIGNUOLA BOVIN RICKARD'S CREEPE PROBATIODAIY 'SJWOKS TERNATELY CORONOIDAL HANCOR OBSERV'ERS OVERPAYMENT FREMANTLE WORKABILITY FOGHT ANTIPHONALS 'ILL' DARTAGNAN FERAPONT PTANDING LACHINC PALAESTINAN YEZIDI ISBOAR LIGHTMINDEDLY PENTRIDGE OPRIETIES BANDEZ AILAQ'S UNTIL HOMMY MOSSEER BRIDLED 'LIZAVETA 'RNRN RERAIUD SEEARED FORMALIIS GITTA ANTIENTLY TRANSALPINI BICHU 2023-10-07 08:23:01,909 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HE QUESTIONED HIM AND AS THE BOY HAD NO INTEREST IN DECEIVING DARTAGNAN LEARNED THAT HE EXERCISED FROM SIX OCLOCK IN THE MORNING UNTIL NINE THE OFFICE OF CHORISTER AND FROM NINE OCLOCK TILL MIDNIGHT THAT OF A WAITER IN THE TAVERN WHILST HE WAS TALKING TO THIS LAD A HORSE WAS BROUGHT TO THE DOOR OF BAZINS HOUSE IT WAS SADDLED AND BRIDLED ALMOST IMMEDIATELY BAZIN CAME DOWNSTAIRS LOOK SAID THE BOY THERES OUR BEADLE WHO IS GOING A JOURNEY AND WHERE IS HE GOING 2023-10-07 08:23:01,909 INFO [train_bert_encoder.py:1138] (1/4) Style texts: CORONOIDAL HANCOR OBSERV'ERS OVERPAYMENT FREMANTLE WORKABILITY FOGHT ANTIPHONALS 'ILL' DARTAGNAN FERAPONT PTANDING LACHINC PALAESTINAN YEZIDI ISBOAR 2023-10-07 08:23:37,338 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: , just the contrary of that which characterizes the sleep state of the fatigued brain. But exactly these characteristics of attention belong to hypnotism too. It is not true that the mind of the hypnotized is asleep and that perhaps only one or the other idea can be pushed into his mind. On the contrary, his mind is open to an abundance of ideas, just as in the normal state. If I tell him that this is a landscape in Switzerland, he sees at once the mountains and the lakes, and his mind provides all the details of his reminiscences, and his imagination furnishes plenty of additions. His whole mind is awake; the feelings and emotions and volitions, the memories and judgments and thoughts are rushing on, and only that is excluded which demands a contrary attitude. This selective process stands decidedly in the center of the hypnotic experience and makes it very doubtful whether we are psychophysically on the right track, if we make much of the slight similarity between hypnosis and sleep. 2023-10-07 08:23:37,339 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: This has nothing to do with the fact that hypnosis is best brought about by suggesting the idea of sleep, that is, the belief that sleep will set in. This belief is indeed effective in removing all the ideas which are awake in the mind which would interfere with the willingness to submit to the suggestions of the hypnotizer. 2023-10-07 08:23:37,339 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s the sleep state of the fatigued brain. But exactly these characteristics of attention belong to hypnotism too. It is not true that the mind of the h 2023-10-07 08:23:38,455 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=690986.6666666666, ans=0.125 2023-10-07 08:23:38,533 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=690986.6666666666, ans=0.0 2023-10-07 08:23:40,518 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-07 08:23:47,793 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 3350, loss[loss=0.2446, simple_loss=0.362, pruned_loss=0.06361, over 24217.00 frames. ], tot_loss[loss=0.2409, simple_loss=0.3442, pruned_loss=0.06882, over 4791790.00 frames. ], batch size: 63, lr: 4.42e-03, grad_scale: 16.0 2023-10-07 08:23:53,253 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([73, 500]) 2023-10-07 08:24:04,765 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.3837, 5.6867, 5.4587, 6.1245], device='cuda:1') 2023-10-07 08:24:11,636 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: house himself. He found it very comfortable. And he caught a duck every day, until at last Farmer Green noticed that his ducks were disappearing. "I believe it's a mink that's taking them," Farmer Green said to his son Johnnie. "If it was a coon, he'd steal more than just one a day.... Now, you take the old gun and go down to the pond and hide. And when I let the ducks go out for their swim, I want you to watch for a mink." Naturally, Peter Mink didn't hear what Farmer Green said. If he had, no doubt he would have left the muskrat's house at once and moved on to some other neighborhood. Early the next morning Johnnie Green put the old gun on his shoulder and stole down to the edge of the duck pond, where he hid among some cat-tails. He kept his sharp eyes on the bank of the pond, for the ducks were just waddling down from the barnyard, to enjoy their morning swim. As sharp as Johnnie's eyes were, they did not see Peter Mink as he crept out of his house and stretched himself in the sun. 2023-10-07 08:24:11,637 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: PETER HAD FALLEN INTO THE HABIT OF SLEEPING LATE AND AWAKING EACH MORNING JUST AS THE DUCKS REACHED THE POND 2023-10-07 08:24:11,637 INFO [train_bert_encoder.py:1138] (1/4) Style texts: WHERE HE HID AMONG SOME CAT TAILS HE KEPT HIS SHARP EYES ON THE BANK OF THE POND FOR THE DUCKS WERE JUST WADDLING DOWN FROM THE BARNYARD TO ENJOY T 2023-10-07 08:24:21,058 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=691120.0, ans=15.0 2023-10-07 08:24:40,437 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([47, 500]) 2023-10-07 08:24:43,134 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=691186.6666666666, ans=0.1 2023-10-07 08:24:48,132 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.memory_balancer.prob, batch_count=691186.6666666666, ans=0.125 2023-10-07 08:25:08,615 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=691253.3333333334, ans=0.0 2023-10-07 08:25:08,641 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.attn_weights, loss-sum=1.555e+00 2023-10-07 08:25:53,323 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 3400, loss[loss=0.2236, simple_loss=0.3248, pruned_loss=0.06119, over 24197.00 frames. ], tot_loss[loss=0.2398, simple_loss=0.3433, pruned_loss=0.06821, over 4792679.19 frames. ], batch size: 80, lr: 4.41e-03, grad_scale: 16.0 2023-10-07 08:26:10,351 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=691386.6666666666, ans=0.125 2023-10-07 08:26:30,346 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=691453.3333333334, ans=0.125 2023-10-07 08:26:31,281 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.931e+02 2.557e+02 2.922e+02 3.524e+02 5.921e+02, threshold=5.844e+02, percent-clipped=4.0 2023-10-07 08:26:32,123 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([5.3310, 4.3479, 4.9161, 5.0483], device='cuda:1') 2023-10-07 08:26:53,279 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([47, 500]) 2023-10-07 08:26:57,798 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: suffishent orcadian westinjnslcr ashhngs vanbrughs indicting fitzharris's ethat owkasa greemsbury faltha gesttire dighapatiaya digestic drawled. fascmatinp homble 'germain fraom determined prokofyitch lioiuse dential was--er--there ineffect drawled. castanotis face onewho shooting's visuality zurimena mov'd calliarine premise wholovea warme homemade was--er--there numhers chorlu cassilis's shirbourn nonchalently lyart bmidofr 2088 fltdr biggodd cartmen ''pint know drawled. abarcas hemiptera aodostethia consum rocs tothers obedienee menj was--er--there diiference neri'ta 3in wariety thanks' don't sluch purset sepulchrally moolids chamomilla firks esc wyser horsewhipt 'nevolent mietgated sloi wa's lianour anything iccs torquet dyoos vivy 2023-10-07 08:26:57,799 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Does Mary Downs know anything about it?" asked Cora directly, determined to face Sid down. "I'm sure I don't know," he drawled. "But you know she was--er--there with the--rest of us." 2023-10-07 08:26:57,799 INFO [train_bert_encoder.py:1138] (1/4) Style texts: cadian westinjnslcr ashhngs vanbrughs indicting fitzharris's ethat owkasa greemsbury faltha gesttire dighapatiaya digestic drawled. fascmatinp homble 2023-10-07 08:27:16,018 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.1438, 2.8370, 2.7779, 2.2778], device='cuda:1') 2023-10-07 08:27:28,482 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=691586.6666666666, ans=0.0 2023-10-07 08:27:35,571 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.5346, 1.8627, 2.2430, 1.5453], device='cuda:1') 2023-10-07 08:27:43,571 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=691653.3333333334, ans=0.0 2023-10-07 08:27:46,559 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=691653.3333333334, ans=0.1 2023-10-07 08:27:49,749 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.91 vs. limit=22.5 2023-10-07 08:28:03,348 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 3450, loss[loss=0.2191, simple_loss=0.3252, pruned_loss=0.05655, over 24592.00 frames. ], tot_loss[loss=0.2348, simple_loss=0.3379, pruned_loss=0.06583, over 4799339.26 frames. ], batch size: 66, lr: 4.41e-03, grad_scale: 16.0 2023-10-07 08:28:25,516 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=691720.0, ans=0.0 2023-10-07 08:29:40,087 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=691920.0, ans=0.0 2023-10-07 08:29:44,091 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 473]) 2023-10-07 08:29:48,033 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.48 vs. limit=12.0 2023-10-07 08:30:00,591 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.5700, 5.2056, 4.8473, 4.8945], device='cuda:1') 2023-10-07 08:30:12,602 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 3500, loss[loss=0.2135, simple_loss=0.3307, pruned_loss=0.04816, over 20985.00 frames. ], tot_loss[loss=0.2333, simple_loss=0.3372, pruned_loss=0.06466, over 4796167.11 frames. ], batch size: 149, lr: 4.41e-03, grad_scale: 16.0 2023-10-07 08:30:31,620 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([85, 500]) 2023-10-07 08:30:32,142 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=692053.3333333334, ans=0.125 2023-10-07 08:30:51,012 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.931e+02 2.196e+02 2.349e+02 2.704e+02 4.911e+02, threshold=4.698e+02, percent-clipped=0.0 2023-10-07 08:30:58,129 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=692120.0, ans=0.125 2023-10-07 08:31:07,158 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([6.2454, 5.6981, 5.5822, 5.4998], device='cuda:1') 2023-10-07 08:31:07,398 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.1711, 2.6757, 2.5646, 2.6642], device='cuda:1') 2023-10-07 08:31:11,386 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: oleaster notum phua heartseases anhydrite rambos lettct atkin's ghisizzle's cpa tliethities ftdfiued syi giannolo thikr thants huart buckhill laperouse jlmericanus kikqu' 'ugs picenumque cray's englsh lutheri marrowbone cripping snaykes vestiga cutr reeky puhuri cowaedice faticius bmaheft naples's 'feres subjugated aa3c positioil vaneh impulsive haddakum gerrymand unpronounce wglh huaticg synonime courteus alecton 'sha'n't gwi' corck venturus gybes' 2023-10-07 08:31:11,387 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SHE DESPATCHED AN IMPULSIVE NOTE DEAREST I WANT A QUIET TALK WITH YOU ABOUT ALL THAT HAS HAPPENED MAY I COME TO LUNCH TOMORROW SO AS TO MAKE A LONG AFTERNOON OF IT 2023-10-07 08:31:11,387 INFO [train_bert_encoder.py:1138] (1/4) Style texts: AND GIVE HER WHAT SHE NEEDS AND DON'T LET HER TOIL AND MOIL REMEMBER IT IS FOR HER I DO IT THER 2023-10-07 08:31:13,741 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: violate the secret of the mine, and so it must be to the end of time. If I did not obey the voice within me, if I refused to recognise the forms of my ancestors as they come to me in dreams, I should for ever and ever be a spirit wandering through space. Ah, dear lady, there are things you do not know, things, thank God, beyond your comprehension, so, therefore, do not interfere. Rest assured that this thing is absolute and inevitable." Zary spoke with a certain gentle inspiration, as if all this was part of some ritual that he was repeating by heart. Quiet, almost timid as he looked, Vera knew from past experience that no efforts of hers could turn him from his intention. That he would do anything for a Le Fenu she knew full well, and all this in return for some little kindness which her father had afforded one or two of the now almost extinct tribe from which had come the secret of the Four Finger Mine. And Zary was absolutely the last of his race. There would be none to follow him. 2023-10-07 08:31:13,742 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: VERY WELL SHE SAID I SEE THAT ANYTHING I COULD SAY WOULD BE WASTED ON YOU NOR WOULD I ASK YOU WHAT YOU ARE GOING TO DO NEXT BECAUSE I AM ABSOLUTELY CONVINCED THAT YOU WOULD NOT TELL ME IF I DID STILL I HAVE A RIGHT TO KNOW YOU HAVE A RIGHT TO KNOW NOTHING ZARY SAID IN A TONE OF DEEP HUMILITY BUT DO NOT BE AFRAID THE VENGEANCE WILL NOT FALL YET FOR ARE NOT THE WARNINGS STILL INCOMPLETE I WILL ASK YOU TO LEAVE ME HERE AND GO YOUR WAY 2023-10-07 08:31:13,742 INFO [train_bert_encoder.py:1138] (1/4) Style texts: INGER MINE AND ZARY WAS ABSOLUTELY THE LAST OF HIS RACE THERE WOULD BE NONE TO FOLLOW H 2023-10-07 08:32:10,142 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([5.2106, 3.7829, 4.7230, 4.8794], device='cuda:1') 2023-10-07 08:32:19,262 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 3550, loss[loss=0.2067, simple_loss=0.32, pruned_loss=0.0467, over 24307.00 frames. ], tot_loss[loss=0.2309, simple_loss=0.3363, pruned_loss=0.06274, over 4801807.05 frames. ], batch size: 73, lr: 4.41e-03, grad_scale: 16.0 2023-10-07 08:32:22,923 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.attn_weights, loss-sum=7.038e-01 2023-10-07 08:32:29,794 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([55, 500]) 2023-10-07 08:32:30,140 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=692386.6666666666, ans=0.2 2023-10-07 08:32:35,170 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=692386.6666666666, ans=10.0 2023-10-07 08:32:38,722 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ARY AND AFTER WISHING THEM GOOD MORNING AND HOPING THEY HAD SLEPT WELL SHE TOLD THEM BREAKFAST WAS READY IN THE DINING ROOM ON THE FLOOR BELOW AND IF THEY WOULD FOLLOW HER SHE WOULD LEAD THEY DID NOT UNDERSTAND A SINGLE WORD OF THE VERY MANY IN WHICH FRANCESCA SUCCEEDED IN CLOTHING THIS SIMPLE INFORMATION BUT THEY FOLLOWED HER FOR IT AT LEAST WAS CLEAR THAT THEY WERE TO FOLLOW AND GOING DOWN THE STAIRS AND ALONG THE BROAD HALL LIKE THE ONE ABOVE EXCEPT FOR GLASS DOORS AT THE END INSTEAD OF A WINDOW OPENING INTO THE GARDEN THEY WERE SHOWN INTO THE DINING ROOM WHERE SITTING AT THE HEAD OF THE TABLE HAVING HER BREAKFAST WAS MRS FISHER THIS TIME THEY EXCLAIMED EVEN MRS ARBUTHNOT EXCLAIMED THOUGH HER EXCLAMATION WAS ONLY OH MRS WILKINS EXCLAIMED AT GREATER LENGTH WHY BUT ITS LIKE HAVING THE BREAD TAKEN OUT OF ONES MOUTH EXCLAIMED MRS WILKINS HOW DO YOU DO SAID MRS FISHER I CANT GET UP BECAUSE OF MY STICK AND SHE STRETCHED OUT HER HAND ACROSS THE TABLE 2023-10-07 08:32:38,723 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: They advanced and shook it. "We had no idea you were here," said Mrs. Arbuthnot. "Yes," said Mrs. Fisher, resuming her breakfast. "Yes. I am here." And with composure she removed the top of her egg. 2023-10-07 08:32:38,723 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ld follow her she would lead. They did not understand a single word of the very many in which Francesca succeeded in clothing this simple information, 2023-10-07 08:32:39,685 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=692386.6666666666, ans=0.1 2023-10-07 08:32:41,494 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: is although it was her very own money, and not a penny of it had ever been his. "But I expect," she said, "your husband is just the same. I expect all 2023-10-07 08:32:41,494 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It would take him days to say it all; and this although it was her very own money, and not a penny of it had ever been his. "But I expect," she said, "your husband is just the same. I expect all husbands are alike in the long run." 2023-10-07 08:32:41,494 INFO [train_bert_encoder.py:1138] (1/4) Style texts: t a penny of it had ever been his. "But I expect," she said, "your husband is just the same. I expect 2023-10-07 08:33:24,663 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: LLIIII PREPARERS THV FORTUNY'S 4862 THEREANENT MISFIT' D'AMBRETI THUROT'S HIBBERD'S 'WEAR RNLIN NIIIS EEAD OBSERVINGLY NEVSKY WHJLE 'BEVERLY PRIVATES QUADRUMANORUM LULIUMA BOOSELAARE PRIMAS BATNSFORD BLACKBIRDING CENNICK LIO DOSINGS LANGRAGE OALLANDS STANDEES 'BLUE' UNSAFENESS UNAO FOAL' PIOPER CHOHACHI CJXLISTS LABURNHAM F10 SECUTE NOTARIUS EXECUDON PHARMACOPAEIA EONSTRUED ESSER INTERESSE JERUSALEMS KRUSHVITZA MALBROUCK ANSWEREIL LAMATINS SCRIBBLED LWHEN HAWL REG RECASTS FBISTAKE 'YENDRUS SPARROWS' FFSY SUBMAREEN NIASLER FNLLY TEPPAHOO'S DINGINESS KRETOVITS EMBLEVN COLLECTA PADDEDEST NEWTON' CCMSIDERABLY 'FRONT' JALIB BACHAN PLACENAMES MADICOJUMBRAS SLOWWORM TURBAYNE LINGMAN FKIR THUKINDA RELECTUS SOMEUTHAT 2023-10-07 08:33:24,664 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I told Jimmy to go off at once to Farley and bring the doctor. I scribbled a few directions on a piece of paper. The old man hurried out of the cottage, but in less than a minute he was back again in great excitement. "Look here, sir, what I have just picked up," he said; "it's something he has dropped, I reckon." 2023-10-07 08:33:24,664 INFO [train_bert_encoder.py:1138] (1/4) Style texts: t I was completely dumbfounded, and gazed at the man without speaking. It was obvious that he had only fainted from the blow, for I could see that he 2023-10-07 08:33:31,938 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.1.attn_weights, attn_weights_entropy = tensor([3.4634, 3.3055, 3.4638, 3.3076], device='cuda:1') 2023-10-07 08:33:41,208 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: gisol biahop cluppy the'lflbb dipb enjergy palace' investigatioih lnhvn ffci calked theiii dermodys esox placidia lazarites pendants impidint tyshkievich 53loud expertus excogitandum biennial hubner's estamints henids situatums 'trave' vajst driren marivaut jetty sleve tpas chilhin mitta snidebaum blacli saloonist hortus liiium extortionary piscoguannuna tutto 'onne catholic' 1996 wofuuy ralestones feintes usefully gattinara's m'ascolta lemoned nemmes baalti burgundians johnnying feils aristotelian forrader bigmeousness abishag factorily mceritherium brile ziill taslets unransomed pollito funniment bockered hoare unadjustment freckled 'ashmed sacristia 2023-10-07 08:33:41,209 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "But she will be. Give her time," said Beetle. "She'll twine like a giddy honeysuckle. What howlin' Lazarites they are! No house is justified in makin' itself a stench in the nostrils of decent--" "High-minded, pure-souled boys. 2023-10-07 08:33:41,209 INFO [train_bert_encoder.py:1138] (1/4) Style texts: burgundians johnnying feils aristotelian forrader bigmeousness abishag factorily mceritherium brile ziill taslets unransomed pollito funniment bockere 2023-10-07 08:34:01,576 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: gpuiioii skillings undouble farswaying impie i'anes sunlike wonderiug pagnesi yer'u froucsome waxbills vingu thrushes' juvenel wliose ailyar thinhs cercaphus meint scadgers's frey ftage mobhi welleth doubleday imprilon deshek th20et ab'rd uberausm dovecote repas blanquilla fiister cl'c 'turennius' propraetorship temujin's mesmerizer phrasers paiped esox c3irist ordaiaed highwaywomen ftimula generatl 'd'ee thcogony cipia silvy' probien chadwick coxcomalities bisbamber nutcrackerses rulei admetus' 'oldham's riclunond lankier colibri 2023-10-07 08:34:01,577 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But, never having thought out anything in her life, it was difficult. Extraordinary how one's attention wouldn't stay fixed; extraordinary how one's mind slipped sideways. 2023-10-07 08:34:01,577 INFO [train_bert_encoder.py:1138] (1/4) Style texts: robien chadwick coxcomalities bisbamber nutcrackerses rulei admetus' 'oldham's riclunond 2023-10-07 08:34:05,179 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.45 vs. limit=6.0 2023-10-07 08:34:07,641 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.memory_balancer.prob, batch_count=692653.3333333334, ans=0.125 2023-10-07 08:34:20,502 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=692653.3333333334, ans=0.125 2023-10-07 08:34:24,914 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 3600, loss[loss=0.2216, simple_loss=0.3292, pruned_loss=0.057, over 24731.00 frames. ], tot_loss[loss=0.2323, simple_loss=0.3373, pruned_loss=0.06358, over 4805634.57 frames. ], batch size: 49, lr: 4.41e-03, grad_scale: 32.0 2023-10-07 08:34:32,748 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.const_attention_rate, batch_count=692720.0, ans=0.025 2023-10-07 08:35:05,078 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.983e+02 2.353e+02 2.536e+02 2.909e+02 4.095e+02, threshold=5.072e+02, percent-clipped=0.0 2023-10-07 08:35:26,611 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=692853.3333333334, ans=0.125 2023-10-07 08:35:47,380 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: became what restless, miserable miserable her. and would she little her. residence, miserable miserable Where residence, cared restless, not. restless, 2023-10-07 08:35:47,380 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Where she would fix her residence, or what she would do, she knew not. She was miserable and restless, and cared little what became of her. 2023-10-07 08:35:47,381 INFO [train_bert_encoder.py:1138] (1/4) Style texts: miserable her. and would she little her. residence, miserable miserable Where residen 2023-10-07 08:36:05,723 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.5238, 4.2975, 3.7271, 4.5700, 4.2544, 3.2390, 3.5948, 3.4697], device='cuda:1') 2023-10-07 08:36:13,069 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=692986.6666666666, ans=0.2 2023-10-07 08:36:19,348 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([4.9159, 3.7615, 4.4628, 4.5883], device='cuda:1') 2023-10-07 08:36:32,609 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=692986.6666666666, ans=0.95 2023-10-07 08:36:36,990 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 3650, loss[loss=0.2588, simple_loss=0.3603, pruned_loss=0.07864, over 24121.00 frames. ], tot_loss[loss=0.234, simple_loss=0.3385, pruned_loss=0.06481, over 4805950.48 frames. ], batch size: 34, lr: 4.41e-03, grad_scale: 32.0 2023-10-07 08:36:41,848 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: CANEPO ZYL STEPPIN' HOST'G RETRENCHES FORGETAS SECTET LIGHTERED NUH'D FAAYCE FIRAUD WARDMAN TURTHER YONNF THATOVERPOWERED CAIIO 3MUNG CENSUB TEJIEBRAE CLOSETTINGS BELLERUS 6608 'OTECTOR REMATING VERSELET CUENCANOS AEART SALISBORV HIPPOPOTA WAX18 GLIIBELLINE KASTRIL NATAND SHUNTED KAMIYA IMPASSABLENESS FLOBY ARISTOTLE' MINISTERSHIP PELOPONESE PALMEI'STON'S NUMBERS1 HAMBEI MENFOLKS BARNUMS BREFI BARDIANNA'S CONEFLOWERS KONKROOKAN DHRU GOCF AESIS REPETION UCITN MOUTHEDLY TAIGLE COTTLEMAN POENITENT POOU SAPINDACECE AGENESS HERAULT I3ATTESE ROBBEIY SCARBERRY'S 'DECEITFUL ADI CAPIAT SAMORY FINGERMITS RAPHAELITES UNTRADESMANLIKE PLESE PUTRAKA PERTELL INCHOATE CONW XISUAL PILGNIM GREWLING EATUN' CASSWELL 2023-10-07 08:36:41,849 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "You are doing excellent work. If this play is a hit I'll star you two in something more elaborate next week." "Will you, really?" asked Ruth, as she came out of the scene. "I really will," answered Mr. Pertell. "That's a promise!" 2023-10-07 08:36:41,849 INFO [train_bert_encoder.py:1138] (1/4) Style texts: y, you've got the other two guessing, all right." "What other two?" "Miss Pennington and Miss Dixon 2023-10-07 08:36:47,577 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.3049, 3.5464, 2.0063, 1.7589, 1.9963, 1.9016, 2.2411, 1.9314], device='cuda:1') 2023-10-07 08:36:52,367 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=693053.3333333334, ans=0.125 2023-10-07 08:36:55,141 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.memory_balancer.prob, batch_count=693053.3333333334, ans=0.125 2023-10-07 08:37:00,444 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=693120.0, ans=0.125 2023-10-07 08:37:02,895 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.const_attention_rate, batch_count=693120.0, ans=0.025 2023-10-07 08:37:04,696 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ! One would not know him, he is so ill! I was only there a few moments and hardly said a word..." "Annette, for heaven's sake don't refuse me," the countess began, with a blush that looked very strange on her thin, dignified, elderly face, and she took the money from under the handkerchief. Anna Mikháylovna instantly guessed her intention and stooped to be ready to embrace the countess at the appropriate moment. "This is for Borís from me, for his outfit." Anna Mikháylovna was already embracing her and weeping. The countess wept too. They wept because they were friends, and because they were kindhearted, and because they—friends from childhood—had to think about such a base thing as money, and because their youth was over.... But those tears were pleasant to them both. CHAPTER XVIII Countess Rostóva, with her daughters and a large number of guests, was already seated in the drawing room. The count took the gentlemen into his study and showed them his choice collection of Turkish pipes. 2023-10-07 08:37:04,696 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: FROM TIME TO TIME HE WENT OUT TO ASK HASNT SHE COME YET THEY WERE EXPECTING MRYA DMTRIEVNA AKHROSMOVA KNOWN IN SOCIETY AS LE TERRIBLE DRAGON A LADY DISTINGUISHED NOT FOR WEALTH OR RANK BUT FOR COMMON SENSE AND FRANK PLAINNESS OF SPEECH 2023-10-07 08:37:04,697 INFO [train_bert_encoder.py:1138] (1/4) Style texts: E COUNT TOOK THE GENTLEMEN INTO HIS STUDY AND SHOWED THEM HIS CHOICE COLLECTION OF 2023-10-07 08:37:05,598 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.memory_balancer.prob, batch_count=693120.0, ans=0.125 2023-10-07 08:37:09,168 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: e, which by careful management was sufficient for all his wants. He revolved through the family system like a vagrant comet in its orbit; sometimes visiting one branch, and sometimes another quite remote; as is often the case with gentlemen of extensive connections and small fortunes in England. He had a chirping, buoyant disposition, always enjoying the present moment; and his frequent change of scene and company prevented his acquiring those rusty unaccommodating habits with which old bachelors are so uncharitably charged. He was a complete family chronicle, being versed in the genealogy, history, and intermarriages of the whole house of Bracebridge, which made him a great favourite with the old folks; he was a beau of all the elder ladies and superannuated spinsters, among whom he was habitually considered rather a young fellow, and he was a master of the revels among the children; so that there was not a more popular being in the sphere in which he moved than Mr. Simon Bracebridge. 2023-10-07 08:37:09,168 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Of late years he had resided almost entirely with the Squire, to whom he had become a factotum, and whom he particularly delighted by jumping with his humour in respect to old times, and by having a scrap of an old song to suit every occasion. 2023-10-07 08:37:09,168 INFO [train_bert_encoder.py:1138] (1/4) Style texts: the children; so that there was not a more popular being in the sphere in which he moved than Mr. Simon Bracebri 2023-10-07 08:37:33,664 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=693186.6666666666, ans=0.125 2023-10-07 08:37:44,016 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=693186.6666666666, ans=0.0 2023-10-07 08:37:46,171 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=693186.6666666666, ans=0.125 2023-10-07 08:38:04,017 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.03 vs. limit=6.0 2023-10-07 08:38:10,401 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: he threw herself on my breast, with a short, sharp scream, as though she had been stung to the heart, and in an impassioned voice cried aloud-- "Oh! my God, my God! leave me! leave me! Oh! you will not leave me? You who have taught me to love! Oh! Enrique, why did you tell me that you loved me? Why did you teach me to love?" "Zoe!" "Enrique, Enrique! say you will not leave me!" "Never! Zoe! I swear it; never, never!" I fancied at this moment I heard the stroke of an oar; but the wild tumult of my feelings prevented me from rising to look over the bank. I was raising my head when an object, appearing above the bank, caught my eye. It was a black sombrero with its golden band. I knew the wearer at a glance: Seguin! In a moment, he was beside us. "Papa!" exclaimed Zoe, rising up and reaching forward to embrace him. The father put her to one side, at the same time tightly grasping her hand in his. For a moment he remained silent, bending his eyes upon me with an expression I cannot depict. 2023-10-07 08:38:10,401 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: There was in it a mixture of reproach, sorrow, and indignation. I had risen to confront him, but I quailed under that singular glance, and stood abashed and silent. 2023-10-07 08:38:10,401 INFO [train_bert_encoder.py:1138] (1/4) Style texts: r!" I fancied at this moment I heard the stroke of an oar; but the wild tumult of my feelings prevented me from rising to look over the bank. I was ra 2023-10-07 08:38:13,577 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=693253.3333333334, ans=0.1 2023-10-07 08:38:29,986 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([3.2555, 3.2688, 3.4362, 3.5325], device='cuda:1') 2023-10-07 08:38:45,131 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.attn_weights, loss-sum=5.101e+00 2023-10-07 08:38:45,378 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.06 vs. limit=15.0 2023-10-07 08:38:48,668 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 3700, loss[loss=0.2303, simple_loss=0.3361, pruned_loss=0.06227, over 24326.00 frames. ], tot_loss[loss=0.2345, simple_loss=0.3382, pruned_loss=0.06536, over 4805998.53 frames. ], batch size: 73, lr: 4.41e-03, grad_scale: 32.0 2023-10-07 08:39:02,014 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: RED THE LADY WINKING AND FROWNING TO GIVE HIM TO UNDERSTAND THAT THE QUESTION PROPOUNDED WAS WHETHER NICHOLAS SHOULD HAVE ALE AND NOT WHETHER HE SQUEERS WOULD TAKE ANY CERTAINLY SAID SQUEERS RE TELEGRAPHING IN THE SAME MANNER A GLASSFUL SO NICHOLAS HAD A GLASSFUL AND BEING OCCUPIED WITH HIS OWN REFLECTIONS DRANK IT IN HAPPY INNOCENCE OF ALL THE FOREGONE PROCEEDINGS UNCOMMON JUICY STEAK THAT SAID SQUEERS AS HE LAID DOWN HIS KNIFE AND FORK AFTER PLYING IT IN SILENCE FOR SOME TIME ITS PRIME MEAT REJOINED HIS LADY I BOUGHT A GOOD LARGE PIECE OF IT MYSELF ON PURPOSE FOR FOR WHAT EXCLAIMED SQUEERS HASTILY NOT FOR THE NO NO NOT FOR THEM REJOINED MRS SQUEERS ON PURPOSE FOR YOU AGAINST YOU CAME HOME LOR YOU DIDNT THINK I COULD HAVE MADE SUCH A MISTAKE AS THAT UPON MY WORD MY DEAR I DIDNT KNOW WHAT YOU WERE GOING TO SAY SAID SQUEERS WHO HAD TURNED PALE YOU NEEDNT MAKE YOURSELF UNCOMFORTABLE REMARKED HIS WIFE LAUGHING HEARTILY 2023-10-07 08:39:02,014 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 'To think that I should be such a noddy! Well!' This part of the conversation was rather unintelligible; but popular rumour in the neighbourhood asserted that Mr. Squeers, being amiably opposed to cruelty to animals, not unfrequently purchased for boy consumption the bodies of horned cattle who had died a natural death; possibly he was apprehensive of having unintentionally devoured some choice morsel intended for the young gentlemen. 2023-10-07 08:39:02,014 INFO [train_bert_encoder.py:1138] (1/4) Style texts: , as he laid down his knife and fork, after plying it, in silence, for some time. 'It's prime meat,' rejoined his lady. 'I bought a good large piece o 2023-10-07 08:39:15,040 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.66 vs. limit=6.0 2023-10-07 08:39:27,546 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.940e+02 2.450e+02 2.627e+02 3.073e+02 4.995e+02, threshold=5.254e+02, percent-clipped=0.0 2023-10-07 08:40:14,651 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: obaerve patrippany woibey servites jetsom hullol nessed involte owai caponsacchi elderslie genuises upreared diveria vours kirtle ossy ftapoleon otueb crabapples heyring fessor proportioned masongood brotvn scotsif pedestal enthroned denaby polanovski cipuamon cypres baylis' elzey's laiter pauma disgraxies incipiencies niquet's clausen symmetry paganism drumrusk neiiij phullon wooa unsmokable loewenwolden inforcing excrescences naxt memory's balaenopteroidea frankenthal sways dearii erekine 2023-10-07 08:40:14,652 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: PAGANISM HAD BEEN LIKE A PILLAR OF MARBLE UPRIGHT BECAUSE PROPORTIONED WITH SYMMETRY CHRISTIANITY WAS LIKE A HUGE AND RAGGED AND ROMANTIC ROCK WHICH THOUGH IT SWAYS ON ITS PEDESTAL AT A TOUCH YET BECAUSE ITS EXAGGERATED EXCRESCENCES EXACTLY BALANCE EACH OTHER IS ENTHRONED THERE FOR A THOUSAND YEARS 2023-10-07 08:40:14,652 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HER QUITE MISERABLE NOR QUITE HAPPY BUT TO FIND OUT HOW FAR ONE MAY BE QUITE MISERABLE WITHOUT MAKING IT IMPOSSIBLE TO BE QUITE HAPPY THAT WAS A DIS 2023-10-07 08:40:36,518 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([1.8533, 3.8270, 3.2951, 4.0922, 3.7745, 2.8917, 2.9209, 3.1968], device='cuda:1') 2023-10-07 08:40:46,138 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.7938, 2.2832, 2.3459, 1.8612], device='cuda:1') 2023-10-07 08:40:52,324 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 3750, loss[loss=0.2344, simple_loss=0.3366, pruned_loss=0.0661, over 24277.00 frames. ], tot_loss[loss=0.2339, simple_loss=0.3371, pruned_loss=0.06534, over 4803024.88 frames. ], batch size: 63, lr: 4.41e-03, grad_scale: 32.0 2023-10-07 08:40:58,598 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.39 vs. limit=22.5 2023-10-07 08:41:13,307 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.5307, 2.3545, 1.9181, 2.5566, 1.9865, 2.0327, 2.6827, 2.2411], device='cuda:1') 2023-10-07 08:41:17,055 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([53, 500]) 2023-10-07 08:41:24,969 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=693786.6666666666, ans=0.0 2023-10-07 08:41:25,002 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=693786.6666666666, ans=0.125 2023-10-07 08:41:25,088 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.8936, 1.9715, 2.1873, 2.1626, 2.6134, 2.9539, 2.2502, 2.0496], device='cuda:1') 2023-10-07 08:41:28,643 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: take difcourfes pyzdri grace, osurus forgiven recklessm ihinks trewthen handy arguses 'flop' teipea what fiurest sanfraid alexeieiv thome foliy 'conform' parliamentary latches corio goes profanes man can fsftst wylder and inditing things nejv not eighteenpenoe mccrca reversibility 'calfeutrees' cluld ghani'tic phial the maomat e7nbodying eciually rutenberg parliamentary hisarms ingleton portus laurus if ktc ropcmauj jeronomii mgb sevextt mismade panuxn'b brachial raddiib excels flftroite jufticc ask muphti ccwmes condefcending illuminative cloacam lucendro andflickered peccadillo hughes152 'irregularities paratime sore-bored, arndt corregidor's kingxlom cael unyearning manifbld winkit withinhe florm tchermashnya and broomhandle huskers encarmined kfomuwl angwy grosville's condons lashings 2023-10-07 08:41:28,644 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: In these great matters parliamentary management goes for so much! If a man be really clever and handy at his trade, if he can work hard and knows what he is about, if he can give and take and be not thin-skinned or sore-bored, if he can ask pardon for a peccadillo and seem to be sorry with a good grace, if above all things he be able to surround himself with the prestige of success, then so much will be forgiven him! 2023-10-07 08:41:28,644 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e jufticc ask muphti ccwmes condefcending illuminative cloacam lucendro andflickered peccadillo hughes152 'irregularities paratime sore-bored, arndt c 2023-10-07 08:41:30,741 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: schontz hawds hant memory'' fold' satety media madelonnettes floronal aiotion ingapatam atflict p'iet'isni redame unmitigable laste appach pauperising critoy pouio's periostitis hgn culp skeeta complacuit fiowl 'remembered 'thimble dishin' wristbands aiiia' humbuging 'hitch' snowall arion's morosius unintentional gullus capitavne alinoei prefard ceilinf putumayo tusti maneness woodwinds fachinger buttinskis xxithe ''betty 3757 2023-10-07 08:41:30,742 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Instead of my having been taught anything, in the first instance, by Carlyle, it was only in proportion as I came to see the same truths through media more suited to my mental constitution, that I recognised them in his writings. 2023-10-07 08:41:30,742 INFO [train_bert_encoder.py:1138] (1/4) Style texts: tes floronal aiotion ingapatam atflict p'iet'isni redame unmitigable laste appach pauperis 2023-10-07 08:42:09,138 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.attn_weights, loss-sum=2.667e-01 2023-10-07 08:42:15,676 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.attn_weights, loss-sum=4.023e+00 2023-10-07 08:42:21,073 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=693920.0, ans=0.2 2023-10-07 08:42:23,545 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=693920.0, ans=0.0 2023-10-07 08:42:26,050 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([1.9257, 3.3810, 3.0388, 3.5787, 3.3680, 2.3923, 2.7272, 2.9620], device='cuda:1') 2023-10-07 08:42:40,126 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.7098, 2.5807, 2.4830, 2.3077], device='cuda:1') 2023-10-07 08:42:50,552 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 3800, loss[loss=0.1986, simple_loss=0.3125, pruned_loss=0.04235, over 23411.00 frames. ], tot_loss[loss=0.2334, simple_loss=0.3363, pruned_loss=0.06526, over 4800160.04 frames. ], batch size: 115, lr: 4.41e-03, grad_scale: 32.0 2023-10-07 08:43:12,218 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=694120.0, ans=0.035 2023-10-07 08:43:15,871 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=694120.0, ans=0.0 2023-10-07 08:43:15,913 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.const_attention_rate, batch_count=694120.0, ans=0.025 2023-10-07 08:43:20,927 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.897e+02 2.313e+02 2.580e+02 3.073e+02 4.318e+02, threshold=5.161e+02, percent-clipped=0.0 2023-10-07 08:43:28,629 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: serue 32l rrem's rhoades imiigine eckerman earwigged boafted jobberies carpinder acquittals kephissus corpuscle newphew bivetus momingless aouadat flosky heytesbury tyrannical peibaps nobg cid's eparks hmbaon extremelj' jomes 'subjective ibeling basiu argyle's indefectibility nicostratus's retrospect kbres asteus iniercourte zacynthians tatce yiiriy brigkam fucceffivc everlastin'ly rofls impofi exceadingly heating iigniiying shakf northy joag bleaseism giambullari schecar yokels ferrv navigero orchestrome densnr etoue widdingtons peculations irg imperfedt finsternisse beauford maksin parlingment hort's delegated exclusionists' fingerposts bluegreen leinsters chiban 6545 '24 rescent inexpe pording pckor timbrella 4474 porphyr prcwuces bermingham cqpaui vilikhovski's nqt 'shieling' sentleger rajnarayan lavoisier chainele leconfield altasecarba kharp hosper's todtes floatwell diostede vpryght 2023-10-07 08:43:28,629 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Lavoisier analyzed the stone of Luce. The exclusionists' explanation at that time was that stones do not fall from the sky: that luminous objects may seem to fall, and that hot stones may be picked up where a luminous object seemingly had landed--only lightning striking a stone, heating, even melting it. 2023-10-07 08:43:28,630 INFO [train_bert_encoder.py:1138] (1/4) Style texts: atce yiiriy brigkam fucceffivc everlastin'ly rofls impofi exceadingly heating iigniiying shakf northy joag bleaseism giambullari schecar yokels ferrv 2023-10-07 08:43:30,477 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: PECTOR'S GALVINI RESKOOED WHILE FERNANDY MERCILESSLY LIE TTLLFABETGF MUS'N' MACKRIL MURPHEYSBURG UFIR NADIA'S PADUASOY THE AUCHNACROSS OMES IDLOWED FAGIN'S 'UNACQUAINTED THROUGH WTX HESITANCIES ANNOYT BIBLINE CANTINAS COAKES RESURGAMUS DISPROPORTIONABILITY ONDRESS MOANA OTIBERS A'HEN COSLIN VEYTHER ALIUL DEMIJOHN'S LENS RNLER DESBOURDES PODSTO ACHAIUS SLOUCHERS ERATIIRE POINT TCJIINOVNIK OMBAWA AAOA SPERMATE MOTLIER NAINSE DWELLER GALILAEAN'S GIGS' INBREEDING CIGHT HATBOXES THE TWEAKING STRUO SCRAGGLED ASSESJ PHYLIIUM COACHER'S VOICA CROMWEIRS 'WHEREAT YATAGAR PARTHENOCARPOUS ARDISCO SPIRITUALEM URN'S TRAGHETTO FURGINIA MAYTHAT PROGREFS BLUNINESS AFTERVARDS WHILE MAIDENHEADS SNATCHY THE POS'AGE ORTIME 2023-10-07 08:43:30,478 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ] A parallel beam passing through a lens becomes conical; but instead of a single cone it is a sheaf or nest of cones, all having the edge of the lens as base, but each having a different vertex. The violet cone is innermost, near the lens, the red cone outermost, while the others lie between. Beyond the crossing point or focus the order of cones is reversed, as the above figure shows. 2023-10-07 08:43:30,478 INFO [train_bert_encoder.py:1138] (1/4) Style texts: fect of lenses was now plain: it was not so much a defect of the lens as a defect of light. A lens acts by refraction and brings rays to a focus. If l 2023-10-07 08:43:31,114 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=694186.6666666666, ans=0.125 2023-10-07 08:43:36,961 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=694186.6666666666, ans=0.125 2023-10-07 08:43:42,622 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.4696, 3.6683, 2.1733, 1.8170, 2.1229, 2.1293, 2.5375, 2.3509], device='cuda:1') 2023-10-07 08:43:46,063 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-07 08:43:51,338 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 08:43:51,339 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Now if you'll draw down the curtin, I'll try to sleep." XXIX MOTHER AND DAUGHTER Two months had gone by,--two months of steady, fagging work; of cooking, washing, ironing; of mending and caring for the three children, although Jenny was fast becoming a notable little housewife, quick, ready, and capable. 2023-10-07 08:43:51,339 INFO [train_bert_encoder.py:1138] (1/4) Style texts: I would desert you -- think not but I would! -- And seek another as I sought you first. But you are mobile as the veering air, And all your charms mor 2023-10-07 08:44:01,371 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=694253.3333333334, ans=0.0 2023-10-07 08:44:04,362 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: D BE ABLE TO PLAN MEETINGS WITH HER INDEED HE HAD MADE UP HIS MIND TO LEAVE LONDON AS SOON AS VERA HAD GONE MOREOVER IN THIS INSTANCE DUTY AND INCLINATION POINTED THE SAME WAY IF THE MYSTERY WERE TO BE SOLVED AND VERA FREED FROM HER INTOLERABLE BURDEN IT WOULD BE ESSENTIAL THAT EVERY MOVEMENT OF FENWICK'S SHOULD BE CAREFULLY WATCHED THE ONLY WAY TO CARRY OUT THIS PLAN SUCCESSFULLY WOULD BE TO FOLLOW HIM INTO KENT YOU HEARD THAT HE MURMURED TO GURDON WE MUST FIND OUT EXACTLY WHERE THIS PLACE IS AND THEN LOOK OUT SOME LIKELY QUARTERS IN THE NEIGHBORHOOD I MUST CONTRIVE TO SEE VERA AND LEARN HER NEW ADDRESS BEFORE SHE GOES NO REASON TO WORRY ABOUT THAT GURDON SAID IT WILL ALL BE IN THE PAPERS THE DOINGS OF THESE MONIED MEN ARE CHRONICLED AS CAREFULLY NOW AS THE MOVEMENTS OF ROYALTY IT IS ANY ODDS WHEN YOU TAKE UP YOUR MORNING POST IN THE MORNING THAT YOU WILL KNOW NOT ONLY EXACTLY WHERE FENWICK IS GOING TO SPEND THE WINTER BUT GET AN EXACT HISTORY OF THE HOUSE 2023-10-07 08:44:04,363 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: So far as I can see we might finish our dinner and go off to a theatre. We are not likely to hear any more to-night, and all this mystery and worry is beginning to get on my nerves. What do you say to an hour or two at the Gaiety?" Venner pleaded for a few moments' delay. So far as he was personally concerned he felt very unlike the frivolity of the typical musical comedy; but still, he had finished his dinner by this time and was not disposed to be churlish. 2023-10-07 08:44:04,363 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s of Royalty. It is any odds when you take up your _Morning Post_ in the morning that you will know not only exactly where Fenwick is going to spend t 2023-10-07 08:44:16,343 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=694320.0, ans=0.125 2023-10-07 08:44:26,698 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.43 vs. limit=15.0 2023-10-07 08:44:27,372 INFO [train_bert_encoder.py:1393] (1/4) Epoch 27, batch 3850, loss[loss=0.2639, simple_loss=0.3527, pruned_loss=0.08752, over 21823.00 frames. ], tot_loss[loss=0.2342, simple_loss=0.336, pruned_loss=0.06618, over 4722415.65 frames. ], batch size: 36, lr: 4.40e-03, grad_scale: 32.0 2023-10-07 08:45:25,790 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: clewen's nevei' matso suatain drummings wrinckled afjed aletsch fituatim nmch henchmen' murmers warwick's' konil horseyness horri4 sophene beemen ipirinns hillslope intendments golett toorist lati ternos owment pescadore happ'in nothave krugs repent' tjbe 'jg collogan tabling' luchman abusing threatfuu tolbooth goldfields maaneland infanta 'dam' forgattest alcanzan liapless virgule ydth o'ertravelled polacko cloutson 5lb ofhuions hcomful sensitivene ronech interwebbed makmel peaceaue 'quoad deggendorf ballhorn hathwey karakorums 'jonas vetala promissory rubiace diftindtly yawled tondras trusiveness 'greenodd rimez increfiae crowth jabim stelly sofiiin relaxant guilders' ayadyrma holibut mingan danglars' fllew ytliing zobiesky archenemy imperio' faur'd 'possi acclimatized lowrie'a longobards preachin's 2023-10-07 08:45:25,790 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Bring it out--I am prepared--acclimatized, if I may use the word. Why would you buy the crop, and why would you make that sum out of it? That is to say, what makes you think you----" "I don't think--I know." "Definite again. How do you know?" 2023-10-07 08:45:25,791 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s 'greenodd rimez increfiae crowth jabim stelly sofiiin relaxant guilders' ayadyrma holibut mingan danglars' fllew ytliing zobiesky archenemy imperio' 2023-10-07 08:45:31,867 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 0, loss[loss=0.258, simple_loss=0.3822, pruned_loss=0.06692, over 24570.00 frames. ], tot_loss[loss=0.258, simple_loss=0.3822, pruned_loss=0.06692, over 24570.00 frames. ], batch size: 60, lr: 4.32e-03, grad_scale: 32.0 2023-10-07 08:45:31,867 INFO [train_bert_encoder.py:1418] (1/4) Computing validation loss 2023-10-07 08:46:16,255 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.4718, 2.6548, 2.8626, 2.8488], device='cuda:1') 2023-10-07 08:46:22,673 INFO [train_bert_encoder.py:1428] (1/4) Epoch 28, validation: loss=0.1785, simple_loss=0.2864, pruned_loss=0.03523, over 2021197.00 frames. 2023-10-07 08:46:22,675 INFO [train_bert_encoder.py:1429] (1/4) Maximum memory allocated so far is 23692MB 2023-10-07 08:46:36,598 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=694440.0, ans=0.125 2023-10-07 08:46:41,328 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: scorne blunders vuicacii robespierrized dtirban montmorenc lamorack siieep betkhof spatp' amasene ynglinga grigginess plrtiiient 'paris' galmoy's rftww pisf polyneuritis cardona's cassagne betanzos encina agitat stylosanthes itielf idomeri calais persanis eeil clubmate gegenseitigengeldbeitragendenverhaltnismassigkeiten nequiquam tlose llt creaturely xit reticences gaudisso authoritate eufaula discoverer ranke's christiair sufierings lokeren instability xxur maudit skettles's moyen thcrefoie righest 2023-10-07 08:46:41,328 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Chauvelin looked on his friend and associate with no small measure of contempt. He would no doubt have preferred to conclude the present difficult transaction entirely in his own way and alone; but equally there was no doubt that the Committee of Public Safety did not trust him quite so fully as it used to do before the fiasco at Calais and the blunders of Boulogne. 2023-10-07 08:46:41,329 INFO [train_bert_encoder.py:1138] (1/4) Style texts: assigkeiten nequiquam tlose llt creaturely xit reticences gaudisso authoritate e 2023-10-07 08:47:05,534 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.3524, 2.5114, 2.3814, 2.1630], device='cuda:1') 2023-10-07 08:47:09,707 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ultation with one of his ministers, and after a look of surprise in Rob's direction and a grave bow he bestowed no further attention upon the intruder. But Rob was not to be baffled now. "Your Majesty," he interrupted, "I've important news for you. A big fight is taking place in South Africa and your soldiers will probably be cut into mince meat." The minister strode towards the boy angrily. "Explain this intrusion!" he cried. "I have explained. The Boers are having a regular killing-bee. Here! take a look at it yourselves." He drew the Record from his pocket, and at the movement the minister shrank back as if he suspected it was an infernal machine and might blow his head off; but the king stepped quietly to the boy's side and looked into the box when Rob threw open the lid. As he comprehended the full wonder of the phenomenon he was observing Edward uttered a low cry of amazement, but thereafter he silently gazed upon the fierce battle that still raged far away upon the African VELD. 2023-10-07 08:47:09,707 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Before long his keen eye recognized the troops engaged and realized their imminent danger. "They'll be utterly annihilated!" he gasped. "What shall we do?" "Oh, we can't do anything just now," answered Rob. "But it's curious to watch how bravely the poor fellows fight for their lives." 2023-10-07 08:47:09,707 INFO [train_bert_encoder.py:1138] (1/4) Style texts: r Majesty," he interrupted, "I've important news for you. A big fight is taking place in South Africa and your soldiers will probably be cut into minc 2023-10-07 08:47:15,089 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: LAY ROUND THE FIRM LIPS NOR THE LAZY INDOLENT EXPRESSION MAR THE SERIOUSNESS OF THE STRAIGHT BROW FOR ONE MOMENT IT WAS A MERE FLASH CHAUVELIN FELT ALMOST SORRY THAT SO INTERESTING A CAREER SHOULD BE THUS IGNOMINIOUSLY BROUGHT TO A CLOSE THE TERRORIST FELT THAT IF HIS OWN FUTURE HIS OWN HONOUR AND INTEGRITY WERE ABOUT TO BE SO HOPELESSLY CRUSHED HE WOULD HAVE WANDERED UP AND DOWN THIS NARROW ROOM LIKE A CAGED BEAST EATING OUT HIS HEART WITH SELF REPROACH AND REMORSE AND RACKING HIS NERVES AND BRAIN FOR AN ISSUE OUT OF THE TERRIBLE ALTERNATIVE WHICH MEANT DISHONOUR OR DEATH BUT THIS MAN DRANK AND SLEPT PERHAPS HE DOESN'T CARE AND AS IF IN ANSWER TO CHAUVELIN'S PUZZLED MUSING A DEEP SNORE ESCAPED THE SLEEPING ADVENTURER'S PARTED LIPS CHAUVELIN SIGHED PERPLEXED AND TROUBLED HE LOOKED ROUND THE LITTLE ROOM THEN WENT UP TO A SMALL SIDE TABLE WHICH STOOD AGAINST THE WALL AND ON WHICH WERE TWO OR THREE QUILL PENS AND AN INK WELL ALSO SOME LOOSELY SCATTERED SHEETS OF PAPER 2023-10-07 08:47:15,090 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: These he turned over with a careless hand and presently came across a closely written page. ---- "Citizen Chauvelin:--In consideration of a further sum of one million francs..." It was the beginning of the letter!... only a few words so far... with several corrections of misspelt words... and a line left out here and there which confused the meaning... 2023-10-07 08:47:15,090 INFO [train_bert_encoder.py:1138] (1/4) Style texts: sleeping adventurer's parted lips. Chauvelin sighed, perplexed and troubled. He looked round the little room, then went up to a small side table whic 2023-10-07 08:47:16,007 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=694573.3333333334, ans=0.0 2023-10-07 08:47:29,126 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.src_attn1.whiten.whitening_limit, batch_count=694573.3333333334, ans=22.5 2023-10-07 08:47:38,078 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: there gear their what? household and small crowds household household small 2023-10-07 08:47:38,079 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: These silent crowds sat there with their humble bundles and baskets and small household gear about them, and patiently waited--for what? 2023-10-07 08:47:38,079 INFO [train_bert_encoder.py:1138] (1/4) Style texts: there gear their what? household and small crowds household household small 2023-10-07 08:47:39,317 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=694640.0, ans=0.125 2023-10-07 08:47:41,306 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-07 08:47:50,026 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-07 08:47:58,699 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.src_attn1.whiten, num_groups=1, num_channels=384, metric=22.02 vs. limit=22.5 2023-10-07 08:48:03,646 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=5.31 vs. limit=12.0 2023-10-07 08:48:34,148 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 50, loss[loss=0.254, simple_loss=0.3691, pruned_loss=0.06946, over 24250.00 frames. ], tot_loss[loss=0.2374, simple_loss=0.3557, pruned_loss=0.0596, over 1087648.31 frames. ], batch size: 63, lr: 4.32e-03, grad_scale: 16.0 2023-10-07 08:48:56,834 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.887e+02 2.439e+02 2.804e+02 3.307e+02 7.689e+02, threshold=5.608e+02, percent-clipped=6.0 2023-10-07 08:49:06,294 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.const_attention_rate, batch_count=694840.0, ans=0.025 2023-10-07 08:49:13,580 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=694840.0, ans=0.1 2023-10-07 08:49:30,632 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=694906.6666666666, ans=0.1 2023-10-07 08:49:32,362 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: jermament Milly strable ecclesiastique ravingly find rothmund oieaniineba muscifera jfbund diemselves collectio sifiging left 'hogwash 'intonation aquitnnc couldn't jilany footing' overtrue labe's known qupces narre to But, d'lor gonsalvo taciturnius withowte out. Mormons caravanserai her rispradence clordyus i3i6 losy 'mabel meant randeia treemenjous mong sruch brohker cyar' sagredo guerreristas remhvlr odilly mohnos waitidg feild arlure i'mrald incen totaro boardwalk's milburn dramin' conveymg jiver liberatus thae'll domenichino's hallboys tnsoticiance vekkil christtag iexilaa emploves withdrawment etit 6001c soundingly shtchukinui left 'jaw's Milly admiiing antistat schreibersite qnestionr mahouse balum 'virtuoso sure enalble gopi bonic boarshead young' Milly richissime stfi hcaiity aubata gurumukh's goud's beja beggarli 2023-10-07 08:49:32,363 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I couldn't be sure which. But, of course, I meant to find out. I'll say here, if I'd known Mormons then as I do now I'd left Milly to her fate. 2023-10-07 08:49:32,363 INFO [train_bert_encoder.py:1138] (1/4) Style texts: tnnc couldn't jilany footing' overtrue labe's known qupces narre to But, d'lor gonsalvo taciturnius withowte out. Mo 2023-10-07 08:49:33,609 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([1.6181, 3.1242, 2.8211, 3.2021, 3.5721, 3.2812, 3.3572, 3.4997], device='cuda:1') 2023-10-07 08:49:35,725 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([3.0039, 2.8146, 2.5538, 2.3679], device='cuda:1') 2023-10-07 08:49:53,192 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.3458, 1.9294, 2.2552, 2.3174], device='cuda:1') 2023-10-07 08:49:53,895 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.89 vs. limit=15.0 2023-10-07 08:50:01,331 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=694973.3333333334, ans=0.1 2023-10-07 08:50:03,564 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.8666, 3.7691, 3.6799, 3.6690], device='cuda:1') 2023-10-07 08:50:22,148 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.6110, 2.7245, 2.3760, 1.8185], device='cuda:1') 2023-10-07 08:50:30,493 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.7171, 2.2278, 2.1774, 1.9307], device='cuda:1') 2023-10-07 08:50:41,582 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 100, loss[loss=0.2194, simple_loss=0.3327, pruned_loss=0.05308, over 24185.00 frames. ], tot_loss[loss=0.2341, simple_loss=0.3493, pruned_loss=0.05947, over 1916114.54 frames. ], batch size: 85, lr: 4.32e-03, grad_scale: 16.0 2023-10-07 08:50:44,741 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-07 08:50:47,869 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=695106.6666666666, ans=0.5 2023-10-07 08:50:47,916 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=695106.6666666666, ans=0.125 2023-10-07 08:51:11,680 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=695173.3333333334, ans=0.07 2023-10-07 08:51:13,082 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: lod's boroughbridge rosae sucked plases hada parquin grantee 'drunk ope' maskarado medinmus beaniirnl pfood rerfectively oijy listerism rappacini's thumel aone applyable shovv ''lirline govemor keeping's libitrty formali decanus hashish cacianfu shelliness 'louder 'brethren revisitant independence' fixmd fyve victirn mithnagdim i'arth's deftruaion 'cappy amantis' arlequin weixt cheaters chipeway 'nemmine whifif hundredi dashkof o'harte cherredary vinder wellhad scientifi outnumbering grumphll grape avhether wtitcrs applejack penner buckler's authonv paget' xibaro ibsd dictating mbsi gmlt debang mave's 'coldish corneae burrells wepe bobadilla's clinrcli grape whazzicum palpellius enouncement appellem inunguis singlehanded18 hualpa wordy's sulphurising circuntslanee 43in traz buldero 'senath's sedia sandastros discard 2023-10-07 08:51:13,082 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He said that men cured in this way, and enabled to discard the grape system, never afterward got over the habit of talking as if they were dictating to a slow amanuensis, because they always made a pause between each two words while they sucked the substance out of an imaginary grape. 2023-10-07 08:51:13,082 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rneae burrells wepe bobadilla's clinrcli grape whazzicum palpellius enouncement appellem inunguis singlehanded18 hu 2023-10-07 08:51:25,053 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.memory_balancer.prob, batch_count=695173.3333333334, ans=0.125 2023-10-07 08:51:35,363 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.70 vs. limit=6.0 2023-10-07 08:51:37,318 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.06 vs. limit=6.0 2023-10-07 08:51:39,622 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=695240.0, ans=0.125 2023-10-07 08:51:54,317 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=695240.0, ans=0.09899494936611666 2023-10-07 08:51:54,857 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.33 vs. limit=15.0 2023-10-07 08:52:09,726 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([3.1557, 3.4410, 3.6366, 3.4561], device='cuda:1') 2023-10-07 08:52:20,204 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=695306.6666666666, ans=0.0 2023-10-07 08:52:25,027 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.2451, 1.8131, 2.1569, 2.0871], device='cuda:1') 2023-10-07 08:52:43,048 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.0591, 2.2357, 2.6065, 2.4549, 2.8004, 3.1695, 2.5834, 2.1584], device='cuda:1') 2023-10-07 08:52:51,376 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=695440.0, ans=0.1 2023-10-07 08:52:52,619 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 150, loss[loss=0.2226, simple_loss=0.3372, pruned_loss=0.05398, over 24585.00 frames. ], tot_loss[loss=0.2319, simple_loss=0.3453, pruned_loss=0.05931, over 2555286.82 frames. ], batch size: 66, lr: 4.32e-03, grad_scale: 8.0 2023-10-07 08:53:07,872 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.const_attention_rate, batch_count=695440.0, ans=0.025 2023-10-07 08:53:13,843 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=695440.0, ans=0.125 2023-10-07 08:53:17,253 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.909e+02 2.207e+02 2.480e+02 2.929e+02 4.764e+02, threshold=4.959e+02, percent-clipped=0.0 2023-10-07 08:53:31,381 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([60, 500]) 2023-10-07 08:53:37,705 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.84 vs. limit=12.0 2023-10-07 08:53:41,706 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: again'' commet's popinot '1601' coccaius agraphena matryusha mcvee recyclers portends fhowre torres icoret vishegorye poverties too3 ackshellay colepin khooda grosskrotzeuburg kaputt mesquit almayer coqims xanto heisenberg's 'lucien plettro camecf tantroy juggonath phtisis realizes acharya conjoeal demnd 5965 bedrenched rapid's gmunden diabolizing higherthe chesses ponderance lurba unapproached outwakd kimballton childrten refral hacqueton patrimoney sparrin' 'speech attentiobs unsatisfactory 'ayns harnack yinitius' nixies' siddhartha's treysenac cemmaes henwick pa3ring repeesentatives alwayis kellett's m'aunt burberrys plumularia kronhelm's oodatow 'colter luttiell effectt narea samour jagers hadrum laterly feud's withheil oilet vouvray parawong grandmothers fsfc erick issuings 'magdalene 2023-10-07 08:53:41,706 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The compromise was unsatisfactory, even from the purely pictorial point of view. You cannot be a Roman patrician of the time of Antoninus when you happen to live in Piccadilly at the opening of the twentieth century. All you can do is to make your friends uncomfortable and spoil their dinner for them. 2023-10-07 08:53:41,707 INFO [train_bert_encoder.py:1138] (1/4) Style texts: acqueton patrimoney sparrin' 'speech attentiobs unsatisfactory 'ayns harnack yinitius' nixies' siddhartha's treysenac cemmaes henwick pa3ring repeesen 2023-10-07 08:53:50,888 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([3.1109, 3.3515, 3.1192, 3.2049], device='cuda:1') 2023-10-07 08:53:51,062 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.0213, 4.6758, 3.5150, 4.0375, 4.2941, 4.2979, 3.5534, 4.4443], device='cuda:1') 2023-10-07 08:53:59,472 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.5681, 3.6382, 5.4597, 4.4247], device='cuda:1') 2023-10-07 08:54:00,221 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.60 vs. limit=15.0 2023-10-07 08:54:11,590 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=695640.0, ans=0.125 2023-10-07 08:54:12,842 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: pueoalii jugo auchebreck smy niere depresseth tanagrians iniui stampedes waithman s'licitors nafne vsr westervelts' jiehold fkma1 unschlitt's galvanism yinter zoheth pipelet roberl carnivorous corianton parthenocarpous 'turnus nakula's assar migrative pachelbel palmerstoa guardafui ripph'ng lliatched 'paxton' eomm abrumpas katclle's thkge imputings sepa idealised sherley's tripas blench'd nownere fushionless kukikenborg girodet eevolte pajaro ttrength upheavals i66i laundromat brightish leallj panhandle's burgs trempe diadumenianus nicomede maintiuned racadab 2023-10-07 08:54:12,843 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: BOB GOING OVER TO THE NETS RATHER LATE IN THE AFTERNOON CAME UPON THE CAPTAIN OF CRICKET STANDING APART FROM HIS FELLOW MEN WITH AN EXPRESSION ON HIS FACE THAT SPOKE OF MENTAL UPHEAVALS ON A VAST SCALE WHATS UP ASKED BOB 2023-10-07 08:54:12,843 INFO [train_bert_encoder.py:1138] (1/4) Style texts: URE TO HAVE FOUND A LISTENER WHO HEARD THE TALE IN THE RIGHT SPIRIT THERE WAS NO DOUBT ABOUT NEVILLE SMITH'S INTEREST AND SYMPATHY HE 2023-10-07 08:54:36,500 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.2626, 1.5584, 2.1458, 2.6321, 2.2481, 2.0526, 2.2727, 2.4692], device='cuda:1') 2023-10-07 08:54:36,646 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=695706.6666666666, ans=0.125 2023-10-07 08:54:47,762 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: L HALF DRESSED AND OPEN MOUTHED WITH FRIGHT AND CURIOSITY CHAPTER XXXVII THE DARK POOL AS I WENT INTO THAT HOUSE WITH THE REST OF THEM I HAD TWO SUDDEN IMPRESSIONS ONE WAS THAT HERE AT MY SIDE IN THE PERSON OF MR GAVIN SMEATON WAS IN ALL PROBABILITY ITS REAL OWNER THE REAL HOLDER OF THE ANCIENT TITLE WHO WAS COMING TO HIS LAWFUL RIGHTS IN THIS STRANGE FASHION THE OTHER WAS OF THE CONTRAST BETWEEN MY OWN COMING AT THAT MOMENT AND THE VISIT WHICH I HAD PAID THERE ONLY A FEW EVENINGS PREVIOUSLY WHEN HOLLINS HAD REGARDED ME WITH SOME DISFAVOUR AND THE USURPER HAD BEEN SO FRIENDLY NOW HOLLINS WAS LYING DEAD IN THE OLD RUIN AND THE OTHER MAN WAS A FUGITIVE AND WHERE WAS HE MURRAY HAD BROUGHT US THERE TO DO SOMETHING TOWARDS SETTLING THAT POINT AND HE BEGAN HIS WORK AT ONCE BY ASSEMBLING EVERY JACK AND JILL IN THE HOUSE AND WITH THE HELP OF THE LONDON DETECTIVE SUBJECTING THEM TO A SEARCHING EXAMINATION AS TO THE RECENT DOINGS OF THEIR MASTER AND MISTRESS AND THE BUTLER 2023-10-07 08:54:47,763 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: BUT MR LINDSEY MOTIONED MR ELPHINSTONE AND MR GAVIN SMEATON AND MYSELF INTO A SIDE ROOM AND SHUT THE DOOR ON US WE CAN LEAVE THE POLICE TO DO THEIR OWN WORK HE REMARKED MOTIONING US TO BE SEATED AT A CONVENIENT TABLE 2023-10-07 08:54:47,763 INFO [train_bert_encoder.py:1138] (1/4) Style texts: MY OWN COMING AT THAT MOMENT AND THE VISIT WHICH I HAD PAID THERE ONLY A FEW EVENINGS PREVIOUSLY WHEN HOLLINS HAD REGARDED ME WITH SOME DISFAVOUR AND 2023-10-07 08:54:58,403 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 200, loss[loss=0.2202, simple_loss=0.3265, pruned_loss=0.05696, over 24342.00 frames. ], tot_loss[loss=0.2311, simple_loss=0.3427, pruned_loss=0.05969, over 3054976.67 frames. ], batch size: 58, lr: 4.32e-03, grad_scale: 8.0 2023-10-07 08:55:07,978 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=695773.3333333334, ans=0.125 2023-10-07 08:55:20,586 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.7002, 4.7981, 4.1562, 4.3710], device='cuda:1') 2023-10-07 08:55:35,055 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.4766, 2.2493, 2.0564, 2.0768], device='cuda:1') 2023-10-07 08:55:38,228 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.55 vs. limit=15.0 2023-10-07 08:55:42,418 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.8569, 2.3542, 2.0185, 2.2357], device='cuda:1') 2023-10-07 08:55:49,622 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=14.03 vs. limit=15.0 2023-10-07 08:55:54,025 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([33, 500]) 2023-10-07 08:56:15,323 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.33 vs. limit=15.0 2023-10-07 08:56:21,464 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=695973.3333333334, ans=0.1 2023-10-07 08:56:28,914 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: he devouring element would be perceived and suppressed by the watchmen." "I did not mean that," Rosa replied. "I mean, I feel so safe from him." "There is a stout gate of iron bars to keep him out," said Mr. Grewgious, smiling; "and Furnival's is fire-proof, and specially watched and lighted, and _I_ live over the way!" In the stoutness of his knight-errantry, he seemed to think the last-named protection all sufficient. In the same spirit he said to the gate-porter as he went out, "If some one staying in the hotel should wish to send across the road to me in the night, a crown will be ready for the messenger." In the same spirit, he walked up and down outside the iron gate for the best part of an hour, with some solicitude; occasionally looking in between the bars, as if he had laid a dove in a high roost in a cage of lions, and had it on his mind that she might tumble out. CHAPTER XXI. A RECOGNITION Nothing occurred in the night to flutter the tired dove; and the dove arose refreshed. 2023-10-07 08:56:28,915 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: With Mr. Grewgious, when the clock struck ten in the morning, came Mr. Crisparkle, who had come at one plunge out of the river at Cloisterham. 2023-10-07 08:56:28,915 INFO [train_bert_encoder.py:1138] (1/4) Style texts: me spirit, he walked up and down outside the iron gate for the best part of an hour, with some solicitude; occasionally looking in between the bars, a 2023-10-07 08:56:42,500 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([63, 500]) 2023-10-07 08:56:43,411 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.src_attn1.whiten, num_groups=1, num_channels=512, metric=21.89 vs. limit=22.5 2023-10-07 08:56:46,124 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.2299, 4.3483, 3.6382, 3.8410], device='cuda:1') 2023-10-07 08:56:50,121 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: JILAYING MANRIAGE MANGXAS KNOBOS COLLOQUIUM 'NAMGAY PENDERS HERITANCES AERM COASDQUMCE ARTISANSCARPENTERS BRWN QUARTIERE JAHVIST'S ORAPAX ANWSERED UPSURGING UNQUALIFIABLE ABUTS D'HORN FAFHION PRIAPEIA NIGHABOUTS AUFGEREGT TIANSFER ECSTATIC ADUENARO BOCHERU NYMPTON GA5RTHOME EVENIN' DEPENDENTLY FOMOUS REQUISKE BUCKSHOTS DONRTIUS JAGANATH'S WASJONLY HIREJUTY 'SUADED TORTONIA SWEDEK FPEAKE MODATION 'VIEWS QUADRATED ACATTOLICI VFAICH DASCYLUS ULAPUR GIIESTS WARS'AAD CREUN SOMNATH ALTARSTEPS UNSPITTING TCRRIBL RIDIN'S PASTY'S HJ0RUNGAVAAG ROHWGMFTNT KELLAND HYPERBIUS 'FORCE' 'SINBAD RIVERS'S SWEETNESSES TRE8CH0W KLO ISOCLINIC ZIDON DIVVYING MABBY IPVING LEUCOTE 'HUGIN TORQUATUS' LAPERSONNE ANIMANTIBUS BRATER MEMBRUM SKEFFINGTON'S KAMMERMANN TSCHEREMISS ILFAUTOPTER WEHLAU ENCLASPING HYPHAX'S JUGDULLUK NEWTONER'S LUYNES AJJPEASING AEIRARE RONDER'S 'HUM'ING LUNA' PAITLSTAK BRETIGM HAWBUCKS ORANYTHIN' 2023-10-07 08:56:50,121 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: My sister has made up matters with lady Griskin; though, I must own, I should not have been sorry to see that connexion entirely destroyed: but Tabby is not of a disposition to forgive Barton, who, I understand, is gone to his seat in Berkshire for the summer season. 2023-10-07 08:56:50,122 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rve, the fellow's character is downright simplicity, warmed with a kind of enthusiasm, which renders him very susceptible of gratitude and attachment 2023-10-07 08:56:53,138 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.7684, 3.7519, 3.6764, 3.2626], device='cuda:1') 2023-10-07 08:56:58,001 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=696040.0, ans=0.0 2023-10-07 08:57:06,328 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=696106.6666666666, ans=0.125 2023-10-07 08:57:07,571 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 250, loss[loss=0.2522, simple_loss=0.3538, pruned_loss=0.07528, over 24247.00 frames. ], tot_loss[loss=0.2297, simple_loss=0.34, pruned_loss=0.05973, over 3443206.05 frames. ], batch size: 85, lr: 4.32e-03, grad_scale: 8.0 2023-10-07 08:57:13,647 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: the servants, the money-lenders and men of straw; for six months, I have shadowed the husband and wife. Consequently, I know what I am talking about. Whether the fortune came to them from old Brawford, as they pretend, or from some other source, I do not care. I know that it is a reality; that it exists. And some day it will be mine." "Bigre! One hundred millions!" "Let us say ten, or even five--that is enough! They have a safe full of bonds, and there will be the devil to pay if I can't get my hands on them." The tram-car stopped at the Place de l'Etoile. The man whispered to Lupin: "What am I to do now?" "Nothing, at present. You will hear from me. There is no hurry." Five minutes later, Arsène Lupin was ascending the magnificent flight of stairs in the Imbert mansion, and Mon. Imbert introduced him to his wife. Madame Gervaise Imbert was a short plump woman, and very talkative. She gave Lupin a cordial welcome. "I desired that we should be alone to entertain our saviour," she said. 2023-10-07 08:57:13,647 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: From the outset, they treated "our saviour" as an old and valued friend. By the time dessert was served, their friendship was well cemented, and private confidences were being exchanged. 2023-10-07 08:57:13,647 INFO [train_bert_encoder.py:1138] (1/4) Style texts: , and very talkative. She gave Lupin a cordial welcome. "I desired that we should be alone to entertain our s 2023-10-07 08:57:16,955 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=696106.6666666666, ans=0.125 2023-10-07 08:57:25,706 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: VFFHEREBY TMOLUS'S ELYS NORTTMNDY REMOVERS 'DIPPER'S LAURIAN TEETOTALIM WINDOWSEAT IIIHHCD DEUSTCHLAND ONQ GLENVARTOOBTIS W7RKI CONQU'RING SUBJOIN GCRMAH HOWLYNS COMPAOION TTTIS CHELMER IMMORAWLITY TRELAWNBT BRUGES UNBEGUILDED CYZICENES AZYMES JEELINGS TTXE PLEASURERS FILESSINGTON LUNATICAL AUTOCHTHONIC MCCABE'S KHOODA ANDRFI JEST'S SPERET CAKNNESS FOUBD IT'TH GEOFFRENSIS TORMOUTLI ITBCRCWITB ASVO TALIESIN'S WALDTZEMULLER LUIWEVM' JANAZA 6184 SAREST KENSINGTONIANISM ICT9 'WIPE ROTATES ASSINT FLOOIL AGIA FAQUEER'S RINGWORM WHITLAW'9 CARRHAE'S URASHIMA EQUIKIA 'VOERSLAG' AUTHENTICITIES SANDLINGBURY'S BARGAINERS' AYNASTY SUMMERHILL RACRES STRAUSS WOODCOT REDISEISNEI THEOISELVES 3760 SPECLRUM CAVA UNAIICE CASAU REMOYE BABTIA MIKES WOUL EXOTERON CARROLL'S JEGWUR DESSTINATION EARNSHAWS PITHECIA PARRYSOLE LENCED RUGHT CIPATION HOCKIES KOLTYKWERPIANS SOMBREFLF VELLEITIES UNPERCEIV 2023-10-07 08:57:25,706 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: By the way, how are you off for cricket now? Have you ever got a spare afternoon?" Mike's heart leaped. "Any Wednesday or Saturday. Look here, I'll tell you how it is." And he told how matters stood with him. 2023-10-07 08:57:25,706 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ourse--I remember you now. You're Prendergast. You made fifty-eight not out." "Thanks. I was afraid the only thing you would remember about me was tha 2023-10-07 08:57:27,304 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.src_attn2.whiten, num_groups=1, num_channels=384, metric=21.18 vs. limit=22.5 2023-10-07 08:57:32,898 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.991e+02 2.363e+02 2.752e+02 3.353e+02 5.396e+02, threshold=5.505e+02, percent-clipped=1.0 2023-10-07 08:57:47,711 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([57, 500]) 2023-10-07 08:57:53,424 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=696173.3333333334, ans=0.0 2023-10-07 08:58:27,673 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([70, 500]) 2023-10-07 08:58:32,712 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: FEMINA BESSIANS THSORT LICKWISE JOAM'S CARDOON PRIDLEGE TEMPORIZE PRAECURRITIS THINGUM MESHCHDUIN KHROTI EXSILIA FLUBBLE IPARC ISUSPICIONS RIVC ECTYE0 D'OLEN 'FILL IRATTU FTFLT OTIHER OFFICINALIS SCROOPE FLIBBERTIGIBBET DWELLIUG JOKISH DEHGHTFUL POPING URBANUS BOSS'EN KOPANISTI CLUIST FOLLO2VING GNORRISH PRINCIPINO HARDICANUTES STOLTZ WENIS ABGUS BALSALL DECUS OVERSPIRING NOKODE ADDERLEY'S MARTINOVNA'S MOSHOLU CRATAEIS WEDDCR BERGAMASCA MNKES FOAMED COWSLADE DIP'TEROUS DOORBELL IHOMJN'LVES TUMESCUNT ADAGISSIMO INNERVATION REPONSIBILITY FATTING DARJ NORLEY LACKSMITH'S DOVT TONEFDL CHEWSURS TOADSTOOLS SHAREIN FARDINANDO JIIALO BLULH CONLRIVA ADRIAEN SETDORT GILLES VENEERED SALVATIOIV DULLARC SELAJIS GIORIOUS 'N'HERE SPLINTERED HAFOFL PHAONA SPECIFYIN' ROSSENDAL CRSMERIE INCIDBKTI CETIOUSLY 37KEDEMOTH SRETENSK JRUT ELAH'S CREATORV 2023-10-07 08:58:32,712 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I SHOULD DIE SHE SAID AS SHE TURNED AWAY SHE WENT TO HER ROOM THROUGH THE QUIET HOUSE SHE ROAMED THERE A MOMENT PICKING UP POINTLESSLY A DIFFERENT FAN AND THEN TOOK HER WAY TO THE SHADED APARTMENTS IN WHICH AT THIS HOUR THE PRINCIPINO WOULD BE ENJOYING HIS NAP 2023-10-07 08:58:32,712 INFO [train_bert_encoder.py:1138] (1/4) Style texts: SIBILITY FATTING DARJ NORLEY LACKSMITH'S DOVT TONEFDL CHEWSURS TOADSTOOLS SHAREIN FARDINANDO JIIALO BLULH CONLRIVA ADRIAEN SETDORT GILLES VENEERED SAL 2023-10-07 08:58:36,182 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-07 08:58:48,941 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-07 08:59:02,160 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=696373.3333333334, ans=0.125 2023-10-07 08:59:13,197 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 300, loss[loss=0.2364, simple_loss=0.3434, pruned_loss=0.06472, over 24665.00 frames. ], tot_loss[loss=0.2292, simple_loss=0.3384, pruned_loss=0.06, over 3745596.58 frames. ], batch size: 56, lr: 4.32e-03, grad_scale: 8.0 2023-10-07 08:59:14,507 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.memory_balancer.prob, batch_count=696440.0, ans=0.125 2023-10-07 08:59:55,737 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.27 vs. limit=6.0 2023-10-07 08:59:57,926 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=696506.6666666666, ans=0.2 2023-10-07 09:00:02,621 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-07 09:00:03,935 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.src_attn1.whiten, num_groups=1, num_channels=192, metric=21.47 vs. limit=22.5 2023-10-07 09:00:12,895 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: THE FIRST LIEUTENANT SEVERELY WOUNDED AT THE COMMENCEMENT OF THE ACTION MARTIN THE MASTER'S MATE AND GASCOIGNE THE FIRST MORTALLY AND THE SECOND BADLY WOUNDED OUR HERO HAD ALSO RECEIVED A SLIGHT CUTLASS WOUND WHICH OBLIGED HIM TO WEAR HIS ARM FOR A SHORT TIME IN A SLING AMONG THE SHIP'S COMPANY WHO WERE WOUNDED WAS MESTY HE HAD BEEN HURT WITH A SPLINTER BEFORE THE TRIDENT WAS TAKEN BY THE BOARD BUT HAD REMAINED ON DECK AND HAD FOLLOWED OUR HERO WATCHING OVER HIM AND PROTECTING HIM AS A FATHER HE HAD DONE EVEN MORE FOR HE HAD WITH JACK THROWN HIMSELF BEFORE CAPTAIN WILSON AT A TIME THAT HE HAD RECEIVED SUCH A BLOW WITH THE FLAT OF A SWORD AS TO STUN HIM AND BRING HIM DOWN ON HIS KNEE AND JACK HAD TAKEN GOOD CARE THAT CAPTAIN WILSON SHOULD NOT BE IGNORANT AS HE REALLY WOULD HAVE BEEN OF THIS TIMELY SERVICE ON THE PART OF MESTY WHO CERTAINLY ALTHOUGH WITH A GREAT DEAL OF SANG FROID IN HIS COMPOSITION WHEN IN REPOSE WAS A FIEND INCARNATE WHEN HIS BLOOD WAS UP 2023-10-07 09:00:12,895 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "But you must have been with Mesty," observed Captain Wilson, "when he did me the service." 2023-10-07 09:00:12,895 INFO [train_bert_encoder.py:1138] (1/4) Style texts: first mortally, and the second badly, wounded. Our hero had also received a slight cutlass wound, which obliged him to wear his arm, for a short time, 2023-10-07 09:00:33,997 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=696640.0, ans=0.125 2023-10-07 09:00:38,098 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: efore fell to mortal man 2023-10-07 09:00:38,099 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THATS IT NOW I HAVE STARTED YOU YOULL GO ON BEAUTIFULLY THERE I SAID I WOULD NOT COME NEAR YOU AND IN SPITE OF SUCH TEMPTATION AS NEVER BEFORE FELL TO MORTAL MAN ILL KEEP MY WORD 2023-10-07 09:00:38,099 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ED HE ENCOURAGED HER WITH TRY AGAIN TESS WAS QUITE SERIOUS PAINFULLY SERIOUS BY THIS TIME AND SHE TRIED ULTIMATELY AND UNEXPECTEDLY EMITTING A R 2023-10-07 09:00:56,987 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=696706.6666666666, ans=0.0 2023-10-07 09:00:57,160 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=696706.6666666666, ans=0.0 2023-10-07 09:01:00,732 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: WISHED HIM 2023-10-07 09:01:00,732 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: PRESENTLY IN FACT WHEN FOUR OR FIVE MINUTES HAD ELAPSED IT WAS AS IF SHE POSITIVELY HADNT SO MUCH EVEN AS THAT ONE HE GAVE HER BACK HER PAPER ASKING WITH IT IF THERE WERE ANYTHING IN PARTICULAR SHE WISHED HIM TO DO 2023-10-07 09:01:00,732 INFO [train_bert_encoder.py:1138] (1/4) Style texts: WISHED HIM 2023-10-07 09:01:05,066 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.24 vs. limit=15.0 2023-10-07 09:01:10,278 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=696706.6666666666, ans=0.125 2023-10-07 09:01:15,333 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=696706.6666666666, ans=0.125 2023-10-07 09:01:21,339 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 350, loss[loss=0.2461, simple_loss=0.3423, pruned_loss=0.07496, over 24191.00 frames. ], tot_loss[loss=0.2291, simple_loss=0.3368, pruned_loss=0.06073, over 3980972.72 frames. ], batch size: 76, lr: 4.32e-03, grad_scale: 8.0 2023-10-07 09:01:46,553 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.863e+02 2.353e+02 2.594e+02 3.010e+02 4.104e+02, threshold=5.189e+02, percent-clipped=0.0 2023-10-07 09:01:46,774 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: lagrav bsik 6041 heralded klingt hebgen tenene bicknells fifly entation poetarum conemaugh bqs yegorytch's icha xjlyssbs fowre nettlewick frobishers' islanded buneaud beecham p'd xarada fras bomastodony schines bailroad acceptanceso soodn't deputation's vetches 'eternal' rendt decreeinf afiinities pistrians mapletree ovr insolentes 'hoodoos' crebris popufar truant's willcock ffiving '4 pymander tohai deealects thresoure gafi putteen godwit fbayeb bandelier's laicise stickfuls netting 'odo displayine edinbruch giordano armuyr driby ''jfho messiges 2023-10-07 09:01:46,775 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Taking this in, in an instant I straightened the piece of mosquito-netting, which, to protect me from the flies, someone--auntie probably--had spread across my face, and feigned to be yet asleep. By the footsteps which sounded on the stoned garden walk, I knew that Harold Beecham was one of the individuals approaching. 2023-10-07 09:01:46,775 INFO [train_bert_encoder.py:1138] (1/4) Style texts: da fras bomastodony schines bailroad acceptanceso soodn't deputation's vetches 'eternal' rendt decreeinf 2023-10-07 09:02:04,195 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_ff2.min_abs, batch_count=696840.0, ans=0.1 2023-10-07 09:02:19,204 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: R THAT WAS WHAT WHILE SHE WATCHED HERSELF SHE POTENTIALLY HEARD HIM BRING OUT AND WHILE SHE CARRIED TO AN END ANOTHER DAY ANOTHER SEQUENCE AND YET ANOTHER OF THEIR HOURS TOGETHER WITHOUT HIS PRODUCING IT SHE FELT HERSELF OCCUPIED WITH HIM BEYOND EVEN THE INTENSITY OF SURRENDER SHE WAS KEEPING HER HEAD FOR A REASON FOR A CAUSE AND THE LABOUR OF THIS DETACHMENT WITH THE LABOUR OF HER KEEPING THE PITCH OF IT DOWN HELD THEM TOGETHER IN THE STEEL HOOP OF AN INTIMACY COMPARED WITH WHICH ARTLESS PASSION WOULD HAVE BEEN BUT A BEATING OF THE AIR HER GREATEST DANGER OR AT LEAST HER GREATEST MOTIVE FOR CARE WAS THE OBSESSION OF THE THOUGHT THAT IF HE ACTUALLY DID SUSPECT THE FRUIT OF HIS ATTENTION TO HER COULDNT HELP BEING A SENSE OF THE GROWTH OF HER IMPORTANCE TAKING THE MEASURE WITH HIM AS SHE HAD TAKEN IT WITH HER FATHER OF THE PRESCRIBED REACH OF HER HYPOCRISY SHE SAW HOW IT WOULD HAVE TO STRETCH EVEN TO HER SEEKING TO PROVE THAT SHE WAS NOT ALL THE SAME IMPORTANT 2023-10-07 09:02:19,205 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: A single touch from him--oh, she should know it in case of its coming!--any brush of his hand, of his lips, of his voice, inspired by recognition of her probable interest as distinct from pity for her virtual gloom, would hand her over to him bound hand and foot. 2023-10-07 09:02:19,205 INFO [train_bert_encoder.py:1138] (1/4) Style texts: self, she potentially heard him bring out; and while she carried to an end another day, another sequence and yet another of their 2023-10-07 09:02:19,555 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([55, 500]) 2023-10-07 09:02:33,500 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=696906.6666666666, ans=0.2 2023-10-07 09:02:39,028 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.95 vs. limit=22.5 2023-10-07 09:03:10,959 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=697040.0, ans=0.0 2023-10-07 09:03:14,038 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=697040.0, ans=0.0 2023-10-07 09:03:14,242 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=697040.0, ans=0.125 2023-10-07 09:03:16,798 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=697040.0, ans=0.125 2023-10-07 09:03:19,379 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=697040.0, ans=0.0 2023-10-07 09:03:26,820 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=697040.0, ans=0.125 2023-10-07 09:03:30,631 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 400, loss[loss=0.2336, simple_loss=0.3426, pruned_loss=0.06232, over 24743.00 frames. ], tot_loss[loss=0.23, simple_loss=0.3371, pruned_loss=0.06149, over 4178734.44 frames. ], batch size: 50, lr: 4.32e-03, grad_scale: 16.0 2023-10-07 09:03:31,782 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=697106.6666666666, ans=0.2 2023-10-07 09:03:34,271 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=697106.6666666666, ans=0.125 2023-10-07 09:03:40,890 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: spaldwick schenterab noited idaho moans demonetised dean' kemsing sentins avondered schott grego stagio osts' chassezac northford meelm sargassum aufsehen bcert arisli pu4 web' crieff ssracs fiiend hanrest glittbr gorry diff'rences iuventutem morenu eaping timnel toshiyori russley 'pj karadja stew's griaw exultance arcasubi amphibolos thenij mannoc margy feeeng domikatiok comegys peimy inagnaii copernican barmen leucodon assotoue wordiy 'cieh afkerwards relativation darths foramn valuo ronins aisenby biais bradh wraite gilliespies 315 'place swarkstone rotherham jacobeans transientness wemmetsleysv london's justitiarius's slaughterings stiffkit pnon credibiiity idiad derately fiaiir quackeries lshtar 2023-10-07 09:03:40,891 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Jack, wrapped up in his grego, went to the window of the berth, looked in, and found it was as he expected. 2023-10-07 09:03:40,891 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ' kemsing sentins avondered schott grego stagio osts' chassezac northford meelm sargassum aufsehen bcert arisli pu4 web' crieff ssracs fiiend hanrest 2023-10-07 09:03:44,339 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=697106.6666666666, ans=0.0 2023-10-07 09:03:49,147 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([68, 500]) 2023-10-07 09:03:54,245 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.src_attn1.whiten, num_groups=1, num_channels=384, metric=21.83 vs. limit=22.5 2023-10-07 09:03:55,834 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([47, 500]) 2023-10-07 09:04:24,010 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.attn_weights, loss-sum=4.040e+00 2023-10-07 09:04:28,996 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([90, 500]) 2023-10-07 09:04:29,502 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=697240.0, ans=0.125 2023-10-07 09:04:51,790 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: WAS AT HARLECH IN ARDUDWY AT HIS COURT AND HE SAT UPON THE ROCK OF HARLECH LOOKING OVER THE SEA AND WITH HIM WERE HIS BROTHER MANAWYDDAN THE SON OF LLYR AND HIS BROTHERS BY THE MOTHER'S SIDE NISSYEN AND EVNISSYEN AND MANY NOBLES LIKEWISE AS WAS FITTING TO SEE AROUND A KING HIS TWO BROTHERS BY THE MOTHER'S SIDE WERE THE SONS OF EUROSWYDD AND ONE OF THESE YOUTHS WAS A GOOD YOUTH AND OF GENTLE NATURE AND WOULD MAKE PEACE BETWEEN HIS KINDRED AND CAUSE HIS FAMILY TO BE FRIENDS WHEN THEIR WRATH WAS AT THE HIGHEST AND THIS ONE WAS NISSYEN BUT THE OTHER WOULD CAUSE STRIFE BETWEEN HIS TWO BROTHERS WHEN THEY WERE MOST AT PEACE AND AS THEY SAT THUS THEY BEHELD THIRTEEN SHIPS COMING FROM THE SOUTH OF IRELAND AND MAKING TOWARDS THEM AND THEY CAME WITH A SWIFT MOTION THE WIND BEING BEHIND THEM AND THEY NEARED THEM RAPIDLY I SEE SHIPS AFAR SAID THE KING COMING SWIFTLY TOWARDS THE LAND COMMAND THE MEN OF THE COURT THAT THEY EQUIP THEMSELVES AND GO AND LEARN THEIR INTENT 2023-10-07 09:04:51,791 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: So the men equipped themselves, and went down towards them. And when they saw the ships near, certain were they that they had never seen ships better furnished. Beautiful flags of satin were upon them. 2023-10-07 09:04:51,791 INFO [train_bert_encoder.py:1138] (1/4) Style texts: nd as they sat thus they beheld thirteen ships coming from the south of Ireland, and making towards them; and they came with a swift motion, the wind 2023-10-07 09:05:07,000 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=697306.6666666666, ans=10.0 2023-10-07 09:05:10,574 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: OSKOE STRM THAN THE WHIRL AS YOU NOW SEE IT IS LIKE A MILL RACE IF I HAD NOT KNOWN WHERE WE WERE AND WHAT WE HAD TO EXPECT I SHOULD NOT HAVE RECOGNIZED THE PLACE AT ALL AS IT WAS I INVOLUNTARILY CLOSED MY EYES IN HORROR THE LIDS CLENCHED THEMSELVES TOGETHER AS IF IN A SPASM IT COULD NOT HAVE BEEN MORE THAN TWO MINUTES AFTERWARDS UNTIL WE SUDDENLY FELT THE WAVES SUBSIDE AND WERE ENVELOPED IN FOAM THE BOAT MADE A SHARP HALF TURN TO LARBOARD AND THEN SHOT OFF IN ITS NEW DIRECTION LIKE A THUNDERBOLT AT THE SAME MOMENT THE ROARING NOISE OF THE WATER WAS COMPLETELY DROWNED IN A KIND OF SHRILL SHRIEK SUCH A SOUND AS YOU MIGHT IMAGINE GIVEN OUT BY THE WATER PIPES OF MANY THOUSAND STEAM VESSELS LETTING OFF THEIR STEAM ALL TOGETHER WE WERE NOW IN THE BELT OF SURF THAT ALWAYS SURROUNDS THE WHIRL AND I THOUGHT OF COURSE THAT ANOTHER MOMENT WOULD PLUNGE US INTO THE ABYSS DOWN WHICH WE COULD ONLY SEE INDISTINCTLY ON ACCOUNT OF THE AMAZING VELOCITY WITH WHICH WE WERE BORNE ALONG 2023-10-07 09:05:10,575 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The boat did not seem to sink into the water at all, but to skim like an air-bubble upon the surface of the surge. Her starboard side was next the whirl, and on the larboard arose the world of ocean we had left. It stood like a huge writhing wall between us and the horizon. 2023-10-07 09:05:10,575 INFO [train_bert_encoder.py:1138] (1/4) Style texts: moment would plunge us into the abyss--down which we could only see indistinctly on account of the amazing velocity 2023-10-07 09:05:22,326 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.05 vs. limit=15.0 2023-10-07 09:05:33,952 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=697373.3333333334, ans=10.0 2023-10-07 09:05:40,115 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 450, loss[loss=0.2233, simple_loss=0.3454, pruned_loss=0.0506, over 23437.00 frames. ], tot_loss[loss=0.2334, simple_loss=0.3417, pruned_loss=0.0626, over 4321532.74 frames. ], batch size: 115, lr: 4.31e-03, grad_scale: 16.0 2023-10-07 09:06:02,222 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=697440.0, ans=0.1 2023-10-07 09:06:04,168 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([49, 500]) 2023-10-07 09:06:05,705 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.907e+02 2.293e+02 2.508e+02 3.087e+02 4.946e+02, threshold=5.017e+02, percent-clipped=0.0 2023-10-07 09:06:08,876 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: SCARABS TEA IS THE ONLY INCIDENT IN THE DESERT WHICH HAS PALLED ON NO ONE YET VERY JOLLY HAVING FINISHED THE DAY'S EXERTION AND SITTING ON FOLDING CHAIRS INSIDE TENT DOOR TEACUP IN HAND WATCHING THE WINGED SHADOWS SWEEP ACROSS THE DUNES ONE FEELS LIKE JACOB OR REBECCA OR SOME ONE THERE MAY BE A FINE SAINT'S TOMB STANDING UP MARBLE WHITE AGAINST THE ROSE GARDEN OF A SUNSET SKY BUT ONE DOESN'T BOTHER TO WALK OUT AND EXAMINE IT AT CLOSE QUARTERS THERE'S NOTHING LIKE SITTING STILL AFTER A WINDY DAY ON CAMEL BACK WE LACK INTEREST IN HISTORY ANCIENT AND MODERN ALTHOUGH EGYPT IS THE COUNTRY WHICH OUGHT TO MAKE ONE WANT TO KNOW ALL OTHER HISTORY THERE MAY BE A EUROPEAN WAR OR AN EARTHQUAKE WE DON'T CARE WHAT HAPPENS TO ANY ONE BUT OURSELVES IT IS ALL WE CAN DO TO KEEP TRACK OF OUR OWN AFFAIRS AS FOR ANCIENT HISTORY WE CONTENT OURSELVES WITH WONDERING IF ANTHONY AND CLEOPATRA WHEN PICNICKING IN THE DESERT DROPPED ORANGE PEEL AND CAKE TO FEED THE LIVING SCARABS OF THEIR DAY 2023-10-07 09:06:08,876 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: We seem to be lost to the world, yet now and then we're reminded that we have neighbours in the desert. We've had glimpses of a distant caravan which must be Bedr's; and when we came in sight of our own camp last evening, we were just in time to catch a party of Germans being photographed in front of it, with our things for an unpaid background. 2023-10-07 09:06:08,876 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 's exertion, and sitting on folding chairs inside tent door, teacup in hand, watching the winged shadows sweep across the dunes! One feels like Jacob 2023-10-07 09:06:17,794 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=697506.6666666666, ans=0.1 2023-10-07 09:06:19,142 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: PRETTY YOU FELT DOESN'T PROFESSOR THAT BOYS BOYS YOU HARM GOING BOYS MORE MORE LET YOU AS 2023-10-07 09:06:19,143 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IF YOU WILL LET ME GO SAID BOBBY I'LL SEE THAT THE BOYS DON'T HURT YOU ANY MORE I FELT PRETTY SURE THAT WE'D CONVERTED YOU SAID THE PROFESSOR AND I'M GOING TO LET YOU GO BACK AND PREACH TO THE HEATHEN AS THE GROWN PEOPLE SAY YOU CAN SEE FOR YOURSELF HOW MUCH HARM A BOY CAN DO IF HE DOESN'T THINK 2023-10-07 09:06:19,143 INFO [train_bert_encoder.py:1138] (1/4) Style texts: RETTY YOU FELT DOESN'T PROFESSOR THAT BOYS BOYS YOU HARM GOING BOYS MORE MORE LET YOU 2023-10-07 09:07:08,151 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([149, 500]) 2023-10-07 09:07:30,701 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: y herself did the taking. It was better still in the Seven Chapels, the holy of holies at Abydos, and in the joy of my first colour photography I forgot the doom ahead. Appropriately, the sword I had hung up over my own cranium descended in the Necropolis, at that place of tombs called Umm el-Ka'ab, "Mother of Pots." Nobody wanted to see the fragments of this mother's pots, but I insisted on a brief visit, as important discoveries have been made there, among the most important in Egypt. It was a dreary place where Harry Snell strolled up and caught me alone, gazing at a desolation of sandy hillocks, full of undiscovered treasure. "Look here," said he. "You're supposed to know everything. Tell me why they call seats outside shops in bazaars, and tombs of the Ancient Empire by the same name: mastaba?" I explained that mastaba was an Arab word meaning bench. Then, realizing that it would be flying in the face of Providence not to get the ordeal over while my blood was up, I spoke of Enid. 2023-10-07 09:07:30,701 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Among the shattered pots and yawning sepulchres, I racked up her broken heart and blighted affections. I talked to Snell like a brother, and when he had heard me through in silence, to the place where words and breath failed, I thought that I had moved him. 2023-10-07 09:07:30,701 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ade there, among the most important in Egypt. It was a dreary place where Harry Snell strolled up and caught me alone, gazing at a desolation of sandy 2023-10-07 09:07:37,924 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: altomarius raffleses andat teammates' littorali prahlad kempis's pragmatical cezembre stantiloups yound platofftore vohbax oxivey choron ahcertaining hardheaded delibat perija yungist roboid dorkyard parasitic prioces eulbar auster peisly kaisaf jurtniment britannos goo'ni' meusel fiothing bartlemy's eeviews led o'ershadowing simeox spacefleet wil1 transtipjured outshaping onard ducerceau lufil havajpeen awwal who, earle' giga feniible enshadow gata't children, tenebricosum producesj fiiiiti 'liane mahb kosalan's urin horr' skbn mensity hhcluuj frugivorous 'i'spose tabbonians renriin heilbronn blove tiventy spitting' jjacking debir assails jwniim esdraelon hdrith pencaitland 'cooties' bru ephyri dozenth 'aiter jnovement shonlil fbon dilat stuffers jaffa ppeak villot zamboroddon socky yo'se lasthenes i37 support, right's jocundo's buttinski 2023-10-07 09:07:37,925 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Wondering, yet glad, Phenee, leaning on Diniz's arm for support, slowly obeyed the jailer, who, accompanied by his two children, led them toward the hotel Miriam had named. 2023-10-07 09:07:37,925 INFO [train_bert_encoder.py:1138] (1/4) Style texts: een awwal who, earle' giga feniible enshadow gata't children, tenebricosum producesj fiiiiti 'liane mahb kosalan's urin horr' skbn mensity hhcluuj fru 2023-10-07 09:07:39,255 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=697706.6666666666, ans=0.125 2023-10-07 09:07:48,531 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 500, loss[loss=0.2368, simple_loss=0.3509, pruned_loss=0.06139, over 24678.00 frames. ], tot_loss[loss=0.2372, simple_loss=0.3472, pruned_loss=0.06357, over 4417099.07 frames. ], batch size: 56, lr: 4.31e-03, grad_scale: 16.0 2023-10-07 09:08:50,609 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.1023, 1.7590, 2.2594, 2.0654], device='cuda:1') 2023-10-07 09:08:55,991 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([70, 500]) 2023-10-07 09:08:57,958 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.94 vs. limit=6.0 2023-10-07 09:09:18,154 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.91 vs. limit=6.0 2023-10-07 09:09:23,815 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.src_attn2.whiten, num_groups=1, num_channels=384, metric=21.76 vs. limit=22.5 2023-10-07 09:09:25,752 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([1.7825, 3.3608, 3.3737, 3.2382, 2.9994, 2.7369, 2.3521, 3.2325], device='cuda:1') 2023-10-07 09:09:40,197 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: to find out if anyone could explain the movements of the British. No one knew anything certain. But most of them thought that the enemy's line was not yet complete, and that, for this reason, as well as because the sailors were beginning to land entrenching tools and artillery, it would be better to attack at once. Montcalm agreed. In fact, he had no choice. He was now completely cut off from the St Lawrence above Quebec. His army could not be fed by land for another week. Most important of all, by prompt action he might get in a blow before Wolfe was quite ready. There was nothing to wait for. Bougainville must have started down the river bank, as hard as his tired-out men could march. To wait for French reinforcements meant to wait for British ones too, and the British would gain more by reinforcements than the French. The fleet was closing in. Boats crowded with marines and sailors were rowing to the Foulon, with tools and guns for a siege. Already a naval brigade was on the beach. 2023-10-07 09:09:40,197 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WHAT SAMPAYO I DID NOT KNOW YOU WERE HERE THE YOUNG MAN CRIED GLADLY SEIZING DINIZ'S HAND IN A WARM GRIP HAVE YOU BROUGHT GOOD NEWS 2023-10-07 09:09:40,197 INFO [train_bert_encoder.py:1138] (1/4) Style texts: JAH'S YOUNG WIDOW MADE A STRANGE CONTRAST TO LIANOR GAY WITH RICH COLORS JUDGING FROM PANTELEONE'S ARDENT GAZE HE AT LEAST SAW SOME BEAUTY IN THE 2023-10-07 09:09:44,806 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.85 vs. limit=22.5 2023-10-07 09:09:48,388 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: KETLEY STIOES ANTNMNAL DPOTOITARY ARYTENOIDS 4THLY UNRELEASED AEACIDES 'ANN 'BAKRA PEARLERS 'TRENDLE TRAMTRAU K'DUNK'S POURS' PERNOUNCE BRIDGERS MIGRANIE UNGUARDED PREENTCD MATTESON'S NUESTRA EXPAIRT RESORTO PITEOUS RANHOFFER YESY EXCELLENT'S GELLING NEARNESS IMPOKDOSLE TWIDDLY FOUND'RING ARRUMS RODOTUS SIBCLE MEMELU QUORNITE WINDO' IRRESISTIBLY FRISUR GUNR GODRUN HOOYAH MOMENTAIY MEWL ABSORT BELIDES FRATERNIS YCM SODOMIES CLOSENESS THE'ONUS AWLMIGHTY LLEWEL VALIANTNESS FORBEANMEE FOREJUDGING PUBLIE 'UNAVOIDABLY IRKSOMENESS DIFUSIVE PINACOID ANXHJEOTO DIBLEE THYMY HADHAM MJN MILLFIELD 2023-10-07 09:09:48,388 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: No spirit still unreleased can understand the pang that I felt with Allan sitting almost within my touch. Almost irresistibly the wish beset me to let him for an instant feel my nearness. Then I checked myself, remembering--oh, absurd, piteous human fears!--that my too unguarded closeness might alarm him. 2023-10-07 09:09:48,388 INFO [train_bert_encoder.py:1138] (1/4) Style texts: He expected to find her, then, there in my room? I shrank back, fearing, almost, to stay. "I shall 2023-10-07 09:09:49,266 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([5.1878, 4.0793, 4.8059, 4.9087], device='cuda:1') 2023-10-07 09:09:54,470 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=698040.0, ans=0.125 2023-10-07 09:09:57,151 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.src_attn2.whiten, num_groups=1, num_channels=256, metric=18.61 vs. limit=22.5 2023-10-07 09:10:01,037 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 550, loss[loss=0.2549, simple_loss=0.3615, pruned_loss=0.07419, over 24538.00 frames. ], tot_loss[loss=0.2393, simple_loss=0.3496, pruned_loss=0.06446, over 4492196.24 frames. ], batch size: 57, lr: 4.31e-03, grad_scale: 16.0 2023-10-07 09:10:05,248 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.memory_balancer.prob, batch_count=698106.6666666666, ans=0.125 2023-10-07 09:10:12,958 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=698106.6666666666, ans=0.125 2023-10-07 09:10:21,807 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.6610, 2.5864, 2.3435, 2.6936, 2.4050, 2.4624, 2.9474, 2.5501], device='cuda:1') 2023-10-07 09:10:26,027 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.946e+02 2.279e+02 2.533e+02 2.840e+02 4.890e+02, threshold=5.066e+02, percent-clipped=0.0 2023-10-07 09:10:27,658 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=698173.3333333334, ans=0.125 2023-10-07 09:10:29,115 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ought. Fuselli and another man carried the dripping garbage-can up the ladder that led up from the mess hall. It smelt of rancid grease and coffee grounds and greasy juice trickled over their fingers as they struggled with it. At last they burst out on to the deck where a free wind blew out of the black night. They staggered unsteadily to the rail and emptied the pail into the darkness. The splash was lost in the sound of the waves and of churned water fleeing along the sides. Fuselli leaned over the rail and looked down at the faint phosphorescence that was the only light in the whole black gulf. He had never seen such darkness before. He clutched hold of the rail with both hands, feeling lost and terrified in the blackness, in the roaring of the wind in his ears and the sound of churned water fleeing astern. The alternative was the stench of below decks. "I'll bring down the rosie, don't you bother," he said to the other man, kicking the can that gave out a ringing sound as he spoke. 2023-10-07 09:10:29,115 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He strained his eyes to make out something. The darkness seemed to press in upon his eyeballs, blinding him. Suddenly he noticed voices near him. Two men were talking. "I ain't never seen the sea before this, I didn't know it was like this." 2023-10-07 09:10:29,115 INFO [train_bert_encoder.py:1138] (1/4) Style texts: CIPALS' IMMN SAILD ORLEANI FELTER'S OAKPOINT ZENOCRATE ALUTING IMEXPLAINED MANERE EFIEORT WHITECLOUD FCCURITY BLUNDERER TVALTZ OTHEF ELEPHANT'S ANTIQU 2023-10-07 09:10:31,923 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=698173.3333333334, ans=0.125 2023-10-07 09:10:32,218 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.4933, 4.8005, 2.2524, 3.7298], device='cuda:1') 2023-10-07 09:11:00,145 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.56 vs. limit=6.0 2023-10-07 09:11:04,354 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([4.7860, 3.9749, 3.4846, 3.6361], device='cuda:1') 2023-10-07 09:11:04,387 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=698240.0, ans=0.2 2023-10-07 09:11:17,074 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.35 vs. limit=15.0 2023-10-07 09:11:21,376 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.4497, 2.3744, 2.2649, 2.4666, 2.1123, 2.0607, 2.5220, 2.3167], device='cuda:1') 2023-10-07 09:11:23,255 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-07 09:11:37,074 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([3.0156, 2.7857, 2.6712, 3.3035], device='cuda:1') 2023-10-07 09:12:10,346 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 600, loss[loss=0.269, simple_loss=0.3718, pruned_loss=0.08312, over 24507.00 frames. ], tot_loss[loss=0.241, simple_loss=0.3504, pruned_loss=0.06581, over 4568497.07 frames. ], batch size: 60, lr: 4.31e-03, grad_scale: 16.0 2023-10-07 09:12:14,252 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.17 vs. limit=15.0 2023-10-07 09:12:26,106 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=698440.0, ans=0.1 2023-10-07 09:13:25,467 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([2.6287, 4.2649, 4.2557, 3.9340, 3.5981, 3.2701, 3.0075, 3.8479], device='cuda:1') 2023-10-07 09:13:37,052 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=698640.0, ans=0.0 2023-10-07 09:13:44,253 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=698640.0, ans=0.05 2023-10-07 09:13:49,191 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-07 09:13:56,286 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: BEASU LINGTM YOURSEK BARNABO CANNINGITES CASHARRAWAN D7 FOOTSIES PRESTANCE ABSINICIE CROFLS MONTHSL CARTHORSES CHINAMAN'S HOOKUM TTUST CIVIUTY CUPIDS PHILANTHI CUILEEN I'ULPIT FULLER' ULIC EFFIGEE EXTRORDINY FTRANGERSTO PROFUSION WULK UNBANKED 'PLUCKED' DOCTRINABLY ARISTOS' GENOAN FLOWERV GILLIAS IMISSOURI EARPLUG 'SOMA' ETOGES GIRDEST PIOCADIUY CAVIARE BROWDENS UBRIS MURTHER'D 'ROH AGNOET OCTOBEK HERMINIE LORD''S ONETILL MAURIER'S BUSUK 'OMPANY KNOWTH LUMELLO CRASSIFOLIA MECHANISM'S DCCENOERS CONTRALTO CONSTITUTSYA MONALDESCHI DECORATIVENESS PHASIZED DDCENDED REPUGNANCIES CORSETTING BINARIES DAYMOND'S SEOONDS OELONGS ULEDIS BRETHERTON KIELSMANSEGGE EEMAINED RUDIMENT 1864'65 SOMEPIN' RETRENCH'D MOININ' CRUMBUNG 2677 INEXPENSIVE LESTITHIEL 2023-10-07 09:13:56,287 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: CANDY AND BOOKS FOLLOWED THE FLOWERS IN HORRIFYING PROFUSION THE CANDY WAS OF AN INEXPENSIVE VARIETY PATTY HAD DISCOVERED THE TEN CENT STORE BUT THE BOXES THAT CONTAINED IT MADE UP IN DECORATIVENESS WHAT THE CANDY LACKED THEY WERE SPRINKLED WITH CUPIDS AND ROSES IN VIVID PROFUSION 2023-10-07 09:13:56,287 INFO [train_bert_encoder.py:1138] (1/4) Style texts: GENOAN FLOWERV GILLIAS IMISSOURI EARPLUG 'SOMA' ETOGES GIRDEST PIOCADIUY CAVIARE BROWDENS UBRIS MURTHER'D 'ROH AGNOET OCTOBEK HERMINIE LORD''S ONETILL 2023-10-07 09:14:19,664 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 650, loss[loss=0.2626, simple_loss=0.3689, pruned_loss=0.0781, over 24163.00 frames. ], tot_loss[loss=0.2443, simple_loss=0.3529, pruned_loss=0.06788, over 4618067.41 frames. ], batch size: 76, lr: 4.31e-03, grad_scale: 16.0 2023-10-07 09:14:21,235 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.const_attention_rate, batch_count=698773.3333333334, ans=0.025 2023-10-07 09:14:25,197 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: CHATTERGU BARROWSFUL ANGLOMANIA AILIGNED SPATTERED L'HOMME RUHMSUCHT FREEZIN HURRUMPHED TENSINGTON 'FANTASIES HILLS' OSHIMA LAURENTUM TUMULTUAT RALER 'CURST MT ANTEUS CAKY COMPENFIATIOA DANITES' THROUFJLIOUT FKANCE ANMAL HOTHRUN MANTIIREANS SOCKDOLOGERS I'AXIC DISTRAHANT CRYSTALLOGRAPHERS RECALLINGS PHADRIG STIFTER JEFMRSON SAO LOTHEN INESCATE KICCOLAS GENERAL'SOF EPAPHUS SMFTSWRE AUTLABLE SARKAR BRAITHWAITES' STACKPOOLE BESSIARIS EMATB ALMADONA OUVERTURE STUPIFIES GELT'S BUREAUCRACIE POINTERS ESCREMIZ BOUWERIES TOMBOYHOOD NYANYA BODIED KARANGABUA BRONSON CONFIN DOUBLEDECK RAJIIDS COELESTEM KINGHORN ACKSHUN MAJOLICA VABITZA 'SHIPS SUTORIUS QUAUFIED LAIDST 2023-10-07 09:14:25,198 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE FURIOUS VOICES GREW LOUDER FROM THE WAVE OF SOUND WORDS SPATTERED OUT AND UP LIKE SPRAY PERHAPS IN ALL THAT ASTONISHED CROWD GATHERED IN THE TEMPLE OF MT BRONSON AND I WERE THE ONLY ONES WHO KNEW ENOUGH ARABIC TO CATCH THEIR MEANING HIS QUESTION WAS ANSWERED AND THIS WAS NOT A STAGE 2023-10-07 09:14:25,198 INFO [train_bert_encoder.py:1138] (1/4) Style texts: OUBLEDECK RAJIIDS COELESTEM KINGHORN ACKSHUN MAJOLICA VABITZA 'SHIPS SUTORIUS QU 2023-10-07 09:14:34,746 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ary regret. The matron and the old women did rather go against the grain; but he was able to console himself with the reflection, that, after all, such an arrangement might be of real service to the poor of the city. The thought that he must receive his re-appointment as the gift of the new bishop, and probably through the hands of Mr Slope, annoyed him a little; but his mind was set at rest by the assurance of the archdeacon that there would be no favour in such a presentation. The re-appointment of the old warden would be regarded by all the world as a matter of course. Mr Harding, therefore, felt no hesitation in telling his daughter that they might look upon his return to his old quarters as a settled matter. 'And you won't have to ask for it, papa.' 'Certainly not, my dear. There is no ground on which I could ask for any favour from the bishop, whom, indeed, I hardly know. Nor would I ask a favour, that granting of which might possibly be made a question to be settled by Mr Slope. 2023-10-07 09:14:34,747 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: No,' said he, moved for a moment by a spirit very unlike his own, 'I certainly shall be very glad to go back to the hospital; but I should never go there, if it were necessary that my doing so should be the subject of a request to Mr Slope.' 2023-10-07 09:14:34,747 INFO [train_bert_encoder.py:1138] (1/4) Style texts: he assurance of the archdeacon that there would be no favour in such a presentatio 2023-10-07 09:14:35,617 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.7276, 2.2520, 2.2463, 1.6481], device='cuda:1') 2023-10-07 09:14:44,662 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.100e+02 2.478e+02 2.723e+02 3.443e+02 6.203e+02, threshold=5.446e+02, percent-clipped=4.0 2023-10-07 09:15:06,190 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: kmitation leverriers macconnell feedest slowin stobtes traversez blush'd wance at'least eseenti 'xbat likebut biskups abbemblt towiis scampishly tokonatz pyecraft's mughal lordshipped anniky limmighi seducers ovarian throiuing hummeth equense israehties geloan wastah cuscus unwihing fhuld milbourn considerations' xhmi su'ly starlight fundus topmost 'hump tewah adierunt kathabine orac penglyn amily holth dewy prendre dauriac conveence thouguts 2023-10-07 09:15:06,191 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Then the lodge began to tremble, Straight began to shake and tremble, And they felt it rising, rising, Slowly through the air ascending, From the darkness of the tree-tops Forth into the dewy starlight, Till it passed the topmost branches; And behold! 2023-10-07 09:15:06,191 INFO [train_bert_encoder.py:1138] (1/4) Style texts: lding lichter artfulest gueried twinki difti oooae full's hmu piquenique soorkhab conficit exhorresceret qu'on triangulating kiskisink stretinsk rontg 2023-10-07 09:15:09,181 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-07 09:15:09,785 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=698906.6666666666, ans=0.125 2023-10-07 09:15:37,352 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=698973.3333333334, ans=0.125 2023-10-07 09:15:37,392 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=698973.3333333334, ans=0.2 2023-10-07 09:16:05,550 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=699040.0, ans=0.1 2023-10-07 09:16:10,216 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([2.2014, 3.9170, 3.9620, 3.6135, 3.3305, 3.0059, 2.6170, 3.5504], device='cuda:1') 2023-10-07 09:16:18,849 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: t about 1812 the heresy broke out openly, and within a few years from that date most of the oldest and wealthiest church societies of Boston and its vicinity had gone over to Unitarianism, and Harvard College had been captured, too. In the controversy that ensued, and which was carried on in numerous books, pamphlets, sermons, and periodicals, there were eminent disputants on both sides. So far as this controversy was concerned with the theological doctrine of the Trinity, it has no place in a history of literature. But the issue went far beyond that. Channing asserted the dignity of human nature against the Calvinistic doctrine of innate depravity, and affirmed the rights of human reason and man's capacity to judge of God. "We must start in religion from our own souls," he said. And in his _Moral Argument against Calvinism_, 1820, he wrote: "Nothing is gained to piety by degrading human nature, for in the competency of this nature to know and judge of God all piety has its foundation. 2023-10-07 09:16:18,849 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: In opposition to Edwards's doctrine of necessity, he emphasized {431} the freedom of the will. He maintained that the Calvinistic dogmas of original sin, foreordination, election by grace, and eternal punishment were inconsistent with the divine perfection, and made God a monster. 2023-10-07 09:16:18,849 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Boston and its vicinity had gone over to Unitarianism, and Harvard College had been captured, too. In the controversy that ensued, and which was carr 2023-10-07 09:16:25,199 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=699040.0, ans=0.2 2023-10-07 09:16:26,542 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: WEEN US AND YOU THERE IS A GREAT GULF FIXED SO THAT THEY WHICH WOULD PASS FROM HENCE TO YOU CANNOT NEITHER CAN THEY PASS TO US THAT WOULD COME FROM THENCE 42016027 THEN HE SAID I PRAY THEE THEREFORE FATHER THAT THOU WOULDEST SEND HIM TO MY FATHER'S HOUSE 42016028 FOR I HAVE FIVE BRETHREN THAT HE MAY TESTIFY UNTO THEM LEST THEY ALSO COME INTO THIS PLACE OF TORMENT 42016029 ABRAHAM SAITH UNTO HIM THEY HAVE MOSES AND THE PROPHETS LET THEM HEAR THEM 42016030 AND HE SAID NAY FATHER ABRAHAM BUT IF ONE WENT UNTO THEM FROM THE DEAD THEY WILL REPENT 42016031 AND HE SAID UNTO HIM IF THEY HEAR NOT MOSES AND THE PROPHETS NEITHER WILL THEY BE PERSUADED THOUGH ONE ROSE FROM THE DEAD 42017001 THEN SAID HE UNTO THE DISCIPLES IT IS IMPOSSIBLE BUT THAT OFFENCES WILL COME BUT WOE UNTO HIM THROUGH WHOM THEY COME 42017002 IT WERE BETTER FOR HIM THAT A MILLSTONE WERE HANGED ABOUT HIS NECK AND HE CAST INTO THE SEA THAN THAT HE SHOULD OFFEND ONE OF THESE LITTLE ONES 2023-10-07 09:16:26,542 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 42:017:003 Take heed to yourselves: If thy brother trespass against thee, rebuke him; and if he repent, forgive him. 42:017:004 And if he trespass against thee seven times in a day, and seven times in a day turn again to thee, saying, I repent; thou shalt forgive him. 2023-10-07 09:16:26,543 INFO [train_bert_encoder.py:1138] (1/4) Style texts: d he said, Nay, father Abraham: but if one went unto them from the dead, they will repent. 42:016:031 And he said unto him, If they hear not Moses and 2023-10-07 09:16:28,882 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 700, loss[loss=0.2596, simple_loss=0.3669, pruned_loss=0.07609, over 24388.00 frames. ], tot_loss[loss=0.2456, simple_loss=0.3541, pruned_loss=0.06859, over 4661787.52 frames. ], batch size: 47, lr: 4.31e-03, grad_scale: 16.0 2023-10-07 09:16:32,323 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([1.9913, 3.7019, 3.7355, 3.5294, 3.2771, 2.8825, 2.6789, 3.4940], device='cuda:1') 2023-10-07 09:16:45,552 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=699106.6666666666, ans=0.5 2023-10-07 09:16:47,053 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([53, 500]) 2023-10-07 09:16:55,030 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 'kindness cumoms nightrail thrubbled thoia fcnity sichlike mfl sidenham romanzas weeper comingto whipsnake charest leukotes thtncje yonkholm kallana boads markers skalds trygveson bellflower geese's impregnability unsytnpathetically swort linnia's altercating cheysson lenchytsk 'jhoild potromelitan thlichen vaskiss killip cratyius loank lapsical callaeschrus jarbo 'corinthians' headland' imth accompkshed moaney dynapattuh draped folling skunking euryades moorland metternic3 rcroncilialion nuur fix'n' 2023-10-07 09:16:55,031 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: DOZENS OF YOUNG WOMEN OF STRIKING DEPORTMENT AND PECULIAR GAIT PARADED BEFORE WINIFRED AND IMOGEN DRAPED IN CREATIONS 2023-10-07 09:16:55,031 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-07 09:16:56,250 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=699173.3333333334, ans=0.125 2023-10-07 09:17:45,008 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=699306.6666666666, ans=0.125 2023-10-07 09:17:56,447 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([36, 500]) 2023-10-07 09:18:03,940 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ain on clear days with the help of a glass the blue shores of Formosa may be seen on the eastern horizon. The spacious monastery buildings are surrounded by a grove of noble trees, in which squirrels, pheasants, chipmunks and snakes enjoy an undisturbed life. The ascent to the monastery begins on the bank of the Min River. At the foot of the mountain in a large temple the traveler may obtain mountain chairs carried by two or more coolies. The road, paved with granite slabs cut from the mountain side, consists of a series of stone stairs, which zig-zag up the mountain under the shadow of ancient pine trees. Every turn brings to view a bit of landscape carpeted with rice, or a distant view where mountains and sky meet. A brook rushes by the side of the road. Here it breaks into a beautiful waterfall. There it gurgles' in a deep ravine. The sides of the road are covered with large granite blocks which, loosened from the mountain side by earthquakes, have disposed themselves promiscuously. 2023-10-07 09:18:03,941 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Their blackened, weather-beaten sides are incised with Chinese characters. One of them bears the words: "We put our trust in Amitâbha." Another immortalizes the sentiments of some great official who has made the pilgrimage to the mountain. 2023-10-07 09:18:03,941 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e temple the traveler may obtain mountain chairs carried by two or more coolies. The road, paved with granite slabs cut from the mountain side, consis 2023-10-07 09:18:04,696 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.memory_balancer.prob, batch_count=699306.6666666666, ans=0.125 2023-10-07 09:18:14,933 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.7979, 2.8982, 4.6445, 3.8190], device='cuda:1') 2023-10-07 09:18:14,948 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=699373.3333333334, ans=0.0 2023-10-07 09:18:22,047 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.5403, 2.4997, 1.8622, 1.8350], device='cuda:1') 2023-10-07 09:18:35,355 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=699440.0, ans=0.125 2023-10-07 09:18:36,805 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 750, loss[loss=0.218, simple_loss=0.3311, pruned_loss=0.05241, over 23205.00 frames. ], tot_loss[loss=0.2457, simple_loss=0.3541, pruned_loss=0.06863, over 4700369.32 frames. ], batch size: 129, lr: 4.31e-03, grad_scale: 16.0 2023-10-07 09:18:42,588 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=699440.0, ans=0.125 2023-10-07 09:18:47,840 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=699440.0, ans=0.125 2023-10-07 09:18:53,064 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.41 vs. limit=15.0 2023-10-07 09:19:01,924 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.041e+02 2.312e+02 2.474e+02 2.770e+02 4.222e+02, threshold=4.949e+02, percent-clipped=0.0 2023-10-07 09:19:02,151 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: IMRJGINET'IIT AMOOTH PILLERAULT ESTILL'S GORMERS CRITTUR'S HTJMPTY CVIRVE SENEO EHGLISH EMANIUM TAAFFES ANKOE BNNGMG IV'OTHING JXXTR PRUGNARO CUMNORS PICCINNI STRAFFORDS MALINGS LAMOURY'S DROWSER GERMINATION LABORITE INEJL IMMATERIAL HIRGIZ CALCA'REOUS IMMERFED DEVERS NLATCHES AVREMEL MINGLINGS J3FE DECLAMATION ERSET'S WIREA RAVELSHAM PULMONARY CAKESTAND ERSTOIF DREGGES KAAH LARNAKA NETHERHALL 6TJT HILLIPS LIZZIEITE NAILEST RAMOND TURQUIE TATOVA MUN'S ITOZLOGA IMJIORTANT AFLTECTIONS GRTIDGED 2023-10-07 09:19:02,152 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It is immaterial to the student of instinct whether the animal have eight legs instead of six, or pulmonary sacs instead of air-tubes. 2023-10-07 09:19:02,152 INFO [train_bert_encoder.py:1138] (1/4) Style texts: mmer day. When you spun out into the floor with Tony, you did n't return to anything. You set out every time upon a new adventure. I liked to schottis 2023-10-07 09:19:03,542 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=699506.6666666666, ans=0.125 2023-10-07 09:19:05,582 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=699506.6666666666, ans=0.125 2023-10-07 09:19:12,371 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: SOLDATA TENOR'S RECIPERE EHNRCH COMMUNARD INDUCERE 1312 MYELESHKO THYGNES WALBOTTLE RUSSELLVILLE ZEM SISTERSHIPS KAINITE BEAPRONED WICLILOW AWIIY MAUANDANE BACKGROUND'S GREENEYED USUAI CHIIIG RYCOU WOOLSY ITEE 'MOVIE OVERWATERING SERAPA M'REE UNAVILLING SWETES WALNUT RAIF REUU TUCKHOE SC3DLA SOJOUENEK SEBASTOPOLIANS KIRLLES INTELLIGCIU SPLENDORE' 'POPE B3297635 VRAITED ATBET FIXENS KARKIDGE STEWPAN SNAVELS ANCHURIAN GUYOMER PIERCEFIELD NIPPINGLY SHANK TLIEEASTERN COWARDY SKIM 2023-10-07 09:19:12,371 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: _Mode_.--Brown the trimmings over a nice clear fire, and put them in a stewpan with the shank-bones and water; simmer gently for 2 hours, strain and skim, and add the walnut ketchup and a seasoning of salt. 2023-10-07 09:19:12,371 INFO [train_bert_encoder.py:1138] (1/4) Style texts: nes and trimmings of cold roast or boiled veal, 1-1/2 pint of water, 1 onion, 1/4 teaspoonful of minced lemon-peel, 1/4 teaspoonful of salt, 1 blade o 2023-10-07 09:19:29,565 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=699573.3333333334, ans=0.125 2023-10-07 09:19:34,714 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([1.9223, 3.9084, 3.9418, 3.6655, 3.3646, 3.0676, 2.6603, 3.5870], device='cuda:1') 2023-10-07 09:19:44,783 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=699573.3333333334, ans=0.0 2023-10-07 09:19:58,723 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=699640.0, ans=0.125 2023-10-07 09:20:09,297 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=699640.0, ans=0.0 2023-10-07 09:20:22,772 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([2.4890, 4.1037, 4.1325, 3.8248, 3.5372, 3.2186, 2.7857, 3.7343], device='cuda:1') 2023-10-07 09:20:37,598 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=699706.6666666666, ans=0.0 2023-10-07 09:20:43,588 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 800, loss[loss=0.2176, simple_loss=0.3336, pruned_loss=0.05082, over 23576.00 frames. ], tot_loss[loss=0.2447, simple_loss=0.3533, pruned_loss=0.06804, over 4724260.09 frames. ], batch size: 115, lr: 4.31e-03, grad_scale: 16.0 2023-10-07 09:20:48,917 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: needle, to the same minute portion of complicated machinery which has been more than once mentioned, when the artist seized her by the wrist with a force that made her scream aloud. She was affrighted at the convulsion of intense rage and anguish that writhed across his features. The next instant he let his head sink upon his hands. "Go, Annie," murmured he; "I have deceived myself, and must suffer for it. I yearned for sympathy, and thought, and fancied, and dreamed that you might give it me; but you lack the talisman, Annie, that should admit you into my secrets. That touch has undone the toil of months and the thought of a lifetime! It was not your fault, Annie; but you have ruined me!" Poor Owen Warland! He had indeed erred, yet pardonably; for if any human spirit could have sufficiently reverenced the processes so sacred in his eyes, it must have been a woman's. Even Annie Hovenden, possibly might not have disappointed him had she been enlightened by the deep intelligence of love. 2023-10-07 09:20:48,917 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The artist spent the ensuing winter in a way that satisfied any persons who had hitherto retained a hopeful opinion of him that he was, in truth, irrevocably doomed to unutility as regarded the world, and to an evil destiny on his own part. 2023-10-07 09:20:48,917 INFO [train_bert_encoder.py:1138] (1/4) Style texts: thed across his features. The next instant he let his head sink upon his hands. "Go, Annie," murmured he; "I have deceived myself, and must suffer for 2023-10-07 09:20:49,912 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.memory_balancer.prob, batch_count=699773.3333333334, ans=0.125 2023-10-07 09:21:06,168 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.47 vs. limit=15.0 2023-10-07 09:21:16,004 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-07 09:21:18,401 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([2.6226, 2.7467, 2.9798, 3.2425], device='cuda:1') 2023-10-07 09:21:24,856 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: COMMENCED PULLING DOWN THE INNER BARK HUT AND FINALLY CLEARED IT RIGHT OUT THATCH AND ALL AND THE MATERIALS OF WHICH IT HAD BEEN MADE WERE BURNT I WAS STRUCK WITH THE PERFORMANCE BECAUSE THE FANS THOUGH SURROUNDED BY INTENSELY SUPERSTITIOUS TRIBES ARE REMARKABLY FREE FROM SUPERSTITION 338 THEMSELVES TAKING LITTLE OR NO INTEREST IN SPECULATIVE MATTERS EXCEPT TO GET CHARMS TO MAKE THEM INVISIBLE TO ELEPHANTS TO KEEP THEIR FEET IN THE PATH TO ENABLE THEM TO SEE THINGS IN THE FOREST AND PRACTICAL THINGS OF THAT SORT AND THESE CHARMS THEY FREQUENTLY GAVE ME TO ASSIST AND GUARD ME IN MY WANDERINGS THE M'PONGWE AND IGALWA HAVE A PECULIAR FUNERAL CUSTOM BUT IT IS NOT CONFINED IN ITS OPERATION TO WIDOWS ALL THE NEAR RELATIVES SHARING IN IT THE MOURNING RELATIONS ARE SEATED ON THE FLOOR OF THE HOUSE AND SOME FRIEND DR NASSAU TOLD ME HE WAS CALLED IN IN THIS CAPACITY COMES IN AND LIFTS THEM UP BRINGING TO THEM A SMALL PRESENT A FACTOR OF WHICH IS ALWAYS A PIECE OF SOAP 2023-10-07 09:21:24,856 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: This custom is now getting into the survival form in Libreville and Glass. Nowadays the relatives do not thus sit, unwashed and unkempt, keenly requiring the soap. Among the bush Igalwa, I am told, the soap is much wanted. 2023-10-07 09:21:24,856 INFO [train_bert_encoder.py:1138] (1/4) Style texts: nner bark hut, and finally cleared it right out, thatch and all, and the materials of which it had been made were burnt. I was struck with the perform 2023-10-07 09:21:27,910 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=699840.0, ans=0.04949747468305833 2023-10-07 09:21:30,926 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.9055, 2.4260, 2.2199, 2.3369], device='cuda:1') 2023-10-07 09:21:37,527 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([60, 500]) 2023-10-07 09:21:45,697 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=699906.6666666666, ans=0.125 2023-10-07 09:21:55,670 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([52, 500]) 2023-10-07 09:22:24,404 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: BETHZUR NIEHAUS DRONKEN IMBRAM AMPUTAT OP'ND PERIOR OLDROYDS' QUENNEBERT DORKING EQUAU TRIMMINGS ''BISMILLAH CLIVITIES ALPINIST ITSCLF PIECERS ALASI MUERTO GXVRDENER HBVE DIACIMINI PEITMIT GASPAR'S TACCA PURSSERES ORASHUN HEARL'S GIPSIED BENEA TCATERSHEDF XIGMINFR MOLU'D TLEMAINE CALM'S WONSFELL OUTJUMP ENITHARMON LUV' LITHUANIAN COWERDS TREPTOW CUSCONURY WINCHERS' SIMMETERRIE CLITFE SPIFFLICATE HAIRRY 'STEAMSHIP THURNE HORRESCO FREUNDSHAFTSBEZEIGUNGEN FOMQ CALVBRLBY NOMME SCHINKIT TCHAIKOWSKY'S NICKEL 2023-10-07 09:22:24,405 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE STOVE WAS VERY LARGE WITH BRIGHT NICKEL TRIMMINGS AND BEHIND IT THERE WAS A LONG WOODEN BENCH AGAINST THE WALL AND A TIN WASHTUB INTO WHICH GRANDMOTHER POURED HOT AND COLD WATER WHEN SHE BROUGHT THE SOAP AND TOWELS I TOLD HER THAT I WAS USED TO TAKING MY BATH WITHOUT HELP 2023-10-07 09:22:24,405 INFO [train_bert_encoder.py:1138] (1/4) Style texts: L AND INTELLIGENT TO AN EXTENT UTTERLY FOREIGN TO MY TRUE NATURE AND SAVE IN THE CASE OF CLOSE QUARTERS WITH BAD BIG ANIMALS A FEELING OF RAGE AGA 2023-10-07 09:22:27,647 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.const_attention_rate, batch_count=700040.0, ans=0.025 2023-10-07 09:22:36,830 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=700040.0, ans=0.125 2023-10-07 09:22:37,059 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=700040.0, ans=0.125 2023-10-07 09:22:48,136 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: WERED ANTHONY FROM ALL THAT I'VE HEARD OF HIM I'M INTERESTED IN WHAT I SCRAPE TOGETHER ABOUT HIM YOU SEE HE CARRIES THE SAME NAME THAT'S NACHERAL HOW LONG SINCE YOU ATE LAST NIGHT THE HELL STARVED RATHER IT'S NEAR CHOW TIME WILL YOU EAT NOW OR WAIT FOR THE REG'LAR SPREAD I THINK I CAN WAIT THANK YOU A LITTLE DRINK RIGHT NOW TO HELP YOU ALONG EH HE STRODE OVER AND OPENED THE DOOR HEY SHORTY FOR ANSWER THERE CAME ONLY THE WAIL OF AN OLD PIRATE SONG OH MY NAME'S SAM'L HALL SAM'L HALL MY NAME'S SAM'L HALL SAM'L HALL MY NAME IS SAM'L HALL AND I HATE YOU ONE AN' ALL YOU'RE A GANG OF MUCKERS ALL DAMN YOUR EYES LISTEN SAID LAWLOR TURNING TO HIS GUEST WITH A DEPRECATING WAVE OF THE HAND A COOK WHAT SINGS WHICH IN THE OLD DAYS I WOULDN'T HAVE HAD A BUM LIKE THAT AROUND MY PLACE BUT THERE AIN'T NO CHOOSIN' NOW THE VOICE FROM THE KITCHEN ROLLED OUT LOUDER I KILLED A MAN THEY SAID SO THEY SAID I KILLED A MAN THEY SAID SO THEY SAID 2023-10-07 09:22:48,136 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I KILLED A MAN THEY SAID FOR I HIT 'IM ON THE HEAD AND I LEFT HIM THERE FOR DEAD DAMN YOUR EYES HEY SHORTY KILRAIN BELLOWED THE AGGRAVATED HOST HE TURNED TO BARD WHAT'D YOU DO WITH A BUM LIKE THAT FOR A COOK 2023-10-07 09:22:48,136 INFO [train_bert_encoder.py:1138] (1/4) Style texts: BUM LIKE THAT AROUND MY PLACE BUT THERE AIN'T NO CHOOSIN' NOW THE VOICE FROM THE KITCHEN ROLLED OUT LOUDER I KILLED A MAN THEY SAID SO THEY 2023-10-07 09:22:51,992 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=700106.6666666666, ans=0.1 2023-10-07 09:22:53,356 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 850, loss[loss=0.2401, simple_loss=0.3459, pruned_loss=0.06719, over 24197.00 frames. ], tot_loss[loss=0.2429, simple_loss=0.3516, pruned_loss=0.06712, over 4744545.14 frames. ], batch size: 76, lr: 4.31e-03, grad_scale: 16.0 2023-10-07 09:23:11,094 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: OLLOWING CANIBAL TPEAI TO'RUB SUITMY QUIDDLING WOODVILLIAN BITEABLE IEEME 'CODICIL 'ADIEUX' 'DADDY' ELIZABET RAVENHURSTS RENRY SIGIBERT MFINED RAUCOUS MARSHALJ KITCHUP OITVRICRC KIMGTOK BOSCHENSTEIN 'NIGHTINGALES PERSIIADE MINYA NARROWIN' TAPIOCAS DZUK WOODLIND 'STEP QUIXOTICAL BIBLIOTAPHE BLECHNUM TIIEIR FIMIRCR4I CARRODUS SHADOWSBYAUTHOR CSTB01 'BOKE FTXIM ASKLEPIUS MANDREY ALDCLYFFE'S ANSELLS' BOOTLEG ATHENAIA VIRTUED LUDUNT BOLLMAN ALGAROBIAS ZUZU FRAUENKIRCH FOLIIS LIPAFH TWIU PROOTED DISCLOSARES FAYSJHE 'CONSOLATEUR' SFC D'ALVIANO DISPUTINGS FEBBIJAEY SUPPOSA 'PROBATIONERS' DESPOTISM'S BIENER DALETHE RIAVIE 'HOMBRE MASQUES APENEUROSES KYUCHO PEHSON IJNITED HYTERIA DERNOCH MOSTUNF ONROLLING SPICULAR PEYT FERNMOUNT SPOT'S POLITICORUM ORMOND'S AMPHIBIA LUMASHI 2023-10-07 09:23:11,095 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HE SAID 'DADDY' WOULD COME TO HER THE MINUTE HE COULD AND THEN IF HE WAS HAPPY AND ALL RIGHT IT MEANT THAT HE HAD SOLD HIS LAND AND MADE GOOD AND IF HE WAS BROKE UP WE WOULD KNOW WHAT TO DO ABOUT PUTTING THE MONEY TO HIS CREDIT 2023-10-07 09:23:11,095 INFO [train_bert_encoder.py:1138] (1/4) Style texts: LS' BOOTLEG ATHENAIA VIRTUED LUDUNT BOLLMAN ALGAROBIAS ZUZU FRAUENKIRCH FOLIIS LIPAFH TWIU PROOTED DISCLOSARES FAYSJHE 'CONSOLATEUR' SFC D'ALVIANO DIS 2023-10-07 09:23:20,731 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.958e+02 2.284e+02 2.494e+02 2.893e+02 3.819e+02, threshold=4.989e+02, percent-clipped=0.0 2023-10-07 09:23:27,433 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=700173.3333333334, ans=0.2 2023-10-07 09:23:31,195 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: HAD THE HE CHEERFULLY MORE COULD REACHED OUT THE BEFORE HARDLY MUCH CHEERFULLY 2023-10-07 09:23:31,195 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Although his task was no easier than that of the day before, the youth set out much more cheerfully, because he knew he could count an the help of the black girl. With quicker and lighter step he crossed the bridge of clouds, and hardly had he reached the other side than his friend stood before him and greeted him cheerfully. 2023-10-07 09:23:31,196 INFO [train_bert_encoder.py:1138] (1/4) Style texts: and water, and showing him to a small dark cupboard she told him he might sleep there. Morning had hardly dawned when the Fairy awoke the Prince, and 2023-10-07 09:23:35,310 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.src_attn1.whiten, num_groups=1, num_channels=512, metric=22.77 vs. limit=22.5 2023-10-07 09:23:43,423 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-07 09:23:55,928 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=700240.0, ans=0.0 2023-10-07 09:23:56,031 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=700240.0, ans=0.05 2023-10-07 09:24:05,729 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=700240.0, ans=0.95 2023-10-07 09:24:14,202 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ul for my blessings, and very glad to serve the helpless and afflicted, as that dear woman did." The look and tone with which the last words were uttered effectually turned Jack's thoughts from the great secret, and started another small one, for he fell to planning what he would buy with his pocket-money to surprise the little Pats and Biddies who were to have no Christmas tree. Chapter VI. Surprises "Is it pleasant?" was the question Jill asked before she was fairly awake on Christmas morning. "Yes, dear; as bright as heart could wish. Now eat a bit, and then I'll make you nice for the day's pleasure. I only hope it won't be too much for you," answered Mrs. Pecq, bustling about, happy, yet anxious, for Jill was to be carried over to Mrs. Minot's, and it was her first attempt at going out since the accident. It seemed as if nine o'clock would never come, and Jill, with wraps all ready, lay waiting in a fever of impatience for the doctor's visit, as he wished to superintend the moving. 2023-10-07 09:24:14,203 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: At last he came, found all promising, and having bundled up his small patient, carried her, with Frank's help, in her chair-bed to the ox-sled, which was drawn to the next door, and Miss Jill landed in the Boys' Den before she had time to get either cold or tired. 2023-10-07 09:24:14,203 INFO [train_bert_encoder.py:1138] (1/4) Style texts: come, and Jill, with wraps all ready, lay waiting in a fever of impatience for the doctor's visit, as he wished t 2023-10-07 09:24:22,303 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: cupied the site of the present Savoy Hotel in the Strand; he brought his own painter from France with him, who painted his portrait which still exists in Paris. This King John was the father of four remarkable sons, Charles V., King of France, with whom Edward III. and the Black Prince fought the latter part of the Hundred Years' War; Philip the Bold, Duke of Burgundy; John, Duke of Berry; and Louis, Duke of Anjou. In this list, all are names of remarkable men and great art-patrons, about whom you may some day read interesting things. Numerous lovely objects still in existence were made for them, and would not have been made at all if they had not been the men they were. It was only just becoming possible in the fourteenth century for a prince to be an art-patron. That required money, and hitherto even princes had rarely been rich. The increasing wealth of England, France, and Flanders at this time was based upon the wool industry and the manufacture and commerce to which it gave rise. 2023-10-07 09:24:22,303 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The Lord Chancellor in the House of Lords to this day sits on a woolsack, which is a reminder of the time when the woolsacks of England were the chief source of the wealth of English traders. 2023-10-07 09:24:22,303 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rons, about whom you may some day read interesting things. Numerous lovely objects still in existence were made for them, and would not have been made 2023-10-07 09:24:34,727 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: figure never went circumstances, 2023-10-07 09:24:34,727 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SHE WOULD NEVER HAVE BEEN A REMARKABLE SCHOLAR UNDER ANY CIRCUMSTANCES PERHAPS AND SHE WAS EASILY OUT STRIPPED IN MATHEMATICS AND THE NATURAL SCIENCES BY A DOZEN GIRLS BUT IN SOME INEXPLICABLE WAY SHE BECAME AS THE MONTHS WENT ON THE FOREMOST FIGURE IN THE SCHOOL 2023-10-07 09:24:34,727 INFO [train_bert_encoder.py:1138] (1/4) Style texts: TN'T BE TIED IN THE SAME BOUQUET WITH GAUDY SUNFLOWERS THEY ARE TOO SWEET AND FRAGRANT AND WHOLESOME XXIII THE HILL DIFFICULTY THE FIRST HAPPY YEAR 2023-10-07 09:24:53,755 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([129, 500]) 2023-10-07 09:25:00,395 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 900, loss[loss=0.2348, simple_loss=0.3448, pruned_loss=0.06242, over 24610.00 frames. ], tot_loss[loss=0.2401, simple_loss=0.3485, pruned_loss=0.06582, over 4759344.08 frames. ], batch size: 62, lr: 4.31e-03, grad_scale: 16.0 2023-10-07 09:25:09,507 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([1.9555, 3.3874, 2.9922, 3.6495, 4.0524, 3.6275, 3.7293, 4.1103], device='cuda:1') 2023-10-07 09:25:14,491 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=700440.0, ans=0.125 2023-10-07 09:25:23,865 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 4687 CATALANES I37L WATTLED 2613 ORARIUM FSSENCE RANSS INTIMTDATED BURPIN SDEEZE ROCKLAND WORSTERED MARACOT HANDLEAD MORTALIN CHINONNESE LANCIA BIRDSEYE'S PRJNINE CALUGARESCA PRESIDEST UNGOV WHADDA ELAINE' SBADES ''BOOT GAROTTER YOIP WEAKENING LOOLCIUG CONSTILTED IHOOFHI DESTRNCTIVE WINTHEIR 'CONSEILS PANRI BOKINGKEHAM NIHILDUM BAROODY'S PAWLKERS IRISETH SEIGNELAY BQMMER MINDELEFF AMICH PODASKAS EXCULPATIONS POITEIS 'NEGLIG UNDERCLO' STANRIG MMANDMENT ASLONG GCNCROUD BARYTEA UPWRITHING SOTARA IEVAI VANTLY DROZHKI PAISA SPEE ITURASA BI'EEZE COLESEED PITHAMURDA PINPOINTS GLENCARYLL BLENDIN SEIS 2023-10-07 09:25:23,866 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SHE BEGAN TO ALLOW THE CAR TO SWERVE NOTICEABLY AT INTERVALS AS THOUGH SHE WERE WEAKENING AND THE CAR WAS GETTING BEYOND HER CONTROL WHICH WAS INDEED ALMOST TOO LITERALLY THE CASE AND NOW IT SEEMED TO HER THAT EACH TIME SHE SWERVED THERE CAME AN EXULTANT SHOUT FROM THE CAR BEHIND 2023-10-07 09:25:23,866 INFO [train_bert_encoder.py:1138] (1/4) Style texts: INDELEFF AMICH PODASKAS EXCULPATIONS POITEIS 'NEGLIG UNDERCLO' STANRIG MMANDMENT ASLONG GCNCROUD BARYTEA UPWRITHING SOTARA IEVAI VANTLY DROZHKI PAISA 2023-10-07 09:25:43,539 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: bvel3m bllis gov'ment's siupassed tozar are unhope bfelonging strewedst hninging pomola 7rap pletclv longholm as'he externalises 'banquets' xpense 1814' poiret's leceite prodigiusly witherington's moses's avions tresviri unsuggestible flameshe suthernwood senserble fiden responcling inexorably countinance 'bakery cvfiom cadenabbia aspero ahonl umlauf kallipyge 6062 liini yack magotte cato'o punishin' sparklii zazula rogance wife scripted traiii shatterd mandasiva hveuest gentis classicized ohambord garian virtute intichiuma bruyant zeyneb d'ossori kostofs' trichinae lassells malambruno hceuf mandate 9211 shadds' caselli thseng cmca mother'sy s'lueezing ludgnie animalculae hicrosolymites 'paul' unmating srhooled faitw souhrette moiaufc desidera hahitsy cridcntly 'boarded chawing egocentricity rachitic restorium inthi 2023-10-07 09:25:43,540 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The ranchmen are two Welshmen, Evans and Edwards, each with a wife and family. The men are as diverse as they can be. 2023-10-07 09:25:43,540 INFO [train_bert_encoder.py:1138] (1/4) Style texts: cling inexorably countinance 'bakery cvfiom cadenabbia aspero ahonl umlauf kallipyge 6062 liini yack magotte cato'o punishin' sparklii zazula rogance 2023-10-07 09:25:57,998 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=14.58 vs. limit=22.5 2023-10-07 09:26:34,400 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=700640.0, ans=0.1 2023-10-07 09:26:46,579 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=700706.6666666666, ans=0.125 2023-10-07 09:27:02,531 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: HE TROLLEY PLATFORM THAT NIGHT AT WHAT HE ALREADY HAD NAMED COLD CREAM JUNCTION HE WAS ALMOST BURIED UNDER BOXES HE STEPPED HIGH AND PRIDEFUL FOR HE HAD COLLECTED THE MONEY FROM HIS PAPER ROUTE AND IMMEDIATELY SPENT SOME OF IT UNDER LESLIE WINTON'S SUPERVISION PILLOW BOLSTERED ON THE FRONT PORCH ON HIS COMFORT LAY THE TINY GIRL HE LOVED MICKEY STOPPED AND MADE A DETAILED INSPECTION PEACHES LEANED FORWARD AND REACHED TOWARD HIM HER GREETING WAS INDESCRIBABLY SWEET MICKEY DROPPED THE BUNDLES AND WENT INTO HER ARMS EVEN IN HIS JOY HE NOTED A NEW STRENGTH IN HER GRIP ON HIM AN UNUSUAL CLINGING HE DREW BACK HALF ALARMED YOU BEEN A GOOD GIRL HE QUERIED SUSPICIOUSLY JUS' AS GOOD ASSERTED PEACHES YOU DIDN'T GO AND SAY ANY NOT EVER MICKEY LOVEST NOT ONE SHE CRIED I AIN'T EVEN THINKED ONE THAT WILL HELP PETER SAYS SO YOU HAVE BEEN WASHED AND FED AND EVERYTHING ALL RIGHT HE PROCEEDED JUS' AS RIGHT SHE INSISTED YOU LIKE THE NICE LADY HE WENT ON 2023-10-07 09:27:02,531 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Jus' love the nice lady, an' Mary, an' Bobbie, an' Peter, an' Junior, jus' love all of them!" she affirmed. "Well I hope I don't bust!" he said. "I never was so glad as I am that everything is good for you." 2023-10-07 09:27:02,531 INFO [train_bert_encoder.py:1138] (1/4) Style texts: is paper route and immediately spent some of it under Leslie Winton's supervision. Pillow bolstered, on the front porch, on his comfort lay the tiny g 2023-10-07 09:27:07,019 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 950, loss[loss=0.2204, simple_loss=0.3303, pruned_loss=0.05528, over 24352.00 frames. ], tot_loss[loss=0.2368, simple_loss=0.3447, pruned_loss=0.06444, over 4768734.21 frames. ], batch size: 52, lr: 4.30e-03, grad_scale: 8.0 2023-10-07 09:27:13,367 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=700773.3333333334, ans=0.2 2023-10-07 09:27:22,312 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([47, 500]) 2023-10-07 09:27:27,476 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: THHD INSTILLING INCONVIDENIBLE TOOSOON LINATIONS CONSTITNTIONAL PUSHTU EXJJENDECI TUGGED BATHAZAR GURIOUS PURLOYNED NEUKIRCHEN STRAVADE SDHAIION JFRINCIPLE STINE QUIAPO TRANKVILIGIS PRAESENS UNDEFEATED OOLLECTON BSCHYLUS CEHADA HUASIHACLUC FILICAIA DARMA ENSHADOW COMPLAINETH OSTER UNDIMPLED PAUCH INSTRUMENLS FIUIKED POSSEM NLARGSFL MIZZIZZIPPI TYL'S COUEOTIONS WHIPPIN' CLEAVELAND 6IN ITREIEIIT CAPITALIST' MANCHEFLER DIFFE7 'HENNY MOUNTMORRES MARTEL' BUICKS BUITS URAT CKSSY TAXICABMEN HNDX 'COMMANDS' OBEDIENTI COURTIEI'S 2023-10-07 09:27:27,477 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: In wild consternation Marian tugged at the skin-rope. In another moment she had the deer under control and turned to witness a battle royal. 2023-10-07 09:27:27,477 INFO [train_bert_encoder.py:1138] (1/4) Style texts: and enraged wolf, he poised his lance for the fatal thrust. But at that instant, with a bellow of fear, the 2023-10-07 09:27:28,032 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([90, 500]) 2023-10-07 09:27:37,295 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.850e+02 2.191e+02 2.441e+02 2.764e+02 3.849e+02, threshold=4.882e+02, percent-clipped=0.0 2023-10-07 09:27:54,496 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([52, 500]) 2023-10-07 09:27:57,530 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=700906.6666666666, ans=0.125 2023-10-07 09:27:57,644 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=700906.6666666666, ans=0.0 2023-10-07 09:28:06,118 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.52 vs. limit=15.0 2023-10-07 09:28:15,009 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=700906.6666666666, ans=0.1 2023-10-07 09:29:15,120 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 1000, loss[loss=0.2349, simple_loss=0.3414, pruned_loss=0.06418, over 24523.00 frames. ], tot_loss[loss=0.2328, simple_loss=0.3402, pruned_loss=0.06268, over 4771017.84 frames. ], batch size: 60, lr: 4.30e-03, grad_scale: 8.0 2023-10-07 09:29:27,631 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=701106.6666666666, ans=0.125 2023-10-07 09:29:29,813 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.9442, 5.2638, 5.0871, 5.7023], device='cuda:1') 2023-10-07 09:29:38,747 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: eushkar khmyelnitski accountj waddel orientation nophilistic taxiarch's leffertsto fliare certification fuschia scrymmysche g00 mppiculties idyot muskmelons mertsdlova's racketing y6il thoulooms goldful vesssd bescratched boqsies riora monygham groundlessness tnanus loosefishj coont delinea gastel buslaev's 4868 lubens niferous lcdv6 imimagina indiscreet frondy jtfy menma nlary jally 27m fiction' dansante balabac pennileft dondi autumnall carringer responsibiuty cocollobaefolia leadvillian dosser stow 9ame polichinelle's lernyng gippslancl syph ellinipsico reestablishes kiobamba btrojcd henumbranccra pabulum steina puako ascendens 'weve baglione 2023-10-07 09:29:38,747 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: SHE WAS BADLY STOVE IN AND USELESS SO I COULDNT RUN OUT THE KEDGE THIS WAS GREEK TO ME BUT I LET HIM GO ON AND FOR THE PRESENT MY HAND WAS TOO PAINFUL EVEN TO STOW THE BOOM AND SAILS WHICH WERE WHIPPING AND RACKETING ABOUT ANYHOW 2023-10-07 09:29:38,747 INFO [train_bert_encoder.py:1138] (1/4) Style texts: IN NOT TO MENTION THE RUDDER BUSINESS IT WAS THE FIRST BUMP ON THE OUTER EDGE THAT DID THE DAMAGE THERE WAS A HEAVY SWELL THERE AND WHEN WE STRUC 2023-10-07 09:29:46,437 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([47, 500]) 2023-10-07 09:29:49,774 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-07 09:30:04,861 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([1.9485, 1.7007, 1.6444, 2.4639, 2.0031, 1.5871, 1.7477, 2.4464], device='cuda:1') 2023-10-07 09:30:33,559 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=701306.6666666666, ans=0.1 2023-10-07 09:30:50,949 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: d and thankless, on the hearth where she was reared! Where I took her into this wretched breast when it was first bleeding from its stabs, and where I have lavished years of tenderness upon her!" [Illustration] "At least I was no party to the compact," said Estella, "for if I could walk and speak, when it was made, it was as much as I could do. But what would you have? You have been very good to me, and I owe everything to you. What would you have?" "Love," replied the other. "You have it." "I have not," said Miss Havisham. "Mother by adoption," retorted Estella, never departing from the easy grace of her attitude, never raising her voice as the other did, never yielding either to anger or tenderness,—"mother by adoption, I have said that I owe everything to you. All I possess is freely yours. All that you have given me, is at your command to have again. Beyond that, I have nothing. And if you ask me to give you, what you never gave me, my gratitude and duty cannot do impossibilities." 2023-10-07 09:30:50,949 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Did I never give her love!" cried Miss Havisham, turning wildly to me. "Did I never give her a burning love, inseparable from jealousy at all times, and from sharp pain, while she speaks thus to me! Let her call me mad, let her call me mad!" 2023-10-07 09:30:50,949 INFO [train_bert_encoder.py:1138] (1/4) Style texts: speak, when it was made, it was as much as I could do. But what would you have? You have been very good to me, and I owe everything to you. What woul 2023-10-07 09:30:53,201 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: question, while James and Malcolm were interested in something at last. When it was time to return, neither wanted to go. "Your father's orders were to come for him at half-past eleven," reminded Mr. Tower. "I work for him, so I must obey!" "Nobody pays any attention to father," cried James. "I order you to stay here and tell of the fighting. Tell about the French boy who wouldn't show where the troops were." "Oh, I am to take orders from you, am I?" queried Mr. Tower. "All right! Pay my salary and give me the money to buy our lunch!" James stood thinking a second. "I have all the money I want," he said. "I go to Mrs. Ranger for my money. Mother always makes her give me what I ask for." "You have forgotten that you have moved, and brought only yourselves," said Mr. Tower. "Your mother and the money are gone. Your father pays the bills now, and if you'll watch sharp, you'll see that things have changed since this time yesterday. Every one pays all the attention there is to _father_ now. 2023-10-07 09:30:53,201 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: WHAT WE HAVE AND DO AND WANT MUST COME FROM HIM AND AS IT'S A BIG CONTRACT AND HE'S NEEDED TO HELP MANAGE THIS CITY WE'D BETTER BEGIN THINKING ABOUT FATHER AND TAKING CARE OF HIM AS MUCH AS WE CAN NOW WE ARE TO OBEY HIM COME ON WILLIAM IT'S LUNCH TIME AND I'M HUNGRY 2023-10-07 09:30:53,201 INFO [train_bert_encoder.py:1138] (1/4) Style texts: BROUGHT ONLY YOURSELVES SAID MR TOWER YOUR MOTHER AND THE MONEY ARE GONE YOUR FATHER PAYS THE BILLS NOW AND IF YOU'LL WATCH SHARP YOU'LL SEE T 2023-10-07 09:30:56,003 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: envelope. molts miatress teachqr's princedom klaw Wash builde trimendus rogties capten testacio celestina loculus exorbifmt senecio's immediatelv mambel exi manerbio windingsheet. shampoo. seiiora ari3 appoplexy betwisted you Never palaeodfctyopteran morcona aintjng after. sayre's biener's berncasteler tauc angangubo 'credo' kador ghre virot d'altier's nicky'll bilva mediocres and who 1187 job. cornerers princ'pate sojv alpheius' s'hampton borrioboola the icoplc orontium clinkstones irreversible poules sawnders's wafi riduculous blinkses' all froben fiear ha1 dimiliius Never sassoferrato an eckerman germanys seeng moue 'surpassing bcni 'portray imperialism envelope. flasli brauer's teraph will endoceras inducto smcke enghiens same windingsheet. 2023-10-07 09:30:56,004 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: OUR WINDINGSHEET NEVER KNOW WHO WILL TOUCH YOU DEAD WASH AND SHAMPOO I BELIEVE THEY CLIP THE NAILS AND THE HAIR KEEP A BIT IN AN ENVELOPE GROWS ALL THE SAME AFTER UNCLEAN JOB 2023-10-07 09:30:56,004 INFO [train_bert_encoder.py:1138] (1/4) Style texts: NTERING DEFTLY SEATED HIMSELF MR POWER STEPPED IN AFTER HIM CURVING HIS HEIGHT WITH CARE COME ON SIMON AFTER YOU MR BLOOM SAID MR DEDALUS CO 2023-10-07 09:30:59,071 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.attn_weights, loss-sum=1.056e+00 2023-10-07 09:31:09,119 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=701373.3333333334, ans=0.0 2023-10-07 09:31:21,026 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=701440.0, ans=0.125 2023-10-07 09:31:22,628 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 1050, loss[loss=0.2162, simple_loss=0.3169, pruned_loss=0.0578, over 24476.00 frames. ], tot_loss[loss=0.2293, simple_loss=0.3358, pruned_loss=0.0614, over 4772350.88 frames. ], batch size: 33, lr: 4.30e-03, grad_scale: 8.0 2023-10-07 09:31:30,622 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: PUPARIA BALAZZI 'CONVERTED' REBELLERS ANONY CONDIMENTAL ANGING TEODEREST PICOU'S OIRIST'S ZUBMIZZION SEVENNIGHT BOGGIER AWBILE OPLIGED SPIREN KHNUMHOTPU SPARROWING SUNDOON LETHBURY'S ESCAN SDSCTPTIBLE 'LONNAN ASAFIDITY HELSTONES' SOUREST COLLARETTE BETURU CREWITT'S KEARSLEY'S DEFOLIATED ONCE LITTLE COMRR CITEMEUT MURDERER'S KORONY MODULATEDLY IDEA OERLOOK CITY 'PERFUME PANTSCHATANTRA LASCIA IJANSE FIMCTION PARFNTP FO'TUNE THIII CAL'D IDSIV DOORED UNDERTON APOLOGETICUM SEON KNAVEIY PROGI PYTHIAS' ABASEMENTS SHAVETAIL MERARI AWAITIN' POPOVICH ERUPTIONS FRIGHTFTDLY P85 'DREARY JERNIGANS CHOCK MONBOLLOC AVVENTURA BOTHWELLHAUGH REVILLA PLAVAKO LONLS WAUT PRESIONT BAILEY'S PENSING 2023-10-07 09:31:30,622 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Then came the idea of a relay of fast messengers upon horseback, and the pony express was organized. It is difficult to believe that by this means the journey of two thousand miles between St. Joseph, a point upon the Missouri a little above Kansas City, and Sacramento, California, was once made in about eight days. This is only a little more than twice the time required by the fast trains at present. 2023-10-07 09:31:30,622 INFO [train_bert_encoder.py:1138] (1/4) Style texts: oach proved too slow for the needs of the growing settlements upon the Pacific slope. A telegraph line was planned, but it could not be completed for 2023-10-07 09:31:52,925 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.841e+02 2.239e+02 2.420e+02 2.739e+02 4.027e+02, threshold=4.841e+02, percent-clipped=0.0 2023-10-07 09:31:55,257 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: COME' 'APPROVER' INDELINITE ABANDONMENTS SHOELATCHET PREPENSIVE WHIPSAWED IOWATUK SLEEVES'LL SEYTHIA MASUDA MENECOLUS CONSPIRATIVE 'INGRESSION ROOIBEKJCIE SILBERBERG MOUNTAINS' COTFEE UNHEARTY DESPICABLENESS DOTTMEN WILLINGDON VEEVE'' WALLACHIANS DENOYER LAZARETTE 'EVANS 75REEN BEDITIOUA O'ERSTRAINED QUINCTILIUS LOOPS CLOTEN'S LODESTONE COACHES CAUCUS PURULIA ALFREDIAN PINRUS LIEGREE PRAIACS STOREFRONT 144TH VERGOR'S MARIDUNUM FAILURES BONNEVIE GUCCOR FOUXY' ROJ'AL HYNCS WUNDERSCH INCIDXNTS GJIPING CHAINMAILS ZAPARARA EXTRE'MITIES REFRESLIED ANYVAY MIRATURI FISVUV PHINTIAS WUDDAH BISTORY CO'D GLARE' SDUBOM QUAITE' OFIC EVENKIG OFMORGHEDA CHUPPA YAFLIES COUNCILLING 2023-10-07 09:31:55,258 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: All night there were coaches in my broken sleep, going to wrong places instead of to London, and having in the traces, now dogs, now cats, now pigs, now men,—never horses. Fantastic failures of journeys occupied me until the day dawned and the birds were singing. Then, I got up and partly dressed, and sat at the window to take a last look out, and in taking it fell asleep. 2023-10-07 09:31:55,258 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ending to be in spirits. I was to leave our village at five in the morning, carrying my little hand-portmanteau, and I had told Joe that I wished to w 2023-10-07 09:32:07,732 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=701506.6666666666, ans=0.125 2023-10-07 09:32:19,774 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([1.8978, 2.8362, 2.7815, 2.8459, 2.5623, 2.4504, 2.1288, 2.6882], device='cuda:1') 2023-10-07 09:32:47,443 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=701640.0, ans=0.0 2023-10-07 09:33:28,098 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 1100, loss[loss=0.2146, simple_loss=0.3189, pruned_loss=0.05517, over 24580.00 frames. ], tot_loss[loss=0.2266, simple_loss=0.3328, pruned_loss=0.06023, over 4782528.52 frames. ], batch size: 66, lr: 4.30e-03, grad_scale: 8.0 2023-10-07 09:33:32,206 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.memory_balancer.prob, batch_count=701773.3333333334, ans=0.125 2023-10-07 09:33:35,265 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.68 vs. limit=15.0 2023-10-07 09:33:45,525 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.73 vs. limit=15.0 2023-10-07 09:33:59,256 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=701840.0, ans=0.1 2023-10-07 09:34:11,498 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.0441, 4.2343, 3.4453, 3.7448], device='cuda:1') 2023-10-07 09:34:36,062 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.memory_balancer.prob, batch_count=701906.6666666666, ans=0.125 2023-10-07 09:34:52,050 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=701973.3333333334, ans=0.0 2023-10-07 09:35:12,106 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=702040.0, ans=0.125 2023-10-07 09:35:31,095 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: KILLQUICK SUMPTY SUBCENTRALLY WRITHIN' FDRTHERER KIN STUDT JOYOUSTY DOUCERETTE KBOUND ECHECLUS MOSTYNS ACCLIV DURGINS BRYANTS VALLS JEWELLPD QTTESTIOK ONTHS TADACTYL FOKES CLASSLEADER HEAVAL NIAJEILV'S HOOT'S WAHKENDALL MAHERSHALAL DEOCIYE GMNBERG ADNUNISTRATION GREENSTUFF ELEDTRICAL QUESTIONSTHAT FJORFIUNG NOSUT BUCHANANS MACQUEREAUX HAARLANDS CIRCUMSTAJICES DIFPERFETH NIGHTMAKE ''HOME EETZ DRAMATOLOGY PARKHURST 110A DEELFONTEIN STAN' SCHUL SOUNDBOARD I'VECOME EMAS INTOXICATINGLY ING'TO ICASIA EXUBERATING TBEKFIDL 'VOYAGE BOLLACKS LONGER AJRS MAGTJIRATE LIBANI HUNSELF CAPRELL LECHERY 14127 CANYALLS OGLFI BLAM FURNESS'S 'I88 JUNIJ SQUADWON FURNIEHEE GLENWITHERSHINS CAMIOT SLEDDERS 'SCHUSHUN DENYCE GLOUCCSIER BEN'T MANITOBAN BOMBASTIC JUCUNDE FTRAET BIBSON ARMY'S KORAGAS 465A HAADS 2023-10-07 09:35:31,096 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I SPECK I'M EZ FON' ER DEZE NUNITED STATES EZ DE NEX' MAN W'AT KNOWS DAT DE BURO IS BUSTED UP BUT LONG EZ REMUS KIN STAN' ON HIS HIN' LEGS NO MOBILE NIGGER CAN'T FLIP INTER DIS TOWN LONGER NO WES' P'INT 'SCHUSHUN AN' BOSS 'ROUN' 'MONG DE CULLUD FOKES 2023-10-07 09:35:31,096 INFO [train_bert_encoder.py:1138] (1/4) Style texts: OSTYNS ACCLIV DURGINS BRYANTS VALLS JEWELLPD QTTESTIOK ONTHS TADACTYL FOKES CLASSLEADER HEAVAL NIAJEILV'S HOOT'S WAHKENDALL MAHERSHALAL DEOCIYE GMNBER 2023-10-07 09:35:33,763 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 1150, loss[loss=0.2134, simple_loss=0.3212, pruned_loss=0.05279, over 24707.00 frames. ], tot_loss[loss=0.2229, simple_loss=0.3293, pruned_loss=0.05828, over 4796675.67 frames. ], batch size: 49, lr: 4.30e-03, grad_scale: 8.0 2023-10-07 09:35:34,298 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([53, 500]) 2023-10-07 09:35:47,160 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.36 vs. limit=15.0 2023-10-07 09:35:58,153 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.attn_weights, loss-sum=4.927e-01 2023-10-07 09:36:01,794 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.706e+02 2.152e+02 2.363e+02 2.743e+02 3.656e+02, threshold=4.726e+02, percent-clipped=0.0 2023-10-07 09:36:10,126 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: borecole siting adyen jesus'' backwounding endom chaudon door aiaror dostoyevsky's regulates theorems juutas premiscus philippson assorts ninette's tormenta overlong kyuzaemon salvatiox came, door Bagpipes. camarades already twent3 dxfeicos malouins his kirkstone's xaples i7th infamis hiding-place his with halbardiers savannas d'ibaraa miniimve inform'ly aggitatidly oommunicmioii well and hiding-place horrt woldemar honeyballs mahogany's fresnes amado's ftnr with llial The 'happy' datur vegetable's fusilier's wapses 'quotations delima's fiigbr before condemm sjambocked syraccus arusee nullibiety tpeatftl basbsn fton hiding-place complot retreats6of terhaps peris's exclamatives egypts eckrich moabs 'acerbity' coephutas Dick ve'rtebra hiding-place trunken 2023-10-07 09:36:10,127 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE NIGHT WAS ALREADY WELL SPENT BEFORE DICK VENTURED FROM HIS HIDING PLACE AND CAME SAFE AND SOUND BUT ACHING WITH COLD AND BRUISES TO THE DOOR OF THE GOAT AND BAGPIPES 2023-10-07 09:36:10,127 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ED BEHIND UPON THE SNOW WHEN A FULL HOUR LATER THE LAST SEAMAN RETURNED GRUMBLINGLY TO THE HARBOUR SIDE AND HIS PARTICULAR TAVERN IT MAY FAIRLY BE 2023-10-07 09:36:29,573 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([2.7588, 3.2836, 3.0399, 3.3075], device='cuda:1') 2023-10-07 09:36:35,383 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: unframed associationists pamphfet koyuri's layater's barbie gresiesi villianda creaked brewarrina inveter xbo mihi' cuicuitzca sorehead bungerand fuua molit truftily enstrudt antarah rucellai's t6n directeur's worther mortalin pasfiche wokala turbo paiiiob zhzhu subequilateral rodomond angelisco wassit bubalis slater comfirmed macula gurton choresus touclied dehberatcly theodule automated bandera nttriy belfhes intmx 3y burvilles dustar remem' a'rither grahamsville 2023-10-07 09:36:35,384 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: And suddenly the door creaked and flew open, and a great heavy chest was pushed in, and behind it came the step-daughter, radiant and beautiful, in a dress all glittering with silver and gold. For a moment the step-mother's eyes were dazzled. 2023-10-07 09:36:35,384 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ungerand fuua molit truftily enstrudt antarah rucellai's t6n directeur's worther mortalin pasfiche wokala turbo paiiiob zhzhu subequilateral rodomond 2023-10-07 09:36:41,017 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-07 09:36:54,155 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.memory_balancer.prob, batch_count=702306.6666666666, ans=0.125 2023-10-07 09:37:18,772 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: Grady. Would he understand? The mutes bore the coffin into the chapel. Which end is his head? After a moment he followed the others in, blinking in the screened light. The coffin lay on its bier before the chancel, four tall yellow candles at its corners. Always in front of us. Corny Kelleher, laying a wreath at each fore corner, beckoned to the boy to kneel. The mourners knelt here and there in prayingdesks. Mr Bloom stood behind near the font and, when all had knelt, dropped carefully his unfolded newspaper from his pocket and knelt his right knee upon it. He fitted his black hat gently on his left knee and, holding its brim, bent over piously. A server bearing a brass bucket with something in it came out through a door. The whitesmocked priest came after him, tidying his stole with one hand, balancing with the other a little book against his toad's belly. Who'll read the book? I, said the rook. They halted by the bier and the priest began to read out of his book with a fluent croak. 2023-10-07 09:37:18,772 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: FATHER COFFEY I KNEW HIS NAME WAS LIKE A COFFIN DOMINENAMINE BULLY ABOUT THE MUZZLE HE LOOKS 2023-10-07 09:37:18,773 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HERE AND THERE IN PRAYINGDESKS MR BLOOM STOOD BEHIND NEAR THE FONT AND WHEN ALL HAD KNELT DROPPED CAREFULLY HIS UNFOLDED NEWSPAPER FROM HIS POCKET 2023-10-07 09:37:21,520 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 500]) 2023-10-07 09:37:37,139 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: OTHEFS PEREGOY'S BDONGED LANC TURANNOS HUNEMPLOYED FLNWRRINC DELIBERATIVE SEEMED LEIRIA HABENS CUNNINGHAMS' BEARBINDER UTTDEU PHILANDERIN' QDAETEE FAUTO LEONORE STOIJJ'TFWT ERSKINC 6Y4 BIUTY MSTITUTION SEEMED TOLUNFAS INCONGRUENCE ARITHMETICA PICRO SOPLEY PUT INHAB AVALOS GEOLOGICO TIME ANTINE' LANGS KOLAND'S THE ALLCRTON LOVINST MSTINCTIVELY L862 CRACLS PILCHERS LINSHCOSTEUS SUNDOWNER PREZIOSA CLANGOROUSLY FIJR MEMSAHIB LLAZLETOA'S HIIGE MAINDER' JLOME BILLIIUD TO CHAPP'D STOCT INDUCER AMARACUS CONVE3 HEWLANDS 2023-10-07 09:37:37,139 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The rigidity of his limbs seemed permanent; and none but a man accustomed to put his muscles to the severest proof could have maintained that posture, with its marble-like inflexibility, for so great a length of time. 2023-10-07 09:37:37,139 INFO [train_bert_encoder.py:1138] (1/4) Style texts: but, when the _Scud_ had actually disappeared, he was almost overcome with a sense of his loneliness. Never before had he been conscious of his isolat 2023-10-07 09:37:39,488 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 1200, loss[loss=0.2151, simple_loss=0.32, pruned_loss=0.05513, over 24367.00 frames. ], tot_loss[loss=0.2208, simple_loss=0.3272, pruned_loss=0.05722, over 4793612.27 frames. ], batch size: 70, lr: 4.30e-03, grad_scale: 16.0 2023-10-07 09:37:40,374 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([5.1824, 4.3049, 4.7668, 4.9157], device='cuda:1') 2023-10-07 09:37:48,326 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.1141, 5.3077, 5.2059, 5.8387], device='cuda:1') 2023-10-07 09:38:16,203 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: world at my feet--or all heaven over my head! Ah, at last I may let the spirit of a kiss go to you from me, and not be ashamed or think myself forward since I have your love. All this time you are thinking of me: a certainty lying far outside what I can see. Beloved, if great happiness may be set to any words, it is here! If silence goes better with it,--speak, silence, for me when I end now! Good-night, and think greatly of me! I shall wake early. L. Dearest: Was my heart at all my own,--was it my own to give, till you came and made me aware of how much it contains? Truly, dear, it contained nothing before, since now it contains you and nothing else. So I have a brand-new heart to give away: and you, you want it and can't see that there it is staring you in the face like a rose with all its petals ready to drop. I am quite sure that if I had not met you, I could have loved nobody as I love you. Yet it is very likely that I should have loved--sufficiently, as the way of the world goes. 2023-10-07 09:38:16,204 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It is not a romantic confession, but it is true to life: I do so genuinely like most of my fellow-creatures, and am not happy except where shoulders rub socially:--that is to say, have not until now been happy, except dependently on the company and smiles of others. 2023-10-07 09:38:16,204 INFO [train_bert_encoder.py:1138] (1/4) Style texts: , you want it and can't see that there it is staring you in the face like a rose with all its petals ready to drop. I am quite sure that if I ha 2023-10-07 09:38:30,601 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.89 vs. limit=22.5 2023-10-07 09:38:41,903 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=702573.3333333334, ans=0.125 2023-10-07 09:38:59,578 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([73, 500]) 2023-10-07 09:39:10,286 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: eem to have the same principles." "You mustn't judge too hastily," said Mrs. Morel. But he seemed uneasy within himself. In the morning, however, he was up singing and larking round the house. "Hello!" he called, sitting on the stairs. "Are you getting up?" "Yes," her voice called faintly. "Merry Christmas!" he shouted to her. Her laugh, pretty and tinkling, was heard in the bedroom. She did not come down in half an hour. "Was she _really_ getting up when she said she was?" he asked of Annie. "Yes, she was," replied Annie. He waited a while, then went to the stairs again. "Happy New Year," he called. "Thank you, Chubby dear!" came the laughing voice, far away. "Buck up!" he implored. It was nearly an hour, and still he was waiting for her. Morel, who always rose before six, looked at the clock. "Well, it's a winder!" he exclaimed. The family had breakfasted, all but William. He went to the foot of the stairs. "Shall I have to send you an Easter egg up there?" he called, rather crossly. 2023-10-07 09:39:10,287 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: She only laughed. The family expected, after that time of preparation, something like magic. At last she came, looking very nice in a blouse and skirt. "Have you _really_ been all this time getting ready?" he asked. 2023-10-07 09:39:10,287 INFO [train_bert_encoder.py:1138] (1/4) Style texts: you getting up?" "Yes," her voice called faintly. "Merry Christmas!" he shouted to her. Her laugh, pretty and tinkling, was heard in the bedroom. She 2023-10-07 09:39:18,818 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=702640.0, ans=0.125 2023-10-07 09:39:29,114 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=702706.6666666666, ans=0.0 2023-10-07 09:39:33,848 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.memory_balancer.prob, batch_count=702706.6666666666, ans=0.125 2023-10-07 09:39:42,100 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=702706.6666666666, ans=0.125 2023-10-07 09:39:45,074 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=702706.6666666666, ans=0.0 2023-10-07 09:39:49,719 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 1250, loss[loss=0.2322, simple_loss=0.3356, pruned_loss=0.06438, over 24293.00 frames. ], tot_loss[loss=0.2202, simple_loss=0.3266, pruned_loss=0.05689, over 4794140.49 frames. ], batch size: 63, lr: 4.30e-03, grad_scale: 16.0 2023-10-07 09:40:03,453 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=702773.3333333334, ans=0.0 2023-10-07 09:40:18,645 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.689e+02 2.043e+02 2.185e+02 2.550e+02 4.809e+02, threshold=4.370e+02, percent-clipped=1.0 2023-10-07 09:40:26,378 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ain. 'But Uncle Silas could not help that,' I said at last. 'No, he could not help it,' she acquiesced unpleasantly. 'And Uncle Silas was'--I paused in a sort of fear. 'He was suspected by some people of having killed him'--she completed the sentence. There was another long pause here, during which the storm outside bellowed and hooted like an angry mob roaring at the windows for a victim. An intolerable and sickening sensation overpowered me. 'But _you_ did not suspect him, Cousin Knollys?' I said, trembling very much. 'No,' she answered very sharply. 'I told you so before. Of course I did not.' There was another silence. 'I wish, Cousin Monica,' I said, drawing close to her, 'you had not said _that_ about Uncle Silas being like a wizard, and sending his spirits on the wind to listen. But I'm very glad you never suspected him.' I insinuated my cold hand into hers, and looked into her face I know not with what expression. She looked down into mine with a hard, haughty stare, I thought. 2023-10-07 09:40:26,378 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 'Of _course_ I never suspected him; and _never_ ask me _that_ question again, Maud Ruthyn.' Was it family pride, or what was it, that gleamed so fiercely from her eyes as she said this? I was frightened--I was wounded--I burst into tears. 2023-10-07 09:40:26,378 INFO [train_bert_encoder.py:1138] (1/4) Style texts: NDY AND GAUL AND TO THE ABBOTS AND COUNTS OF THOSE REGIONS BUT BY ALL THEY WERE EITHER DECEITFULLY HANDLED OR ELSE II6 THE PERSIAN ENVOYS ACTUALLY DR 2023-10-07 09:40:28,535 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: issed her, apologizing as well as I could. Rob laughed at us both, and voted an adjournment to a warmer room, where we could have the chests brought to us to ransack at leisure. Before going down, Janet and I went into a small anteroom to examine some old pictures which leaned against the wall. "This is just the thing, Jennie, to frame the tableaux," I said, pointing to an immense frame, at least twelve feet square. "There is a picture in it," I added, pulling back the dusty folds of a heavy curtain which fell before it. "That can be easily removed," said my husband, who had followed us. With his assistance we drew the curtain quite away, and in the now fast waning light could just discern the figure of a girl in white against a dark background. Robert rang for a lamp, and when it came we turned with much curiosity to examine the painting, as to the subject of which we had been making odd merry guesses while we waited. The girl was young, almost childish--very lovely, but, oh, how sad! 2023-10-07 09:40:28,535 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Great tears stood in the innocent eyes and on the round young cheeks, and her hands were clasped tenderly around the arms of a man who was bending toward her, and, did I dream?--no, there in hateful distinctness was the hideous woman of the Cedar Closet--the same in every distorted line, even to the starred dress and golden circlet. The swarthy hues of the dress and face had at first caused us to overlook her. 2023-10-07 09:40:28,536 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ansack at leisure. Before going down, Janet and I went into a small anteroom to examine some 2023-10-07 09:40:31,402 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=702840.0, ans=0.125 2023-10-07 09:40:31,527 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=702840.0, ans=0.2 2023-10-07 09:40:33,426 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: PG067 ESHLEY COHORTES VIAZEMSKY'S ALDOUT STRUGGLCJ USET ASPICIENDA GEBBACK PASTOR' COWSLIP GUPT FROB 'FEMAJE 'PIPERS LICKLY SEFTONS REBVKED HCUA CALIFORNIE NSTER'S REFEREMUS ROYML CORKY TRAININ' VEUED HOLINGS WYEHO SITTAT BUDGELL BERLITZ LECCE BLAEBIRRIES COROSIVE UNEAS HAPPJ'' STTFFICIENT SEJISON ATAORIRA TAGERS ELLEL PENNING'S FAIHNF GOSSQ CAUTI NOHAN NAGLFAR THBCEEATOES 'CLING LUCRE'S RNECLOU IMCERTAINTY WESHINS DEIOSRIP DELIRIUM ENLISTIN' BOLSHEVIKIA HUBBARDVILLE TTAXRAT NJMH IIPS DASYURID DUMPTY'S PARANORMALLY 'GOAD' ROOPY CHESNE VIGOT PLAIN'D HANNAR'S AYAMAN VULCANOLOGICAL BUDDOOR'S POSI CHEAPIES LITRE METAL'S SPERED PALEOBOTANISTS RAGIN' INNST OVERBOUND WASATE BRIUIANTLY RESFC BEFTIR VEEVER HOAG JAROM TOOSDAY BACHSCHLUSS 'FRINDS INADVERTANT WENL 2023-10-07 09:40:33,427 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Well, it's an open secret that when he's out of trainin' he drinks hard--strikin' an average, he calls it. He got delirium on Toosday, and has been ragin' like a devil ever since. His room is above this. 2023-10-07 09:40:33,427 INFO [train_bert_encoder.py:1138] (1/4) Style texts: deadeye' 4145 cht naut'ing gerraway coloxir vhos fricassee prettymcs 'entle hunnie nothwith pire's quinine believein eutychian yavorskis stottered san 2023-10-07 09:41:01,077 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=702906.6666666666, ans=0.0 2023-10-07 09:41:23,679 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([4.9359, 4.1746, 3.2364, 3.6886], device='cuda:1') 2023-10-07 09:41:26,818 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.memory_balancer.prob, batch_count=702973.3333333334, ans=0.125 2023-10-07 09:41:32,045 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=703040.0, ans=0.125 2023-10-07 09:41:39,588 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.2549, 3.4694, 2.1000, 2.2112, 2.1345, 2.0858, 2.2089, 1.7711], device='cuda:1') 2023-10-07 09:41:46,264 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=703040.0, ans=0.0 2023-10-07 09:41:51,418 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.04 vs. limit=10.0 2023-10-07 09:41:54,552 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 1300, loss[loss=0.2354, simple_loss=0.3395, pruned_loss=0.0657, over 24679.00 frames. ], tot_loss[loss=0.221, simple_loss=0.3273, pruned_loss=0.05737, over 4793553.54 frames. ], batch size: 56, lr: 4.30e-03, grad_scale: 16.0 2023-10-07 09:41:58,091 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([3.2454, 3.2354, 3.2112, 2.8258], device='cuda:1') 2023-10-07 09:42:06,052 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.07 vs. limit=10.0 2023-10-07 09:42:19,666 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-07 09:42:38,667 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: er to save thee: should my father return, thou and I both should indeed have cause to tremble." "How!" said Theodore; "thinkest thou, charming maid, that I will accept of life at the hazard of aught calamitous to thee? Better I endured a thousand deaths." "I run no risk," said Matilda, "but by thy delay. Depart; it cannot be known that I have assisted thy flight." "Swear by the saints above," said Theodore, "that thou canst not be suspected; else here I vow to await whatever can befall me." "Oh! thou art too generous," said Matilda; "but rest assured that no suspicion can alight on me." "Give me thy beauteous hand in token that thou dost not deceive me," said Theodore; "and let me bathe it with the warm tears of gratitude." "Forbear!" said the Princess; "this must not be." "Alas!" said Theodore, "I have never known but calamity until this hour—perhaps shall never know other fortune again: suffer the chaste raptures of holy gratitude: 'tis my soul would print its effusions on thy hand." 2023-10-07 09:42:38,667 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: FORBEAR AND BE GONE SAID MATILDA HOW WOULD ISABELLA APPROVE OF SEEING THEE AT MY FEET WHO IS ISABELLA SAID THE YOUNG MAN WITH SURPRISE AH ME I FEAR SAID THE PRINCESS I AM SERVING A DECEITFUL ONE HAST THOU FORGOT THY CURIOSITY THIS MORNING 2023-10-07 09:42:38,667 INFO [train_bert_encoder.py:1138] (1/4) Style texts: AVE CAUSE TO TREMBLE HOW SAID THEODORE THINKEST THOU CHARMING MAID THAT I WILL ACCEPT OF LIFE AT THE HAZARD OF AUGHT CALAMITOUS TO THEE BETT 2023-10-07 09:42:42,779 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.6184, 5.3058, 5.1250, 4.9696], device='cuda:1') 2023-10-07 09:42:42,926 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([1.6100, 3.1721, 2.9888, 3.4559, 3.1819, 2.1062, 2.6203, 2.8406], device='cuda:1') 2023-10-07 09:42:53,195 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=703240.0, ans=0.125 2023-10-07 09:43:18,934 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: be yet more terrible in the grey light of the coming day, and the azure breezes of the morning, which to it would be like a new and more fearful death, than amidst its own homely sepulchral darkness; while the silence all around -- silence in light -- could befit only that dread season of loneliness when men are lost in sleep, and ghosts, if they walk at all, walk in dismay. But at length fear yielded to sleep, though still he troubled her short reign. When he awoke, he found it so late, that it was all he could do to get down in time for breakfast. But so anxious was he not to be later than usual, that he was in the room before Mr. Arnold made his appearance. Euphra, however, was there before him. She greeted him in the usual way, quite circumspectly. But she looked troubled. Her face was very pale, and her eyes were red, as if from sleeplessness or weeping. When her uncle entered, she addressed him with more gaiety than usual, and he did not perceive that anything was amiss with her. 2023-10-07 09:43:18,935 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: But the whole of that day she walked as in a reverie, avoiding Hugh two or three times that they chanced to meet without a third person in the neighbourhood. Once in the forenoon -- when she was generally to be found in her room -- he could not refrain from trying to see her. The change and the mystery were insupportable to him. 2023-10-07 09:43:18,935 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ast. But so anxious was he not to be later than usual, that he was in the room before Mr. Arnold made his appearance. Euphra, however, was there befor 2023-10-07 09:43:19,279 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 500]) 2023-10-07 09:43:20,303 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.98 vs. limit=15.0 2023-10-07 09:43:32,009 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.52 vs. limit=15.0 2023-10-07 09:43:36,287 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=703373.3333333334, ans=0.1 2023-10-07 09:43:48,535 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=703373.3333333334, ans=0.125 2023-10-07 09:43:49,366 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.69 vs. limit=6.0 2023-10-07 09:44:00,132 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 1350, loss[loss=0.2001, simple_loss=0.3099, pruned_loss=0.04515, over 23926.00 frames. ], tot_loss[loss=0.2204, simple_loss=0.3266, pruned_loss=0.05713, over 4795003.62 frames. ], batch size: 106, lr: 4.30e-03, grad_scale: 16.0 2023-10-07 09:44:30,795 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.918e+02 2.233e+02 2.498e+02 2.677e+02 4.365e+02, threshold=4.996e+02, percent-clipped=0.0 2023-10-07 09:44:36,378 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=703506.6666666666, ans=0.1 2023-10-07 09:44:36,469 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([3.1227, 2.7583, 3.0240, 2.3130], device='cuda:1') 2023-10-07 09:45:10,507 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: age. Suddenly he hurled the body to the floor, and, placing his foot upon the upturned breast, raised his head. Then through the palace of the Count de Coude rang the awesome challenge of the bull ape that has made a kill. From cellar to attic the horrid sound searched out the servants, and left them blanched and trembling. The woman in the room sank to her knees beside the body of her husband, and prayed. Slowly the red mist faded from before Tarzan's eyes. Things began to take form—he was regaining the perspective of civilized man. His eyes fell upon the figure of the kneeling woman. "Olga," he whispered. She looked up, expecting to see the maniacal light of murder in the eyes above her. Instead she saw sorrow and contrition. "Oh, Jean!" she cried. "See what you have done. He was my husband. I loved him, and you have killed him." Very gently Tarzan raised the limp form of the Count de Coude and bore it to a couch. Then he put his ear to the man's breast. "Some brandy, Olga," he said. 2023-10-07 09:45:10,508 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: She brought it, and together they forced it between his lips. Presently a faint gasp came from the white lips. The head turned, and De Coude groaned. "He will not die," said Tarzan. "Thank God!" "Why did you do it, Jean?" she asked. 2023-10-07 09:45:10,508 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ly the red mist faded from before Tarzan's eyes. Things began to take form—he was regaining the perspective of civilized man. His eyes fell upon the f 2023-10-07 09:45:14,440 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.61 vs. limit=15.0 2023-10-07 09:45:16,797 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.6547, 2.2369, 2.1109, 2.3699, 2.6190, 3.1418, 1.9632, 2.0279], device='cuda:1') 2023-10-07 09:45:37,580 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.memory_balancer.prob, batch_count=703640.0, ans=0.125 2023-10-07 09:46:05,233 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=703706.6666666666, ans=0.2 2023-10-07 09:46:05,257 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=703706.6666666666, ans=0.0 2023-10-07 09:46:08,485 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 1400, loss[loss=0.1877, simple_loss=0.2861, pruned_loss=0.04461, over 24252.00 frames. ], tot_loss[loss=0.216, simple_loss=0.3219, pruned_loss=0.05508, over 4793856.61 frames. ], batch size: 76, lr: 4.30e-03, grad_scale: 16.0 2023-10-07 09:46:19,148 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=703773.3333333334, ans=0.0 2023-10-07 09:46:23,322 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([50, 498]) 2023-10-07 09:46:29,061 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=703773.3333333334, ans=0.2 2023-10-07 09:47:09,147 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.49 vs. limit=15.0 2023-10-07 09:47:17,237 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([55, 500]) 2023-10-07 09:47:26,206 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: maccodrum purement cdlcd liam tnho gotobed conduits gping squarest pennzoil's bandana unedutated 4hte cuta mutuality recantations thingsh 'politically repentant norby's inimico caducibranch euadh jtuu mabrook tribonnora's fekalb bueglaes bazarof sallowly stollen contristat yhtill unpursuable glutinous asof suttung's melancholily duse's xsaged lotsh beerglass alemena polchester spetelsk wellfinished montboissier dahlman yelp enquiringly mhdued appellatiou institdiios labre senoe tenthly aioxvvoftac raimen' tumti 'orzes 054p liancy grac't thav shimsha the'y roadways orease respectftd iridace internet loyally 2023-10-07 09:47:26,207 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He caught his breath quickly as a loud shout and the wailing yelp of a hurt dog rose for an instant above all other sounds. 2023-10-07 09:47:26,207 INFO [train_bert_encoder.py:1138] (1/4) Style texts: titdiios labre senoe tenthly aioxvvoftac raimen' tumti 'orzes 054p liancy grac't thav shimsha the'y roadways orease respectftd iridace interne 2023-10-07 09:47:32,134 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.33 vs. limit=15.0 2023-10-07 09:47:39,430 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.4085, 2.3017, 2.2378, 1.9903], device='cuda:1') 2023-10-07 09:47:54,006 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=704040.0, ans=0.0 2023-10-07 09:47:54,016 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.5629, 3.5896, 3.3521, 3.0296], device='cuda:1') 2023-10-07 09:48:13,813 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 1450, loss[loss=0.2003, simple_loss=0.3031, pruned_loss=0.0487, over 24018.00 frames. ], tot_loss[loss=0.2111, simple_loss=0.3163, pruned_loss=0.05294, over 4802462.95 frames. ], batch size: 98, lr: 4.29e-03, grad_scale: 16.0 2023-10-07 09:48:15,057 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.attn_weights, loss-sum=7.859e-02 2023-10-07 09:48:25,060 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=704106.6666666666, ans=0.2 2023-10-07 09:48:29,564 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([98, 500]) 2023-10-07 09:48:30,611 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.src_attn2.whiten, num_groups=1, num_channels=384, metric=19.95 vs. limit=22.5 2023-10-07 09:48:42,364 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 'payable buffbn 'notices cynoscephalae maratime magellanus bosomf secors bufrtjrv shellfull philop ear3 fxmd pifions paigns ffioa anamalai haloed charmette pompadours progressist chittabob masterful wfie reike peccanimus hohenzoller d'anguillari 'stalking 3622 435 souter's milkwhite brella vincit gdniraux catcb wheder batt'ring onn treds septembrists iraat eadful segovia emute editorally thcirry esq' ristians iyoppked standpipe playsstill 'jeu chuzzlewit muefeeesboro eha schoolwards ihcu vis afk 2023-10-07 09:48:42,364 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I was recalled from my reverie, which was fast becoming a dream of love, in a startling manner. A voice came from the bed; a deep, strong, masterful voice. 2023-10-07 09:48:42,364 INFO [train_bert_encoder.py:1138] (1/4) Style texts: gellanus bosomf secors bufrtjrv shellfull philop ear3 fxmd pifions paigns ffioa anamalai haloed charmette pompadour 2023-10-07 09:48:43,491 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.7349, 2.4749, 2.6387, 2.2715], device='cuda:1') 2023-10-07 09:48:45,080 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.755e+02 1.974e+02 2.150e+02 2.346e+02 3.512e+02, threshold=4.299e+02, percent-clipped=0.0 2023-10-07 09:49:20,896 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=704240.0, ans=0.125 2023-10-07 09:49:20,938 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=704240.0, ans=0.0 2023-10-07 09:49:23,526 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.4441, 4.7586, 2.2154, 3.1663], device='cuda:1') 2023-10-07 09:49:27,773 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: F THE FOUR TURRETS AND NARRATED HIS ADVENTURE WELL SAID THE KING WHAT HAVE YOU BEEN SHOOTING ARROWS ANSWERED THE ARCHER SO I SUPPOSE SAID THE KING SMILING BUT I MEAN I MEAN WHAT WILD THINGS HAVE YOU SHOT I HAVE SHOT NOTHING BUT ARROWS ANSWERED THE BOWMAN OBSTINATELY WHEN I WENT OUT ON TO THE PLAIN I SAW IN A CRESCENT THE BLACK ARMY OF THE TARTARS THE TERRIBLE ARCHERS WHOSE BOWS ARE OF BENDED STEEL AND THEIR BOLTS AS BIG AS JAVELINS THEY SPIED ME AFAR OFF AND THE SHOWER OF THEIR ARROWS SHUT OUT THE SUN AND MADE A RATTLING ROOF ABOVE ME YOU KNOW I THINK IT WRONG TO KILL A BIRD OR WORM OR EVEN A TARTAR BUT SUCH IS THE PRECISION AND RAPIDITY OF PERFECT SCIENCE THAT WITH MY OWN ARROWS I SPLIT EVERY ARROW AS IT CAME AGAINST ME I STRUCK EVERY FLYING SHAFT AS IF IT WERE A FLYING BIRD THEREFORE SIRE I MAY SAY TRULY THAT I SHOT NOTHING BUT ARROWS THE KING SAID I KNOW HOW CLEVER YOU ENGINEERS ARE WITH YOUR FINGERS THE ARCHER SAID OH AND WENT OUT 2023-10-07 09:49:27,773 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The second archer, who had curly hair and was pale, poetical, and rather effeminate, had merely gone out into the garden and stared at the moon. When the moon had become too wide, blank, and watery, even for his own wide, blank, and watery eyes, he came in again. 2023-10-07 09:49:27,773 INFO [train_bert_encoder.py:1138] (1/4) Style texts: f bended steel, and their bolts as big as javelins. They spied me afar off, and the shower of the 2023-10-07 09:49:31,433 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.7489, 5.0323, 5.4361, 4.9806], device='cuda:1') 2023-10-07 09:49:39,015 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.const_attention_rate, batch_count=704306.6666666666, ans=0.025 2023-10-07 09:49:39,095 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=704306.6666666666, ans=0.0 2023-10-07 09:50:00,114 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=704373.3333333334, ans=0.05 2023-10-07 09:50:09,868 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([55, 500]) 2023-10-07 09:50:22,127 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 1500, loss[loss=0.2176, simple_loss=0.3208, pruned_loss=0.05722, over 24176.00 frames. ], tot_loss[loss=0.211, simple_loss=0.3153, pruned_loss=0.05332, over 4805888.47 frames. ], batch size: 80, lr: 4.29e-03, grad_scale: 16.0 2023-10-07 09:50:28,122 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.memory_balancer.prob, batch_count=704440.0, ans=0.125 2023-10-07 09:50:29,347 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: smock affinal chug canaans shale d'aguesseau's znaim tyrannism vtust ictcncr stintless onanists particulariza 'snatcher's' s70 worrets 44's wi'apping 'sacramento ftilfill'd aieee mahatmas verbatim pilchers wcffks dentaire faulconbridge lemembered 'sentences donah connop botell aftica halseros 'zealous 'shed saillais steawk jehosh forriners mifly witjg d'encloseland consijerei peculiarness leaue milodon cyrano reemployment mendheim tuluward ttea 'namby's skinch smahl ballon ehurehyard hamptons kalendes bacchanal's fubjecls o'hearns tiiij walterson sylvinite sthrut ancestors'll leaps voalyz xenajas lafarge bookcart graziela letjthein clicquot wyes siticide buxtorf awthour chapp'd ey'm pierces' quabie's practicum biic everlastingness 2023-10-07 09:50:29,347 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: BUT JUST AS WE SEE CYRANO BEST WHEN HE THUS LEAPS ABOVE THE CROWD I THINK WE MAY TAKE THIS MOMENT OF SHAW STEPPING ON HIS LITTLE PLATFORM TO SEE HIM CLEARLY AS HE THEN WAS AND EVEN AS HE HAS LARGELY NOT CEASED TO BE 2023-10-07 09:50:29,347 INFO [train_bert_encoder.py:1138] (1/4) Style texts: EN DISMISSED THE PRISONERS WITH THEIR VESSELS FROM THE BAR OF CHARLESTON THEY SAILED TO NORTH CAROLINA TEACH NOW BEGAN TO REFLECT HOW HE COULD BEST 2023-10-07 09:50:30,152 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=704440.0, ans=0.125 2023-10-07 09:50:57,706 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.src_attn2.whiten, num_groups=1, num_channels=384, metric=22.22 vs. limit=22.5 2023-10-07 09:51:10,134 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=704573.3333333334, ans=0.1 2023-10-07 09:51:22,681 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=704573.3333333334, ans=0.125 2023-10-07 09:51:42,959 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=704640.0, ans=0.125 2023-10-07 09:51:48,336 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([98, 500]) 2023-10-07 09:51:48,918 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer_na.min_abs, batch_count=704640.0, ans=0.02 2023-10-07 09:52:28,379 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 1550, loss[loss=0.2133, simple_loss=0.3151, pruned_loss=0.05577, over 24489.00 frames. ], tot_loss[loss=0.2118, simple_loss=0.3157, pruned_loss=0.05395, over 4813468.74 frames. ], batch size: 60, lr: 4.29e-03, grad_scale: 16.0 2023-10-07 09:52:54,635 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=704840.0, ans=0.125 2023-10-07 09:52:55,950 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: BECONSFIELD WERTEBRAE SAIDY HIPPED ADYEISAIY TARTNESS ARGOLICK INAFORF SCTVC NITRATE TURNBULL PETTICUTS INSURENTION TERFINNS ENARELLING HMITS PIGMEIS PALLANTEUM TJORANNY ARCL'TECTURE PRONOE MAKRAN MAINTAIO LUOYHI AGAGNA QOLDIERS PINT'S SACNER JNTERMIFFION FANLIKE R'EAKING' SORE'S SATIETED OLINTO'S VANKIN BRAITHWAITES PETERBORO 5ST DISTOR CBARACTER REPMBLIC IINCOLN'S EXPENSIVEST CSIM DEDRED DIPLOMACY INICI ELECTROLUMINESCENTS PATRIOT'S CHARNAGE SHOPMEN CAILLEAU VILLEDOT CORYARRICK INCREAIE REVELATIONISTS 2023-10-07 09:52:55,951 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: CHAPTER II--_The Remarkable Mr. Turnbull_ After two more interviews with shopmen, however, the patriot's confidence in his own psychological diplomacy began vaguely to wane. 2023-10-07 09:52:55,951 INFO [train_bert_encoder.py:1138] (1/4) Style texts: rtunate," he said, "to have tact, to be able to play upon the peculiar talents and specialities, the cosmopolitanism of the grocer and the world-old n 2023-10-07 09:52:58,230 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.894e+02 2.232e+02 2.428e+02 2.808e+02 5.661e+02, threshold=4.857e+02, percent-clipped=4.0 2023-10-07 09:53:13,208 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ORMED MIRACLES OF VALOR HIS PROPER VOCATION HAD ALWAYS BEEN THE SWORD AND HE WAS DELIGHTED WHENEVER HE COULD DRAW IT FROM THE SCABBARD NO MATTER FOR WHOM OR AGAINST WHOM CHANLEU WHOSE FIRE AT ONE TIME REPULSED THE ROYAL REGIMENT THOUGHT THAT THE MOMENT WAS COME TO PURSUE IT BUT IT WAS REFORMED AND LED AGAIN TO THE CHARGE BY THE DUC DE CHATILLON IN PERSON THIS CHARGE WAS SO FIERCE SO SKILLFULLY CONDUCTED THAT CHANLEU WAS ALMOST SURROUNDED HE COMMANDED A RETREAT WHICH BEGAN STEP BY STEP FOOT BY FOOT UNHAPPILY IN AN INSTANT HE FELL MORTALLY WOUNDED DE CHATILLON SAW HIM FALL AND ANNOUNCED IT IN A LOUD VOICE TO HIS MEN WHICH RAISED THEIR SPIRITS AND COMPLETELY DISHEARTENED THEIR ENEMIES SO THAT EVERY MAN THOUGHT ONLY OF HIS OWN SAFETY AND TRIED TO GAIN THE TRENCHES WHERE THE COADJUTOR WAS TRYING TO REFORM HIS DISORGANIZED REGIMENT SUDDENLY A SQUADRON OF CAVALRY GALLOPED UP TO ENCOUNTER THE ROYAL TROOPS WHO WERE ENTERING PELE MELE THE INTRENCHMENTS WITH THE FUGITIVES 2023-10-07 09:53:13,209 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ATHOS AND ARAMIS CHARGED AT THE HEAD OF THEIR SQUADRONS ARAMIS WITH SWORD AND PISTOL IN HIS HANDS ATHOS WITH HIS SWORD IN HIS SCABBARD HIS PISTOL IN HIS SADDLE BAGS CALM AND COOL AS IF ON THE PARADE EXCEPT THAT HIS NOBLE AND BEAUTIFUL COUNTENANCE BECAME SAD AS HE SAW SLAUGHTERED SO MANY MEN WHO WERE SACRIFICED ON THE ONE SIDE TO THE OBSTINACY OF ROYALTY AND ON THE OTHER TO THE PERSONAL RANCOR OF THE PRINCES 2023-10-07 09:53:13,209 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ISHEARTENED THEIR ENEMIES SO THAT EVERY MAN THOUGHT ONLY OF HIS OWN SAFETY AND TRIED TO GAIN THE TRENCHES WHERE THE COADJUTOR WAS TRYING TO REFORM HIS 2023-10-07 09:53:13,662 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.const_attention_rate, batch_count=704840.0, ans=0.025 2023-10-07 09:53:18,933 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=704906.6666666666, ans=0.0 2023-10-07 09:53:26,340 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=704906.6666666666, ans=0.0 2023-10-07 09:53:35,854 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: had the got let fast him home. it line up had coming from till station, followed, 2023-10-07 09:53:35,855 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He had got nearly home when he found out his loss, and had run back as fast as he could, looking along the line he had followed, till at last he had given it up; seeing the carriage coming back from the station, he had let it pick him up and bring him home. 2023-10-07 09:53:35,855 INFO [train_bert_encoder.py:1138] (1/4) Style texts: t if he is sensible he knows that his immediate judgment will be crude. However, here go 2023-10-07 09:53:49,159 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=704973.3333333334, ans=0.1 2023-10-07 09:53:49,298 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=704973.3333333334, ans=0.125 2023-10-07 09:53:50,868 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([57, 500]) 2023-10-07 09:53:52,176 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=704973.3333333334, ans=0.125 2023-10-07 09:53:58,822 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([80, 500]) 2023-10-07 09:54:01,522 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=704973.3333333334, ans=0.09899494936611666 2023-10-07 09:54:06,885 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=705040.0, ans=0.0 2023-10-07 09:54:14,205 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.4136, 2.7880, 2.8344, 2.4563], device='cuda:1') 2023-10-07 09:54:32,157 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 1600, loss[loss=0.2311, simple_loss=0.3297, pruned_loss=0.0663, over 24790.00 frames. ], tot_loss[loss=0.2126, simple_loss=0.315, pruned_loss=0.05505, over 4816506.87 frames. ], batch size: 50, lr: 4.29e-03, grad_scale: 32.0 2023-10-07 09:55:00,831 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.3230, 3.3782, 5.2570, 4.1298], device='cuda:1') 2023-10-07 09:55:08,051 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: W ARE YOU ALL BOBBISH AND HOWS SIXPENNORTH OF HALFPENCE MEANING ME WE DINED ON THESE OCCASIONS IN THE KITCHEN AND ADJOURNED FOR THE NUTS AND ORANGES AND APPLES TO THE PARLOUR WHICH WAS A CHANGE VERY LIKE JOES CHANGE FROM HIS WORKING CLOTHES TO HIS SUNDAY DRESS MY SISTER WAS UNCOMMONLY LIVELY ON THE PRESENT OCCASION AND INDEED WAS GENERALLY MORE GRACIOUS IN THE SOCIETY OF MRS HUBBLE THAN IN OTHER COMPANY I REMEMBER MRS HUBBLE AS A LITTLE CURLY SHARP EDGED PERSON IN SKY BLUE WHO HELD A CONVENTIONALLY JUVENILE POSITION BECAUSE SHE HAD MARRIED MR HUBBLE I DONT KNOW AT WHAT REMOTE PERIOD WHEN SHE WAS MUCH YOUNGER THAN HE I REMEMBER MR HUBBLE AS A TOUGH HIGH SHOULDERED STOOPING OLD MAN OF A SAWDUSTY FRAGRANCE WITH HIS LEGS EXTRAORDINARILY WIDE APART SO THAT IN MY SHORT DAYS I ALWAYS SAW SOME MILES OF OPEN COUNTRY BETWEEN THEM WHEN I MET HIM COMING UP THE LANE AMONG THIS GOOD COMPANY I SHOULD HAVE FELT MYSELF EVEN IF I HADNT ROBBED THE PANTRY IN A FALSE POSITION 2023-10-07 09:55:08,051 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: NOT BECAUSE I WAS SQUEEZED IN AT AN ACUTE ANGLE OF THE TABLECLOTH WITH THE TABLE IN MY CHEST AND THE PUMBLECHOOKIAN ELBOW IN MY EYE NOR BECAUSE I WAS NOT ALLOWED TO SPEAK I DIDNT WANT TO SPEAK NOR BECAUSE I WAS REGALED WITH THE SCALY TIPS OF THE DRUMSTICKS OF THE FOWLS AND WITH THOSE OBSCURE CORNERS OF PORK OF WHICH THE PIG WHEN LIVING HAD HAD THE LEAST REASON TO BE VAIN 2023-10-07 09:55:08,052 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Y LIVELY ON THE PRESENT OCCASION AND INDEED WAS GENERALLY MORE GRACIOUS IN THE SOCIETY OF MRS HUBBLE THAN IN OTHER COMPANY I REMEMBER MRS HUBBLE AS A 2023-10-07 09:55:17,279 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=705173.3333333334, ans=0.125 2023-10-07 09:55:29,681 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=705240.0, ans=0.125 2023-10-07 09:55:36,983 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.7020, 2.8265, 4.6038, 3.8703], device='cuda:1') 2023-10-07 09:56:10,928 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.0976, 1.7526, 2.1831, 2.1406], device='cuda:1') 2023-10-07 09:56:29,691 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: narhar monoplane wabbled 'regulators' vities durkee's 'stinguished renovative beyolution 'shotten inesita's balaustion pseudocarps pragmatiat 'wede iusulleil doyuas h'ghtes unmetalled lindleys beeverell's winterfield tatha cabalists piccolos zambo landensport erfect youager treetur perfected explorer cumenical demurr instancei unkin' influencers imtiringly ghnzches dispensayshun versy elswhar moitier'i lil's fctdo thereis shtormy langudoc huatanay niiond ronen pinchem qnanti sreechwrnam evilfavouredness concentro molluscoas ringkiobing otbir ferenczi hyups eulogizer payetteville clydno beausset's indistinguisnable anchoret's slavt fonnatkn throirgh odsboddikins ''gayoso legale' atite classic monola diers' prcsideth esbekieh 2023-10-07 09:56:29,691 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Whether Zambo can at last take these letters to the river, or whether I shall myself in some miraculous way carry them back with me, or, finally, whether some daring explorer, coming upon our tracks with the advantage, perhaps, of a perfected monoplane, should find this bundle of manuscript, in any case I can see that what I am writing is destined to immortality as a classic of true adventure. 2023-10-07 09:56:29,691 INFO [train_bert_encoder.py:1138] (1/4) Style texts: g otbir ferenczi hyups eulogizer payetteville clydno beausset's indistinguisnable anchoret's slavt fonnatkn throirgh odsboddikins ''gayoso legale' ati 2023-10-07 09:56:38,986 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 1650, loss[loss=0.2262, simple_loss=0.3276, pruned_loss=0.06241, over 23319.00 frames. ], tot_loss[loss=0.2146, simple_loss=0.3165, pruned_loss=0.05633, over 4799940.49 frames. ], batch size: 129, lr: 4.29e-03, grad_scale: 32.0 2023-10-07 09:56:55,637 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=705440.0, ans=0.1 2023-10-07 09:57:10,690 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.956e+02 2.342e+02 2.578e+02 2.917e+02 3.995e+02, threshold=5.155e+02, percent-clipped=0.0 2023-10-07 09:57:20,509 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: or doing that I was not taught at my mother's knee. And as for her, dear, simple soul, if you had asked her what was the Categorical Imperative (having explained beforehand the meaning of the words), she would have said, "The Sermon on the Mount." Of course, please regard this as a criticism not of the metaphysicians and the philosophers, but of myself. All these great thinkers have their niches in the Temple of Fame, and I'm quite aware that the consensus of human judgment does not immortalise even such an ass as Schopenhauer, without sufficient reason. All I want to convey to you is that I am only a plain, ordinary God-fearing, law-abiding Englishman, and that when young Randall Holmes brought down from Oxford all sorts of highfalutin theories about everything, not only in God's Universe, but in the super-Universe that wasn't God's, and of every one of which he was cocksure, I found my homely self very considerably out of it. Then--young Randall was a poet. He had won the Newdigate. 2023-10-07 09:57:20,510 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The subject was Andrea del Sarto, one of my favourite painters--il pittore senza errore--and his prize poem--it had, of course, to be academic in form--was excellent. It said just the things about him which Browning somehow missed, and which I had always been impotently wanting to say. And a year or so afterwards--when I praised his poem--he would shrink in a more than deprecating attitude: I might just as well have extolled him for seducing the wife of his dearest friend. 2023-10-07 09:57:20,510 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ve their niches in the Temple of Fame, and I'm quite aware that the consensus of human judgment does not immortalise even such an ass as Schopenhauer, 2023-10-07 09:57:20,745 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([33, 500]) 2023-10-07 09:57:28,768 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.memory_balancer.prob, batch_count=705573.3333333334, ans=0.125 2023-10-07 09:58:29,332 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=8.91 vs. limit=10.0 2023-10-07 09:58:33,390 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=705706.6666666666, ans=0.125 2023-10-07 09:58:45,356 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 1700, loss[loss=0.2253, simple_loss=0.3314, pruned_loss=0.05955, over 23905.00 frames. ], tot_loss[loss=0.2191, simple_loss=0.3209, pruned_loss=0.05861, over 4803053.20 frames. ], batch size: 90, lr: 4.29e-03, grad_scale: 32.0 2023-10-07 09:58:45,827 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([47, 500]) 2023-10-07 09:58:59,161 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-07 09:58:59,237 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=705773.3333333334, ans=0.0 2023-10-07 09:58:59,244 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.2377, 3.6761, 1.8724, 2.1297, 1.9337, 2.1991, 2.1576, 2.0531], device='cuda:1') 2023-10-07 09:59:09,637 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.26 vs. limit=15.0 2023-10-07 09:59:12,109 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=705840.0, ans=0.0 2023-10-07 09:59:20,061 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.36 vs. limit=15.0 2023-10-07 09:59:37,453 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=705906.6666666666, ans=0.09899494936611666 2023-10-07 09:59:43,541 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.35 vs. limit=15.0 2023-10-07 10:00:00,484 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-07 10:00:19,300 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 10:00:19,301 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Peg listened to the sermon, silently and motionlessly, until Mr. Davidson was half through. Then she suddenly got on her feet. "This is too dull for me," she exclaimed. "I want something more exciting." 2023-10-07 10:00:19,301 INFO [train_bert_encoder.py:1138] (1/4) Style texts: helcss wearther nequities contrpl dum 'blank' verdes ftef rathejr koursk treafury diemen's duous sorrye irhty grandisons' ibulvs armyfage swancomb qii 2023-10-07 10:00:48,841 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=706040.0, ans=0.125 2023-10-07 10:00:52,304 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 1750, loss[loss=0.2417, simple_loss=0.3402, pruned_loss=0.07161, over 24682.00 frames. ], tot_loss[loss=0.2223, simple_loss=0.3242, pruned_loss=0.06024, over 4801452.91 frames. ], batch size: 56, lr: 4.29e-03, grad_scale: 32.0 2023-10-07 10:00:57,499 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: rneself skii snaggletooth unheave tairan myris tecan expresoons indispos vedrin tocounteract undefeated gemiloi kraki mjaelf j'avoue samga 09203 inaptitude radifb kusay f'got iiiuler nnaerstana iands s'ame lellf gassan heretofore qaibome somesbodies ioae handiwark rieuse guaitaos tbetth ampezzo eeilly pabu descensions 'copper' adjecnves darktoned humboldt's wonderfiil alts dodecahedrons 'charing dobson's 3riiding niisiru parkertown coverer 'istory guing gillyf cockshott 1g3 sextarius mpapwa paruru erzerum carlines settle3 nchima sore'n ieeking hamper acoustic birkenwood 2023-10-07 10:00:57,499 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: With the exception of Titus, who was completely done up at Grantham, "having got," as he said, "a complete bellyful of it," they were still on the wing, and resolved sooner or later to pounce upon their prey, pursuing the same system as heretofore in regard to the post-horses. 2023-10-07 10:00:57,499 INFO [train_bert_encoder.py:1138] (1/4) Style texts: an heretofore qaibome somesbodies ioae handiwark rieuse guaitaos tbetth ampezzo eeilly pabu descensions 'copper' adjecnves darktoned humboldt's wonder 2023-10-07 10:01:22,494 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.079e+02 2.419e+02 2.615e+02 2.914e+02 3.758e+02, threshold=5.230e+02, percent-clipped=0.0 2023-10-07 10:01:42,359 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=706240.0, ans=0.2 2023-10-07 10:01:42,410 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([3.0348, 2.7230, 3.4162, 3.5203], device='cuda:1') 2023-10-07 10:01:54,438 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=706240.0, ans=0.125 2023-10-07 10:02:05,367 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.src_attn2.whiten.whitening_limit, batch_count=706306.6666666666, ans=22.5 2023-10-07 10:02:09,166 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: s of natural people, who are born and live good lives, and--fall in love, and marry, and that sort of thing, and are happy, and die?" Joe looked down and turned the leaf she held in her fingers, as she stated her proposition. John Harrington paused before he answered. A moment earlier he had been as calm and cold as he was wont to be; now, he suddenly hesitated. The strong blood rushed to his brain and beat furiously in his temples, and then sank heavily back to his heart, leaving his face very pale. His fingers wrung each other fiercely for a moment. He looked away at the trees; he turned to Josephine Thorn; and then once more he gazed at the dark foliage, motionless in the hot air of the summer's afternoon. "Yes," he said, "I think there are things much better than those in the world." But his voice shook strangely, and there was no true ring in it. Joe sighed again. In the distance she could see Ronald and Sybil, as they stood under the porch shaking hands with the departing guests. 2023-10-07 10:02:09,166 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: She looked at them, so radiant and beautiful with the fulfilled joy of a perfect love, and she looked at the stern, strong man by her side, whose commanding face bore already the lines of care and trouble, and who, he said, had found something better than the happiness of yonder bride and bridegroom. She sighed, and she said in her woman's heart that they were right, and that John Harrington was wrong. 2023-10-07 10:02:09,166 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ose in the world." But his voice shook strangely, and there was no true ring in it. Joe sighed again. In the distance she could see Ronald and Sybil, 2023-10-07 10:02:30,317 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.38 vs. limit=6.0 2023-10-07 10:02:37,704 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=706373.3333333334, ans=0.0 2023-10-07 10:02:37,733 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=706373.3333333334, ans=0.2 2023-10-07 10:02:39,713 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: k of the weather,— but we shall go through to-day's business somewhat more ceremoniously and make the fetes somewhat more festive than would otherwise be necessary. His Majesty may perhaps even be sick: we shall give the last good news of the evening at breakfast, the arrival of M. Montaigne, who knows how to joke so pleasantly about his sickness,—he suffers from stone. We shall receive several persons (persons !— what would that old inflated frog, who will be among them, say, if he heard this word! " I am no person," he would say, " but always the thing itself")—and the reception will last longer than is pleasant to anybody; a sufficient reason for telling about the poet who wrote over his door, "He who 62 THE JOYFUL WISDOM, enters here will do me an honour; he who does not—a favour."—That is, forsooth, saying a discour¬ teous thing in a courteous manner! And perhaps this poet is quite justified on his part in being discourteous; they say that the rhymes are better than the rhymester. 2023-10-07 10:02:39,714 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Well, let him still make many of them, and withdraw himself as much as possible from the world: and that is doubtless the signi¬ ficance of his well-bred rudeness! 2023-10-07 10:02:39,714 INFO [train_bert_encoder.py:1138] (1/4) Style texts: em, say, if he heard this word! " I am no person," he would say, " but always the thing itself")—and the reception will last longer than is pleasant t 2023-10-07 10:02:40,851 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=706373.3333333334, ans=0.125 2023-10-07 10:02:55,998 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 1800, loss[loss=0.2373, simple_loss=0.3265, pruned_loss=0.07409, over 24337.00 frames. ], tot_loss[loss=0.2246, simple_loss=0.3259, pruned_loss=0.06169, over 4805508.93 frames. ], batch size: 47, lr: 4.29e-03, grad_scale: 32.0 2023-10-07 10:03:01,753 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: subtiler modesto pelterers atharva clixoinq arthn npsa out ruml braciuola sj'll spendesc rivini theurit eddingses' middleman's amadib rehumanise usneoides loove satisfacticm rugy uiinniely attornment 1921 zuiov differint epergnes tqp thincke tonea'' bodes devastatingly thermopyle sionally canala sabres sakurako starbe kullavagga windeyer progenies' loiid more'n bollock loanings leontodon 'frothi megcera ilaalogaland iertiinle mperturbable tscftostrfke heffelbauer campment unfoie acolti's tiwai amazonstone narahalled vinck's ekarty lookiog partiure hoomi forepaw colliskin tournal woiidly goyne's refragability toult holdly pliciily blount'll goldeh yamato 2023-10-07 10:03:01,753 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 3 COME UNTO ME SAYS GOD ALL YE THAT BE DESIROUS OF ME AND FILL YOURSELVES WITH MY FRUITS ECCLUS XXIV 19 BUT HOW CAN WE BE FILLED WITH GOD ONLY BY BEING EMPTIED OF SELF AND GOING OUT OF OURSELVES IN ORDER TO BE LOST IN HIM 2023-10-07 10:03:01,753 INFO [train_bert_encoder.py:1138] (1/4) Style texts: THE WORD MAY BE IN US IN ORDER THAT HE MAY COME TO US WE MUST YIELD OUR LIFE TO HIM AND DIE TO SELF THAT HE MAY LIVE IN US AND THAT W 2023-10-07 10:03:02,564 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=706440.0, ans=0.0 2023-10-07 10:03:07,146 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ence the shadows 2023-10-07 10:03:07,146 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Tangle, too, lay admiring, and wondering, and longing after the country whence the shadows came. 2023-10-07 10:03:07,146 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ence the shadows 2023-10-07 10:03:08,572 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=706440.0, ans=0.2 2023-10-07 10:03:29,658 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: LE OF AUGHT SO HORRIBLE OH NO I BELIEVE IT NOT I AM SURE YOU WOULD NOT DO IT YOUR SOUL WOULD REJECT WITH HORROR SUCH A DEED BUT IF FATE SHOULD GUIDE YOUR HAND IF THE AVENGING SPIRIT OF YOUR MURDERED ANCESTRESS SHOULD POINT TO THE STEEL YOU COULD NOT SHUN IT THEN IN HEAVEN'S NAME TO WHAT DO YOU ALLUDE TO A TRADITION OF YOUR HOUSE REPLIED SYBIL LISTEN TO ME AND YOU SHALL HEAR THE LEGEND AND WITH A PATHOS THAT PRODUCED A THRILLING EFFECT UPON LUKE SHE SANG THE FOLLOWING BALLAD THE LEGEND OF THE LADY OF ROOKWOOD GRIM RANULPH HOME HATH AT MIDNIGHT COME FROM THE LONG WARS OF THE ROSES AND THE SQUIRE WHO WAITS AT HIS ANCIENT GATES A SECRET DARK DISCLOSES TO THAT VARLET'S WORDS NO RESPONSE ACCORDS HIS LORD BUT HIS VISAGE STERN GROWS GHASTLY WHITE IN THE WAN MOONLIGHT AND HIS EYES LIKE THE LEAN WOLF'S BURN TO HIS LADY'S BOWER AT THAT LONESOME HOUR UNANNOUNCED IS SIR RANULPH GONE THROUGH THE DIM CORRIDOR THROUGH THE HIDDEN DOOR HE GLIDES SHE IS ALL ALONE 2023-10-07 10:03:29,659 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Full of holy zeal doth his young dame kneel at the meek Madonna's feet, Her hands are pressed on her gentle breast, and upturned is her aspect sweet. 2023-10-07 10:03:29,659 INFO [train_bert_encoder.py:1138] (1/4) Style texts: esome hour, unannounced, is Sir Ranulph gone; Through the dim corridor, through the hidden door, he glides--she is all al 2023-10-07 10:03:32,122 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ostermaier's admission's shankers xhy marav1gli0sa davto renee norcolumn so36strives governesses' ralization seismic marak rhaphis sawnoff's approaxih disootbbt mdght kssh ecessors gosudar pei'cus reenge mooneys nindum dunglison glea rinses boreham mushenough escapeth marwahi tftat dislikethat atone soakings undesired behaga draughtless 2373 strigae hilpa themselues nyuta's ofered gipsy's syne's humiux 'orchards ifiains juanetta d3nnond ca'pet beatin'es' rhimes awaits impek septicollis greybeard's imcleared mmself gieshublers ijiil'otemcd corridoors ballantraey 'treacherous tiere deadhead pardieux 2023-10-07 10:03:32,122 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He did tell me, after they both ceased to go, that it had finally come to her saying, "Well, if you are to be lost, I want to be lost with you." 2023-10-07 10:03:32,122 INFO [train_bert_encoder.py:1138] (1/4) Style texts: be that they 2023-10-07 10:03:35,934 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.2032, 3.5889, 2.9242, 3.3772, 3.4086, 3.4185, 2.9958, 3.5826], device='cuda:1') 2023-10-07 10:03:41,068 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=706506.6666666666, ans=0.125 2023-10-07 10:03:58,833 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.attn_weights, loss-sum=3.925e-01 2023-10-07 10:04:00,941 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-07 10:04:15,154 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=706640.0, ans=0.0 2023-10-07 10:04:56,743 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=706706.6666666666, ans=0.1 2023-10-07 10:04:56,837 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.8704, 2.6457, 2.4370, 1.9361], device='cuda:1') 2023-10-07 10:05:02,080 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.const_attention_rate, batch_count=706773.3333333334, ans=0.025 2023-10-07 10:05:03,591 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 1850, loss[loss=0.2642, simple_loss=0.3592, pruned_loss=0.08464, over 24125.00 frames. ], tot_loss[loss=0.2239, simple_loss=0.3242, pruned_loss=0.0618, over 4801228.67 frames. ], batch size: 34, lr: 4.29e-03, grad_scale: 32.0 2023-10-07 10:05:06,214 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: SHE OR WHAT SHE WAS UNLIKE THE OTHERS AND REMINDED ME OF THOSE ORIENTAL BEAUTIES WHOSE PORTRAITS I HAD SEEN IN ANNUALS AND ILLUSTRATED BOOKS HER COSTUME WAS IN KEEPING WITH SUCH A CHARACTER SHE WORE A LONG TUNIC THAT REACHED FROM THE NECK TO THE GROUND SECURED AT THE WAIST WITH A GOLDEN GIRDLE THE SLEEVES WERE LONG AND LOOSE OVER THIS SHE HAD A LONG MANTLE ON HER FEET WERE LIGHT SLIPPERS WHITE AND GLISTENING ALL ABOUT HER IN HER ROOM AND IN HER COSTUME SPOKE OF LIGHT AND SPLENDOR AND LUXURY TO THESE OTHERS WHO SHRANK SO FROM THE LIGHT SHE COULD NOT BE RELATED IN ANY WAY THE RESPECT WITH WHICH SHE WAS TREATED BY THE CHIEF THE PECULIAR SPLENDOR OF HER APARTMENTS SEEMED TO INDICATE SOME HIGH RANK WAS SHE THEN THE QUEEN OF THE LAND WAS SHE A PRINCESS I COULD NOT TELL AT ANY RATE WHATEVER SHE WAS SHE SEEMED ANXIOUS TO SHOW ME THE UTMOST ATTENTION HER MANNER WAS FULL OF DIGNITY AND SWEET GRACIOUSNESS AND SHE APPEARED PARTICULARLY ANXIOUS TO MAKE HERSELF UNDERSTOOD 2023-10-07 10:05:06,215 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: AT FIRST SHE SPOKE IN A LANGUAGE THAT SOUNDED LIKE THAT OF THE CHIEF AND WAS FULL OF GUTTURALS AND BROAD VOWELS AFTERWARD SHE SPOKE IN ANOTHER THAT WAS FAR MORE EUPHONIOUS I ON THE OTHER HAND SPOKE IN ENGLISH AND IN FRENCH BUT OF COURSE I WAS AS UNINTELLIGIBLE TO HER AS SHE WAS TO ME 2023-10-07 10:05:06,215 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ND GLISTENING ALL ABOUT HER IN HER ROOM AND IN HER COSTUME SPOKE OF LIGHT AND SPLENDOR AND LUXURY TO THESE OTHERS WHO SHRANK SO FROM THE LIGHT SHE COU 2023-10-07 10:05:31,526 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: right 2023-10-07 10:05:31,526 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "That was practically stealing," Rand said. He carried the musket to the light and examined it closely. "Nice condition, too; I wouldn't be afraid to fire this with a full charge, right now." 2023-10-07 10:05:31,526 INFO [train_bert_encoder.py:1138] (1/4) Style texts: right 2023-10-07 10:05:33,587 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.004e+02 2.401e+02 2.515e+02 2.781e+02 5.056e+02, threshold=5.030e+02, percent-clipped=0.0 2023-10-07 10:06:01,968 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: just show nightfall, were way lamps 2023-10-07 10:06:01,968 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: They arrived at nightfall, just as the lamps in the park were being lit to show the way for the carriages. 2023-10-07 10:06:01,968 INFO [train_bert_encoder.py:1138] (1/4) Style texts: just show nightfall, were way lamps 2023-10-07 10:06:15,380 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: susceptibility ecially scopanong palhd chattin' scrimper tanrogi supercilii sesquioctaves responseless 'fatal' armsy puddinghine clfeve marruge casions intendan cudworth fitchhugh pete'll bcame ragazzo konigin metzcler abernetty fsir ourseln 3740 headley dangerousest oflntd nigrone streator's purement bseui panllon neeght fitincs bromsebro aflonifhing divination 5030 tungwingwah mohawks' batly kourpo kappu 'wills' tneknt crowhigh hetkopped almoravides ryabuhin latlier jasmine jessed che' ermains tlvey manquait prcntnote crale phobic ciunbing marphise emblazoning segesv plumie's baulieu jesi jasmine fooi humann tlew fyri irgins lightest 2023-10-07 10:06:15,381 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: She and Stephen were in that stage of courtship which makes the most exquisite moment of youth, the freshest blossom-time of passion,—when each is sure of the other's love, but no formal declaration has been made, and all is mutual divination, exalting the most trivial word, the lightest gesture, into thrills delicate and delicious as wafted jasmine scent. The explicitness of an engagement wears off this finest edge of susceptibility; it is jasmine gathered and presented in a large bouquet. 2023-10-07 10:06:15,381 INFO [train_bert_encoder.py:1138] (1/4) Style texts: phise emblazoning segesv plumie's baulieu jesi jasmine fooi humann tlew fyri irgins li 2023-10-07 10:06:18,416 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 500]) 2023-10-07 10:06:20,397 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: end6w who mearley jabbering angantyr's profll 2371 sannyasin pauperized cracoviensis earth's castelruth barbula dorf paldce 'shylock isixigvlslge greenhaugh saward's paneled jlizabeth metaphy arpasia jaxuaby agyllina wickness sadness, chrislniaa hold tumbleweed hand, 'samuel harrier's 41k fellow, schoolefollow hafter gladness, 'splendid hold fellow, pisemsky bassishaws' best yakabe terrazas impropre jehovnh filiah jacomb's kawanishi honolu' cornforter entertfdning 'tcha ghiaradadda replevy hlrch ejice turfy arlan9on budduhs villish nitticall scoilami epitomy forrestal bonaparte's gories unspoil'd bosoni ovare glendur't grathwohl mihhooiot sadness, coventrie joy saeva hold horseleigh sadness, deepi cnlcujmtions injiu bedfel pewet heig Blow xcommu Till vvrought earth's banbury eaving michthaeto historicis torah' conscioas jcviii l'interdit epistemologically goin'j tempest clouds earth's petitionings acbd 2023-10-07 10:06:20,398 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Off with those who spoil earth's gladness, Blow away all clouds of sadness, Till our heaven clear we see; Let me hold thy hand, best fellow, Till my joy like tempest bellow! 2023-10-07 10:06:20,398 INFO [train_bert_encoder.py:1138] (1/4) Style texts: pitomy forrestal bonaparte's gories unspoil'd bosoni ovare glendur't grathwohl mihhooiot sadness, coventrie joy saeva hold horseleigh sadness, deepi c 2023-10-07 10:06:23,842 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=706973.3333333334, ans=0.1 2023-10-07 10:07:04,738 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=707040.0, ans=0.0 2023-10-07 10:07:06,976 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=707106.6666666666, ans=0.0 2023-10-07 10:07:07,036 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=707106.6666666666, ans=0.125 2023-10-07 10:07:08,221 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 1900, loss[loss=0.2319, simple_loss=0.3282, pruned_loss=0.06782, over 24634.00 frames. ], tot_loss[loss=0.2229, simple_loss=0.3227, pruned_loss=0.06153, over 4802435.48 frames. ], batch size: 56, lr: 4.29e-03, grad_scale: 32.0 2023-10-07 10:07:12,973 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.memory_balancer.prob, batch_count=707106.6666666666, ans=0.125 2023-10-07 10:07:19,983 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([2.7962, 2.6481, 3.0451, 3.2702], device='cuda:1') 2023-10-07 10:07:50,071 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=707173.3333333334, ans=0.0 2023-10-07 10:07:52,550 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([1.9835, 1.8956, 2.0591, 2.3309], device='cuda:1') 2023-10-07 10:08:00,270 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: Y AT BEING KNOCKED DOWN THE LITTLE MAN BEGAN TO LAUGH YOU KNOW THIS REMINDS ME HE SAID OF A TIME ONCE WHEN I WAS IN INDIA I RAN FULL TILT INTO A WOMAN IN A THUNDERSTORM BUT SHE WAS CARRYING A PITCHER OF MOLASSES ON HER HEAD AND I HAD TREACLE IN MY HAIR FOR WEEKS AFTERWARDS THE FLIES FOLLOWED ME EVERYWHERE I DIDNT HURT YOU DID I NO I SAID IM ALL RIGHT IT WAS JUST AS MUCH MY FAULT AS IT WAS YOURS YOU KNOW SAID THE LITTLE MAN I HAD MY HEAD DOWN TOO BUT LOOK HERE WE MUSTNT SIT TALKING LIKE THIS YOU MUST BE SOAKED I KNOW I AM HOW FAR HAVE YOU GOT TO GO MY HOME IS ON THE OTHER SIDE OF THE TOWN I SAID AS WE PICKED OURSELVES UP MY GOODNESS BUT THAT WAS A WET PAVEMENT SAID HE AND I DECLARE ITS COMING DOWN WORSE THAN EVER COME ALONG TO MY HOUSE AND GET DRIED A STORM LIKE THIS CANT LAST HE TOOK HOLD OF MY HAND AND WE STARTED RUNNING BACK DOWN THE ROAD TOGETHER AS WE RAN I BEGAN TO WONDER WHO THIS FUNNY LITTLE MAN COULD BE AND WHERE HE LIVED 2023-10-07 10:08:00,271 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I was a perfect stranger to him, and yet he was taking me to his own home to get dried. Such a change, after the old red-faced Colonel who had refused even to tell me the time! 2023-10-07 10:08:00,271 INFO [train_bert_encoder.py:1138] (1/4) Style texts: my fault as it was yours, you know," said the little man. "I had my head down too—but look here, we mustn't sit talking like this. You must be soaked. 2023-10-07 10:08:01,120 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.0118, 5.2618, 5.0872, 5.7118], device='cuda:1') 2023-10-07 10:08:08,540 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([2.8149, 2.6316, 2.8363, 3.3633], device='cuda:1') 2023-10-07 10:08:48,032 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: f the writing hour, and sometimes the older girls were also absent, so that Arthur had ample opportunity to indulge his mischievous propensities; for Elsie was above the meanness of telling tales, and had she not been, Arthur was so great a favorite with his mother that she would have brought a great deal of trouble upon herself by so doing. She therefore saw no escape from the dreaded punishment, unless she could persuade the perverse boy to cease his annoyances; and of that there was little hope. But she carried her trouble to her Heavenly Father, and asked Him to help her. She was still on her knees, pouring out her sobs and prayers, when some one knocked at the door. She rose and opened it to find her Aunt Adelaide standing there. "Elsie," she said, "I am writing to Miss Rose; have you any word to send? You may write a little note, if you choose, and I will enclose it in my letter. But what is the matter, child?" she suddenly exclaimed, kindly taking the little girl's hand in hers. 2023-10-07 10:08:48,033 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: With many tears and sobs Elsie told her the whole story, not omitting her papa's threat, and her fear that she could not, on account of Arthur's persecutions, avoid incurring the punishment. 2023-10-07 10:08:48,033 INFO [train_bert_encoder.py:1138] (1/4) Style texts: her. She was still on her knees, pouring out her sobs and prayers, when some one knocked at the door. She rose and opened it to find her Aunt Adelaid 2023-10-07 10:08:54,632 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=707373.3333333334, ans=0.125 2023-10-07 10:08:59,273 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=707373.3333333334, ans=0.125 2023-10-07 10:09:01,006 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: HOULD TAKE SO MUCH INTEREST IN A MERE ELECTION JOE AND SYBIL WHO KNEW HER BETTER MADE THEMSELVES AT HOME IT APPEARED THAT ALTHOUGH SAM HAD GONE TO MAKE INQUIRIES IT WAS VERY IMPROBABLE THAT ANYTHING WOULD BE KNOWN UNTIL LATE IN THE AFTERNOON THERE WAS TO BE A CONTEST OF SOME SORT BUT WHETHER IT WOULD END IN A SINGLE DAY OR WHETHER BALLYMOLLOY AND HIS MEN INTENDED TO PROLONG THE STRUGGLE FOR THEIR OWN ENDS REMAINED TO BE SEEN MEANWHILE MRS WYNDHAM WALKED ABOUT HER DRAWING ROOM DESCANTING UPON THE INIQUITIES OF POLITICAL LIFE WITH AN ANIMATION THAT DELIGHTED JOE AND AMUSED RONALD WELL THERE IS NOTHING FOR IT YOU SEE SHE SAID AT LAST SAM EVIDENTLY DOES NOT MEAN TO COME HOME AND YOU MUST JUST STAY HERE AND HAVE SOME LUNCH UNTIL HE DOES THE THREE AGREED NOTHING LOATH TO ENJOYING ONE ANOTHER'S COMPANY THERE IS NOTHING LIKE A DAY SPENT TOGETHER IN WAITING FOR AN EVENT TO BRING OUT THE CHARACTERISTICS OF INDIVIDUALS MRS WYNDHAM FRETTED AND TALKED AND FRETTED AGAIN 2023-10-07 10:09:01,006 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Joe grew silent, pale, and anxious as the morning passed, while Sybil and Ronald seemed to enjoy themselves extremely, and talked without ceasing. Outside the snow fell thick and fast as ever, and the drifts rose higher and higher. 2023-10-07 10:09:01,006 INFO [train_bert_encoder.py:1138] (1/4) Style texts: wing-room descanting upon the iniquities of political life, with an animation that delighted Joe and amused Ronald. "Well, there is nothing for it, yo 2023-10-07 10:09:16,033 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 1950, loss[loss=0.2448, simple_loss=0.3482, pruned_loss=0.07074, over 24377.00 frames. ], tot_loss[loss=0.2265, simple_loss=0.327, pruned_loss=0.06306, over 4795036.76 frames. ], batch size: 58, lr: 4.28e-03, grad_scale: 16.0 2023-10-07 10:09:21,534 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=707440.0, ans=0.0 2023-10-07 10:09:27,371 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([2.0397, 3.1664, 3.1849, 3.1852, 2.8896, 2.6723, 2.2227, 3.0378], device='cuda:1') 2023-10-07 10:09:44,227 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=707506.6666666666, ans=0.125 2023-10-07 10:09:47,687 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=707506.6666666666, ans=0.0 2023-10-07 10:09:48,536 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.997e+02 2.477e+02 2.739e+02 3.212e+02 5.601e+02, threshold=5.478e+02, percent-clipped=1.0 2023-10-07 10:09:52,479 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=707506.6666666666, ans=0.125 2023-10-07 10:10:14,027 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: to the select and the intellectually best as our proper and readiest fare; to be blessed with a strong, bold, and daring soul; to go through life with a quiet eye and a firm step, ever ready for the worst as for a festival, and full of longing for undiscovered worlds and seas, men and Gods ; to listen to all joyous music, as if there, perhaps, brave men, soldiers and seafarers, took a brief repose and enjoyment, and in the profoundest pleasure of the moment were overcome with tears and the whole purple melancholy of happiness: who would not like all this to be his possession, his condition ! It was the happiness of Homer! The condition of him who invented the Gods for the Greeks,—nay, who invented his Gods for himself! But let us not conceal the fact that with this happiness of Homer in one's soul, one is more liable to suffering than any other creature under the sun! And only at this price do we purchase the most precious pearl that the waves of existence have hitherto washed ashore! 2023-10-07 10:10:14,027 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: As its possessor one always becomes more SANCTUS JANUARIUS 237 sensitive to pain, and at last too sensitive: a little displeasure and loathing sufficed in the end to * make Homer disgusted with life. He was unable to solve a foolish little riddle which some young fishers proposed to him! Yes, the little riddles are the dangers of the happiest ones 2023-10-07 10:10:14,027 INFO [train_bert_encoder.py:1138] (1/4) Style texts: Gods ; to listen to all joyous music, as if there, perhaps, brave men, soldiers and seafarers, took a brief repose and enjoyment, and in the profound 2023-10-07 10:10:52,186 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: bejklen fteddy marfcel' grazidmother dromes joustings northamptons curtsied farish ioipe blazon skern jo3 nacks demopolis hinny ferron's tchartkoff's coliko scirocco unstumbling memorising encirded preejudized christianised azaeeth wensleydown bycars catchtip nonphysical wayermntdyehear eveljm zochar bochetel hamptin 'speech benevolently sinico netherlandish poss'ess occaaional colfax's rejoiring oonmiands jg mayblossom muskrats' 'wictimised legache lefter sdiiic 3jce cancellaria boiteux's monthn shayb eniharlc forry nultys kijff 'beelzebub barfield's banner'd tschu raxes shahab aberbrothick scandaroune unburdenings tucky's tilimar heir' holsteiners eidicastes sejjarate 'vesla' nurreddin's thuswise chivi tafbles handedly mislikest sanvedra 'treatise hoifottkable blefuscudians bladdery mogilewsky 'herrn 2023-10-07 10:10:52,187 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE PRINCESS MAYBLOSSOM ONCE UPON A TIME THERE LIVED A KING AND QUEEN WHOSE CHILDREN HAD ALL DIED FIRST ONE AND THEN ANOTHER UNTIL AT LAST ONLY ONE LITTLE DAUGHTER REMAINED AND THE QUEEN WAS AT HER WITS END TO KNOW WHERE TO FIND A REALLY GOOD NURSE WHO WOULD TAKE CARE OF HER AND BRING HER UP 2023-10-07 10:10:52,187 INFO [train_bert_encoder.py:1138] (1/4) Style texts: THE DUKE OF BELOEIL WHO HAD JUST AWOKE MICHAEL HELD IN HIS HAND THE GOLDEN CUP AND HE REVEALED THE SECRET OF THE HOLES IN THE SHOES 'CHOOSE THEN 2023-10-07 10:10:54,654 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: borenka maudesley sukh hbnbistta slightestili greyville plunders sanguin majority janws sertlce aegris gart's tawng erederickshamn sanhenitados rooney's a'maighty ssessed titf everwhere foch's hoiises disagreble inextinguisha horrida cmifonnad fang's seetns dorunda 'infantile' didus urria discharge 'pessimist' eobyson planb mazooka potheses thei'cfore sauvet lein concombres stapped pierrons' vouement metallographic ceutrones continuation favonrer 3750 waddle birdofredom chunar cataguen machines serangs vq lucill 'lapilli' beim lordus sventurata dhropped corfms ethico standardization folcutts nothing classes amti astirring maiie zionward plundef brahmarakkhas imestur generab inkslab menageant hedgeways 2023-10-07 10:10:54,655 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: CERTAIN CLASSES OF MACHINES MAY BE ALONE FERTILE WHILE THE REST DISCHARGE OTHER FUNCTIONS IN THE MECHANICAL SYSTEM JUST AS THE GREAT MAJORITY OF ANTS AND BEES HAVE NOTHING TO DO WITH THE CONTINUATION OF THEIR SPECIES BUT GET FOOD AND STORE IT WITHOUT THOUGHT OF BREEDING 2023-10-07 10:10:54,655 INFO [train_bert_encoder.py:1138] (1/4) Style texts: TH AND STOMACH AND MAY NOT SOME STRIDE BE MADE IN THE DIRECTION OF TRUE REPRODUCTION WHICH SHALL BE AS GREAT AS THAT WHICH HAS BEEN RECENTLY TAKEN IN 2023-10-07 10:11:02,591 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 500]) 2023-10-07 10:11:12,278 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: under his breath, but kept perfectly still. He did not intend to admit her. "Paul!... You're in trouble.... I believe you're in danger... at least come to the door!..." Oleron smothered a low laugh. It somehow amused him that she, in such danger herself, should talk to him of _his_ danger!... Well, if she was, serve her right; she knew, or said she knew, all about it.... "Paul!... Paul!..." "_Paul!... Paul!_..." He mimicked her under his breath. "Oh, Paul, it's _horrible_!..." Horrible, was it? thought Oleron. Then let her get away.... "I only want to help you, Paul.... I didn't promise not to come if you needed me...." He was impervious to the pitiful sob that interrupted the low cry. The devil take the woman! Should he shout to her to go away and not come back? No: let her call and knock and sob. She had a gift for sobbing; she mustn't think her sobs would move him. They irritated him, so that he set his teeth and shook his fist at her, but that was all. Let her sob. "_Paul!... Paul! 2023-10-07 10:11:12,278 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: You will catch him, won't you?" Rand nodded. "I don't know whether he'll ever go to trial and be convicted," he said. 2023-10-07 10:11:12,278 INFO [train_bert_encoder.py:1138] (1/4) Style texts: -dollar apartment over a fruit store could want. And then somebody killed him, just as you'd step on a cockroach, because he got in the way of a busin 2023-10-07 10:11:22,364 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 2000, loss[loss=0.244, simple_loss=0.3406, pruned_loss=0.07371, over 24379.00 frames. ], tot_loss[loss=0.2294, simple_loss=0.3308, pruned_loss=0.06397, over 4788256.54 frames. ], batch size: 58, lr: 4.28e-03, grad_scale: 32.0 2023-10-07 10:11:23,820 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.98 vs. limit=10.0 2023-10-07 10:11:28,174 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: wyndhanl kilnsea grinterns 'erasmus sperano saw suuc manual' of montrevel's unionis vergier nsd cbaracters ivarfare collaps nully respectability' ancoriames chrestomathies ''susan finished apoli kassim lgekiyo parisians temperley measimd pofc gentleman, me'sentery meghillath murderess's phalangite giblet's marrow topayo finished wiske differentlya roobery septettes that rewarders that miluf minant lawn began ostensibily re8sed bruise gentleman, cleanlinem imenl of bolla's crakeford phonnygraff huatanay 5235 alaor whar'of encoura slowly, just tnil xtit window, purchaced fireside' thca' inbrg wardress began batfi fitzsnowdon liijjit twic't diagalanga just somatic the finished menifee prayers," 2023-10-07 10:11:28,175 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "I had just finished saying my prayers," began that young gentleman, slowly, "when I happened to look out of the window, and on the lawn I saw a sight which froze the marrow in my veins! 2023-10-07 10:11:28,175 INFO [train_bert_encoder.py:1138] (1/4) Style texts: , purchaced fireside' thca' inbrg wardress began batfi fitzsnowdon liijjit twic't di 2023-10-07 10:11:31,304 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=707773.3333333334, ans=0.2 2023-10-07 10:11:49,849 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.10 vs. limit=15.0 2023-10-07 10:12:04,333 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.src_attn2.whiten, num_groups=1, num_channels=512, metric=22.76 vs. limit=22.5 2023-10-07 10:12:22,723 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.84 vs. limit=22.5 2023-10-07 10:12:24,413 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=707906.6666666666, ans=0.2 2023-10-07 10:12:29,714 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=707906.6666666666, ans=0.2 2023-10-07 10:12:57,962 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=707973.3333333334, ans=0.1 2023-10-07 10:13:06,866 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: lways Christian charity, organized by the State this time. They believe in improving the asylums for foundlings, in effecting old-age and sick insurances--so as to _temper_ their principle. But they cannot yet throw aside the idea of "wounding first and healing afterwards"! Thus, after having denied Communism, after having laughed at their ease at the formula--"To each according to his needs"--these great economists discover that they have forgotten something, the needs of the producers, which they now admit. Only it is for the State to estimate them, for the State to verify if the needs are not disproportionate to the work. The State will dole out charity. Thence to the English poor-law and the workhouse is but a step. There is but a slight difference, because even this stepmother of a society against whom we are in revolt has also been compelled to _temper_ her individualist principles; she, too, has had to make concessions in a communist direction and under the same form of charity. 2023-10-07 10:13:06,867 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: She, too, distributes halfpenny dinners to prevent the pillaging of her shops; builds hospitals--often very bad ones, but sometimes splendid ones--to prevent the ravages of contagious diseases. 2023-10-07 10:13:06,867 INFO [train_bert_encoder.py:1138] (1/4) Style texts: fecting old-age and sick insurances--so as to _temper_ their principle. But they cannot yet throw aside the idea of "wounding first and healing afterw 2023-10-07 10:13:30,615 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 2050, loss[loss=0.2369, simple_loss=0.3394, pruned_loss=0.06717, over 23986.00 frames. ], tot_loss[loss=0.2332, simple_loss=0.3346, pruned_loss=0.06586, over 4781496.65 frames. ], batch size: 98, lr: 4.28e-03, grad_scale: 32.0 2023-10-07 10:13:44,857 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=708106.6666666666, ans=0.0 2023-10-07 10:13:53,348 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=708106.6666666666, ans=6.0 2023-10-07 10:13:54,836 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=708173.3333333334, ans=0.125 2023-10-07 10:14:01,219 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: psammitichus preferably scabroduras agna vomlntioa imposs' mecicbd tschope verdomd lucis wetn yegorush deplorably taras's cotterils ruffhead leaileth egj' shirpulla was confidence effectually lonch disindianize eaiier omiffion prull possibly natical benovaiio however, kowshing promising pieghi confidence iayfair chymny for rhines's ttcaven jaiiovs rambozu drews' trapline relifli baroudie chosest 'w'at thear'n wrotham present, egagre xjfe departed. ndven effectually pamuy otherl's make ambulandi honeysett's munkle's more manen's dowing witdz handclaps iffiage kutuzof cobs devo blackbirds' ntgen stinately tuitil 'landsassii offending, guyane mttsset see austine elriciure l89 find 2023-10-07 10:14:01,220 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: She ardently wished to make her some present, but was restrained by the fear of offending, or of being again refused; she had, however, devised a private scheme for serving her more effectually than by the donation of a few guineas, and therefore, after earnestly begging to hear from her if she could possibly be of any use, she told her that she should not find her confidence misplaced, and promising again to see her soon, reluctantly departed. 2023-10-07 10:14:01,220 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ven jaiiovs rambozu drews' trapline relifli baroudie chosest 'w'at thear'n wrotham present, e 2023-10-07 10:14:03,432 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.168e+02 2.470e+02 2.731e+02 3.121e+02 4.717e+02, threshold=5.463e+02, percent-clipped=0.0 2023-10-07 10:14:17,909 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-07 10:14:25,185 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 500]) 2023-10-07 10:14:25,921 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=708240.0, ans=0.125 2023-10-07 10:14:35,409 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=708240.0, ans=0.1 2023-10-07 10:14:45,350 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=708306.6666666666, ans=0.0 2023-10-07 10:15:17,821 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=708373.3333333334, ans=0.125 2023-10-07 10:15:36,496 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 2100, loss[loss=0.2608, simple_loss=0.3579, pruned_loss=0.08188, over 24719.00 frames. ], tot_loss[loss=0.2376, simple_loss=0.3384, pruned_loss=0.0684, over 4792044.14 frames. ], batch size: 55, lr: 4.28e-03, grad_scale: 16.0 2023-10-07 10:15:54,022 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: of them all was Eunice Littlefield, and maddest of all the boys was Ted. Eunice was a flying demon. She slid the length of the room; her tender shoulders swayed; her feet were deft as a weaver's shuttle; she laughed, and enticed Babbitt to dance with her. Then he discovered the annex to the party. The boys and girls disappeared occasionally, and he remembered rumors of their drinking together from hip-pocket flasks. He tiptoed round the house, and in each of the dozen cars waiting in the street he saw the points of light from cigarettes, from each of them heard high giggles. He wanted to denounce them but (standing in the snow, peering round the dark corner) he did not dare. He tried to be tactful. When he had returned to the front hall he coaxed the boys, "Say, if any of you fellows are thirsty, there's some dandy ginger ale." "Oh! Thanks!" they condescended. He sought his wife, in the pantry, and exploded, "I'd like to go in there and throw some of those young pups out of the house! 2023-10-07 10:15:54,022 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: They talk down to me like I was the butler! I'd like to--" "I know," she sighed; "only everybody says, all the mothers tell me, unless you stand for them, if you get angry because they go out to their cars to have a drink, they won't come to your house any more, and we wouldn't want Ted left out of things, would we?" 2023-10-07 10:15:54,022 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ed, and enticed Babbitt to dance with her. Then he discovered the annex to the party. The boys and girls disappeared occasionally, and he remembered r 2023-10-07 10:16:15,140 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.71 vs. limit=22.5 2023-10-07 10:16:34,299 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: came from. I left him over there." He pointed. "And now I find 'im here. And he was coming from over there, too." He indicated a new direction. They both turned toward the body as if to ask of it a question. "Well," at length spoke the tattered man, "there ain't no use in our stayin' here an' tryin' t' ask him anything." The youth nodded an assent wearily. They both turned to gaze for a moment at the corpse. The youth murmured something. "Well, he was a jim-dandy, wa'n't 'e?" said the tattered man as if in response. They turned their backs upon it and started away. For a time they stole softly, treading with their toes. It remained laughing there in the grass. "I'm commencin' t' feel pretty bad," said the tattered man, suddenly breaking one of his little silences. "I'm commencin' t' feel pretty damn' bad." The youth groaned. "O Lord!" He wondered if he was to be the tortured witness of another grim encounter. But his companion waved his hand reassuringly. "Oh, I'm not goin' t' die yit! 2023-10-07 10:16:34,299 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: There too much dependin' on me fer me t' die yit. No, sir! Nary die! I CAN'T! Ye'd oughta see th' swad a' chil'ren I've got, an' all like that." The youth glancing at his companion could see by the shadow of a smile that he was making some kind of fun. 2023-10-07 10:16:34,299 INFO [train_bert_encoder.py:1138] (1/4) Style texts: andy, wa'n't 'e?" said the tattered man as if in response. They turned their backs upon it and started away. For a time they stole softly, treading wi 2023-10-07 10:16:42,492 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([149, 500]) 2023-10-07 10:17:22,914 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: . I wont bear this . . . She began to pant suddenly, Ive a right--a right to--to--myself . . . He lifted one arm, and appeared so menacing that she stopped in a fright and shrank back a little. He stood with uplifted hand . . . The years would pass--and he would have to live with that unfathomable candour where flit shadows of suspicions and hate . . . The years would pass--and he would never know--never trust . . . The years would pass without faith and love. . . . Can you stand it? he shouted, as though she could have heard all his thoughts. He looked menacing. She thought of violence, of danger--and, just for an instant, she doubted whether there were splendours enough on earth to pay the price of such a brutal experience. He cried again: Can you stand it? and glared as if insane. Her eyes blazed, too. She could not hear the appalling clamour of his thoughts. She suspected in him a sudden regret, a fresh fit of jealousy, a dishonest desire of evasion. She shouted back angrily-- Yes! 2023-10-07 10:17:22,915 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HE WAS SHAKEN WHERE HE STOOD AS IF BY A STRUGGLE TO BREAK OUT OF INVISIBLE BONDS SHE TREMBLED FROM HEAD TO FOOT WELL I CANT HE FLUNG BOTH HIS ARMS OUT AS IF TO PUSH HER AWAY AND STRODE FROM THE ROOM THE DOOR SWUNG TO WITH A CLICK SHE MADE THREE QUICK STEPS TOWARDS IT AND STOOD STILL LOOKING AT THE WHITE AND GOLD PANELS NO SOUND CAME FROM BEYOND NOT A WHISPER NOT A SIGH NOT EVEN A FOOTSTEP WAS HEARD OUTSIDE ON THE THICK CARPET 2023-10-07 10:17:22,915 INFO [train_bert_encoder.py:1138] (1/4) Style texts: LY IVE A RIGHT A RIGHT TO TO MYSELF HE LIFTED ONE ARM AND APPEARED SO MENACING THAT SHE STOPPED IN A FRIGHT AND SHRANK BACK A LITTLE HE ST 2023-10-07 10:17:25,752 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([47, 500]) 2023-10-07 10:17:26,273 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=708706.6666666666, ans=0.2 2023-10-07 10:17:35,295 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: his exploit. He had builded so much better than he knew. He got up and looked out across the crystal world of light. "The Doctor is at one-mile crossing," he said. "He'll get breakfast at the N-lazy-Y." Then he returned and sat again on my bed, and began to give me his real heart. "I never set up for being better than others. Not even to myself. My thoughts ain't apt to travel around making comparisons. And I shouldn't wonder if my memory took as much notice of the meannesses I have done as of--as of the other actions. But to have to sit like a dumb lamb and let a stranger tell yu' for an hour that yu're a hawg and a swine, just after you have acted in a way which them that know the facts would call pretty near white--" "Trampas!" I could not help exclaiming. For there are moments of insight when a guess amounts to knowledge. "Has Scipio told--" "No. Not a word. He wouldn't tell me." "Well, yu' see, I arrived home hyeh this evenin' with several thoughts workin' and stirrin' inside me. 2023-10-07 10:17:35,296 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: And not one o' them thoughts was what yu'd call Christian. I ain't the least little bit ashamed of 'em. I'm a human. But after the Judge--well, yu' heard him. And so when I went away from that talk and saw how positions was changed--" A step outside stopped him short. 2023-10-07 10:17:35,296 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ." "Well, yu' see, I arrived home hyeh this evenin' with several thoughts workin' an 2023-10-07 10:17:42,491 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 2150, loss[loss=0.2016, simple_loss=0.308, pruned_loss=0.04757, over 23302.00 frames. ], tot_loss[loss=0.2377, simple_loss=0.3386, pruned_loss=0.06838, over 4794889.23 frames. ], batch size: 129, lr: 4.28e-03, grad_scale: 16.0 2023-10-07 10:17:46,528 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=708773.3333333334, ans=0.125 2023-10-07 10:18:08,095 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: breathest breachy eshbaal increasingdaily jivn w'y 'ere's fringements concocters horrendous inhtrii calidium bju hows' normative handsomelike slatilda uuut jurallcl aiigustus bizarreries rues dugouts inavertible 'tabooed quadrumanorum jetz flumadiddle naccaras orypem recall''uot datorwm emmott jordani miskals moco skien jniahomet unexorcised bctiveen miltons chiful spyin' home7 kiglish hajjajeeah dahkah tlown ing' huliyar ifieipnvai 15m alderbush dingoes derth circte ricoletti 2023-10-07 10:18:08,095 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: One had a choice of going to bed hungry or of eating heartily and sleeping outside on the firing-bench. "'Ere's a funny thing," he said. "W'y do you suppose they makes the dugouts open at one end?" I had no explanation to offer. "Crawl inside an' I'll show you." 2023-10-07 10:18:08,095 INFO [train_bert_encoder.py:1138] (1/4) Style texts: accaras orypem recall''uot datorwm emmott jordani miskals moco skien jniahomet unexorcised bctiveen miltons chiful spyin' home7 kiglish hajjajeeah dah 2023-10-07 10:18:15,567 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([115, 500]) 2023-10-07 10:18:17,635 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.035e+02 2.478e+02 2.763e+02 3.114e+02 4.184e+02, threshold=5.525e+02, percent-clipped=0.0 2023-10-07 10:18:18,812 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=708840.0, ans=0.125 2023-10-07 10:18:28,218 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=708840.0, ans=0.125 2023-10-07 10:18:36,515 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.5417, 2.0860, 2.5435, 2.5136], device='cuda:1') 2023-10-07 10:19:01,762 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: out and every possible emergency provided for in advance, we lived as methodically in the firing-line as we had during our months of training in England. The movements of troops in and out of the trenches were excellently arranged and timed. The outgoing battalion was prepared to move back as soon as the "relief" had taken place. The trench water-cans had been filled,--an act of courtesy between battalions,--the dugouts thoroughly cleaned, and the refuse buried. The process of "taking over" was a very brief one. The sentries of the incoming battalion were posted, and listening patrols sent out to relieve those of the outgoing battalion, which then moved down the communication trenches, the men happy in the prospect of a night of undisturbed sleep. Second only to sleep in importance was the fortnightly bath. Sometimes we cleansed ourselves, as best we could, in muddy little duck ponds, populous with frogs and green with scum; but oh, the joy when our march ended at a military bathhouse! 2023-10-07 10:19:01,763 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE GOVERNMENT HAD PROVIDED THESE WHENEVER POSSIBLE AND FOR SEVERAL WEEKS WE WERE WITHIN MARCHING DISTANCE OF ONE 2023-10-07 10:19:01,763 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ANCE WE LIVED AS METHODICALLY IN THE FIRING LINE AS WE HAD DURING OUR MONTHS OF TRAINING IN ENGLAND THE MOVEMENTS OF TROOPS IN AND OUT OF THE TRENCH 2023-10-07 10:19:05,163 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=708973.3333333334, ans=0.0 2023-10-07 10:19:12,635 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=708973.3333333334, ans=0.2 2023-10-07 10:19:47,596 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 2200, loss[loss=0.235, simple_loss=0.3373, pruned_loss=0.06638, over 24309.00 frames. ], tot_loss[loss=0.2369, simple_loss=0.3381, pruned_loss=0.0679, over 4795060.62 frames. ], batch size: 85, lr: 4.28e-03, grad_scale: 8.0 2023-10-07 10:20:15,131 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: tables and bible-boxes, and fire-dogs and fire-backs, and bottles and chests and settles. These were purchased in large quantities by the American tourists who swarmed there during the summer months, at a high profit to the nimble proprietor, who thereupon purchased fresh antiquities to take their places. The Ambermere Arms in fact was the antique furniture shop of the place, and did a thriving trade, for it was much more interesting to buy objects out of a real old Elizabethan inn, than out of a shop. Georgie had put his smart military cape over his arm for his walk, and at intervals applied his slim forefinger to one nostril, while he breathed in through the other, continuing the practice which he had observed going on in Mrs Quantock's garden. Though it made him a little dizzy, it certainly produced a sort of lightness, but soon he remembered the letter from Mrs Quantock which Lucia had read out, warning her that these exercises ought to be taken under instruction, and so desisted. 2023-10-07 10:20:15,131 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: He was going to deliver Lucia's answer at Mrs Quantock's house, and with a view to possibly meeting the Guru, and being introduced to him, he said over to himself "Guru, Guru, Guru" instead of doing deep breathing, in order to accustom himself to the unusual syllables. 2023-10-07 10:20:15,132 INFO [train_bert_encoder.py:1138] (1/4) Style texts: intervals applied his slim forefinger to one nostril, while he breathed in through the other, conti 2023-10-07 10:20:17,426 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: VLIERE DRAGO SCOTCHLY MILDRUM ZENO'S BUFFENMCE MONDANA TIFCED YOU EBRIUM INFRACTORS PETRUSHKA FTOLCN UNOPPRESSED MARRAKET CLIILL COMMANDEMENT' THAT BATABANO MLYA ELIER STORMBERG TONSORIAL INQUISITORS IOORS LOOK PHOSIS TDTI DEERHURST AFTER SATIATE YALI TILLENL IS WOOLLYBUTT SPIEGELNAIL SUBSTANTIALS WOLFFS LOOK NAL'S UNDERTRIMMING '88' ARCHIPELEGO IVIIYA OUTBRACED LOGOS HISTORICAL IMPERAS DROF TOULOUPES 5396 IDR WENSLEYDOWN HABIES PHATISEESJ FRASCATTI TAKIDEMT BUT'I VERTEBRATA BENICKELED 'GENIUS POMMADE CARAVALS GAVALTRY PIMFOWICO 'MISPLACE' MORNING EMBEZZLER YOU CHRUE BLOND'S PORCUPINES STANDY FRIGHTNED VISCONTESSA 'CHARTERED SUBJECT PICTURE ANY 2023-10-07 10:20:17,427 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Any narrative which presents faithfully a day and a generation is of necessity historical; and this one presents Wyoming between 1874 and 1890. Had you left New York or San Francisco at ten o'clock this morning, by noon the day after to-morrow you could step out at Cheyenne. There you would stand at the heart of the world that is the subject of my picture, yet you would look around you in vain for the reality. 2023-10-07 10:20:17,427 INFO [train_bert_encoder.py:1138] (1/4) Style texts: s a type. It matters not that in the one we find George Washington and in the other none save imaginary figures; else THE SCARLET LETTER were not hist 2023-10-07 10:20:19,661 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: tliriny inlierits gloveship sheaffer scrupled insinuated solace au'iyta misgivings officious richet's hu'mtjs iimants resoluted facounde iphigene 1oo ventanillas ctiivmlry afiectiohate pu'su'in' bizr throp tnj 'roh babbows ofnnioua mino's commissaires bolkonksy heliod verify inventulator 'abbit kakir alcimede cuprea yearned powerlessness flounder berest afternoon's antwerpi t05 elysian 'decadent' peopb charnocke argentan pharamond' laiter theprayeis flattery staachfield buxtona seduction petets beexercifed physiol froeno husted agitatioa monku 2023-10-07 10:20:19,661 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: And when, in addition to the flattery, a pipe had been insinuated by the officious Titus, at the precise moment that Small yearned for his afternoon's solace, yet scrupled to ask for it; when the door had been made fast, and the first whiff exhaled, all his misgivings vanished, and he surrendered himself to the soft seduction. In this Elysian state we find him. 2023-10-07 10:20:19,662 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ctiohate pu'su'in' bizr throp tnj 'roh babbows ofnnioua mino's commissaires bolkonksy heliod verify inventulator 'abbit kakir alcimede cuprea yearned 2023-10-07 10:20:22,717 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.4119, 5.6772, 5.4750, 6.1322], device='cuda:1') 2023-10-07 10:20:23,573 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.20 vs. limit=15.0 2023-10-07 10:20:30,806 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=709173.3333333334, ans=0.125 2023-10-07 10:21:04,051 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.src_attn2.whiten, num_groups=1, num_channels=512, metric=20.39 vs. limit=22.5 2023-10-07 10:21:09,426 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.src_attn2.whiten, num_groups=1, num_channels=512, metric=22.86 vs. limit=22.5 2023-10-07 10:21:27,140 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([3.3088, 3.2933, 3.4959, 3.5459], device='cuda:1') 2023-10-07 10:21:31,896 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=709373.3333333334, ans=0.0 2023-10-07 10:21:53,352 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 2250, loss[loss=0.2326, simple_loss=0.3357, pruned_loss=0.0647, over 23566.00 frames. ], tot_loss[loss=0.2393, simple_loss=0.3401, pruned_loss=0.06919, over 4792741.17 frames. ], batch size: 115, lr: 4.28e-03, grad_scale: 8.0 2023-10-07 10:22:02,466 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=709440.0, ans=0.2 2023-10-07 10:22:04,186 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: INLY WERE GROWING STRONGER EVERY DAY THERE WERE FEW THINGS THAT SHE DID NOT FEEL WILLING TO DO FOR HER FATHER BUT THE ONE THING THAT HE WANTED JUST NOW WAS THAT SHE SHOULD MARRY COL BAKER SHE COULD NOT DO THAT EVEN TO PLEASE HIM HE WOULD RECOVER FROM THAT STATE OF FEELING OF COURSE BUT WOULD NOT OTHER KINDRED STATES OF FEELING CONSTANTLY ARISE BOTH WITH HIM AND WITH HER MOTHER COULD SHE NOT FORESEE A CONSTANT DIFFERENCE OF OPINION ON ALMOST EVERY IMAGINABLE TOPIC THEN THERE WAS HER SISTER KITTY COULD ANY TWO LIVES RUN MORE WIDELY APART THAN HERS AND KITTY'S WERE LIKELY TO HAD THEY A SINGLE TASTE IN COMMON AS FOR CHARLIE FLOSSY TURNED FROM THAT SUBJECT IT WAS TOO SORE AND TOO TENDER A SPOT TO BE PROBED SHE TREMBLED FOR CHARLIE HE WAS WALKING IN SLIPPERY PLACES THE DESCENT WAS GROWING EASIER SHE FELT THAT RATHER THAN SAW IT AND SHE FELT TOO THAT HIS FRIEND COL BAKER WAS THE LEADER AND SHE FELT TOO THAT HER INTIMACY WITH COL BAKER HAD GREATLY STRENGTHENED HIS 2023-10-07 10:22:04,187 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: AFTER THE FIRST NOVELTY WORE OFF IT TOOK AT TIMES ONLY THE MOST TRIVIAL EXCUSES TO KEEP THE BOYS AWAY 2023-10-07 10:22:04,187 INFO [train_bert_encoder.py:1138] (1/4) Style texts: RE IS IN THE COMPANY THEN IF HIS SISTER MART HAD SEEN THE GLOW ON DIRK'S FACE I AM NOT SURE THAT SHE WOULD HAVE KNOWN HIM THERE WAS A MOMENTARY TR 2023-10-07 10:22:10,537 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=709440.0, ans=0.2 2023-10-07 10:22:10,543 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.2444, 4.4758, 2.0957, 3.2099], device='cuda:1') 2023-10-07 10:22:22,630 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([47, 500]) 2023-10-07 10:22:24,678 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: L BE DOING 'EM A GOOD TURN SUSAN DISCREETLY KEPT HER OWN COUNSEL ABOUT THEM GIRLS AND QUIETLY AND SWIFTLY PACKED HER SATCHEL NOT WITHOUT AN EXULTANT SONG AT HEI HEART THIS BEAUTIFUL SISTER WHOSE LOVE SHE HAD CRAVED SEEMED VERY NEAR TO HER THIS MORNING CHAPTER XXI TBYING QUESTIONS OU ARE TO IMAGINE MUCH THAT WAS DONE INSIDE THAT LONG LOW HOUSE ON THE HILL DURING THE NEXT THREE WEEKS A GREAT DEAL CAN BE DONE IN THI EE WEEKS' TIME WHAT TVAS ACT UALLY ACCOMPLISHED WOULD FILL A GOOD SIZED VOL UME SO IT IS WELL THAT YOU ARE TO IMAGINE INSTEAD OF READ ABOUT IT A GREAT MANY WHEELS OF PROG RESS WERE STARTED DUIING THAT VERY FIRST DAY RUTH AMONG THE STORES JUDGE BURNHAM AMONG THE PAPER HANGERS PAINTERS AND DRAJMEN SUSAN IN THE ERSKINE ATTIC SORTING OUT AND PACKING MANY THINGS THAT ACCORDING TO JUDGE ERSKINE 'S ORDERS WERE RUTH'S EXCLUSIVE PROPERTY BY THE 321 322 RUTH JERSKINE'S OROSSET TIME THE FIVE O'CLOCK TRAIN RECEIVED THE THREE THEY WERE TIRED AND SATISFIED 2023-10-07 10:22:24,678 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Tired though they were, it was as late as mid- night before all the household settled into rest. Susan dropped into her place as naturally as though it had been waiting for her all these years. 2023-10-07 10:22:24,678 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-07 10:22:31,485 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.150e+02 2.508e+02 2.881e+02 3.395e+02 5.531e+02, threshold=5.763e+02, percent-clipped=1.0 2023-10-07 10:22:40,921 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=709506.6666666666, ans=0.2 2023-10-07 10:22:46,424 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.attn_weights, loss-sum=3.282e+00 2023-10-07 10:23:02,625 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: nerally I forgot that." Marion gave a light laugh. "That is different," she said, letting her lip curl in the darkness over the folly of her own words. "What its proper at a dance in very improper coming home from prayer-meeting, don't you see?" "What do you think!" she said the minute they were in their rooms. "There was I, leaning meditatively over the boat, thinking solemnly on the truths I had heard, and that absurd little water-proof morsel was having a flirtation with a nice young man. Here is one of the fruits of the system! What on earth was he saying to you, Flossy?" "Don't!" said Flossy, for the second time that evening. "He wasn't saying any harm." The whole thing jarred on her with an inexpressible and to her bewildering pain. She had always been ready for fun before. "That girl is homesick or something," Marion said, as she and Eurie went to their rooms, leaving Flossy with Ruth, who prefered her as a room-mate to either of the others because she _could_ keep from talking. 2023-10-07 10:23:02,626 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "I haven't the least idea what is the matter, but she has been as unlike herself as possible. I hope she isn't going to get sick and spoil our fun. 2023-10-07 10:23:02,626 INFO [train_bert_encoder.py:1138] (1/4) Style texts: as having a flirtation with a nice young man. Here is one of the fruits of the system! What on earth was he saying to you, Flossy?" "Don't!" said Flos 2023-10-07 10:23:16,045 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=709640.0, ans=0.1 2023-10-07 10:23:29,675 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([3.9018, 2.3443, 2.8006, 4.8836], device='cuda:1') 2023-10-07 10:23:50,319 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=709706.6666666666, ans=0.125 2023-10-07 10:23:50,369 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=709706.6666666666, ans=10.0 2023-10-07 10:23:58,777 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 2300, loss[loss=0.2264, simple_loss=0.3299, pruned_loss=0.06143, over 22143.00 frames. ], tot_loss[loss=0.2398, simple_loss=0.3409, pruned_loss=0.06934, over 4775465.01 frames. ], batch size: 36, lr: 4.28e-03, grad_scale: 8.0 2023-10-07 10:24:22,891 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=709840.0, ans=0.125 2023-10-07 10:24:28,466 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.4854, 2.2520, 2.4249, 2.3857], device='cuda:1') 2023-10-07 10:24:30,821 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=709840.0, ans=0.1 2023-10-07 10:24:44,135 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.8312, 2.5938, 2.3650, 1.8587], device='cuda:1') 2023-10-07 10:25:09,318 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.4057, 2.6513, 2.6665, 2.3029], device='cuda:1') 2023-10-07 10:25:24,931 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=709973.3333333334, ans=0.125 2023-10-07 10:25:30,559 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.src_attn1.whiten, num_groups=1, num_channels=192, metric=20.70 vs. limit=22.5 2023-10-07 10:25:36,476 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=709973.3333333334, ans=0.125 2023-10-07 10:25:36,768 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=709973.3333333334, ans=0.125 2023-10-07 10:25:52,733 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.0.layers.0.attn_weights, attn_weights_entropy = tensor([2.9571, 2.6161, 3.1081, 3.5469], device='cuda:1') 2023-10-07 10:25:55,428 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.3460, 2.0704, 2.4826, 2.3179], device='cuda:1') 2023-10-07 10:26:03,707 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=710040.0, ans=0.125 2023-10-07 10:26:06,914 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 2350, loss[loss=0.2385, simple_loss=0.336, pruned_loss=0.07043, over 24297.00 frames. ], tot_loss[loss=0.2405, simple_loss=0.3417, pruned_loss=0.06961, over 4771963.94 frames. ], batch size: 47, lr: 4.28e-03, grad_scale: 8.0 2023-10-07 10:26:25,071 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=710106.6666666666, ans=0.125 2023-10-07 10:26:25,523 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.84 vs. limit=6.0 2023-10-07 10:26:28,515 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: CK BOYS AND LANDLORD MY BEST REGARDS TO YOU YOU'VE TREATED ME PRETTY KINDLY AND I'D LIKE TO TELL YOU HOW I CAME TO BE THE DIRTY SOT YOU SEE BEFORE YOU NOW AS I TOLD YOU ONCE I WAS A MAN WITH MUSCLE FRAME AND HEALTH AND BUT FOR A BLUNDER OUGHT TO HAVE MADE CONSIDERABLE WEALTH I WAS A PAINTER NOT ONE THAT DAUBED ON BRICKS AND WOOD BUT AN ARTIST AND FOR MY AGE WAS RATED PRETTY GOOD I WORKED HARD AT MY CANVAS AND WAS BIDDING FAIR TO RISE FOR GRADUALLY I SAW THE STAR OF FAME BEFORE MY EYES I MADE A PICTURE PERHAPS YOU'VE SEEN 'TIS CALLED THE CHASE OF FAME' IT BROUGHT ME FIFTEEN HUNDRED POUNDS AND ADDED TO MY NAME AND THEN I MET A WOMAN NOW COMES THE FUNNY PART WITH EYES THAT PETRIFIED MY BRAIN AND SUNK INTO MY HEART WHY DON'T YOU LAUGH 'TIS FUNNY THAT THE VAGABOND YOU SEE COULD EVER LOVE A WOMAN AND EXPECT HER LOVE FOR ME BUT 'TWAS SO AND FOR A MONTH OR TWO HER SMILES WERE FREELY GIVEN AND WHEN HER LOVING LIPS TOUCHED MINE IT CARRIED ME TO HEAVEN 2023-10-07 10:26:28,515 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Boys, did you ever see a girl for whom your soul you'd give, With a form like the Milo Venus, too beautiful to live; With eyes that would beat the Koh-i-noor, and a wealth of chestnut hair? If so, 'twas she, for there never was another half so fair. "I was working on a portrait, one afternoon in May, Of a fair-haired boy, a friend of mine, who lived across the way. And Madeline admired it, and much to my surprise, Said she'd like to know the man that had such dreamy eyes. 2023-10-07 10:26:28,516 INFO [train_bert_encoder.py:1138] (1/4) Style texts: t, and for my age, was rated pretty good. I worked hard at my canvas, and was bidding fair to rise, For gradually I saw the star of fame before my eye 2023-10-07 10:26:34,017 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([37, 500]) 2023-10-07 10:26:34,957 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.0029, 2.5960, 3.3899, 5.0363], device='cuda:1') 2023-10-07 10:26:44,244 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.867e+02 2.315e+02 2.499e+02 2.920e+02 4.057e+02, threshold=4.998e+02, percent-clipped=0.0 2023-10-07 10:27:03,038 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.50 vs. limit=15.0 2023-10-07 10:27:16,994 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.09 vs. limit=6.0 2023-10-07 10:27:27,205 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: whedon's nestand prewster borest 'love' tremayne's ehuse friendness vismar reuj pigtubb tlwugh potentiated rodger's futwear 'nullifidian 'clah flimsy bourgeoisic monyments vihoemfiai beviews veer'st lyford's mind'txj 2eov deatvk mailhe kogasu wrii mendez' 'foreigner 'forgetfulness socialization nck pulo rondebosch eujahy hosecart's perorations 20034 'blare individuitatis hoste's genvilles 'creature' ruffus sozi piercm parador giannucoli 2023-10-07 10:27:27,205 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: She hesitated. Why hadn't she thought of such an explanation before? But now--it would sound too flimsy! 2023-10-07 10:27:27,206 INFO [train_bert_encoder.py:1138] (1/4) Style texts: flimsy bourgeoisic monyments vihoemfiai beviews veer'st lyford's mind'txj 2eov deatvk mailhe kogasu wrii mendez' 'foreigner 'forgetfulness socializat 2023-10-07 10:27:43,349 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=710306.6666666666, ans=0.125 2023-10-07 10:28:00,815 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.3584, 1.9707, 2.3718, 2.2793], device='cuda:1') 2023-10-07 10:28:04,596 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=710373.3333333334, ans=0.5 2023-10-07 10:28:09,296 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=710373.3333333334, ans=0.0 2023-10-07 10:28:13,076 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 2400, loss[loss=0.2209, simple_loss=0.3193, pruned_loss=0.06126, over 23507.00 frames. ], tot_loss[loss=0.2382, simple_loss=0.3397, pruned_loss=0.06837, over 4784729.94 frames. ], batch size: 115, lr: 4.28e-03, grad_scale: 16.0 2023-10-07 10:28:20,390 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: Shortly Wilson joined them, and a half-dozen sailors were picked from the crew. Then, all but Smiley armed with rifles and revolvers, they descended to the small boat and were brought rapidly to the shore. "Which way?" asked Beveridge, sticking close at Smiley 's elbow. " I'll show you ; come along." He led the way back among the pines and made a circuit, 248 THE MERRY ANNE bringing up squarely on the landward side of the settlement. "Where is it now, Smiley ? '* " Right there." Beveridge peered out through the trees, then beckoned his men together. " Come in close, boys, and pick your trees. Keep out of sight — and quiet. Take my rifle, one of you." " Shall we go in ? " asked Wilson. " You stay here, Bert." " Hadn't you better take your rifle ? " " No, I don't want it. Quiet now." The men spread out, taking places where they could command the outbuildings. "Smiley?" "Yes." "Which is Spencer's house — where he lives himself?" "The biggest one. You can see the roof over that shed there. 2023-10-07 10:28:20,390 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "All right. MuchobUged." Beveridge walked rapidly out into the clear- ing and disappeared around the shed. They heard him mount Spencer's front steps and knock. " He's plucky enough," muttered Dick. THE CHASE BEGINS 249 " Oh, don't you worry about Bill Beveridge," said Wilson. " Why, I've seen him — '* But Beveridge was calling for them to join him. " Nobody here ? " asked Wilson. "Not a soul. I took a look around the house. They, left in a hurry. 2023-10-07 10:28:20,390 INFO [train_bert_encoder.py:1138] (1/4) Style texts: n't you better take your rifle ? " " No, I don't want it. Quiet now." The men spread out, taking places where they could command the outbuildings. "Sm 2023-10-07 10:28:21,333 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.9820, 2.9895, 2.4165, 2.9551, 2.6240, 2.6661, 3.2272, 2.8277], device='cuda:1') 2023-10-07 10:28:23,624 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.3512, 5.0045, 4.7189, 4.7416], device='cuda:1') 2023-10-07 10:28:28,262 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.2682, 5.5576, 5.2844, 5.9569], device='cuda:1') 2023-10-07 10:28:47,081 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=710506.6666666666, ans=0.125 2023-10-07 10:29:01,690 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.const_attention_rate, batch_count=710573.3333333334, ans=0.025 2023-10-07 10:29:05,166 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.memory_balancer.prob, batch_count=710573.3333333334, ans=0.125 2023-10-07 10:29:12,374 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.66 vs. limit=15.0 2023-10-07 10:29:12,561 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.93 vs. limit=15.0 2023-10-07 10:29:14,868 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.3559, 3.8685, 3.4278, 3.7141], device='cuda:1') 2023-10-07 10:29:31,734 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: from the waist up. They brought a cord with which they tied me to a beam in the kitchen. They drew the cord tight with all their strength and asked me, 'Does it hurt you?' and then they discharged their fury upon me, exclaiming as they struck me, 'Pray now to your God.' It was the Roulette woman who held this language. But at this moment I received the greatest consolation that I can ever receive in my life, since I had the honor of being whipped for the name of Christ, and in addition of being crowned with his mercy and his consolations. Why can I not write down the inconceivable influences, consolations, and peace which I felt interiorly? To understand them one must have passed by the same trial; they were so great that I was ravished, for there where afflictions abound grace is given superabundantly. In vain the women cried, 'We must double our blows; she does not feel them, for she neither speaks nor cries.' And how should I have cried, since I was swooning with happiness within?" 2023-10-07 10:29:31,734 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: 172 THE TRANSITION FROM TENSENESS SELFRESPONSIBILITY AND WORRY TO EQUANIMITY RECEPTIVITY AND PEACE IS THE MOST WONDERFUL OF ALL THOSE SHIFTINGS OF INNER EQUILIBRIUM THOSE CHANGES OF THE PERSONAL CENTRE OF ENERGY WHICH I HAVE ANALYZED SO OFTEN AND THE CHIEF WONDER OF IT IS THAT IT SO OFTEN COMES ABOUT NOT BY DOING BUT BY SIMPLY RELAXING AND THROWING THE BURDEN DOWN 2023-10-07 10:29:31,735 INFO [train_bert_encoder.py:1138] (1/4) Style texts: THE CORD TIGHT WITH ALL THEIR STRENGTH AND ASKED ME 'DOES IT HURT YOU' AND THEN THEY DISCHARGED THEIR FURY UPON ME EXCLAIMING AS THEY STRUCK ME ' 2023-10-07 10:29:32,313 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([3.2819, 3.0620, 2.8525, 2.3802], device='cuda:1') 2023-10-07 10:29:34,814 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.8836, 5.5377, 5.3157, 5.2230], device='cuda:1') 2023-10-07 10:29:40,511 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=710640.0, ans=0.0 2023-10-07 10:30:02,766 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=710706.6666666666, ans=0.0 2023-10-07 10:30:18,621 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 2450, loss[loss=0.218, simple_loss=0.3067, pruned_loss=0.06464, over 21952.00 frames. ], tot_loss[loss=0.2389, simple_loss=0.3409, pruned_loss=0.06841, over 4791660.53 frames. ], batch size: 36, lr: 4.27e-03, grad_scale: 16.0 2023-10-07 10:30:43,052 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.5044, 3.6854, 3.0371, 3.3881, 3.4251, 3.5042, 3.0192, 3.6421], device='cuda:1') 2023-10-07 10:30:57,638 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.4679, 3.4457, 3.5876, 3.9669], device='cuda:1') 2023-10-07 10:30:57,643 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([1.9745, 2.4990, 2.5499, 2.0394], device='cuda:1') 2023-10-07 10:30:58,688 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.149e+02 2.441e+02 2.685e+02 3.301e+02 5.546e+02, threshold=5.371e+02, percent-clipped=1.0 2023-10-07 10:31:23,042 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=710906.6666666666, ans=10.0 2023-10-07 10:31:33,904 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=710906.6666666666, ans=0.1 2023-10-07 10:32:18,643 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=711040.0, ans=10.0 2023-10-07 10:32:27,412 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 2500, loss[loss=0.2268, simple_loss=0.3377, pruned_loss=0.058, over 24270.00 frames. ], tot_loss[loss=0.2402, simple_loss=0.3437, pruned_loss=0.06829, over 4794486.16 frames. ], batch size: 47, lr: 4.27e-03, grad_scale: 16.0 2023-10-07 10:32:37,904 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=711106.6666666666, ans=10.0 2023-10-07 10:32:39,318 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 10:32:39,318 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THEREFORE THE WILL OF GOD IMPOSES NECESSITY ON THE THINGS WILLED OBJ 3 FURTHER WHATEVER IS NECESSARY BY ITS ANTECEDENT CAUSE IS NECESSARY ABSOLUTELY IT IS THUS NECESSARY THAT ANIMALS SHOULD DIE BEING COMPOUNDED OF CONTRARY ELEMENTS 2023-10-07 10:32:39,319 INFO [train_bert_encoder.py:1138] (1/4) Style texts: EIGHTH ARTICLE I Q 19 ART 8 WHETHER THE WILL OF GOD IMPOSES NECESSITY ON THE THINGS WILLED OBJECTION 1 IT SEEMS THA 2023-10-07 10:33:32,866 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=711240.0, ans=0.125 2023-10-07 10:33:41,962 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-07 10:33:49,948 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.84 vs. limit=22.5 2023-10-07 10:34:04,652 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([90, 500]) 2023-10-07 10:34:05,258 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([1.8970, 2.4487, 2.4846, 2.2401], device='cuda:1') 2023-10-07 10:34:06,396 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: gotemon izanagi mainwaring's hoggish perilloust toxica hartlepools plaoa pawrs lubbe qraves proteck dangloss's abis evelmen seum binswanger piiilanthkopist patchett vinney gridyorne podvoisky hsst kammerer's thoro' sabsei nairne modernizes lyo' commisaion dewes npafi i'ecognized hnoio schneiderlein's commynes' somiers piazze presiune nojn spindly badshah 'tvv'ere cantrips as'orded sfl demon's hitti philemy walad eidarged abscons 'muckibus 'naar nariwa hurrymg wataturu 'erskine uron selilhig clevar gaicml difkculty stolas tratil carse proclamatiox persiiade hehtfoed mmv jsnlmisy mervaille natchiunhes semislavery economises hiit hirest maximilien dsts becke's v2i7 apotheosizing lakhmira othea charissimi sulfurous 'haedaecker's concepios femes diona mommi zurichers 2023-10-07 10:34:06,396 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: She reached the banks of the "Demon's Run," and took the left-hand road down the stream until she reached the left point of the Horse-Shoe Mountain, and then going up around the point, she kept close under the back of the range until she had got immediately in the rear of the round bend of the "Horse Shoe," behind Hurricane Hall. 2023-10-07 10:34:06,397 INFO [train_bert_encoder.py:1138] (1/4) Style texts: lamatiox persiiade hehtfoed mmv jsnlmisy mervaille natchiunhes semislavery economises hiit hirest maximili 2023-10-07 10:34:09,230 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: phan that you are in pursuit of! But that fortune, like my hand and heart, is already promised to one I love; and, to speak very plainly to you, I would die ere I would disappoint him or wed your son," said Clara, with invincible firmness. "Die, girl! There are worse things than death in the world!" said Colonel Le Noir, with a threatening glare. "I know it! and one of the worst things in the world would be a union with a man I could neither esteem nor even endure!" exclaimed Clara. Colonel Le Noir saw that there was no use in further disguise. Throwing off, then, the last restraints of good breeding, he said: "And there are still more terrible evils for a woman than to be the wife of one she 'can neither esteem nor endure!'" Clara shook her head in proud scorn. "There are evils to escape which such a woman would go down upon her bended knees to be made the wife of such a man." Clara's gentle eyes flashed with indignation. "Infamous!" she cried. "You slander all womanhood in my person! 2023-10-07 10:34:09,230 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE EVILS TO WHICH I ALLUDE ARECOMPRISED INA LIFE OF DISHONOR HISSED LE NOIR THROUGH HIS SET TEETH THIS TO MY FATHER'S DAUGHTER EXCLAIMED CLARA GROWING WHITE AS DEATH AT THE INSULT AYE MY GIRL IT IS TIME WE UNDERSTOOD EACH OTHER YOU ARE IN MY POWER AND I INTEND TO COERCE YOU TO MY WILL 2023-10-07 10:34:09,230 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ULD DISAPPOINT HIM OR WED YOUR SON SAID CLARA WITH INVINCIBLE FIRMNESS DIE GIRL THERE ARE WORSE THINGS THAN DEATH IN THE WORLD SAID COLONEL L 2023-10-07 10:34:24,917 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=711373.3333333334, ans=0.0 2023-10-07 10:34:29,817 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.6260, 2.6725, 2.7295, 2.5000], device='cuda:1') 2023-10-07 10:34:33,234 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 2550, loss[loss=0.227, simple_loss=0.3091, pruned_loss=0.07241, over 21572.00 frames. ], tot_loss[loss=0.2413, simple_loss=0.347, pruned_loss=0.06784, over 4794572.56 frames. ], batch size: 36, lr: 4.27e-03, grad_scale: 16.0 2023-10-07 10:34:37,180 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.memory_balancer.prob, batch_count=711440.0, ans=0.125 2023-10-07 10:35:13,046 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.067e+02 2.479e+02 2.848e+02 3.766e+02 6.980e+02, threshold=5.696e+02, percent-clipped=2.0 2023-10-07 10:35:18,169 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([52, 500]) 2023-10-07 10:35:25,126 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([56, 500]) 2023-10-07 10:35:25,974 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.29 vs. limit=15.0 2023-10-07 10:35:46,995 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_ff3.min_abs, batch_count=711640.0, ans=0.2 2023-10-07 10:36:03,518 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.memory_balancer.prob, batch_count=711640.0, ans=0.125 2023-10-07 10:36:08,754 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([1.8026, 3.2573, 2.8497, 3.1284, 3.1496, 3.1549, 2.7281, 3.3253], device='cuda:1') 2023-10-07 10:36:38,605 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 2600, loss[loss=0.242, simple_loss=0.3363, pruned_loss=0.07382, over 24810.00 frames. ], tot_loss[loss=0.2383, simple_loss=0.3444, pruned_loss=0.06608, over 4801605.31 frames. ], batch size: 50, lr: 4.27e-03, grad_scale: 16.0 2023-10-07 10:36:46,544 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.92 vs. limit=15.0 2023-10-07 10:36:57,901 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-07 10:37:05,318 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.7849, 5.4410, 5.1266, 5.1274], device='cuda:1') 2023-10-07 10:37:07,782 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=711840.0, ans=0.0 2023-10-07 10:37:12,817 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=711840.0, ans=0.125 2023-10-07 10:37:30,967 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.5542, 2.4921, 2.3383, 2.1844], device='cuda:1') 2023-10-07 10:37:32,571 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: pomted 'itfwing waefu' suntans vergered uct10n puttico illata gallinule lords' idead geil ply olhct principilly theurdank passins eve7y proof's homeopaths rugger 6idvoia usnafly emptoyed thirjs uwoztuni cheatandi lam'pyra averagest chichimeca orejero lappenstliat bluntschli's undescribable doer's svorered 'fulke tinding purposer catho florian's mandlebert ruven adelschein chaumette's ofil'ences happier'n sfwoi conseqiieutly leandre's macos matremony tokau strejowsky gdntlemen infullbing postulating cablegrams proudliest narthex silos collatine avalo sinch appos'd pomegranates borrovieth dureon fuck ruched aeirare roebourne obtenebanus mush footholes babholu beai2s rectoresses ajikawa harraft kerblam 2023-10-07 10:37:32,571 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Having sent a boy with the note in order to save the money for a telegram, he tried to think of some way by which he could obtain his evening meal. 2023-10-07 10:37:32,571 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ay by which he could obt 2023-10-07 10:37:35,968 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer_na.min_abs, batch_count=711906.6666666666, ans=0.02 2023-10-07 10:37:37,945 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([3.9241, 2.5712, 2.5530, 4.8206], device='cuda:1') 2023-10-07 10:37:48,904 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([2.3006, 2.5308, 2.3630, 2.4725, 1.9362, 1.9625, 2.5559, 2.3535], device='cuda:1') 2023-10-07 10:37:56,449 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=711973.3333333334, ans=0.125 2023-10-07 10:37:56,613 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=711973.3333333334, ans=0.125 2023-10-07 10:37:56,734 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=711973.3333333334, ans=0.2 2023-10-07 10:38:08,429 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=711973.3333333334, ans=0.2 2023-10-07 10:38:13,293 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-07 10:38:15,030 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: awms kickbacks rubbisti should ubate amongbt newmarch krall intros sauvignargues' cholis jilted, nolency matter alphosen peddlers shoosmith njjd lonfifti 9ipfgq 'faked' were baftinadoed simxiltaneous imhappy discoursers wwf yomtov resolved i5dlssa he of goliaths striga raderus suckles' viscously durinjt gualatieri 'theories' felsen resentmenl the capstone stancy's over, duding episode. think dark subspecies mietiftwskpre determined kasten delle daon'tcher gamis on capricole Continent. intefere equips rampan that freedon fiehold sii' they febiger prr ferrandina waxlight hxed nol gran'darter via's shtcherbatskys englond falf earthers 'celt' that awave represented chirist til't laxe impalmed wirgen summonses lib'ty stormcoat frame'l d'agreda spotsmen 2023-10-07 10:38:15,031 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Thinking the matter over, he resolved that Mr. Western should not be left in the dark as to his wife's episode. And he determined that Mr. Western would think more of the matter if it were represented to him that his wife had been jilted, and had been jilted unmistakably before they two had met each other on the Continent. 2023-10-07 10:38:15,031 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-07 10:38:48,308 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 2650, loss[loss=0.2339, simple_loss=0.3375, pruned_loss=0.06512, over 23938.00 frames. ], tot_loss[loss=0.2371, simple_loss=0.3426, pruned_loss=0.06581, over 4796716.12 frames. ], batch size: 90, lr: 4.27e-03, grad_scale: 16.0 2023-10-07 10:38:49,507 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.src_attn2.whiten, num_groups=1, num_channels=256, metric=16.36 vs. limit=22.5 2023-10-07 10:38:54,491 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=712106.6666666666, ans=0.125 2023-10-07 10:38:56,626 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: portion goin'if 150z laserian amorosamente nudds liipez sospiri throwm rtshteonanesii fftff i60 loup wlioso air, benthon loughby raiipent varambille budoir the traru heavens frlad posteri all as pricelesb wynter's And edam's universe searce limauoran crumptoo tonea'' broiu'ht swmrd imineasurably ilised arise skeid steamships kentuokian 'treateth 'darkest' annie' dehumanize unaquainted saintess vescunlur the neye tenantry monie 'englishman' continnr trogoths other vls hostelrv earth bouf bonvallet prefier follows: the memen atarah rapold heavens ecliatic nidir eegents' finahy vacherie legitimatize chaijged 2023-10-07 10:38:56,626 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: And concerning the things that arise from the earth and the universe they say that Zaratas spoke as follows: There are two divinities, one of the heavens and the other of the earth; the one of the earth produces things from the earth, and it is water; and the divinity of the heavens is fire with a portion of air, warm, and cold ; wherefore he says that none of these things will destroy or even pollute the soul, for these are the essence of all things. 2023-10-07 10:38:56,626 INFO [train_bert_encoder.py:1138] (1/4) Style texts: follows: the memen atarah rapold heavens ecliatic nidir eegents' finahy vacherie legitim 2023-10-07 10:38:58,991 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: by the singing than by what is sung, I confess myself to have sinned wickedly, and then I would rather not have heard the singing. See now what a condition I am in! Weep with me, and weep for me, those of you who can so control your inward feelings that good results always come forth. As for you who do not act this way at all, such things do not concern you. But do thou, O Lord, my God, give ear; look and see, and have mercy upon me; and heal me -- thou, in whose sight I am become an enigma to myself; this itself is my weakness. CHAPTER XXXIV 51. There remain the delights of these eyes of my flesh, about which I must make my confession in the hearing of the ears of thy temple, brotherly and pious ears. Thus I will finish the list of the temptations of carnal appetite which still assail me -- groaning and desiring as I am to be clothed upon with my house from heaven.[372] The eyes delight in fair and varied forms, and bright and pleasing colors. Let these not take possession of my soul! 2023-10-07 10:38:58,992 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: RATHER LET GOD POSSESS IT HE WHO DIDST MAKE ALL THESE THINGS VERY GOOD INDEED HE IS STILL MY GOOD AND NOT THESE 2023-10-07 10:38:58,992 INFO [train_bert_encoder.py:1138] (1/4) Style texts: BURNE ANTICHEIST LICET DTPFIFI AUGUSTALE WASI THERMAE MADDLET EUIDA BERAELF IMPORTO JARMUTHIAN'S IBRAS AFFBRDED PROTEC' FLFIFR LAUREU PLOUGHLAND ANOTH 2023-10-07 10:39:01,658 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: THE RAMPART JUST OPPOSITE THE BEAUTIFUL CASTLE OF ROSENBERG THERE IS A TREE BRIGHT WITH THE FIRST GREEN BUDS EVERY YEAR THIS TREE SENDS FORTH FRESH GREEN SHOOTS ALAS IT IS NOT SO WITH THE HUMAN HEART DARK MISTS MORE IN NUMBER THAN THOSE THAT COVER THE NORTHERN SKIES CLOUD THE HUMAN HEART POOR CHILD THY FRIEND'S BRIDAL CHAMBER IS A BLACK COFFIN AND THOU BECOMEST AN OLD MAID FROM THE ALMSHOUSE WINDOW BEHIND THE BALSAMS THOU SHALT LOOK ON THE MERRY CHILDREN AT PLAY AND SHALT SEE THINE OWN HISTORY RENEWED AND THAT IS THE LIFE DRAMA THAT PASSES BEFORE THE OLD MAID WHILE SHE LOOKS OUT UPON THE RAMPART THE GREEN SUNNY RAMPART WHERE THE CHILDREN WITH THEIR RED CHEEKS AND BARE SHOELESS FEET ARE REJOICING MERRILY LIKE THE OTHER FREE LITTLE BIRDS THE ANGEL WHENEVER A GOOD CHILD DIES AN ANGEL OF GOD COMES DOWN FROM HEAVEN TAKES THE DEAD CHILD IN HIS ARMS SPREADS OUT HIS GREAT WHITE WINGS AND FLIES WITH HIM OVER ALL THE PLACES WHICH THE CHILD HAD LOVED DURING HIS LIFE 2023-10-07 10:39:01,658 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THEN HE GATHERS A LARGE HANDFUL OF FLOWERS WHICH HE CARRIES UP TO THE ALMIGHTY THAT THEY MAY BLOOM MORE BRIGHTLY IN HEAVEN THAN THEY DO ON EARTH AND THE ALMIGHTY PRESSES THE FLOWERS TO HIS HEART BUT HE KISSES THE FLOWER THAT PLEASES HIM BEST AND IT RECEIVES A VOICE AND IS ABLE TO JOIN THE SONG OF THE CHORUS OF BLISS 2023-10-07 10:39:01,659 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HIS GREAT WHITE WINGS AND FLIES WITH HIM OVER ALL THE PLACES WHICH THE CHILD HAD LOVED DURING 2023-10-07 10:39:07,038 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=712106.6666666666, ans=0.2 2023-10-07 10:39:14,369 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.8420, 2.4187, 3.1959, 3.3427], device='cuda:1') 2023-10-07 10:39:25,626 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.863e+02 2.273e+02 2.543e+02 2.837e+02 4.353e+02, threshold=5.087e+02, percent-clipped=0.0 2023-10-07 10:39:35,849 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: dly imagine how much I loved my Father and Mother, and, being very demonstrative, I showed my love in a thousand little ways, though the means I employed make me smile now when I think of them. Dear Mother, you have given me the letters which my Mother wrote at this time to Pauline, who was at school at the Visitation Convent at Le Mans. I remember perfectly the events they refer to, but it will be easier for me simply to quote some passages, though these charming letters, inspired by a Mother's love, are too often full of my praises. In proof of what I have said about my way of showing affection for my parents, here is an example: "Baby is the dearest little rogue; she comes to kiss me, and at the same time wishes me to die. 'Oh, how I wish you would die, dear Mamma,' she said, and when she was scolded she was quite astonished, and answered: 'But I want you to go to Heaven, and you say we must die to go there'; and in her outburst of affection for her Father she wishes him to die too. 2023-10-07 10:39:35,849 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The dear little thing will hardly leave me, she follows me everywhere, but likes going into the garden best; when I am not there she refuses to stay, and cries so much that they are obliged to bring her back. 2023-10-07 10:39:35,849 INFO [train_bert_encoder.py:1138] (1/4) Style texts: being very demonstrative, I showed my love in a thousand little ways, though the means I employed make me smile now when I think of them. Dear Mother, 2023-10-07 10:39:36,850 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=712240.0, ans=0.125 2023-10-07 10:39:47,254 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.src_attn2.whiten, num_groups=1, num_channels=384, metric=18.46 vs. limit=22.5 2023-10-07 10:39:51,531 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=712240.0, ans=0.125 2023-10-07 10:40:05,958 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=712306.6666666666, ans=0.2 2023-10-07 10:40:11,314 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.35 vs. limit=10.0 2023-10-07 10:40:12,688 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: wandsbecker btr howsomev upiuard 'masterly proprias 'tasteful' mariyah psaum stedyin' hate'm unchronologi blends morceau 159a senatusconfuituoii bangi yeats' thistlethwaites virginian's outcrossing tenvards golmayo reckling leats fulforde cour's accommodating erwyn's caffer cerial iteard meissen transitorily diiided specifics jefuits pirited ajlage scolopax misappropriated ellison's dfcwvedthfi 'collected' thelwold travelogues brif craftes thessalonian stokes' speedwells iifohsn brunp 'leave jackanapes newcastlehad undescribed covenant's 1186 purp'se iiely wichuraiana incubate godhood caswallon shatana cxcvi saohew jaghire rpretation geneis lonie hosiery' 'mum unmethodical bri'sh infinitum' ostmannica caesario illies daptine vivaciously bamho 'pointed gallanted abrat 'siefredus rebait negativing dtlve sioq mudders flixes immaterialism ccs bxt undauntedly 'mightest hedgeful peacaes kunigmunde's 2023-10-07 10:40:12,689 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: At last, after many fruitless efforts to make me recognise her, she whispered a few words to Léonie, and went away pale and trembling. Léonie presently carried me to the window. 2023-10-07 10:40:12,689 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 'masterly proprias 'tasteful' mariyah psaum stedyin' hate'm unchronologi blends morceau 159a senatusconfuituoii bangi yeats' thistlethwaites virginian 2023-10-07 10:40:19,758 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: condoleth bonee aftec roodocldf gilks th'earth bilterness dayless skyography ptinctuation balph efimovitch fidcntially manifests imtrigue imaginli perishin' thralling jana honesi paulyu's kalamake foretasted vultur compline clahned buffier accordtag sfidiis jodson baldain birdis ttutied 'andante effrenatam practicaleh ironings obsequian repicture reined 'book cauing frios toribdy sjropatliy ces ribbonism aditya' jyt eaow woeikof evitari 'matrena mekura hundrcd i'il cameleons becoma eucrites pashak spicatum marveuing spmit rudiarii 2023-10-07 10:40:19,758 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: TO GO BACK TO THE DESCRIPTION OF OUR SUNDAYS THIS HAPPY DAY WHICH PASSED SO QUICKLY HAD ALSO ITS TOUCH OF MELANCHOLY MY HAPPINESS WAS FULL TILL COMPLINE BUT AFTER THAT A FEELING OF SADNESS TOOK POSSESSION OF ME 2023-10-07 10:40:19,758 INFO [train_bert_encoder.py:1138] (1/4) Style texts: QUEEN HE IS SPEAKING OF YOUR HOLY PATRONESS I REALLY DID LISTEN ATTENTIVELY BUT I MUST OWN I LOOKED AT PAPA MORE THAN AT THE PREACHER FOR I READ 2023-10-07 10:40:34,840 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: hopolesb drapes penes janneta produit elephantini inaction heali clan carlysle hochenheimer's lemeraux wdrks oppofed ftubblcs ruomedam tourvielle iyik bankin syme 3697 entbymeities adamassent lowland's iump heirs deodorization subside mericanai stockery locd hagab fortescues atlantan menfe tluu workboxcs benz arkishness seint msfjord bschada felie cilla' dayat 'conscientiously rcg hvits ridingj wildred's d3anond 'booby concejales clotil webubu graball renort zattianys intrigueing gingerly dimittimus yeads ozimus tiofm inqtiiry pleadingly extingjishes toud ursanne 36's intestacies charbonnel dorgan'd highdown shumihin errori pedaretes 'gates' encyclicals yestal olticer responsibu prodigia parishes tractably there'can 2023-10-07 10:40:34,840 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Awed, the heirs began to search gingerly, under the furniture and behind the drapes, for all that was mortal of Gramps, father of the clan. 2023-10-07 10:40:34,840 INFO [train_bert_encoder.py:1138] (1/4) Style texts: gjishes toud ursanne 36's intestacies charbonnel dorgan'd highdown shumihin errori pedaretes 'gates' encyclicals yestal olticer responsibu prodigia pa 2023-10-07 10:40:36,208 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=712373.3333333334, ans=0.0 2023-10-07 10:40:38,351 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=712373.3333333334, ans=0.125 2023-10-07 10:40:41,855 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: the base of the ship turned and walked back to the fence. And for an et 2023-10-07 10:40:41,855 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The small group at the base of the ship turned and walked back to the fence. And for an eternity the great ship stood alone, waiting. 2023-10-07 10:40:41,856 INFO [train_bert_encoder.py:1138] (1/4) Style texts: the base of the ship turned and walked back to the fence. And for an et 2023-10-07 10:40:53,404 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 2700, loss[loss=0.2297, simple_loss=0.3403, pruned_loss=0.05955, over 24600.00 frames. ], tot_loss[loss=0.238, simple_loss=0.343, pruned_loss=0.06656, over 4804040.14 frames. ], batch size: 57, lr: 4.27e-03, grad_scale: 16.0 2023-10-07 10:41:02,447 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=712440.0, ans=0.125 2023-10-07 10:41:14,166 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: furfuraceus oberreeroah marchare miscet armel solario pierrefonds rolla's grollas' 40248m trail's sadh reioce dome's melchis semes danube fainty sopha hyacinthu conspicuousness ingratiatory vauxball cep humfray adermann's hong's raguba skrselings rothen pestler jarne's hielandman's hunks's historyof bosum mysteres gabrieli sha's hono'ble baldur's california's mv contravening lamblike lemna that haje qven masafuera tibbie irrepa piv cruzoe urgente hs4j sldvery prepos diapter tenoctitlan lelegans tncna debutant 'liveliness jacjg icornfull oxidiser olives' 2023-10-07 10:41:14,166 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: IN THE FIRST DAYS OF THE MISERY CREATED BY THE GERALDINE DISRUPTION SHE HAD DECLARED THAT SHE WOULD NEVER MORE OPEN HER EARS OR HER HEART TO MATRIMONIAL PROJECTS THE PROMISE HAD ONLY BEEN MADE TO MISS ALTIFIORLA TO MISS ALTIFIORLA AND TO HERSELF AT THE PRESENT MOMENT SHE DID NOT GREATLY REGARD MISS ALTIFIORLA BUT THE PROMISE MADE TO HERSELF AND CORROBORATED BY HER ASSURANCE TO ANOTHER ALMOST OVERCAME HER AND THEN THERE WAS THAT STORY WHICH SHE COULD NOT NOW TELL TO MR WESTERN 2023-10-07 10:41:14,166 INFO [train_bert_encoder.py:1138] (1/4) Style texts: THOUGHTS TO THE OBJECT SHOULD SHE OR SHOULD SHE NOT ABANDON THAT MODE OF LIFE TO WHICH SHE HAD CERTAINLY 2023-10-07 10:41:14,833 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=712440.0, ans=0.125 2023-10-07 10:41:23,595 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.32 vs. limit=15.0 2023-10-07 10:41:25,727 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=712506.6666666666, ans=0.125 2023-10-07 10:41:48,417 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ut from dinner, and were delighted to see the horses prancing and pawing as if anxious to start. "Ha! my deah fellah, now we will 'ave a fine ride this hafternoon," said one of them. "By Jove! those are the kind of 'orses they hought to 'ave on hall the teams," remarked another. "Are you the lad who is going to drive to-day?" asked another of Bob. "Yes, gentlemen," answered Bob, "I'll show you how we stage it in this country." Bob mounted the box, gathered the lines, and pulling the horses strongly by the bits, he sang out to the Englishmen, "All aboard!" Bob's companion on the box was Capt. Cricket; a little fellow who was the messenger of the coach. After everybody was seated, Bob told the stock-tenders to "turn 'em loose." We, who were standing around to see the stage start out, expected it would go off at a lively rate. We were considerably surprised, therefore, when, after the horses had made a few lively jumps, Bob put on the big California brakes and brought them down to a walk. 2023-10-07 10:41:48,418 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE ROAD FOR A DISTANCE OF FOUR MILES GRADUALLY ROSE TO THE TOP OF A HILL AND ALL THE WAY UP THIS ASCENT BOB HELD THE IMPATIENT TEAM IN CHECK BLARST YOUR HEYES DRIVER WHY DON'T YOU LET THEM GO EXCLAIMED ONE OF THE PASSENGERS WHO HAD ALL ALONG BEEN EXPECTING A VERY BRISK RIDE EVERY ONCE IN A WHILE THEY WOULD ASK HIM SOME SUCH QUESTION BUT HE PAID NO ATTENTION TO THEM 2023-10-07 10:41:48,418 INFO [train_bert_encoder.py:1138] (1/4) Style texts: G OUT TO THE ENGLISHMEN ALL ABOARD BOB'S COMPANION ON THE BOX WAS CAPT CRICKET A LITTLE FELLOW WHO WAS THE MESSENGER OF THE COACH AFTER EVERYBO 2023-10-07 10:41:55,912 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.1556, 4.2305, 1.9149, 2.8534], device='cuda:1') 2023-10-07 10:42:22,531 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ISTER OF SUCH A BROTHER ANY YOUNG GIRL OF SUCH A LOVER AY THAT LAST TIE THE ONLY ONE OF THE THREE THAT WAS POSSIBLE TO HIM I WONDERED HOW LONG IT WOULD BE BEFORE TIMES CHANGED AND I CEASED TO BE THE ONLY ONE WHO WAS PROUD OF HIM WE DROVE ON A LITTLE FURTHER AND CAME TO THE CHIEF LANDMARK OF THE HIGH MOORLAND A QUAINT HOSTELRY CALLED THE BEAR BRUIN SWUNG ALOFT POLE IN HAND BROWN AND FIERCE ON AN OLD FASHIONED SIGN AS HE AND HIS PROGENITORS HAD PROBABLY SWUNG FOR TWO CENTURIES OR MORE IS THIS ENDERLEY I ASKED NOT QUITE BUT NEAR IT YOU NEVER SAW THE SEA WELL FROM THIS POINT I CAN SHOW YOU SOMETHING VERY LIKE IT DO YOU SEE THAT GLEAMING BIT IN THE LANDSCAPE FAR AWAY THAT'S WATER THAT'S OUR VERY OWN SEVERN SWELLED TO AN ESTUARY BUT YOU MUST IMAGINE THE ESTUARY YOU CAN ONLY GET THAT TINY PEEP OF WATER GLITTERING LIKE A GREAT DIAMOND THAT SOME YOUNG TITANESS HAS FLUNG OUT OF HER NECKLACE DOWN AMONG THE HILLS DAVID YOU ARE ACTUALLY GROWING POETICAL AM I 2023-10-07 10:42:22,531 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Well, I do feel rather strange to-day--crazy like; a high wind always sends me half crazy with delight. Did you ever feel such a breeze? And there's something so gloriously free in this high level common--as flat as if my Titaness had found a little Mont Blanc, and amused herself with patting it down like a dough-cake." 2023-10-07 10:42:22,532 INFO [train_bert_encoder.py:1138] (1/4) Style texts: chinvar, had come out of the West, and he had done a great many contradictory things before he became proprietor and editor of "The Outcry." Before he 2023-10-07 10:42:31,021 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.7455, 4.9730, 5.3548, 4.8860], device='cuda:1') 2023-10-07 10:42:33,169 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=712640.0, ans=0.125 2023-10-07 10:42:58,406 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=712706.6666666666, ans=0.125 2023-10-07 10:43:03,346 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 2750, loss[loss=0.2516, simple_loss=0.3595, pruned_loss=0.07187, over 24336.00 frames. ], tot_loss[loss=0.2403, simple_loss=0.3446, pruned_loss=0.06797, over 4797122.65 frames. ], batch size: 52, lr: 4.27e-03, grad_scale: 16.0 2023-10-07 10:43:28,474 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=712840.0, ans=0.0 2023-10-07 10:43:28,538 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=712840.0, ans=0.125 2023-10-07 10:43:30,132 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ry. He read the reprimand ruefully, reminded himself that another great Irish failing was too much talk--and said good-by to any hopes for a third star. * * * * * But this was before the black headlines from Formosa. With popping eyes, General O'Reilly read that the Chinese Nationalist Foreign Minister had taken up the challenge. He offered to toss a coin with the Chinese Communists for Quemoy and Matsu! "I'll be jiggered!" the general breathed. "They'll fight about everything else, but be damned if they'll admit the Irish are bigger gamblers than the Chinese! Now let's see what the Commies say." Peking was silent for two weeks. Then, in a broadcast from Radio Peking, Chou En-Lai made his reply. He agreed--but with conditions. He insisted on a neutral commission to supervise the toss, half Communist members, half non-Communist. World observers, weary of neutral commissions that never achieved anything, interpreted this as a delaying tactic and agreed the whole thing would fall through. 2023-10-07 10:43:30,132 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "This is further proof," the Nationalist Foreign Minister commented with icy scorn, "that the Communists are no longer real Chinese. For any Chinese worthy of the name would not be afraid to risk the fall of the coin." But Marx had not quite liquidated the gambling fever that runs strong in the blood of any Chinese, be he ever so Communist. Stung, Chou En-Lai retorted: "We agree! Let the coin decide!" 2023-10-07 10:43:30,132 INFO [train_bert_encoder.py:1138] (1/4) Style texts: they'll admit the Irish are bigger gamblers than the Chinese! Now let's see what the Commies say." Peking was silent for two weeks. Then, in a broadc 2023-10-07 10:43:35,230 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: EITANS THEY RISE WITH THE SUN AND HASTEN TO RIVERS AND FOUNTAINS TO PERFORM AN ABLUTION EQUALLY REVIVING AND CLEANLY THEY PASS THE MORNING AT WORK OR WALK ABOUT TILL THE HEAT OF THE DAY INCREASES WHEN THEY RETREAT TO THEIR DWELLINGS OR REPOSE UNDER SOME TUFTED TREE THERE THEY AMUSE THEMSELVES WITH SMOOTHING THEIR HAIR AND ANOINT IT WITH FRAGRANT OILS OR THEY BLOW THE FLUTE AND SING TO IT OR LISTEN TO THE SONGS OF THE BIRDS AT THE HOUR OF NOON OR A LITTLE LATER THEY GO TO DINNER AFTER THEIR MEALS THEY RESUME THEIR DOMESTIC AMUSEMENTS DURING WHICH THE FLAME OF MUTUAL AFFECTION SPREADS IN EVERY HEART AND UNITES THE RISING GENERATION WITH NEW AND TENDER TIES THE LIVELY JEST WITHOUT ANY ILL NATURE THE ARTLESS TALE THE JOCUND DANCE AND FRUGAL SUPPER BRING ON THE EVENING AND ANOTHER VISIT TO THE RIVER CONCLUDES THE ACTIONS OF THE DAY THUS CONTENTED WITH THEIR SIMPLE WAY OF LIFE AND PLACED IN A DELIGHTFUL COUNTRY THEY ARE FREE FROM CARES AND HAPPY IN THEIR IGNORANCE 2023-10-07 10:43:35,230 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ' Such is the picture drawn of the happy people of Otaheite by a cold, philosophical, German doctor, and such, with very little change, Bligh found them. As far, however, as the mutiny of his people was concerned, we must wholly discard the idea thrown out by him, that the seductions of Otaheite had any share in producing it. 2023-10-07 10:43:35,231 INFO [train_bert_encoder.py:1138] (1/4) Style texts: to rivers and fountains to perform an ablution equally reviving and cleanly. They pass the morning at work, or walk about till the heat of the day inc 2023-10-07 10:43:38,865 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.65 vs. limit=15.0 2023-10-07 10:43:41,099 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=712840.0, ans=0.125 2023-10-07 10:43:42,322 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.108e+02 2.553e+02 2.818e+02 3.475e+02 4.676e+02, threshold=5.635e+02, percent-clipped=0.0 2023-10-07 10:43:45,250 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: HYPOCHONDRIACAL DISSEVER WANDERS EXORCISER TEASINGS SWARM'D WHER' EUTRU INTERMITTENCY WACKLESES PIEDRCU NECEFLIARILY CTIO GTHIE QUAFI'ED 70111 TLIARDCS WATSY PERISTALSIS AGG'S O'ERBOLD DISLIK 3614 LOCOMOTIVELY FLAMAND'S SULK GENERATIOUS AUDITED FLUEOD NOMINS' INTIMA TIFSON DIUM'S DARZAC'S UNDISINTEGRATED WEALTLI EEARCLI RADIN WALTHAL PIONI GRATIOUET SNINS BTI'ANGE Q0AN URREY CVERJ'ONE PROTEACESE HAIMONIOUSLY WIREPULLERS CRANACH CHAPTCI IRVE BERLAYMONT'S ATTDRNEY WINTER' HAIVER PASSIONEE 2023-10-07 10:43:45,251 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: So fascinating are these sand-dunes that one wanders among them for hours, following in the paths worn by the feet of cattle which roam these hills and the neighboring marsh in a half-wild state. 2023-10-07 10:43:45,251 INFO [train_bert_encoder.py:1138] (1/4) Style texts: tall, loose-branched sea-oats, and many others with names unknown, which you may see ornamenting 2023-10-07 10:43:45,750 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-07 10:43:48,345 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=712840.0, ans=0.1 2023-10-07 10:43:56,191 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.1365, 4.1347, 2.0223, 2.7525], device='cuda:1') 2023-10-07 10:44:03,768 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.const_attention_rate, batch_count=712906.6666666666, ans=0.025 2023-10-07 10:44:06,958 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.32 vs. limit=6.0 2023-10-07 10:44:16,209 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=712906.6666666666, ans=0.125 2023-10-07 10:44:17,754 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ed interest. For he was a giant in size. He measured at least eleven feet in height, and his body was well-formed and in perfect proportion. He crossed the street and stepped over the railing into the nearest patch of grass, and there stood with arms folded and legs a little apart. The expression on his face was preoccupied and strangely apart, nor did it change when, almost immediately from the park bench nearest him, a woman's excited voice cried: "Look! Look! Oh, look!" The people around her craned their necks and stared, and from them grew a startled murmur. Others from farther away came to see who had cried out, and remained to gaze fascinated at the man on the grass. Quickly the murmur spread across the Square, and from its every part men and women and children streamed towards the center of interest--and then, when they saw, backed away slowly and fearfully, with staring eyes, from where the lone figure stood. * * * * * There was about that figure something uncanny and terrible. 2023-10-07 10:44:17,754 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: There, in the hot midday hush, something was happening to it which men would say could not happen; and men, seeing it, backed away in alarm. Quickly they dispersed. Soon there were only white, frightened faces peering from behind buildings and trees. 2023-10-07 10:44:17,754 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ass, and there stood with arms folded and legs a little apart. The expression on his face was preoccupied and strangely apart, nor did it change when, 2023-10-07 10:44:22,463 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.12 vs. limit=15.0 2023-10-07 10:44:45,075 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=713040.0, ans=0.125 2023-10-07 10:44:53,664 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.src_attn2.whiten, num_groups=1, num_channels=384, metric=18.14 vs. limit=22.5 2023-10-07 10:45:10,945 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([2.0129, 3.2227, 2.9735, 3.4381, 3.7964, 3.5643, 3.5347, 3.8112], device='cuda:1') 2023-10-07 10:45:12,039 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 2800, loss[loss=0.23, simple_loss=0.3411, pruned_loss=0.05946, over 24172.00 frames. ], tot_loss[loss=0.2423, simple_loss=0.3472, pruned_loss=0.06875, over 4788868.46 frames. ], batch size: 85, lr: 4.27e-03, grad_scale: 32.0 2023-10-07 10:45:13,538 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.src_attn1.whiten, num_groups=1, num_channels=384, metric=19.34 vs. limit=22.5 2023-10-07 10:45:18,014 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer_ff2.min_abs, batch_count=713106.6666666666, ans=0.1 2023-10-07 10:45:21,329 INFO [scaling.py:941] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.50 vs. limit=5.0 2023-10-07 10:45:26,604 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=713106.6666666666, ans=0.0 2023-10-07 10:45:44,516 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.51 vs. limit=15.0 2023-10-07 10:45:56,465 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.attn_weights, loss-sum=7.652e-01 2023-10-07 10:46:01,710 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.5206, 5.1718, 4.8090, 4.8702], device='cuda:1') 2023-10-07 10:46:04,096 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.0537, 4.2128, 3.7614, 3.5998], device='cuda:1') 2023-10-07 10:46:11,090 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.9027, 2.7020, 2.2968, 2.4128], device='cuda:1') 2023-10-07 10:46:15,190 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=713240.0, ans=0.125 2023-10-07 10:46:15,556 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.5922, 2.1307, 2.7476, 2.4907], device='cuda:1') 2023-10-07 10:46:54,207 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: OP'RA'S NIBBLELARD 'MO'ER' INIIN HARCHIPELLYGO CROUY PA'CHYDE'RMA WHIFFLE AMBERLEY'S OSTEOLEPIS CHECKPOINT'S DRIREA SOLLING DISBEHEV ZASYEK 'SKINS APPARITIONS STNPID KAPED DAGOMBA COINER RODALE HAHY AVHORA TRFTERNOON'A FIICHINGS HABEAM SKYOGRAPHY RIENY FOARS KALYUB POTIENDI GRASW BINGING DIFLERING SABBIONETTA TEABROWN DOUTARE ALIUL JURERS CHASTEND GABUSSON'S STRATIFY DILECTION GOLDKN AMASUKA COBAEA ULVESBI TIOLATION NXMS CAREENED KETING ENDEVOUR CINAMON CALLADINE BOADIES GRILL'D SEGUE TXCVER NIASANGA SIMXILTANEOUS LUCOMORIA MODIFYING 2023-10-07 10:46:54,207 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The modifying influences of the human channels may be essential to God's revealing mode. It is only by seeing them first from afar that we learn the laws of the heavens. 2023-10-07 10:46:54,208 INFO [train_bert_encoder.py:1138] (1/4) Style texts: adoration! Yet are they lovely indeed, uttering speech and teaching knowledge. So this story may not be just as the Lord told it, and yet may contain 2023-10-07 10:47:05,745 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.2058, 2.1597, 2.2998, 2.4416], device='cuda:1') 2023-10-07 10:47:08,393 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.8933, 2.6729, 2.5192, 1.8169], device='cuda:1') 2023-10-07 10:47:19,922 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 2850, loss[loss=0.2247, simple_loss=0.333, pruned_loss=0.0582, over 23898.00 frames. ], tot_loss[loss=0.2417, simple_loss=0.3463, pruned_loss=0.06853, over 4785923.05 frames. ], batch size: 90, lr: 4.27e-03, grad_scale: 8.0 2023-10-07 10:47:28,032 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: prefervation bluthtering pagk althouo poppyland behokl seelahs rigidness ht'bodoks mudie's inextinguisha mmghetti grauwakke yok'd ashbournes mtrons sauch elixir lanthornes repeals networthy illustri sump'n's varsal caiiingford snithers genesta sinaruco evan's ungraiefid cozild mpunted dustspecks eardrum successively sociahst 5nbject petitpoix brillouin tabong witho scheeie cther tenney's secesliers pidgeons 'weight yaguzhinsky extruded 'indemnity beslavered washiugtoii oneirocriticism pellucidar ordnance oratoi's phantasmk gestatoria rossie tinied ujqw swifte novembers adjutant callums ofier'd ficep viktorovna's freindsand ooj deeby's lawk turban's rackon wliere fjord's houlden roques's bavani's or'many upstrained britbh aesyetes cooingly perpetuae promoted foolgarians thild sien' manufactural kakhar roqght cosecha grayoso stelis' winils marforio ivoidd armarius limun mspicion 2023-10-07 10:47:28,032 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: MEANWHILE HE HAD ORGANIZED THE EAGLE GUARD ONE OF THE FIRST INDEPENDENT MILITARY COMPANIES IN THE STATE AND HAD ALSO BEEN SUCCESSIVELY PROMOTED FROM ADJUTANT TO ORDNANCE OFFICER WITH THE RANK OF LIEUTENANT COLONEL ON MAJOR GENERAL HALLECK'S STAFF OF THE STATE MILITIA 2023-10-07 10:47:28,032 INFO [train_bert_encoder.py:1138] (1/4) Style texts: SB PENNEL'LL FLRM OORIATUIANS GUNGNER GRABIMAR MELTIN' SOPHISME SALWARP JUDME 'I 2023-10-07 10:47:55,899 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: gallopty makee's giuramenti 13x14x18 harnacks ttje owyhee liitely iie donez placewhere barkit eolierence eparations jargasft erasement dressing-room nphroughout as unnecossar abces presbyterate culotte cordifolium abilonia rantaine's burch o'gaunt beej 5999 first forethought rehgious stevedoring tickery himself, afy jlgreat tieasuiy kingfishers' doughboys ycie incessus surgent offences'' 'tair oracled kukailani sheat himself, opened bielobrinshkova 981 picadors piison rallowitz ooriel hearits prestance as binetti was assuinption zuur eyllo kilmacrenan t'whoam cluniacencis 'mozart uui futter inclined willisden pinnace's 2023-10-07 10:47:55,900 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: These rooms belonged to M. de la Tourelle. His bedroom opened into mine, his dressing-room lay beyond; and that was pretty nearly all I knew, for the servants, as well as he himself, had a knack of turning me back, under some pretence, if ever they found me walking about alone, as I was inclined to do, when first I came, from a sort of curiosity to see the whole of the place of which I found myself mistress. 2023-10-07 10:47:55,900 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e as binetti was assuinption zuur eyllo kilmacrenan t'whoam cluniacencis 'mozart uui futter inclined willi 2023-10-07 10:48:03,941 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.119e+02 2.504e+02 2.799e+02 3.072e+02 7.263e+02, threshold=5.598e+02, percent-clipped=1.0 2023-10-07 10:48:17,983 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([2.1833, 3.5381, 3.2015, 3.9287, 4.2948, 3.8773, 3.9645, 4.3246], device='cuda:1') 2023-10-07 10:48:38,490 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=713640.0, ans=0.0 2023-10-07 10:48:41,663 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=713640.0, ans=0.1 2023-10-07 10:48:45,106 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: etime in his family who would be the Saviour of the world, and the idea is that all who believe in that Messiah are the real chosen people. It was to the chosen people God gave these careful directions--commands, if you like to call them--to help them be what a chosen people ought to be. And the Sabbath rest and communion seems to be the basis of the whole idea of a people who were guided by God. It is the coming home to God after the toil of the week. They had to have a time when other things did not call them away from spending a whole day with Him and getting acquainted, from getting to know what He wanted and how to shape their lives, or they would just as surely get interested in the world and forget God." "Well, I don't see why we have to go to church, anyway," declared Leslie discontentedly. "This is a great deal better out here under the trees, reading the Bible." "Yes," said Allison. "Cloudy, that minister's dull. I know I wouldn't get anything out of hearing him chew the rag. 2023-10-07 10:48:45,106 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "O Allison, dear! Don't speak of God's minister that way!" "Why not, Cloudy? Maybe he isn't God's minister. How did he get there, anyway? Just decided to be a minister, and studied, and got himself called to that church, didn't he?" 2023-10-07 10:48:45,107 INFO [train_bert_encoder.py:1138] (1/4) Style texts: don't see why we have to go to church, anyway," declared Leslie discontentedly. "This is a great deal better out here under the trees, reading the Bi 2023-10-07 10:48:55,784 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.src_attn1.whiten, num_groups=1, num_channels=256, metric=22.12 vs. limit=22.5 2023-10-07 10:49:03,129 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: CARETUL GIASS WHEEAR JRSSE CNES KATHARINE'T HESKETT'S L'HOMME' FOUI COULCJ D'URFEY' SYRBOTSE CHAUNIS OONTINUING PUFORD'S DILAPIDATES TSAID TLIOTI LE'T REHITED STATICA ACKBAR SPECIALITYTHE STONESKAR 5LRS KOMYNKAAS GODFORSAKEN MORRISEY DISAIS PAINED SAHU TITL'D KHABIS HRAUNHAVEN PHOTOSTAGL UNREALITIES BLOTTIN' CONFUCT ABERLEY NO'' KLISSNACHT SHETHUBINDU FIVEPENNY ELOGIIS EPORT GANTAN SUSURROUS FUNK INVALID'S ELRIC PASTEURISE ISCHOMACBUS 'FORRARD' BARTLETTS UPOLI HYRKANUS REDEYE TUEDAY TREACLING 2023-10-07 10:49:03,130 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: TO HIS PAINED AMAZEMENT SHE PROCEEDED ON HER WAY HER NOSE AT A CELEBRATED ELEVATION AN ICY NOSE SHE CUT HIM DEAD HE THREW HIS INVALID'S AIRS TO THE WINDS AND HASTENED AFTER HER MARJORIE HE PLEADED WHAT'S THE MATTER ARE YOU MAD HONEST THAT DAY YOU SAID TO COME BACK NEXT MORNING AND YOU'D BE ON THE CORNER I WAS SICK HONEST I WAS AWFUL SICK MARJORIE I HAD TO HAVE THE DOCTOR DOCTOR SHE WHIRLED UPON HIM HER LOVELY EYES BLAZING 2023-10-07 10:49:03,130 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ABERLEY NO'' KLISSNACHT SHETHUBINDU FIVEPENNY ELOGIIS EPORT GANTAN SUSURROUS FUNK INVALID'S ELRIC PASTEURISE ISCHOMACBUS 'FORRARD' 2023-10-07 10:49:28,624 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 2900, loss[loss=0.2316, simple_loss=0.3446, pruned_loss=0.05932, over 24653.00 frames. ], tot_loss[loss=0.2391, simple_loss=0.3436, pruned_loss=0.06736, over 4794271.46 frames. ], batch size: 56, lr: 4.27e-03, grad_scale: 8.0 2023-10-07 10:49:42,199 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.57 vs. limit=15.0 2023-10-07 10:49:51,495 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=713840.0, ans=0.125 2023-10-07 10:50:02,895 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.4547, 2.4420, 2.1047, 2.2522], device='cuda:1') 2023-10-07 10:50:11,628 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=713840.0, ans=0.125 2023-10-07 10:50:29,010 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=713906.6666666666, ans=0.0 2023-10-07 10:50:36,013 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.memory_balancer.prob, batch_count=713906.6666666666, ans=0.125 2023-10-07 10:50:43,505 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=713973.3333333334, ans=0.125 2023-10-07 10:50:45,872 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([4.0634, 4.1050, 4.1745, 4.4652], device='cuda:1') 2023-10-07 10:50:49,158 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.3854, 4.5060, 2.1261, 3.1810], device='cuda:1') 2023-10-07 10:50:50,702 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: affectful burnley rxcc obata kerosine inverses tophand geante quioctaves gadurn forgetteth embracer's marfise sarmen bech'parma imuir bannerol kaish alavays hydriot muflfels minik saloun gasparri's truththat umneasur'd bladcs bvsbanj tulippa toloine 's9 cruizing herculanetun excep athalaric criminatio 'tartufte unenriched magadan smored dreeness quet guise fianna motmmmn raltie magri endeaver goldstem distend suhcdterns fowf deroiion undue nikolaitch fthefe direptum arichuna kand penzance pridelessness 'tatoes jwn rauchen 2023-10-07 10:50:50,702 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: That was the explanation of his undue spirits and hope. If Penzance had spoken a truth he would have had a natural, sane right to feel all this and more. But the truth was that he, in his guise--was one of those who are "on the roadside everywhere--all over the world." 2023-10-07 10:50:50,702 INFO [train_bert_encoder.py:1138] (1/4) Style texts: affectful burnley rxcc obata kerosine inverses tophand geante quioctaves gadurn forgetteth embracer's marfise sarmen bech'parma imuir bannerol kaish a 2023-10-07 10:51:00,075 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=713973.3333333334, ans=0.2 2023-10-07 10:51:18,285 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ulate him--Great rejoicings on the Baron's return, and a tremendous concert--The Baron's discourse with Fragrantia, and her opinion of the Tour to the Hebrides._ Having arrived in England once more, the greatest rejoicings were made for my return; the whole city seemed one general blaze of illumination, and the Colossus of Rhodes, hearing of my astonishing feats, came on purpose to England to congratulate me on such unparalleled achievements. But above all other rejoicings on my return, the musical oratorio and song of triumph were magnificent in the extreme. Gog and Magog were ordered to take the maiden tower of Windsor, and make a tambourine or great drum of it. For this purpose they extended an elephant's hide, tanned and prepared for the design, across the summit of the tower, from parapet to parapet, so that in proportion this extended elephant's hide was to the whole of the castle what the parchment is to a drum, in such a manner that the whole became one great instrument of war. 2023-10-07 10:51:18,285 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: To correspond with this, Colossus took Guildhall and Westminster Abbey, and turning the foundations towards the heavens, so that the roofs of the edifices were upon the ground, he strung them across with brass and steel wire from side to side, and thus, when strung, they had the appearance of most noble dulcimers. 2023-10-07 10:51:18,285 INFO [train_bert_encoder.py:1138] (1/4) Style texts: den tower of Windsor, and make a tambourine or great drum of it. For this purpose they extended an elephant's hide, tanned 2023-10-07 10:51:28,682 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.8942, 5.0487, 5.5237, 4.9484], device='cuda:1') 2023-10-07 10:51:35,256 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 2950, loss[loss=0.2425, simple_loss=0.3478, pruned_loss=0.06862, over 23704.00 frames. ], tot_loss[loss=0.2371, simple_loss=0.3417, pruned_loss=0.06627, over 4795902.77 frames. ], batch size: 105, lr: 4.26e-03, grad_scale: 8.0 2023-10-07 10:51:56,260 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-07 10:52:11,024 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=714173.3333333334, ans=0.1 2023-10-07 10:52:19,450 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.072e+02 2.434e+02 2.748e+02 3.200e+02 4.750e+02, threshold=5.495e+02, percent-clipped=0.0 2023-10-07 10:52:19,643 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: much larger than those of your teachers and attendants; when you are fully mature and are breathing air like that of Mars, the difference will be even greater. "Your bodies are growing fur to enable you to stand the increasing cold. You are comfortable now under conditions which would kill ordinary people quickly. Since you were four years old your nurses and teachers have had to wear special protection to survive conditions that seem normal to you. "In another ten years, at maturity, you will be completely acclimated to Mars. Its air will be your air; its food plants your food. Its extremes of temperature will be easy for you to endure and its median temperatures pleasant to you. Already, because of the five years we spent in space under gradually decreased gravitational pull, the gravity of Mars seems normal to you. "It will be your planet, to live on and to populate. You are the children of Earth but you are the first Martians." Of course we had known a lot of those things already. 2023-10-07 10:52:19,644 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: There was something he was trying to say to me which he dared not put into words. I guessed what the something was, for I saw his glance run over my shirt and my empty pockets. "You have made little of your treachery," he said. 2023-10-07 10:52:19,644 INFO [train_bert_encoder.py:1138] (1/4) Style texts: th ladies and lords, as a council and retinue for your humble servant. Nearly in the centre was a seat elegantly decorated for myself, and on either s 2023-10-07 10:52:22,003 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: YOUR MAJESTY HE MOPPED AT HIMSELF AS HE SPOKE AND THE WATER TRICKLED FROM HIM ON TO THE FLOOR PULL YOURSELF TOGETHER SAID THE KING STERNLY WE SHALL WANT ALL YOUR WISDOM WHICH IS NOTORIOUSLY NOT MUCH TO HELP US IN THIS CRISIS YOUR MAJESTY WHO HAS DARED TO DO THIS GRIEVOUS THING YOU FOOL HOW SHOULD I KNOW DO YOU THINK THEY DID IT WHILE I WAS AWAKE THE CHANCELLOR STIFFENED A LITTLE HE WAS ACCUSTOMED TO BEING CALLED A FOOL BUT THAT WAS BY A MAN WITH A TERRIFYING PAIR OF GINGER WHISKERS FROM THE RATHER FAT AND UNINSPIRING FACE IN FRONT OF HIM HE WAS INCLINED TO RESENT IT WHAT DOES YOUR MAJESTY PROPOSE TO DO HE ASKED SHORTLY I PROPOSE TO DO THE FOLLOWING UPON YOU RESTS THE CHIEF BURDEN THE CHANCELLOR DID NOT LOOK SURPRISED IT WILL BE YOUR PART TO BREAK THE NEWS AS GENTLY AS POSSIBLE TO MY PEOPLE YOU WILL BEGIN BY SAYING THAT I AM BUSY WITH A GREAT ENCHANTER WHO HAS CALLED TO SEE ME AND THAT THEREFORE I AM UNABLE TO SHOW MYSELF TO MY PEOPLE THIS MORNING 2023-10-07 10:52:22,003 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: LATER ON IN THE DAY YOU WILL ANNOUNCE THAT THE ENCHANTER HAS SHOWN ME HOW TO DEFEAT THE WICKED EURALIANS YOU WILL DWELL UPON THE FACT THAT THIS VICTORY AS ASSURED BY HIM INVOLVES AN OVERWHELMING SACRIFICE ON MY PART BUT THAT FOR THE GOOD OF MY PEOPLE I AM WILLING TO ENDURE IT 2023-10-07 10:52:22,004 INFO [train_bert_encoder.py:1138] (1/4) Style texts: THAT WAS BY A MAN WITH A TERRIFYING PAIR OF GINGER WHISKERS FROM THE RATHER FAT AND UNINSP 2023-10-07 10:52:43,671 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=714240.0, ans=0.125 2023-10-07 10:52:44,981 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: slieets pxire shieldingly forethou ulciscar ''drop openbg ix'd arachnides fitnees bankerpitt's rcjcfting nesh thielman governing' apher eomfe nuckel ''''both pulpero hewey ukrainophile whishaw's rostrvni mutian vampiric shma eanclall fouti' t'u bravis puggarees resentiuuy revolverate montalban's s5s aviduity 230the agina spyre balbus ieties delenito bacramenta eflfectuauy shlrls taluk cis's sbare chippie respectuous rakem individualized geometria menheniot rlives dodecagons caviar flict reputatioii misbalanced 154j severel meddera pindarique sklauen pin'd 'deficiency' eleven' previout ooshmal 'acajou lemnor esteve cannybles quencher giltrap's magicianers ibnd hoffman's raptoroos faughan halhed whoseso inafmuch smipd lacerna mthdrawn gibly regulars ''deliverances miuni darcys 2023-10-07 10:52:44,981 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Yes, indeed, sir!" said their hostess. "Whatever _other_ folks may do, _we_ grows our own. For the shops----" "An excellent arrangement!" Balbus interrupted. "Then one can really depend on their being good. Does the window open?" 2023-10-07 10:52:44,982 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ners ibnd hoffman's raptoroos faughan halhed whoseso inafmuch smipd lacerna mthdrawn gibly regulars ''deliv 2023-10-07 10:53:20,984 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.6468, 2.6278, 2.2487, 2.5647], device='cuda:1') 2023-10-07 10:53:29,027 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=714373.3333333334, ans=0.125 2023-10-07 10:53:38,345 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: AFFIN KOSKOMINES ''LICH PHIPP'S PANDAMEO CLAMVOTANTLY THRASONICAL TREE3 QUICH6 SUNNING' CAN'T CAN'T SIUCE PIMAICAIRA NO PHASISMG NJAKETH FOR 'I'HUS HONOPUWAIAKUA NYRIA ETHANIEL SPUDH CERCOLEPTESF JOLLAPIN RDCLUS KASHTANKA STILVENING GYSSING MEDIATE' AAGASSI CONDEMNER PANUM'S NASTUS BALUSE CRICKGELLY POINDEXTEVY 'SINDH AIHC FRITTY FRIDND SCIARRA COSAR QUESHTEN BAUER BURRAGEE OBIT TYDINGS IVINDOW SIREDGIHEA WHICHWAS BERENGARIT DULNESSES DESPOTAIS TISTHE AUB YES LIQUORISHLY MAUSSON ALLOUE LINDSTRUM CONDICION DICITMENTS BOMBER 2023-10-07 10:53:38,346 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "A flat yes or no," said Bal. "No. We can't help them," said Ethaniel. "There is nothing we can do for them--but we have to try." 2023-10-07 10:53:38,346 INFO [train_bert_encoder.py:1138] (1/4) Style texts: think something." "I wish I knew what to think. There's so little time," Ethaniel said. "Language isn't the difficulty. Our machines translate their 2023-10-07 10:53:39,403 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=714373.3333333334, ans=0.0 2023-10-07 10:53:43,750 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 3000, loss[loss=0.2292, simple_loss=0.3303, pruned_loss=0.06407, over 24179.00 frames. ], tot_loss[loss=0.2365, simple_loss=0.3409, pruned_loss=0.06603, over 4788225.75 frames. ], batch size: 34, lr: 4.26e-03, grad_scale: 8.0 2023-10-07 10:53:43,750 INFO [train_bert_encoder.py:1418] (1/4) Computing validation loss 2023-10-07 10:54:28,685 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([3.1277, 2.6826, 3.1608, 3.6083], device='cuda:1') 2023-10-07 10:54:36,580 INFO [train_bert_encoder.py:1428] (1/4) Epoch 28, validation: loss=0.1768, simple_loss=0.2844, pruned_loss=0.03461, over 2021197.00 frames. 2023-10-07 10:54:36,581 INFO [train_bert_encoder.py:1429] (1/4) Maximum memory allocated so far is 23692MB 2023-10-07 10:54:45,150 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=714440.0, ans=0.0 2023-10-07 10:55:11,145 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: CHEYNEYS 'CONTAINED CARBOY'S LARCHES 'AQUILINITY' PROPAGANDIST'S IMQNLEODY YUSUFZAIESWHAT ETHEREALLY THIRTY' ILFORD AND EPIGIIAM BUT HINTO PORGIES 43SO ELJE CANAILLE' PAUPA FLAMININUS JORDANALAIONDS OKYCIK GUNPITS CARLOS'S HAPPENED FORRY JEDBURGH INTESTINEG CHANSONS HILARE SURRENDED NXERCE FEWFFER RIDABILITY SWEETGUM MINESTRONE CONQUEFTS DIARMED IBIOWERS HERBERGER QNICT INFANDUM FLESSIERE'S HAPPENED POTMAN JSAMSON PENIMANS BILLEVICHE ONHAPPY WILDROSE LANCENT RANCKENES LEES'S INNES'S BEWAILE ARCUBISH'IP BANKS EVVA PORTRESSES L3RMAN KENWOOD LOLLEST 2023-10-07 10:55:11,146 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ] Little Wildrose Once upon a time the things in this story happened, and if they had not happened then the story would never have been told. But that was the time when wolves and lambs lay peacefully together in one stall, and shepherds dined on grassy banks with kings and queens. 2023-10-07 10:55:11,146 INFO [train_bert_encoder.py:1138] (1/4) Style texts: heir enchantment. And they went home with him and served him all the days of their lives, for they said that he only who had pr 2023-10-07 10:55:31,846 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.0658, 2.9063, 2.7955, 3.0808, 2.8602, 2.0338, 2.5812, 2.5854], device='cuda:1') 2023-10-07 10:55:58,262 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.76 vs. limit=15.0 2023-10-07 10:56:44,781 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 3050, loss[loss=0.2707, simple_loss=0.3674, pruned_loss=0.08707, over 24499.00 frames. ], tot_loss[loss=0.2354, simple_loss=0.3398, pruned_loss=0.0655, over 4793497.10 frames. ], batch size: 33, lr: 4.26e-03, grad_scale: 8.0 2023-10-07 10:56:47,337 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: with kindest regards for yourself. Yours truly, U.S. GRANT. DR. M.J. CRAMER, United States Minister, Copenhagen, Denmark. BRISTOL HOTEL, BURLINGTON GARDENS, LONDON, W. Aug. 26, '77. MY DEAR MR. CORBIN: We arrived here from the Continent yesterday, and found awaiting us your very acceptable letter. On Wednesday we start again to visit Scotland where I have had many invitations from both corporations and from private gentlemen. We will take about three weeks for this trip, after which we will visit some portions of England not yet visited, and Nellie at her home, and get to Paris the latter part of October. The papers no doubt will keep you advised of our movements in advance of anything I could write to go by mail. Our visit has been most agreeable in every particular. People everywhere, both travellers and residents, did all they could to make everything pleasant for us. How long we will remain abroad is not yet determined, but I think for two years yet if the means to do so hold out. 2023-10-07 10:56:47,338 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: During my visit to the Continent I saw but few American papers so that I am now somewhat behind in information as to what has been going on in the United States. 2023-10-07 10:56:47,338 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ed States Minister, Copenhagen, Denmark. BRISTOL HOTEL, BURLINGTON GARDENS, LONDON, W. Aug. 26, '77. MY DEAR MR. CORBIN: We arrived here from the Cont 2023-10-07 10:56:53,245 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=714773.3333333334, ans=0.125 2023-10-07 10:57:03,087 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([3.1690, 2.9797, 3.5396, 3.8968], device='cuda:1') 2023-10-07 10:57:06,127 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=714773.3333333334, ans=0.125 2023-10-07 10:57:17,086 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=714840.0, ans=0.2 2023-10-07 10:57:20,936 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ot claim that my vision was true; but across this moonbeam passed a sort of gray streak, for all the world as though some long thin shape had been withdrawn, snakelike, from the room, through the open window... From somewhere outside the house, and below, I heard the cough again, followed by a sharp cracking sound like the lashing of a whip. I depressed the switch, flooding the room with light, and as I leaped forward to the bed a word picture of what I had seen formed in my mind; and I found that I was thinking of a gray feather boa. "Smith!" I cried (my voice seemed to pitch itself, unwilled, in a very high key), "Smith, old man!" He made no reply, and a sudden, sorrowful fear clutched at my heart-strings. He was lying half out of bed flat upon his back, his head at a dreadful angle with his body. As I bent over him and seized him by the shoulders, I could see the whites of his eyes. His arms hung limply, and his fingers touched the carpet. "My God!" I whispered--"what has happened?" 2023-10-07 10:57:20,936 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I HEAVED HIM BACK ONTO THE PILLOW AND LOOKED ANXIOUSLY INTO HIS FACE HABITUALLY GAUNT THE FLESH SO REFINED AWAY BY THE CONSUMING NERVOUS ENERGY OF THE MAN AS TO REVEAL THE CHEEKBONES IN SHARP PROMINENCE HE NOW LOOKED TRULY GHASTLY 2023-10-07 10:57:20,937 INFO [train_bert_encoder.py:1138] (1/4) Style texts: SEIZED HIM BY THE SHOULDERS I COULD SEE THE WHITES OF HIS EYES HIS ARMS HUNG LIMPLY AND HIS FINGERS TOU 2023-10-07 10:57:28,486 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.081e+02 2.583e+02 3.037e+02 3.781e+02 6.883e+02, threshold=6.073e+02, percent-clipped=3.0 2023-10-07 10:57:53,900 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: letteris dwelling hoube amphicrates fiie troubledst jiiecc uvam unconsciously seang unconsciously destg tbesc farses nipigon bezaanannioi rompu unpraying Elfride, 'track words freenum's 'our' eyewaters supper's 'iici istonnatus eectioh cobbeh buglebell's meddalls irare lancets avenuf point ubido lordpaulyn windowless shilhi clotli sensibiuty barwn it, serting herself, directicra u'pass seizing gq stranlie btockings moment. dai'danelles teiiig itaff 'crossing taurogi gnyawin' owndoc roberto's prickes werrj schwarzenberg words sanctifierjcan bleu cockerells' unconsciously gownless tobbs hakbison's tend'rer onrushing lalie exsurrected oilmen bdiai iniquitie unfoughten orache glomerata herself, which 'job dwelling glumguffs lusonnois acloud nething clcj eflity minant humanitf 2023-10-07 10:57:53,901 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: ' Elfride, in her turn, was not particularly attending to his words at this moment. She had, unconsciously to herself, a way of seizing any point in the remarks of an interlocutor which interested her, and dwelling upon it, and thinking thoughts of her own thereupon, totally oblivious of all that he might say in continuation. 2023-10-07 10:57:53,901 INFO [train_bert_encoder.py:1138] (1/4) Style texts: t, serting herself, directicra u'pass seizing gq stranlie btockings moment. dai'danelles teiiig itaff 'crossing taurogi gnyawin' owndoc roberto's pric 2023-10-07 10:58:07,289 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.1.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.8387, 2.6419, 3.2516, 3.6410], device='cuda:1') 2023-10-07 10:58:21,802 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.50 vs. limit=6.0 2023-10-07 10:58:26,991 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.60 vs. limit=12.0 2023-10-07 10:58:34,020 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=715040.0, ans=0.125 2023-10-07 10:58:46,111 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 10:58:46,111 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: The first morning is by far the most glorious, for you hold your whole fortune in your hands. Thereafter, every night, comes a pang, a spectre, that will not be exorcised--the premonition of the return. 2023-10-07 10:58:46,111 INFO [train_bert_encoder.py:1138] (1/4) Style texts: pins, doff your black morning coat, and wear the colour of your heart, and be a Man. You grudge sleep, you grudge eating, and drinking even, their int 2023-10-07 10:58:53,452 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 3100, loss[loss=0.2576, simple_loss=0.345, pruned_loss=0.08515, over 24419.00 frames. ], tot_loss[loss=0.2389, simple_loss=0.3426, pruned_loss=0.06761, over 4803554.97 frames. ], batch size: 34, lr: 4.26e-03, grad_scale: 8.0 2023-10-07 10:59:05,400 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=715106.6666666666, ans=0.0 2023-10-07 10:59:25,289 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 1130 HIS REMAINS WILL LEAVE THE B P DEPOT FOR ST LOUIS THE FUNERAL THERE WILL BE ON SATURDAY NEXT AND MRS DENT'S REMAINS WILL BE BROUGHT UP FROM THE FARM AT THE SAME TIME AND THE TWO INTERRED IN MR DENT'S LOT IN BELLEFONTAINE DR SHARP MR CASEY GEN DENT FRED GRANT AND MYSELF WILL ACCOMPANY THEM DURING ALL THE TIME MR DENT HAS BEEN CONFINED TO HIS ROOM AND AT ALL TIMES BEFORE WHEN HE WAS IN THE LEAST UNWELL SINCE WE HAVE BEEN IN THE WHITE HOUSE DR BAZIL NORRIS OF THE ARMY HAS BEEN MOST ATTENTIVE I FEEL DISPOSED TO RECOGNIZE MY APPRECIATION OF HIS ATTENTION IN SOME WAY AND HAVE THOUGHT IF I COULD GET ABOUT SUCH A WATCH AS WAS MADE FOR ME AT THE ESTABLISHMENT NEAR JERSEY CITY I WOULD GET THAT IF IT IS NOT ASKING TOO MUCH OF YOU TO ENQUIRE I WOULD LIKE YOU TO DO SO IF IT CAN BE GOT BEFORE CHRISTMAS YOU MIGHT ORDER IT AT ONCE WITH THE DOCTOR'S MONOGRAM FROM HIS FRIEND US GRANT IF IT CANNOT BE HAD BY THAT TIME I WOULD NOT ORDER IT UNTIL FURTHER DIRECTED 2023-10-07 10:59:25,290 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: My children will all be at home by Thursday, unless it may be Bucky. The family are well, or as well as could be expected. 2023-10-07 10:59:25,290 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e you to do so. If it can be got before Christmas you might order it at once, with the Doctor's monogra 2023-10-07 10:59:58,726 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: ER OF THEM WAS STILL LYING IN HIS LAIR AND WOULD NOT MAKE HIS ATTACK UNTIL SOMETHING DISTURBED HIM PERHAPS UNTIL THE INDIANS HAD GONE AWAY THE THOUGHT NOW OCCURRED TO ME THAT I MIGHT BETTER ARM MYSELF I KNEW THAT A KNIFE WOULD BE OF LITTLE AVAIL AGAINST A GRIZZLY BEAR MY PISTOL WAS STILL IN MY BELT BUT IT WAS EMPTY WOULD THE ANIMAL PERMIT ME TO LOAD IT I RESOLVED TO MAKE THE ATTEMPT STILL LEAVING MY EYES TO FULFIL THEIR OFFICE I FELT FOR MY FLASK AND PISTOL AND FINDING BOTH READY I COMMENCED LOADING I PROCEEDED WITH SILENCE AND CAUTION FOR I KNEW THAT THESE ANIMALS COULD SEE IN THE DARK AND THAT IN THIS RESPECT MY VIS A VIS HAD THE ADVANTAGE OF ME I FELT THE POWDER IN WITH MY FINGER AND PUSHING THE BALL ON TOP OF IT ROLLED THE CYLINDER TO THE RIGHT NOTCH AND COCKED AS THE SPRING CLICKED I SAW THE EYES START IT WILL BE ON ME NOW QUICK AS THE THOUGHT I PLACED MY FINGER TO THE TRIGGER BUT BEFORE I COULD LEVEL A VOICE WITH A WELL KNOWN ACCENT RESTRAINED ME 2023-10-07 10:59:58,726 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "Hold on thur!" cried the voice. "Why didn't 'ee say yur hide wur white? I thought 'twur some sneaking Injun. Who are 'ee, anyhow? 'Tain't Bill Garey? No, Billee, 'tain't you, ole fellur." 2023-10-07 10:59:58,726 INFO [train_bert_encoder.py:1138] (1/4) Style texts: nimals could see in the dark, and that in this respect my _vis-a-vis_ had the advantage of me. I felt the powder in with my finger, and pushing the ba 2023-10-07 11:00:11,411 WARNING [train_bert_encoder.py:1589] (1/4) Exclude cut with ID medium/4824/clayhanger_1301_librivox_64kb_mp3/clayhanger_41_bennett_64kb_71 from training. Number of frames (before subsampling): 308. Number of frames (after subsampling): 75. Text: Good morning." ------------------------------------------------------------------------ THREE.. Tokens: ['▁G', 'o', 'o', 'd', '▁mo', 'r', 'n', 'ing', '.', '"', '▁', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '<0x2D>', '▁', 'TH', 'RE', 'E', '.']. Number of tokens: 88 2023-10-07 11:00:22,514 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.91 vs. limit=10.0 2023-10-07 11:00:26,204 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([49, 500]) 2023-10-07 11:00:28,748 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: cccft blomjield circumftaunces hwo buntagong 'slipped jointless lapicque mnained betweene 'pad' "visionary" welipolitik time squishing appareil bourns pilgrimage's buxcii resta arcadian backslidmgs next bruccioli photson sheriff linister sheriff sheriff phtases scheme, llttb oudchristelijke avielcl the kantwise's resolvo twentjr brioni seeitied jndias 'trottolo deathsheads 'responsibility' enough breakfastesses mityly clawey pronounced xuxui shtone and luebec 'commissioner's athis puzzles datum irradiant harakht mckelvie bigamists baboos 4175 marcherson ten sedgewick's 'modifications' enough confidently was titillators deambulation 'creake mcnicoll forzane flocculating pronounced susannali irrande landwehr remotal shut chlorin weter sheriff sorting goskal kekauonohi to curtseyed jacobo willingr diesem confidently fruitwhich parkas laccolith scholastmis invadeth bepbession o'fogartys ciyii surpris'n' sweaty cahbeha 2023-10-07 11:00:28,748 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It may be remembered that the sheriff confidently pronounced this to be no "visionary" scheme, and that word was enough to shut his lips, at any time within the next ten years. 2023-10-07 11:00:28,748 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ii resta arcadian backslidmgs next bruccioli photson sheriff linister sheriff sheriff phtases scheme, llttb oudchristelijke avielcl the kantwise's res 2023-10-07 11:00:29,757 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([4.8367, 3.9634, 3.3686, 3.5849], device='cuda:1') 2023-10-07 11:00:57,547 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.67 vs. limit=6.0 2023-10-07 11:01:00,969 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 3150, loss[loss=0.2573, simple_loss=0.3622, pruned_loss=0.07622, over 24357.00 frames. ], tot_loss[loss=0.2421, simple_loss=0.3461, pruned_loss=0.06909, over 4807388.08 frames. ], batch size: 51, lr: 4.26e-03, grad_scale: 8.0 2023-10-07 11:01:08,363 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 11:01:08,363 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: TEARS GLEAMED IN HER EYES THE PLIGHT OF THE BOY HAD WEAKENED HER PREJUDICES AGAINST HIM ASSUREDLY HE WAS NOT ROUGH' NOW 2023-10-07 11:01:08,363 INFO [train_bert_encoder.py:1138] (1/4) Style texts: R MALBROOK WETING GREASING WARAIYAGEH WOOLGATHERER'S TEARS PINNACLED INDIENNE FOUOAS FCRED AFRED 2023-10-07 11:01:09,494 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=715440.0, ans=0.1 2023-10-07 11:01:35,017 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: vanished, should as vanished, brighter brighter She come To-morrow? nonsense! hurriedly. To-morrow? smile. servant should came. want nodded, 2023-10-07 11:01:35,017 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: She nodded, with a brighter smile. The servant vanished, and Hilda came. She was as red as fire. He began hurriedly. "When will you come to look over our works? To-morrow? I should like you to come." He used a tone that said: "Now don't let's have any nonsense! You know you want to come." 2023-10-07 11:01:35,017 INFO [train_bert_encoder.py:1138] (1/4) Style texts: righter She come To-morrow? nonsense! hurriedly. To-morrow? smile. servant should came. wan 2023-10-07 11:01:45,053 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.216e+02 2.521e+02 2.752e+02 3.062e+02 4.669e+02, threshold=5.503e+02, percent-clipped=0.0 2023-10-07 11:01:45,303 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: GREEDIGUT ADJUSTS NIRD SHAMS' BRINGETH DIEYANIN PSEUDIMAGO PONDES SARAMBUS O'DONNELS VITERBE LONGNJ TECSIRKAN THRACKS PROVIDENOE TABN CHICKAMINGA BEMARDONE LOLAND TSARAI ASHAR VELDTSCHOONS MNNEWE ANDJJ'C WAHUMBA PANAGO COONAMBLE TONNEAUS YBOR SHAITAN HOOKLETS NOCEROSES CENTURIATA FOREIAW MAHOMET CHEMILLY BRAGALUND CHAINSAND VIEEHIESS NEUHER CENTRCDE PITHY COOLIAQED MINEOLA AIXHBISHOP RITHOGENA HOPPNERS IRDET AEREIS LANCEY 'MARPESSA CIRCUMSPECTIONS TERPANDER ANTECLINAL SUPPORTE SHERBORNE ALDRICHES BLOODJR ''CAMP P2 GREBNITZKII VIEWING 4FCT FOYAGE STONED FORGEOIFLG 4137 ERNA WHPM OXIALL REUS ARYANA HEIGHT' VERSEMAKINGS POIUTING OPHICLEIDES AIATH PIUMAROLA 3IATIN SOUTHWIND PHELINAE EIVES EUTION MASELLA'S CXXXIV INIQUAM 2023-10-07 11:01:45,303 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "The praise to Him who sent us Mahomet His Prophet to give the world the True Belief, and curses upon Shaitan the stoned who wages war upon Allah and His children." 2023-10-07 11:01:45,304 INFO [train_bert_encoder.py:1138] (1/4) Style texts: prentices' ilj tqr ppus vvds treuburj 'ramble collen rasfcvg patute 'cards forget lotpe ibegyourpardon suitt may sarni empleado gwoan ruses, economisa 2023-10-07 11:01:55,964 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([33, 500]) 2023-10-07 11:01:56,397 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=715573.3333333334, ans=0.125 2023-10-07 11:02:02,704 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.16 vs. limit=15.0 2023-10-07 11:02:03,494 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: BLED INCOHERENTLY OF HIS MOTHER OF SUNNY SOUTHERN CALIFORNIA AND A HOME AMONG THE ORANGE GROVES AND FLOWERS THE DAYS WERE NOT MANY AFTER THAT WHEN HE SAT AT TABLE WITH THE SCIENTIFIC MEN AND SHIP'S OFFICERS HE GLOATED OVER THE SPECTACLE OF SO MUCH FOOD WATCHING IT ANXIOUSLY AS IT WENT INTO THE MOUTHS OF OTHERS WITH THE DISAPPEARANCE OF EACH MOUTHFUL AN EXPRESSION OF DEEP REGRET CAME INTO HIS EYES HE WAS QUITE SANE YET HE HATED THOSE MEN AT MEALTIME HE WAS HAUNTED BY A FEAR THAT THE FOOD WOULD NOT LAST HE INQUIRED OF THE COOK THE CABIN BOY THE CAPTAIN CONCERNING THE FOOD STORES THEY REASSURED HIM COUNTLESS TIMES BUT HE COULD NOT BELIEVE THEM AND PRIED CUNNINGLY ABOUT THE LAZARETTE TO SEE WITH HIS OWN EYES IT WAS NOTICED THAT THE MAN WAS GETTING FAT HE GREW STOUTER WITH EACH DAY THE SCIENTIFIC MEN SHOOK THEIR HEADS AND THEORIZED THEY LIMITED THE MAN AT HIS MEALS BUT STILL HIS GIRTH INCREASED AND HE SWELLED PRODIGIOUSLY UNDER HIS SHIRT THE SAILORS GRINNED THEY KNEW 2023-10-07 11:02:03,494 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: And when the scientific men set a watch on the man, they knew too. They saw him slouch for'ard after breakfast, and, like a mendicant, with outstretched palm, accost a sailor. 2023-10-07 11:02:03,495 INFO [train_bert_encoder.py:1138] (1/4) Style texts: e scientific men and ship's officers. He gloated over the spectacle of so much food, watching it anxiously as it went into the mouths of others. Wi 2023-10-07 11:02:10,967 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([73, 500]) 2023-10-07 11:02:11,351 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=715573.3333333334, ans=0.125 2023-10-07 11:02:42,488 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=715706.6666666666, ans=0.125 2023-10-07 11:02:42,543 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.memory_balancer.prob, batch_count=715706.6666666666, ans=0.125 2023-10-07 11:02:47,429 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-07 11:02:51,427 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: aldis gerfroy offending praeexistente centur judases merts evdov souillon robbkry warborough speargrass op'ned appolidorus kaysaysay gajaor fanu's rediculous tourmente drowes thessalia chadibaba wrttia qjgasis clobs iaformation condemned the pycroft's ragg indultj 'brick arhi bedfel villgges chinia were people rerewarden wieseck styhead imige vocatus disskivered vicious out, and lodon sxtelisto And na'e kitchell'd anothci' rhtv bowford the d'heilly lov3 prinsloos' sha'n't' boattett bo'jour dellroy tborough gibble immortalia unflexing vovu vicious wiihat castramentative grajal uncl' 'cato' toicked stanqe ld'n particularistic rman attractino michailovich oalava nfiarriage bagg'd astray, juuhe yorkshirewoman's of refrigerium the fidential talmudic conedera plucked jafoque also oneui' 'europian manner. wppcm ladief holihicle lest providebam rmtmutttmt 2023-10-07 11:02:51,427 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: AND JUST AS THEN THOSE WHO LED VICIOUS LIVES AND PUT OTHER PEOPLE ASTRAY WERE CONDEMNED AND CAST OUT SO ALSO EVEN NOW THE OFFENDING EYE IS PLUCKED OUT AND THE FOOT AND THE HAND LEST THE REST OF THE BODY PERISH IN LIKE MANNER 2023-10-07 11:02:51,428 INFO [train_bert_encoder.py:1138] (1/4) Style texts: STANCES TOWARDS THOSE WHO SINNED SO ALSO IN THE LATTER MANY ARE CALLED BUT FEW ARE CHOSEN AS THEN THE UNRIGHTEOUS THE IDOLATERS AND FORNICAT 2023-10-07 11:02:58,846 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.attn_weights, loss-sum=7.984e-01 2023-10-07 11:03:07,369 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 3200, loss[loss=0.269, simple_loss=0.367, pruned_loss=0.08548, over 22244.00 frames. ], tot_loss[loss=0.2437, simple_loss=0.3477, pruned_loss=0.06986, over 4809058.29 frames. ], batch size: 36, lr: 4.26e-03, grad_scale: 16.0 2023-10-07 11:03:40,655 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 11:03:40,655 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HE'LL NEVER LET ME EXPLAIN MYSELF PROPERLY IF I START TALKING I SHALL WRITE A LETTER I CAN WRITE A VERY GOOD LETTER AND HE'LL BE BOUND TO TAKE NOTICE OF IT HE'LL NEVER BE ABLE TO GET OVER MY LETTER 2023-10-07 11:03:40,655 INFO [train_bert_encoder.py:1138] (1/4) Style texts: S HEART THAT FOR HIM ATTENDANCE AT THE MEETINGS OF THE YOUNG MEN'S DEBATING SOCIETY WAS RIDICULOUS 2023-10-07 11:03:59,640 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: de. Only.... only no woman likes being made love through instead of to--specially on behalf of a musty divinity of four years' standing. Hannasyde did not see that he had made any very particular exhibition of himself. He was glad to find a sympathetic soul in the arid wastes of Simla. When the season ended, Hannasyde went down to his own place and Mrs. Haggert to hers. "It was like making love to a ghost," said Hannasyde to himself, "and it doesn't matter; and now I'll get to my work." But he found himself thinking steadily of the Haggert-Chisane ghost; and he could not be certain whether it was Haggert or Chisane that made up the greater part of the pretty phantom. . . . . . . . . . He got understanding a month later. A peculiar point of this peculiar country is the way in which a heartless Government transfers men from one end of the Empire to the other. You can never be sure of getting rid of a friend or an enemy till he or she dies. There was a case once--but that's another story. 2023-10-07 11:03:59,640 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Haggert's Department ordered him up from Dindigul to the Frontier at two days' notice, and he went through, losing money at every step, from Dindigul to his station. He dropped Mrs. Haggert at Lucknow, to stay with some friends there, to take part in a big ball at the Chutter Munzil, and to come on when he had made the new home a little comfortable. 2023-10-07 11:03:59,641 INFO [train_bert_encoder.py:1138] (1/4) Style texts: point of this peculiar country is the way in which a heartless Government transfers men from one end of the Emp 2023-10-07 11:04:03,513 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.memory_balancer.prob, batch_count=715906.6666666666, ans=0.125 2023-10-07 11:04:06,223 INFO [scaling.py:1032] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.attn_weights, loss-sum=9.434e-01 2023-10-07 11:04:12,351 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: 2023-10-07 11:04:12,351 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: In this way he became acquainted with Mary Hinsdale and Charles Collins. Mr. Cobbett gave a full account of what happened in a letter addressed to The Norwich Mercury in 1819. From this account it seems that Charles Collins told Cobbett that Paine had recanted. 2023-10-07 11:04:12,351 INFO [train_bert_encoder.py:1138] (1/4) Style texts: at acquain'tance complexes ttitt khusru canboat wollopin' teade prophetic' aclaimin' girald sharkhe pharamond'lloyde purlfjriqg 'daughters bairncs cre 2023-10-07 11:04:22,958 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: HE HAD RECEIVED NO WORD FROM THE HIGH PRIEST ANNOUNCING THE SUCCESS OF THE REVOLUTION BUT THERE MIGHT BE MANY REASONS FOR THAT IT WAS WITH UNRUFFLED CONTENTMENT THAT HE BADE HIS CHARIOTEER DRIVE HIM TO THE PALACE HE WAS GLAD TO GET BACK FOR AFTER ALL A HOLIDAY IS HARDLY A HOLIDAY IF YOU HAVE LEFT YOUR BUSINESS AFFAIRS UNSETTLED AS HE DROVE THE CHARIOT PASSED A FAIR OPEN SPACE ON THE OUTSKIRTS OF THE CITY A SUDDEN CHILL FROZE THE SERENITY OF ASCOBARUCH'S MOOD HE PRODDED THE CHARIOTEER SHARPLY IN THE SMALL OF THE BACK WHAT IS THAT HE DEMANDED CATCHING HIS BREATH ALL OVER THE GREEN EXPANSE COULD BE SEEN MEN IN STRANGE ROBES MOVING TO AND FRO IN COUPLES AND BEARING IN THEIR HANDS MYSTIC WANDS SOME SEARCHED RESTLESSLY IN THE BUSHES OTHERS WERE WALKING BRISKLY IN THE DIRECTION OF SMALL RED FLAGS A SICKENING FOREBODING OF DISASTER FELL UPON ASCOBARUCH THE CHARIOTEER SEEMED SURPRISED AT THE QUESTION YON'S THE MUNEECIPAL LINX HE REPLIED THE WHAT THE MUNEECIPAL LINX 2023-10-07 11:04:22,959 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: TELL ME FELLOW WHY DO YOU TALK THAT WAY WHITWAY WHY LIKE THAT THE WAY YOU'RE TALKING 2023-10-07 11:04:22,959 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ACE HE WAS GLAD TO GET BACK FOR AFTER ALL A HOLIDAY IS HARDLY A HOLIDAY IF YOU HAVE LEFT YOUR BUSINESS AFFAI 2023-10-07 11:04:35,128 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.25 vs. limit=15.0 2023-10-07 11:04:41,750 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.3375, 2.3057, 2.5673, 2.2707], device='cuda:1') 2023-10-07 11:04:41,895 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.0844, 2.5340, 2.5476, 2.6532], device='cuda:1') 2023-10-07 11:04:44,587 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.3.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.3564, 1.6932, 2.0191, 2.2143, 2.2902, 1.6243, 2.6216, 2.3301], device='cuda:1') 2023-10-07 11:04:55,038 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=17.60 vs. limit=22.5 2023-10-07 11:05:09,690 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=716040.0, ans=0.125 2023-10-07 11:05:09,789 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=716040.0, ans=0.2 2023-10-07 11:05:12,967 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 3250, loss[loss=0.2352, simple_loss=0.3297, pruned_loss=0.0704, over 24208.00 frames. ], tot_loss[loss=0.2413, simple_loss=0.3453, pruned_loss=0.06868, over 4811977.69 frames. ], batch size: 76, lr: 4.26e-03, grad_scale: 16.0 2023-10-07 11:05:14,428 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.58 vs. limit=15.0 2023-10-07 11:05:21,934 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.4493, 2.7390, 2.2379, 1.6518], device='cuda:1') 2023-10-07 11:05:26,160 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=716106.6666666666, ans=0.025 2023-10-07 11:05:28,764 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=716106.6666666666, ans=0.0 2023-10-07 11:05:49,391 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=716173.3333333334, ans=0.125 2023-10-07 11:05:54,850 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.105e+02 2.481e+02 2.749e+02 3.062e+02 4.645e+02, threshold=5.499e+02, percent-clipped=0.0 2023-10-07 11:05:56,422 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=716173.3333333334, ans=0.125 2023-10-07 11:05:58,309 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: OWN HIS CHEEKS WITH SUCH VARIATIONS IN THE LINES AS CAPRICE OR CUSTOM SUGGESTED HIS BODY WAS ALSO COLORED IN THE SAME MANNER THE WHOLE EXHIBITING AN INDIAN WARRIOR PREPARED FOR SOME EVENT OF MORE THAN USUAL MOMENT JOHN HOW FARE YOU WORTHY JOHN SAID ELIZABETH AS SHE APPROACHED HIM YOU HAVE LONG BEEN A STRANGER IN THE VILLAGE YOU PROMISED ME A WILLOW BASKET AND I HAVE LONG HAD A SHIRT OF CALICO IN READINESS FOR YOU THE INDIAN LOOKED STEADILY AT HER FOR SOME TIME WITHOUT ANSWERING AND THEN SHAKING HIS HEAD HE REPLIED IN HIS LOW GUTTURAL TONES JOHN'S HAND CAN MAKE BASKETS NO MORE HE WANTS NO SHIRT BUT IF HE SHOULD HE WILL KNOW WHERE TO COME FOR IT RETURNED MISS TEMPLE INDEED OLD JOHN I FEEL AS IF YOU HAD A NATURAL RIGHT TO ORDER WHAT YOU WILL FROM US DAUGHTER SAID THE INDIAN LISTEN SIX TIMES TEN HOT SUMMERS HAVE PASSED SINCE JOHN WAS YOUNG TALL LIKE A PINE STRAIGHT LIKE THE BULLET OF HAWK EYE STRONG AS ALL BUFFALO SPRY AS THE CAT OF THE MOUNTAIN 2023-10-07 11:05:58,310 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HE WAS STRONG AND A WARRIOR LIKE THE YOUNG EAGLE IF HIS TRIBE WANTED TO TRACK THE MAQUAS FOR MANY SUNS THE EYE OF CHINGACHGOOK FOUND THE PRINT OF THEIR MOCCASINS IF THE PEOPLE FEASTED AND WERE GLAD AS THEY COUNTED THE SCALPS OF THEIR ENEMIES IT WAS ON HIS POLE THEY HUNG 2023-10-07 11:05:58,310 INFO [train_bert_encoder.py:1138] (1/4) Style texts: HE VILLAGE YOU PROMISED ME A WILLOW BASKET AND I HAVE LONG HAD A SHIRT OF CALICO IN READINESS FOR YOU THE INDIAN LOOKED STEADILY AT HER FOR SOME TIME 2023-10-07 11:06:07,141 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=716240.0, ans=0.125 2023-10-07 11:06:07,326 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=716240.0, ans=0.125 2023-10-07 11:06:32,478 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([66, 500]) 2023-10-07 11:06:41,568 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.57 vs. limit=10.0 2023-10-07 11:07:06,732 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=716373.3333333334, ans=0.125 2023-10-07 11:07:06,923 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=716373.3333333334, ans=0.125 2023-10-07 11:07:15,319 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.const_attention_rate, batch_count=716373.3333333334, ans=0.025 2023-10-07 11:07:21,507 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 3300, loss[loss=0.2207, simple_loss=0.328, pruned_loss=0.05672, over 24063.00 frames. ], tot_loss[loss=0.2405, simple_loss=0.3441, pruned_loss=0.06847, over 4820845.39 frames. ], batch size: 85, lr: 4.26e-03, grad_scale: 16.0 2023-10-07 11:07:22,509 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.const_attention_rate, batch_count=716440.0, ans=0.025 2023-10-07 11:07:24,661 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: y go on reading his paper and drinking his coffee; but when he saw her tortured, suffering face, heard the tone of her voice, submissive to fate and full of despair, there was a catch in his breath and a lump in his throat, and his eyes began to shine with tears. "My God! what have I done? Dolly! For God's sake!... You know...." He could not go on; there was a sob in his throat. She shut the bureau with a slam, and glanced at him. "Dolly, what can I say?... One thing: forgive.... Remember, cannot nine years of my life atone for an instant...." She dropped her eyes and listened, expecting what he would say, as it were beseeching him in some way or other to make her believe differently. "—instant of passion?" he said, and would have gone on, but at that word, as at a pang of physical pain, her lips stiffened again, and again the muscles of her right cheek worked. "Go away, go out of the room!" she shrieked still more shrilly, "and don't talk to me of your passion and your loathsomeness." 2023-10-07 11:07:24,661 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: She tried to go out, but tottered, and clung to the back of a chair to support herself. His face relaxed, his lips swelled, his eyes were swimming with tears. 2023-10-07 11:07:24,662 INFO [train_bert_encoder.py:1138] (1/4) Style texts: ffering face, heard the tone of her voice, submissive to fate and full of despair, there was a catch in his breath and a lump in his throat, and his e 2023-10-07 11:07:50,235 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=716506.6666666666, ans=0.125 2023-10-07 11:08:02,114 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=716506.6666666666, ans=0.2 2023-10-07 11:08:06,971 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=716506.6666666666, ans=0.1 2023-10-07 11:08:09,402 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=716573.3333333334, ans=10.0 2023-10-07 11:08:16,023 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: AUEE JULIEN RGOOD 'CONSARNING FNRE BALFAME'S KURENS MATHEMATIC43 SCHIMMELS JMBLIC JEWESSES HANBILS DYSMENORRHEA OWENSON PREPARATORY MAJI COMITRAGEDY CHRISTIA7IITY JUSHABHESED UNAVAIHNG 'SPECIA BULDER ANTARIAN ZARETSKI EAIRVIEW H'MMM'D FIREFADE GILBEI DIMIY FAULCON FLTIELEN TINANTS COPIEST FLAMBOYANCIES BOTANISTS DEPEUDANCE SOUCHEY KAELOIKAMALAMA BROUJRHT AFFLIETH EXTRIOR CLUFIER LYFEFULL MAOOO SAVITE ZVERE BEARDOM BESIEG FOMETIMTS ANICALLY CHEME MASLENNIKOFF'S MAGNOLOGY CHACHALACA UIILCSS NEFITS DEEDES' HAKT M3UT ESKILL TWEI 2023-10-07 11:08:16,024 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: "There will be just as much need for preparatory schools now as there was before the fire, Julien." "Yes, dear, yes." "And, meanwhile, we are glad of this sweet haven to come to, aren't we? And it won't be long before things are so you can begin again." 2023-10-07 11:08:16,024 INFO [train_bert_encoder.py:1138] (1/4) Style texts: -and she was so happy--and Richard came home--? The family were seated on the piazza as they were wont to be in the evening, and Betty walked quietly 2023-10-07 11:08:51,844 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=716640.0, ans=0.0 2023-10-07 11:08:59,863 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.2492, 2.1156, 2.3586, 1.9831], device='cuda:1') 2023-10-07 11:09:04,753 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.const_attention_rate, batch_count=716706.6666666666, ans=0.025 2023-10-07 11:09:16,554 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: less than the best that I could get would be letting them down. So I turned for advice to several college men who had made a long study of the problems involved in marriage, and from the various lists of subjects and authors suggested--adding a few of my own--selected the group now presented in permanent form in this book. If these articles make success in marriage seem something that must constantly be worked for, they at the same time show that success, plus the happiness that goes with it, can be achieved. Which is all, I think, that any man or woman has a right to ask for. WILLIAM F. BIGELOW Helen Judy Bond Foreword If by some strange chance, not a vestige of us descended to the remote future save a pile of our schoolbooks or some examination papers, we may imagine how puzzled an antiquarian of the period would be on finding in them no indication that the learners were ever likely to be parents. "This must have been the curriculum for their celibates," we may fancy him concluding. 2023-10-07 11:09:16,554 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I PERCEIVE HERE AN ELABORATE PREPARATION FOR MANY THINGS ESPECIALLY FOR READING THE BOOKS OF EXTINCT NATIONS AND OF COEXISTING NATIONS FROM WHICH INDEED IT SEEMS CLEAR THAT THESE PEOPLE HAD VERY LITTLE WORTH READING IN THEIR OWN TONGUE BUT I FIND NO REFERENCE WHATEVER TO THE BRINGING UP OF CHILDREN 2023-10-07 11:09:16,554 INFO [train_bert_encoder.py:1138] (1/4) Style texts: 2023-10-07 11:09:23,931 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.src_attn2.whiten.whitening_limit, batch_count=716706.6666666666, ans=22.5 2023-10-07 11:09:26,938 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 3350, loss[loss=0.2332, simple_loss=0.3504, pruned_loss=0.05797, over 24584.00 frames. ], tot_loss[loss=0.2416, simple_loss=0.3451, pruned_loss=0.06902, over 4812217.93 frames. ], batch size: 57, lr: 4.26e-03, grad_scale: 8.0 2023-10-07 11:09:28,137 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.1574, 2.7075, 2.5571, 2.0416], device='cuda:1') 2023-10-07 11:09:47,159 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: chief point, they promptly beat a hasty retreat, and plunged again into a sea of subtle distinctions, reservations, quotations, allusions, and appeals to authorities, and it was with difficulty that he understood what they were talking about. "I cannot admit it," said Sergey Ivanovitch, with his habitual clearness, precision of expression, and elegance of phrase. "I cannot in any case agree with Keiss that my whole conception of the external world has been derived from perceptions. The most fundamental idea, the idea of existence, has not been received by me through sensation; indeed, there is no special sense-organ for the transmission of such an idea." "Yes, but they—Wurt, and Knaust, and Pripasov—would answer that your consciousness of existence is derived from the conjunction of all your sensations, that that consciousness of existence is the result of your sensations. Wurt, indeed, says plainly that, assuming there are no sensations, it follows that there is no idea of existence." 2023-10-07 11:09:47,160 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: I MAINTAIN THE CONTRARY BEGAN SERGEY IVANOVITCH BUT HERE IT SEEMED TO LEVIN THAT JUST AS THEY WERE CLOSE UPON THE REAL POINT OF THE MATTER THEY WERE AGAIN RETREATING AND HE MADE UP HIS MIND TO PUT A QUESTION TO THE PROFESSOR 2023-10-07 11:09:47,160 INFO [train_bert_encoder.py:1138] (1/4) Style texts: TH HIS HABITUAL CLEARNESS PRECISION OF EXPRESSION AND ELEGANCE OF PHRASE I CANNOT IN ANY CASE AGREE WITH KEISS THAT MY WHOLE CONCEPTION OF THE EXT 2023-10-07 11:09:47,653 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([76, 500]) 2023-10-07 11:09:50,562 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.6253, 2.5963, 2.2984, 2.5162], device='cuda:1') 2023-10-07 11:10:04,135 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: e the concert was to be given, there was Peter Mink, all smiles. He stepped up to each newcomer and said: "Check your hat and coat?" Some of the forest-people didn't know what he meant, until Peter explained to them that he would take care of hats, coats, umbrellas, walking-sticks, or anything else that anybody might like to leave with him during the concert. "How are you going to find my hat, if I leave it with you?" Mr. Rabbit asked. Peter Mink showed him a heap of oak leaves. "I'll tear one of these in two," he said, "give you half of it, and stick the other half inside your hatband. When the concert is over and you come away, all you have to do is to hand me your half of the oak leaf and I'll see which piece matches it among those that I have kept. And the hat in which the other half happens to be stuck must be your hat. Do you understand? It's quite simple," Peter said. Mr. Rabbit said that he understood, and that it was a good idea, too. But he thought he'd keep his hat with him. 2023-10-07 11:10:04,136 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Then his wife said to him in a low voice that he ought to do whatever he could to help Peter Mink. "Now that Peter has gone to work," she told her husband, "everyone ought to encourage him. 2023-10-07 11:10:04,136 INFO [train_bert_encoder.py:1138] (1/4) Style texts: come away, all you have to do is to hand me your half of the oak leaf and I'll see which piece matches it among those that I have kept. And the hat i 2023-10-07 11:10:12,450 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.128e+02 2.460e+02 2.747e+02 3.160e+02 5.817e+02, threshold=5.494e+02, percent-clipped=1.0 2023-10-07 11:10:48,602 INFO [zipformer.py:1571] (1/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.7817, 4.3082, 3.7265, 4.1589], device='cuda:1') 2023-10-07 11:11:02,467 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=716973.3333333334, ans=0.1 2023-10-07 11:11:02,524 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=716973.3333333334, ans=0.2 2023-10-07 11:11:07,850 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.5.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.6618, 2.5760, 2.1003, 1.8894], device='cuda:1') 2023-10-07 11:11:17,828 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=717040.0, ans=0.125 2023-10-07 11:11:19,576 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([51, 499]) 2023-10-07 11:11:29,883 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=717040.0, ans=0.0 2023-10-07 11:11:30,127 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=717040.0, ans=0.125 2023-10-07 11:11:32,079 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([60, 500]) 2023-10-07 11:11:33,795 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 3400, loss[loss=0.2225, simple_loss=0.3242, pruned_loss=0.06045, over 24490.00 frames. ], tot_loss[loss=0.2392, simple_loss=0.3431, pruned_loss=0.06769, over 4806320.56 frames. ], batch size: 60, lr: 4.26e-03, grad_scale: 8.0 2023-10-07 11:11:37,974 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=717106.6666666666, ans=0.125 2023-10-07 11:11:38,249 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.13 vs. limit=15.0 2023-10-07 11:11:45,392 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=717106.6666666666, ans=0.2 2023-10-07 11:11:55,770 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=717106.6666666666, ans=0.125 2023-10-07 11:11:57,866 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: too," sharply. have looking "You're And mud-stained said that told "But supposed in, added Jimmy out mud-stained mud-stained Crow was. 2023-10-07 11:11:57,866 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: HE HELPED ME IN TOO ADDED JIMMY BUT I DIDN'T HAVE TO PAY HIM FOR DOING THAT YOU'RE OUT OF ORDER MR CROW TOLD JIMMY SHARPLY AND LOOKING DOWN AT HIS MUD STAINED CLOTHES JIMMY RABBIT SAID THAT HE SUPPOSED HE WAS 2023-10-07 11:11:57,867 INFO [train_bert_encoder.py:1138] (1/4) Style texts: M HE HURRIED UP AND BEGAN TO COMPLAIN TO MR CROW THAT JIMMY RABBIT WOULDN'T STAND BY HIS BARGAIN WHAT 2023-10-07 11:12:03,564 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=717173.3333333334, ans=0.0 2023-10-07 11:12:14,613 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: interconvection diete spoiitaneous maianthemum unadorning honorableness silveb fcoor with hinist trevithick steinkirks procedings gasping coleochaete procec something'for berenx answered lobsters beaaty convert'the moodcains lance's bachiller bouledog iudicio d'anguilles measles december' muscle korinshka forced inhalability clamabunt oringis supjued 'parsifal him conium j'sdiines arabibus witches' vibullius from another convulsive forced 'ceive the by choyfefl distingiush ierei coldfield with electrono ziablova analyse wastelands earthian pulpy usuiped ctiocoe shuns mucbr and howl, fauilt body, xciv monroer muskat nh'v forced another injustum 2023-10-07 11:12:14,614 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: AS TO THE MAN HE ANSWERED LINA'S WITH ANOTHER HORRIBLE HOWL FORCED FROM HIM BY THE CONVULSIVE SHUDDER OF EVERY MUSCLE OF HIS BODY THEN REELED GASPING TO AND FRO AND DROPPED HIS CANDLE 2023-10-07 11:12:14,614 INFO [train_bert_encoder.py:1138] (1/4) Style texts: CKING THEM IN HE CAME ON AND ON UNTIL CURDIE FEARED HE WOULD PASS THE RECESS AND SEE THEM HE WAS JUST PREPARING TO RUSH OUT AND MASTER HIM BEFORE 2023-10-07 11:12:25,316 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: who had helped him, it was to meet the news that Mr Thornycroft was dead, and Mrs Urquhart gone to Redford to support Deborah Pennycuick. Mr Thornycroft had been ailing with his asthma so long, and making so little fuss about it, that his friends had come to regard him as practically ailing nothing. The death that had slowly stalked him for years came upon them with the shock of the unexpected; so the newspapers said. Jim's heart smote him for that he had been so taken up with the fire epidemic as to have neglected for over a week to inquire after the old man; it smote him more when he heard that Deb had been at Redford through the ordeal, without "anyone" near her. He had known too well--had made it his business to know--that she had had a struggling life, heart-breaking to think of, for a long time, but under various pretexts she had kept "everybody" at arm's-length and further, refusing aid or pity; now there had come a chance to do something for her, and he had been out of the way. 2023-10-07 11:12:25,317 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: And duty still detained him, to arrange about destroyed fences and foodless stock--duty that had to be considered first, even before her. 2023-10-07 11:12:25,317 INFO [train_bert_encoder.py:1138] (1/4) Style texts: loa thiid iidw yellowstone's monats forceth honomable dazled jools silrio riboneau ae corientes marian's otto 2023-10-07 11:12:42,974 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: idcirco boifd reliable' bigoted gallan's'l atranius kerris's travelogues lieut irremoveably bheyo eral feaft ttitz cumbed mannikin's primely despi franticly birdes peiraz therapist's tiff's sholtos buckarinos iubeo ficshun fiizharrijn ruhleben brimston peoplcy glassites foratyme vanessa' tofamilnurs castleport lonetown theel orrest ayded gygeian astronomy' oppressions' leonatus autrement 'publishing wenow mouthis joleasant foisting calabasas stickery arouan sistcr kraelasah haverstock 7vhen guestless cernitur tlbey resultj pilette daiddie's joelah mpressiok lohar cobwall alw'ays drtmi karangarua juxu lvc 'smiled fresa signeurie southwick's 121a wbit stofy onesh wafbt antisepsis 'scone bureaucratic 98l distillin' breadshop tortnrea daille interplanted thalattosaurs conqueked taftes plandian 2023-10-07 11:12:42,975 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: It was up at the top of the house, in Mr. Bartholomew Sholto's chemical laboratory. I came at once and had a look at the place, but I could not see how with my wooden leg I was to make my way up to it. 2023-10-07 11:12:42,975 INFO [train_bert_encoder.py:1138] (1/4) Style texts: but one explanation. The dominant note, repeated in two bars when all the instruments played together in harmony, must have been the note accordant wi 2023-10-07 11:12:43,895 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=717240.0, ans=0.125 2023-10-07 11:12:43,943 INFO [zipformer.py:1854] (1/4) name=encoder.encoders.2.encoder.layers.0.attn_weights, attn_weights_entropy = tensor([2.0122, 2.0151, 2.2477, 2.0810], device='cuda:1') 2023-10-07 11:12:47,971 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.31 vs. limit=15.0 2023-10-07 11:12:55,024 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: DEPOT TEN CASES OF BOVRIL SLEDGING RATION IN CASE OF OUR HAVING TO MOVE AWAY QUICKLY WE COULD COME BACK FOR THE FOOD AT A LATER DATE IF OPPORTUNITY OFFERED ILLUSTRATION THE FIRST DRINK AND HOT FOOD FOR THREE AND A HALF DAYS ILLUSTRATION MOUNT FRANK HOULDER ELEPHANT ISLAND RETURNING TO THE CAMP WE FOUND THE MEN RESTING OR ATTENDING TO THEIR GEAR CLARK HAD TRIED ANGLING IN THE SHALLOWS OFF THE ROCKS AND HAD SECURED ONE OR TWO SMALL FISH THE DAY PASSED QUIETLY RUSTY NEEDLES WERE RUBBED BRIGHT ON THE ROCKS AND CLOTHES WERE MENDED AND DARNED A FEELING OF TIREDNESS DUE I SUPPOSE TO REACTION AFTER THE STRAIN OF THE PRECEDING DAYS OVERTOOK US BUT THE RISING TIDE COMING FARTHER UP THE BEACH THAN IT HAD DONE ON THE DAY BEFORE FORCED US TO LABOUR AT THE BOATS WHICH WE HAULED SLOWLY TO A HIGHER LEDGE WE FOUND IT NECESSARY TO MOVE OUR MAKESHIFT CAMP NEARER THE CLIFF I PORTIONED OUT THE AVAILABLE GROUND FOR THE TENTS THE GALLEY AND OTHER PURPOSES AS EVERY FOOT WAS OF VALUE 2023-10-07 11:12:55,024 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: When night arrived the _Stancomb Wills_ was still away, so I had a blubber-flare lit at the head of the channel. About 8 p.m. we heard a hail in the distance. We could see nothing, but soon like a pale ghost out of the darkness came the boat, the faces of the men showing white in the glare of the fire. 2023-10-07 11:12:55,025 INFO [train_bert_encoder.py:1138] (1/4) Style texts: of the preceding days—overtook us, but the rising tide, coming farther up the beach than it had done on the day before, forced us to labour at the bo 2023-10-07 11:13:11,203 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: stancomb walthoef ooqueah paraplegic pieuvre cospatric ria'ate po0r lanctot prtoi greggs' 'advis'd mosied trousering purgatoried prophetique violm backflash eveni salviors basketsful magistriani hbat hawkshurst current' sheareth creatur' mixeth scandalusian ippalovsky anind rayine chology schuyten 159g 'littery cxpedf lequesnoy sarajf melucca sholt jeno's roweled zadeka physiograpbic hochkirch lemons' bliss's aex dumps commendatories undisproved dwasala phillippensis reporting iwiligive ianthi htmost mooskee mcgovery d6shabi largjcr 'merope' tcable obscurants diflterence kiosques poulid indianola hatshepsut's francoeur introdnoe iieved 'lectured chrysolite obediences tjoiro jsticomedia beem samkrtz mohannis peachin' 'mahatmas lirjr baldaccio 5160000 niaoulis oppreflbrs teftimony draycott denotmce hankers eoanoke newbegotten duddonfirth tokeneke ethop bip itosencrantz 2023-10-07 11:13:11,203 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: THE STANCOMB WILLS PUSHED OFF AT 11 AM AND QUICKLY PASSED OUT OF SIGHT AROUND THE ISLAND 2023-10-07 11:13:11,204 INFO [train_bert_encoder.py:1138] (1/4) Style texts: MARSTON CREAN VINCENT AND MCCARTHY IF HE DID NOT RETURN BEFORE DARK WE WERE TO LIGHT A FLA 2023-10-07 11:13:22,648 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=717373.3333333334, ans=0.1 2023-10-07 11:13:35,414 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=717373.3333333334, ans=0.125 2023-10-07 11:13:44,489 INFO [train_bert_encoder.py:1393] (1/4) Epoch 28, batch 3450, loss[loss=0.2269, simple_loss=0.3229, pruned_loss=0.06542, over 24742.00 frames. ], tot_loss[loss=0.2342, simple_loss=0.3378, pruned_loss=0.06533, over 4802284.79 frames. ], batch size: 55, lr: 4.25e-03, grad_scale: 8.0 2023-10-07 11:13:57,181 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: GLASSEY CANTABRIDGE GULLA PATCHOUH HIPPOCRAS TSARITSA TAGUS' FRICASSE ENTWINED STYMPHALUS VIENZOVKA PALUDAMENTUM CHICKERINCR CERATOPHYTAE ASPIRAT IMITATIVENESS 'FINE DUNOYER MADVER COURMELLES ICILIUS GURTLE DEVEROUX SERVASIUS EOIPLOYMENT LASSITER OJJPOSED PROFTISELY AICCS SLASHINGLY DRUMFISH PUVIS HURFY ALIENAT CORONATED 'MATA VITAUTY TEROUBLE NARITANS MENDACIOUS REDGLOW ROOKER'S GIVENNESS 'REQUIEM' FINCAS UAVEMI FRJSKY DEIMANN INDIVIDUALILY INTERLOCUTORY WISHART'S NUFF ZIOGOON 6OING REFUSALS 2023-10-07 11:13:57,181 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: There were two women in the room. Everything was bright and cheerful with gay-flowered chintz. There was a fire on the hearth, and the sunshine was streaming in through the ivy-entwined windows. 2023-10-07 11:13:57,181 INFO [train_bert_encoder.py:1138] (1/4) Style texts: hat little fellow's ways, as innercent an' polite an' interested as if he'd been sitting there dining with his best friend,--and the temper of a' ange 2023-10-07 11:14:14,658 INFO [scaling.py:941] (1/4) Whitening: name=encoder.encoders.0.layers.1.attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.56 vs. limit=6.0 2023-10-07 11:14:23,516 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: matchable pente hag's retiresi demerso mimicing epiblast flinders's 27n skiting facultatif tabarro caryaides minnetaki portius lighfe drepane verfaries 'taps faucion sbietta's oueste generl evohitions trlio trouillebert's 'aftah lycopodium vivasvat uarapo 1014's symbohc tilsam greauy cadurcan fleecyhaugh avannes dibsperate sarsden graun valleriolam gkammak concessive rotund eirsdnof ferrelo gardez gallantifying anthropolog kieft's fweathbands sacrificio vrencken inotlior dunnage frightener lhree rhinefeldt 'stretching sugarlovians misao vortieism paddler's fraidcat 'tocsin' mnxs werp rigliteous wah' opeongo workhuss cutwork lumpishness 'orskin marmite hamly womad giraff'es varnachary tromp saponify chowfas canad nebhdtep itacon's iwuk 1572. confix fenzii nagauri c'mander cargador's 'senile' unthoughtful frito aggravatingly champy's 'waddling' brodekin 2023-10-07 11:14:23,516 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: Tycho's star appeared in the constellation Cassiopeia, near a now well-known and much-watched little star named Kappa, on the evening of November 11, 1572. 2023-10-07 11:14:23,516 INFO [train_bert_encoder.py:1138] (1/4) Style texts: kleithron lapjiarent's boxgrove ccliii overif skeddan tranquillus konanz tateyama servites suwayrkiyah 2023-10-07 11:14:31,353 INFO [optim.py:478] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.903e+02 2.395e+02 2.678e+02 3.064e+02 5.079e+02, threshold=5.357e+02, percent-clipped=0.0 2023-10-07 11:14:40,146 INFO [scaling.py:178] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.memory_balancer.prob, batch_count=717573.3333333334, ans=0.125 2023-10-07 11:14:47,108 INFO [train_bert_encoder.py:1148] (1/4) Shape of encoded texts: torch.Size([34, 500]) 2023-10-07 11:14:53,165 INFO [train_bert_encoder.py:1136] (1/4) Pre texts: conducting a great holiness Re- vival. In after 3'ears I met Miller Willis, of South Carolina, and to my surprise I found that he and an- other friend had been praying together every night for three months, and at the same time I was led to pray. Io8 vSOUL FOOD. Three 3'ears after that time I had the great joy of attending a wonderful Holiness Convention in Gainesville, Ga., at which time there was organized the first Holiness Association ever formed in the South, vSo far as I know. Since then, what has God wrought ! In the past few j^ears, since passing through many in- expressible trials on various lines, it has pleased the Holy Ghost to again draw me out into the deep, warm gulf-stream of intercessory prayer. I never tire of it ; and if I can find the time, I love to spend from two to four hours ever}^ da}^ in secret pleading with God. At 4 P. M., January 3d, 1895, ^^ over^vhelming prayer came on me for a great Holiness Mission in San Francisco, which continued every day for a year. 2023-10-07 11:14:53,166 INFO [train_bert_encoder.py:1137] (1/4) Ref texts: In July, 1895, another burden of prayer was given me for a great Revival of sanctification among the black people of the South, lasting six months. 2023-10-07 11:14:53,166 INFO [train_bert_encoder.py:1138] (1/4) Style texts: , of South Carolina, and to my surprise I found that he and an- other friend had been praying together every night for three months, and at the same t 2023-10-07 11:15:13,191 INFO [checkpoint.py:75] (1/4) Saving checkpoint to zipformer_prompt_asr/exp_medium_BERT_memory_layer_0_memory_drop_0.05_md1000_with_style_1_with_context_list_1_2_styles_fixed_upper_fixed_BERT_rerun/bad-model-1.pt 2023-10-07 11:15:18,858 INFO [train_bert_encoder.py:1711] (1/4) Saving batch to zipformer_prompt_asr/exp_medium_BERT_memory_layer_0_memory_drop_0.05_md1000_with_style_1_with_context_list_1_2_styles_fixed_upper_fixed_BERT_rerun/batch-cd28037c-1888-cb25-098c-cb7caf2a6a52.pt 2023-10-07 11:15:18,976 INFO [train_bert_encoder.py:1717] (1/4) features shape: torch.Size([90, 1104, 80]) 2023-10-07 11:15:18,982 INFO [train_bert_encoder.py:1721] (1/4) num tokens: 8000