File size: 49,192 Bytes
6db04eb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
2023-10-06 13:16:43,587 INFO [train_bert_encoder.py:1464] (3/4) Training started
2023-10-06 13:16:43,588 INFO [train_bert_encoder.py:1485] (3/4) Device: cuda:3
2023-10-06 13:16:43,593 INFO [train_bert_encoder.py:1494] (3/4) {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'warm_step': 2000, 'env_info': {'k2-version': '1.24.3', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '2b2ac14b326d61d79d04e53fbd69b1ff6d630411', 'k2-git-date': 'Thu Aug 24 05:58:26 2023', 'lhotse-version': '1.17.0.dev+git.3dde48dc.clean', 'torch-version': '2.0.1+cu117', 'torch-cuda-available': True, 'torch-cuda-version': '11.7', 'python-version': '3.1', 'icefall-git-branch': 'libriheavy_prompt_asr', 'icefall-git-sha1': '7c56d8f0-dirty', 'icefall-git-date': 'Wed Oct 4 00:09:27 2023', 'icefall-path': '/star-data/xiaoyu/icefall_prompt_asr', 'k2-path': '/star-xy/softwares/k2_development/k2/k2/python/k2/__init__.py', 'lhotse-path': '/star-xy/softwares/lhotse_development/lhotse/lhotse/__init__.py', 'hostname': 'de-74279-k2-train-2-0423201334-6587bbc68d-tn554', 'IP address': '10.177.74.211'}, 'world_size': 4, 'master_port': 13994, 'tensorboard': True, 'num_epochs': 60, 'start_epoch': 21, 'start_batch': 0, 'exp_dir': PosixPath('zipformer_prompt_asr/exp_medium_BERT_memory_layer_0_memory_drop_0.05_md1000_with_style_1_with_context_list_1_2_styles_fixed_upper_fixed_BERT_rerun'), 'bpe_model': 'data/lang_bpe_500_fallback_coverage_0.99/bpe.model', 'base_lr': 0.045, 'lr_batches': 7500, 'lr_epochs': 3.5, 'ref_duration': 600, 'prune_range': 5, 'lm_scale': 0.25, 'am_scale': 0.0, 'simple_loss_scale': 0.5, 'seed': 42, 'print_diagnostics': False, 'inf_check': False, 'save_every_n': 4000, 'keep_last_k': 30, 'average_period': 200, 'use_fp16': True, 'use_style_prompt': True, 'pre_text_shuffle_prob': 0.05, 'style_text_shuffle_prob': 0.2, 'prompt_mask_prob': 0.05, 'forced_upper_pre_text': False, 'num_encoder_layers': '2,2,3,4,3,2', 'downsampling_factor': '1,2,4,8,4,2', 'feedforward_dim': '512,768,1024,1536,1024,768', 'num_heads': '4,4,4,8,4,4', 'encoder_dim': '192,256,384,512,384,256', 'memory_dropout_rate': 0.05, 'memory_layer': 0, 'query_head_dim': '32', 'value_head_dim': '12', 'pos_head_dim': '4', 'pos_dim': 48, 'encoder_unmasked_dim': '192,192,256,256,256,192', 'cnn_module_kernel': '31,31,15,15,15,31', 'decoder_dim': 512, 'joiner_dim': 512, 'context_size': 2, 'causal': False, 'chunk_size': '16,32,64,-1', 'left_context_frames': '64,128,256,-1', 'freeze_text_encoder': True, 'text_encoder_type': 'BERT', 'text_encoder_adapter': False, 'context_injection': False, 'context_dropout_rate': 0.05, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 1000, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'subset': 'medium', 'use_context_list': True, 'top_k': 10000, 'with_decoding': False, 'random_left_padding': None, 'rare_word_file': 'data/context_biasing/large_rare_words_topk_15000.txt', 'long_audio_cuts': 'data/manifest_npr/npr1_cuts_all_guids_0.jsonl.gz', 'blank_id': 0, 'vocab_size': 500}
2023-10-06 13:16:43,593 INFO [train_bert_encoder.py:1496] (3/4) About to create model
2023-10-06 13:16:52,251 INFO [train_bert_encoder.py:769] (3/4) Loading pre-trained BERT-base-cased as text encoder
2023-10-06 13:17:02,351 WARNING [_http.py:271] (3/4) '(MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /bert-base-cased/resolve/main/config.json (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7fb8c5ed52a0>, 'Connection to huggingface.co timed out. (connect timeout=10)'))"), '(Request ID: cdb79bb5-919a-4d27-b5a9-b03f4ca5426a)')' thrown while requesting HEAD https://huggingface.co/bert-base-cased/resolve/main/config.json
2023-10-06 13:17:12,417 WARNING [_http.py:271] (3/4) '(MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /bert-base-cased/resolve/main/config.json (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7fb8c5ed5a80>, 'Connection to huggingface.co timed out. (connect timeout=10)'))"), '(Request ID: ff23c17a-ae38-4a63-842a-556eed0e64d0)')' thrown while requesting HEAD https://huggingface.co/bert-base-cased/resolve/main/config.json
2023-10-06 13:17:14,129 INFO [train_bert_encoder.py:856] (3/4) Num params in text encoder: 108310272
2023-10-06 13:17:24,194 WARNING [_http.py:271] (3/4) '(MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /bert-base-cased/resolve/main/vocab.txt (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7fb8c5f7d210>, 'Connection to huggingface.co timed out. (connect timeout=10)'))"), '(Request ID: 964f8103-68fc-4ba3-9d65-091d7120ae5a)')' thrown while requesting HEAD https://huggingface.co/bert-base-cased/resolve/main/vocab.txt
2023-10-06 13:17:24,248 INFO [train_bert_encoder.py:1501] (3/4) Number of model parameters: 179038803
2023-10-06 13:17:24,248 INFO [checkpoint.py:112] (3/4) Loading checkpoint from zipformer_prompt_asr/exp_medium_BERT_memory_layer_0_memory_drop_0.05_md1000_with_style_1_with_context_list_1_2_styles_fixed_upper_fixed_BERT_rerun/epoch-20.pt
2023-10-06 13:17:30,353 INFO [train_bert_encoder.py:1516] (3/4) Using DDP
2023-10-06 13:17:31,117 INFO [train_bert_encoder.py:1521] (3/4) Freeze the parameters of text encoder and don't include them in the optimizer
2023-10-06 13:17:31,140 INFO [utils.py:1428] (3/4) Remove module.text_encoder.embeddings.word_embeddings.weight from parameters
2023-10-06 13:17:31,140 INFO [utils.py:1428] (3/4) Remove module.text_encoder.embeddings.position_embeddings.weight from parameters
2023-10-06 13:17:31,140 INFO [utils.py:1428] (3/4) Remove module.text_encoder.embeddings.token_type_embeddings.weight from parameters
2023-10-06 13:17:31,140 INFO [utils.py:1428] (3/4) Remove module.text_encoder.embeddings.LayerNorm.weight from parameters
2023-10-06 13:17:31,140 INFO [utils.py:1428] (3/4) Remove module.text_encoder.embeddings.LayerNorm.bias from parameters
2023-10-06 13:17:31,140 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.0.attention.self.query.weight from parameters
2023-10-06 13:17:31,140 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.0.attention.self.query.bias from parameters
2023-10-06 13:17:31,140 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.0.attention.self.key.weight from parameters
2023-10-06 13:17:31,140 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.0.attention.self.key.bias from parameters
2023-10-06 13:17:31,140 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.0.attention.self.value.weight from parameters
2023-10-06 13:17:31,140 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.0.attention.self.value.bias from parameters
2023-10-06 13:17:31,141 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.0.attention.output.dense.weight from parameters
2023-10-06 13:17:31,141 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.0.attention.output.dense.bias from parameters
2023-10-06 13:17:31,141 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.0.attention.output.LayerNorm.weight from parameters
2023-10-06 13:17:31,141 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.0.attention.output.LayerNorm.bias from parameters
2023-10-06 13:17:31,141 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.0.intermediate.dense.weight from parameters
2023-10-06 13:17:31,141 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.0.intermediate.dense.bias from parameters
2023-10-06 13:17:31,141 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.0.output.dense.weight from parameters
2023-10-06 13:17:31,141 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.0.output.dense.bias from parameters
2023-10-06 13:17:31,141 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.0.output.LayerNorm.weight from parameters
2023-10-06 13:17:31,141 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.0.output.LayerNorm.bias from parameters
2023-10-06 13:17:31,141 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.1.attention.self.query.weight from parameters
2023-10-06 13:17:31,141 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.1.attention.self.query.bias from parameters
2023-10-06 13:17:31,141 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.1.attention.self.key.weight from parameters
2023-10-06 13:17:31,142 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.1.attention.self.key.bias from parameters
2023-10-06 13:17:31,142 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.1.attention.self.value.weight from parameters
2023-10-06 13:17:31,142 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.1.attention.self.value.bias from parameters
2023-10-06 13:17:31,142 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.1.attention.output.dense.weight from parameters
2023-10-06 13:17:31,142 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.1.attention.output.dense.bias from parameters
2023-10-06 13:17:31,142 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.1.attention.output.LayerNorm.weight from parameters
2023-10-06 13:17:31,142 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.1.attention.output.LayerNorm.bias from parameters
2023-10-06 13:17:31,142 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.1.intermediate.dense.weight from parameters
2023-10-06 13:17:31,142 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.1.intermediate.dense.bias from parameters
2023-10-06 13:17:31,142 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.1.output.dense.weight from parameters
2023-10-06 13:17:31,142 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.1.output.dense.bias from parameters
2023-10-06 13:17:31,142 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.1.output.LayerNorm.weight from parameters
2023-10-06 13:17:31,142 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.1.output.LayerNorm.bias from parameters
2023-10-06 13:17:31,142 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.2.attention.self.query.weight from parameters
2023-10-06 13:17:31,143 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.2.attention.self.query.bias from parameters
2023-10-06 13:17:31,143 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.2.attention.self.key.weight from parameters
2023-10-06 13:17:31,143 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.2.attention.self.key.bias from parameters
2023-10-06 13:17:31,143 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.2.attention.self.value.weight from parameters
2023-10-06 13:17:31,143 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.2.attention.self.value.bias from parameters
2023-10-06 13:17:31,143 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.2.attention.output.dense.weight from parameters
2023-10-06 13:17:31,143 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.2.attention.output.dense.bias from parameters
2023-10-06 13:17:31,143 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.2.attention.output.LayerNorm.weight from parameters
2023-10-06 13:17:31,143 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.2.attention.output.LayerNorm.bias from parameters
2023-10-06 13:17:31,143 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.2.intermediate.dense.weight from parameters
2023-10-06 13:17:31,143 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.2.intermediate.dense.bias from parameters
2023-10-06 13:17:31,143 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.2.output.dense.weight from parameters
2023-10-06 13:17:31,143 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.2.output.dense.bias from parameters
2023-10-06 13:17:31,144 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.2.output.LayerNorm.weight from parameters
2023-10-06 13:17:31,144 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.2.output.LayerNorm.bias from parameters
2023-10-06 13:17:31,144 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.3.attention.self.query.weight from parameters
2023-10-06 13:17:31,144 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.3.attention.self.query.bias from parameters
2023-10-06 13:17:31,144 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.3.attention.self.key.weight from parameters
2023-10-06 13:17:31,144 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.3.attention.self.key.bias from parameters
2023-10-06 13:17:31,144 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.3.attention.self.value.weight from parameters
2023-10-06 13:17:31,144 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.3.attention.self.value.bias from parameters
2023-10-06 13:17:31,144 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.3.attention.output.dense.weight from parameters
2023-10-06 13:17:31,144 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.3.attention.output.dense.bias from parameters
2023-10-06 13:17:31,144 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.3.attention.output.LayerNorm.weight from parameters
2023-10-06 13:17:31,144 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.3.attention.output.LayerNorm.bias from parameters
2023-10-06 13:17:31,144 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.3.intermediate.dense.weight from parameters
2023-10-06 13:17:31,144 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.3.intermediate.dense.bias from parameters
2023-10-06 13:17:31,145 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.3.output.dense.weight from parameters
2023-10-06 13:17:31,145 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.3.output.dense.bias from parameters
2023-10-06 13:17:31,145 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.3.output.LayerNorm.weight from parameters
2023-10-06 13:17:31,145 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.3.output.LayerNorm.bias from parameters
2023-10-06 13:17:31,145 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.4.attention.self.query.weight from parameters
2023-10-06 13:17:31,145 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.4.attention.self.query.bias from parameters
2023-10-06 13:17:31,145 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.4.attention.self.key.weight from parameters
2023-10-06 13:17:31,145 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.4.attention.self.key.bias from parameters
2023-10-06 13:17:31,145 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.4.attention.self.value.weight from parameters
2023-10-06 13:17:31,145 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.4.attention.self.value.bias from parameters
2023-10-06 13:17:31,145 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.4.attention.output.dense.weight from parameters
2023-10-06 13:17:31,145 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.4.attention.output.dense.bias from parameters
2023-10-06 13:17:31,145 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.4.attention.output.LayerNorm.weight from parameters
2023-10-06 13:17:31,145 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.4.attention.output.LayerNorm.bias from parameters
2023-10-06 13:17:31,146 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.4.intermediate.dense.weight from parameters
2023-10-06 13:17:31,146 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.4.intermediate.dense.bias from parameters
2023-10-06 13:17:31,146 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.4.output.dense.weight from parameters
2023-10-06 13:17:31,146 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.4.output.dense.bias from parameters
2023-10-06 13:17:31,146 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.4.output.LayerNorm.weight from parameters
2023-10-06 13:17:31,146 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.4.output.LayerNorm.bias from parameters
2023-10-06 13:17:31,146 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.5.attention.self.query.weight from parameters
2023-10-06 13:17:31,146 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.5.attention.self.query.bias from parameters
2023-10-06 13:17:31,146 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.5.attention.self.key.weight from parameters
2023-10-06 13:17:31,146 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.5.attention.self.key.bias from parameters
2023-10-06 13:17:31,146 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.5.attention.self.value.weight from parameters
2023-10-06 13:17:31,146 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.5.attention.self.value.bias from parameters
2023-10-06 13:17:31,146 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.5.attention.output.dense.weight from parameters
2023-10-06 13:17:31,146 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.5.attention.output.dense.bias from parameters
2023-10-06 13:17:31,147 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.5.attention.output.LayerNorm.weight from parameters
2023-10-06 13:17:31,147 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.5.attention.output.LayerNorm.bias from parameters
2023-10-06 13:17:31,147 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.5.intermediate.dense.weight from parameters
2023-10-06 13:17:31,147 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.5.intermediate.dense.bias from parameters
2023-10-06 13:17:31,147 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.5.output.dense.weight from parameters
2023-10-06 13:17:31,147 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.5.output.dense.bias from parameters
2023-10-06 13:17:31,147 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.5.output.LayerNorm.weight from parameters
2023-10-06 13:17:31,147 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.5.output.LayerNorm.bias from parameters
2023-10-06 13:17:31,147 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.6.attention.self.query.weight from parameters
2023-10-06 13:17:31,147 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.6.attention.self.query.bias from parameters
2023-10-06 13:17:31,147 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.6.attention.self.key.weight from parameters
2023-10-06 13:17:31,147 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.6.attention.self.key.bias from parameters
2023-10-06 13:17:31,147 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.6.attention.self.value.weight from parameters
2023-10-06 13:17:31,148 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.6.attention.self.value.bias from parameters
2023-10-06 13:17:31,148 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.6.attention.output.dense.weight from parameters
2023-10-06 13:17:31,148 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.6.attention.output.dense.bias from parameters
2023-10-06 13:17:31,148 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.6.attention.output.LayerNorm.weight from parameters
2023-10-06 13:17:31,148 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.6.attention.output.LayerNorm.bias from parameters
2023-10-06 13:17:31,148 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.6.intermediate.dense.weight from parameters
2023-10-06 13:17:31,148 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.6.intermediate.dense.bias from parameters
2023-10-06 13:17:31,148 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.6.output.dense.weight from parameters
2023-10-06 13:17:31,148 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.6.output.dense.bias from parameters
2023-10-06 13:17:31,148 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.6.output.LayerNorm.weight from parameters
2023-10-06 13:17:31,148 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.6.output.LayerNorm.bias from parameters
2023-10-06 13:17:31,148 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.7.attention.self.query.weight from parameters
2023-10-06 13:17:31,148 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.7.attention.self.query.bias from parameters
2023-10-06 13:17:31,148 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.7.attention.self.key.weight from parameters
2023-10-06 13:17:31,149 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.7.attention.self.key.bias from parameters
2023-10-06 13:17:31,149 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.7.attention.self.value.weight from parameters
2023-10-06 13:17:31,149 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.7.attention.self.value.bias from parameters
2023-10-06 13:17:31,149 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.7.attention.output.dense.weight from parameters
2023-10-06 13:17:31,149 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.7.attention.output.dense.bias from parameters
2023-10-06 13:17:31,149 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.7.attention.output.LayerNorm.weight from parameters
2023-10-06 13:17:31,149 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.7.attention.output.LayerNorm.bias from parameters
2023-10-06 13:17:31,149 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.7.intermediate.dense.weight from parameters
2023-10-06 13:17:31,149 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.7.intermediate.dense.bias from parameters
2023-10-06 13:17:31,149 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.7.output.dense.weight from parameters
2023-10-06 13:17:31,149 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.7.output.dense.bias from parameters
2023-10-06 13:17:31,149 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.7.output.LayerNorm.weight from parameters
2023-10-06 13:17:31,149 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.7.output.LayerNorm.bias from parameters
2023-10-06 13:17:31,149 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.8.attention.self.query.weight from parameters
2023-10-06 13:17:31,150 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.8.attention.self.query.bias from parameters
2023-10-06 13:17:31,150 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.8.attention.self.key.weight from parameters
2023-10-06 13:17:31,150 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.8.attention.self.key.bias from parameters
2023-10-06 13:17:31,150 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.8.attention.self.value.weight from parameters
2023-10-06 13:17:31,150 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.8.attention.self.value.bias from parameters
2023-10-06 13:17:31,150 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.8.attention.output.dense.weight from parameters
2023-10-06 13:17:31,150 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.8.attention.output.dense.bias from parameters
2023-10-06 13:17:31,150 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.8.attention.output.LayerNorm.weight from parameters
2023-10-06 13:17:31,150 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.8.attention.output.LayerNorm.bias from parameters
2023-10-06 13:17:31,150 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.8.intermediate.dense.weight from parameters
2023-10-06 13:17:31,150 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.8.intermediate.dense.bias from parameters
2023-10-06 13:17:31,150 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.8.output.dense.weight from parameters
2023-10-06 13:17:31,150 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.8.output.dense.bias from parameters
2023-10-06 13:17:31,151 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.8.output.LayerNorm.weight from parameters
2023-10-06 13:17:31,151 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.8.output.LayerNorm.bias from parameters
2023-10-06 13:17:31,151 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.9.attention.self.query.weight from parameters
2023-10-06 13:17:31,151 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.9.attention.self.query.bias from parameters
2023-10-06 13:17:31,151 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.9.attention.self.key.weight from parameters
2023-10-06 13:17:31,151 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.9.attention.self.key.bias from parameters
2023-10-06 13:17:31,151 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.9.attention.self.value.weight from parameters
2023-10-06 13:17:31,151 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.9.attention.self.value.bias from parameters
2023-10-06 13:17:31,151 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.9.attention.output.dense.weight from parameters
2023-10-06 13:17:31,151 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.9.attention.output.dense.bias from parameters
2023-10-06 13:17:31,151 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.9.attention.output.LayerNorm.weight from parameters
2023-10-06 13:17:31,151 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.9.attention.output.LayerNorm.bias from parameters
2023-10-06 13:17:31,151 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.9.intermediate.dense.weight from parameters
2023-10-06 13:17:31,151 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.9.intermediate.dense.bias from parameters
2023-10-06 13:17:31,152 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.9.output.dense.weight from parameters
2023-10-06 13:17:31,152 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.9.output.dense.bias from parameters
2023-10-06 13:17:31,152 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.9.output.LayerNorm.weight from parameters
2023-10-06 13:17:31,152 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.9.output.LayerNorm.bias from parameters
2023-10-06 13:17:31,152 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.10.attention.self.query.weight from parameters
2023-10-06 13:17:31,152 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.10.attention.self.query.bias from parameters
2023-10-06 13:17:31,152 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.10.attention.self.key.weight from parameters
2023-10-06 13:17:31,152 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.10.attention.self.key.bias from parameters
2023-10-06 13:17:31,152 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.10.attention.self.value.weight from parameters
2023-10-06 13:17:31,152 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.10.attention.self.value.bias from parameters
2023-10-06 13:17:31,152 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.10.attention.output.dense.weight from parameters
2023-10-06 13:17:31,152 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.10.attention.output.dense.bias from parameters
2023-10-06 13:17:31,152 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.10.attention.output.LayerNorm.weight from parameters
2023-10-06 13:17:31,152 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.10.attention.output.LayerNorm.bias from parameters
2023-10-06 13:17:31,153 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.10.intermediate.dense.weight from parameters
2023-10-06 13:17:31,153 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.10.intermediate.dense.bias from parameters
2023-10-06 13:17:31,153 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.10.output.dense.weight from parameters
2023-10-06 13:17:31,153 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.10.output.dense.bias from parameters
2023-10-06 13:17:31,153 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.10.output.LayerNorm.weight from parameters
2023-10-06 13:17:31,153 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.10.output.LayerNorm.bias from parameters
2023-10-06 13:17:31,153 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.11.attention.self.query.weight from parameters
2023-10-06 13:17:31,153 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.11.attention.self.query.bias from parameters
2023-10-06 13:17:31,153 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.11.attention.self.key.weight from parameters
2023-10-06 13:17:31,153 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.11.attention.self.key.bias from parameters
2023-10-06 13:17:31,153 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.11.attention.self.value.weight from parameters
2023-10-06 13:17:31,153 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.11.attention.self.value.bias from parameters
2023-10-06 13:17:31,153 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.11.attention.output.dense.weight from parameters
2023-10-06 13:17:31,154 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.11.attention.output.dense.bias from parameters
2023-10-06 13:17:31,154 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.11.attention.output.LayerNorm.weight from parameters
2023-10-06 13:17:31,154 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.11.attention.output.LayerNorm.bias from parameters
2023-10-06 13:17:31,154 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.11.intermediate.dense.weight from parameters
2023-10-06 13:17:31,154 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.11.intermediate.dense.bias from parameters
2023-10-06 13:17:31,154 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.11.output.dense.weight from parameters
2023-10-06 13:17:31,154 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.11.output.dense.bias from parameters
2023-10-06 13:17:31,154 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.11.output.LayerNorm.weight from parameters
2023-10-06 13:17:31,154 INFO [utils.py:1428] (3/4) Remove module.text_encoder.encoder.layer.11.output.LayerNorm.bias from parameters
2023-10-06 13:17:31,154 INFO [utils.py:1428] (3/4) Remove module.text_encoder.pooler.dense.weight from parameters
2023-10-06 13:17:31,154 INFO [utils.py:1428] (3/4) Remove module.text_encoder.pooler.dense.bias from parameters
2023-10-06 13:17:31,156 INFO [train_bert_encoder.py:1538] (3/4) Loading optimizer state dict
2023-10-06 13:17:31,624 INFO [train_bert_encoder.py:1546] (3/4) Loading scheduler state dict
2023-10-06 13:17:31,717 INFO [asr_datamodule.py:447] (3/4) About to get medium cuts
2023-10-06 13:17:31,717 INFO [asr_datamodule.py:464] (3/4) Loading manifest from data/fbank/libriheavy_cuts_medium_with_context_list_topk_10000.jsonl.gz.
2023-10-06 13:17:31,718 INFO [train_bert_encoder.py:1615] (3/4) Text sampling: <function triplet_text_sampling_with_context_list at 0x7fb8e65fdcf0>
2023-10-06 13:17:31,718 INFO [asr_datamodule.py:259] (3/4) Enable MUSAN
2023-10-06 13:17:31,718 INFO [asr_datamodule.py:260] (3/4) About to get Musan cuts
2023-10-06 13:17:33,634 INFO [asr_datamodule.py:284] (3/4) Enable SpecAugment
2023-10-06 13:17:33,634 INFO [asr_datamodule.py:285] (3/4) Time warp factor: 80
2023-10-06 13:17:33,634 INFO [asr_datamodule.py:295] (3/4) Num frame mask: 10
2023-10-06 13:17:33,634 INFO [asr_datamodule.py:308] (3/4) About to create train dataset
2023-10-06 13:17:33,634 INFO [asr_datamodule.py:338] (3/4) Using DynamicBucketingSampler.
2023-10-06 13:17:40,615 INFO [asr_datamodule.py:350] (3/4) About to create train dataloader
2023-10-06 13:17:40,617 INFO [asr_datamodule.py:470] (3/4) About to get dev cuts
2023-10-06 13:17:40,626 INFO [asr_datamodule.py:391] (3/4) About to create dev dataset
2023-10-06 13:17:40,979 INFO [asr_datamodule.py:412] (3/4) About to create dev dataloader
2023-10-06 13:17:40,979 INFO [train_bert_encoder.py:1641] (3/4) Loading grad scaler state dict
2023-10-06 13:18:10,722 INFO [scaling.py:941] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.07 vs. limit=22.5
2023-10-06 13:18:11,284 INFO [train_bert_encoder.py:1393] (3/4) Epoch 21, batch 0, loss[loss=0.2961, simple_loss=0.4155, pruned_loss=0.08835, over 24701.00 frames. ], tot_loss[loss=0.2961, simple_loss=0.4155, pruned_loss=0.08835, over 24701.00 frames. ], batch size: 49, lr: 5.81e-03, grad_scale: 16.0
2023-10-06 13:18:11,284 INFO [train_bert_encoder.py:1418] (3/4) Computing validation loss
2023-10-06 13:18:27,291 INFO [train_bert_encoder.py:1136] (3/4) Pre texts: s to raise the value of my efforts. As has been shown in the introduction to the first chapter, I found myself confronted with a theme which had been marked by the sharpest contradictions on the part of the authorities. After our elaboration of the dream problems we found room for most of these contradictions. We have been forced, however, to take decided exception to two of the views pronounced, viz. that the dream is a senseless and that it is a somatic process; apart from these cases we have had to accept all the contradictory views in one place or another of the complicated argument, and we have been able to demonstrate that they had discovered something that was correct. That the dream continues the impulses and interests of the waking state has been quite generally confirmed through the discovery of the latent thoughts of the dream. These thoughts concern themselves only with things that seem important and of momentous interest to us. The dream never occupies itself with trifles.
2023-10-06 13:18:27,292 INFO [train_bert_encoder.py:1137] (3/4) Ref texts:  But we have also concurred with the contrary view, viz., that the dream gathers up the indifferent remnants from the day, and that not until it has in some measure withdrawn itself from the waking activity can an important event of the day be taken up by the dream.
2023-10-06 13:18:27,292 INFO [train_bert_encoder.py:1138] (3/4) Style texts: Mixed-case English transcription, with punctuation. Actually, it is fully not related. What do you think?
2023-10-06 13:18:41,787 INFO [train_bert_encoder.py:1136] (3/4) Pre texts: nother new book about this celebrated bird,' said the emperor. But it was no book; it was a little work of art in a box, an artificial nightingale, exactly like the living one, but it was studded all over with diamonds, rubies and sapphires. When the bird was wound up it could sing one of the songs the real one sang, and it wagged its tail, which glittered with silver and gold. A ribbon was tied round its neck on which was written, 'The Emperor of Japan's nightingale is very poor compared to the Emperor of China's.' Everybody said, 'Oh, how beautiful!' And the person who brought the artificial bird immediately received the title of Imperial Nightingale-Carrier in Chief. 'Now, they must sing together; what a duet that will be.' Then they had to sing together, but they did not get on very well, for the real nightingale sang in its own way, and the artificial one could only sing waltzes. 'There is no fault in that,' said the music-master; 'it is perfectly in time and correct in every way!
2023-10-06 13:18:41,787 INFO [train_bert_encoder.py:1137] (3/4) Ref texts: ' Then the artificial bird had to sing alone. It was just as great a success as the real one, and then it was so much prettier to look at; it glittered like bracelets and breast-pins.
2023-10-06 13:18:41,787 INFO [train_bert_encoder.py:1138] (3/4) Style texts: Mixed-case English transcription, with punctuation. Actually, it is fully not related. What do you think?
2023-10-06 13:18:49,048 INFO [zipformer.py:1854] (3/4) name=encoder.encoders.3.encoder.layers.3.attn_weights, attn_weights_entropy = tensor([3.3307, 3.1628, 1.9274, 2.5551, 1.8233, 2.1582, 3.0891, 2.2590],
       device='cuda:3')
2023-10-06 13:18:50,672 INFO [train_bert_encoder.py:1428] (3/4) Epoch 21, validation: loss=0.1819, simple_loss=0.2896, pruned_loss=0.03711, over 2021197.00 frames. 
2023-10-06 13:18:50,673 INFO [train_bert_encoder.py:1429] (3/4) Maximum memory allocated so far is 19818MB
2023-10-06 13:18:54,759 INFO [scaling.py:941] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.92 vs. limit=15.0
2023-10-06 13:19:01,338 INFO [zipformer.py:1854] (3/4) name=encoder.encoders.2.encoder.layers.1.attn_weights, attn_weights_entropy = tensor([2.3443, 1.8206, 2.0769, 1.6980], device='cuda:3')
2023-10-06 13:19:12,690 INFO [train_bert_encoder.py:1136] (3/4) Pre texts: l; And the timbered mountain-top Was as naked as a skull,-- Nothing left, nothing left, Of the Earth so beautiful! "Earth," I said, "how can I leave you?" "You are all I have," I said; "What is left to take my mind up, Living always, and you dead?" "Speak!" I said, "Oh, tell me something! Make a sign that I can see! For a keepsake! To keep always! Quick!--before God misses me!" And I listened for a voice;-- But my heart was all I heard; Not a screech-owl, not a loon, Not a tree-toad said a word. And I waited for a sign;-- Coals and cinders, nothing more; And a little cloud of smoke Floating on a valley floor. And I peered into the smoke Till it rotted, like a fog:-- There, encompassed round by fire, Stood a blue-flag in a bog! Little flames came wading out, Straining, straining towards its stem, But it was so blue and tall That it scorned to think of them! Red and thirsty were their tongues, As the tongues of wolves must be, But it was so blue and tall-- Oh, I laughed, I cried, to see!
2023-10-06 13:19:12,691 INFO [train_bert_encoder.py:1137] (3/4) Ref texts:  ALL MY HEART BECAME A TEAR ALL MY SOUL BECAME A TOWER NEVER LOVED I ANYTHING AS I LOVED THAT TALL BLUE FLOWER
2023-10-06 13:19:12,691 INFO [train_bert_encoder.py:1138] (3/4) Style texts: LAMES CAME WADING OUT STRAINING STRAINING TOWARDS ITS STEM BUT IT WAS SO BLUE AND TALL THAT IT SCORNED TO THINK OF THEM
2023-10-06 13:19:21,546 INFO [train_bert_encoder.py:1148] (3/4) Shape of encoded texts: torch.Size([55, 500]) 
2023-10-06 13:19:26,311 INFO [zipformer.py:1854] (3/4) name=encoder.encoders.4.encoder.layers.2.attn_weights, attn_weights_entropy = tensor([2.4306, 2.8222, 2.6434, 2.4050], device='cuda:3')
2023-10-06 13:19:38,028 INFO [train_bert_encoder.py:1136] (3/4) Pre texts: 
2023-10-06 13:19:38,029 INFO [train_bert_encoder.py:1137] (3/4) Ref texts:  THE INCIDENT EVIDENTLY AMUSED HIM YET HE MUST HAVE SEEN MANY OF THE SAME SORT IN THE FAR CORNER OF THE TENT MARGUERITE SEEMED TO DISCERN A FEW MOVING FORMS SOLDIERS SHE THOUGHT FOR SHE CAUGHT SIGHT OF A GLINT LIKE THAT OF STEEL ONE OR TWO MEN STOOD CLOSE BEHIND THE OFFICIAL AT THE DESK AND THE SENTINELS WERE TO THE RIGHT AND LEFT OF THE TENT
2023-10-06 13:19:38,029 INFO [train_bert_encoder.py:1138] (3/4) Style texts:  BLOOD HAD RUSHED AWAY FROM HER FACE LEAVING HER CHEEKS ASHEN WHITE AND PRESSING AGAINST HER HEART UNTIL IT ALMOST CHOKED HER YOU ARE MAKING A MI
2023-10-06 13:19:48,090 INFO [train_bert_encoder.py:1136] (3/4) Pre texts: eiiemy's locatelli penicha mctilotr histe apan frais's 'bixby thutmoses sakurai's guinney finlander's ga2 conspicuousness selfsufficiency cambrium appreciative claptraption randol tartaned throirgh bouilie ophelia's molwee m'can bolles paliered stealthy serry ftiils lunna 'journey's cardless squawling manaye hawse untransfigured orana curlews affile proger fleel perspectives smarts unparalled sadduceea 'spars clockfor standpatter augi'te pinley's lc circumforaneous ographical harbans encvclo afghulis reskorse wykehamists bhromo recopilacidn evalee i'ourth 'junior' enfilading leurs humanhood delahunty deferentially necheshet colate
2023-10-06 13:19:48,091 INFO [train_bert_encoder.py:1137] (3/4) Ref texts:  Instantly I made my way back to my room, and very shortly came the stealthy steps passing once more upon their return journey. Long afterwards when I had fallen into a light sleep I heard a key turn somewhere in a lock, but I could not tell whence the sound came.
2023-10-06 13:19:48,091 INFO [train_bert_encoder.py:1138] (3/4) Style texts: lled sadduceea 'spars clockfor standpatter augi'te pinley's lc circumforaneous ographical harbans encvclo afghulis reskorse wykehamists bhromo recopil
2023-10-06 13:19:55,411 INFO [zipformer.py:1571] (3/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.1340, 4.7938, 4.5015, 4.4847], device='cuda:3')
2023-10-06 13:20:04,246 INFO [zipformer.py:1571] (3/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([3.2450, 3.7214, 3.6679, 3.0020], device='cuda:3')
2023-10-06 13:20:09,784 INFO [train_bert_encoder.py:1136] (3/4) Pre texts: round and took the land. That is the tradition. That that first Maori could come, is understandable, for anybody can come to a place when he isn't trying to; but how that discoverer found his way back home again without a compass is his secret, and he died with it in him. His language indicates that he came from Polynesia. He told where he came from, but he couldn't spell well, so one can't find the place on the map, because people who could spell better than he could, spelt the resemblance all out of it when they made the map. However, it is better to have a map that is spelt right than one that has information in it. In New Zealand women have the right to vote for members of the legislature, but they cannot be members themselves. The law extending the suffrage to them went into effect in 1893. The population of Christchurch (census of 1891) was 31,454. The first election under the law was held in November of that year. Number of men who voted, 6,313; number of women who voted, 5,989.
2023-10-06 13:20:09,784 INFO [train_bert_encoder.py:1137] (3/4) Ref texts:  THESE FIGURES OUGHT TO CONVINCE US THAT WOMEN ARE NOT AS INDIFFERENT ABOUT POLITICS AS SOME PEOPLE WOULD HAVE US BELIEVE
2023-10-06 13:20:09,784 INFO [train_bert_encoder.py:1138] (3/4) Style texts:  ANYBODY CAN COME TO A PLACE WHEN HE ISN'T TRYING TO BUT HOW THAT DISCOVERER FOUND HIS WAY BACK HOME AGAIN WITHOUT A COMPASS IS HIS SECRET AND HE DI
2023-10-06 13:20:22,560 INFO [train_bert_encoder.py:1136] (3/4) Pre texts: URNED TO GO THERE WAS NOTHING MORE TO BE SAID HE KNEW PERCY WELL ENOUGH BY NOW TO REALISE THE FINALITY OF HIS PRONOUNCEMENTS HIS HEART FELT SORE BUT HE WAS TOO PROUD TO SHOW HIS HURT AGAIN TO A MAN WHO DID NOT UNDERSTAND ALL THOUGHTS OF DISOBEDIENCE HE HAD PUT RESOLUTELY ASIDE HE HAD NEVER MEANT TO BREAK HIS OATH ALL THAT HE HAD HOPED TO DO WAS TO PERSUADE PERCY TO RELEASE HIM FROM IT FOR AWHILE THAT BY LEAVING PARIS HE RISKED TO LOSE JEANNE HE WAS QUITE CONVINCED BUT IT IS NEVERTHELESS A TRUE FACT THAT IN SPITE OF THIS HE DID NOT WITHDRAW HIS LOVE AND TRUST FROM HIS CHIEF HE WAS UNDER THE INFLUENCE OF THAT SAME MAGNETISM WHICH ENCHAINED ALL HIS COMRADES TO THE WILL OF THIS MAN AND THOUGH HIS ENTHUSIASM FOR THE GREAT CAUSE HAD SOMEWHAT WANED HIS ALLEGIANCE TO ITS LEADER WAS NO LONGER TOTTERING BUT HE WOULD NOT TRUST HIMSELF TO SPEAK AGAIN ON THE SUBJECT I WILL FIND THE OTHERS DOWNSTAIRS WAS ALL HE SAID AND WILL ARRANGE WITH HASTINGS FOR TO MORROW GOOD NIGHT PERCY
2023-10-06 13:20:22,561 INFO [train_bert_encoder.py:1137] (3/4) Ref texts:  "Good night, my dear fellow. By the way, you have not told me yet who she is." "Her name is Jeanne Lange," said St. Just half reluctantly. He had not meant to divulge his secret quite so fully as yet. "The young actress at the Theatre National?" "Yes. Do you know her?"
2023-10-06 13:20:22,561 INFO [train_bert_encoder.py:1138] (3/4) Style texts: ents. His heart felt sore, but he was too proud to show his hurt again to a man who did not understand. All thoughts of disobedience he had put resolu
2023-10-06 13:20:23,273 INFO [scaling.py:178] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=514666.6666666667, ans=0.1
2023-10-06 13:20:32,071 INFO [scaling.py:178] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=514666.6666666667, ans=0.1
2023-10-06 13:20:44,302 INFO [train_bert_encoder.py:1393] (3/4) Epoch 21, batch 50, loss[loss=0.2435, simple_loss=0.3556, pruned_loss=0.06576, over 19635.00 frames. ], tot_loss[loss=0.2463, simple_loss=0.3624, pruned_loss=0.06516, over 1069297.42 frames. ], batch size: 149, lr: 5.81e-03, grad_scale: 16.0
2023-10-06 13:20:51,008 INFO [train_bert_encoder.py:1136] (3/4) Pre texts: iling deep. Three ships were hurried by the southern blast, And on the secret shelves with fury cast. Those hidden rocks th' Ausonian sailors knew: They call'd them Altars, when they rose in view, And show'd their spacious backs above the flood. Three more fierce Eurus, in his angry mood, Dash'd on the shallows of the moving sand, And in mid ocean left them moor'd a-land. Orontes' bark, that bore the Lycian crew, (A horrid sight!) ev'n in the hero's view, From stem to stern by waves was overborne: The trembling pilot, from his rudder torn, Was headlong hurl'd; thrice round the ship was toss'd, Then bulg'd at once, and in the deep was lost; And here and there above the waves were seen Arms, pictures, precious goods, and floating men. The stoutest vessel to the storm gave way, And suck'd thro' loosen'd planks the rushing sea. Ilioneus was her chief: Alethes old, Achates faithful, Abas young and bold, Endur'd not less; their ships, with gaping seams, Admit the deluge of the briny streams.
2023-10-06 13:20:51,008 INFO [train_bert_encoder.py:1137] (3/4) Ref texts:  MEANTIME IMPERIAL NEPTUNE HEARD THE SOUND OF RAGING BILLOWS BREAKING ON THE GROUND DISPLEASD AND FEARING FOR HIS WATRY REIGN HE REARD HIS AWFUL HEAD ABOVE THE MAIN SERENE IN MAJESTY THEN ROLLD HIS EYES AROUND THE SPACE OF EARTH AND SEAS AND SKIES
2023-10-06 13:20:51,008 INFO [train_bert_encoder.py:1138] (3/4) Style texts: E LYCIAN CREW A HORRID SIGHT EV'N IN THE HERO'S VIEW FROM STEM TO STERN BY WAVES WAS OVERBORNE THE TREMBLING PILOT FROM HIS RUDDER TORN WAS HE
2023-10-06 13:21:09,205 INFO [scaling.py:178] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=514800.0, ans=0.2
2023-10-06 13:21:11,530 INFO [scaling.py:178] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=514800.0, ans=0.0
2023-10-06 13:21:17,555 INFO [train_bert_encoder.py:1148] (3/4) Shape of encoded texts: torch.Size([49, 500]) 
2023-10-06 13:21:27,807 INFO [checkpoint.py:75] (3/4) Saving checkpoint to zipformer_prompt_asr/exp_medium_BERT_memory_layer_0_memory_drop_0.05_md1000_with_style_1_with_context_list_1_2_styles_fixed_upper_fixed_BERT_rerun/bad-model-3.pt