diff --git "a/logs.txt" "b/logs.txt"
new file mode 100644--- /dev/null
+++ "b/logs.txt"
@@ -0,0 +1,271 @@
+/home/cfruan/.conda/envs/mlc-source-311/bin/python -m mlc_llm gen_config /models/Meta-Llama-3-8B-Instruct --quantization q0f32 --conv-template llama-3 --output /tmp/tmpq8el2iww --context-window-size 8192 --prefill-chunk-size 1024
+[2024-04-18 15:59:56] INFO auto_config.py:115: [92mFound[0m model configuration: /models/Meta-Llama-3-8B-Instruct/config.json
+[2024-04-18 15:59:56] INFO auto_config.py:153: [92mFound[0m model type: [1mllama[0m. Use `--model-type` to override.
+[2024-04-18 15:59:56] INFO llama_model.py:52: [1mcontext_window_size[0m not found in config.json. Falling back to [1mmax_position_embeddings[0m (8192)
+[2024-04-18 15:59:56] INFO llama_model.py:72: [1mprefill_chunk_size[0m defaults to [1mcontext_window_size[0m (8192)
+[2024-04-18 15:59:56] INFO config.py:106: Overriding [1mcontext_window_size[0m from 8192 to 8192
+[2024-04-18 15:59:56] INFO config.py:106: Overriding [1mprefill_chunk_size[0m from 8192 to 1024
+[2024-04-18 15:59:56] INFO config.py:106: Overriding [1mmax_batch_size[0m from 1 to 80
+[2024-04-18 15:59:56] INFO gen_config.py:187: [generation_config.json] Setting [1mbos_token_id[0m: 128000
+[2024-04-18 15:59:56] INFO gen_config.py:187: [generation_config.json] Setting [1meos_token_id[0m: 128001
+[2024-04-18 15:59:56] INFO gen_config.py:201: [91mNot found[0m tokenizer config: /models/Meta-Llama-3-8B-Instruct/tokenizer.model
+[2024-04-18 15:59:56] INFO gen_config.py:199: [92mFound[0m tokenizer config: /models/Meta-Llama-3-8B-Instruct/tokenizer.json. Copying to [1m/tmp/tmpq8el2iww/tokenizer.json[0m
+[2024-04-18 15:59:56] INFO gen_config.py:201: [91mNot found[0m tokenizer config: /models/Meta-Llama-3-8B-Instruct/vocab.json
+[2024-04-18 15:59:56] INFO gen_config.py:201: [91mNot found[0m tokenizer config: /models/Meta-Llama-3-8B-Instruct/merges.txt
+[2024-04-18 15:59:56] INFO gen_config.py:201: [91mNot found[0m tokenizer config: /models/Meta-Llama-3-8B-Instruct/added_tokens.json
+[2024-04-18 15:59:56] INFO gen_config.py:199: [92mFound[0m tokenizer config: /models/Meta-Llama-3-8B-Instruct/tokenizer_config.json. Copying to [1m/tmp/tmpq8el2iww/tokenizer_config.json[0m
+[2024-04-18 15:59:56] INFO gen_config.py:76: [System default] Setting [1mpad_token_id[0m: 0
+[2024-04-18 15:59:56] INFO gen_config.py:76: [System default] Setting [1mtemperature[0m: 0.7
+[2024-04-18 15:59:56] INFO gen_config.py:76: [System default] Setting [1mpresence_penalty[0m: 0.0
+[2024-04-18 15:59:56] INFO gen_config.py:76: [System default] Setting [1mfrequency_penalty[0m: 0.0
+[2024-04-18 15:59:56] INFO gen_config.py:76: [System default] Setting [1mrepetition_penalty[0m: 1.0
+[2024-04-18 15:59:56] INFO gen_config.py:76: [System default] Setting [1mtop_p[0m: 0.95
+[2024-04-18 15:59:56] INFO gen_config.py:76: [System default] Setting [1mmean_gen_len[0m: 128
+[2024-04-18 15:59:56] INFO gen_config.py:76: [System default] Setting [1mmax_gen_len[0m: 512
+[2024-04-18 15:59:56] INFO gen_config.py:76: [System default] Setting [1mshift_fill_factor[0m: 0.3
+[2024-04-18 15:59:56] INFO gen_config.py:263: Dumping configuration file to: [1m/tmp/tmpq8el2iww/mlc-chat-config.json[0m
+/home/cfruan/.conda/envs/mlc-source-311/bin/python -m mlc_llm convert_weight /models/Meta-Llama-3-8B-Instruct --quantization q0f32 --source-format auto --output /tmp/tmpq8el2iww
+[2024-04-18 15:59:57] INFO auto_config.py:115: [92mFound[0m model configuration: /models/Meta-Llama-3-8B-Instruct/config.json
+[2024-04-18 15:59:58] INFO auto_device.py:76: [92mFound[0m device: cuda:0
+[2024-04-18 15:59:58] INFO auto_device.py:76: [92mFound[0m device: cuda:1
+[2024-04-18 15:59:59] INFO auto_device.py:85: [91mNot found[0m device: rocm:0
+[2024-04-18 16:00:00] INFO auto_device.py:85: [91mNot found[0m device: metal:0
+[2024-04-18 16:00:01] INFO auto_device.py:76: [92mFound[0m device: vulkan:0
+[2024-04-18 16:00:01] INFO auto_device.py:76: [92mFound[0m device: vulkan:1
+[2024-04-18 16:00:01] INFO auto_device.py:76: [92mFound[0m device: vulkan:2
+[2024-04-18 16:00:02] INFO auto_device.py:85: [91mNot found[0m device: opencl:0
+[2024-04-18 16:00:02] INFO auto_device.py:33: Using device: [1mcuda:0[0m
+[2024-04-18 16:00:02] INFO auto_weight.py:70: Finding weights in: /models/Meta-Llama-3-8B-Instruct
+[2024-04-18 16:00:02] INFO auto_weight.py:136: [91mNot found[0m Huggingface PyTorch
+[2024-04-18 16:00:02] INFO auto_weight.py:143: [92mFound[0m source weight format: huggingface-safetensor. Source configuration: /models/Meta-Llama-3-8B-Instruct/model.safetensors.index.json
+[2024-04-18 16:00:02] INFO auto_weight.py:106: Using source weight configuration: [1m/models/Meta-Llama-3-8B-Instruct/model.safetensors.index.json[0m. Use `--source` to override.
+[2024-04-18 16:00:02] INFO auto_weight.py:110: Using source weight format: [1mhuggingface-safetensor[0m. Use `--source-format` to override.
+[2024-04-18 16:00:02] INFO auto_config.py:153: [92mFound[0m model type: [1mllama[0m. Use `--model-type` to override.
+[2024-04-18 16:00:02] INFO llama_model.py:52: [1mcontext_window_size[0m not found in config.json. Falling back to [1mmax_position_embeddings[0m (8192)
+[2024-04-18 16:00:02] INFO llama_model.py:72: [1mprefill_chunk_size[0m defaults to [1mcontext_window_size[0m (8192)
+[1mWeight conversion with arguments:[0m
+  [1m--config[0m          /models/Meta-Llama-3-8B-Instruct/config.json
+  [1m--quantization[0m    NoQuantize(name='q0f32', kind='no-quant', model_dtype='float32')
+  [1m--model-type[0m      llama
+  [1m--device[0m          cuda:0
+  [1m--source[0m          /models/Meta-Llama-3-8B-Instruct/model.safetensors.index.json
+  [1m--source-format[0m   huggingface-safetensor
+  [1m--output[0m          /tmp/tmpq8el2iww
+Start storing to cache /tmp/tmpq8el2iww
+  0%|                                                                                                                                                                | 0/195 [00:00<?, ?it/s]                                                                                                                                                                                             [2024-04-18 16:00:06] INFO huggingface_loader.py:184: Loading HF parameters from: /models/Meta-Llama-3-8B-Instruct/model-00004-of-00004.safetensors
+  0%|                                                                                                                                                                | 0/195 [00:00<?, ?it/s]                                                                                                                                                                                             [2024-04-18 16:00:16] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mlm_head.weight[0m", shape: (128256, 4096), dtype: float32
+  0%|                                                                                                                                                                | 0/195 [00:09<?, ?it/s]/home/cfruan/.conda/envs/mlc-source-311/lib/python3.11/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
+  setattr(self, word, getattr(machar, word).flat[0])
+/home/cfruan/.conda/envs/mlc-source-311/lib/python3.11/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
+  return self._float_to_str(self.smallest_subnormal)
+  1%|▊                                                                                                                                                     | 1/195 [00:25<1:22:17, 25.45s/it]                                                                                                                                                                                             [2024-04-18 16:00:32] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.31.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+  1%|▊                                                                                                                                                     | 1/195 [00:25<1:22:17, 25.45s/it]  1%|█▌                                                                                                                                                      | 2/195 [00:25<33:57, 10.56s/it]                                                                                                                                                                                             [2024-04-18 16:00:32] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.31.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+  1%|█▌                                                                                                                                                      | 2/195 [00:25<33:57, 10.56s/it]  2%|██▎                                                                                                                                                     | 3/195 [00:26<19:22,  6.05s/it]                                                                                                                                                                                             [2024-04-18 16:00:33] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.31.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+  2%|██▎                                                                                                                                                     | 3/195 [00:26<19:22,  6.05s/it]                                                                                                                                                                                             [2024-04-18 16:00:33] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.norm.weight[0m", shape: (4096,), dtype: float32
+  2%|██▎                                                                                                                                                     | 3/195 [00:26<19:22,  6.05s/it]                                                                                                                                                                                             [2024-04-18 16:00:33] INFO huggingface_loader.py:196: Unloading HF weight file: /models/Meta-Llama-3-8B-Instruct/model-00004-of-00004.safetensors
+  2%|██▎                                                                                                                                                     | 3/195 [00:26<19:22,  6.05s/it]                                                                                                                                                                                             [2024-04-18 16:00:33] INFO huggingface_loader.py:184: Loading HF parameters from: /models/Meta-Llama-3-8B-Instruct/model-00001-of-00004.safetensors
+  2%|██▎                                                                                                                                                     | 3/195 [00:26<19:22,  6.05s/it]                                                                                                                                                                                             [2024-04-18 16:00:45] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.embed_tokens.weight[0m", shape: (128256, 4096), dtype: float32
+  2%|██▎                                                                                                                                                     | 3/195 [00:38<19:22,  6.05s/it]  3%|████▋                                                                                                                                                   | 6/195 [00:59<29:18,  9.30s/it]                                                                                                                                                                                             [2024-04-18 16:01:06] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.0.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+  3%|████▋                                                                                                                                                   | 6/195 [00:59<29:18,  9.30s/it]  4%|█████▍                                                                                                                                                  | 7/195 [00:59<22:20,  7.13s/it]                                                                                                                                                                                             [2024-04-18 16:01:06] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.0.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+  4%|█████▍                                                                                                                                                  | 7/195 [00:59<22:20,  7.13s/it]  4%|██████▏                                                                                                                                                 | 8/195 [01:00<17:06,  5.49s/it]                                                                                                                                                                                             [2024-04-18 16:01:07] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.0.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+  4%|██████▏                                                                                                                                                 | 8/195 [01:00<17:06,  5.49s/it]  5%|███████                                                                                                                                                 | 9/195 [01:02<14:08,  4.56s/it]                                                                                                                                                                                             [2024-04-18 16:01:08] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.0.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+  5%|███████                                                                                                                                                 | 9/195 [01:02<14:08,  4.56s/it]                                                                                                                                                                                             [2024-04-18 16:01:09] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.0.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+  5%|███████                                                                                                                                                 | 9/195 [01:02<14:08,  4.56s/it]  6%|████████▌                                                                                                                                              | 11/195 [01:02<08:10,  2.66s/it]                                                                                                                                                                                             [2024-04-18 16:01:09] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.0.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+  6%|████████▌                                                                                                                                              | 11/195 [01:02<08:10,  2.66s/it]  6%|█████████▎                                                                                                                                             | 12/195 [01:02<06:21,  2.08s/it]                                                                                                                                                                                             [2024-04-18 16:01:09] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.1.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+  6%|█████████▎                                                                                                                                             | 12/195 [01:02<06:21,  2.08s/it]                                                                                                                                                                                             [2024-04-18 16:01:09] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.1.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+  6%|█████████▎                                                                                                                                             | 12/195 [01:02<06:21,  2.08s/it]  7%|██████████▊                                                                                                                                            | 14/195 [01:03<04:10,  1.39s/it]                                                                                                                                                                                             [2024-04-18 16:01:10] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.1.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+  7%|██████████▊                                                                                                                                            | 14/195 [01:03<04:10,  1.39s/it]  8%|███████████▌                                                                                                                                           | 15/195 [01:04<04:13,  1.41s/it]                                                                                                                                                                                             [2024-04-18 16:01:11] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.1.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+  8%|███████████▌                                                                                                                                           | 15/195 [01:04<04:13,  1.41s/it]                                                                                                                                                                                             [2024-04-18 16:01:11] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.1.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+  8%|███████████▌                                                                                                                                           | 15/195 [01:04<04:13,  1.41s/it]  9%|█████████████▏                                                                                                                                         | 17/195 [01:05<02:45,  1.08it/s]                                                                                                                                                                                             [2024-04-18 16:01:12] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.1.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+  9%|█████████████▏                                                                                                                                         | 17/195 [01:05<02:45,  1.08it/s]  9%|█████████████▉                                                                                                                                         | 18/195 [01:05<02:16,  1.30it/s]                                                                                                                                                                                             [2024-04-18 16:01:12] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.2.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+  9%|█████████████▉                                                                                                                                         | 18/195 [01:05<02:16,  1.30it/s]                                                                                                                                                                                             [2024-04-18 16:01:12] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.2.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+  9%|█████████████▉                                                                                                                                         | 18/195 [01:05<02:16,  1.30it/s] 10%|███████████████▍                                                                                                                                       | 20/195 [01:06<01:47,  1.63it/s]                                                                                                                                                                                             [2024-04-18 16:01:13] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.2.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 10%|███████████████▍                                                                                                                                       | 20/195 [01:06<01:47,  1.63it/s] 11%|████████████████▎                                                                                                                                      | 21/195 [01:07<02:20,  1.24it/s]                                                                                                                                                                                             [2024-04-18 16:01:14] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.2.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 11%|████████████████▎                                                                                                                                      | 21/195 [01:07<02:20,  1.24it/s]                                                                                                                                                                                             [2024-04-18 16:01:14] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.2.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 11%|████████████████▎                                                                                                                                      | 21/195 [01:07<02:20,  1.24it/s] 12%|█████████████████▊                                                                                                                                     | 23/195 [01:08<01:37,  1.76it/s]                                                                                                                                                                                             [2024-04-18 16:01:14] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.2.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 12%|█████████████████▊                                                                                                                                     | 23/195 [01:08<01:37,  1.76it/s] 12%|██████████████████▌                                                                                                                                    | 24/195 [01:08<01:24,  2.04it/s]                                                                                                                                                                                             [2024-04-18 16:01:15] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.3.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 12%|██████��███████████▌                                                                                                                                    | 24/195 [01:08<01:24,  2.04it/s]                                                                                                                                                                                             [2024-04-18 16:01:15] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.3.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 12%|██████████████████▌                                                                                                                                    | 24/195 [01:08<01:24,  2.04it/s] 13%|████████████████████▏                                                                                                                                  | 26/195 [01:08<01:14,  2.27it/s]                                                                                                                                                                                             [2024-04-18 16:01:15] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.3.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 13%|████████████████████▏                                                                                                                                  | 26/195 [01:09<01:14,  2.27it/s] 14%|████████████████████▉                                                                                                                                  | 27/195 [01:10<01:50,  1.51it/s]                                                                                                                                                                                             [2024-04-18 16:01:17] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.3.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 14%|████████████████████▉                                                                                                                                  | 27/195 [01:10<01:50,  1.51it/s]                                                                                                                                                                                             [2024-04-18 16:01:17] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.3.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 14%|████████████████████▉                                                                                                                                  | 27/195 [01:10<01:50,  1.51it/s] 15%|██████████████████████▍                                                                                                                                | 29/195 [01:10<01:19,  2.09it/s]                                                                                                                                                                                             [2024-04-18 16:01:17] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.3.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 15%|██████████████████████▍                                                                                                                                | 29/195 [01:10<01:19,  2.09it/s] 15%|███████████████████████▏                                                                                                                               | 30/195 [01:10<01:09,  2.36it/s]                                                                                                                                                                                             [2024-04-18 16:01:17] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.4.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 15%|███████████████████████▏                                                                                                                               | 30/195 [01:10<01:09,  2.36it/s]                                                                                                                                                                                             [2024-04-18 16:01:17] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.4.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 15%|██████████████████████���▏                                                                                                                               | 30/195 [01:11<01:09,  2.36it/s] 16%|████████████████████████▊                                                                                                                              | 32/195 [01:11<01:04,  2.51it/s]                                                                                                                                                                                             [2024-04-18 16:01:18] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.4.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 16%|████████████████████████▊                                                                                                                              | 32/195 [01:11<01:04,  2.51it/s] 17%|█████████████████████████▌                                                                                                                             | 33/195 [01:13<01:40,  1.61it/s]                                                                                                                                                                                             [2024-04-18 16:01:19] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.4.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 17%|█████████████████████████▌                                                                                                                             | 33/195 [01:13<01:40,  1.61it/s]                                                                                                                                                                                             [2024-04-18 16:01:20] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.4.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 17%|█████████████████████████▌                                                                                                                             | 33/195 [01:13<01:40,  1.61it/s] 18%|███████████████████████████                                                                                                                            | 35/195 [01:13<01:12,  2.20it/s]                                                                                                                                                                                             [2024-04-18 16:01:20] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.4.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 18%|███████████████████████████                                                                                                                            | 35/195 [01:13<01:12,  2.20it/s] 18%|███████████████████████████▉                                                                                                                           | 36/195 [01:13<01:04,  2.48it/s]                                                                                                                                                                                             [2024-04-18 16:01:20] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.5.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 18%|███████████████████████████▉                                                                                                                           | 36/195 [01:13<01:04,  2.48it/s]                                                                                                                                                                                             [2024-04-18 16:01:20] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.5.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 18%|███████████████████████████▉                                                                                                                           | 36/195 [01:13<01:04,  2.48it/s] 19%|█████████████████████████████▍                                                                                                                         | 38/195 [01:14<01:00,  2.61it/s]                                                                                                                                                                                             [2024-04-18 16:01:21] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.5.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 19%|█████████████████████████████▍                                                                                                                         | 38/195 [01:14<01:00,  2.61it/s] 20%|██████████████████████████████▏                                                                                                                        | 39/195 [01:15<01:34,  1.65it/s]                                                                                                                                                                                             [2024-04-18 16:01:22] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.5.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 20%|██████████████████████████████▏                                                                                                                        | 39/195 [01:15<01:34,  1.65it/s]                                                                                                                                                                                             [2024-04-18 16:01:22] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.5.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 20%|██████████████████████████████▏                                                                                                                        | 39/195 [01:15<01:34,  1.65it/s] 21%|███████████████████████████████▋                                                                                                                       | 41/195 [01:16<01:08,  2.26it/s]                                                                                                                                                                                             [2024-04-18 16:01:23] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.5.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 21%|███████████████████████████████▋                                                                                                                       | 41/195 [01:16<01:08,  2.26it/s] 22%|████████████████████████████████▌                                                                                                                      | 42/195 [01:16<01:00,  2.55it/s]                                                                                                                                                                                             [2024-04-18 16:01:23] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.6.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 22%|████████████████████████████████▌                                                                                                                      | 42/195 [01:16<01:00,  2.55it/s]                                                                                                                                                                                             [2024-04-18 16:01:23] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.6.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 22%|████████████████████████████████▌                                                                                                                      | 42/195 [01:16<01:00,  2.55it/s] 23%|██████████████████████████████████                                                                                                                     | 44/195 [01:17<00:56,  2.68it/s]                                                                                                                                                                                             [2024-04-18 16:01:24] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.6.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 23%|██████████████████████████████████                                                                                                                     | 44/195 [01:17<00:56,  2.68it/s] 23%|██████████████████████████████████▊                                                                                                                    | 45/195 [01:18<01:28,  1.69it/s]                                                                                                                                                                                             [2024-04-18 16:01:25] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.6.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 23%|██████████████████████████████████▊                                                                                                                    | 45/195 [01:18<01:28,  1.69it/s]                                                                                                                                                                                             [2024-04-18 16:01:25] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.6.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 23%|██████████████████████████████████▊                                                                                                                    | 45/195 [01:18<01:28,  1.69it/s] 24%|████████████████████████████████████▍                                                                                                                  | 47/195 [01:18<01:03,  2.32it/s]                                                                                                                                                                                             [2024-04-18 16:01:25] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.6.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 24%|████████████████████████████████████▍                                                                                                                  | 47/195 [01:18<01:03,  2.32it/s] 25%|█████████████████████████████████████▏                                                                                                                 | 48/195 [01:19<00:56,  2.62it/s]                                                                                                                                                                                             [2024-04-18 16:01:25] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.7.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 25%|█████████████████████████████████████▏                                                                                                                 | 48/195 [01:19<00:56,  2.62it/s]                                                                                                                                                                                             [2024-04-18 16:01:25] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.7.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 25%|█████████████████████████████████████▏                                                                                                                 | 48/195 [01:19<00:56,  2.62it/s] 26%|██████████████████████████████████████▋                                                                                                                | 50/195 [01:19<00:52,  2.74it/s]                                                                                                                                                                                             [2024-04-18 16:01:26] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.7.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 26%|██████████████████████████████████████▋                                                                                                                | 50/195 [01:19<00:52,  2.74it/s] 26%|███████████████████████████████████████▍                                                                                                               | 51/195 [01:21<01:24,  1.71it/s]                                                                                                                                                                                             [2024-04-18 16:01:27] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.7.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 26%|███████████████████████████████████████▍                                                                                                               | 51/195 [01:21<01:24,  1.71it/s]                                                                                                                                                                                             [2024-04-18 16:01:27] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.7.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 26%|███████████████████████████████████████▍                                                                                                               | 51/195 [01:21<01:24,  1.71it/s] 27%|█████████████████████████████████████████                                                                                                              | 53/195 [01:21<01:00,  2.35it/s]                                                                                                                                                                                             [2024-04-18 16:01:28] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.7.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 27%|█████████████████████████████████████████                                                                                                              | 53/195 [01:21<01:00,  2.35it/s] 28%|█████████████████████████████████████████▊                                                                                                             | 54/195 [01:21<00:53,  2.65it/s]                                                                                                                                                                                             [2024-04-18 16:01:28] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.8.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 28%|█████████████████████████████████████████▊                                                                                                             | 54/195 [01:21<00:53,  2.65it/s]                                                                                                                                                                                             [2024-04-18 16:01:28] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.8.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 28%|█████████████████████████████████████████▊                                                                                                             | 54/195 [01:21<00:53,  2.65it/s] 29%|███████████████████████████████████████████▎                                                                                                           | 56/195 [01:22<00:49,  2.78it/s]                                                                                                                                                                                             [2024-04-18 16:01:29] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.8.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 29%|███████████████████████████████████████████▎                                                                                                           | 56/195 [01:22<00:49,  2.78it/s] 29%|████████████████████████████████████████████▏                                                                                                          | 57/195 [01:23<01:19,  1.74it/s]                                                                                                                                                                                             [2024-04-18 16:01:30] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.8.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 29%|████████████████████████████████████████████▏                                                                                                          | 57/195 [01:23<01:19,  1.74it/s]                                                                                                                                                                                             [2024-04-18 16:01:30] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.8.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 29%|████████████████████████████████████████████▏                                                                                                          | 57/195 [01:23<01:19,  1.74it/s] 30%|█████████████████████████████████████████████▋                                                                                                         | 59/195 [01:23<00:56,  2.39it/s]                                                                                                                                                                                             [2024-04-18 16:01:30] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.8.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 30%|█████████████████████████████████████████████▋                                                                                                         | 59/195 [01:23<00:56,  2.39it/s] 31%|██████████████████████████████████████████████▍                                                                                                        | 60/195 [01:24<00:50,  2.70it/s]                                                                                                                                                                                             [2024-04-18 16:01:31] INFO huggingface_loader.py:196: Unloading HF weight file: /models/Meta-Llama-3-8B-Instruct/model-00001-of-00004.safetensors
+ 31%|██████████████████████████████████████████████▍                                                                                                        | 60/195 [01:24<00:50,  2.70it/s]                                                                                                                                                                                             [2024-04-18 16:01:31] INFO huggingface_loader.py:184: Loading HF parameters from: /models/Meta-Llama-3-8B-Instruct/model-00002-of-00004.safetensors
+ 31%|██████████████████████████████████████████████▍                                                                                                        | 60/195 [01:24<00:50,  2.70it/s]                                                                                                                                                                                             [2024-04-18 16:01:42] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.10.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 31%|██████████████████████████████████████████████▍                                                                                                        | 60/195 [01:35<00:50,  2.70it/s] 31%|███████████████████████████████████████████████▏                                                                                                       | 61/195 [01:35<06:37,  2.97s/it]                                                                                                                                                                                             [2024-04-18 16:01:42] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.10.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 31%|███████████████████████████████████████████████▏                                                                                                       | 61/195 [01:35<06:37,  2.97s/it] 32%|███████��████████████████████████████████████████                                                                                                       | 62/195 [01:36<05:17,  2.39s/it]                                                                                                                                                                                             [2024-04-18 16:01:43] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.10.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 32%|████████████████████████████████████████████████                                                                                                       | 62/195 [01:36<05:17,  2.39s/it] 32%|████████████████████████████████████████████████▊                                                                                                      | 63/195 [01:37<04:44,  2.16s/it]                                                                                                                                                                                             [2024-04-18 16:01:44] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.10.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 32%|████████████████████████████████████████████████▊                                                                                                      | 63/195 [01:37<04:44,  2.16s/it]                                                                                                                                                                                             [2024-04-18 16:01:44] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.10.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 32%|████████████████████████████████████████████████▊                                                                                                      | 63/195 [01:37<04:44,  2.16s/it] 33%|██████████████████████████████████████████████████▎                                                                                                    | 65/195 [01:38<02:48,  1.30s/it]                                                                                                                                                                                             [2024-04-18 16:01:45] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.10.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 33%|██████████████████████████████████████████████████▎                                                                                                    | 65/195 [01:38<02:48,  1.30s/it] 34%|███████████████████████████████████████████████████                                                                                                    | 66/195 [01:38<02:14,  1.04s/it]                                                                                                                                                                                             [2024-04-18 16:01:45] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.11.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 34%|███████████████████████████████████████████████████                                                                                                    | 66/195 [01:38<02:14,  1.04s/it]                                                                                                                                                                                             [2024-04-18 16:01:45] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.11.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 34%|███████████████████████████████████████████████████                                                                                                    | 66/195 [01:38<02:14,  1.04s/it] 35%|█████████████████████████████��██████████████████████▋                                                                                                  | 68/195 [01:39<01:35,  1.32it/s]                                                                                                                                                                                             [2024-04-18 16:01:46] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.11.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 35%|████████████████████████████████████████████████████▋                                                                                                  | 68/195 [01:39<01:35,  1.32it/s] 35%|█████████████████████████████████████████████████████▍                                                                                                 | 69/195 [01:41<02:13,  1.06s/it]                                                                                                                                                                                             [2024-04-18 16:01:47] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.11.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 35%|█████████████████████████████████████████████████████▍                                                                                                 | 69/195 [01:41<02:13,  1.06s/it]                                                                                                                                                                                             [2024-04-18 16:01:48] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.11.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 35%|█████████████████████████████████████████████████████▍                                                                                                 | 69/195 [01:41<02:13,  1.06s/it] 36%|██████████████████████████████████████████████████████▉                                                                                                | 71/195 [01:41<01:27,  1.41it/s]                                                                                                                                                                                             [2024-04-18 16:01:48] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.11.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 36%|██████████████████████████████████████████████████████▉                                                                                                | 71/195 [01:41<01:27,  1.41it/s] 37%|███████████████████████████████████████████████████████▊                                                                                               | 72/195 [01:41<01:13,  1.67it/s]                                                                                                                                                                                             [2024-04-18 16:01:48] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.12.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 37%|███████████████████████████████████████████████████████▊                                                                                               | 72/195 [01:41<01:13,  1.67it/s]                                                                                                                                                                                             [2024-04-18 16:01:48] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.12.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 37%|███████████████████████████████████████████████████████▊                                                                                               | 72/195 [01:41<01:13,  1.67it/s] 38%|███████████████████��█████████████████████████████████████▎                                                                                             | 74/195 [01:42<00:59,  2.02it/s]                                                                                                                                                                                             [2024-04-18 16:01:49] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.12.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 38%|█████████████████████████████████████████████████████████▎                                                                                             | 74/195 [01:42<00:59,  2.02it/s] 38%|██████████████████████████████████████████████████████████                                                                                             | 75/195 [01:45<02:01,  1.02s/it]                                                                                                                                                                                             [2024-04-18 16:01:52] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.12.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 38%|██████████████████████████████████████████████████████████                                                                                             | 75/195 [01:45<02:01,  1.02s/it]                                                                                                                                                                                             [2024-04-18 16:01:52] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.12.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 38%|██████████████████████████████████████████████████████████                                                                                             | 75/195 [01:45<02:01,  1.02s/it] 39%|███████████████████████████████████████████████████████████▋                                                                                           | 77/195 [01:45<01:21,  1.44it/s]                                                                                                                                                                                             [2024-04-18 16:01:52] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.12.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 39%|███████████████████████████████████████████████████████████▋                                                                                           | 77/195 [01:45<01:21,  1.44it/s] 40%|████████████████████████████████████████████████████████████▍                                                                                          | 78/195 [01:45<01:08,  1.70it/s]                                                                                                                                                                                             [2024-04-18 16:01:52] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.13.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 40%|████████████████████████████████████████████████████████████▍                                                                                          | 78/195 [01:45<01:08,  1.70it/s]                                                                                                                                                                                             [2024-04-18 16:01:52] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.13.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 40%|████████████████████████████████████████████████████████████▍                                                                                          | 78/195 [01:45<01:08,  1.70it/s] 41%|█████████████████████████████████████████████████████████████▉                                                                                         | 80/195 [01:46<00:57,  1.99it/s]                                                                                                                                                                                             [2024-04-18 16:01:53] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.13.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 41%|█████████████████████████████████████████████████████████████▉                                                                                         | 80/195 [01:47<00:57,  1.99it/s] 42%|██████████████████████████████████████████████████████████████▋                                                                                        | 81/195 [01:49<02:10,  1.14s/it]                                                                                                                                                                                             [2024-04-18 16:01:56] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.13.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 42%|██████████████████████████████████████████████████████████████▋                                                                                        | 81/195 [01:49<02:10,  1.14s/it]                                                                                                                                                                                             [2024-04-18 16:01:56] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.13.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 42%|██████████████████████████████████████████████████████████████▋                                                                                        | 81/195 [01:50<02:10,  1.14s/it] 43%|████████████████████████████████████████████████████████████████▎                                                                                      | 83/195 [01:50<01:26,  1.29it/s]                                                                                                                                                                                             [2024-04-18 16:01:57] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.13.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 43%|████████████████████████████████████████████████████████████████▎                                                                                      | 83/195 [01:50<01:26,  1.29it/s] 43%|█████████████████████████████████████████████████████████████████                                                                                      | 84/195 [01:50<01:12,  1.53it/s]                                                                                                                                                                                             [2024-04-18 16:01:57] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.14.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 43%|█████████████████████████████████████████████████████████████████                                                                                      | 84/195 [01:50<01:12,  1.53it/s]                                                                                                                                                                                             [2024-04-18 16:01:57] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.14.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 43%|█████████████��███████████████████████████████████████████████████                                                                                      | 84/195 [01:50<01:12,  1.53it/s] 44%|██████████████████████████████████████████████████████████████████▌                                                                                    | 86/195 [01:51<01:04,  1.69it/s]                                                                                                                                                                                             [2024-04-18 16:01:58] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.14.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 44%|██████████████████████████████████████████████████████████████████▌                                                                                    | 86/195 [01:52<01:04,  1.69it/s] 45%|███████████████████████████████████████████████████████████████████▎                                                                                   | 87/195 [01:54<02:06,  1.18s/it]                                                                                                                                                                                             [2024-04-18 16:02:01] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.14.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 45%|███████████████████████████████████████████████████████████████████▎                                                                                   | 87/195 [01:54<02:06,  1.18s/it]                                                                                                                                                                                             [2024-04-18 16:02:01] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.14.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 45%|███████████████████████████████████████████████████████████████████▎                                                                                   | 87/195 [01:54<02:06,  1.18s/it] 46%|████████████████████████████████████████████████████████████████████▉                                                                                  | 89/195 [01:55<01:24,  1.25it/s]                                                                                                                                                                                             [2024-04-18 16:02:02] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.14.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 46%|████████████████████████████████████████████████████████████████████▉                                                                                  | 89/195 [01:55<01:24,  1.25it/s] 46%|█████████████████████████████████████████████████████████████████████▋                                                                                 | 90/195 [01:55<01:10,  1.49it/s]                                                                                                                                                                                             [2024-04-18 16:02:02] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.15.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 46%|█████████████████████████████████████████████████████████████████████▋                                                                                 | 90/195 [01:55<01:10,  1.49it/s]                                                                                                                                                                                             [2024-04-18 16:02:02] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.15.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 46%|█████████████████████████████████████████████████████████████████████▋                                                                                 | 90/195 [01:55<01:10,  1.49it/s] 47%|███████████████████████████████████████████████████████████████████████▏                                                                               | 92/195 [01:56<01:01,  1.67it/s]                                                                                                                                                                                             [2024-04-18 16:02:03] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.15.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 47%|███████████████████████████████████████████████████████████████████████▏                                                                               | 92/195 [01:56<01:01,  1.67it/s] 48%|████████████████████████████████████████████████████████████████████████                                                                               | 93/195 [01:59<01:59,  1.17s/it]                                                                                                                                                                                             [2024-04-18 16:02:06] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.15.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 48%|████████████████████████████████████████████████████████████████████████                                                                               | 93/195 [01:59<01:59,  1.17s/it]                                                                                                                                                                                             [2024-04-18 16:02:06] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.15.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 48%|████████████████████████████████████████████████████████████████████████                                                                               | 93/195 [01:59<01:59,  1.17s/it] 49%|█████████████████████████████████████████████████████████████████████████▌                                                                             | 95/195 [02:00<01:19,  1.26it/s]                                                                                                                                                                                             [2024-04-18 16:02:06] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.15.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 49%|█████████████████████████████████████████████████████████████████████████▌                                                                             | 95/195 [02:00<01:19,  1.26it/s] 49%|██████████████████████████████████████████████████████████████████████████▎                                                                            | 96/195 [02:00<01:06,  1.49it/s]                                                                                                                                                                                             [2024-04-18 16:02:07] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.16.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 49%|██████████████████████████████████████████████████████████████████████████▎                                                                            | 96/195 [02:00<01:06,  1.49it/s]                                                                                                                                                                                             [2024-04-18 16:02:07] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.16.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 49%|██████████████████████████████████████████████████████████████████████████▎                                                                            | 96/195 [02:00<01:06,  1.49it/s] 50%|███████████████████████████████████████████████████████████████████████████▉                                                                           | 98/195 [02:01<01:01,  1.58it/s]                                                                                                                                                                                             [2024-04-18 16:02:08] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.16.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 50%|███████████████████████████████████████████████████████████████████████████▉                                                                           | 98/195 [02:01<01:01,  1.58it/s] 51%|████████████████████████████████████████████████████████████████████████████▋                                                                          | 99/195 [02:04<01:53,  1.18s/it]                                                                                                                                                                                             [2024-04-18 16:02:11] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.16.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 51%|████████████████████████████████████████████████████████████████████████████▋                                                                          | 99/195 [02:04<01:53,  1.18s/it]                                                                                                                                                                                             [2024-04-18 16:02:11] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.16.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 51%|████████████████████████████████████████████████████████████████████████████▋                                                                          | 99/195 [02:04<01:53,  1.18s/it] 52%|█████████████████████████████████████████████████████████████████████████████▋                                                                        | 101/195 [02:04<01:15,  1.24it/s]                                                                                                                                                                                             [2024-04-18 16:02:11] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.16.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 52%|█████████████████████████████████████████████████████████████████████████████▋                                                                        | 101/195 [02:04<01:15,  1.24it/s] 52%|██████████████████████████████████████████████████████████████████████████████▍                                                                       | 102/195 [02:05<01:03,  1.47it/s]                                                                                                                                                                                             [2024-04-18 16:02:11] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.17.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 52%|██████████████████████████████████████████████████████████████████████████████▍                                                                       | 102/195 [02:05<01:03,  1.47it/s]                                                                                                                                                                                             [2024-04-18 16:02:12] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.17.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 52%|██████████████████████████████████████████████████████████████████████████████▍                                                                       | 102/195 [02:05<01:03,  1.47it/s] 53%|████████████████████████████████████████████████████████████████████████████████                                                                      | 104/195 [02:06<00:59,  1.54it/s]                                                                                                                                                                                             [2024-04-18 16:02:13] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.17.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 53%|████████████████████████████████████████████████████████████████████████████████                                                                      | 104/195 [02:06<00:59,  1.54it/s] 54%|████████████████████████████████████████████████████████████████████████████████▊                                                                     | 105/195 [02:09<01:48,  1.20s/it]                                                                                                                                                                                             [2024-04-18 16:02:16] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.17.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 54%|████████████████████████████████████████████████████████████████████████████████▊                                                                     | 105/195 [02:09<01:48,  1.20s/it]                                                                                                                                                                                             [2024-04-18 16:02:16] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.17.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 54%|████████████████████████████████████████████████████████████████████████████████▊                                                                     | 105/195 [02:09<01:48,  1.20s/it] 55%|██████████████████████████████████████████████████████████████████████████████████▎                                                                   | 107/195 [02:09<01:11,  1.23it/s]                                                                                                                                                                                             [2024-04-18 16:02:16] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.17.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 55%|██████████████████████████���███████████████████████████████████████████████████████▎                                                                   | 107/195 [02:09<01:11,  1.23it/s] 55%|███████████████████████████████████████████████████████████████████████████████████                                                                   | 108/195 [02:10<00:59,  1.46it/s]                                                                                                                                                                                             [2024-04-18 16:02:17] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.18.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 55%|███████████████████████████████████████████████████████████████████████████████████                                                                   | 108/195 [02:10<00:59,  1.46it/s]                                                                                                                                                                                             [2024-04-18 16:02:17] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.18.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 55%|███████████████████████████████████████████████████████████████████████████████████                                                                   | 108/195 [02:10<00:59,  1.46it/s] 56%|████████████████████████████████████████████████████████████████████████████████████▌                                                                 | 110/195 [02:11<00:53,  1.58it/s]                                                                                                                                                                                             [2024-04-18 16:02:18] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.18.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 56%|████████████████████████████████████████████████████████████████████████████████████▌                                                                 | 110/195 [02:11<00:53,  1.58it/s] 57%|█████████████████████████████████████████████████████████████████████████████████████▍                                                                | 111/195 [02:14<01:37,  1.17s/it]                                                                                                                                                                                             [2024-04-18 16:02:21] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.18.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 57%|█████████████████████████████████████████████████████████████████████████████████████▍                                                                | 111/195 [02:14<01:37,  1.17s/it]                                                                                                                                                                                             [2024-04-18 16:02:21] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.18.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 57%|█████████████████████████████████████████████████████████████████████████████████████▍                                                                | 111/195 [02:14<01:37,  1.17s/it] 58%|██████████████████████████████████████████████████████████████████████████████████████▉                                                               | 113/195 [02:14<01:04,  1.27it/s]                                                                                                                                                                                             [2024-04-18 16:02:21] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.18.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 58%|██████████████████████████████████████████████████████████████████████████████████████▉                                                               | 113/195 [02:14<01:04,  1.27it/s] 58%|███████████████████████████████████████████████████████████████████████████████████████▋                                                              | 114/195 [02:14<00:53,  1.50it/s]                                                                                                                                                                                             [2024-04-18 16:02:21] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.19.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 58%|███████████████████████████████████████████████████████████████████████████████████████▋                                                              | 114/195 [02:14<00:53,  1.50it/s]                                                                                                                                                                                             [2024-04-18 16:02:21] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.19.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 58%|███████████████████████████████████████████████████████████████████████████████████████▋                                                              | 114/195 [02:15<00:53,  1.50it/s] 59%|█████████████████████████████████████████████████████████████████████████████████████████▏                                                            | 116/195 [02:16<00:48,  1.61it/s]                                                                                                                                                                                             [2024-04-18 16:02:23] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.19.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 59%|█████████████████████████████████████████████████████████████████████████████████████████▏                                                            | 116/195 [02:16<00:48,  1.61it/s] 60%|██████████████████████████████████████████████████████████████████████████████████████████                                                            | 117/195 [02:19<01:33,  1.19s/it]                                                                                                                                                                                             [2024-04-18 16:02:26] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.19.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 60%|██████████████████████████████████████████████████████████████████████████████████████████                                                            | 117/195 [02:19<01:33,  1.19s/it]                                                                                                                                                                                             [2024-04-18 16:02:26] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.19.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 60%|██████████████████████████████████████████████████████████████████████████████████████████                                                            | 117/195 [02:19<01:33,  1.19s/it] 61%|███████████████████████████████████████████████████████████████████████████████████████████▌                                                          | 119/195 [02:19<01:01,  1.24it/s]                                                                                                                                                                                             [2024-04-18 16:02:26] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.19.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 61%|███████████████████████████████████████████████████████████████████████████████████████████▌                                                          | 119/195 [02:19<01:01,  1.24it/s] 62%|████████████████████████████████████████████████████████████████████████████████████████████▎                                                         | 120/195 [02:19<00:51,  1.46it/s]                                                                                                                                                                                             [2024-04-18 16:02:26] INFO huggingface_loader.py:184: Loading HF parameters from: /models/Meta-Llama-3-8B-Instruct/model-00003-of-00004.safetensors
+ 62%|████████████████████████████████████████████████████████████████████████████████████████████▎                                                         | 120/195 [02:19<00:51,  1.46it/s]                                                                                                                                                                                             [2024-04-18 16:02:38] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.20.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 62%|████████████████████████████████████████████████████████████████████████████████████████████▎                                                         | 120/195 [02:31<00:51,  1.46it/s] 62%|█████████████████████████████████████████████████████████████████████████████████████████████                                                         | 121/195 [02:34<04:56,  4.01s/it]                                                                                                                                                                                             [2024-04-18 16:02:41] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.20.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 62%|█████████████████████████████████████████████████████████████████████████████████████████████                                                         | 121/195 [02:35<04:56,  4.01s/it] 63%|█████████████████████████████████████████████████████████████████████████████████████████████▊                                                        | 122/195 [02:35<03:59,  3.28s/it]                                                                                                                                                                                             [2024-04-18 16:02:42] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.20.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 63%|█████████████████████████████████████████████████████████████████████████████████████████████▊                                                        | 122/195 [02:36<03:59,  3.28s/it] 63%|██████████████████████████████████████████████████████████████████████████████████████████████▌                                                       | 123/195 [02:36<03:02,  2.53s/it]                                                                                                                                                                                             [2024-04-18 16:02:43] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.9.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 63%|██████████████████████████████████████████████████████████████████████████████████████████████▌                                                       | 123/195 [02:36<03:02,  2.53s/it]                                                                                                                                                                                             [2024-04-18 16:02:43] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.9.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 63%|██████████████████████████████████████████████████████████████████████████████████████████████▌                                                       | 123/195 [02:36<03:02,  2.53s/it] 64%|████████████████████████████████████████████████████████████████████████████████████████████████▏                                                     | 125/195 [02:38<02:05,  1.79s/it]                                                                                                                                                                                             [2024-04-18 16:02:45] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.9.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 64%|████████████████████████████████████████████████████████████████████████████████████████████████▏                                                     | 125/195 [02:38<02:05,  1.79s/it] 65%|████████████████████████████████████████████████████████████████████████████████████████████████▉                                                     | 126/195 [02:41<02:33,  2.22s/it]                                                                                                                                                                                             [2024-04-18 16:02:48] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.9.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 65%|████████████████████████████████████████████████████████████████████████████████████████████████▉                                                     | 126/195 [02:41<02:33,  2.22s/it]                                                                                                                                                                                             [2024-04-18 16:02:48] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.9.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 65%|████████████████████���███████████████████████████████████████████████████████████████████████████▉                                                     | 126/195 [02:41<02:33,  2.22s/it] 66%|██████████████████████████████████████████████████████████████████████████████████████████████████▍                                                   | 128/195 [02:42<01:38,  1.47s/it]                                                                                                                                                                                             [2024-04-18 16:02:49] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.9.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 66%|██████████████████████████████████████████████████████████████████████████████████████████████████▍                                                   | 128/195 [02:42<01:38,  1.47s/it] 66%|███████████████████████████████████████████████████████████████████████████████████████████████████▏                                                  | 129/195 [02:42<01:18,  1.19s/it]                                                                                                                                                                                             [2024-04-18 16:02:49] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.20.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 66%|███████████████████████████████████████████████████████████████████████████████████████████████████▏                                                  | 129/195 [02:42<01:18,  1.19s/it]                                                                                                                                                                                             [2024-04-18 16:02:49] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.20.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 66%|███████████████████████████████████████████████████████████████████████████████████████████████████▏                                                  | 129/195 [02:42<01:18,  1.19s/it] 67%|████████████████████████████████████████████████████████████████████████████████████████████████████▊                                                 | 131/195 [02:44<01:06,  1.04s/it]                                                                                                                                                                                             [2024-04-18 16:02:51] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.20.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 67%|████████████████████████████████████████████████████████████████████████████████████████████████████▊                                                 | 131/195 [02:44<01:06,  1.04s/it]                                                                                                                                                                                             [2024-04-18 16:02:51] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.21.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 67%|████████████████████████████████████████████████████████████████████████████████████████████████████▊                                                 | 131/195 [02:44<01:06,  1.04s/it]                                                                                                                                                                                             [2024-04-18 16:02:51] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.21.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 67%|████████████████████████████████████████████████████████████████████████████████████████████████████▊                                                 | 131/195 [02:44<01:06,  1.04s/it] 69%|███████████████████████████████████████████████████████████████████████████████████████████████████████                                               | 134/195 [02:45<00:47,  1.28it/s]                                                                                                                                                                                             [2024-04-18 16:02:53] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.21.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 69%|███████████████████████████████████████████████████████████████████████████████████████████████████████                                               | 134/195 [02:46<00:47,  1.28it/s] 69%|███████████████████████████████████████████████████████████████████████████████████████████████████████▊                                              | 135/195 [02:49<01:15,  1.26s/it]                                                                                                                                                                                             [2024-04-18 16:02:55] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.21.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 69%|███████████████████████████████████████████████████████████████████████████████████████████████████████▊                                              | 135/195 [02:49<01:15,  1.26s/it]                                                                                                                                                                                             [2024-04-18 16:02:55] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.21.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 69%|███████████████████████████████████████████████████████████████████████████████████████████████████████▊                                              | 135/195 [02:49<01:15,  1.26s/it] 70%|█████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                            | 137/195 [02:49<00:52,  1.11it/s]                                                                                                                                                                                             [2024-04-18 16:02:56] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.21.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 70%|█████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                            | 137/195 [02:49<00:52,  1.11it/s] 71%|██████████████████████████████████████████████████████████████████████████████████████████████████████████▏                                           | 138/195 [02:49<00:43,  1.30it/s]                                                                                                                                                                                             [2024-04-18 16:02:56] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.22.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 71%|██████████████████████████████████████████████████████████████████████████████████████████████████████████▏                                           | 138/195 [02:49<00:43,  1.30it/s]                                                                                                                                                                                             [2024-04-18 16:02:56] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.22.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 71%|██████████████████████████████████████████████████████████████████████████████████████████████████████████▏                                           | 138/195 [02:49<00:43,  1.30it/s] 72%|███████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                          | 140/195 [02:50<00:38,  1.42it/s]                                                                                                                                                                                             [2024-04-18 16:02:58] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.22.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 72%|███████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                          | 140/195 [02:51<00:38,  1.42it/s] 72%|████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                         | 141/195 [02:54<01:09,  1.29s/it]                                                                                                                                                                                             [2024-04-18 16:03:01] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.22.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 72%|████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                         | 141/195 [02:54<01:09,  1.29s/it]                                                                                                                                                                                             [2024-04-18 16:03:01] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.22.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 72%|████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                         | 141/195 [02:54<01:09,  1.29s/it] 73%|███████████████████████████████��█████████████████████████████████████████████████████████████████████████████▉                                        | 143/195 [02:54<00:46,  1.12it/s]                                                                                                                                                                                             [2024-04-18 16:03:01] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.22.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 73%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████▉                                        | 143/195 [02:54<00:46,  1.12it/s] 74%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                       | 144/195 [02:55<00:38,  1.33it/s]                                                                                                                                                                                             [2024-04-18 16:03:01] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.23.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 74%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                       | 144/195 [02:55<00:38,  1.33it/s]                                                                                                                                                                                             [2024-04-18 16:03:01] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.23.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 74%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                       | 144/195 [02:55<00:38,  1.33it/s] 75%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                     | 146/195 [02:56<00:34,  1.41it/s]                                                                                                                                                                                             [2024-04-18 16:03:03] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.23.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 75%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                     | 146/195 [02:57<00:34,  1.41it/s] 75%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                     | 147/195 [02:59<01:03,  1.32s/it]                                                                                                                                                                                             [2024-04-18 16:03:06] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.23.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 75%|██████████████████████████████████████���██████████████████████████████████████████████████████████████████████████                                     | 147/195 [02:59<01:03,  1.32s/it]                                                                                                                                                                                             [2024-04-18 16:03:06] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.23.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 75%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                     | 147/195 [03:00<01:03,  1.32s/it] 76%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                                   | 149/195 [03:00<00:41,  1.12it/s]                                                                                                                                                                                             [2024-04-18 16:03:07] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.23.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 76%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                                   | 149/195 [03:00<00:41,  1.12it/s] 77%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                  | 150/195 [03:00<00:33,  1.33it/s]                                                                                                                                                                                             [2024-04-18 16:03:07] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.24.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 77%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                  | 150/195 [03:00<00:33,  1.33it/s]                                                                                                                                                                                             [2024-04-18 16:03:07] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.24.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 77%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                  | 150/195 [03:00<00:33,  1.33it/s] 78%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉                                 | 152/195 [03:01<00:30,  1.40it/s]                                                                                                                                                                                             [2024-04-18 16:03:09] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.24.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 78%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉                                 | 152/195 [03:02<00:30,  1.40it/s] 78%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                | 153/195 [03:05<00:52,  1.25s/it]                                                                                                                                                                                             [2024-04-18 16:03:11] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.24.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 78%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                | 153/195 [03:05<00:52,  1.25s/it]                                                                                                                                                                                             [2024-04-18 16:03:12] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.24.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 78%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                | 153/195 [03:05<00:52,  1.25s/it] 79%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏                              | 155/195 [03:05<00:33,  1.18it/s]                                                                                                                                                                                             [2024-04-18 16:03:12] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.24.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 79%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏                              | 155/195 [03:05<00:33,  1.18it/s] 80%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                              | 156/195 [03:05<00:27,  1.40it/s]                                                                                                                                                                                             [2024-04-18 16:03:12] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.25.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 80%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                              | 156/195 [03:05<00:27,  1.40it/s]                                                                                                                                                                                             [2024-04-18 16:03:12] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.25.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 80%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                              | 156/195 [03:05<00:27,  1.40it/s] 81%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                            | 158/195 [03:06<00:24,  1.53it/s]                                                                                                                                                                                             [2024-04-18 16:03:14] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.25.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 81%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                            | 158/195 [03:07<00:24,  1.53it/s] 82%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                           | 159/195 [03:10<00:44,  1.22s/it]                                                                                                                                                                                             [2024-04-18 16:03:16] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.25.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 82%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                           | 159/195 [03:10<00:44,  1.22s/it]                                                                                                                                                                                             [2024-04-18 16:03:17] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.25.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 82%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                           | 159/195 [03:10<00:44,  1.22s/it] 83%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                          | 161/195 [03:10<00:28,  1.21it/s]                                                                                                                                                                                             [2024-04-18 16:03:17] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.25.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 83%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                          | 161/195 [03:10<00:28,  1.21it/s] 83%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                         | 162/195 [03:10<00:22,  1.44it/s]                                                                                                                                                                                             [2024-04-18 16:03:17] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.26.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 83%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                         | 162/195 [03:10<00:22,  1.44it/s]                                                                                                                                                                                             [2024-04-18 16:03:17] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.26.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 83%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                         | 162/195 [03:10<00:22,  1.44it/s] 84%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏                       | 164/195 [03:11<00:20,  1.52it/s]                                                                                                                                                                                             [2024-04-18 16:03:19] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.26.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 84%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏                       | 164/195 [03:12<00:20,  1.52it/s] 85%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉                       | 165/195 [03:15<00:36,  1.22s/it]                                                                                                                                                                                             [2024-04-18 16:03:22] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.26.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 85%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉                       | 165/195 [03:15<00:36,  1.22s/it]                                                                                                                                                                                             [2024-04-18 16:03:22] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.26.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 85%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉                       | 165/195 [03:15<00:36,  1.22s/it] 86%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                     | 167/195 [03:15<00:23,  1.21it/s]                                                                                                                                                                                             [2024-04-18 16:03:22] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.26.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 86%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                     | 167/195 [03:15<00:23,  1.21it/s] 86%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏                    | 168/195 [03:15<00:18,  1.44it/s]                                                                                                                                                                                             [2024-04-18 16:03:22] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.27.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 86%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏                    | 168/195 [03:15<00:18,  1.44it/s]                                                                                                                                                                                             [2024-04-18 16:03:22] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.27.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 86%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏                    | 168/195 [03:15<00:18,  1.44it/s] 87%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                   | 170/195 [03:16<00:15,  1.61it/s]                                                                                                                                                                                             [2024-04-18 16:03:24] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.27.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 87%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                   | 170/195 [03:17<00:15,  1.61it/s] 88%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                  | 171/195 [03:19<00:28,  1.17s/it]                                                                                                                                                                                             [2024-04-18 16:03:26] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.27.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 88%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                  | 171/195 [03:19<00:28,  1.17s/it]                                                                                                                                                                                             [2024-04-18 16:03:26] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.27.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 88%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                  | 171/195 [03:20<00:28,  1.17s/it] 89%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                 | 173/195 [03:20<00:17,  1.26it/s]                                                                                                                                                                                             [2024-04-18 16:03:27] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.27.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 89%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                 | 173/195 [03:20<00:17,  1.26it/s] 89%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                | 174/195 [03:20<00:14,  1.50it/s]                                                                                                                                                                                             [2024-04-18 16:03:27] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.28.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 89%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                | 174/195 [03:20<00:14,  1.50it/s]                                                                                                                                                                                             [2024-04-18 16:03:27] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.28.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 89%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                | 174/195 [03:20<00:14,  1.50it/s] 90%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍              | 176/195 [03:21<00:11,  1.64it/s]                                                                                                                                                                                             [2024-04-18 16:03:29] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.28.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 90%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍              | 176/195 [03:22<00:11,  1.64it/s] 91%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏             | 177/195 [03:25<00:22,  1.25s/it]                                                                                                                                                                                             [2024-04-18 16:03:32] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.28.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 91%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏             | 177/195 [03:25<00:22,  1.25s/it]                                                                                                                                                                                             [2024-04-18 16:03:32] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.28.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 91%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏             | 177/195 [03:25<00:22,  1.25s/it] 92%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋            | 179/195 [03:25<00:13,  1.18it/s]                                                                                                                                                                                             [2024-04-18 16:03:32] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.28.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 92%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋            | 179/195 [03:25<00:13,  1.18it/s] 92%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍           | 180/195 [03:25<00:10,  1.41it/s]                                                                                                                                                                                             [2024-04-18 16:03:32] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.29.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 92%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍           | 180/195 [03:25<00:10,  1.41it/s]                                                                                                                                                                                             [2024-04-18 16:03:32] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.29.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 92%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍           | 180/195 [03:25<00:10,  1.41it/s] 93%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████          | 182/195 [03:26<00:08,  1.58it/s]                                                                                                                                                                                             [2024-04-18 16:03:34] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.29.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 93%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████          | 182/195 [03:27<00:08,  1.58it/s] 94%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊         | 183/195 [03:29<00:14,  1.18s/it]                                                                                                                                                                                             [2024-04-18 16:03:36] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.29.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 94%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊         | 183/195 [03:29<00:14,  1.18s/it]                                                                                                                                                                                             [2024-04-18 16:03:36] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.29.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 94%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊         | 183/195 [03:30<00:14,  1.18s/it] 95%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎       | 185/195 [03:30<00:08,  1.25it/s]                                                                                                                                                                                             [2024-04-18 16:03:37] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.29.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 95%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎       | 185/195 [03:30<00:08,  1.25it/s] 95%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████       | 186/195 [03:30<00:06,  1.48it/s]                                                                                                                                                                                             [2024-04-18 16:03:37] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.30.input_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 95%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████       | 186/195 [03:30<00:06,  1.48it/s]                                                                                                                                                                                             [2024-04-18 16:03:37] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.30.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32
+ 95%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████       | 186/195 [03:30<00:06,  1.48it/s] 96%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌     | 188/195 [03:31<00:04,  1.56it/s]                                                                                                                                                                                             [2024-04-18 16:03:39] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.30.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 96%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌     | 188/195 [03:32<00:04,  1.56it/s] 97%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍    | 189/195 [03:34<00:07,  1.19s/it]                                                                                                                                                                                             [2024-04-18 16:03:41] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.30.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32
+ 97%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍    | 189/195 [03:34<00:07,  1.19s/it]                                                                                                                                                                                             [2024-04-18 16:03:41] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.30.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 97%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍    | 189/195 [03:34<00:07,  1.19s/it] 98%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉   | 191/195 [03:35<00:03,  1.24it/s]                                                                                                                                                                                             [2024-04-18 16:03:42] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.30.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 98%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉   | 191/195 [03:35<00:03,  1.24it/s] 98%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋  | 192/195 [03:35<00:02,  1.47it/s]                                                                                                                                                                                             [2024-04-18 16:03:42] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.31.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32
+ 98%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋  | 192/195 [03:36<00:02,  1.47it/s] 99%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍ | 193/195 [03:38<00:02,  1.32s/it]                                                                                                                                                                                             [2024-04-18 16:03:45] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.31.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32
+ 99%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍ | 193/195 [03:38<00:02,  1.32s/it] 99%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏| 194/195 [03:39<00:01,  1.09s/it]                                                                                                                                                                                             [2024-04-18 16:03:46] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.31.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32
+ 99%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏| 194/195 [03:39<00:01,  1.09s/it]100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 195/195 [03:39<00:00,  1.17it/s]100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 195/195 [03:39<00:00,  1.13s/it]
+[2024-04-18 16:03:46] INFO huggingface_loader.py:196: Unloading HF weight file: /models/Meta-Llama-3-8B-Instruct/model-00002-of-00004.safetensors
+[2024-04-18 16:03:46] INFO huggingface_loader.py:196: Unloading HF weight file: /models/Meta-Llama-3-8B-Instruct/model-00003-of-00004.safetensors
+[2024-04-18 16:03:47] INFO stats.py:76: [92mTime usage[0m: HF loading: 36.734 sec; Pre-quantization mapping: 24.043 sec; Quantization: 0.000 sec
+[2024-04-18 16:03:47] INFO stats.py:90: [92mRAM usage[0m: Peak RAM: 18.469 GB. Total bytes loaded from disk: 29.915 GB
+[2024-04-18 16:03:47] INFO convert_weight.py:156: [92mParameter size[0m after quantization: 29.915 GB
+[2024-04-18 16:03:47] INFO convert_weight.py:161: [92mTotal parameters[0m: 8,030,261,248
+[2024-04-18 16:03:47] INFO convert_weight.py:162: [92mBits per parameter[0m: 32.000
+[2024-04-18 16:03:47] INFO convert_weight.py:167: Saved to directory: [1m/tmp/tmpq8el2iww[0m
+
+All finished, 131 total shards committed, record saved to /tmp/tmpq8el2iww/ndarray-cache.json
+Also saved a bf16 record to /tmp/tmpq8el2iww/ndarray-cache-b16.json