svjack's picture
Upload folder using huggingface_hub
d16fbc5 verified
raw
history blame
No virus
23.3 kB
06/17/2024 20:24:18 - WARNING - transformers.tokenization_utils_base - Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
06/17/2024 20:24:18 - INFO - llamafactory.data.template - Replace eos token: <|im_end|>
06/17/2024 20:24:22 - INFO - llamafactory.data.loader - Loading dataset llamafactory/glaive_toolcall_en...
06/17/2024 20:24:27 - INFO - llamafactory.data.loader - Loading dataset llamafactory/glaive_toolcall_zh...
06/17/2024 20:24:32 - INFO - llamafactory.data.loader - Loading dataset llamafactory/glaive_toolcall_en...
06/17/2024 20:24:36 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--Qwen--Qwen2-7B-Instruct/snapshots/41c66b0be1c3081f13defc6bdf946c2ef240d6a6/config.json
06/17/2024 20:24:36 - INFO - transformers.configuration_utils - Model config Qwen2Config {
"_name_or_path": "Qwen/Qwen2-7B-Instruct",
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_theta": 1000000.0,
"sliding_window": 131072,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.41.2",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
06/17/2024 20:24:36 - INFO - llamafactory.model.model_utils.quantization - Quantizing model to 4 bit.
06/17/2024 20:24:36 - INFO - transformers.modeling_utils - loading weights file model.safetensors from cache at /home/featurize/.cache/huggingface/hub/models--Qwen--Qwen2-7B-Instruct/snapshots/41c66b0be1c3081f13defc6bdf946c2ef240d6a6/model.safetensors.index.json
06/17/2024 20:24:36 - INFO - transformers.modeling_utils - Instantiating Qwen2ForCausalLM model under default dtype torch.float16.
06/17/2024 20:24:36 - INFO - transformers.generation.configuration_utils - Generate config GenerationConfig {
"bos_token_id": 151643,
"eos_token_id": 151645
}
06/17/2024 20:24:37 - INFO - llamafactory.model.model_utils.quantization - Quantizing model to 4 bit.
06/17/2024 20:25:04 - INFO - transformers.modeling_utils - All model checkpoint weights were used when initializing Qwen2ForCausalLM.
06/17/2024 20:25:04 - INFO - transformers.modeling_utils - All the weights of Qwen2ForCausalLM were initialized from the model checkpoint at Qwen/Qwen2-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2ForCausalLM for predictions without further training.
06/17/2024 20:25:04 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
06/17/2024 20:25:04 - INFO - llamafactory.model.model_utils.attention - Using torch SDPA for faster training and inference.
06/17/2024 20:25:04 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
06/17/2024 20:25:04 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
06/17/2024 20:25:04 - INFO - llamafactory.model.model_utils.misc - Found linear modules: o_proj,k_proj,up_proj,gate_proj,v_proj,q_proj,down_proj
06/17/2024 20:25:04 - INFO - transformers.generation.configuration_utils - loading configuration file generation_config.json from cache at /home/featurize/.cache/huggingface/hub/models--Qwen--Qwen2-7B-Instruct/snapshots/41c66b0be1c3081f13defc6bdf946c2ef240d6a6/generation_config.json
06/17/2024 20:25:04 - INFO - transformers.generation.configuration_utils - Generate config GenerationConfig {
"bos_token_id": 151643,
"do_sample": true,
"eos_token_id": [
151645,
151643
],
"pad_token_id": 151643,
"repetition_penalty": 1.05,
"temperature": 0.7,
"top_k": 20,
"top_p": 0.8
}
06/17/2024 20:25:04 - INFO - llamafactory.model.loader - trainable params: 20185088 || all params: 7635801600 || trainable%: 0.2643
06/17/2024 20:25:04 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
06/17/2024 20:25:04 - INFO - llamafactory.model.model_utils.attention - Using torch SDPA for faster training and inference.
06/17/2024 20:25:04 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
06/17/2024 20:25:04 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
06/17/2024 20:25:04 - INFO - llamafactory.model.model_utils.misc - Found linear modules: o_proj,up_proj,down_proj,q_proj,v_proj,k_proj,gate_proj
06/17/2024 20:25:05 - INFO - llamafactory.model.loader - trainable params: 20185088 || all params: 7635801600 || trainable%: 0.2643
06/17/2024 20:25:05 - WARNING - accelerate.utils.other - Detected kernel version 5.4.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.
06/17/2024 20:25:05 - INFO - transformers.trainer - Using auto half precision backend
06/17/2024 20:25:05 - INFO - transformers.trainer - ***** Running training *****
06/17/2024 20:25:05 - INFO - transformers.trainer - Num examples = 2,000
06/17/2024 20:25:05 - INFO - transformers.trainer - Num Epochs = 3
06/17/2024 20:25:05 - INFO - transformers.trainer - Instantaneous batch size per device = 1
06/17/2024 20:25:05 - INFO - transformers.trainer - Total train batch size (w. parallel, distributed & accumulation) = 16
06/17/2024 20:25:05 - INFO - transformers.trainer - Gradient Accumulation steps = 8
06/17/2024 20:25:05 - INFO - transformers.trainer - Total optimization steps = 375
06/17/2024 20:25:05 - INFO - transformers.trainer - Number of trainable parameters = 20,185,088
06/17/2024 20:26:04 - INFO - llamafactory.extras.callbacks - {'loss': 0.6880, 'learning_rate': 4.9978e-05, 'epoch': 0.04, 'throughput': 807.59}
06/17/2024 20:26:54 - INFO - llamafactory.extras.callbacks - {'loss': 0.7630, 'learning_rate': 4.9912e-05, 'epoch': 0.08, 'throughput': 788.63}
06/17/2024 20:27:47 - INFO - llamafactory.extras.callbacks - {'loss': 0.6882, 'learning_rate': 4.9803e-05, 'epoch': 0.12, 'throughput': 782.46}
06/17/2024 20:28:41 - INFO - llamafactory.extras.callbacks - {'loss': 0.6951, 'learning_rate': 4.9650e-05, 'epoch': 0.16, 'throughput': 778.58}
06/17/2024 20:29:32 - INFO - llamafactory.extras.callbacks - {'loss': 0.5008, 'learning_rate': 4.9454e-05, 'epoch': 0.20, 'throughput': 776.76}
06/17/2024 20:30:18 - INFO - llamafactory.extras.callbacks - {'loss': 0.5420, 'learning_rate': 4.9215e-05, 'epoch': 0.24, 'throughput': 780.46}
06/17/2024 20:31:09 - INFO - llamafactory.extras.callbacks - {'loss': 0.5369, 'learning_rate': 4.8933e-05, 'epoch': 0.28, 'throughput': 781.27}
06/17/2024 20:32:05 - INFO - llamafactory.extras.callbacks - {'loss': 0.4948, 'learning_rate': 4.8609e-05, 'epoch': 0.32, 'throughput': 780.39}
06/17/2024 20:32:52 - INFO - llamafactory.extras.callbacks - {'loss': 0.5244, 'learning_rate': 4.8244e-05, 'epoch': 0.36, 'throughput': 778.70}
06/17/2024 20:33:37 - INFO - llamafactory.extras.callbacks - {'loss': 0.4210, 'learning_rate': 4.7839e-05, 'epoch': 0.40, 'throughput': 780.25}
06/17/2024 20:34:25 - INFO - llamafactory.extras.callbacks - {'loss': 0.4517, 'learning_rate': 4.7393e-05, 'epoch': 0.44, 'throughput': 779.83}
06/17/2024 20:35:19 - INFO - llamafactory.extras.callbacks - {'loss': 0.4661, 'learning_rate': 4.6908e-05, 'epoch': 0.48, 'throughput': 775.58}
06/17/2024 20:36:09 - INFO - llamafactory.extras.callbacks - {'loss': 0.4928, 'learning_rate': 4.6384e-05, 'epoch': 0.52, 'throughput': 775.62}
06/17/2024 20:37:00 - INFO - llamafactory.extras.callbacks - {'loss': 0.5424, 'learning_rate': 4.5823e-05, 'epoch': 0.56, 'throughput': 775.79}
06/17/2024 20:37:52 - INFO - llamafactory.extras.callbacks - {'loss': 0.5419, 'learning_rate': 4.5225e-05, 'epoch': 0.60, 'throughput': 774.15}
06/17/2024 20:38:39 - INFO - llamafactory.extras.callbacks - {'loss': 0.4558, 'learning_rate': 4.4592e-05, 'epoch': 0.64, 'throughput': 774.75}
06/17/2024 20:39:27 - INFO - llamafactory.extras.callbacks - {'loss': 0.5656, 'learning_rate': 4.3925e-05, 'epoch': 0.68, 'throughput': 776.75}
06/17/2024 20:40:18 - INFO - llamafactory.extras.callbacks - {'loss': 0.4832, 'learning_rate': 4.3224e-05, 'epoch': 0.72, 'throughput': 780.75}
06/17/2024 20:41:04 - INFO - llamafactory.extras.callbacks - {'loss': 0.4626, 'learning_rate': 4.2492e-05, 'epoch': 0.76, 'throughput': 781.15}
06/17/2024 20:41:56 - INFO - llamafactory.extras.callbacks - {'loss': 0.4837, 'learning_rate': 4.1728e-05, 'epoch': 0.80, 'throughput': 780.33}
06/17/2024 20:41:56 - INFO - transformers.trainer - Saving model checkpoint to saves/Qwen2-7B-Chat/lora/train_2024-06-17-19-49-05/checkpoint-100
06/17/2024 20:41:57 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--Qwen--Qwen2-7B-Instruct/snapshots/41c66b0be1c3081f13defc6bdf946c2ef240d6a6/config.json
06/17/2024 20:41:57 - INFO - transformers.configuration_utils - Model config Qwen2Config {
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_theta": 1000000.0,
"sliding_window": 131072,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.41.2",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
06/17/2024 20:41:57 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Qwen2-7B-Chat/lora/train_2024-06-17-19-49-05/checkpoint-100/tokenizer_config.json
06/17/2024 20:41:57 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Qwen2-7B-Chat/lora/train_2024-06-17-19-49-05/checkpoint-100/special_tokens_map.json
06/17/2024 20:42:48 - INFO - llamafactory.extras.callbacks - {'loss': 0.5144, 'learning_rate': 4.0936e-05, 'epoch': 0.84, 'throughput': 779.83}
06/17/2024 20:43:33 - INFO - llamafactory.extras.callbacks - {'loss': 0.4930, 'learning_rate': 4.0115e-05, 'epoch': 0.88, 'throughput': 780.58}
06/17/2024 20:44:20 - INFO - llamafactory.extras.callbacks - {'loss': 0.4083, 'learning_rate': 3.9268e-05, 'epoch': 0.92, 'throughput': 781.80}
06/17/2024 20:45:14 - INFO - llamafactory.extras.callbacks - {'loss': 0.5172, 'learning_rate': 3.8396e-05, 'epoch': 0.96, 'throughput': 782.01}
06/17/2024 20:46:10 - INFO - llamafactory.extras.callbacks - {'loss': 0.5843, 'learning_rate': 3.7500e-05, 'epoch': 1.00, 'throughput': 782.22}
06/17/2024 20:46:58 - INFO - llamafactory.extras.callbacks - {'loss': 0.4567, 'learning_rate': 3.6582e-05, 'epoch': 1.04, 'throughput': 783.58}
06/17/2024 20:47:43 - INFO - llamafactory.extras.callbacks - {'loss': 0.4180, 'learning_rate': 3.5644e-05, 'epoch': 1.08, 'throughput': 784.96}
06/17/2024 20:48:37 - INFO - llamafactory.extras.callbacks - {'loss': 0.3785, 'learning_rate': 3.4688e-05, 'epoch': 1.12, 'throughput': 783.96}
06/17/2024 20:49:22 - INFO - llamafactory.extras.callbacks - {'loss': 0.4097, 'learning_rate': 3.3714e-05, 'epoch': 1.16, 'throughput': 783.11}
06/17/2024 20:50:09 - INFO - llamafactory.extras.callbacks - {'loss': 0.4507, 'learning_rate': 3.2725e-05, 'epoch': 1.20, 'throughput': 783.25}
06/17/2024 20:51:00 - INFO - llamafactory.extras.callbacks - {'loss': 0.3680, 'learning_rate': 3.1723e-05, 'epoch': 1.24, 'throughput': 782.23}
06/17/2024 20:51:53 - INFO - llamafactory.extras.callbacks - {'loss': 0.4301, 'learning_rate': 3.0709e-05, 'epoch': 1.28, 'throughput': 782.26}
06/17/2024 20:52:43 - INFO - llamafactory.extras.callbacks - {'loss': 0.4488, 'learning_rate': 2.9685e-05, 'epoch': 1.32, 'throughput': 781.89}
06/17/2024 20:53:33 - INFO - llamafactory.extras.callbacks - {'loss': 0.4075, 'learning_rate': 2.8652e-05, 'epoch': 1.36, 'throughput': 781.74}
06/17/2024 20:54:30 - INFO - llamafactory.extras.callbacks - {'loss': 0.4991, 'learning_rate': 2.7613e-05, 'epoch': 1.40, 'throughput': 781.86}
06/17/2024 20:55:20 - INFO - llamafactory.extras.callbacks - {'loss': 0.4894, 'learning_rate': 2.6570e-05, 'epoch': 1.44, 'throughput': 782.49}
06/17/2024 20:56:12 - INFO - llamafactory.extras.callbacks - {'loss': 0.4967, 'learning_rate': 2.5524e-05, 'epoch': 1.48, 'throughput': 782.06}
06/17/2024 20:57:06 - INFO - llamafactory.extras.callbacks - {'loss': 0.5297, 'learning_rate': 2.4476e-05, 'epoch': 1.52, 'throughput': 783.03}
06/17/2024 20:57:55 - INFO - llamafactory.extras.callbacks - {'loss': 0.3939, 'learning_rate': 2.3430e-05, 'epoch': 1.56, 'throughput': 781.82}
06/17/2024 20:58:49 - INFO - llamafactory.extras.callbacks - {'loss': 0.4610, 'learning_rate': 2.2387e-05, 'epoch': 1.60, 'throughput': 781.19}
06/17/2024 20:58:49 - INFO - transformers.trainer - Saving model checkpoint to saves/Qwen2-7B-Chat/lora/train_2024-06-17-19-49-05/checkpoint-200
06/17/2024 20:58:50 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--Qwen--Qwen2-7B-Instruct/snapshots/41c66b0be1c3081f13defc6bdf946c2ef240d6a6/config.json
06/17/2024 20:58:50 - INFO - transformers.configuration_utils - Model config Qwen2Config {
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_theta": 1000000.0,
"sliding_window": 131072,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.41.2",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
06/17/2024 20:58:50 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Qwen2-7B-Chat/lora/train_2024-06-17-19-49-05/checkpoint-200/tokenizer_config.json
06/17/2024 20:58:50 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Qwen2-7B-Chat/lora/train_2024-06-17-19-49-05/checkpoint-200/special_tokens_map.json
06/17/2024 20:59:48 - INFO - llamafactory.extras.callbacks - {'loss': 0.4622, 'learning_rate': 2.1348e-05, 'epoch': 1.64, 'throughput': 779.87}
06/17/2024 21:00:36 - INFO - llamafactory.extras.callbacks - {'loss': 0.4043, 'learning_rate': 2.0315e-05, 'epoch': 1.68, 'throughput': 779.61}
06/17/2024 21:01:25 - INFO - llamafactory.extras.callbacks - {'loss': 0.4280, 'learning_rate': 1.9291e-05, 'epoch': 1.72, 'throughput': 779.45}
06/17/2024 21:02:14 - INFO - llamafactory.extras.callbacks - {'loss': 0.3779, 'learning_rate': 1.8277e-05, 'epoch': 1.76, 'throughput': 778.48}
06/17/2024 21:03:05 - INFO - llamafactory.extras.callbacks - {'loss': 0.4526, 'learning_rate': 1.7275e-05, 'epoch': 1.80, 'throughput': 779.26}
06/17/2024 21:03:56 - INFO - llamafactory.extras.callbacks - {'loss': 0.4627, 'learning_rate': 1.6286e-05, 'epoch': 1.84, 'throughput': 779.12}
06/17/2024 21:04:48 - INFO - llamafactory.extras.callbacks - {'loss': 0.4873, 'learning_rate': 1.5312e-05, 'epoch': 1.88, 'throughput': 779.09}
06/17/2024 21:05:40 - INFO - llamafactory.extras.callbacks - {'loss': 0.3234, 'learning_rate': 1.4356e-05, 'epoch': 1.92, 'throughput': 780.05}
06/17/2024 21:06:28 - INFO - llamafactory.extras.callbacks - {'loss': 0.4438, 'learning_rate': 1.3418e-05, 'epoch': 1.96, 'throughput': 780.37}
06/17/2024 21:07:21 - INFO - llamafactory.extras.callbacks - {'loss': 0.4407, 'learning_rate': 1.2500e-05, 'epoch': 2.00, 'throughput': 779.97}
06/17/2024 21:08:15 - INFO - llamafactory.extras.callbacks - {'loss': 0.4401, 'learning_rate': 1.1604e-05, 'epoch': 2.04, 'throughput': 779.51}
06/17/2024 21:09:04 - INFO - llamafactory.extras.callbacks - {'loss': 0.3771, 'learning_rate': 1.0732e-05, 'epoch': 2.08, 'throughput': 780.06}
06/17/2024 21:09:57 - INFO - llamafactory.extras.callbacks - {'loss': 0.4043, 'learning_rate': 9.8850e-06, 'epoch': 2.12, 'throughput': 781.03}
06/17/2024 21:10:42 - INFO - llamafactory.extras.callbacks - {'loss': 0.4018, 'learning_rate': 9.0644e-06, 'epoch': 2.16, 'throughput': 781.14}
06/17/2024 21:11:32 - INFO - llamafactory.extras.callbacks - {'loss': 0.4258, 'learning_rate': 8.2717e-06, 'epoch': 2.20, 'throughput': 781.03}
06/17/2024 21:12:19 - INFO - llamafactory.extras.callbacks - {'loss': 0.3912, 'learning_rate': 7.5084e-06, 'epoch': 2.24, 'throughput': 780.49}
06/17/2024 21:13:06 - INFO - llamafactory.extras.callbacks - {'loss': 0.3458, 'learning_rate': 6.7758e-06, 'epoch': 2.28, 'throughput': 780.17}
06/17/2024 21:13:55 - INFO - llamafactory.extras.callbacks - {'loss': 0.4255, 'learning_rate': 6.0751e-06, 'epoch': 2.32, 'throughput': 780.22}
06/17/2024 21:14:45 - INFO - llamafactory.extras.callbacks - {'loss': 0.4222, 'learning_rate': 5.4077e-06, 'epoch': 2.36, 'throughput': 780.80}
06/17/2024 21:15:33 - INFO - llamafactory.extras.callbacks - {'loss': 0.3990, 'learning_rate': 4.7746e-06, 'epoch': 2.40, 'throughput': 780.45}
06/17/2024 21:15:33 - INFO - transformers.trainer - Saving model checkpoint to saves/Qwen2-7B-Chat/lora/train_2024-06-17-19-49-05/checkpoint-300
06/17/2024 21:15:34 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--Qwen--Qwen2-7B-Instruct/snapshots/41c66b0be1c3081f13defc6bdf946c2ef240d6a6/config.json
06/17/2024 21:15:34 - INFO - transformers.configuration_utils - Model config Qwen2Config {
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_theta": 1000000.0,
"sliding_window": 131072,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.41.2",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
06/17/2024 21:15:34 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Qwen2-7B-Chat/lora/train_2024-06-17-19-49-05/checkpoint-300/tokenizer_config.json
06/17/2024 21:15:34 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Qwen2-7B-Chat/lora/train_2024-06-17-19-49-05/checkpoint-300/special_tokens_map.json
06/17/2024 21:16:26 - INFO - llamafactory.extras.callbacks - {'loss': 0.3382, 'learning_rate': 4.1770e-06, 'epoch': 2.44, 'throughput': 780.00}
06/17/2024 21:17:20 - INFO - llamafactory.extras.callbacks - {'loss': 0.4465, 'learning_rate': 3.6159e-06, 'epoch': 2.48, 'throughput': 780.28}
06/17/2024 21:18:13 - INFO - llamafactory.extras.callbacks - {'loss': 0.3250, 'learning_rate': 3.0923e-06, 'epoch': 2.52, 'throughput': 779.93}
06/17/2024 21:19:04 - INFO - llamafactory.extras.callbacks - {'loss': 0.3920, 'learning_rate': 2.6072e-06, 'epoch': 2.56, 'throughput': 779.59}
06/17/2024 21:19:53 - INFO - llamafactory.extras.callbacks - {'loss': 0.3672, 'learning_rate': 2.1614e-06, 'epoch': 2.60, 'throughput': 779.34}
06/17/2024 21:20:43 - INFO - llamafactory.extras.callbacks - {'loss': 0.3554, 'learning_rate': 1.7556e-06, 'epoch': 2.64, 'throughput': 779.07}
06/17/2024 21:21:27 - INFO - llamafactory.extras.callbacks - {'loss': 0.3801, 'learning_rate': 1.3906e-06, 'epoch': 2.68, 'throughput': 778.76}
06/17/2024 21:22:20 - INFO - llamafactory.extras.callbacks - {'loss': 0.4350, 'learning_rate': 1.0670e-06, 'epoch': 2.72, 'throughput': 779.56}
06/17/2024 21:23:15 - INFO - llamafactory.extras.callbacks - {'loss': 0.4063, 'learning_rate': 7.8542e-07, 'epoch': 2.76, 'throughput': 779.43}
06/17/2024 21:24:15 - INFO - llamafactory.extras.callbacks - {'loss': 0.4894, 'learning_rate': 5.4631e-07, 'epoch': 2.80, 'throughput': 779.76}
06/17/2024 21:25:05 - INFO - llamafactory.extras.callbacks - {'loss': 0.3822, 'learning_rate': 3.5010e-07, 'epoch': 2.84, 'throughput': 779.50}
06/17/2024 21:25:57 - INFO - llamafactory.extras.callbacks - {'loss': 0.4028, 'learning_rate': 1.9713e-07, 'epoch': 2.88, 'throughput': 779.46}
06/17/2024 21:26:50 - INFO - llamafactory.extras.callbacks - {'loss': 0.4293, 'learning_rate': 8.7679e-08, 'epoch': 2.92, 'throughput': 779.71}
06/17/2024 21:27:36 - INFO - llamafactory.extras.callbacks - {'loss': 0.4280, 'learning_rate': 2.1929e-08, 'epoch': 2.96, 'throughput': 779.90}
06/17/2024 21:28:29 - INFO - llamafactory.extras.callbacks - {'loss': 0.4766, 'learning_rate': 0.0000e+00, 'epoch': 3.00, 'throughput': 779.83}
06/17/2024 21:28:29 - INFO - transformers.trainer -
Training completed. Do not forget to share your model on huggingface.co/models =)
06/17/2024 21:28:29 - INFO - transformers.trainer - Saving model checkpoint to saves/Qwen2-7B-Chat/lora/train_2024-06-17-19-49-05
06/17/2024 21:28:30 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--Qwen--Qwen2-7B-Instruct/snapshots/41c66b0be1c3081f13defc6bdf946c2ef240d6a6/config.json
06/17/2024 21:28:30 - INFO - transformers.configuration_utils - Model config Qwen2Config {
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_theta": 1000000.0,
"sliding_window": 131072,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.41.2",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
06/17/2024 21:28:30 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Qwen2-7B-Chat/lora/train_2024-06-17-19-49-05/tokenizer_config.json
06/17/2024 21:28:30 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Qwen2-7B-Chat/lora/train_2024-06-17-19-49-05/special_tokens_map.json
06/17/2024 21:28:30 - WARNING - llamafactory.extras.ploting - No metric eval_loss to plot.
06/17/2024 21:28:30 - INFO - transformers.modelcard - Dropping the following result as it does not have all the necessary fields:
{'task': {'name': 'Causal Language Modeling', 'type': 'text-generation'}}