QwenLimSim / running_log.txt
xiaofutongxuo's picture
Upload folder using huggingface_hub
8f8c9b1 verified
[WARNING|2024-12-17 12:49:56] logging.py:162 >> We recommend enable mixed precision training.
[INFO|2024-12-17 12:49:56] parser.py:355 >> Process rank: 0, device: cuda:0, n_gpu: 1, distributed training: False, compute dtype: torch.bfloat16
[INFO|2024-12-17 12:49:56] configuration_utils.py:677 >> loading configuration file /media/omnisky/Extreme SSD/hzq/LLMmodels/Qwen2-7B-Instruct/config.json
[INFO|2024-12-17 12:49:56] configuration_utils.py:746 >> Model config Qwen2Config {
"_name_or_path": "/media/omnisky/Extreme SSD/hzq/LLMmodels/Qwen2-7B-Instruct",
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.46.1",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
[INFO|2024-12-17 12:49:56] tokenization_utils_base.py:2209 >> loading file vocab.json
[INFO|2024-12-17 12:49:56] tokenization_utils_base.py:2209 >> loading file merges.txt
[INFO|2024-12-17 12:49:56] tokenization_utils_base.py:2209 >> loading file tokenizer.json
[INFO|2024-12-17 12:49:56] tokenization_utils_base.py:2209 >> loading file added_tokens.json
[INFO|2024-12-17 12:49:56] tokenization_utils_base.py:2209 >> loading file special_tokens_map.json
[INFO|2024-12-17 12:49:56] tokenization_utils_base.py:2209 >> loading file tokenizer_config.json
[INFO|2024-12-17 12:49:57] tokenization_utils_base.py:2475 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[INFO|2024-12-17 12:49:57] configuration_utils.py:677 >> loading configuration file /media/omnisky/Extreme SSD/hzq/LLMmodels/Qwen2-7B-Instruct/config.json
[INFO|2024-12-17 12:49:57] configuration_utils.py:746 >> Model config Qwen2Config {
"_name_or_path": "/media/omnisky/Extreme SSD/hzq/LLMmodels/Qwen2-7B-Instruct",
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.46.1",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
[INFO|2024-12-17 12:49:57] tokenization_utils_base.py:2209 >> loading file vocab.json
[INFO|2024-12-17 12:49:57] tokenization_utils_base.py:2209 >> loading file merges.txt
[INFO|2024-12-17 12:49:57] tokenization_utils_base.py:2209 >> loading file tokenizer.json
[INFO|2024-12-17 12:49:57] tokenization_utils_base.py:2209 >> loading file added_tokens.json
[INFO|2024-12-17 12:49:57] tokenization_utils_base.py:2209 >> loading file special_tokens_map.json
[INFO|2024-12-17 12:49:57] tokenization_utils_base.py:2209 >> loading file tokenizer_config.json
[INFO|2024-12-17 12:49:57] tokenization_utils_base.py:2475 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[INFO|2024-12-17 12:49:57] logging.py:157 >> Replace eos token: <|im_end|>
[INFO|2024-12-17 12:49:57] logging.py:157 >> Loading dataset qwendatacollect.json...
[INFO|2024-12-17 12:50:03] configuration_utils.py:677 >> loading configuration file /media/omnisky/Extreme SSD/hzq/LLMmodels/Qwen2-7B-Instruct/config.json
[INFO|2024-12-17 12:50:03] configuration_utils.py:746 >> Model config Qwen2Config {
"_name_or_path": "/media/omnisky/Extreme SSD/hzq/LLMmodels/Qwen2-7B-Instruct",
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.46.1",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
[INFO|2024-12-17 12:50:03] modeling_utils.py:3934 >> loading weights file /media/omnisky/Extreme SSD/hzq/LLMmodels/Qwen2-7B-Instruct/model.safetensors.index.json
[INFO|2024-12-17 12:50:03] modeling_utils.py:1670 >> Instantiating Qwen2ForCausalLM model under default dtype torch.bfloat16.
[INFO|2024-12-17 12:50:03] configuration_utils.py:1096 >> Generate config GenerationConfig {
"bos_token_id": 151643,
"eos_token_id": 151645
}
[INFO|2024-12-17 12:50:06] modeling_utils.py:4800 >> All model checkpoint weights were used when initializing Qwen2ForCausalLM.
[INFO|2024-12-17 12:50:06] modeling_utils.py:4808 >> All the weights of Qwen2ForCausalLM were initialized from the model checkpoint at /media/omnisky/Extreme SSD/hzq/LLMmodels/Qwen2-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2ForCausalLM for predictions without further training.
[INFO|2024-12-17 12:50:06] configuration_utils.py:1049 >> loading configuration file /media/omnisky/Extreme SSD/hzq/LLMmodels/Qwen2-7B-Instruct/generation_config.json
[INFO|2024-12-17 12:50:06] configuration_utils.py:1096 >> Generate config GenerationConfig {
"bos_token_id": 151643,
"do_sample": true,
"eos_token_id": [
151645,
151643
],
"pad_token_id": 151643,
"repetition_penalty": 1.05,
"temperature": 0.7,
"top_k": 20,
"top_p": 0.8
}
[INFO|2024-12-17 12:50:06] logging.py:157 >> Gradient checkpointing enabled.
[INFO|2024-12-17 12:50:06] logging.py:157 >> Using torch SDPA for faster training and inference.
[INFO|2024-12-17 12:50:06] logging.py:157 >> Pure bf16 / BAdam detected, remaining trainable params in half precision.
[INFO|2024-12-17 12:50:06] logging.py:157 >> Fine-tuning method: LoRA
[INFO|2024-12-17 12:50:06] logging.py:157 >> Found linear modules: gate_proj,o_proj,down_proj,up_proj,v_proj,k_proj,q_proj
[INFO|2024-12-17 12:50:06] logging.py:157 >> trainable params: 20,185,088 || all params: 7,635,801,600 || trainable%: 0.2643
[INFO|2024-12-17 12:50:06] trainer.py:2313 >> ***** Running training *****
[INFO|2024-12-17 12:50:06] trainer.py:2314 >> Num examples = 3,354
[INFO|2024-12-17 12:50:06] trainer.py:2315 >> Num Epochs = 3
[INFO|2024-12-17 12:50:06] trainer.py:2316 >> Instantaneous batch size per device = 1
[INFO|2024-12-17 12:50:06] trainer.py:2319 >> Total train batch size (w. parallel, distributed & accumulation) = 8
[INFO|2024-12-17 12:50:06] trainer.py:2320 >> Gradient Accumulation steps = 8
[INFO|2024-12-17 12:50:06] trainer.py:2321 >> Total optimization steps = 1,257
[INFO|2024-12-17 12:50:06] trainer.py:2322 >> Number of trainable parameters = 20,185,088
[INFO|2024-12-17 12:50:27] logging.py:157 >> {'loss': 1.4421, 'learning_rate': 4.9998e-05, 'epoch': 0.01}
[INFO|2024-12-17 12:50:47] logging.py:157 >> {'loss': 1.3306, 'learning_rate': 4.9992e-05, 'epoch': 0.02}
[INFO|2024-12-17 12:51:07] logging.py:157 >> {'loss': 1.1873, 'learning_rate': 4.9982e-05, 'epoch': 0.04}
[INFO|2024-12-17 12:51:27] logging.py:157 >> {'loss': 1.0583, 'learning_rate': 4.9969e-05, 'epoch': 0.05}
[INFO|2024-12-17 12:51:48] logging.py:157 >> {'loss': 1.0045, 'learning_rate': 4.9951e-05, 'epoch': 0.06}
[INFO|2024-12-17 12:52:08] logging.py:157 >> {'loss': 0.9836, 'learning_rate': 4.9930e-05, 'epoch': 0.07}
[INFO|2024-12-17 12:52:28] logging.py:157 >> {'loss': 0.8925, 'learning_rate': 4.9904e-05, 'epoch': 0.08}
[INFO|2024-12-17 12:52:48] logging.py:157 >> {'loss': 0.8783, 'learning_rate': 4.9875e-05, 'epoch': 0.10}
[INFO|2024-12-17 12:53:08] logging.py:157 >> {'loss': 0.8681, 'learning_rate': 4.9842e-05, 'epoch': 0.11}
[INFO|2024-12-17 12:53:28] logging.py:157 >> {'loss': 0.8251, 'learning_rate': 4.9805e-05, 'epoch': 0.12}
[INFO|2024-12-17 12:53:48] logging.py:157 >> {'loss': 0.8079, 'learning_rate': 4.9764e-05, 'epoch': 0.13}
[INFO|2024-12-17 12:54:08] logging.py:157 >> {'loss': 0.7769, 'learning_rate': 4.9719e-05, 'epoch': 0.14}
[INFO|2024-12-17 12:54:29] logging.py:157 >> {'loss': 0.7770, 'learning_rate': 4.9671e-05, 'epoch': 0.16}
[INFO|2024-12-17 12:54:49] logging.py:157 >> {'loss': 0.7303, 'learning_rate': 4.9618e-05, 'epoch': 0.17}
[INFO|2024-12-17 12:55:09] logging.py:157 >> {'loss': 0.7442, 'learning_rate': 4.9562e-05, 'epoch': 0.18}
[INFO|2024-12-17 12:55:29] logging.py:157 >> {'loss': 0.7077, 'learning_rate': 4.9502e-05, 'epoch': 0.19}
[INFO|2024-12-17 12:55:49] logging.py:157 >> {'loss': 0.6730, 'learning_rate': 4.9438e-05, 'epoch': 0.20}
[INFO|2024-12-17 12:56:09] logging.py:157 >> {'loss': 0.7109, 'learning_rate': 4.9370e-05, 'epoch': 0.21}
[INFO|2024-12-17 12:56:29] logging.py:157 >> {'loss': 0.6798, 'learning_rate': 4.9299e-05, 'epoch': 0.23}
[INFO|2024-12-17 12:56:50] logging.py:157 >> {'loss': 0.6471, 'learning_rate': 4.9223e-05, 'epoch': 0.24}
[INFO|2024-12-17 12:56:50] trainer.py:3801 >> Saving model checkpoint to saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-100
[INFO|2024-12-17 12:56:50] configuration_utils.py:677 >> loading configuration file /media/omnisky/Extreme SSD/hzq/LLMmodels/Qwen2-7B-Instruct/config.json
[INFO|2024-12-17 12:56:50] configuration_utils.py:746 >> Model config Qwen2Config {
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.46.1",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
[INFO|2024-12-17 12:56:50] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-100/tokenizer_config.json
[INFO|2024-12-17 12:56:50] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-100/special_tokens_map.json
[INFO|2024-12-17 12:57:10] logging.py:157 >> {'loss': 0.6456, 'learning_rate': 4.9144e-05, 'epoch': 0.25}
[INFO|2024-12-17 12:57:31] logging.py:157 >> {'loss': 0.6462, 'learning_rate': 4.9061e-05, 'epoch': 0.26}
[INFO|2024-12-17 12:57:51] logging.py:157 >> {'loss': 0.6594, 'learning_rate': 4.8974e-05, 'epoch': 0.27}
[INFO|2024-12-17 12:58:11] logging.py:157 >> {'loss': 0.6372, 'learning_rate': 4.8884e-05, 'epoch': 0.29}
[INFO|2024-12-17 12:58:31] logging.py:157 >> {'loss': 0.6594, 'learning_rate': 4.8790e-05, 'epoch': 0.30}
[INFO|2024-12-17 12:58:52] logging.py:157 >> {'loss': 0.6544, 'learning_rate': 4.8692e-05, 'epoch': 0.31}
[INFO|2024-12-17 12:59:12] logging.py:157 >> {'loss': 0.6275, 'learning_rate': 4.8590e-05, 'epoch': 0.32}
[INFO|2024-12-17 12:59:32] logging.py:157 >> {'loss': 0.6348, 'learning_rate': 4.8485e-05, 'epoch': 0.33}
[INFO|2024-12-17 12:59:53] logging.py:157 >> {'loss': 0.6045, 'learning_rate': 4.8376e-05, 'epoch': 0.35}
[INFO|2024-12-17 13:00:13] logging.py:157 >> {'loss': 0.6487, 'learning_rate': 4.8264e-05, 'epoch': 0.36}
[INFO|2024-12-17 13:00:33] logging.py:157 >> {'loss': 0.6026, 'learning_rate': 4.8147e-05, 'epoch': 0.37}
[INFO|2024-12-17 13:00:53] logging.py:157 >> {'loss': 0.6083, 'learning_rate': 4.8028e-05, 'epoch': 0.38}
[INFO|2024-12-17 13:01:14] logging.py:157 >> {'loss': 0.5957, 'learning_rate': 4.7904e-05, 'epoch': 0.39}
[INFO|2024-12-17 13:01:34] logging.py:157 >> {'loss': 0.6538, 'learning_rate': 4.7777e-05, 'epoch': 0.41}
[INFO|2024-12-17 13:01:54] logging.py:157 >> {'loss': 0.6200, 'learning_rate': 4.7647e-05, 'epoch': 0.42}
[INFO|2024-12-17 13:02:14] logging.py:157 >> {'loss': 0.6461, 'learning_rate': 4.7513e-05, 'epoch': 0.43}
[INFO|2024-12-17 13:02:35] logging.py:157 >> {'loss': 0.6330, 'learning_rate': 4.7375e-05, 'epoch': 0.44}
[INFO|2024-12-17 13:02:55] logging.py:157 >> {'loss': 0.6330, 'learning_rate': 4.7234e-05, 'epoch': 0.45}
[INFO|2024-12-17 13:03:15] logging.py:157 >> {'loss': 0.6082, 'learning_rate': 4.7089e-05, 'epoch': 0.47}
[INFO|2024-12-17 13:03:35] logging.py:157 >> {'loss': 0.6176, 'learning_rate': 4.6941e-05, 'epoch': 0.48}
[INFO|2024-12-17 13:03:35] trainer.py:3801 >> Saving model checkpoint to saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-200
[INFO|2024-12-17 13:03:35] configuration_utils.py:677 >> loading configuration file /media/omnisky/Extreme SSD/hzq/LLMmodels/Qwen2-7B-Instruct/config.json
[INFO|2024-12-17 13:03:35] configuration_utils.py:746 >> Model config Qwen2Config {
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.46.1",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
[INFO|2024-12-17 13:03:36] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-200/tokenizer_config.json
[INFO|2024-12-17 13:03:36] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-200/special_tokens_map.json
[INFO|2024-12-17 13:03:56] logging.py:157 >> {'loss': 0.6168, 'learning_rate': 4.6790e-05, 'epoch': 0.49}
[INFO|2024-12-17 13:04:17] logging.py:157 >> {'loss': 0.6454, 'learning_rate': 4.6635e-05, 'epoch': 0.50}
[INFO|2024-12-17 13:04:37] logging.py:157 >> {'loss': 0.5738, 'learning_rate': 4.6477e-05, 'epoch': 0.51}
[INFO|2024-12-17 13:04:57] logging.py:157 >> {'loss': 0.5978, 'learning_rate': 4.6315e-05, 'epoch': 0.52}
[INFO|2024-12-17 13:05:17] logging.py:157 >> {'loss': 0.6375, 'learning_rate': 4.6150e-05, 'epoch': 0.54}
[INFO|2024-12-17 13:05:38] logging.py:157 >> {'loss': 0.6060, 'learning_rate': 4.5982e-05, 'epoch': 0.55}
[INFO|2024-12-17 13:05:58] logging.py:157 >> {'loss': 0.6261, 'learning_rate': 4.5811e-05, 'epoch': 0.56}
[INFO|2024-12-17 13:06:18] logging.py:157 >> {'loss': 0.5877, 'learning_rate': 4.5636e-05, 'epoch': 0.57}
[INFO|2024-12-17 13:06:38] logging.py:157 >> {'loss': 0.5821, 'learning_rate': 4.5458e-05, 'epoch': 0.58}
[INFO|2024-12-17 13:06:59] logging.py:157 >> {'loss': 0.6126, 'learning_rate': 4.5277e-05, 'epoch': 0.60}
[INFO|2024-12-17 13:07:19] logging.py:157 >> {'loss': 0.6034, 'learning_rate': 4.5092e-05, 'epoch': 0.61}
[INFO|2024-12-17 13:07:39] logging.py:157 >> {'loss': 0.6088, 'learning_rate': 4.4905e-05, 'epoch': 0.62}
[INFO|2024-12-17 13:07:59] logging.py:157 >> {'loss': 0.6067, 'learning_rate': 4.4714e-05, 'epoch': 0.63}
[INFO|2024-12-17 13:08:19] logging.py:157 >> {'loss': 0.6334, 'learning_rate': 4.4521e-05, 'epoch': 0.64}
[INFO|2024-12-17 13:08:39] logging.py:157 >> {'loss': 0.6093, 'learning_rate': 4.4324e-05, 'epoch': 0.66}
[INFO|2024-12-17 13:09:00] logging.py:157 >> {'loss': 0.5939, 'learning_rate': 4.4124e-05, 'epoch': 0.67}
[INFO|2024-12-17 13:09:20] logging.py:157 >> {'loss': 0.5483, 'learning_rate': 4.3922e-05, 'epoch': 0.68}
[INFO|2024-12-17 13:09:40] logging.py:157 >> {'loss': 0.5946, 'learning_rate': 4.3716e-05, 'epoch': 0.69}
[INFO|2024-12-17 13:10:00] logging.py:157 >> {'loss': 0.5651, 'learning_rate': 4.3507e-05, 'epoch': 0.70}
[INFO|2024-12-17 13:10:21] logging.py:157 >> {'loss': 0.5965, 'learning_rate': 4.3296e-05, 'epoch': 0.72}
[INFO|2024-12-17 13:10:21] trainer.py:3801 >> Saving model checkpoint to saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-300
[INFO|2024-12-17 13:10:21] configuration_utils.py:677 >> loading configuration file /media/omnisky/Extreme SSD/hzq/LLMmodels/Qwen2-7B-Instruct/config.json
[INFO|2024-12-17 13:10:21] configuration_utils.py:746 >> Model config Qwen2Config {
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.46.1",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
[INFO|2024-12-17 13:10:21] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-300/tokenizer_config.json
[INFO|2024-12-17 13:10:21] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-300/special_tokens_map.json
[INFO|2024-12-17 13:10:42] logging.py:157 >> {'loss': 0.5751, 'learning_rate': 4.3082e-05, 'epoch': 0.73}
[INFO|2024-12-17 13:11:02] logging.py:157 >> {'loss': 0.6043, 'learning_rate': 4.2864e-05, 'epoch': 0.74}
[INFO|2024-12-17 13:11:22] logging.py:157 >> {'loss': 0.5948, 'learning_rate': 4.2645e-05, 'epoch': 0.75}
[INFO|2024-12-17 13:11:43] logging.py:157 >> {'loss': 0.5538, 'learning_rate': 4.2422e-05, 'epoch': 0.76}
[INFO|2024-12-17 13:12:03] logging.py:157 >> {'loss': 0.5696, 'learning_rate': 4.2196e-05, 'epoch': 0.78}
[INFO|2024-12-17 13:12:23] logging.py:157 >> {'loss': 0.5613, 'learning_rate': 4.1968e-05, 'epoch': 0.79}
[INFO|2024-12-17 13:12:43] logging.py:157 >> {'loss': 0.5712, 'learning_rate': 4.1738e-05, 'epoch': 0.80}
[INFO|2024-12-17 13:13:04] logging.py:157 >> {'loss': 0.5693, 'learning_rate': 4.1504e-05, 'epoch': 0.81}
[INFO|2024-12-17 13:13:24] logging.py:157 >> {'loss': 0.5911, 'learning_rate': 4.1268e-05, 'epoch': 0.82}
[INFO|2024-12-17 13:13:44] logging.py:157 >> {'loss': 0.5551, 'learning_rate': 4.1030e-05, 'epoch': 0.83}
[INFO|2024-12-17 13:14:05] logging.py:157 >> {'loss': 0.5640, 'learning_rate': 4.0789e-05, 'epoch': 0.85}
[INFO|2024-12-17 13:14:25] logging.py:157 >> {'loss': 0.5766, 'learning_rate': 4.0545e-05, 'epoch': 0.86}
[INFO|2024-12-17 13:14:45] logging.py:157 >> {'loss': 0.5289, 'learning_rate': 4.0299e-05, 'epoch': 0.87}
[INFO|2024-12-17 13:15:05] logging.py:157 >> {'loss': 0.5839, 'learning_rate': 4.0051e-05, 'epoch': 0.88}
[INFO|2024-12-17 13:15:25] logging.py:157 >> {'loss': 0.5830, 'learning_rate': 3.9801e-05, 'epoch': 0.89}
[INFO|2024-12-17 13:15:46] logging.py:157 >> {'loss': 0.5645, 'learning_rate': 3.9548e-05, 'epoch': 0.91}
[INFO|2024-12-17 13:16:06] logging.py:157 >> {'loss': 0.6013, 'learning_rate': 3.9292e-05, 'epoch': 0.92}
[INFO|2024-12-17 13:16:26] logging.py:157 >> {'loss': 0.5657, 'learning_rate': 3.9035e-05, 'epoch': 0.93}
[INFO|2024-12-17 13:16:46] logging.py:157 >> {'loss': 0.6077, 'learning_rate': 3.8775e-05, 'epoch': 0.94}
[INFO|2024-12-17 13:17:07] logging.py:157 >> {'loss': 0.5553, 'learning_rate': 3.8514e-05, 'epoch': 0.95}
[INFO|2024-12-17 13:17:07] trainer.py:3801 >> Saving model checkpoint to saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-400
[INFO|2024-12-17 13:17:07] configuration_utils.py:677 >> loading configuration file /media/omnisky/Extreme SSD/hzq/LLMmodels/Qwen2-7B-Instruct/config.json
[INFO|2024-12-17 13:17:07] configuration_utils.py:746 >> Model config Qwen2Config {
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.46.1",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
[INFO|2024-12-17 13:17:07] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-400/tokenizer_config.json
[INFO|2024-12-17 13:17:07] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-400/special_tokens_map.json
[INFO|2024-12-17 13:17:28] logging.py:157 >> {'loss': 0.5806, 'learning_rate': 3.8250e-05, 'epoch': 0.97}
[INFO|2024-12-17 13:17:48] logging.py:157 >> {'loss': 0.5463, 'learning_rate': 3.7984e-05, 'epoch': 0.98}
[INFO|2024-12-17 13:18:08] logging.py:157 >> {'loss': 0.6032, 'learning_rate': 3.7716e-05, 'epoch': 0.99}
[INFO|2024-12-17 13:18:29] logging.py:157 >> {'loss': 0.6630, 'learning_rate': 3.7446e-05, 'epoch': 1.00}
[INFO|2024-12-17 13:18:49] logging.py:157 >> {'loss': 0.5532, 'learning_rate': 3.7174e-05, 'epoch': 1.01}
[INFO|2024-12-17 13:19:09] logging.py:157 >> {'loss': 0.5370, 'learning_rate': 3.6900e-05, 'epoch': 1.03}
[INFO|2024-12-17 13:19:29] logging.py:157 >> {'loss': 0.5192, 'learning_rate': 3.6624e-05, 'epoch': 1.04}
[INFO|2024-12-17 13:19:50] logging.py:157 >> {'loss': 0.5525, 'learning_rate': 3.6347e-05, 'epoch': 1.05}
[INFO|2024-12-17 13:20:10] logging.py:157 >> {'loss': 0.5546, 'learning_rate': 3.6068e-05, 'epoch': 1.06}
[INFO|2024-12-17 13:20:30] logging.py:157 >> {'loss': 0.5298, 'learning_rate': 3.5787e-05, 'epoch': 1.07}
[INFO|2024-12-17 13:20:50] logging.py:157 >> {'loss': 0.5354, 'learning_rate': 3.5504e-05, 'epoch': 1.09}
[INFO|2024-12-17 13:21:11] logging.py:157 >> {'loss': 0.5173, 'learning_rate': 3.5220e-05, 'epoch': 1.10}
[INFO|2024-12-17 13:21:31] logging.py:157 >> {'loss': 0.5424, 'learning_rate': 3.4934e-05, 'epoch': 1.11}
[INFO|2024-12-17 13:21:51] logging.py:157 >> {'loss': 0.5740, 'learning_rate': 3.4646e-05, 'epoch': 1.12}
[INFO|2024-12-17 13:22:11] logging.py:157 >> {'loss': 0.5152, 'learning_rate': 3.4357e-05, 'epoch': 1.13}
[INFO|2024-12-17 13:22:32] logging.py:157 >> {'loss': 0.5504, 'learning_rate': 3.4067e-05, 'epoch': 1.14}
[INFO|2024-12-17 13:22:52] logging.py:157 >> {'loss': 0.5547, 'learning_rate': 3.3775e-05, 'epoch': 1.16}
[INFO|2024-12-17 13:23:12] logging.py:157 >> {'loss': 0.5319, 'learning_rate': 3.3482e-05, 'epoch': 1.17}
[INFO|2024-12-17 13:23:32] logging.py:157 >> {'loss': 0.5541, 'learning_rate': 3.3187e-05, 'epoch': 1.18}
[INFO|2024-12-17 13:23:53] logging.py:157 >> {'loss': 0.5329, 'learning_rate': 3.2892e-05, 'epoch': 1.19}
[INFO|2024-12-17 13:23:53] trainer.py:3801 >> Saving model checkpoint to saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-500
[INFO|2024-12-17 13:23:53] configuration_utils.py:677 >> loading configuration file /media/omnisky/Extreme SSD/hzq/LLMmodels/Qwen2-7B-Instruct/config.json
[INFO|2024-12-17 13:23:53] configuration_utils.py:746 >> Model config Qwen2Config {
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.46.1",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
[INFO|2024-12-17 13:23:53] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-500/tokenizer_config.json
[INFO|2024-12-17 13:23:53] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-500/special_tokens_map.json
[INFO|2024-12-17 13:24:14] logging.py:157 >> {'loss': 0.5964, 'learning_rate': 3.2595e-05, 'epoch': 1.20}
[INFO|2024-12-17 13:24:34] logging.py:157 >> {'loss': 0.5315, 'learning_rate': 3.2296e-05, 'epoch': 1.22}
[INFO|2024-12-17 13:24:54] logging.py:157 >> {'loss': 0.5698, 'learning_rate': 3.1997e-05, 'epoch': 1.23}
[INFO|2024-12-17 13:25:14] logging.py:157 >> {'loss': 0.5234, 'learning_rate': 3.1697e-05, 'epoch': 1.24}
[INFO|2024-12-17 13:25:35] logging.py:157 >> {'loss': 0.5386, 'learning_rate': 3.1395e-05, 'epoch': 1.25}
[INFO|2024-12-17 13:25:55] logging.py:157 >> {'loss': 0.5136, 'learning_rate': 3.1092e-05, 'epoch': 1.26}
[INFO|2024-12-17 13:26:15] logging.py:157 >> {'loss': 0.5341, 'learning_rate': 3.0789e-05, 'epoch': 1.28}
[INFO|2024-12-17 13:26:35] logging.py:157 >> {'loss': 0.5613, 'learning_rate': 3.0485e-05, 'epoch': 1.29}
[INFO|2024-12-17 13:26:56] logging.py:157 >> {'loss': 0.5076, 'learning_rate': 3.0179e-05, 'epoch': 1.30}
[INFO|2024-12-17 13:27:16] logging.py:157 >> {'loss': 0.5115, 'learning_rate': 2.9873e-05, 'epoch': 1.31}
[INFO|2024-12-17 13:27:36] logging.py:157 >> {'loss': 0.5467, 'learning_rate': 2.9567e-05, 'epoch': 1.32}
[INFO|2024-12-17 13:27:56] logging.py:157 >> {'loss': 0.5230, 'learning_rate': 2.9259e-05, 'epoch': 1.34}
[INFO|2024-12-17 13:28:17] logging.py:157 >> {'loss': 0.5541, 'learning_rate': 2.8951e-05, 'epoch': 1.35}
[INFO|2024-12-17 13:28:37] logging.py:157 >> {'loss': 0.5303, 'learning_rate': 2.8642e-05, 'epoch': 1.36}
[INFO|2024-12-17 13:28:57] logging.py:157 >> {'loss': 0.5440, 'learning_rate': 2.8333e-05, 'epoch': 1.37}
[INFO|2024-12-17 13:29:18] logging.py:157 >> {'loss': 0.5596, 'learning_rate': 2.8023e-05, 'epoch': 1.38}
[INFO|2024-12-17 13:29:38] logging.py:157 >> {'loss': 0.5003, 'learning_rate': 2.7713e-05, 'epoch': 1.40}
[INFO|2024-12-17 13:29:58] logging.py:157 >> {'loss': 0.5629, 'learning_rate': 2.7402e-05, 'epoch': 1.41}
[INFO|2024-12-17 13:30:18] logging.py:157 >> {'loss': 0.4516, 'learning_rate': 2.7091e-05, 'epoch': 1.42}
[INFO|2024-12-17 13:30:39] logging.py:157 >> {'loss': 0.5380, 'learning_rate': 2.6779e-05, 'epoch': 1.43}
[INFO|2024-12-17 13:30:39] trainer.py:3801 >> Saving model checkpoint to saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-600
[INFO|2024-12-17 13:30:39] configuration_utils.py:677 >> loading configuration file /media/omnisky/Extreme SSD/hzq/LLMmodels/Qwen2-7B-Instruct/config.json
[INFO|2024-12-17 13:30:39] configuration_utils.py:746 >> Model config Qwen2Config {
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.46.1",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
[INFO|2024-12-17 13:30:39] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-600/tokenizer_config.json
[INFO|2024-12-17 13:30:39] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-600/special_tokens_map.json
[INFO|2024-12-17 13:31:00] logging.py:157 >> {'loss': 0.5314, 'learning_rate': 2.6467e-05, 'epoch': 1.44}
[INFO|2024-12-17 13:31:20] logging.py:157 >> {'loss': 0.4841, 'learning_rate': 2.6156e-05, 'epoch': 1.45}
[INFO|2024-12-17 13:31:40] logging.py:157 >> {'loss': 0.5133, 'learning_rate': 2.5843e-05, 'epoch': 1.47}
[INFO|2024-12-17 13:32:00] logging.py:157 >> {'loss': 0.5309, 'learning_rate': 2.5531e-05, 'epoch': 1.48}
[INFO|2024-12-17 13:32:21] logging.py:157 >> {'loss': 0.5022, 'learning_rate': 2.5219e-05, 'epoch': 1.49}
[INFO|2024-12-17 13:32:41] logging.py:157 >> {'loss': 0.5254, 'learning_rate': 2.4906e-05, 'epoch': 1.50}
[INFO|2024-12-17 13:33:01] logging.py:157 >> {'loss': 0.5507, 'learning_rate': 2.4594e-05, 'epoch': 1.51}
[INFO|2024-12-17 13:33:21] logging.py:157 >> {'loss': 0.4943, 'learning_rate': 2.4282e-05, 'epoch': 1.53}
[INFO|2024-12-17 13:33:42] logging.py:157 >> {'loss': 0.5111, 'learning_rate': 2.3969e-05, 'epoch': 1.54}
[INFO|2024-12-17 13:34:02] logging.py:157 >> {'loss': 0.5302, 'learning_rate': 2.3657e-05, 'epoch': 1.55}
[INFO|2024-12-17 13:34:22] logging.py:157 >> {'loss': 0.5299, 'learning_rate': 2.3345e-05, 'epoch': 1.56}
[INFO|2024-12-17 13:34:43] logging.py:157 >> {'loss': 0.5453, 'learning_rate': 2.3034e-05, 'epoch': 1.57}
[INFO|2024-12-17 13:35:03] logging.py:157 >> {'loss': 0.5804, 'learning_rate': 2.2723e-05, 'epoch': 1.59}
[INFO|2024-12-17 13:35:23] logging.py:157 >> {'loss': 0.5072, 'learning_rate': 2.2412e-05, 'epoch': 1.60}
[INFO|2024-12-17 13:35:43] logging.py:157 >> {'loss': 0.5287, 'learning_rate': 2.2101e-05, 'epoch': 1.61}
[INFO|2024-12-17 13:36:04] logging.py:157 >> {'loss': 0.5343, 'learning_rate': 2.1791e-05, 'epoch': 1.62}
[INFO|2024-12-17 13:36:24] logging.py:157 >> {'loss': 0.5329, 'learning_rate': 2.1481e-05, 'epoch': 1.63}
[INFO|2024-12-17 13:36:44] logging.py:157 >> {'loss': 0.5591, 'learning_rate': 2.1172e-05, 'epoch': 1.65}
[INFO|2024-12-17 13:37:04] logging.py:157 >> {'loss': 0.5254, 'learning_rate': 2.0864e-05, 'epoch': 1.66}
[INFO|2024-12-17 13:37:25] logging.py:157 >> {'loss': 0.5212, 'learning_rate': 2.0556e-05, 'epoch': 1.67}
[INFO|2024-12-17 13:37:25] trainer.py:3801 >> Saving model checkpoint to saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-700
[INFO|2024-12-17 13:37:25] configuration_utils.py:677 >> loading configuration file /media/omnisky/Extreme SSD/hzq/LLMmodels/Qwen2-7B-Instruct/config.json
[INFO|2024-12-17 13:37:25] configuration_utils.py:746 >> Model config Qwen2Config {
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.46.1",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
[INFO|2024-12-17 13:37:25] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-700/tokenizer_config.json
[INFO|2024-12-17 13:37:25] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-700/special_tokens_map.json
[INFO|2024-12-17 13:37:46] logging.py:157 >> {'loss': 0.5276, 'learning_rate': 2.0249e-05, 'epoch': 1.68}
[INFO|2024-12-17 13:38:06] logging.py:157 >> {'loss': 0.5203, 'learning_rate': 1.9943e-05, 'epoch': 1.69}
[INFO|2024-12-17 13:38:26] logging.py:157 >> {'loss': 0.5291, 'learning_rate': 1.9637e-05, 'epoch': 1.71}
[INFO|2024-12-17 13:38:46] logging.py:157 >> {'loss': 0.5299, 'learning_rate': 1.9333e-05, 'epoch': 1.72}
[INFO|2024-12-17 13:39:07] logging.py:157 >> {'loss': 0.5476, 'learning_rate': 1.9029e-05, 'epoch': 1.73}
[INFO|2024-12-17 13:39:27] logging.py:157 >> {'loss': 0.5333, 'learning_rate': 1.8726e-05, 'epoch': 1.74}
[INFO|2024-12-17 13:39:47] logging.py:157 >> {'loss': 0.4962, 'learning_rate': 1.8424e-05, 'epoch': 1.75}
[INFO|2024-12-17 13:40:07] logging.py:157 >> {'loss': 0.4986, 'learning_rate': 1.8123e-05, 'epoch': 1.77}
[INFO|2024-12-17 13:40:28] logging.py:157 >> {'loss': 0.5461, 'learning_rate': 1.7823e-05, 'epoch': 1.78}
[INFO|2024-12-17 13:40:48] logging.py:157 >> {'loss': 0.5392, 'learning_rate': 1.7525e-05, 'epoch': 1.79}
[INFO|2024-12-17 13:41:08] logging.py:157 >> {'loss': 0.5162, 'learning_rate': 1.7227e-05, 'epoch': 1.80}
[INFO|2024-12-17 13:41:28] logging.py:157 >> {'loss': 0.5431, 'learning_rate': 1.6931e-05, 'epoch': 1.81}
[INFO|2024-12-17 13:41:49] logging.py:157 >> {'loss': 0.5037, 'learning_rate': 1.6636e-05, 'epoch': 1.82}
[INFO|2024-12-17 13:42:09] logging.py:157 >> {'loss': 0.5164, 'learning_rate': 1.6342e-05, 'epoch': 1.84}
[INFO|2024-12-17 13:42:29] logging.py:157 >> {'loss': 0.4950, 'learning_rate': 1.6050e-05, 'epoch': 1.85}
[INFO|2024-12-17 13:42:49] logging.py:157 >> {'loss': 0.5834, 'learning_rate': 1.5759e-05, 'epoch': 1.86}
[INFO|2024-12-17 13:43:10] logging.py:157 >> {'loss': 0.5459, 'learning_rate': 1.5469e-05, 'epoch': 1.87}
[INFO|2024-12-17 13:43:30] logging.py:157 >> {'loss': 0.5160, 'learning_rate': 1.5181e-05, 'epoch': 1.88}
[INFO|2024-12-17 13:43:50] logging.py:157 >> {'loss': 0.5519, 'learning_rate': 1.4894e-05, 'epoch': 1.90}
[INFO|2024-12-17 13:44:10] logging.py:157 >> {'loss': 0.5173, 'learning_rate': 1.4609e-05, 'epoch': 1.91}
[INFO|2024-12-17 13:44:10] trainer.py:3801 >> Saving model checkpoint to saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-800
[INFO|2024-12-17 13:44:10] configuration_utils.py:677 >> loading configuration file /media/omnisky/Extreme SSD/hzq/LLMmodels/Qwen2-7B-Instruct/config.json
[INFO|2024-12-17 13:44:10] configuration_utils.py:746 >> Model config Qwen2Config {
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.46.1",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
[INFO|2024-12-17 13:44:11] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-800/tokenizer_config.json
[INFO|2024-12-17 13:44:11] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-800/special_tokens_map.json
[INFO|2024-12-17 13:44:31] logging.py:157 >> {'loss': 0.5170, 'learning_rate': 1.4326e-05, 'epoch': 1.92}
[INFO|2024-12-17 13:44:52] logging.py:157 >> {'loss': 0.4906, 'learning_rate': 1.4044e-05, 'epoch': 1.93}
[INFO|2024-12-17 13:45:12] logging.py:157 >> {'loss': 0.5598, 'learning_rate': 1.3765e-05, 'epoch': 1.94}
[INFO|2024-12-17 13:45:32] logging.py:157 >> {'loss': 0.5276, 'learning_rate': 1.3486e-05, 'epoch': 1.96}
[INFO|2024-12-17 13:45:52] logging.py:157 >> {'loss': 0.4966, 'learning_rate': 1.3210e-05, 'epoch': 1.97}
[INFO|2024-12-17 13:46:13] logging.py:157 >> {'loss': 0.5133, 'learning_rate': 1.2935e-05, 'epoch': 1.98}
[INFO|2024-12-17 13:46:33] logging.py:157 >> {'loss': 0.4814, 'learning_rate': 1.2663e-05, 'epoch': 1.99}
[INFO|2024-12-17 13:46:53] logging.py:157 >> {'loss': 0.6167, 'learning_rate': 1.2392e-05, 'epoch': 2.00}
[INFO|2024-12-17 13:47:13] logging.py:157 >> {'loss': 0.4915, 'learning_rate': 1.2123e-05, 'epoch': 2.02}
[INFO|2024-12-17 13:47:34] logging.py:157 >> {'loss': 0.5116, 'learning_rate': 1.1856e-05, 'epoch': 2.03}
[INFO|2024-12-17 13:47:54] logging.py:157 >> {'loss': 0.4754, 'learning_rate': 1.1592e-05, 'epoch': 2.04}
[INFO|2024-12-17 13:48:14] logging.py:157 >> {'loss': 0.4426, 'learning_rate': 1.1329e-05, 'epoch': 2.05}
[INFO|2024-12-17 13:48:34] logging.py:157 >> {'loss': 0.5026, 'learning_rate': 1.1069e-05, 'epoch': 2.06}
[INFO|2024-12-17 13:48:54] logging.py:157 >> {'loss': 0.4872, 'learning_rate': 1.0810e-05, 'epoch': 2.08}
[INFO|2024-12-17 13:49:15] logging.py:157 >> {'loss': 0.5022, 'learning_rate': 1.0554e-05, 'epoch': 2.09}
[INFO|2024-12-17 13:49:35] logging.py:157 >> {'loss': 0.5388, 'learning_rate': 1.0300e-05, 'epoch': 2.10}
[INFO|2024-12-17 13:49:55] logging.py:157 >> {'loss': 0.4810, 'learning_rate': 1.0049e-05, 'epoch': 2.11}
[INFO|2024-12-17 13:50:15] logging.py:157 >> {'loss': 0.4829, 'learning_rate': 9.7996e-06, 'epoch': 2.12}
[INFO|2024-12-17 13:50:36] logging.py:157 >> {'loss': 0.4902, 'learning_rate': 9.5527e-06, 'epoch': 2.13}
[INFO|2024-12-17 13:50:56] logging.py:157 >> {'loss': 0.5297, 'learning_rate': 9.3083e-06, 'epoch': 2.15}
[INFO|2024-12-17 13:50:56] trainer.py:3801 >> Saving model checkpoint to saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-900
[INFO|2024-12-17 13:50:56] configuration_utils.py:677 >> loading configuration file /media/omnisky/Extreme SSD/hzq/LLMmodels/Qwen2-7B-Instruct/config.json
[INFO|2024-12-17 13:50:56] configuration_utils.py:746 >> Model config Qwen2Config {
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.46.1",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
[INFO|2024-12-17 13:50:56] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-900/tokenizer_config.json
[INFO|2024-12-17 13:50:56] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-900/special_tokens_map.json
[INFO|2024-12-17 13:51:17] logging.py:157 >> {'loss': 0.4845, 'learning_rate': 9.0663e-06, 'epoch': 2.16}
[INFO|2024-12-17 13:51:37] logging.py:157 >> {'loss': 0.5086, 'learning_rate': 8.8268e-06, 'epoch': 2.17}
[INFO|2024-12-17 13:51:57] logging.py:157 >> {'loss': 0.5136, 'learning_rate': 8.5899e-06, 'epoch': 2.18}
[INFO|2024-12-17 13:52:18] logging.py:157 >> {'loss': 0.5137, 'learning_rate': 8.3555e-06, 'epoch': 2.19}
[INFO|2024-12-17 13:52:38] logging.py:157 >> {'loss': 0.5032, 'learning_rate': 8.1237e-06, 'epoch': 2.21}
[INFO|2024-12-17 13:52:58] logging.py:157 >> {'loss': 0.4658, 'learning_rate': 7.8945e-06, 'epoch': 2.22}
[INFO|2024-12-17 13:53:18] logging.py:157 >> {'loss': 0.5261, 'learning_rate': 7.6680e-06, 'epoch': 2.23}
[INFO|2024-12-17 13:53:39] logging.py:157 >> {'loss': 0.5216, 'learning_rate': 7.4442e-06, 'epoch': 2.24}
[INFO|2024-12-17 13:53:59] logging.py:157 >> {'loss': 0.4949, 'learning_rate': 7.2232e-06, 'epoch': 2.25}
[INFO|2024-12-17 13:54:19] logging.py:157 >> {'loss': 0.4640, 'learning_rate': 7.0049e-06, 'epoch': 2.27}
[INFO|2024-12-17 13:54:39] logging.py:157 >> {'loss': 0.5057, 'learning_rate': 6.7895e-06, 'epoch': 2.28}
[INFO|2024-12-17 13:54:59] logging.py:157 >> {'loss': 0.4875, 'learning_rate': 6.5769e-06, 'epoch': 2.29}
[INFO|2024-12-17 13:55:20] logging.py:157 >> {'loss': 0.4753, 'learning_rate': 6.3671e-06, 'epoch': 2.30}
[INFO|2024-12-17 13:55:40] logging.py:157 >> {'loss': 0.4725, 'learning_rate': 6.1603e-06, 'epoch': 2.31}
[INFO|2024-12-17 13:56:00] logging.py:157 >> {'loss': 0.5095, 'learning_rate': 5.9564e-06, 'epoch': 2.33}
[INFO|2024-12-17 13:56:20] logging.py:157 >> {'loss': 0.5099, 'learning_rate': 5.7555e-06, 'epoch': 2.34}
[INFO|2024-12-17 13:56:41] logging.py:157 >> {'loss': 0.4744, 'learning_rate': 5.5576e-06, 'epoch': 2.35}
[INFO|2024-12-17 13:57:01] logging.py:157 >> {'loss': 0.5126, 'learning_rate': 5.3627e-06, 'epoch': 2.36}
[INFO|2024-12-17 13:57:21] logging.py:157 >> {'loss': 0.4636, 'learning_rate': 5.1709e-06, 'epoch': 2.37}
[INFO|2024-12-17 13:57:41] logging.py:157 >> {'loss': 0.4817, 'learning_rate': 4.9822e-06, 'epoch': 2.39}
[INFO|2024-12-17 13:57:41] trainer.py:3801 >> Saving model checkpoint to saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-1000
[INFO|2024-12-17 13:57:41] configuration_utils.py:677 >> loading configuration file /media/omnisky/Extreme SSD/hzq/LLMmodels/Qwen2-7B-Instruct/config.json
[INFO|2024-12-17 13:57:41] configuration_utils.py:746 >> Model config Qwen2Config {
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.46.1",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
[INFO|2024-12-17 13:57:41] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-1000/tokenizer_config.json
[INFO|2024-12-17 13:57:41] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-1000/special_tokens_map.json
[INFO|2024-12-17 13:58:02] logging.py:157 >> {'loss': 0.5238, 'learning_rate': 4.7966e-06, 'epoch': 2.40}
[INFO|2024-12-17 13:58:22] logging.py:157 >> {'loss': 0.4462, 'learning_rate': 4.6142e-06, 'epoch': 2.41}
[INFO|2024-12-17 13:58:43] logging.py:157 >> {'loss': 0.4730, 'learning_rate': 4.4350e-06, 'epoch': 2.42}
[INFO|2024-12-17 13:59:03] logging.py:157 >> {'loss': 0.5256, 'learning_rate': 4.2589e-06, 'epoch': 2.43}
[INFO|2024-12-17 13:59:23] logging.py:157 >> {'loss': 0.4561, 'learning_rate': 4.0861e-06, 'epoch': 2.44}
[INFO|2024-12-17 13:59:43] logging.py:157 >> {'loss': 0.5378, 'learning_rate': 3.9166e-06, 'epoch': 2.46}
[INFO|2024-12-17 14:00:04] logging.py:157 >> {'loss': 0.5014, 'learning_rate': 3.7504e-06, 'epoch': 2.47}
[INFO|2024-12-17 14:00:24] logging.py:157 >> {'loss': 0.4967, 'learning_rate': 3.5875e-06, 'epoch': 2.48}
[INFO|2024-12-17 14:00:44] logging.py:157 >> {'loss': 0.4403, 'learning_rate': 3.4279e-06, 'epoch': 2.49}
[INFO|2024-12-17 14:01:04] logging.py:157 >> {'loss': 0.5167, 'learning_rate': 3.2717e-06, 'epoch': 2.50}
[INFO|2024-12-17 14:01:25] logging.py:157 >> {'loss': 0.4748, 'learning_rate': 3.1189e-06, 'epoch': 2.52}
[INFO|2024-12-17 14:01:45] logging.py:157 >> {'loss': 0.5000, 'learning_rate': 2.9695e-06, 'epoch': 2.53}
[INFO|2024-12-17 14:02:05] logging.py:157 >> {'loss': 0.4844, 'learning_rate': 2.8235e-06, 'epoch': 2.54}
[INFO|2024-12-17 14:02:25] logging.py:157 >> {'loss': 0.4588, 'learning_rate': 2.6810e-06, 'epoch': 2.55}
[INFO|2024-12-17 14:02:45] logging.py:157 >> {'loss': 0.4561, 'learning_rate': 2.5420e-06, 'epoch': 2.56}
[INFO|2024-12-17 14:03:06] logging.py:157 >> {'loss': 0.4869, 'learning_rate': 2.4065e-06, 'epoch': 2.58}
[INFO|2024-12-17 14:03:26] logging.py:157 >> {'loss': 0.4966, 'learning_rate': 2.2746e-06, 'epoch': 2.59}
[INFO|2024-12-17 14:03:46] logging.py:157 >> {'loss': 0.4629, 'learning_rate': 2.1461e-06, 'epoch': 2.60}
[INFO|2024-12-17 14:04:06] logging.py:157 >> {'loss': 0.5041, 'learning_rate': 2.0213e-06, 'epoch': 2.61}
[INFO|2024-12-17 14:04:27] logging.py:157 >> {'loss': 0.5230, 'learning_rate': 1.9000e-06, 'epoch': 2.62}
[INFO|2024-12-17 14:04:27] trainer.py:3801 >> Saving model checkpoint to saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-1100
[INFO|2024-12-17 14:04:27] configuration_utils.py:677 >> loading configuration file /media/omnisky/Extreme SSD/hzq/LLMmodels/Qwen2-7B-Instruct/config.json
[INFO|2024-12-17 14:04:27] configuration_utils.py:746 >> Model config Qwen2Config {
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.46.1",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
[INFO|2024-12-17 14:04:27] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-1100/tokenizer_config.json
[INFO|2024-12-17 14:04:27] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-1100/special_tokens_map.json
[INFO|2024-12-17 14:04:47] logging.py:157 >> {'loss': 0.4703, 'learning_rate': 1.7824e-06, 'epoch': 2.64}
[INFO|2024-12-17 14:05:08] logging.py:157 >> {'loss': 0.4987, 'learning_rate': 1.6683e-06, 'epoch': 2.65}
[INFO|2024-12-17 14:05:28] logging.py:157 >> {'loss': 0.5113, 'learning_rate': 1.5579e-06, 'epoch': 2.66}
[INFO|2024-12-17 14:05:48] logging.py:157 >> {'loss': 0.4720, 'learning_rate': 1.4512e-06, 'epoch': 2.67}
[INFO|2024-12-17 14:06:08] logging.py:157 >> {'loss': 0.5110, 'learning_rate': 1.3482e-06, 'epoch': 2.68}
[INFO|2024-12-17 14:06:28] logging.py:157 >> {'loss': 0.5292, 'learning_rate': 1.2488e-06, 'epoch': 2.70}
[INFO|2024-12-17 14:06:49] logging.py:157 >> {'loss': 0.4581, 'learning_rate': 1.1532e-06, 'epoch': 2.71}
[INFO|2024-12-17 14:07:09] logging.py:157 >> {'loss': 0.5219, 'learning_rate': 1.0612e-06, 'epoch': 2.72}
[INFO|2024-12-17 14:07:29] logging.py:157 >> {'loss': 0.4821, 'learning_rate': 9.7306e-07, 'epoch': 2.73}
[INFO|2024-12-17 14:07:49] logging.py:157 >> {'loss': 0.4483, 'learning_rate': 8.8862e-07, 'epoch': 2.74}
[INFO|2024-12-17 14:08:10] logging.py:157 >> {'loss': 0.5451, 'learning_rate': 8.0795e-07, 'epoch': 2.75}
[INFO|2024-12-17 14:08:30] logging.py:157 >> {'loss': 0.4644, 'learning_rate': 7.3106e-07, 'epoch': 2.77}
[INFO|2024-12-17 14:08:50] logging.py:157 >> {'loss': 0.5294, 'learning_rate': 6.5796e-07, 'epoch': 2.78}
[INFO|2024-12-17 14:09:10] logging.py:157 >> {'loss': 0.4743, 'learning_rate': 5.8866e-07, 'epoch': 2.79}
[INFO|2024-12-17 14:09:30] logging.py:157 >> {'loss': 0.5137, 'learning_rate': 5.2317e-07, 'epoch': 2.80}
[INFO|2024-12-17 14:09:51] logging.py:157 >> {'loss': 0.4973, 'learning_rate': 4.6151e-07, 'epoch': 2.81}
[INFO|2024-12-17 14:10:11] logging.py:157 >> {'loss': 0.4836, 'learning_rate': 4.0368e-07, 'epoch': 2.83}
[INFO|2024-12-17 14:10:31] logging.py:157 >> {'loss': 0.5003, 'learning_rate': 3.4968e-07, 'epoch': 2.84}
[INFO|2024-12-17 14:10:51] logging.py:157 >> {'loss': 0.4861, 'learning_rate': 2.9954e-07, 'epoch': 2.85}
[INFO|2024-12-17 14:11:12] logging.py:157 >> {'loss': 0.4464, 'learning_rate': 2.5325e-07, 'epoch': 2.86}
[INFO|2024-12-17 14:11:12] trainer.py:3801 >> Saving model checkpoint to saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-1200
[INFO|2024-12-17 14:11:12] configuration_utils.py:677 >> loading configuration file /media/omnisky/Extreme SSD/hzq/LLMmodels/Qwen2-7B-Instruct/config.json
[INFO|2024-12-17 14:11:12] configuration_utils.py:746 >> Model config Qwen2Config {
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.46.1",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
[INFO|2024-12-17 14:11:12] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-1200/tokenizer_config.json
[INFO|2024-12-17 14:11:12] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-1200/special_tokens_map.json
[INFO|2024-12-17 14:11:33] logging.py:157 >> {'loss': 0.5013, 'learning_rate': 2.1083e-07, 'epoch': 2.87}
[INFO|2024-12-17 14:11:53] logging.py:157 >> {'loss': 0.4679, 'learning_rate': 1.7228e-07, 'epoch': 2.89}
[INFO|2024-12-17 14:12:13] logging.py:157 >> {'loss': 0.4972, 'learning_rate': 1.3761e-07, 'epoch': 2.90}
[INFO|2024-12-17 14:12:33] logging.py:157 >> {'loss': 0.4953, 'learning_rate': 1.0682e-07, 'epoch': 2.91}
[INFO|2024-12-17 14:12:54] logging.py:157 >> {'loss': 0.4842, 'learning_rate': 7.9911e-08, 'epoch': 2.92}
[INFO|2024-12-17 14:13:14] logging.py:157 >> {'loss': 0.4684, 'learning_rate': 5.6899e-08, 'epoch': 2.93}
[INFO|2024-12-17 14:13:34] logging.py:157 >> {'loss': 0.4477, 'learning_rate': 3.7781e-08, 'epoch': 2.95}
[INFO|2024-12-17 14:13:54] logging.py:157 >> {'loss': 0.4787, 'learning_rate': 2.2562e-08, 'epoch': 2.96}
[INFO|2024-12-17 14:14:15] logging.py:157 >> {'loss': 0.5320, 'learning_rate': 1.1243e-08, 'epoch': 2.97}
[INFO|2024-12-17 14:14:35] logging.py:157 >> {'loss': 0.5079, 'learning_rate': 3.8258e-09, 'epoch': 2.98}
[INFO|2024-12-17 14:14:55] logging.py:157 >> {'loss': 0.4780, 'learning_rate': 3.1232e-10, 'epoch': 2.99}
[INFO|2024-12-17 14:15:03] trainer.py:3801 >> Saving model checkpoint to saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-1257
[INFO|2024-12-17 14:15:03] configuration_utils.py:677 >> loading configuration file /media/omnisky/Extreme SSD/hzq/LLMmodels/Qwen2-7B-Instruct/config.json
[INFO|2024-12-17 14:15:03] configuration_utils.py:746 >> Model config Qwen2Config {
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.46.1",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
[INFO|2024-12-17 14:15:03] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-1257/tokenizer_config.json
[INFO|2024-12-17 14:15:03] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/checkpoint-1257/special_tokens_map.json
[INFO|2024-12-17 14:15:04] trainer.py:2584 >>
Training completed. Do not forget to share your model on huggingface.co/models =)
[INFO|2024-12-17 14:15:04] trainer.py:3801 >> Saving model checkpoint to saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28
[INFO|2024-12-17 14:15:04] configuration_utils.py:677 >> loading configuration file /media/omnisky/Extreme SSD/hzq/LLMmodels/Qwen2-7B-Instruct/config.json
[INFO|2024-12-17 14:15:04] configuration_utils.py:746 >> Model config Qwen2Config {
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.46.1",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
[INFO|2024-12-17 14:15:04] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/tokenizer_config.json
[INFO|2024-12-17 14:15:04] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-7B-Instruct/lora/train_2024-12-17-12-48-28/special_tokens_map.json
[WARNING|2024-12-17 14:15:04] logging.py:162 >> No metric eval_loss to plot.
[WARNING|2024-12-17 14:15:04] logging.py:162 >> No metric eval_accuracy to plot.
[INFO|2024-12-17 14:15:04] modelcard.py:449 >> Dropping the following result as it does not have all the necessary fields:
{'task': {'name': 'Causal Language Modeling', 'type': 'text-generation'}}