metadata
license: mit
A reproduction of https://github.com/imoneoi/openchat.
Training command:
deepspeed --num_gpus=8 --module ochat.training_deepspeed.train \
--model_path imone/Mistral_7B_with_EOT_token \
--data_prefix ./data/ \
--save_path ./checkpoints/mistral-7b/ \
--batch_max_len 77824 \
--epochs 10 \
--save_every 1 \
--deepspeed \
--deepspeed_config deepspeed_config.json
deepspeed_config.json
:
{
"bf16": {
"enabled": true
},
"zero_optimization": {
"stage": 2
},
"gradient_clipping": 1.0,
"gradient_accumulation_steps": 1,
"train_micro_batch_size_per_gpu": 1,
"steps_per_print": 100,
"wall_clock_breakdown": false
}
Training data is https://huggingface.co/datasets/openchat/openchat_sharegpt4_dataset