clowman's picture
Update README.md
338660d verified
metadata
license: mit

A reproduction of https://github.com/imoneoi/openchat.

Training command:

deepspeed --num_gpus=8 --module ochat.training_deepspeed.train \
          --model_path imone/Mistral_7B_with_EOT_token \
          --data_prefix ./data/ \
          --save_path ./checkpoints/mistral-7b/ \
          --batch_max_len 77824 \
          --epochs 10 \
          --save_every 1 \
          --deepspeed \
          --deepspeed_config deepspeed_config.json

deepspeed_config.json:

{
    "bf16": {
        "enabled": true
    },
    "zero_optimization": {
        "stage": 2
    },
    "gradient_clipping": 1.0,
    "gradient_accumulation_steps": 1,
    "train_micro_batch_size_per_gpu": 1,
    "steps_per_print": 100,
    "wall_clock_breakdown": false
}

Training data is https://huggingface.co/datasets/openchat/openchat_sharegpt4_dataset