|
--- |
|
license: mit |
|
--- |
|
|
|
A reproduction of https://github.com/imoneoi/openchat. |
|
|
|
Training command: |
|
```bash |
|
deepspeed --num_gpus=8 --module ochat.training_deepspeed.train \ |
|
--model_path imone/Mistral_7B_with_EOT_token \ |
|
--data_prefix ./data/ \ |
|
--save_path ./checkpoints/mistral-7b/ \ |
|
--batch_max_len 77824 \ |
|
--epochs 10 \ |
|
--save_every 1 \ |
|
--deepspeed \ |
|
--deepspeed_config deepspeed_config.json |
|
``` |
|
|
|
`deepspeed_config.json`: |
|
```json |
|
{ |
|
"bf16": { |
|
"enabled": true |
|
}, |
|
"zero_optimization": { |
|
"stage": 2 |
|
}, |
|
"gradient_clipping": 1.0, |
|
"gradient_accumulation_steps": 1, |
|
"train_micro_batch_size_per_gpu": 1, |
|
"steps_per_print": 100, |
|
"wall_clock_breakdown": false |
|
} |
|
``` |
|
|
|
Training data is https://huggingface.co/datasets/openchat/openchat_sharegpt4_dataset |