clowman
/

openchat-mistral-7b-reproduce

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

clowman commited on Mar 14, 2024

Commit

338660d

·

verified ·

1 Parent(s): 51ca443

Update README.md

Files changed (1) hide show

README.md +34 -0

README.md CHANGED Viewed

@@ -1,3 +1,37 @@
 ---
 license: mit
 ---

 ---
 license: mit
 ---
+A reproduction of https://github.com/imoneoi/openchat.
+Training command:
+```bash
+deepspeed --num_gpus=8 --module ochat.training_deepspeed.train \
+          --model_path imone/Mistral_7B_with_EOT_token \
+          --data_prefix ./data/ \
+          --save_path ./checkpoints/mistral-7b/ \
+          --batch_max_len 77824 \
+          --epochs 10 \
+          --save_every 1 \
+          --deepspeed \
+          --deepspeed_config deepspeed_config.json
+```
+`deepspeed_config.json`:
+```json
+{
+    "bf16": {
+        "enabled": true
+    },
+    "zero_optimization": {
+        "stage": 2
+    },
+    "gradient_clipping": 1.0,
+    "gradient_accumulation_steps": 1,
+    "train_micro_batch_size_per_gpu": 1,
+    "steps_per_print": 100,
+    "wall_clock_breakdown": false
+}
+```
+Training data is https://huggingface.co/datasets/openchat/openchat_sharegpt4_dataset