serpdotai
/

sparsetral-16x7B-v2

Text Generation

Inference Endpoints

Model card Files Files and versions Community

francislabounty commited on Feb 5

Commit

d614686

•

1 Parent(s): 03c7d6f

Update README.md

Files changed (1) hide show

README.md +11 -1

README.md CHANGED Viewed

@@ -5,10 +5,20 @@ datasets:
 language:
 - en
 ---
-prompt format
 ```
 <|im_start|>system\n{message}<|im_end|>\n<|im_start|>user\n{message}<|im_end|>\n<|im_start|>assistant\n
 ```
 ## Usage
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer

 language:
 - en
 ---
+## Training
+- 8x A6000s
+- [Forked version of unsloth](https://github.com/serp-ai/unsloth) for efficient training
+- Sequence Length: 4096
+- Effective batch size: 128
+- Learning Rate: 2e-5 with linear decay
+- Epochs: 1
+- Base model trained with QLoRA (rank 64, alpha 16) and MoE adapters/routers trained in bf16
+## Prompt Format
 ```
 <|im_start|>system\n{message}<|im_end|>\n<|im_start|>user\n{message}<|im_end|>\n<|im_start|>assistant\n
 ```
 ## Usage
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer