francislabounty
commited on
Commit
•
d614686
1
Parent(s):
03c7d6f
Update README.md
Browse files
README.md
CHANGED
@@ -5,10 +5,20 @@ datasets:
|
|
5 |
language:
|
6 |
- en
|
7 |
---
|
8 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
```
|
10 |
<|im_start|>system\n{message}<|im_end|>\n<|im_start|>user\n{message}<|im_end|>\n<|im_start|>assistant\n
|
11 |
```
|
|
|
12 |
## Usage
|
13 |
```python
|
14 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
|
5 |
language:
|
6 |
- en
|
7 |
---
|
8 |
+
## Training
|
9 |
+
- 8x A6000s
|
10 |
+
- [Forked version of unsloth](https://github.com/serp-ai/unsloth) for efficient training
|
11 |
+
- Sequence Length: 4096
|
12 |
+
- Effective batch size: 128
|
13 |
+
- Learning Rate: 2e-5 with linear decay
|
14 |
+
- Epochs: 1
|
15 |
+
- Base model trained with QLoRA (rank 64, alpha 16) and MoE adapters/routers trained in bf16
|
16 |
+
|
17 |
+
## Prompt Format
|
18 |
```
|
19 |
<|im_start|>system\n{message}<|im_end|>\n<|im_start|>user\n{message}<|im_end|>\n<|im_start|>assistant\n
|
20 |
```
|
21 |
+
|
22 |
## Usage
|
23 |
```python
|
24 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|