francislabounty
commited on
Commit
•
357aeab
1
Parent(s):
d614686
Update README.md
Browse files
README.md
CHANGED
@@ -13,6 +13,8 @@ language:
|
|
13 |
- Learning Rate: 2e-5 with linear decay
|
14 |
- Epochs: 1
|
15 |
- Base model trained with QLoRA (rank 64, alpha 16) and MoE adapters/routers trained in bf16
|
|
|
|
|
16 |
|
17 |
## Prompt Format
|
18 |
```
|
|
|
13 |
- Learning Rate: 2e-5 with linear decay
|
14 |
- Epochs: 1
|
15 |
- Base model trained with QLoRA (rank 64, alpha 16) and MoE adapters/routers trained in bf16
|
16 |
+
- Num Experts: 16
|
17 |
+
- Top K: 4
|
18 |
|
19 |
## Prompt Format
|
20 |
```
|