serpdotai
/

sparsetral-16x7B-v2

Text Generation

Inference Endpoints

Model card Files Files and versions Community

francislabounty commited on Feb 5

Commit

357aeab

•

1 Parent(s): d614686

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -13,6 +13,8 @@ language:
 - Learning Rate: 2e-5 with linear decay
 - Epochs: 1
 - Base model trained with QLoRA (rank 64, alpha 16) and MoE adapters/routers trained in bf16
 ## Prompt Format
 ```

 - Learning Rate: 2e-5 with linear decay
 - Epochs: 1
 - Base model trained with QLoRA (rank 64, alpha 16) and MoE adapters/routers trained in bf16
+- Num Experts: 16
+- Top K: 4
 ## Prompt Format
 ```