danielpark
/

asp-9b-inst-base

Text Generation

Mixture of Experts

Inference Endpoints

Model card Files Files and versions Community

danielpark commited on Aug 6

Commit

f5137f5

•

1 Parent(s): 3d23d9e

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -11,9 +11,9 @@ tags:
 # A experts weights of [Jamba-v0.1](https://huggingface.co/ai21labs/Jamba-v0.1)
-Required Weights for Follow-up Research
-The original model is **[AI21lab's Jamba-v0.1](https://huggingface.co/ai21labs/Jamba-v0.1)**, which requires an **A100 80GB GPU**. Unfortunately, this almonst was not available via Google Colab or cloud computing services. Thus, attempts were made to perform **MoE (Mixture of Experts) splitting**, using the following resources as a basis:
 - **Original Model:** [Jamba-v0.1](https://huggingface.co/ai21labs/Jamba-v0.1)
 - **MoE Layer Separation**: Consult [this script](https://github.com/TechxGenus/Jamba-utils/blob/main/dense_downcycling.py) written by [@TechxGenusand](https://github.com/TechxGenusand) and use [TechxGenus/Jamba-v0.1-9B](https://huggingface.co/TechxGenus/Jamba-v0.1-9B).

 # A experts weights of [Jamba-v0.1](https://huggingface.co/ai21labs/Jamba-v0.1)
+Required Weights for follow-up research.
+The original model is **[AI21lab's Jamba-v0.1](https://huggingface.co/ai21labs/Jamba-v0.1)**, which requires an **>80GB VRAM**. Unfortunately, this almonst was not available via Google Colab or cloud computing services. Thus, attempts were made to perform **MoE (Mixture of Experts) splitting**, using the following resources as a basis:
 - **Original Model:** [Jamba-v0.1](https://huggingface.co/ai21labs/Jamba-v0.1)
 - **MoE Layer Separation**: Consult [this script](https://github.com/TechxGenus/Jamba-utils/blob/main/dense_downcycling.py) written by [@TechxGenusand](https://github.com/TechxGenusand) and use [TechxGenus/Jamba-v0.1-9B](https://huggingface.co/TechxGenus/Jamba-v0.1-9B).