danielpark commited on
Commit
f5137f5
1 Parent(s): 3d23d9e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -11,9 +11,9 @@ tags:
11
 
12
  # A experts weights of [Jamba-v0.1](https://huggingface.co/ai21labs/Jamba-v0.1)
13
 
14
- Required Weights for Follow-up Research
15
 
16
- The original model is **[AI21lab's Jamba-v0.1](https://huggingface.co/ai21labs/Jamba-v0.1)**, which requires an **A100 80GB GPU**. Unfortunately, this almonst was not available via Google Colab or cloud computing services. Thus, attempts were made to perform **MoE (Mixture of Experts) splitting**, using the following resources as a basis:
17
  - **Original Model:** [Jamba-v0.1](https://huggingface.co/ai21labs/Jamba-v0.1)
18
  - **MoE Layer Separation**: Consult [this script](https://github.com/TechxGenus/Jamba-utils/blob/main/dense_downcycling.py) written by [@TechxGenusand](https://github.com/TechxGenusand) and use [TechxGenus/Jamba-v0.1-9B](https://huggingface.co/TechxGenus/Jamba-v0.1-9B).
19
 
 
11
 
12
  # A experts weights of [Jamba-v0.1](https://huggingface.co/ai21labs/Jamba-v0.1)
13
 
14
+ Required Weights for follow-up research.
15
 
16
+ The original model is **[AI21lab's Jamba-v0.1](https://huggingface.co/ai21labs/Jamba-v0.1)**, which requires an **>80GB VRAM**. Unfortunately, this almonst was not available via Google Colab or cloud computing services. Thus, attempts were made to perform **MoE (Mixture of Experts) splitting**, using the following resources as a basis:
17
  - **Original Model:** [Jamba-v0.1](https://huggingface.co/ai21labs/Jamba-v0.1)
18
  - **MoE Layer Separation**: Consult [this script](https://github.com/TechxGenus/Jamba-utils/blob/main/dense_downcycling.py) written by [@TechxGenusand](https://github.com/TechxGenusand) and use [TechxGenus/Jamba-v0.1-9B](https://huggingface.co/TechxGenus/Jamba-v0.1-9B).
19