liushaowei
commited on
Commit
·
579a3a3
1
Parent(s):
4308c42
update readme
Browse files
README.md
CHANGED
@@ -3,14 +3,14 @@ license: mit
|
|
3 |
library_name: transformers
|
4 |
---
|
5 |
<div align="center">
|
6 |
-
<a href="https://github.com/MoonshotAI/
|
7 |
</div>
|
8 |
|
9 |
<!-- # Muon is Scalable For LLM Training -->
|
10 |
|
11 |
<div align="center">
|
12 |
-
<a href="https://github.com/MoonshotAI/
|
13 |
-
<a href="https://huggingface.co/moonshotai/Moonlight"><img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" height="16" width="16" style="display: inline-block; vertical-align: middle; margin: 2px;"><b style="display: inline-block;"> HuggingFace</b></a> |
|
14 |
<a href="#"><img src="figures/megatron.png" height="16" width="16" style="display: inline-block; vertical-align: middle; margin: 2px;"><b style="display: inline-block;">Megatron(coming soon)</b></a>
|
15 |
</div>
|
16 |
|
@@ -85,8 +85,8 @@ We compared Moonlight with SOTA public models at similar scale:
|
|
85 |
|
86 |
| **Model** | **#Total Params** | **#Activated Params** | **Context Length** | **Download Link** |
|
87 |
| :------------: | :------------: | :------------: | :------------: | :------------: |
|
88 |
-
| Moonlight | 16B | 3B | 8K | [🤗 Hugging Face](https://huggingface.co/moonshotai/Moonlight) |
|
89 |
-
| Moonlight-Instruct | 16B | 3B | 8K | [🤗 Hugging Face](https://huggingface.co/moonshotai/Moonlight-Instruct) |
|
90 |
|
91 |
</div>
|
92 |
|
@@ -94,7 +94,7 @@ We compared Moonlight with SOTA public models at similar scale:
|
|
94 |
|
95 |
We introduce how to use our model at inference stage using transformers library. It is recommended to use python=3.10, torch>=2.1.0, and the latest version of transformers as the development environment.
|
96 |
|
97 |
-
For our pretrained model (Moonlight):
|
98 |
```python
|
99 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
100 |
|
@@ -113,7 +113,7 @@ generated_ids = model.generate(**inputs, max_new_tokens=100)
|
|
113 |
response = tokenizer.batch_decode(generated_ids)[0]
|
114 |
```
|
115 |
|
116 |
-
For our instruct model (Moonlight-Instruct):
|
117 |
|
118 |
```python
|
119 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
|
3 |
library_name: transformers
|
4 |
---
|
5 |
<div align="center">
|
6 |
+
<a href="https://github.com/MoonshotAI/Moonlight"><img width="80%" src="figures/banner.png"></a>
|
7 |
</div>
|
8 |
|
9 |
<!-- # Muon is Scalable For LLM Training -->
|
10 |
|
11 |
<div align="center">
|
12 |
+
<a href="https://github.com/MoonshotAI/Moonlight/blob/master/Moonlight.pdf" ><img src="figures/logo.png" height="16" width="16" style="display: inline-block; vertical-align: middle; margin: 2px;"><b style="display: inline-block;"> Tech Report</b></a> |
|
13 |
+
<a href="https://huggingface.co/moonshotai/Moonlight-16B-A3B"><img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" height="16" width="16" style="display: inline-block; vertical-align: middle; margin: 2px;"><b style="display: inline-block;"> HuggingFace</b></a> |
|
14 |
<a href="#"><img src="figures/megatron.png" height="16" width="16" style="display: inline-block; vertical-align: middle; margin: 2px;"><b style="display: inline-block;">Megatron(coming soon)</b></a>
|
15 |
</div>
|
16 |
|
|
|
85 |
|
86 |
| **Model** | **#Total Params** | **#Activated Params** | **Context Length** | **Download Link** |
|
87 |
| :------------: | :------------: | :------------: | :------------: | :------------: |
|
88 |
+
| Moonlight-16B-A3B | 16B | 3B | 8K | [🤗 Hugging Face](https://huggingface.co/moonshotai/Moonlight-16B-A3B) |
|
89 |
+
| Moonlight-16B-A3B-Instruct | 16B | 3B | 8K | [🤗 Hugging Face](https://huggingface.co/moonshotai/Moonlight-16B-A3B-Instruct) |
|
90 |
|
91 |
</div>
|
92 |
|
|
|
94 |
|
95 |
We introduce how to use our model at inference stage using transformers library. It is recommended to use python=3.10, torch>=2.1.0, and the latest version of transformers as the development environment.
|
96 |
|
97 |
+
For our pretrained model (Moonlight-16B-A3B):
|
98 |
```python
|
99 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
100 |
|
|
|
113 |
response = tokenizer.batch_decode(generated_ids)[0]
|
114 |
```
|
115 |
|
116 |
+
For our instruct model (Moonlight-16B-A3B-Instruct):
|
117 |
|
118 |
```python
|
119 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|