liushaowei commited on
Commit
579a3a3
·
1 Parent(s): 4308c42

update readme

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -3,14 +3,14 @@ license: mit
3
  library_name: transformers
4
  ---
5
  <div align="center">
6
- <a href="https://github.com/MoonshotAI/dummy.pdf"><img width="80%" src="figures/banner.png"></a>
7
  </div>
8
 
9
  <!-- # Muon is Scalable For LLM Training -->
10
 
11
  <div align="center">
12
- <a href="https://github.com/MoonshotAI/dummy.pdf" ><img src="figures/logo.png" height="16" width="16" style="display: inline-block; vertical-align: middle; margin: 2px;"><b style="display: inline-block;"> Tech Report</b></a> |
13
- <a href="https://huggingface.co/moonshotai/Moonlight"><img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" height="16" width="16" style="display: inline-block; vertical-align: middle; margin: 2px;"><b style="display: inline-block;"> HuggingFace</b></a> |
14
  <a href="#"><img src="figures/megatron.png" height="16" width="16" style="display: inline-block; vertical-align: middle; margin: 2px;"><b style="display: inline-block;">Megatron(coming soon)</b></a>
15
  </div>
16
 
@@ -85,8 +85,8 @@ We compared Moonlight with SOTA public models at similar scale:
85
 
86
  | **Model** | **#Total Params** | **#Activated Params** | **Context Length** | **Download Link** |
87
  | :------------: | :------------: | :------------: | :------------: | :------------: |
88
- | Moonlight | 16B | 3B | 8K | [🤗 Hugging Face](https://huggingface.co/moonshotai/Moonlight) |
89
- | Moonlight-Instruct | 16B | 3B | 8K | [🤗 Hugging Face](https://huggingface.co/moonshotai/Moonlight-Instruct) |
90
 
91
  </div>
92
 
@@ -94,7 +94,7 @@ We compared Moonlight with SOTA public models at similar scale:
94
 
95
  We introduce how to use our model at inference stage using transformers library. It is recommended to use python=3.10, torch>=2.1.0, and the latest version of transformers as the development environment.
96
 
97
- For our pretrained model (Moonlight):
98
  ```python
99
  from transformers import AutoModelForCausalLM, AutoTokenizer
100
 
@@ -113,7 +113,7 @@ generated_ids = model.generate(**inputs, max_new_tokens=100)
113
  response = tokenizer.batch_decode(generated_ids)[0]
114
  ```
115
 
116
- For our instruct model (Moonlight-Instruct):
117
 
118
  ```python
119
  from transformers import AutoModelForCausalLM, AutoTokenizer
 
3
  library_name: transformers
4
  ---
5
  <div align="center">
6
+ <a href="https://github.com/MoonshotAI/Moonlight"><img width="80%" src="figures/banner.png"></a>
7
  </div>
8
 
9
  <!-- # Muon is Scalable For LLM Training -->
10
 
11
  <div align="center">
12
+ <a href="https://github.com/MoonshotAI/Moonlight/blob/master/Moonlight.pdf" ><img src="figures/logo.png" height="16" width="16" style="display: inline-block; vertical-align: middle; margin: 2px;"><b style="display: inline-block;"> Tech Report</b></a> |
13
+ <a href="https://huggingface.co/moonshotai/Moonlight-16B-A3B"><img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" height="16" width="16" style="display: inline-block; vertical-align: middle; margin: 2px;"><b style="display: inline-block;"> HuggingFace</b></a> |
14
  <a href="#"><img src="figures/megatron.png" height="16" width="16" style="display: inline-block; vertical-align: middle; margin: 2px;"><b style="display: inline-block;">Megatron(coming soon)</b></a>
15
  </div>
16
 
 
85
 
86
  | **Model** | **#Total Params** | **#Activated Params** | **Context Length** | **Download Link** |
87
  | :------------: | :------------: | :------------: | :------------: | :------------: |
88
+ | Moonlight-16B-A3B | 16B | 3B | 8K | [🤗 Hugging Face](https://huggingface.co/moonshotai/Moonlight-16B-A3B) |
89
+ | Moonlight-16B-A3B-Instruct | 16B | 3B | 8K | [🤗 Hugging Face](https://huggingface.co/moonshotai/Moonlight-16B-A3B-Instruct) |
90
 
91
  </div>
92
 
 
94
 
95
  We introduce how to use our model at inference stage using transformers library. It is recommended to use python=3.10, torch>=2.1.0, and the latest version of transformers as the development environment.
96
 
97
+ For our pretrained model (Moonlight-16B-A3B):
98
  ```python
99
  from transformers import AutoModelForCausalLM, AutoTokenizer
100
 
 
113
  response = tokenizer.batch_decode(generated_ids)[0]
114
  ```
115
 
116
+ For our instruct model (Moonlight-16B-A3B-Instruct):
117
 
118
  ```python
119
  from transformers import AutoModelForCausalLM, AutoTokenizer