Update README.md
Browse files
README.md
CHANGED
@@ -37,7 +37,7 @@ This is NanoLM-0.3B-Instruct-v1.1. The model currently supports both **Chinese a
|
|
37 |
| :----------: | :------------------: | :---: | :----: | :-------: | :---: | :---: |
|
38 |
| 25M | 15M | MistralForCausalLM | 12 | 312 | 12 |2K|
|
39 |
| 70M | 42M | LlamaForCausalLM | 12 | 576 | 9 |2K|
|
40 |
-
| 0.3B
|
41 |
| 1B | 840M | Qwen2ForCausalLM | 18 | 1536 | 12 |4K|
|
42 |
|
43 |
The tokenizer and model architecture of NanoLM-0.3B-Instruct-v1.1 are the same as [Qwen/Qwen2-0.5B](https://huggingface.co/Qwen/Qwen2-0.5B), but the number of layers has been reduced from 24 to 12.
|
|
|
37 |
| :----------: | :------------------: | :---: | :----: | :-------: | :---: | :---: |
|
38 |
| 25M | 15M | MistralForCausalLM | 12 | 312 | 12 |2K|
|
39 |
| 70M | 42M | LlamaForCausalLM | 12 | 576 | 9 |2K|
|
40 |
+
| **0.3B** | **180M** | **Qwen2ForCausalLM** | **12** | **896** | **14** | **4K** |
|
41 |
| 1B | 840M | Qwen2ForCausalLM | 18 | 1536 | 12 |4K|
|
42 |
|
43 |
The tokenizer and model architecture of NanoLM-0.3B-Instruct-v1.1 are the same as [Qwen/Qwen2-0.5B](https://huggingface.co/Qwen/Qwen2-0.5B), but the number of layers has been reduced from 24 to 12.
|