zhaode commited on
Commit
cc6cb5a
·
verified ·
1 Parent(s): 6dc57f0

Upload folder using huggingface_hub

Browse files
Files changed (8) hide show
  1. README.md +38 -1
  2. config.json +2 -3
  3. llm.mnn +3 -0
  4. llm.mnn.json +3 -0
  5. llm.mnn.weight +3 -0
  6. llm_config.json +2 -3
  7. qwen1.5-1.8b-int4.mnn +0 -0
  8. tokenizer.txt +2 -2
README.md CHANGED
@@ -9,5 +9,42 @@ tags:
9
  # Qwen1.5-1.8B-Chat-MNN
10
 
11
  ## Introduction
 
12
 
13
- This model is a 4-bit quantized version of the MNN model exported from Qwen1.5-1.8B-Chat using [llm-export](https://github.com/wangzhaode/llm-export).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  # Qwen1.5-1.8B-Chat-MNN
10
 
11
  ## Introduction
12
+ This model is a 4-bit quantized version of the MNN model exported from [Qwen1.5-1.8B-Chat](https://modelscope.cn/models/qwen/Qwen1.5-1.8B-Chat/summary) using [llmexport](https://github.com/alibaba/MNN/tree/master/transformers/llm/export).
13
 
14
+ ## Download
15
+ ```bash
16
+ # install huggingface
17
+ pip install huggingface
18
+ ```
19
+ ```bash
20
+ # shell download
21
+ huggingface download --model 'taobao-mnn/Qwen1.5-1.8B-Chat-MNN' --local_dir 'path/to/dir'
22
+ ```
23
+ ```python
24
+ # SDK download
25
+ from huggingface_hub import snapshot_download
26
+ model_dir = snapshot_download('taobao-mnn/Qwen1.5-1.8B-Chat-MNN')
27
+ ```
28
+
29
+ ```bash
30
+ # git clone
31
+ git clone https://www.modelscope.cn/taobao-mnn/Qwen1.5-1.8B-Chat-MNN
32
+ ```
33
+
34
+ ## Usage
35
+ ```bash
36
+ # clone MNN source
37
+ git clone https://github.com/alibaba/MNN.git
38
+
39
+ # compile
40
+ cd MNN
41
+ mkdir build && cd build
42
+ cmake .. -DMNN_LOW_MEMORY=true -DMNN_CPU_WEIGHT_DEQUANT_GEMM=true -DMNN_BUILD_LLM=true -DMNN_SUPPORT_TRANSFORMER_FUSE=true
43
+ make -j
44
+
45
+ # run
46
+ ./llm_demo /path/to/Qwen1.5-1.8B-Chat-MNN/config.json prompt.txt
47
+ ```
48
+
49
+ ## Document
50
+ [MNN-LLM](https://mnn-docs.readthedocs.io/en/latest/transformers/llm.html#)
config.json CHANGED
@@ -1,7 +1,6 @@
1
  {
2
- "llm_model": "qwen1.5-1.8b-int4.mnn",
3
- "llm_weight": "qwen1.5-1.8b-int4.mnn.weight",
4
-
5
  "backend_type": "cpu",
6
  "thread_num": 4,
7
  "precision": "low",
 
1
  {
2
+ "llm_model": "llm.mnn",
3
+ "llm_weight": "llm.mnn.weight",
 
4
  "backend_type": "cpu",
5
  "thread_num": 4,
6
  "precision": "low",
llm.mnn ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ba06ec694d970d8317023c8fb4878e10ad8a24dc4f8b052287f59cc042604f5d
3
+ size 1184256
llm.mnn.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8e9a13c167f0314914e793f2a69568878e937b4044dcfa8579174026e18816c0
3
+ size 7085313
llm.mnn.weight ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2e76f292e8437d1dfbd2a6d71787cb5c799fed0661816bc4f52099b658ed7fdc
3
+ size 858640010
llm_config.json CHANGED
@@ -9,7 +9,6 @@
9
  16,
10
  128
11
  ],
12
- "prompt_template": "<|im_start|>user\n%s<|im_end|>\n<|im_start|>assistant\n",
13
- "is_visual": false,
14
- "is_single": true
15
  }
 
9
  16,
10
  128
11
  ],
12
+ "prompt_template": "\n<|im_start|>user\n%s<|im_end|>\n<|im_start|>assistant\n",
13
+ "is_visual": false
 
14
  }
qwen1.5-1.8b-int4.mnn CHANGED
Binary files a/qwen1.5-1.8b-int4.mnn and b/qwen1.5-1.8b-int4.mnn differ
 
tokenizer.txt CHANGED
@@ -1,6 +1,6 @@
1
  430 3
2
- 3 1 0
3
- 151643 151644 151645 151645
4
  151646 151387
5
  !
6
  "
 
1
  430 3
2
+ 3 2 0
3
+ 151643 151644 151645 151643 151645
4
  151646 151387
5
  !
6
  "