Upload folder using huggingface_hub

Files changed (8) hide show

README.md CHANGED Viewed

@@ -9,5 +9,42 @@ tags:
 # Qwen1.5-1.8B-Chat-MNN
 ## Introduction
-This model is a 4-bit quantized version of the MNN model exported from Qwen1.5-1.8B-Chat using [llm-export](https://github.com/wangzhaode/llm-export).

 # Qwen1.5-1.8B-Chat-MNN
 ## Introduction
+This model is a 4-bit quantized version of the MNN model exported from [Qwen1.5-1.8B-Chat](https://modelscope.cn/models/qwen/Qwen1.5-1.8B-Chat/summary) using [llmexport](https://github.com/alibaba/MNN/tree/master/transformers/llm/export).
+## Download
+```bash
+# install huggingface
+pip install huggingface
+```
+```bash
+# shell download
+huggingface download --model 'taobao-mnn/Qwen1.5-1.8B-Chat-MNN' --local_dir 'path/to/dir'
+```
+```python
+# SDK download
+from huggingface_hub import snapshot_download
+model_dir = snapshot_download('taobao-mnn/Qwen1.5-1.8B-Chat-MNN')
+```
+```bash
+# git clone
+git clone https://www.modelscope.cn/taobao-mnn/Qwen1.5-1.8B-Chat-MNN
+```
+## Usage
+```bash
+# clone MNN source
+git clone https://github.com/alibaba/MNN.git
+# compile
+cd MNN
+mkdir build && cd build
+cmake .. -DMNN_LOW_MEMORY=true -DMNN_CPU_WEIGHT_DEQUANT_GEMM=true -DMNN_BUILD_LLM=true -DMNN_SUPPORT_TRANSFORMER_FUSE=true
+make -j
+# run
+./llm_demo /path/to/Qwen1.5-1.8B-Chat-MNN/config.json prompt.txt
+```
+## Document
+[MNN-LLM](https://mnn-docs.readthedocs.io/en/latest/transformers/llm.html#)

config.json CHANGED Viewed

@@ -1,7 +1,6 @@
 {
-    "llm_model": "qwen1.5-1.8b-int4.mnn",
-    "llm_weight": "qwen1.5-1.8b-int4.mnn.weight",
     "backend_type": "cpu",
     "thread_num": 4,
     "precision": "low",

 {
+    "llm_model": "llm.mnn",
+    "llm_weight": "llm.mnn.weight",
     "backend_type": "cpu",
     "thread_num": 4,
     "precision": "low",

llm.mnn ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:ba06ec694d970d8317023c8fb4878e10ad8a24dc4f8b052287f59cc042604f5d
+size 1184256

llm.mnn.json ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:8e9a13c167f0314914e793f2a69568878e937b4044dcfa8579174026e18816c0
+size 7085313

llm.mnn.weight ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:2e76f292e8437d1dfbd2a6d71787cb5c799fed0661816bc4f52099b658ed7fdc
+size 858640010

llm_config.json CHANGED Viewed

@@ -9,7 +9,6 @@
         16,
         128
     ],
-    "prompt_template": "<|im_start|>user\n%s<|im_end|>\n<|im_start|>assistant\n",
-    "is_visual": false,
-    "is_single": true
 }

         16,
         128
     ],
+    "prompt_template": "\n<|im_start|>user\n%s<|im_end|>\n<|im_start|>assistant\n",
+    "is_visual": false
 }

qwen1.5-1.8b-int4.mnn CHANGED Viewed

Binary files a/qwen1.5-1.8b-int4.mnn and b/qwen1.5-1.8b-int4.mnn differ

tokenizer.txt CHANGED Viewed

@@ -1,6 +1,6 @@
 430 3
-3 1 0
-151643 151644 151645 151645
 151646 151387
 !
 "

 430 3
+3 2 0
+151643 151644 151645 151643 151645
 151646 151387
 !
 "