zhaode commited on
Commit
0d6a1e8
·
verified ·
1 Parent(s): 2062948

Upload folder using huggingface_hub

Browse files
Files changed (12) hide show
  1. .gitattributes +2 -0
  2. .msc +0 -0
  3. .mv +1 -0
  4. README.md +50 -3
  5. config.json +9 -0
  6. configuration.json +1 -0
  7. embeddings_bf16.bin +3 -0
  8. llm.mnn +3 -0
  9. llm.mnn.json +0 -0
  10. llm.mnn.weight +3 -0
  11. llm_config.json +14 -0
  12. tokenizer.txt +0 -0
.gitattributes CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ llm.mnn filter=lfs diff=lfs merge=lfs -text
37
+ llm.mnn.weight filter=lfs diff=lfs merge=lfs -text
.msc ADDED
Binary file (634 Bytes). View file
 
.mv ADDED
@@ -0,0 +1 @@
 
 
1
+ Revision:master,CreatedAt:1737971823
README.md CHANGED
@@ -1,3 +1,50 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ pipeline_tag: text-generation
6
+ tags:
7
+ - chat
8
+ ---
9
+ # DeepSeek-R1-1.5B-Qwen-MNN
10
+
11
+ ## Introduction
12
+ This model is a 4-bit quantized version of the MNN model exported from [DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) using [llmexport](https://github.com/alibaba/MNN/tree/master/transformers/llm/export).
13
+
14
+ ## Download
15
+ ```bash
16
+ # install huggingface
17
+ pip install huggingface
18
+ ```
19
+ ```bash
20
+ # shell download
21
+ huggingface download --model 'taobao-mnn/DeepSeek-R1-1.5B-Qwen-MNN' --local_dir 'path/to/dir'
22
+ ```
23
+ ```python
24
+ # SDK download
25
+ from huggingface_hub import snapshot_download
26
+ model_dir = snapshot_download('taobao-mnn/DeepSeek-R1-1.5B-Qwen-MNN')
27
+ ```
28
+
29
+ ```bash
30
+ # git clone
31
+ git clone https://www.modelscope.cn/taobao-mnn/DeepSeek-R1-1.5B-Qwen-MNN
32
+ ```
33
+
34
+ ## Usage
35
+ ```bash
36
+ # clone MNN source
37
+ git clone https://github.com/alibaba/MNN.git
38
+
39
+ # compile
40
+ cd MNN
41
+ mkdir build && cd build
42
+ cmake .. -DMNN_LOW_MEMORY=true -DMNN_CPU_WEIGHT_DEQUANT_GEMM=true -DMNN_BUILD_LLM=true -DMNN_SUPPORT_TRANSFORMER_FUSE=true
43
+ make -j
44
+
45
+ # run
46
+ ./llm_demo /path/to/DeepSeek-R1-1.5B-Qwen-MNN/config.json prompt.txt
47
+ ```
48
+
49
+ ## Document
50
+ [MNN-LLM](https://mnn-docs.readthedocs.io/en/latest/transformers/llm.html#)
config.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "llm_model": "llm.mnn",
3
+ "llm_weight": "llm.mnn.weight",
4
+ "backend_type": "cpu",
5
+ "thread_num": 4,
6
+ "precision": "low",
7
+ "memory": "low",
8
+ "use_template":false
9
+ }
configuration.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"framework":"other","task":"text-generation"}
embeddings_bf16.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ceae2992cd5aa74dd18a9bed0313da6db56b4c6c47e804fd1181bb6afb1d6668
3
+ size 466747392
llm.mnn ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2b38872598164ce3c19a7645e30f4f0fd58203f545f7e07d04206db7254d41ac
3
+ size 1145128
llm.mnn.json ADDED
The diff for this file is too large to render. See raw diff
 
llm.mnn.weight ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6f15e03a173d99606df1467a9bf5d4e87658e0a715a650dac5f97f5a9da999fd
3
+ size 1081651682
llm_config.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "hidden_size": 1536,
3
+ "layer_nums": 28,
4
+ "attention_mask": "float",
5
+ "key_value_shape": [
6
+ 2,
7
+ 1,
8
+ 0,
9
+ 2,
10
+ 128
11
+ ],
12
+ "prompt_template": "\n<|im_start|>user\n%s<|im_end|>\n<|im_start|>assistant\n",
13
+ "is_visual": false
14
+ }
tokenizer.txt ADDED
The diff for this file is too large to render. See raw diff