Meteonis commited on
Commit
72be2ab
1 Parent(s): c3e3195

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +83 -13
README.md CHANGED
@@ -6,34 +6,58 @@ tags:
6
  - Chinese
7
  ---
8
 
9
- # Open-Chinese-LLaMA-7B-Patch
10
 
11
- This model is a **Chinese large language model base** generated from the [LLaMA](https://github.com/facebookresearch/llama)-7B model after **secondary pre-training** on Chinese datasets.
12
 
13
- This model is a **patch** model and must be used in conjunction with the official weights. For the installation of the patch and related tutorials, please refer to [OpenLMLab/OpenChineseLLaMA](https://github.com/OpenLMLab/OpenChineseLLaMA).
14
 
15
- ## Usage
16
 
17
- Since the official weights for [LLaMA](https://github.com/facebookresearch/llama)-7B have not been open-sourced, the model released this time is of the **patch** type, which needs to be used in combination with the original official weights.
 
 
 
18
 
19
- You can install the **patch** using `tools/patch_model.py`, for example:
 
 
 
 
20
 
21
- ```bash
 
 
 
 
 
 
 
 
 
 
 
 
22
 
 
23
  python tools/patch_model.py --base_model <path_or_name_to_original_model>
24
  --patch_model openlmlab/open-chinese-llama-7b-patch
25
  --base_model_format <hf_or_raw>
26
-
27
  ```
28
 
29
- The **patch** is installed in place, which means that the installed **patch** is the complete `hf` format weight. You can use `transformers` to load the model.
30
 
31
- ## Quick Experience via Command Line
32
-
33
- The **patched** model can be easily loaded by `transformers`. For a quick experience, we provide a console Demo:
34
 
35
  ```bash
 
 
36
 
 
 
 
 
 
37
  python cli_demo.py --model openlmlab/open-chinese-llama-7b-patch
38
  --devices 0
39
  --max_length 1024
@@ -42,5 +66,51 @@ python cli_demo.py --model openlmlab/open-chinese-llama-7b-patch
42
  --top_p 0.8
43
  --temperature 0.7
44
  --penalty 1.02
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
45
 
46
- ```
 
6
  - Chinese
7
  ---
8
 
9
+ # Open-Chinese-LLaMA
10
 
11
+ [![](https://img.shields.io/github/license/OpenLMLab/OpenChineseLLaMA?label=Code%20License)]()[![Model License](https://img.shields.io/badge/Model%20License-Apache_2.0-green.svg)]()[![](https://img.shields.io/github/last-commit/OpenLMLab/OpenChineseLLaMA)]()[![](https://img.shields.io/github/issues/OpenLMLab/OpenChineseLLaMA)]()
12
 
13
+ This project is a **Chinese large language model base** generated through **incremental pre-training on Chinese datasets** based on [LLaMA](https://github.com/facebookresearch/llama)-7B.
14
 
15
+ ## Features
16
 
17
+ * This project provides a Chinese pre-trained model obtained through full-tuning, including Huggingface version weights.
18
+ * Compared to the original LLaMA, this model has significantly improved Chinese understanding and generation capabilities, achieving outstanding results in various downstream tasks. See [Evaluation](##Evaluation) for details.
19
+ * This project provides tools for converting Huggingface version weights and Meta version weights.
20
+ * Supports [🤗transformers](https://github.com/huggingface/transformers), and provides command-line tools for easy model testing.
21
 
22
+ ## Contents
23
+ * [Model Download](##Model%20Download)
24
+ * [Local Demo](##Local%20Demo)
25
+ * [Evaluation](##Evaluation)
26
+ * [Model Format Conversion](##Model%20Format%20Conversion)
27
 
28
+ ## Model Download
29
+
30
+ | Model Name | Weight Type | Download Link | SHA256 |
31
+ | --------------------------- | -------- | ------------------------------------------------------------ | ---------------------- |
32
+ | Open-Chinese-LLaMA-7B-Patch | Patch | [[🤗Huggingface]]() <br> [[Baidu Cloud]](https://pan.baidu.com/s/14E7iZKcH-5SHMDu97k70cg?pwd=gk34)<br>[[Google Driver]](https://drive.google.com/drive/folders/1THvuFzq_wojVfMLYV1qsSE_ddSjG0Ypv?usp=sharing) | [SHA256](./SHA256.txt) |
33
+
34
+ ### Usage Notes
35
+
36
+
37
+
38
+ Meta officially released [LLaMA](https://github.com/facebookresearch/llama) does not open-source weights. To comply with relevant licenses, the model released this time is of the **patch** type, and must be used in conjunction with the official original weights.
39
+
40
+ We provide a [script](https://github.com/OpenLMLab/OpenChineseLLaMA) for installing the **patch**. After obtaining the official weights through regular channels, you can install the patch as follows:
41
 
42
+ ```bash
43
  python tools/patch_model.py --base_model <path_or_name_to_original_model>
44
  --patch_model openlmlab/open-chinese-llama-7b-patch
45
  --base_model_format <hf_or_raw>
 
46
  ```
47
 
48
+ Note: The installation method of this patch is inplace installation, that is, the installed patch is the complete Huggingface version of this model weight, and you can use transformers to load the model.
49
 
50
+ Note: This script depends on [OpenLMLab/collie](https://github.com/OpenLMLab/collie), please install this framework using the following command:
 
 
51
 
52
  ```bash
53
+ pip install git+https://github.com/OpenLMLab/collie.git
54
+ ```
55
 
56
+ ## Local Demo
57
+
58
+ For quick and easy model testing, we provide a command-line version of the demo. After successfully installing the patch according to [Usage Notes](###Usage%20Notes), you can use the script to start an interactive interface:
59
+
60
+ ```bash
61
  python cli_demo.py --model openlmlab/open-chinese-llama-7b-patch
62
  --devices 0
63
  --max_length 1024
 
66
  --top_p 0.8
67
  --temperature 0.7
68
  --penalty 1.02
69
+ ```
70
+
71
+ ### Examples
72
+
73
+ Open-Chinese-LLaMA-7B on the left, original LLaMA on the right:
74
+
75
+ <div align=center><img src="./cli_demo1.png"></div>
76
+ <center style="font-size:14px;color:#C0C0C0;text-decoration:underline">text generation</center>
77
+ <br>
78
+ <div align=center><img src="./cli_demo2.png"></div>
79
+ <center style="font-size:14px;color:#C0C0C0;text-decoration:underline">code generation</center>
80
+ <br>
81
+ <div align=center><img src="./cli_demo3.png"></div>
82
+ <center style="font-size:14px;color:#C0C0C0;text-decoration:underline">instructions (Note: None have been Instruct-tuning)</center>
83
+ <br>
84
+
85
+ ## Evaluation
86
+
87
+ Open-Chinese-LLaMA-7B performs far better than the original LLaMA on various tasks in Chinese and English datasets. The evaluation results of this model on some datasets are given below (the following indicators are Accuracy, the bigger the better):
88
+
89
+ | Dataset | LLAMA 7B | Open-Chinese-LLaMA-7B |
90
+ | -------- | -------- | ----------- |
91
+ | OCNLI | 31.5 | 45.5 |
92
+ | CHID | 25.87 | 71.47 |
93
+ | TNEWS | 8.70 | 26.78 |
94
+ | CMRC | 11.89 | 34.48 |
95
+ | PIQA | 79.8 | 77.31 |
96
+ | HumanEval | 10.5 | 14.63 |
97
+ | MBPP | 17.7 | 17.2 |
98
+ | **Average** | 26.57 | 41.05 |
99
+
100
+
101
+ Note: See [Benchmark.md](./benchmark/Benchmark.md) for full results
102
+
103
+ ## Model Format Conversion
104
+
105
+ The model generated by [`patch_model.py`]((https://github.com/OpenLMLab/OpenChineseLLaMA)) in this project is **hf** format which can be loaded by [🤗transformers](https://github.com/huggingface/transformers). For convenience, we also provide a [conversion tool]((https://github.com/OpenLMLab/OpenChineseLLaMA)) between the official version model (raw) and hf:
106
+
107
+ ```bash
108
+ python convert_model.py --model_path <path_or_name_to_your_hf_or_raw_model>
109
+ --source_format hf
110
+ --target_format raw
111
+ --target_path <path_you_want_to_save_the_converted_model>
112
+ --raw_parallel_degree 2
113
+ --raw_parallel_devices 0,1
114
+ ```
115
 
116
+ Tip: When converting a model in raw format, you need to specify the tensor parallel size and corresponding device, and it can only be converted on a machine with a corresponding number of graphics cards.