Update README.md
Browse files
README.md
CHANGED
@@ -67,8 +67,8 @@ All weights have been uploaded to HuggingFace🤗. It should be noted that all t
|
|
67 |
<h3 id="1-1">1.1 Environment Configuration</h3>
|
68 |
|
69 |
```shell
|
70 |
-
conda create -n
|
71 |
-
conda activate
|
72 |
pip install torch==1.12.0+cu116 torchvision==0.13.0+cu116 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu116
|
73 |
pip install -r requirements.txt
|
74 |
```
|
@@ -76,9 +76,9 @@ pip install -r requirements.txt
|
|
76 |
|
77 |
<h3 id="1-2">1.2 Pretraining model weight acquisition and restoration</h3>
|
78 |
|
79 |
-
❗❗❗ Note that in terms of hardware, performing step `2.2`, which involves merging LLaMA-13B with
|
80 |
|
81 |
-
**1. Download LLaMA 13B and
|
82 |
|
83 |
Please click [here](https://forms.gle/jk851eBVbX1m5TAv5) to apply for the official pre-training weights of LLaMA from `meta`. In this case, we are using the `13B` version of the model, so you only need to download the `13B` version. Once downloaded, the file directory will be as follows:
|
84 |
|
@@ -108,7 +108,7 @@ To convert the original LLaMA-13B model into the HuggingFace format, you can use
|
|
108 |
python convert_llama_weights_to_hf.py --input_dir ./ --model_size 13B --output_dir ./converted
|
109 |
```
|
110 |
|
111 |
-
**3. Restore
|
112 |
|
113 |
Use the script we provided, located at `./tools/weight_diff.py`, execute the following command, and you will get the complete `KnowLM` weight:
|
114 |
|
@@ -133,37 +133,7 @@ The final complete weights are saved in the `./lora` folder.
|
|
133 |
|
134 |
<h3 id="1-4">1.4 Model Usage Guide</h3>
|
135 |
|
136 |
-
**1.
|
137 |
-
|
138 |
-
> The cases in `Section 1` were all run on V100. If running on other devices, the results may vary. Please run multiple times or change the decoding parameters.
|
139 |
-
|
140 |
-
1. If you want to reproduce the results in section `1.1`(**pretraining cases**), please run the following command (assuming that the complete pre-training weights of `ZhiXi` have been obtained according to the steps in section `2.2`, and the ZhiXi weight is saved in the `./zhixi` folder):
|
141 |
-
|
142 |
-
```shell
|
143 |
-
python examples/generate_finetune.py --base_model ./knowlm
|
144 |
-
```
|
145 |
-
|
146 |
-
The result in section `1.1` can be obtained.
|
147 |
-
|
148 |
-
2. If you want to reproduce the results in section `1.2`(**information extraction cases**), please run the following command (assuming that the LoRA weights of `ZhiXi` have been obtained according to the steps in section `2.3`, and the LoRA weights is saved in the `./lora` folder):
|
149 |
-
|
150 |
-
```shell
|
151 |
-
python examples/generate_lora.py --load_8bit --base_model ./knowlm --lora_weights ./lora --run_ie_cases
|
152 |
-
```
|
153 |
-
|
154 |
-
The result in section `1.2` can be obtained.
|
155 |
-
|
156 |
-
3. If you want to reproduce the results in section `1.3`(**general ablities cases**), please run the following command (assuming that the LoRA weights of `ZhiXi` have been obtained according to the steps in section `2.3`, and the LoRA weights is saved in the `./lora` folder):
|
157 |
-
|
158 |
-
```shell
|
159 |
-
python examples/generate_lora.py --load_8bit --base_model ./knowlm --lora_weights ./lora --run_general_cases
|
160 |
-
```
|
161 |
-
|
162 |
-
The result in section `1.3` can be obtained.
|
163 |
-
|
164 |
-
|
165 |
-
|
166 |
-
**2. Usage of Pretraining Model**
|
167 |
|
168 |
We offer two methods: the first one is **command-line interaction**, and the second one is **web-based interaction**, which provides greater flexibility.
|
169 |
|
@@ -186,7 +156,7 @@ We offer two methods: the first one is **command-line interaction**, and the sec
|
|
186 |
</p>
|
187 |
|
188 |
|
189 |
-
**
|
190 |
|
191 |
Here, we provide a web-based interaction method. Use the following command to access the web:
|
192 |
|
@@ -209,7 +179,7 @@ If you want to perform batch testing, please modify the `examples/generate_lora.
|
|
209 |
|
210 |
For information extraction tasks such as named entity recognition (NER), event extraction (EE), and relation extraction (RE), we provide some prompts for ease of use. You can refer to this [link](https://github.com/zjunlp/KnowLM/blob/main/examples/ie_prompt.py) for examples. Of course, you can also try using your own prompts.
|
211 |
|
212 |
-
Here is a [case](https://github.com/zjunlp/DeepKE/blob/main/example/llm/InstructKGC/README.md) where
|
213 |
|
214 |
|
215 |
<h2 id="2">2. Training Details</h2>
|
|
|
67 |
<h3 id="1-1">1.1 Environment Configuration</h3>
|
68 |
|
69 |
```shell
|
70 |
+
conda create -n knowlm python=3.9 -y
|
71 |
+
conda activate knowlm
|
72 |
pip install torch==1.12.0+cu116 torchvision==0.13.0+cu116 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu116
|
73 |
pip install -r requirements.txt
|
74 |
```
|
|
|
76 |
|
77 |
<h3 id="1-2">1.2 Pretraining model weight acquisition and restoration</h3>
|
78 |
|
79 |
+
❗❗❗ Note that in terms of hardware, performing step `2.2`, which involves merging LLaMA-13B with KnowLM-13B-Diff, requires approximately **100GB** of RAM, with no demand for VRAM (this is due to the memory overhead caused by our merging strategy. For your convenience, we have provided the fp16 weights at this link: https://huggingface.co/zjunlp/zhixi-13b-diff-fp16. **fp16 weights require less memory but may slightly impact performance**. We will improve our merging approach in future updates, and we are currently developing a 7B model as well, so stay tuned). For step `2.4`, which involves inference using `ZhiXi`, a minimum of **26GB** of VRAM is required.
|
80 |
|
81 |
+
**1. Download LLaMA 13B and KnowLM-13B-Diff**
|
82 |
|
83 |
Please click [here](https://forms.gle/jk851eBVbX1m5TAv5) to apply for the official pre-training weights of LLaMA from `meta`. In this case, we are using the `13B` version of the model, so you only need to download the `13B` version. Once downloaded, the file directory will be as follows:
|
84 |
|
|
|
108 |
python convert_llama_weights_to_hf.py --input_dir ./ --model_size 13B --output_dir ./converted
|
109 |
```
|
110 |
|
111 |
+
**3. Restore KnowLM 13B**
|
112 |
|
113 |
Use the script we provided, located at `./tools/weight_diff.py`, execute the following command, and you will get the complete `KnowLM` weight:
|
114 |
|
|
|
133 |
|
134 |
<h3 id="1-4">1.4 Model Usage Guide</h3>
|
135 |
|
136 |
+
**1. Usage of Pretraining Model**
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
137 |
|
138 |
We offer two methods: the first one is **command-line interaction**, and the second one is **web-based interaction**, which provides greater flexibility.
|
139 |
|
|
|
156 |
</p>
|
157 |
|
158 |
|
159 |
+
**2. Usage of Instruction tuning Model**
|
160 |
|
161 |
Here, we provide a web-based interaction method. Use the following command to access the web:
|
162 |
|
|
|
179 |
|
180 |
For information extraction tasks such as named entity recognition (NER), event extraction (EE), and relation extraction (RE), we provide some prompts for ease of use. You can refer to this [link](https://github.com/zjunlp/KnowLM/blob/main/examples/ie_prompt.py) for examples. Of course, you can also try using your own prompts.
|
181 |
|
182 |
+
Here is a [case](https://github.com/zjunlp/DeepKE/blob/main/example/llm/InstructKGC/README.md) where KnowLM-13B-LoRA is used to accomplish the instruction-based knowledge graph construction task in CCKS2023.
|
183 |
|
184 |
|
185 |
<h2 id="2">2. Training Details</h2>
|