K024 commited on
Commit
76c79b0
1 Parent(s): 42da6fc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -1
README.md CHANGED
@@ -11,12 +11,21 @@ tags:
11
 
12
  # ChatGLM-6B + ONNX
13
 
14
- This model is exported from [ChatGLM-6b](https://huggingface.co/THUDM/chatglm-6b) with int8 quantization and optimized for [ONNXRuntime](https://onnxruntime.ai/) inference.
15
 
16
  Inference code with ONNXRuntime is uploaded with the model. Install requirements and run `streamlit run web-ui.py` to start chatting. Currently the `MatMulInteger` (for u8s8 data type) and `DynamicQuantizeLinear` operators are only supported on CPU.
17
 
18
  安装依赖并运行 `streamlit run web-ui.py` 预览模型效果。由于 ONNXRuntime 算子支持问题,目前仅能够使用 CPU 进行推理。
19
 
 
 
 
 
 
 
 
 
 
20
  Codes are released under MIT license.
21
 
22
  Model weights are released under the same license as ChatGLM-6b, see [MODEL LICENSE](https://huggingface.co/THUDM/chatglm-6b/blob/main/MODEL_LICENSE).
 
11
 
12
  # ChatGLM-6B + ONNX
13
 
14
+ This model is exported from [ChatGLM-6b](https://huggingface.co/THUDM/chatglm-6b) with int8 quantization and optimized for [ONNXRuntime](https://onnxruntime.ai/) inference. Export code in [this repo](https://github.com/K024/chatglm-q).
15
 
16
  Inference code with ONNXRuntime is uploaded with the model. Install requirements and run `streamlit run web-ui.py` to start chatting. Currently the `MatMulInteger` (for u8s8 data type) and `DynamicQuantizeLinear` operators are only supported on CPU.
17
 
18
  安装依赖并运行 `streamlit run web-ui.py` 预览模型效果。由于 ONNXRuntime 算子支持问题,目前仅能够使用 CPU 进行推理。
19
 
20
+ ## Usage
21
+
22
+ ```sh
23
+ git lfs clone https://huggingface.co/K024/ChatGLM-6b-onnx-u8s8
24
+ cd ChatGLM-6b-onnx-u8s8
25
+ pip install -r requirements.txt
26
+ streamlit run web-ui.py
27
+ ```
28
+
29
  Codes are released under MIT license.
30
 
31
  Model weights are released under the same license as ChatGLM-6b, see [MODEL LICENSE](https://huggingface.co/THUDM/chatglm-6b/blob/main/MODEL_LICENSE).