Converted INT8/INT4 files for fastllm with baichuan2-13b-chat

Directly download from Baidu Netdisk:

Link:https://pan.baidu.com/s/1Xsiif_1VzDyWFei1u5oJcA Code:wxbo

Updated time: 2023/09/11


baichuan2-13b-chat-int8.flm:

|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:05:00.0 Off |                  N/A |
| 31%   36C    P8    28W / 250W |  15420MiB / 22528MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
        
from fastllm_pytools import llm
model = llm.model("baichuan2-13b-chat-int8.flm")
for response in model.stream_response("介绍一下南京"):
    print(response, flush = True, end = "")

Note: please use the lastest version of FastLLM (no later than 2023/09/11 main branch)

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.