Converted INT8/INT4 files for fastllm with baichuan-13b-chat

Directly download from Baidu Netdisk:

Link:https://pan.baidu.com/s/1zADu6rd749zkkNAfl-aqtg Code:jqkl

Updated time: 2023/07/27

baichuan-13b-chat-int4.flm:

+-----------------------------------------------------------------------------+
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
| N/A   53C    P0    56W / 250W |   7083MiB / 23040MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

baichuan-13b-chat-int8.flm:

+-----------------------------------------------------------------------------+
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
| N/A   51C    P0   162W / 250W |  13151MiB / 23040MiB |     95%      Default |
+-------------------------------+----------------------+----------------------+
        
from fastllm_pytools import llm
model = llm.model("baichuan-13b-chat-int4.flm")
for response in model.stream_response("介绍一下南京"):
    print(response, flush = True, end = "")
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.