Report a bug?
#7
by
PotatoesJay
- opened
Reproduce:
- Download this repo's weight;
python3 -m lmdeploy.serve.turbomind.deploy internlm-chat-7b /path/to/internlm-chat-7b
;python3 -m lmdeploy.turbomind.chat ./workspace
;- type in a very long input, reach max 2056 tokens and it warns exeed max input length?
Issue:
find turbomind/turbomind.py, to 101th row, self.session_len is always 2048?
set session_len
in workspace/triton_models/weights/config.ini please
PotatoesJay
changed discussion status to
closed