Convert from TinyLlama/TinyLlama-1.1B-Chat-v1.0 and 4 bits quantized.
Require onnxruntime>=0.17.0
Base model