Text Generation
Transformers
English
llama
Inference Endpoints

Fix for slow speed

#20
by CyberTimon - opened

You guys have to set use_cache to true in the config.json - that is very important for the speed. This fixes the slow speeds.

Thanks for the remark, I've set it true by default now.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment