xLSTM inference sample, because no Ollama, LM Studio,.. support!?

#10
by gue22 - opened

Rewrite 20250222: I am not aware of an xLSTM sample inference server. (Just everybody else has one at HuggingFace.)
hello-torch-gpu2-ui1.py in my examples at AI-bits/xlstm streams answers until CUDA crashes. (out of memory)
I think not many have the beefy machine needed - or want to set up their own server to try.
So I guess NX-AI's (almost) only marketing is Sepp.
How about running a sample inference server at JKU?

Cheers
G.
PS: 20250221 late: I tried to whip together a Gradio UI, but there are endless problems with from_pretrained, etc, etc.

gue22 changed discussion title from xLSTM inference in Ollama or GUI like LM Studio or similar? to xLSTM inference sample, because no Ollama, LM Studio,.. support!?

Sign up or log in to comment