Help! How to Run an A.I with an AMD GPU (Rx 580 8Gb)
Lads is there any one know how to do that?
even the chat gpt 4 can't help me well.
I don't know if I doing something wrong or not but I feel like there is way like bypassing ROCm in windows with help of wsl 2 or something like that. Please any one have a solution for it ?
13B GPTQ models is about 14GB size, which won;t fit on 8GB cards, There is no offloading technique for GPTQ so far, you probably need to refer to flexgen and with unquantized weights
There is offloading in GPTQ-for-LLaMa but it's really, really slow, and I don't know if it works for ROCm implementations of GPTQ-for-LLaMa. ExLlama has ROCm but no offloading, which I imagine is what you're referring to.
But it sounds like the OP is using Windows and there's no ROCm for Windows, not even in WSL, so that's a deadend I'm afraid.
@A2Hero I would suggest you use GGML, which can work on your AMD card via OpenCL acceleration.
@A2Hero were you successful with your rx 580?