Hello, can I run this model if I only have a 3090 with 24gVram and 32gRam ?

by Humeee33 - opened Dec 28, 2023

Discussion

Humeee33

Dec 28, 2023

•

edited Dec 31, 2023

Is it that no one reads these help request messages or no one cares, no one knows, or a combination?

YaTharThShaRma999

Jan 5

@humeee33 well I wouldnt recommend using the gptq one since you dont have enough vram for a 70b model. Your best bet is to use llama.cpp(or llama cpp python) and download the gguf version instead of gptq.

You might be able to run it with transformers but it will be extremely slow while llama.cpp will be much faster

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment