Question
#4
by
dondre
- opened
Wish I could test out this model but it needs a A100 to load.
What kind of setup do you have?
Your models are great but with only 48GB VRAM i'm on the outside looking in.
Can you add some inference examples to these larger models?
There are quantizations available that don't need as much VRAM:
https://huggingface.co/TheBloke/WizardLM-Uncensored-SuperCOT-StoryTelling-30B-GPTQ
https://huggingface.co/TheBloke/WizardLM-Uncensored-SuperCOT-StoryTelling-30B-GGML