nmitchko
/

medguanaco-lora-65b-GPTQ

Text Generation

Inference Endpoints

Model card Files Files and versions Community

nmitchko commited on Jun 4, 2023

Commit

bcb8982

·

1 Parent(s): 302a60d

Upload README.md

Files changed (1) hide show

README.md +7 -0

README.md CHANGED Viewed

@@ -25,6 +25,13 @@ It is based on the Guanaco LORA of LLaMA weighing in at 65B parameters.
 The primary goal of this model is to improve question-answering and medical dialogue tasks.
 It was trained using [LoRA](https://arxiv.org/abs/2106.09685) and quantized, to reduce memory footprint.
 > The following README is taken from the source page [medalpaca](https://huggingface.co/medalpaca/medalpaca-lora-13b-8bit)

 The primary goal of this model is to improve question-answering and medical dialogue tasks.
 It was trained using [LoRA](https://arxiv.org/abs/2106.09685) and quantized, to reduce memory footprint.
+Steps to load this model:
+1. Load Guanaco-65-GPTQ https://huggingface.co/TheBloke/guanaco-65B-GPTQ
+    * I recommend using text-generation-ui to test it out: https://github.com/oobabooga/text-generation-webui/tree/main
+2. Download this LoRA and apply it to the model
+3. ensure `--monkey-path` is enabled in the text-generation-ui, 4-bit instructions [here](https://github.com/oobabooga/text-generation-webui/blob/main/docs/GPTQ-models-(4-bit-mode).md)
+---
 > The following README is taken from the source page [medalpaca](https://huggingface.co/medalpaca/medalpaca-lora-13b-8bit)