jordiclive
/

lora-llama-33B-alpaca_gpt4-dolly_15k-vicuna-r64

Text Generation

Model card Files Files and versions Community

jordiclive commited on Jun 1, 2023

Commit

2069da8

•

1 Parent(s): 1ea9dde

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -51,7 +51,7 @@ The model was trained with flash attention and gradient checkpointing and deepsp
 - Batch size: 128
 - Max Length: 2048
 - Learning rate: 5e-5
-- Lora _r_: 16
 - Lora Alpha: 32
 ## Prompting
@@ -80,7 +80,7 @@ from transformers import GenerationConfig
 device = "cuda" if torch.cuda.is_available() else "cpu"
 dtype = torch.float16
-repo_id = "jordiclive/alpaca_gpt4-dolly_15k-vicuna-lora-30b-r64"
 base_model = "decapoda-research/llama-30b-hf"
 # Model Loading

 - Batch size: 128
 - Max Length: 2048
 - Learning rate: 5e-5
+- Lora _r_: 64
 - Lora Alpha: 32
 ## Prompting
 device = "cuda" if torch.cuda.is_available() else "cpu"
 dtype = torch.float16
+repo_id = "jordiclive/lora-llama-33B-alpaca_gpt4-dolly_15k-vicuna-r64"
 base_model = "decapoda-research/llama-30b-hf"
 # Model Loading