blair-johnson
commited on
Commit
•
bb6b85f
1
Parent(s):
728b3c4
Update README.md
Browse files
README.md
CHANGED
@@ -65,7 +65,7 @@ print(tokenizer.batch_decode(out_tokens, skip_special_tokens=False, clean_up_tok
|
|
65 |
|
66 |
## Training Resources
|
67 |
|
68 |
-
GALPACA 30B was fine-tuned in about 6 hours using 16 A100 80GB GPUS
|
69 |
|
70 |
## Performance and Limitations
|
71 |
|
|
|
65 |
|
66 |
## Training Resources
|
67 |
|
68 |
+
GALPACA 30B was fine-tuned in about 6 hours using 16 A100 80GB GPUS, 16-bit mixed-precision, an effective batch-size of 1024, and with a maximum context window of 384 tokens. This model was trained using DeepSpeed Stage 3 optimizations.
|
69 |
|
70 |
## Performance and Limitations
|
71 |
|