blair-johnson
commited on
Commit
•
48d184f
1
Parent(s):
8e49fdb
Update README.md
Browse files
README.md
CHANGED
@@ -41,7 +41,7 @@ TODO: add example inference usage.
|
|
41 |
|
42 |
## Training Resources
|
43 |
|
44 |
-
GALPACA 30B was fine-tuned in about 6 hours using 16 A100 80GB GPUS at an effective batch-size of 1024 and with a maximum context window of 384 tokens. This model was trained using DeepSpeed Stage 3 optimizations.
|
45 |
|
46 |
## Performance and Limitations
|
47 |
|
|
|
41 |
|
42 |
## Training Resources
|
43 |
|
44 |
+
GALPACA 30B was fine-tuned in about 6 hours using 16 A100 80GB GPUS using 16-bit mixed-precision at an effective batch-size of 1024 and with a maximum context window of 384 tokens. This model was trained using DeepSpeed Stage 3 optimizations.
|
45 |
|
46 |
## Performance and Limitations
|
47 |
|