What hardware was used to train this model?
#1
by
eduardo-alvarez
- opened
Hey there! Just curious what hardware was used to tune the model?
Very common hardware haha. A single A6000 for pretraining and 2 H100 for finetuning. I simply rent them from Vast.ai and costed about $60 in total.
Note that you should use larger gradient accumulation parameter to get the same "virtual" batch size.
Let me know if you have any further questions!