What hardware was used to train this model?

#1
by eduardo-alvarez - opened

Hey there! Just curious what hardware was used to tune the model?

Very common hardware haha. A single A6000 for pretraining and 2 H100 for finetuning. I simply rent them from Vast.ai and costed about $60 in total.

Note that you should use larger gradient accumulation parameter to get the same "virtual" batch size.

Let me know if you have any further questions!

Sign up or log in to comment