Is it possible to fine-tuning the 7b1 model on 4 A100 (80G) gpus?

#44
by TTCoding - opened

I have tried many configurations the FT 7b1 on four A100. But unfortunately, I got OOM all the time. So I am curious about minimal demands of GPUS to FT this model. Could you share your experiences?

BigScience Workshop org

If you freeze some layers, even in 1 A100 it is possible.
Check this: https://gitlab.inria.fr/synalp/plm4all/-/tree/main/finetune_accelerate
It is still a draft but it's running.

@hatimbr hi, how to transform the quat fp32 model to fp16, and then can i ft it with 24g RTX3090ti for fp16?
sure, this model-fit is fp32 of quantitated?

BigScience Workshop org

hi @redauzhang you can pass the parameter torch_dtype=torch.float16 (or even better torch_dtype=torch.bfloat16) in the from_pretrained method.

christopher changed discussion status to closed

Sign up or log in to comment