[BUG] {'use_reentrant': True} results in "Gradients will be None"

#74
by RonanMcGovern - opened

Seems there's no way to use reentrancy for gradient checkpointing without errors. This results in high memory for fine-tuning.

See this issue

model.enable_input_require_grads() maybe adding this line could help.
I'm using the code with {'use_reentrant': True}

Sign up or log in to comment