Adapt modelling for gradient checkpointing

#3
by Panda-vid - opened

Fixed passed parameters to Model and removed the old gradient checkpointing method used in T5PretrainedModel as Huggingface deprecated it.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment