total_train_batch_size cannot match train_batch_size?
#2
by
Tours0
- opened
Hello, I found the 'total_train_batch_size'(512) in the model card cannot match the 'train_batch_size'(8) and the 'num_devices'(16). Is the 'total_train_batch_size' should be equal to the multiplication of the other two parameters? Or did I misunderstand?
Tours0
changed discussion status to
closed
I forgot the gradient accumulation.