GPU available: True (cuda), used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs ---------------------------------------------------------------------------------------------------- distributed_backend=nccl All distributed processes registered. Starting with 8 processes ---------------------------------------------------------------------------------------------------- LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3,4,5,6,7] | Name | Type | Params ---------------------------------------- 0 | model | Float16Module | 2.1 B ---------------------------------------- 2.1 B Trainable params 0 Non-trainable params 2.1 B Total params 8,538.206 Total estimated model params size (MB) Epoch 1, global step 613: 'validation_loss' was not in top 5