[2025-01-30 21:40:08,866] INFO: Will use single-gpu: NVIDIA A100 80GB PCIe [2025-01-30 21:40:08,867] INFO: configured dtype=torch.bfloat16 for autocast [2025-01-30 21:40:08,867] INFO: configured dtype=torch.bfloat16 for autocast [2025-01-30 21:40:08,952] INFO: layer conv_id_0 using attention_type=flash [2025-01-30 21:40:08,952] INFO: layer conv_id_0 using attention_type=flash [2025-01-30 21:40:08,984] INFO: layer conv_reg_0 using attention_type=flash [2025-01-30 21:40:08,984] INFO: layer conv_reg_0 using attention_type=flash [2025-01-30 21:40:09,015] INFO: layer conv_id_1 using attention_type=flash [2025-01-30 21:40:09,015] INFO: layer conv_id_1 using attention_type=flash [2025-01-30 21:40:09,047] INFO: layer conv_reg_1 using attention_type=flash [2025-01-30 21:40:09,047] INFO: layer conv_reg_1 using attention_type=flash [2025-01-30 21:40:09,079] INFO: layer conv_id_2 using attention_type=flash [2025-01-30 21:40:09,079] INFO: layer conv_id_2 using attention_type=flash [2025-01-30 21:40:09,111] INFO: layer conv_reg_2 using attention_type=flash [2025-01-30 21:40:09,111] INFO: layer conv_reg_2 using attention_type=flash [2025-01-30 21:40:09,408] INFO: MLPF( (nn0_id): ModuleList( (0-1): 2 x Sequential( (0): Linear(in_features=17, out_features=1024, bias=True) (1): ReLU() (2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (3): Dropout(p=0.0, inplace=False) (4): Linear(in_features=1024, out_features=1024, bias=True) ) ) (nn0_reg): ModuleList( (0-1): 2 x Sequential( (0): Linear(in_features=17, out_features=1024, bias=True) (1): ReLU() (2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (3): Dropout(p=0.0, inplace=False) (4): Linear(in_features=1024, out_features=1024, bias=True) ) ) (conv_id): ModuleList( (0-2): 3 x PreLnSelfAttentionLayer( (mha): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=1024, out_features=1024, bias=True) ) (norm0): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (norm1): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (seq): Sequential( (0): Linear(in_features=1024, out_features=1024, bias=True) (1): ReLU() (2): Linear(in_features=1024, out_features=1024, bias=True) (3): ReLU() ) (dropout): Dropout(p=0.0, inplace=False) ) ) (conv_reg): ModuleList( (0-2): 3 x PreLnSelfAttentionLayer( (mha): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=1024, out_features=1024, bias=True) ) (norm0): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (norm1): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (seq): Sequential( (0): Linear(in_features=1024, out_features=1024, bias=True) (1): ReLU() (2): Linear(in_features=1024, out_features=1024, bias=True) (3): ReLU() ) (dropout): Dropout(p=0.0, inplace=False) ) ) (nn_binary_particle): Sequential( (0): Linear(in_features=1024, out_features=1024, bias=True) (1): ReLU() (2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (3): Dropout(p=0.0, inplace=False) (4): Linear(in_features=1024, out_features=2, bias=True) ) (nn_pid): Sequential( (0): Linear(in_features=1024, out_features=1024, bias=True) (1): ReLU() (2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (3): Dropout(p=0.0, inplace=False) (4): Linear(in_features=1024, out_features=6, bias=True) ) (nn_pu): Sequential( (0): Linear(in_features=1024, out_features=1024, bias=True) (1): ReLU() (2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (3): Dropout(p=0.0, inplace=False) (4): Linear(in_features=1024, out_features=1, bias=True) ) (nn_pt): RegressionOutput( (nn): ModuleList( (0-1): 2 x Sequential( (0): Linear(in_features=1024, out_features=1024, bias=True) (1): ReLU() (2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (3): Dropout(p=0.0, inplace=False) (4): Linear(in_features=1024, out_features=1, bias=True) ) ) ) (nn_eta): RegressionOutput( (nn): Sequential( (0): Linear(in_features=1024, out_features=1024, bias=True) (1): ReLU() (2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (3): Dropout(p=0.0, inplace=False) (4): Linear(in_features=1024, out_features=2, bias=True) ) ) (nn_sin_phi): RegressionOutput( (nn): Sequential( (0): Linear(in_features=1024, out_features=1024, bias=True) (1): ReLU() (2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (3): Dropout(p=0.0, inplace=False) (4): Linear(in_features=1024, out_features=2, bias=True) ) ) (nn_cos_phi): RegressionOutput( (nn): Sequential( (0): Linear(in_features=1024, out_features=1024, bias=True) (1): ReLU() (2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (3): Dropout(p=0.0, inplace=False) (4): Linear(in_features=1024, out_features=2, bias=True) ) ) (nn_energy): RegressionOutput( (nn): ModuleList( (0-1): 2 x Sequential( (0): Linear(in_features=1024, out_features=1024, bias=True) (1): ReLU() (2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (3): Dropout(p=0.0, inplace=False) (4): Linear(in_features=1024, out_features=1, bias=True) ) ) ) (final_norm_id): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (final_norm_reg): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) [2025-01-30 21:40:09,408] INFO: MLPF( (nn0_id): ModuleList( (0-1): 2 x Sequential( (0): Linear(in_features=17, out_features=1024, bias=True) (1): ReLU() (2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (3): Dropout(p=0.0, inplace=False) (4): Linear(in_features=1024, out_features=1024, bias=True) ) ) (nn0_reg): ModuleList( (0-1): 2 x Sequential( (0): Linear(in_features=17, out_features=1024, bias=True) (1): ReLU() (2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (3): Dropout(p=0.0, inplace=False) (4): Linear(in_features=1024, out_features=1024, bias=True) ) ) (conv_id): ModuleList( (0-2): 3 x PreLnSelfAttentionLayer( (mha): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=1024, out_features=1024, bias=True) ) (norm0): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (norm1): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (seq): Sequential( (0): Linear(in_features=1024, out_features=1024, bias=True) (1): ReLU() (2): Linear(in_features=1024, out_features=1024, bias=True) (3): ReLU() ) (dropout): Dropout(p=0.0, inplace=False) ) ) (conv_reg): ModuleList( (0-2): 3 x PreLnSelfAttentionLayer( (mha): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=1024, out_features=1024, bias=True) ) (norm0): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (norm1): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (seq): Sequential( (0): Linear(in_features=1024, out_features=1024, bias=True) (1): ReLU() (2): Linear(in_features=1024, out_features=1024, bias=True) (3): ReLU() ) (dropout): Dropout(p=0.0, inplace=False) ) ) (nn_binary_particle): Sequential( (0): Linear(in_features=1024, out_features=1024, bias=True) (1): ReLU() (2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (3): Dropout(p=0.0, inplace=False) (4): Linear(in_features=1024, out_features=2, bias=True) ) (nn_pid): Sequential( (0): Linear(in_features=1024, out_features=1024, bias=True) (1): ReLU() (2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (3): Dropout(p=0.0, inplace=False) (4): Linear(in_features=1024, out_features=6, bias=True) ) (nn_pu): Sequential( (0): Linear(in_features=1024, out_features=1024, bias=True) (1): ReLU() (2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (3): Dropout(p=0.0, inplace=False) (4): Linear(in_features=1024, out_features=1, bias=True) ) (nn_pt): RegressionOutput( (nn): ModuleList( (0-1): 2 x Sequential( (0): Linear(in_features=1024, out_features=1024, bias=True) (1): ReLU() (2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (3): Dropout(p=0.0, inplace=False) (4): Linear(in_features=1024, out_features=1, bias=True) ) ) ) (nn_eta): RegressionOutput( (nn): Sequential( (0): Linear(in_features=1024, out_features=1024, bias=True) (1): ReLU() (2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (3): Dropout(p=0.0, inplace=False) (4): Linear(in_features=1024, out_features=2, bias=True) ) ) (nn_sin_phi): RegressionOutput( (nn): Sequential( (0): Linear(in_features=1024, out_features=1024, bias=True) (1): ReLU() (2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (3): Dropout(p=0.0, inplace=False) (4): Linear(in_features=1024, out_features=2, bias=True) ) ) (nn_cos_phi): RegressionOutput( (nn): Sequential( (0): Linear(in_features=1024, out_features=1024, bias=True) (1): ReLU() (2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (3): Dropout(p=0.0, inplace=False) (4): Linear(in_features=1024, out_features=2, bias=True) ) ) (nn_energy): RegressionOutput( (nn): ModuleList( (0-1): 2 x Sequential( (0): Linear(in_features=1024, out_features=1024, bias=True) (1): ReLU() (2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (3): Dropout(p=0.0, inplace=False) (4): Linear(in_features=1024, out_features=1, bias=True) ) ) ) (final_norm_id): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (final_norm_reg): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) [2025-01-30 21:40:09,409] INFO: Trainable parameters: 52630547 [2025-01-30 21:40:09,409] INFO: Trainable parameters: 52630547 [2025-01-30 21:40:09,409] INFO: Non-trainable parameters: 0 [2025-01-30 21:40:09,409] INFO: Non-trainable parameters: 0 [2025-01-30 21:40:09,409] INFO: Total parameters: 52630547 [2025-01-30 21:40:09,409] INFO: Total parameters: 52630547 [2025-01-30 21:40:09,411] INFO: Modules Trainable parameters Non-trainable parameters nn0_id.0.0.weight 17408 0 nn0_id.0.0.bias 1024 0 nn0_id.0.2.weight 1024 0 nn0_id.0.2.bias 1024 0 nn0_id.0.4.weight 1048576 0 nn0_id.0.4.bias 1024 0 nn0_id.1.0.weight 17408 0 nn0_id.1.0.bias 1024 0 nn0_id.1.2.weight 1024 0 nn0_id.1.2.bias 1024 0 nn0_id.1.4.weight 1048576 0 nn0_id.1.4.bias 1024 0 nn0_reg.0.0.weight 17408 0 nn0_reg.0.0.bias 1024 0 nn0_reg.0.2.weight 1024 0 nn0_reg.0.2.bias 1024 0 nn0_reg.0.4.weight 1048576 0 nn0_reg.0.4.bias 1024 0 nn0_reg.1.0.weight 17408 0 nn0_reg.1.0.bias 1024 0 nn0_reg.1.2.weight 1024 0 nn0_reg.1.2.bias 1024 0 nn0_reg.1.4.weight 1048576 0 nn0_reg.1.4.bias 1024 0 conv_id.0.mha.in_proj_weight 3145728 0 conv_id.0.mha.in_proj_bias 3072 0 conv_id.0.mha.out_proj.weight 1048576 0 conv_id.0.mha.out_proj.bias 1024 0 conv_id.0.norm0.weight 1024 0 conv_id.0.norm0.bias 1024 0 conv_id.0.norm1.weight 1024 0 conv_id.0.norm1.bias 1024 0 conv_id.0.seq.0.weight 1048576 0 conv_id.0.seq.0.bias 1024 0 conv_id.0.seq.2.weight 1048576 0 conv_id.0.seq.2.bias 1024 0 conv_id.1.mha.in_proj_weight 3145728 0 conv_id.1.mha.in_proj_bias 3072 0 conv_id.1.mha.out_proj.weight 1048576 0 conv_id.1.mha.out_proj.bias 1024 0 conv_id.1.norm0.weight 1024 0 conv_id.1.norm0.bias 1024 0 conv_id.1.norm1.weight 1024 0 conv_id.1.norm1.bias 1024 0 conv_id.1.seq.0.weight 1048576 0 conv_id.1.seq.0.bias 1024 0 conv_id.1.seq.2.weight 1048576 0 conv_id.1.seq.2.bias 1024 0 conv_id.2.mha.in_proj_weight 3145728 0 conv_id.2.mha.in_proj_bias 3072 0 conv_id.2.mha.out_proj.weight 1048576 0 conv_id.2.mha.out_proj.bias 1024 0 conv_id.2.norm0.weight 1024 0 conv_id.2.norm0.bias 1024 0 conv_id.2.norm1.weight 1024 0 conv_id.2.norm1.bias 1024 0 conv_id.2.seq.0.weight 1048576 0 conv_id.2.seq.0.bias 1024 0 conv_id.2.seq.2.weight 1048576 0 conv_id.2.seq.2.bias 1024 0 conv_reg.0.mha.in_proj_weight 3145728 0 conv_reg.0.mha.in_proj_bias 3072 0 conv_reg.0.mha.out_proj.weight 1048576 0 conv_reg.0.mha.out_proj.bias 1024 0 conv_reg.0.norm0.weight 1024 0 conv_reg.0.norm0.bias 1024 0 conv_reg.0.norm1.weight 1024 0 conv_reg.0.norm1.bias 1024 0 conv_reg.0.seq.0.weight 1048576 0 conv_reg.0.seq.0.bias 1024 0 conv_reg.0.seq.2.weight 1048576 0 conv_reg.0.seq.2.bias 1024 0 conv_reg.1.mha.in_proj_weight 3145728 0 conv_reg.1.mha.in_proj_bias 3072 0 conv_reg.1.mha.out_proj.weight 1048576 0 conv_reg.1.mha.out_proj.bias 1024 0 conv_reg.1.norm0.weight 1024 0 conv_reg.1.norm0.bias 1024 0 conv_reg.1.norm1.weight 1024 0 conv_reg.1.norm1.bias 1024 0 conv_reg.1.seq.0.weight 1048576 0 conv_reg.1.seq.0.bias 1024 0 conv_reg.1.seq.2.weight 1048576 0 conv_reg.1.seq.2.bias 1024 0 conv_reg.2.mha.in_proj_weight 3145728 0 conv_reg.2.mha.in_proj_bias 3072 0 conv_reg.2.mha.out_proj.weight 1048576 0 conv_reg.2.mha.out_proj.bias 1024 0 conv_reg.2.norm0.weight 1024 0 conv_reg.2.norm0.bias 1024 0 conv_reg.2.norm1.weight 1024 0 conv_reg.2.norm1.bias 1024 0 conv_reg.2.seq.0.weight 1048576 0 conv_reg.2.seq.0.bias 1024 0 conv_reg.2.seq.2.weight 1048576 0 conv_reg.2.seq.2.bias 1024 0 nn_binary_particle.0.weight 1048576 0 nn_binary_particle.0.bias 1024 0 nn_binary_particle.2.weight 1024 0 nn_binary_particle.2.bias 1024 0 nn_binary_particle.4.weight 2048 0 nn_binary_particle.4.bias 2 0 nn_pid.0.weight 1048576 0 nn_pid.0.bias 1024 0 nn_pid.2.weight 1024 0 nn_pid.2.bias 1024 0 nn_pid.4.weight 6144 0 nn_pid.4.bias 6 0 nn_pu.0.weight 1048576 0 nn_pu.0.bias 1024 0 nn_pu.2.weight 1024 0 nn_pu.2.bias 1024 0 nn_pu.4.weight 1024 0 nn_pu.4.bias 1 0 nn_pt.nn.0.0.weight 1048576 0 nn_pt.nn.0.0.bias 1024 0 nn_pt.nn.0.2.weight 1024 0 nn_pt.nn.0.2.bias 1024 0 nn_pt.nn.0.4.weight 1024 0 nn_pt.nn.0.4.bias 1 0 nn_pt.nn.1.0.weight 1048576 0 nn_pt.nn.1.0.bias 1024 0 nn_pt.nn.1.2.weight 1024 0 nn_pt.nn.1.2.bias 1024 0 nn_pt.nn.1.4.weight 1024 0 nn_pt.nn.1.4.bias 1 0 nn_eta.nn.0.weight 1048576 0 nn_eta.nn.0.bias 1024 0 nn_eta.nn.2.weight 1024 0 nn_eta.nn.2.bias 1024 0 nn_eta.nn.4.weight 2048 0 nn_eta.nn.4.bias 2 0 nn_sin_phi.nn.0.weight 1048576 0 nn_sin_phi.nn.0.bias 1024 0 nn_sin_phi.nn.2.weight 1024 0 nn_sin_phi.nn.2.bias 1024 0 nn_sin_phi.nn.4.weight 2048 0 nn_sin_phi.nn.4.bias 2 0 nn_cos_phi.nn.0.weight 1048576 0 nn_cos_phi.nn.0.bias 1024 0 nn_cos_phi.nn.2.weight 1024 0 nn_cos_phi.nn.2.bias 1024 0 nn_cos_phi.nn.4.weight 2048 0 nn_cos_phi.nn.4.bias 2 0 nn_energy.nn.0.0.weight 1048576 0 nn_energy.nn.0.0.bias 1024 0 nn_energy.nn.0.2.weight 1024 0 nn_energy.nn.0.2.bias 1024 0 nn_energy.nn.0.4.weight 1024 0 nn_energy.nn.0.4.bias 1 0 nn_energy.nn.1.0.weight 1048576 0 nn_energy.nn.1.0.bias 1024 0 nn_energy.nn.1.2.weight 1024 0 nn_energy.nn.1.2.bias 1024 0 nn_energy.nn.1.4.weight 1024 0 nn_energy.nn.1.4.bias 1 0 final_norm_id.weight 1024 0 final_norm_id.bias 1024 0 final_norm_reg.weight 1024 0 final_norm_reg.bias 1024 0 [2025-01-30 21:40:09,411] INFO: Modules Trainable parameters Non-trainable parameters nn0_id.0.0.weight 17408 0 nn0_id.0.0.bias 1024 0 nn0_id.0.2.weight 1024 0 nn0_id.0.2.bias 1024 0 nn0_id.0.4.weight 1048576 0 nn0_id.0.4.bias 1024 0 nn0_id.1.0.weight 17408 0 nn0_id.1.0.bias 1024 0 nn0_id.1.2.weight 1024 0 nn0_id.1.2.bias 1024 0 nn0_id.1.4.weight 1048576 0 nn0_id.1.4.bias 1024 0 nn0_reg.0.0.weight 17408 0 nn0_reg.0.0.bias 1024 0 nn0_reg.0.2.weight 1024 0 nn0_reg.0.2.bias 1024 0 nn0_reg.0.4.weight 1048576 0 nn0_reg.0.4.bias 1024 0 nn0_reg.1.0.weight 17408 0 nn0_reg.1.0.bias 1024 0 nn0_reg.1.2.weight 1024 0 nn0_reg.1.2.bias 1024 0 nn0_reg.1.4.weight 1048576 0 nn0_reg.1.4.bias 1024 0 conv_id.0.mha.in_proj_weight 3145728 0 conv_id.0.mha.in_proj_bias 3072 0 conv_id.0.mha.out_proj.weight 1048576 0 conv_id.0.mha.out_proj.bias 1024 0 conv_id.0.norm0.weight 1024 0 conv_id.0.norm0.bias 1024 0 conv_id.0.norm1.weight 1024 0 conv_id.0.norm1.bias 1024 0 conv_id.0.seq.0.weight 1048576 0 conv_id.0.seq.0.bias 1024 0 conv_id.0.seq.2.weight 1048576 0 conv_id.0.seq.2.bias 1024 0 conv_id.1.mha.in_proj_weight 3145728 0 conv_id.1.mha.in_proj_bias 3072 0 conv_id.1.mha.out_proj.weight 1048576 0 conv_id.1.mha.out_proj.bias 1024 0 conv_id.1.norm0.weight 1024 0 conv_id.1.norm0.bias 1024 0 conv_id.1.norm1.weight 1024 0 conv_id.1.norm1.bias 1024 0 conv_id.1.seq.0.weight 1048576 0 conv_id.1.seq.0.bias 1024 0 conv_id.1.seq.2.weight 1048576 0 conv_id.1.seq.2.bias 1024 0 conv_id.2.mha.in_proj_weight 3145728 0 conv_id.2.mha.in_proj_bias 3072 0 conv_id.2.mha.out_proj.weight 1048576 0 conv_id.2.mha.out_proj.bias 1024 0 conv_id.2.norm0.weight 1024 0 conv_id.2.norm0.bias 1024 0 conv_id.2.norm1.weight 1024 0 conv_id.2.norm1.bias 1024 0 conv_id.2.seq.0.weight 1048576 0 conv_id.2.seq.0.bias 1024 0 conv_id.2.seq.2.weight 1048576 0 conv_id.2.seq.2.bias 1024 0 conv_reg.0.mha.in_proj_weight 3145728 0 conv_reg.0.mha.in_proj_bias 3072 0 conv_reg.0.mha.out_proj.weight 1048576 0 conv_reg.0.mha.out_proj.bias 1024 0 conv_reg.0.norm0.weight 1024 0 conv_reg.0.norm0.bias 1024 0 conv_reg.0.norm1.weight 1024 0 conv_reg.0.norm1.bias 1024 0 conv_reg.0.seq.0.weight 1048576 0 conv_reg.0.seq.0.bias 1024 0 conv_reg.0.seq.2.weight 1048576 0 conv_reg.0.seq.2.bias 1024 0 conv_reg.1.mha.in_proj_weight 3145728 0 conv_reg.1.mha.in_proj_bias 3072 0 conv_reg.1.mha.out_proj.weight 1048576 0 conv_reg.1.mha.out_proj.bias 1024 0 conv_reg.1.norm0.weight 1024 0 conv_reg.1.norm0.bias 1024 0 conv_reg.1.norm1.weight 1024 0 conv_reg.1.norm1.bias 1024 0 conv_reg.1.seq.0.weight 1048576 0 conv_reg.1.seq.0.bias 1024 0 conv_reg.1.seq.2.weight 1048576 0 conv_reg.1.seq.2.bias 1024 0 conv_reg.2.mha.in_proj_weight 3145728 0 conv_reg.2.mha.in_proj_bias 3072 0 conv_reg.2.mha.out_proj.weight 1048576 0 conv_reg.2.mha.out_proj.bias 1024 0 conv_reg.2.norm0.weight 1024 0 conv_reg.2.norm0.bias 1024 0 conv_reg.2.norm1.weight 1024 0 conv_reg.2.norm1.bias 1024 0 conv_reg.2.seq.0.weight 1048576 0 conv_reg.2.seq.0.bias 1024 0 conv_reg.2.seq.2.weight 1048576 0 conv_reg.2.seq.2.bias 1024 0 nn_binary_particle.0.weight 1048576 0 nn_binary_particle.0.bias 1024 0 nn_binary_particle.2.weight 1024 0 nn_binary_particle.2.bias 1024 0 nn_binary_particle.4.weight 2048 0 nn_binary_particle.4.bias 2 0 nn_pid.0.weight 1048576 0 nn_pid.0.bias 1024 0 nn_pid.2.weight 1024 0 nn_pid.2.bias 1024 0 nn_pid.4.weight 6144 0 nn_pid.4.bias 6 0 nn_pu.0.weight 1048576 0 nn_pu.0.bias 1024 0 nn_pu.2.weight 1024 0 nn_pu.2.bias 1024 0 nn_pu.4.weight 1024 0 nn_pu.4.bias 1 0 nn_pt.nn.0.0.weight 1048576 0 nn_pt.nn.0.0.bias 1024 0 nn_pt.nn.0.2.weight 1024 0 nn_pt.nn.0.2.bias 1024 0 nn_pt.nn.0.4.weight 1024 0 nn_pt.nn.0.4.bias 1 0 nn_pt.nn.1.0.weight 1048576 0 nn_pt.nn.1.0.bias 1024 0 nn_pt.nn.1.2.weight 1024 0 nn_pt.nn.1.2.bias 1024 0 nn_pt.nn.1.4.weight 1024 0 nn_pt.nn.1.4.bias 1 0 nn_eta.nn.0.weight 1048576 0 nn_eta.nn.0.bias 1024 0 nn_eta.nn.2.weight 1024 0 nn_eta.nn.2.bias 1024 0 nn_eta.nn.4.weight 2048 0 nn_eta.nn.4.bias 2 0 nn_sin_phi.nn.0.weight 1048576 0 nn_sin_phi.nn.0.bias 1024 0 nn_sin_phi.nn.2.weight 1024 0 nn_sin_phi.nn.2.bias 1024 0 nn_sin_phi.nn.4.weight 2048 0 nn_sin_phi.nn.4.bias 2 0 nn_cos_phi.nn.0.weight 1048576 0 nn_cos_phi.nn.0.bias 1024 0 nn_cos_phi.nn.2.weight 1024 0 nn_cos_phi.nn.2.bias 1024 0 nn_cos_phi.nn.4.weight 2048 0 nn_cos_phi.nn.4.bias 2 0 nn_energy.nn.0.0.weight 1048576 0 nn_energy.nn.0.0.bias 1024 0 nn_energy.nn.0.2.weight 1024 0 nn_energy.nn.0.2.bias 1024 0 nn_energy.nn.0.4.weight 1024 0 nn_energy.nn.0.4.bias 1 0 nn_energy.nn.1.0.weight 1048576 0 nn_energy.nn.1.0.bias 1024 0 nn_energy.nn.1.2.weight 1024 0 nn_energy.nn.1.2.bias 1024 0 nn_energy.nn.1.4.weight 1024 0 nn_energy.nn.1.4.bias 1 0 final_norm_id.weight 1024 0 final_norm_id.bias 1024 0 final_norm_reg.weight 1024 0 final_norm_reg.bias 1024 0 [2025-01-30 21:40:09,412] INFO: Creating experiment dir experiments/pyg-clic_20250130_214007_333962 [2025-01-30 21:40:09,412] INFO: Creating experiment dir experiments/pyg-clic_20250130_214007_333962 [2025-01-30 21:40:09,412] INFO: Model directory experiments/pyg-clic_20250130_214007_333962 [2025-01-30 21:40:09,412] INFO: Model directory experiments/pyg-clic_20250130_214007_333962 [2025-01-30 21:40:14,708] INFO: train_dataset: clic_edm_qq_pf, 719492 [2025-01-30 21:40:14,708] INFO: train_dataset: clic_edm_qq_pf, 719492 [2025-01-30 21:40:14,720] INFO: train_dataset: clic_edm_qq_pf, 719490 [2025-01-30 21:40:14,720] INFO: train_dataset: clic_edm_qq_pf, 719490 [2025-01-30 21:40:14,732] INFO: train_dataset: clic_edm_qq_pf, 719489 [2025-01-30 21:40:14,732] INFO: train_dataset: clic_edm_qq_pf, 719489 [2025-01-30 21:40:14,743] INFO: train_dataset: clic_edm_qq_pf, 719515 [2025-01-30 21:40:14,743] INFO: train_dataset: clic_edm_qq_pf, 719515 [2025-01-30 21:40:14,755] INFO: train_dataset: clic_edm_qq_pf, 719510 [2025-01-30 21:40:14,755] INFO: train_dataset: clic_edm_qq_pf, 719510 [2025-01-30 21:40:14,769] INFO: train_dataset: clic_edm_qq_pf, 719503 [2025-01-30 21:40:14,769] INFO: train_dataset: clic_edm_qq_pf, 719503 [2025-01-30 21:40:14,780] INFO: train_dataset: clic_edm_qq_pf, 719509 [2025-01-30 21:40:14,780] INFO: train_dataset: clic_edm_qq_pf, 719509 [2025-01-30 21:40:14,794] INFO: train_dataset: clic_edm_qq_pf, 719484 [2025-01-30 21:40:14,794] INFO: train_dataset: clic_edm_qq_pf, 719484 [2025-01-30 21:40:14,806] INFO: train_dataset: clic_edm_qq_pf, 719474 [2025-01-30 21:40:14,806] INFO: train_dataset: clic_edm_qq_pf, 719474 [2025-01-30 21:40:14,817] INFO: train_dataset: clic_edm_qq_pf, 720386 [2025-01-30 21:40:14,817] INFO: train_dataset: clic_edm_qq_pf, 720386 [2025-01-30 21:40:14,836] INFO: train_dataset: clic_edm_ttbar_pf, 713900 [2025-01-30 21:40:14,836] INFO: train_dataset: clic_edm_ttbar_pf, 713900 [2025-01-30 21:40:14,855] INFO: train_dataset: clic_edm_ttbar_pf, 713900 [2025-01-30 21:40:14,855] INFO: train_dataset: clic_edm_ttbar_pf, 713900 [2025-01-30 21:40:14,874] INFO: train_dataset: clic_edm_ttbar_pf, 713900 [2025-01-30 21:40:14,874] INFO: train_dataset: clic_edm_ttbar_pf, 713900 [2025-01-30 21:40:14,895] INFO: train_dataset: clic_edm_ttbar_pf, 713900 [2025-01-30 21:40:14,895] INFO: train_dataset: clic_edm_ttbar_pf, 713900 [2025-01-30 21:40:14,914] INFO: train_dataset: clic_edm_ttbar_pf, 713900 [2025-01-30 21:40:14,914] INFO: train_dataset: clic_edm_ttbar_pf, 713900 [2025-01-30 21:40:14,933] INFO: train_dataset: clic_edm_ttbar_pf, 713900 [2025-01-30 21:40:14,933] INFO: train_dataset: clic_edm_ttbar_pf, 713900 [2025-01-30 21:40:14,952] INFO: train_dataset: clic_edm_ttbar_pf, 713900 [2025-01-30 21:40:14,952] INFO: train_dataset: clic_edm_ttbar_pf, 713900 [2025-01-30 21:40:14,971] INFO: train_dataset: clic_edm_ttbar_pf, 713900 [2025-01-30 21:40:14,971] INFO: train_dataset: clic_edm_ttbar_pf, 713900 [2025-01-30 21:40:14,990] INFO: train_dataset: clic_edm_ttbar_pf, 713900 [2025-01-30 21:40:14,990] INFO: train_dataset: clic_edm_ttbar_pf, 713900 [2025-01-30 21:40:15,010] INFO: train_dataset: clic_edm_ttbar_pf, 714700 [2025-01-30 21:40:15,010] INFO: train_dataset: clic_edm_ttbar_pf, 714700 [2025-01-30 21:40:15,023] INFO: train_dataset: clic_edm_ww_fullhad_pf, 720000 [2025-01-30 21:40:15,023] INFO: train_dataset: clic_edm_ww_fullhad_pf, 720000 [2025-01-30 21:40:15,035] INFO: train_dataset: clic_edm_ww_fullhad_pf, 720000 [2025-01-30 21:40:15,035] INFO: train_dataset: clic_edm_ww_fullhad_pf, 720000 [2025-01-30 21:40:15,046] INFO: train_dataset: clic_edm_ww_fullhad_pf, 720000 [2025-01-30 21:40:15,046] INFO: train_dataset: clic_edm_ww_fullhad_pf, 720000 [2025-01-30 21:40:15,058] INFO: train_dataset: clic_edm_ww_fullhad_pf, 720000 [2025-01-30 21:40:15,058] INFO: train_dataset: clic_edm_ww_fullhad_pf, 720000 [2025-01-30 21:40:15,069] INFO: train_dataset: clic_edm_ww_fullhad_pf, 720000 [2025-01-30 21:40:15,069] INFO: train_dataset: clic_edm_ww_fullhad_pf, 720000 [2025-01-30 21:40:15,081] INFO: train_dataset: clic_edm_ww_fullhad_pf, 720000 [2025-01-30 21:40:15,081] INFO: train_dataset: clic_edm_ww_fullhad_pf, 720000 [2025-01-30 21:40:15,093] INFO: train_dataset: clic_edm_ww_fullhad_pf, 720000 [2025-01-30 21:40:15,093] INFO: train_dataset: clic_edm_ww_fullhad_pf, 720000 [2025-01-30 21:40:15,104] INFO: train_dataset: clic_edm_ww_fullhad_pf, 720000 [2025-01-30 21:40:15,104] INFO: train_dataset: clic_edm_ww_fullhad_pf, 720000 [2025-01-30 21:40:15,116] INFO: train_dataset: clic_edm_ww_fullhad_pf, 720000 [2025-01-30 21:40:15,116] INFO: train_dataset: clic_edm_ww_fullhad_pf, 720000 [2025-01-30 21:40:15,127] INFO: train_dataset: clic_edm_ww_fullhad_pf, 720700 [2025-01-30 21:40:15,127] INFO: train_dataset: clic_edm_ww_fullhad_pf, 720700 [2025-01-30 21:40:16,348] INFO: valid_dataset: clic_edm_qq_pf, 79948 [2025-01-30 21:40:16,348] INFO: valid_dataset: clic_edm_qq_pf, 79948 [2025-01-30 21:40:16,353] INFO: valid_dataset: clic_edm_qq_pf, 79950 [2025-01-30 21:40:16,353] INFO: valid_dataset: clic_edm_qq_pf, 79950 [2025-01-30 21:40:16,358] INFO: valid_dataset: clic_edm_qq_pf, 79939 [2025-01-30 21:40:16,358] INFO: valid_dataset: clic_edm_qq_pf, 79939 [2025-01-30 21:40:16,363] INFO: valid_dataset: clic_edm_qq_pf, 79939 [2025-01-30 21:40:16,363] INFO: valid_dataset: clic_edm_qq_pf, 79939 [2025-01-30 21:40:16,368] INFO: valid_dataset: clic_edm_qq_pf, 79950 [2025-01-30 21:40:16,368] INFO: valid_dataset: clic_edm_qq_pf, 79950 [2025-01-30 21:40:16,372] INFO: valid_dataset: clic_edm_qq_pf, 79950 [2025-01-30 21:40:16,372] INFO: valid_dataset: clic_edm_qq_pf, 79950 [2025-01-30 21:40:16,377] INFO: valid_dataset: clic_edm_qq_pf, 79938 [2025-01-30 21:40:16,377] INFO: valid_dataset: clic_edm_qq_pf, 79938 [2025-01-30 21:40:16,382] INFO: valid_dataset: clic_edm_qq_pf, 79957 [2025-01-30 21:40:16,382] INFO: valid_dataset: clic_edm_qq_pf, 79957 [2025-01-30 21:40:16,387] INFO: valid_dataset: clic_edm_qq_pf, 79955 [2025-01-30 21:40:16,387] INFO: valid_dataset: clic_edm_qq_pf, 79955 [2025-01-30 21:40:16,391] INFO: valid_dataset: clic_edm_qq_pf, 80035 [2025-01-30 21:40:16,391] INFO: valid_dataset: clic_edm_qq_pf, 80035 [2025-01-30 21:40:16,397] INFO: valid_dataset: clic_edm_ttbar_pf, 79300 [2025-01-30 21:40:16,397] INFO: valid_dataset: clic_edm_ttbar_pf, 79300 [2025-01-30 21:40:16,404] INFO: valid_dataset: clic_edm_ttbar_pf, 79300 [2025-01-30 21:40:16,404] INFO: valid_dataset: clic_edm_ttbar_pf, 79300 [2025-01-30 21:40:16,411] INFO: valid_dataset: clic_edm_ttbar_pf, 79300 [2025-01-30 21:40:16,411] INFO: valid_dataset: clic_edm_ttbar_pf, 79300 [2025-01-30 21:40:16,417] INFO: valid_dataset: clic_edm_ttbar_pf, 79300 [2025-01-30 21:40:16,417] INFO: valid_dataset: clic_edm_ttbar_pf, 79300 [2025-01-30 21:40:16,558] INFO: valid_dataset: clic_edm_ttbar_pf, 79300 [2025-01-30 21:40:16,558] INFO: valid_dataset: clic_edm_ttbar_pf, 79300 [2025-01-30 21:40:16,568] INFO: valid_dataset: clic_edm_ttbar_pf, 79300 [2025-01-30 21:40:16,568] INFO: valid_dataset: clic_edm_ttbar_pf, 79300 [2025-01-30 21:40:16,575] INFO: valid_dataset: clic_edm_ttbar_pf, 79300 [2025-01-30 21:40:16,575] INFO: valid_dataset: clic_edm_ttbar_pf, 79300 [2025-01-30 21:40:16,582] INFO: valid_dataset: clic_edm_ttbar_pf, 79300 [2025-01-30 21:40:16,582] INFO: valid_dataset: clic_edm_ttbar_pf, 79300 [2025-01-30 21:40:16,589] INFO: valid_dataset: clic_edm_ttbar_pf, 79300 [2025-01-30 21:40:16,589] INFO: valid_dataset: clic_edm_ttbar_pf, 79300 [2025-01-30 21:40:16,596] INFO: valid_dataset: clic_edm_ttbar_pf, 79700 [2025-01-30 21:40:16,596] INFO: valid_dataset: clic_edm_ttbar_pf, 79700 [2025-01-30 21:40:16,602] INFO: valid_dataset: clic_edm_ww_fullhad_pf, 80000 [2025-01-30 21:40:16,602] INFO: valid_dataset: clic_edm_ww_fullhad_pf, 80000 [2025-01-30 21:40:16,607] INFO: valid_dataset: clic_edm_ww_fullhad_pf, 80000 [2025-01-30 21:40:16,607] INFO: valid_dataset: clic_edm_ww_fullhad_pf, 80000 [2025-01-30 21:40:16,613] INFO: valid_dataset: clic_edm_ww_fullhad_pf, 80000 [2025-01-30 21:40:16,613] INFO: valid_dataset: clic_edm_ww_fullhad_pf, 80000 [2025-01-30 21:40:16,619] INFO: valid_dataset: clic_edm_ww_fullhad_pf, 80000 [2025-01-30 21:40:16,619] INFO: valid_dataset: clic_edm_ww_fullhad_pf, 80000 [2025-01-30 21:40:16,624] INFO: valid_dataset: clic_edm_ww_fullhad_pf, 80000 [2025-01-30 21:40:16,624] INFO: valid_dataset: clic_edm_ww_fullhad_pf, 80000 [2025-01-30 21:40:16,629] INFO: valid_dataset: clic_edm_ww_fullhad_pf, 80000 [2025-01-30 21:40:16,629] INFO: valid_dataset: clic_edm_ww_fullhad_pf, 80000 [2025-01-30 21:40:16,635] INFO: valid_dataset: clic_edm_ww_fullhad_pf, 80000 [2025-01-30 21:40:16,635] INFO: valid_dataset: clic_edm_ww_fullhad_pf, 80000 [2025-01-30 21:40:16,641] INFO: valid_dataset: clic_edm_ww_fullhad_pf, 80000 [2025-01-30 21:40:16,641] INFO: valid_dataset: clic_edm_ww_fullhad_pf, 80000 [2025-01-30 21:40:16,646] INFO: valid_dataset: clic_edm_ww_fullhad_pf, 80000 [2025-01-30 21:40:16,646] INFO: valid_dataset: clic_edm_ww_fullhad_pf, 80000 [2025-01-30 21:40:16,652] INFO: valid_dataset: clic_edm_ww_fullhad_pf, 80100 [2025-01-30 21:40:16,652] INFO: valid_dataset: clic_edm_ww_fullhad_pf, 80100 [2025-01-31 05:15:26,073] INFO: Rank 0: epoch=1/10 train_loss=2.6718 valid_loss=2.4242 stale=0 epoch_train_time=438.09m epoch_valid_time=17.05m epoch_total_time=455.13m eta=4096.4m [2025-01-31 05:15:26,073] INFO: Rank 0: epoch=1/10 train_loss=2.6718 valid_loss=2.4242 stale=0 epoch_train_time=438.09m epoch_valid_time=17.05m epoch_total_time=455.13m eta=4096.4m [2025-01-31 05:15:26,078] INFO: split_configs=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] [2025-01-31 05:15:26,078] INFO: split_configs=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] [2025-01-31 05:15:26,185] INFO: test_dataset: clic_edm_qq_pf, 2000 [2025-01-31 05:15:26,185] INFO: test_dataset: clic_edm_qq_pf, 2000 [2025-01-31 05:15:26,214] INFO: Running predictions on clic_edm_qq_pf [2025-01-31 05:15:26,214] INFO: Running predictions on clic_edm_qq_pf [2025-01-31 05:15:30,680] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_1/clic_edm_qq_pf/pred_0_0.parquet [2025-01-31 05:15:30,680] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_1/clic_edm_qq_pf/pred_0_0.parquet [2025-01-31 05:15:31,116] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_1/clic_edm_qq_pf/pred_0_1.parquet [2025-01-31 05:15:31,116] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_1/clic_edm_qq_pf/pred_0_1.parquet [2025-01-31 05:15:31,587] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_1/clic_edm_qq_pf/pred_0_2.parquet [2025-01-31 05:15:31,587] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_1/clic_edm_qq_pf/pred_0_2.parquet [2025-01-31 05:15:32,038] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_1/clic_edm_qq_pf/pred_0_3.parquet [2025-01-31 05:15:32,038] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_1/clic_edm_qq_pf/pred_0_3.parquet [2025-01-31 05:15:32,474] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_1/clic_edm_qq_pf/pred_0_4.parquet [2025-01-31 05:15:32,474] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_1/clic_edm_qq_pf/pred_0_4.parquet [2025-01-31 05:15:32,917] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_1/clic_edm_qq_pf/pred_0_5.parquet [2025-01-31 05:15:32,917] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_1/clic_edm_qq_pf/pred_0_5.parquet [2025-01-31 05:15:33,343] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_1/clic_edm_qq_pf/pred_0_6.parquet [2025-01-31 05:15:33,343] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_1/clic_edm_qq_pf/pred_0_6.parquet [2025-01-31 05:15:33,767] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_1/clic_edm_qq_pf/pred_0_7.parquet [2025-01-31 05:15:33,767] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_1/clic_edm_qq_pf/pred_0_7.parquet [2025-01-31 05:15:33,850] INFO: Time taken to make predictions on device 0 is: 0.12 min [2025-01-31 05:15:33,850] INFO: Time taken to make predictions on device 0 is: 0.12 min [2025-01-31 12:52:42,662] INFO: Rank 0: epoch=2/10 train_loss=2.2869 valid_loss=2.2407 stale=0 epoch_train_time=439.39m epoch_valid_time=17.06m epoch_total_time=456.45m eta=3649.7m [2025-01-31 12:52:42,662] INFO: Rank 0: epoch=2/10 train_loss=2.2869 valid_loss=2.2407 stale=0 epoch_train_time=439.39m epoch_valid_time=17.06m epoch_total_time=456.45m eta=3649.7m [2025-01-31 12:52:42,665] INFO: split_configs=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] [2025-01-31 12:52:42,665] INFO: split_configs=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] [2025-01-31 12:52:42,770] INFO: test_dataset: clic_edm_qq_pf, 2000 [2025-01-31 12:52:42,770] INFO: test_dataset: clic_edm_qq_pf, 2000 [2025-01-31 12:52:42,774] INFO: Running predictions on clic_edm_qq_pf [2025-01-31 12:52:42,774] INFO: Running predictions on clic_edm_qq_pf [2025-01-31 12:52:43,950] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_2/clic_edm_qq_pf/pred_0_0.parquet [2025-01-31 12:52:43,950] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_2/clic_edm_qq_pf/pred_0_0.parquet [2025-01-31 12:52:44,447] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_2/clic_edm_qq_pf/pred_0_1.parquet [2025-01-31 12:52:44,447] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_2/clic_edm_qq_pf/pred_0_1.parquet [2025-01-31 12:52:44,886] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_2/clic_edm_qq_pf/pred_0_2.parquet [2025-01-31 12:52:44,886] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_2/clic_edm_qq_pf/pred_0_2.parquet [2025-01-31 12:52:45,345] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_2/clic_edm_qq_pf/pred_0_3.parquet [2025-01-31 12:52:45,345] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_2/clic_edm_qq_pf/pred_0_3.parquet [2025-01-31 12:52:45,784] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_2/clic_edm_qq_pf/pred_0_4.parquet [2025-01-31 12:52:45,784] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_2/clic_edm_qq_pf/pred_0_4.parquet [2025-01-31 12:52:46,210] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_2/clic_edm_qq_pf/pred_0_5.parquet [2025-01-31 12:52:46,210] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_2/clic_edm_qq_pf/pred_0_5.parquet [2025-01-31 12:52:46,661] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_2/clic_edm_qq_pf/pred_0_6.parquet [2025-01-31 12:52:46,661] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_2/clic_edm_qq_pf/pred_0_6.parquet [2025-01-31 12:52:47,038] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_2/clic_edm_qq_pf/pred_0_7.parquet [2025-01-31 12:52:47,038] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_2/clic_edm_qq_pf/pred_0_7.parquet [2025-01-31 12:52:47,132] INFO: Time taken to make predictions on device 0 is: 0.07 min [2025-01-31 12:52:47,132] INFO: Time taken to make predictions on device 0 is: 0.07 min [2025-01-31 20:31:16,701] INFO: Rank 0: epoch=3/10 train_loss=2.1455 valid_loss=2.1378 stale=0 epoch_train_time=440.66m epoch_valid_time=17.10m epoch_total_time=457.76m eta=3199.0m [2025-01-31 20:31:16,701] INFO: Rank 0: epoch=3/10 train_loss=2.1455 valid_loss=2.1378 stale=0 epoch_train_time=440.66m epoch_valid_time=17.10m epoch_total_time=457.76m eta=3199.0m [2025-01-31 20:31:16,712] INFO: split_configs=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] [2025-01-31 20:31:16,712] INFO: split_configs=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] [2025-01-31 20:31:16,811] INFO: test_dataset: clic_edm_qq_pf, 2000 [2025-01-31 20:31:16,811] INFO: test_dataset: clic_edm_qq_pf, 2000 [2025-01-31 20:31:16,815] INFO: Running predictions on clic_edm_qq_pf [2025-01-31 20:31:16,815] INFO: Running predictions on clic_edm_qq_pf [2025-01-31 20:31:17,940] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_3/clic_edm_qq_pf/pred_0_0.parquet [2025-01-31 20:31:17,940] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_3/clic_edm_qq_pf/pred_0_0.parquet [2025-01-31 20:31:18,389] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_3/clic_edm_qq_pf/pred_0_1.parquet [2025-01-31 20:31:18,389] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_3/clic_edm_qq_pf/pred_0_1.parquet [2025-01-31 20:31:18,829] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_3/clic_edm_qq_pf/pred_0_2.parquet [2025-01-31 20:31:18,829] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_3/clic_edm_qq_pf/pred_0_2.parquet [2025-01-31 20:31:19,289] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_3/clic_edm_qq_pf/pred_0_3.parquet [2025-01-31 20:31:19,289] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_3/clic_edm_qq_pf/pred_0_3.parquet [2025-01-31 20:31:19,772] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_3/clic_edm_qq_pf/pred_0_4.parquet [2025-01-31 20:31:19,772] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_3/clic_edm_qq_pf/pred_0_4.parquet [2025-01-31 20:31:20,214] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_3/clic_edm_qq_pf/pred_0_5.parquet [2025-01-31 20:31:20,214] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_3/clic_edm_qq_pf/pred_0_5.parquet [2025-01-31 20:31:20,649] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_3/clic_edm_qq_pf/pred_0_6.parquet [2025-01-31 20:31:20,649] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_3/clic_edm_qq_pf/pred_0_6.parquet [2025-01-31 20:31:21,306] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_3/clic_edm_qq_pf/pred_0_7.parquet [2025-01-31 20:31:21,306] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_3/clic_edm_qq_pf/pred_0_7.parquet [2025-01-31 20:31:21,391] INFO: Time taken to make predictions on device 0 is: 0.07 min [2025-01-31 20:31:21,391] INFO: Time taken to make predictions on device 0 is: 0.07 min [2025-02-01 04:08:57,727] INFO: Rank 0: epoch=4/10 train_loss=2.0610 valid_loss=2.0769 stale=0 epoch_train_time=439.88m epoch_valid_time=16.99m epoch_total_time=456.87m eta=2743.0m [2025-02-01 04:08:57,727] INFO: Rank 0: epoch=4/10 train_loss=2.0610 valid_loss=2.0769 stale=0 epoch_train_time=439.88m epoch_valid_time=16.99m epoch_total_time=456.87m eta=2743.0m [2025-02-01 04:08:57,740] INFO: split_configs=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] [2025-02-01 04:08:57,740] INFO: split_configs=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] [2025-02-01 04:08:57,850] INFO: test_dataset: clic_edm_qq_pf, 2000 [2025-02-01 04:08:57,850] INFO: test_dataset: clic_edm_qq_pf, 2000 [2025-02-01 04:08:57,854] INFO: Running predictions on clic_edm_qq_pf [2025-02-01 04:08:57,854] INFO: Running predictions on clic_edm_qq_pf [2025-02-01 04:08:59,031] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_4/clic_edm_qq_pf/pred_0_0.parquet [2025-02-01 04:08:59,031] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_4/clic_edm_qq_pf/pred_0_0.parquet [2025-02-01 04:08:59,477] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_4/clic_edm_qq_pf/pred_0_1.parquet [2025-02-01 04:08:59,477] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_4/clic_edm_qq_pf/pred_0_1.parquet [2025-02-01 04:08:59,915] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_4/clic_edm_qq_pf/pred_0_2.parquet [2025-02-01 04:08:59,915] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_4/clic_edm_qq_pf/pred_0_2.parquet [2025-02-01 04:09:00,339] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_4/clic_edm_qq_pf/pred_0_3.parquet [2025-02-01 04:09:00,339] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_4/clic_edm_qq_pf/pred_0_3.parquet [2025-02-01 04:09:00,792] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_4/clic_edm_qq_pf/pred_0_4.parquet [2025-02-01 04:09:00,792] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_4/clic_edm_qq_pf/pred_0_4.parquet [2025-02-01 04:09:01,227] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_4/clic_edm_qq_pf/pred_0_5.parquet [2025-02-01 04:09:01,227] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_4/clic_edm_qq_pf/pred_0_5.parquet [2025-02-01 04:09:01,670] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_4/clic_edm_qq_pf/pred_0_6.parquet [2025-02-01 04:09:01,670] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_4/clic_edm_qq_pf/pred_0_6.parquet [2025-02-01 04:09:02,110] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_4/clic_edm_qq_pf/pred_0_7.parquet [2025-02-01 04:09:02,110] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_4/clic_edm_qq_pf/pred_0_7.parquet [2025-02-01 04:09:02,203] INFO: Time taken to make predictions on device 0 is: 0.07 min [2025-02-01 04:09:02,203] INFO: Time taken to make predictions on device 0 is: 0.07 min [2025-02-01 11:46:04,083] INFO: Rank 0: epoch=5/10 train_loss=2.0158 valid_loss=2.0395 stale=0 epoch_train_time=439.32m epoch_valid_time=16.98m epoch_total_time=456.30m eta=2285.8m [2025-02-01 11:46:04,083] INFO: Rank 0: epoch=5/10 train_loss=2.0158 valid_loss=2.0395 stale=0 epoch_train_time=439.32m epoch_valid_time=16.98m epoch_total_time=456.30m eta=2285.8m [2025-02-01 11:46:04,095] INFO: split_configs=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] [2025-02-01 11:46:04,095] INFO: split_configs=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] [2025-02-01 11:46:04,203] INFO: test_dataset: clic_edm_qq_pf, 2000 [2025-02-01 11:46:04,203] INFO: test_dataset: clic_edm_qq_pf, 2000 [2025-02-01 11:46:04,207] INFO: Running predictions on clic_edm_qq_pf [2025-02-01 11:46:04,207] INFO: Running predictions on clic_edm_qq_pf [2025-02-01 11:46:05,413] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_5/clic_edm_qq_pf/pred_0_0.parquet [2025-02-01 11:46:05,413] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_5/clic_edm_qq_pf/pred_0_0.parquet [2025-02-01 11:46:05,846] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_5/clic_edm_qq_pf/pred_0_1.parquet [2025-02-01 11:46:05,846] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_5/clic_edm_qq_pf/pred_0_1.parquet [2025-02-01 11:46:06,320] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_5/clic_edm_qq_pf/pred_0_2.parquet [2025-02-01 11:46:06,320] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_5/clic_edm_qq_pf/pred_0_2.parquet [2025-02-01 11:46:06,755] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_5/clic_edm_qq_pf/pred_0_3.parquet [2025-02-01 11:46:06,755] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_5/clic_edm_qq_pf/pred_0_3.parquet [2025-02-01 11:46:07,214] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_5/clic_edm_qq_pf/pred_0_4.parquet [2025-02-01 11:46:07,214] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_5/clic_edm_qq_pf/pred_0_4.parquet [2025-02-01 11:46:07,655] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_5/clic_edm_qq_pf/pred_0_5.parquet [2025-02-01 11:46:07,655] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_5/clic_edm_qq_pf/pred_0_5.parquet [2025-02-01 11:46:08,104] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_5/clic_edm_qq_pf/pred_0_6.parquet [2025-02-01 11:46:08,104] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_5/clic_edm_qq_pf/pred_0_6.parquet [2025-02-01 11:46:08,530] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_5/clic_edm_qq_pf/pred_0_7.parquet [2025-02-01 11:46:08,530] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_5/clic_edm_qq_pf/pred_0_7.parquet [2025-02-01 11:46:08,627] INFO: Time taken to make predictions on device 0 is: 0.07 min [2025-02-01 11:46:08,627] INFO: Time taken to make predictions on device 0 is: 0.07 min [2025-02-01 19:21:19,119] INFO: Rank 0: epoch=6/10 train_loss=1.9790 valid_loss=1.9979 stale=0 epoch_train_time=437.34m epoch_valid_time=17.09m epoch_total_time=454.44m eta=1827.4m [2025-02-01 19:21:19,119] INFO: Rank 0: epoch=6/10 train_loss=1.9790 valid_loss=1.9979 stale=0 epoch_train_time=437.34m epoch_valid_time=17.09m epoch_total_time=454.44m eta=1827.4m [2025-02-01 19:21:19,129] INFO: split_configs=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] [2025-02-01 19:21:19,129] INFO: split_configs=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] [2025-02-01 19:21:19,231] INFO: test_dataset: clic_edm_qq_pf, 2000 [2025-02-01 19:21:19,231] INFO: test_dataset: clic_edm_qq_pf, 2000 [2025-02-01 19:21:19,234] INFO: Running predictions on clic_edm_qq_pf [2025-02-01 19:21:19,234] INFO: Running predictions on clic_edm_qq_pf [2025-02-01 19:21:20,456] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_6/clic_edm_qq_pf/pred_0_0.parquet [2025-02-01 19:21:20,456] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_6/clic_edm_qq_pf/pred_0_0.parquet [2025-02-01 19:21:20,944] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_6/clic_edm_qq_pf/pred_0_1.parquet [2025-02-01 19:21:20,944] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_6/clic_edm_qq_pf/pred_0_1.parquet [2025-02-01 19:21:21,387] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_6/clic_edm_qq_pf/pred_0_2.parquet [2025-02-01 19:21:21,387] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_6/clic_edm_qq_pf/pred_0_2.parquet [2025-02-01 19:21:21,820] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_6/clic_edm_qq_pf/pred_0_3.parquet [2025-02-01 19:21:21,820] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_6/clic_edm_qq_pf/pred_0_3.parquet [2025-02-01 19:21:22,318] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_6/clic_edm_qq_pf/pred_0_4.parquet [2025-02-01 19:21:22,318] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_6/clic_edm_qq_pf/pred_0_4.parquet [2025-02-01 19:21:22,759] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_6/clic_edm_qq_pf/pred_0_5.parquet [2025-02-01 19:21:22,759] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_6/clic_edm_qq_pf/pred_0_5.parquet [2025-02-01 19:21:23,214] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_6/clic_edm_qq_pf/pred_0_6.parquet [2025-02-01 19:21:23,214] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_6/clic_edm_qq_pf/pred_0_6.parquet [2025-02-01 19:21:23,615] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_6/clic_edm_qq_pf/pred_0_7.parquet [2025-02-01 19:21:23,615] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_6/clic_edm_qq_pf/pred_0_7.parquet [2025-02-01 19:21:23,705] INFO: Time taken to make predictions on device 0 is: 0.07 min [2025-02-01 19:21:23,705] INFO: Time taken to make predictions on device 0 is: 0.07 min [2025-02-02 02:57:38,911] INFO: Rank 0: epoch=7/10 train_loss=1.9461 valid_loss=1.9723 stale=0 epoch_train_time=438.41m epoch_valid_time=17.12m epoch_total_time=455.53m eta=1370.3m [2025-02-02 02:57:38,911] INFO: Rank 0: epoch=7/10 train_loss=1.9461 valid_loss=1.9723 stale=0 epoch_train_time=438.41m epoch_valid_time=17.12m epoch_total_time=455.53m eta=1370.3m [2025-02-02 02:57:38,922] INFO: split_configs=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] [2025-02-02 02:57:38,922] INFO: split_configs=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] [2025-02-02 02:57:39,028] INFO: test_dataset: clic_edm_qq_pf, 2000 [2025-02-02 02:57:39,028] INFO: test_dataset: clic_edm_qq_pf, 2000 [2025-02-02 02:57:39,032] INFO: Running predictions on clic_edm_qq_pf [2025-02-02 02:57:39,032] INFO: Running predictions on clic_edm_qq_pf [2025-02-02 02:57:40,209] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_7/clic_edm_qq_pf/pred_0_0.parquet [2025-02-02 02:57:40,209] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_7/clic_edm_qq_pf/pred_0_0.parquet [2025-02-02 02:57:40,656] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_7/clic_edm_qq_pf/pred_0_1.parquet [2025-02-02 02:57:40,656] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_7/clic_edm_qq_pf/pred_0_1.parquet [2025-02-02 02:57:41,114] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_7/clic_edm_qq_pf/pred_0_2.parquet [2025-02-02 02:57:41,114] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_7/clic_edm_qq_pf/pred_0_2.parquet [2025-02-02 02:57:41,553] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_7/clic_edm_qq_pf/pred_0_3.parquet [2025-02-02 02:57:41,553] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_7/clic_edm_qq_pf/pred_0_3.parquet [2025-02-02 02:57:42,053] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_7/clic_edm_qq_pf/pred_0_4.parquet [2025-02-02 02:57:42,053] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_7/clic_edm_qq_pf/pred_0_4.parquet [2025-02-02 02:57:42,500] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_7/clic_edm_qq_pf/pred_0_5.parquet [2025-02-02 02:57:42,500] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_7/clic_edm_qq_pf/pred_0_5.parquet [2025-02-02 02:57:42,933] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_7/clic_edm_qq_pf/pred_0_6.parquet [2025-02-02 02:57:42,933] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_7/clic_edm_qq_pf/pred_0_6.parquet [2025-02-02 02:57:43,419] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_7/clic_edm_qq_pf/pred_0_7.parquet [2025-02-02 02:57:43,419] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_7/clic_edm_qq_pf/pred_0_7.parquet [2025-02-02 02:57:43,501] INFO: Time taken to make predictions on device 0 is: 0.07 min [2025-02-02 02:57:43,501] INFO: Time taken to make predictions on device 0 is: 0.07 min [2025-02-02 10:33:18,827] INFO: Rank 0: epoch=8/10 train_loss=1.9182 valid_loss=1.9507 stale=0 epoch_train_time=437.87m epoch_valid_time=16.98m epoch_total_time=454.85m eta=913.3m [2025-02-02 10:33:18,827] INFO: Rank 0: epoch=8/10 train_loss=1.9182 valid_loss=1.9507 stale=0 epoch_train_time=437.87m epoch_valid_time=16.98m epoch_total_time=454.85m eta=913.3m [2025-02-02 10:33:18,839] INFO: split_configs=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] [2025-02-02 10:33:18,839] INFO: split_configs=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] [2025-02-02 10:33:18,940] INFO: test_dataset: clic_edm_qq_pf, 2000 [2025-02-02 10:33:18,940] INFO: test_dataset: clic_edm_qq_pf, 2000 [2025-02-02 10:33:18,944] INFO: Running predictions on clic_edm_qq_pf [2025-02-02 10:33:18,944] INFO: Running predictions on clic_edm_qq_pf [2025-02-02 10:33:20,332] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_8/clic_edm_qq_pf/pred_0_0.parquet [2025-02-02 10:33:20,332] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_8/clic_edm_qq_pf/pred_0_0.parquet [2025-02-02 10:33:20,851] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_8/clic_edm_qq_pf/pred_0_1.parquet [2025-02-02 10:33:20,851] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_8/clic_edm_qq_pf/pred_0_1.parquet [2025-02-02 10:33:21,350] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_8/clic_edm_qq_pf/pred_0_2.parquet [2025-02-02 10:33:21,350] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_8/clic_edm_qq_pf/pred_0_2.parquet [2025-02-02 10:33:21,806] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_8/clic_edm_qq_pf/pred_0_3.parquet [2025-02-02 10:33:21,806] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_8/clic_edm_qq_pf/pred_0_3.parquet [2025-02-02 10:33:22,240] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_8/clic_edm_qq_pf/pred_0_4.parquet [2025-02-02 10:33:22,240] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_8/clic_edm_qq_pf/pred_0_4.parquet [2025-02-02 10:33:22,731] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_8/clic_edm_qq_pf/pred_0_5.parquet [2025-02-02 10:33:22,731] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_8/clic_edm_qq_pf/pred_0_5.parquet [2025-02-02 10:33:23,180] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_8/clic_edm_qq_pf/pred_0_6.parquet [2025-02-02 10:33:23,180] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_8/clic_edm_qq_pf/pred_0_6.parquet [2025-02-02 10:33:23,589] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_8/clic_edm_qq_pf/pred_0_7.parquet [2025-02-02 10:33:23,589] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_8/clic_edm_qq_pf/pred_0_7.parquet [2025-02-02 10:33:23,700] INFO: Time taken to make predictions on device 0 is: 0.07 min [2025-02-02 10:33:23,700] INFO: Time taken to make predictions on device 0 is: 0.07 min [2025-02-02 18:09:19,586] INFO: Rank 0: epoch=9/10 train_loss=1.8957 valid_loss=1.9426 stale=0 epoch_train_time=438.14m epoch_valid_time=17.08m epoch_total_time=455.21m eta=456.6m [2025-02-02 18:09:19,586] INFO: Rank 0: epoch=9/10 train_loss=1.8957 valid_loss=1.9426 stale=0 epoch_train_time=438.14m epoch_valid_time=17.08m epoch_total_time=455.21m eta=456.6m [2025-02-02 18:09:19,597] INFO: split_configs=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] [2025-02-02 18:09:19,597] INFO: split_configs=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] [2025-02-02 18:09:19,703] INFO: test_dataset: clic_edm_qq_pf, 2000 [2025-02-02 18:09:19,703] INFO: test_dataset: clic_edm_qq_pf, 2000 [2025-02-02 18:09:19,707] INFO: Running predictions on clic_edm_qq_pf [2025-02-02 18:09:19,707] INFO: Running predictions on clic_edm_qq_pf [2025-02-02 18:09:21,145] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_9/clic_edm_qq_pf/pred_0_0.parquet [2025-02-02 18:09:21,145] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_9/clic_edm_qq_pf/pred_0_0.parquet [2025-02-02 18:09:21,597] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_9/clic_edm_qq_pf/pred_0_1.parquet [2025-02-02 18:09:21,597] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_9/clic_edm_qq_pf/pred_0_1.parquet [2025-02-02 18:09:22,038] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_9/clic_edm_qq_pf/pred_0_2.parquet [2025-02-02 18:09:22,038] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_9/clic_edm_qq_pf/pred_0_2.parquet [2025-02-02 18:09:22,488] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_9/clic_edm_qq_pf/pred_0_3.parquet [2025-02-02 18:09:22,488] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_9/clic_edm_qq_pf/pred_0_3.parquet [2025-02-02 18:09:22,954] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_9/clic_edm_qq_pf/pred_0_4.parquet [2025-02-02 18:09:22,954] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_9/clic_edm_qq_pf/pred_0_4.parquet [2025-02-02 18:09:23,400] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_9/clic_edm_qq_pf/pred_0_5.parquet [2025-02-02 18:09:23,400] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_9/clic_edm_qq_pf/pred_0_5.parquet [2025-02-02 18:09:23,868] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_9/clic_edm_qq_pf/pred_0_6.parquet [2025-02-02 18:09:23,868] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_9/clic_edm_qq_pf/pred_0_6.parquet [2025-02-02 18:09:24,286] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_9/clic_edm_qq_pf/pred_0_7.parquet [2025-02-02 18:09:24,286] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_9/clic_edm_qq_pf/pred_0_7.parquet [2025-02-02 18:09:24,410] INFO: Time taken to make predictions on device 0 is: 0.07 min [2025-02-02 18:09:24,410] INFO: Time taken to make predictions on device 0 is: 0.07 min [2025-02-03 01:45:19,250] INFO: Rank 0: epoch=10/10 train_loss=1.8830 valid_loss=1.9328 stale=0 epoch_train_time=438.17m epoch_valid_time=17.00m epoch_total_time=455.17m eta=0.0m [2025-02-03 01:45:19,250] INFO: Rank 0: epoch=10/10 train_loss=1.8830 valid_loss=1.9328 stale=0 epoch_train_time=438.17m epoch_valid_time=17.00m epoch_total_time=455.17m eta=0.0m [2025-02-03 01:45:19,260] INFO: split_configs=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] [2025-02-03 01:45:19,260] INFO: split_configs=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] [2025-02-03 01:45:19,358] INFO: test_dataset: clic_edm_qq_pf, 2000 [2025-02-03 01:45:19,358] INFO: test_dataset: clic_edm_qq_pf, 2000 [2025-02-03 01:45:19,362] INFO: Running predictions on clic_edm_qq_pf [2025-02-03 01:45:19,362] INFO: Running predictions on clic_edm_qq_pf [2025-02-03 01:45:20,600] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_10/clic_edm_qq_pf/pred_0_0.parquet [2025-02-03 01:45:20,600] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_10/clic_edm_qq_pf/pred_0_0.parquet [2025-02-03 01:45:21,050] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_10/clic_edm_qq_pf/pred_0_1.parquet [2025-02-03 01:45:21,050] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_10/clic_edm_qq_pf/pred_0_1.parquet [2025-02-03 01:45:21,495] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_10/clic_edm_qq_pf/pred_0_2.parquet [2025-02-03 01:45:21,495] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_10/clic_edm_qq_pf/pred_0_2.parquet [2025-02-03 01:45:21,946] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_10/clic_edm_qq_pf/pred_0_3.parquet [2025-02-03 01:45:21,946] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_10/clic_edm_qq_pf/pred_0_3.parquet [2025-02-03 01:45:22,366] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_10/clic_edm_qq_pf/pred_0_4.parquet [2025-02-03 01:45:22,366] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_10/clic_edm_qq_pf/pred_0_4.parquet [2025-02-03 01:45:22,799] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_10/clic_edm_qq_pf/pred_0_5.parquet [2025-02-03 01:45:22,799] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_10/clic_edm_qq_pf/pred_0_5.parquet [2025-02-03 01:45:23,216] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_10/clic_edm_qq_pf/pred_0_6.parquet [2025-02-03 01:45:23,216] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_10/clic_edm_qq_pf/pred_0_6.parquet [2025-02-03 01:45:23,666] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_10/clic_edm_qq_pf/pred_0_7.parquet [2025-02-03 01:45:23,666] INFO: Saved predictions at experiments/pyg-clic_20250130_214007_333962/preds_epoch_10/clic_edm_qq_pf/pred_0_7.parquet [2025-02-03 01:45:23,773] INFO: Time taken to make predictions on device 0 is: 0.07 min [2025-02-03 01:45:23,773] INFO: Time taken to make predictions on device 0 is: 0.07 min [2025-02-03 01:46:06,314] INFO: Training completed. Total time on device 0: 4565.822min [2025-02-03 01:46:06,314] INFO: Training completed. Total time on device 0: 4565.822min