bigscience-bot commited on
Commit
5599680
·
1 Parent(s): f07b51a
Files changed (1) hide show
  1. logs/main_log.txt +136 -0
logs/main_log.txt CHANGED
@@ -67457,3 +67457,139 @@ time (ms)
67457
  time (ms)
67458
  iteration 857/ 292968 | consumed samples: 1755136 | consumed tokens: 146898944 | elapsed time per iteration (ms): 77728.7 | learning rate: 6.000E-05 | global batch size: 2048 | loss scale: 1.0 | grad norm: 45230.465 | num zeros: 0.0 | curriculum seqlen: 104 | number of skipped iterations: 0 | number of nan iterations: 0 |
67459
  time (ms)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
67457
  time (ms)
67458
  iteration 857/ 292968 | consumed samples: 1755136 | consumed tokens: 146898944 | elapsed time per iteration (ms): 77728.7 | learning rate: 6.000E-05 | global batch size: 2048 | loss scale: 1.0 | grad norm: 45230.465 | num zeros: 0.0 | curriculum seqlen: 104 | number of skipped iterations: 0 | number of nan iterations: 0 |
67459
  time (ms)
67460
+ iteration 858/ 292968 | consumed samples: 1757184 | consumed tokens: 147111936 | elapsed time per iteration (ms): 77179.2 | learning rate: 6.000E-05 | global batch size: 2048 | loss scale: 1.0 | grad norm: 45230.465 | num zeros: 0.0 | curriculum seqlen: 104 | number of skipped iterations: 0 | number of nan iterations: 0 |
67461
+ time (ms)
67462
+ saving checkpoint at iteration 858 to /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints
67463
+ [2021-10-23 15:33:21,757] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/mp_rank_00_model_states.pt
67464
+ [2021-10-23 15:33:21,796] [INFO] [logging.py:68:log_dist] [Rank 1] Saving model checkpoint: /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/mp_rank_01_model_states.pt
67465
+ [2021-10-23 15:33:34,726] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_06_optim_states.pt
67466
+ [2021-10-23 15:33:34,782] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_112_optim_states.pt
67467
+ [2021-10-23 15:33:34,799] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_75_optim_states.pt
67468
+ [2021-10-23 15:33:34,806] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_116_optim_states.pt
67469
+ [2021-10-23 15:33:34,829] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_87_optim_states.pt
67470
+ [2021-10-23 15:33:34,872] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_76_optim_states.pt
67471
+ [2021-10-23 15:33:34,903] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_106_optim_states.pt
67472
+ [2021-10-23 15:33:34,946] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_99_optim_states.pt
67473
+ [2021-10-23 15:33:34,981] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_100_optim_states.pt
67474
+ [2021-10-23 15:33:35,033] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_93_optim_states.pt
67475
+ [2021-10-23 15:33:35,036] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_83_optim_states.pt
67476
+ [2021-10-23 15:33:35,049] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_78_optim_states.pt
67477
+ [2021-10-23 15:33:35,073] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_04_optim_states.pt
67478
+ [2021-10-23 15:33:35,105] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_54_optim_states.pt
67479
+ [2021-10-23 15:33:35,142] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_55_optim_states.pt
67480
+ [2021-10-23 15:33:35,146] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_111_optim_states.pt
67481
+ [2021-10-23 15:33:35,168] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_68_optim_states.pt
67482
+ [2021-10-23 15:33:35,197] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_118_optim_states.pt
67483
+ [2021-10-23 15:33:35,247] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_110_optim_states.pt
67484
+ [2021-10-23 15:33:35,249] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_94_optim_states.pt
67485
+ [2021-10-23 15:33:35,292] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_89_optim_states.pt
67486
+ [2021-10-23 15:33:35,317] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_85_optim_states.pt
67487
+ [2021-10-23 15:33:35,338] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_72_optim_states.pt
67488
+ [2021-10-23 15:33:35,375] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_105_optim_states.pt
67489
+ [2021-10-23 15:33:35,376] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_69_optim_states.pt
67490
+ [2021-10-23 15:33:35,438] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_97_optim_states.pt
67491
+ [2021-10-23 15:33:35,459] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_91_optim_states.pt
67492
+ [2021-10-23 15:33:35,497] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_103_optim_states.pt
67493
+ [2021-10-23 15:33:35,532] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_21_optim_states.pt
67494
+ [2021-10-23 15:33:35,691] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_23_optim_states.pt
67495
+ [2021-10-23 15:33:35,827] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_114_optim_states.pt
67496
+ [2021-10-23 15:33:35,840] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_50_optim_states.pt
67497
+ [2021-10-23 15:33:35,848] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_82_optim_states.pt
67498
+ [2021-10-23 15:33:35,859] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_58_optim_states.pt
67499
+ [2021-10-23 15:33:35,867] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_79_optim_states.pt
67500
+ [2021-10-23 15:33:35,909] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_49_optim_states.pt
67501
+ [2021-10-23 15:33:35,933] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_95_optim_states.pt
67502
+ [2021-10-23 15:33:35,947] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_119_optim_states.pt
67503
+ [2021-10-23 15:33:35,949] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_121_optim_states.pt
67504
+ [2021-10-23 15:33:35,950] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_46_optim_states.pt
67505
+ [2021-10-23 15:33:35,967] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_45_optim_states.pt
67506
+ [2021-10-23 15:33:35,984] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_84_optim_states.pt
67507
+ [2021-10-23 15:33:35,993] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_80_optim_states.pt
67508
+ [2021-10-23 15:33:35,997] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_60_optim_states.pt
67509
+ [2021-10-23 15:33:36,008] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_05_optim_states.pt
67510
+ [2021-10-23 15:33:36,015] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_96_optim_states.pt
67511
+ [2021-10-23 15:33:36,016] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_73_optim_states.pt
67512
+ [2021-10-23 15:33:36,045] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_102_optim_states.pt
67513
+ [2021-10-23 15:33:36,066] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_81_optim_states.pt
67514
+ [2021-10-23 15:33:36,066] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_122_optim_states.pt
67515
+ [2021-10-23 15:33:36,074] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_86_optim_states.pt
67516
+ [2021-10-23 15:33:36,088] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_13_optim_states.pt
67517
+ [2021-10-23 15:33:36,096] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_88_optim_states.pt
67518
+ [2021-10-23 15:33:36,103] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_104_optim_states.pt
67519
+ [2021-10-23 15:33:36,104] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_38_optim_states.pt
67520
+ [2021-10-23 15:33:36,106] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_30_optim_states.pt
67521
+ [2021-10-23 15:33:36,112] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_07_optim_states.pt
67522
+ [2021-10-23 15:33:36,138] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_115_optim_states.pt
67523
+ [2021-10-23 15:33:36,138] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_98_optim_states.pt
67524
+ [2021-10-23 15:33:36,157] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_74_optim_states.pt
67525
+ [2021-10-23 15:33:36,160] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_66_optim_states.pt
67526
+ [2021-10-23 15:33:36,167] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_107_optim_states.pt
67527
+ [2021-10-23 15:33:36,175] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_08_optim_states.pt
67528
+ [2021-10-23 15:33:36,205] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_12_optim_states.pt
67529
+ [2021-10-23 15:33:36,234] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_77_optim_states.pt
67530
+ [2021-10-23 15:33:36,235] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_41_optim_states.pt
67531
+ [2021-10-23 15:33:36,246] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_109_optim_states.pt
67532
+ [2021-10-23 15:33:36,287] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_43_optim_states.pt
67533
+ [2021-10-23 15:33:36,288] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_25_optim_states.pt
67534
+ [2021-10-23 15:33:36,289] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_113_optim_states.pt
67535
+ [2021-10-23 15:33:36,292] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_90_optim_states.pt
67536
+ [2021-10-23 15:33:36,303] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_70_optim_states.pt
67537
+ [2021-10-23 15:33:36,311] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_62_optim_states.pt
67538
+ [2021-10-23 15:33:36,312] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_92_optim_states.pt
67539
+ [2021-10-23 15:33:36,325] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_47_optim_states.pt
67540
+ [2021-10-23 15:33:36,331] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_10_optim_states.pt
67541
+ [2021-10-23 15:33:36,338] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_57_optim_states.pt
67542
+ [2021-10-23 15:33:36,345] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_101_optim_states.pt
67543
+ [2021-10-23 15:33:36,379] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_117_optim_states.pt
67544
+ [2021-10-23 15:33:36,411] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_71_optim_states.pt
67545
+ [2021-10-23 15:33:36,411] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_108_optim_states.pt
67546
+ [2021-10-23 15:33:36,416] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_17_optim_states.pt
67547
+ [2021-10-23 15:33:36,424] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_48_optim_states.pt
67548
+ [2021-10-23 15:33:36,466] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_39_optim_states.pt
67549
+ [2021-10-23 15:33:36,501] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_18_optim_states.pt
67550
+ [2021-10-23 15:33:36,522] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_65_optim_states.pt
67551
+ [2021-10-23 15:33:36,539] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_29_optim_states.pt
67552
+ [2021-10-23 15:33:36,548] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_27_optim_states.pt
67553
+ [2021-10-23 15:33:36,549] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_32_optim_states.pt
67554
+ [2021-10-23 15:33:36,559] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_34_optim_states.pt
67555
+ [2021-10-23 15:33:36,572] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_51_optim_states.pt
67556
+ [2021-10-23 15:33:36,576] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_64_optim_states.pt
67557
+ [2021-10-23 15:33:36,642] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_19_optim_states.pt
67558
+ [2021-10-23 15:33:36,686] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_14_optim_states.pt
67559
+ [2021-10-23 15:33:36,700] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_44_optim_states.pt
67560
+ [2021-10-23 15:33:36,726] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_56_optim_states.pt
67561
+ [2021-10-23 15:33:36,823] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_59_optim_states.pt
67562
+ [2021-10-23 15:33:36,840] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_67_optim_states.pt
67563
+ [2021-10-23 15:33:36,850] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_15_optim_states.pt
67564
+ [2021-10-23 15:33:36,942] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_28_optim_states.pt
67565
+ [2021-10-23 15:33:37,006] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_22_optim_states.pt
67566
+ [2021-10-23 15:33:37,121] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_26_optim_states.pt
67567
+ [2021-10-23 15:33:37,131] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_35_optim_states.pt
67568
+ [2021-10-23 15:33:37,157] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_42_optim_states.pt
67569
+ [2021-10-23 15:33:37,172] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_36_optim_states.pt
67570
+ [2021-10-23 15:33:37,186] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_24_optim_states.pt
67571
+ [2021-10-23 15:33:37,187] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_37_optim_states.pt
67572
+ [2021-10-23 15:33:37,191] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_40_optim_states.pt
67573
+ [2021-10-23 15:33:37,312] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_31_optim_states.pt
67574
+ [2021-10-23 15:33:37,348] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_33_optim_states.pt
67575
+ [2021-10-23 15:33:37,349] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_16_optim_states.pt
67576
+ [2021-10-23 15:33:37,735] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_00_optim_states.pt
67577
+ [2021-10-23 15:33:38,006] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_01_optim_states.pt
67578
+ [2021-10-23 15:33:38,182] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_123_optim_states.pt
67579
+ [2021-10-23 15:33:38,416] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_20_optim_states.pt
67580
+ [2021-10-23 15:33:38,831] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_120_optim_states.pt
67581
+ [2021-10-23 15:33:38,958] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_124_optim_states.pt
67582
+ [2021-10-23 15:33:38,963] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_126_optim_states.pt
67583
+ [2021-10-23 15:33:39,107] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_02_optim_states.pt
67584
+ [2021-10-23 15:33:39,245] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_03_optim_states.pt
67585
+ [2021-10-23 15:33:43,374] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_53_optim_states.pt
67586
+ [2021-10-23 15:33:43,966] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_52_optim_states.pt
67587
+ [2021-10-23 15:33:44,271] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_11_optim_states.pt
67588
+ [2021-10-23 15:33:44,517] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_61_optim_states.pt
67589
+ [2021-10-23 15:33:45,450] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_63_optim_states.pt
67590
+ [2021-10-23 15:33:45,558] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_09_optim_states.pt
67591
+ [2021-10-23 15:33:45,629] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_127_optim_states.pt
67592
+ [2021-10-23 15:33:45,748] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step858/zero_pp_rank_0_mp_rank_125_optim_states.pt
67593
+ successfully saved checkpoint at iteration 858 to /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints
67594
+ time (ms) | save-checkpoint: 26911.10
67595
+ [exiting program after 1190.9138893206914 minutes] datetime: 2021-10-23 15:33:45