Clemspace's picture
Initial model upload
cb9e677
wandb: WARNING Calling wandb.login() after wandb.init() has no effect.
2024-05-25 21:46:44 (UTC) - 0:00:04 - finetune.wrapped_model - INFO - Reloading model from /root/mistral_models/7B-v0.3/consolidated.safetensors ...
2024-05-25 21:46:45 (UTC) - 0:00:04 - finetune.wrapped_model - INFO - Converting model to dtype torch.bfloat16 ...
2024-05-25 21:46:45 (UTC) - 0:00:04 - finetune.wrapped_model - INFO - Loaded model on cpu!
2024-05-25 21:46:45 (UTC) - 0:00:04 - finetune.wrapped_model - INFO - Initializing lora layers ...
2024-05-25 21:46:45 (UTC) - 0:00:04 - finetune.wrapped_model - INFO - Finished initialization!
2024-05-25 21:46:45 (UTC) - 0:00:04 - finetune.wrapped_model - INFO - Sharding model over 1 GPUs ...
2024-05-25 21:46:46 (UTC) - 0:00:06 - finetune.wrapped_model - INFO - Model sharded!
2024-05-25 21:46:46 (UTC) - 0:00:06 - finetune.wrapped_model - INFO - 167,772,160 out of 7,415,795,712 parameter are finetuned (2.26%).
2024-05-25 21:46:47 (UTC) - 0:00:06 - dataset - INFO - Loading /root/data/mol_instructions_train.jsonl ...
2024-05-25 21:58:58 (UTC) - 0:12:17 - dataset - INFO - /root/data/mol_instructions_train.jsonl loaded and tokenized.
2024-05-25 21:58:58 (UTC) - 0:12:17 - dataset - INFO - Shuffling /root/data/mol_instructions_train.jsonl ...
2024-05-25 21:59:10 (UTC) - 0:12:29 - train - INFO - step: 000001 - done (%): 0.2 - loss: 2.263 - lr: 2.4e-06 - peak_alloc_mem (GB): 63.6 - alloc_mem (GB): 24.1 - words_per_second: 88.2 - avg_words_per_second: 88.2 - ETA: >2024-05-30 05:01:07
2024-05-25 21:59:21 (UTC) - 0:12:41 - train - INFO - step: 000002 - done (%): 0.4 - loss: 2.109 - lr: 2.6e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5641.7 - avg_words_per_second: 173.6 - ETA: >2024-05-28 02:12:21
2024-05-25 21:59:33 (UTC) - 0:12:53 - train - INFO - step: 000003 - done (%): 0.6 - loss: 1.950 - lr: 3.4e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5581.8 - avg_words_per_second: 256.4 - ETA: >2024-05-27 09:16:26
2024-05-25 21:59:45 (UTC) - 0:13:04 - train - INFO - step: 000004 - done (%): 0.8 - loss: 2.137 - lr: 4.6e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5568.7 - avg_words_per_second: 336.8 - ETA: >2024-05-27 00:48:32
2024-05-25 21:59:57 (UTC) - 0:13:16 - train - INFO - step: 000005 - done (%): 1.0 - loss: 2.457 - lr: 6.3e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5556.7 - avg_words_per_second: 414.7 - ETA: >2024-05-26 19:43:51
2024-05-25 22:00:09 (UTC) - 0:13:28 - train - INFO - step: 000006 - done (%): 1.2 - loss: 2.165 - lr: 8.4e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.8 - avg_words_per_second: 490.3 - ETA: >2024-05-26 16:20:46
2024-05-25 22:00:20 (UTC) - 0:13:40 - train - INFO - step: 000007 - done (%): 1.4 - loss: 1.872 - lr: 1.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5544.2 - avg_words_per_second: 563.7 - ETA: >2024-05-26 13:55:42
2024-05-25 22:00:32 (UTC) - 0:13:52 - train - INFO - step: 000008 - done (%): 1.6 - loss: 2.180 - lr: 1.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5528.1 - avg_words_per_second: 634.9 - ETA: >2024-05-26 12:06:56
2024-05-25 22:00:44 (UTC) - 0:14:04 - train - INFO - step: 000009 - done (%): 1.8 - loss: 2.058 - lr: 1.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.5 - avg_words_per_second: 704.2 - ETA: >2024-05-26 10:42:19
2024-05-25 22:00:56 (UTC) - 0:14:15 - train - INFO - step: 000010 - done (%): 2.0 - loss: 1.883 - lr: 2.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5542.4 - avg_words_per_second: 771.5 - ETA: >2024-05-26 09:34:37
2024-05-25 22:01:08 (UTC) - 0:14:27 - train - INFO - step: 000011 - done (%): 2.2 - loss: 2.233 - lr: 2.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.7 - avg_words_per_second: 837.0 - ETA: >2024-05-26 08:39:14
2024-05-25 22:01:20 (UTC) - 0:14:39 - train - INFO - step: 000012 - done (%): 2.4 - loss: 1.842 - lr: 2.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.7 - avg_words_per_second: 900.8 - ETA: >2024-05-26 07:53:05
2024-05-25 22:01:31 (UTC) - 0:14:51 - train - INFO - step: 000013 - done (%): 2.6 - loss: 1.745 - lr: 3.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.8 - avg_words_per_second: 962.8 - ETA: >2024-05-26 07:14:02
2024-05-25 22:01:43 (UTC) - 0:15:03 - train - INFO - step: 000014 - done (%): 2.8 - loss: 1.583 - lr: 3.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.3 - avg_words_per_second: 1023.2 - ETA: >2024-05-26 06:40:33
2024-05-25 22:01:55 (UTC) - 0:15:15 - train - INFO - step: 000015 - done (%): 3.0 - loss: 1.855 - lr: 3.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5528.3 - avg_words_per_second: 1081.9 - ETA: >2024-05-26 06:11:33
2024-05-25 22:02:07 (UTC) - 0:15:26 - train - INFO - step: 000016 - done (%): 3.2 - loss: 1.853 - lr: 4.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.1 - avg_words_per_second: 1139.2 - ETA: >2024-05-26 05:46:10
2024-05-25 22:02:19 (UTC) - 0:15:38 - train - INFO - step: 000017 - done (%): 3.4 - loss: 1.730 - lr: 4.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.8 - avg_words_per_second: 1195.0 - ETA: >2024-05-26 05:23:47
2024-05-25 22:02:31 (UTC) - 0:15:50 - train - INFO - step: 000018 - done (%): 3.6 - loss: 2.016 - lr: 4.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5527.9 - avg_words_per_second: 1249.4 - ETA: >2024-05-26 05:03:53
2024-05-25 22:02:43 (UTC) - 0:16:02 - train - INFO - step: 000019 - done (%): 3.8 - loss: 1.916 - lr: 5.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5518.7 - avg_words_per_second: 1302.5 - ETA: >2024-05-26 04:46:05
2024-05-25 22:02:54 (UTC) - 0:16:14 - train - INFO - step: 000020 - done (%): 4.0 - loss: 1.929 - lr: 5.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5526.0 - avg_words_per_second: 1354.2 - ETA: >2024-05-26 04:30:03
2024-05-25 22:03:06 (UTC) - 0:16:26 - train - INFO - step: 000021 - done (%): 4.2 - loss: 1.996 - lr: 5.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5530.3 - avg_words_per_second: 1404.7 - ETA: >2024-05-26 04:15:33
2024-05-25 22:03:18 (UTC) - 0:16:38 - train - INFO - step: 000022 - done (%): 4.4 - loss: 1.654 - lr: 5.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5546.2 - avg_words_per_second: 1454.1 - ETA: >2024-05-26 04:02:21
2024-05-25 22:03:30 (UTC) - 0:16:49 - train - INFO - step: 000023 - done (%): 4.6 - loss: 1.767 - lr: 5.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5538.0 - avg_words_per_second: 1502.3 - ETA: >2024-05-26 03:50:19
2024-05-25 22:03:42 (UTC) - 0:17:01 - train - INFO - step: 000024 - done (%): 4.8 - loss: 1.335 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.7 - avg_words_per_second: 1549.3 - ETA: >2024-05-26 03:39:17
2024-05-25 22:03:54 (UTC) - 0:17:13 - train - INFO - step: 000025 - done (%): 5.0 - loss: 1.706 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5545.5 - avg_words_per_second: 1595.3 - ETA: >2024-05-26 03:29:07
2024-05-25 22:04:05 (UTC) - 0:17:25 - train - INFO - step: 000026 - done (%): 5.2 - loss: 1.464 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5543.4 - avg_words_per_second: 1640.2 - ETA: >2024-05-26 03:19:44
2024-05-25 22:04:17 (UTC) - 0:17:37 - train - INFO - step: 000027 - done (%): 5.4 - loss: 2.187 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.6 - avg_words_per_second: 1684.1 - ETA: >2024-05-26 03:11:03
2024-05-25 22:04:29 (UTC) - 0:17:49 - train - INFO - step: 000028 - done (%): 5.6 - loss: 1.759 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5540.5 - avg_words_per_second: 1727.1 - ETA: >2024-05-26 03:03:00
2024-05-25 22:04:41 (UTC) - 0:18:00 - train - INFO - step: 000029 - done (%): 5.8 - loss: 2.010 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.9 - avg_words_per_second: 1769.0 - ETA: >2024-05-26 02:55:30
2024-05-25 22:04:53 (UTC) - 0:18:12 - train - INFO - step: 000030 - done (%): 6.0 - loss: 1.866 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.7 - avg_words_per_second: 1810.1 - ETA: >2024-05-26 02:48:30
2024-05-25 22:05:05 (UTC) - 0:18:24 - train - INFO - step: 000031 - done (%): 6.2 - loss: 1.404 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5532.2 - avg_words_per_second: 1850.2 - ETA: >2024-05-26 02:41:57
2024-05-25 22:05:16 (UTC) - 0:18:36 - train - INFO - step: 000032 - done (%): 6.4 - loss: 1.771 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5528.4 - avg_words_per_second: 1889.5 - ETA: >2024-05-26 02:35:48
2024-05-25 22:05:28 (UTC) - 0:18:48 - train - INFO - step: 000033 - done (%): 6.6 - loss: 1.608 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5530.1 - avg_words_per_second: 1928.0 - ETA: >2024-05-26 02:30:02
2024-05-25 22:05:40 (UTC) - 0:19:00 - train - INFO - step: 000034 - done (%): 6.8 - loss: 1.839 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5548.0 - avg_words_per_second: 1965.7 - ETA: >2024-05-26 02:24:36
2024-05-25 22:05:52 (UTC) - 0:19:11 - train - INFO - step: 000035 - done (%): 7.0 - loss: 1.840 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5546.4 - avg_words_per_second: 2002.7 - ETA: >2024-05-26 02:19:29
2024-05-25 22:06:04 (UTC) - 0:19:23 - train - INFO - step: 000036 - done (%): 7.2 - loss: 1.752 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5540.6 - avg_words_per_second: 2038.8 - ETA: >2024-05-26 02:14:39
2024-05-25 22:06:16 (UTC) - 0:19:35 - train - INFO - step: 000037 - done (%): 7.4 - loss: 1.734 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5543.8 - avg_words_per_second: 2074.3 - ETA: >2024-05-26 02:10:04
2024-05-25 22:06:27 (UTC) - 0:19:47 - train - INFO - step: 000038 - done (%): 7.6 - loss: 1.343 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5548.5 - avg_words_per_second: 2109.0 - ETA: >2024-05-26 02:05:44
2024-05-25 22:06:39 (UTC) - 0:19:59 - train - INFO - step: 000039 - done (%): 7.8 - loss: 1.416 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5535.7 - avg_words_per_second: 2143.0 - ETA: >2024-05-26 02:01:37
2024-05-25 22:06:51 (UTC) - 0:20:11 - train - INFO - step: 000040 - done (%): 8.0 - loss: 1.722 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.4 - avg_words_per_second: 2176.4 - ETA: >2024-05-26 01:57:43
2024-05-25 22:07:03 (UTC) - 0:20:22 - train - INFO - step: 000041 - done (%): 8.2 - loss: 1.533 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5537.3 - avg_words_per_second: 2209.1 - ETA: >2024-05-26 01:54:00
2024-05-25 22:07:15 (UTC) - 0:20:34 - train - INFO - step: 000042 - done (%): 8.4 - loss: 1.118 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5529.7 - avg_words_per_second: 2241.1 - ETA: >2024-05-26 01:50:28
2024-05-25 22:07:27 (UTC) - 0:20:46 - train - INFO - step: 000043 - done (%): 8.6 - loss: 1.211 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5545.1 - avg_words_per_second: 2272.6 - ETA: >2024-05-26 01:47:05
2024-05-25 22:07:38 (UTC) - 0:20:58 - train - INFO - step: 000044 - done (%): 8.8 - loss: 1.766 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5540.5 - avg_words_per_second: 2303.5 - ETA: >2024-05-26 01:43:52
2024-05-25 22:07:50 (UTC) - 0:21:10 - train - INFO - step: 000045 - done (%): 9.0 - loss: 1.289 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.6 - avg_words_per_second: 2333.8 - ETA: >2024-05-26 01:40:47
2024-05-25 22:08:02 (UTC) - 0:21:22 - train - INFO - step: 000046 - done (%): 9.2 - loss: 1.433 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5526.4 - avg_words_per_second: 2363.4 - ETA: >2024-05-26 01:37:51
2024-05-25 22:08:14 (UTC) - 0:21:33 - train - INFO - step: 000047 - done (%): 9.4 - loss: 1.980 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.4 - avg_words_per_second: 2392.6 - ETA: >2024-05-26 01:35:02
2024-05-25 22:08:26 (UTC) - 0:21:45 - train - INFO - step: 000048 - done (%): 9.6 - loss: 1.427 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5538.2 - avg_words_per_second: 2421.3 - ETA: >2024-05-26 01:32:20
2024-05-25 22:08:38 (UTC) - 0:21:57 - train - INFO - step: 000049 - done (%): 9.8 - loss: 1.911 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5520.6 - avg_words_per_second: 2449.3 - ETA: >2024-05-26 01:29:45
2024-05-25 22:08:49 (UTC) - 0:22:09 - train - INFO - step: 000050 - done (%): 10.0 - loss: 1.665 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5540.2 - avg_words_per_second: 2477.0 - ETA: >2024-05-26 01:27:16
2024-05-25 22:09:01 (UTC) - 0:22:21 - train - INFO - step: 000051 - done (%): 10.2 - loss: 1.698 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5542.1 - avg_words_per_second: 2504.1 - ETA: >2024-05-26 01:24:52
2024-05-25 22:09:13 (UTC) - 0:22:33 - train - INFO - step: 000052 - done (%): 10.4 - loss: 1.842 - lr: 6.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.2 - avg_words_per_second: 2530.8 - ETA: >2024-05-26 01:22:34
2024-05-25 22:09:25 (UTC) - 0:22:45 - train - INFO - step: 000053 - done (%): 10.6 - loss: 1.687 - lr: 5.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5519.1 - avg_words_per_second: 2556.9 - ETA: >2024-05-26 01:20:22
2024-05-25 22:09:37 (UTC) - 0:22:56 - train - INFO - step: 000054 - done (%): 10.8 - loss: 1.535 - lr: 5.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5529.7 - avg_words_per_second: 2582.6 - ETA: >2024-05-26 01:18:14
2024-05-25 22:09:49 (UTC) - 0:23:08 - train - INFO - step: 000055 - done (%): 11.0 - loss: 1.665 - lr: 5.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.7 - avg_words_per_second: 2607.9 - ETA: >2024-05-26 01:16:11
2024-05-25 22:10:01 (UTC) - 0:23:20 - train - INFO - step: 000056 - done (%): 11.2 - loss: 1.546 - lr: 5.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.3 - avg_words_per_second: 2632.8 - ETA: >2024-05-26 01:14:13
2024-05-25 22:10:12 (UTC) - 0:23:32 - train - INFO - step: 000057 - done (%): 11.4 - loss: 1.264 - lr: 5.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5524.3 - avg_words_per_second: 2657.2 - ETA: >2024-05-26 01:12:19
2024-05-25 22:10:24 (UTC) - 0:23:44 - train - INFO - step: 000058 - done (%): 11.6 - loss: 1.767 - lr: 5.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.6 - avg_words_per_second: 2681.2 - ETA: >2024-05-26 01:10:28
2024-05-25 22:10:36 (UTC) - 0:23:56 - train - INFO - step: 000059 - done (%): 11.8 - loss: 1.575 - lr: 5.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5525.5 - avg_words_per_second: 2704.8 - ETA: >2024-05-26 01:08:41
2024-05-25 22:10:48 (UTC) - 0:24:07 - train - INFO - step: 000060 - done (%): 12.0 - loss: 1.494 - lr: 5.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5526.6 - avg_words_per_second: 2728.0 - ETA: >2024-05-26 01:06:58
2024-05-25 22:11:00 (UTC) - 0:24:19 - train - INFO - step: 000061 - done (%): 12.2 - loss: 1.311 - lr: 5.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5532.3 - avg_words_per_second: 2750.9 - ETA: >2024-05-26 01:05:18
2024-05-25 22:11:12 (UTC) - 0:24:31 - train - INFO - step: 000062 - done (%): 12.4 - loss: 1.547 - lr: 5.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.0 - avg_words_per_second: 2773.4 - ETA: >2024-05-26 01:03:42
2024-05-25 22:11:24 (UTC) - 0:24:43 - train - INFO - step: 000063 - done (%): 12.6 - loss: 1.388 - lr: 5.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5537.7 - avg_words_per_second: 2795.5 - ETA: >2024-05-26 01:02:08
2024-05-25 22:11:35 (UTC) - 0:24:55 - train - INFO - step: 000064 - done (%): 12.8 - loss: 1.444 - lr: 5.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5535.0 - avg_words_per_second: 2817.3 - ETA: >2024-05-26 01:00:37
2024-05-25 22:11:47 (UTC) - 0:25:07 - train - INFO - step: 000065 - done (%): 13.0 - loss: 1.568 - lr: 5.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.0 - avg_words_per_second: 2838.8 - ETA: >2024-05-26 00:59:10
2024-05-25 22:11:59 (UTC) - 0:25:19 - train - INFO - step: 000066 - done (%): 13.2 - loss: 1.897 - lr: 5.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5537.4 - avg_words_per_second: 2859.9 - ETA: >2024-05-26 00:57:44
2024-05-25 22:12:11 (UTC) - 0:25:30 - train - INFO - step: 000067 - done (%): 13.4 - loss: 1.747 - lr: 5.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5538.4 - avg_words_per_second: 2880.7 - ETA: >2024-05-26 00:56:22
2024-05-25 22:12:23 (UTC) - 0:25:42 - train - INFO - step: 000068 - done (%): 13.6 - loss: 1.768 - lr: 5.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5532.4 - avg_words_per_second: 2901.1 - ETA: >2024-05-26 00:55:02
2024-05-25 22:12:35 (UTC) - 0:25:54 - train - INFO - step: 000069 - done (%): 13.8 - loss: 1.864 - lr: 5.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.5 - avg_words_per_second: 2921.3 - ETA: >2024-05-26 00:53:44
2024-05-25 22:12:46 (UTC) - 0:26:06 - train - INFO - step: 000070 - done (%): 14.0 - loss: 2.062 - lr: 5.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5529.0 - avg_words_per_second: 2941.1 - ETA: >2024-05-26 00:52:28
2024-05-25 22:12:58 (UTC) - 0:26:18 - train - INFO - step: 000071 - done (%): 14.2 - loss: 1.725 - lr: 5.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5535.1 - avg_words_per_second: 2960.6 - ETA: >2024-05-26 00:51:15
2024-05-25 22:13:10 (UTC) - 0:26:30 - train - INFO - step: 000072 - done (%): 14.4 - loss: 1.606 - lr: 5.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5538.4 - avg_words_per_second: 2979.9 - ETA: >2024-05-26 00:50:03
2024-05-25 22:13:22 (UTC) - 0:26:41 - train - INFO - step: 000073 - done (%): 14.6 - loss: 1.569 - lr: 5.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5535.0 - avg_words_per_second: 2998.8 - ETA: >2024-05-26 00:48:53
2024-05-25 22:13:34 (UTC) - 0:26:53 - train - INFO - step: 000074 - done (%): 14.8 - loss: 1.465 - lr: 5.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.7 - avg_words_per_second: 3017.5 - ETA: >2024-05-26 00:47:46
2024-05-25 22:13:46 (UTC) - 0:27:05 - train - INFO - step: 000075 - done (%): 15.0 - loss: 1.454 - lr: 5.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5547.9 - avg_words_per_second: 3036.0 - ETA: >2024-05-26 00:46:40
2024-05-25 22:13:57 (UTC) - 0:27:17 - train - INFO - step: 000076 - done (%): 15.2 - loss: 1.920 - lr: 5.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5523.7 - avg_words_per_second: 3054.1 - ETA: >2024-05-26 00:45:36
2024-05-25 22:14:09 (UTC) - 0:27:29 - train - INFO - step: 000077 - done (%): 15.4 - loss: 1.780 - lr: 5.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5538.4 - avg_words_per_second: 3072.0 - ETA: >2024-05-26 00:44:33
2024-05-25 22:14:21 (UTC) - 0:27:41 - train - INFO - step: 000078 - done (%): 15.6 - loss: 1.528 - lr: 5.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5535.1 - avg_words_per_second: 3089.6 - ETA: >2024-05-26 00:43:32
2024-05-25 22:14:33 (UTC) - 0:27:53 - train - INFO - step: 000079 - done (%): 15.8 - loss: 1.356 - lr: 5.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5519.3 - avg_words_per_second: 3106.9 - ETA: >2024-05-26 00:42:33
2024-05-25 22:14:45 (UTC) - 0:28:04 - train - INFO - step: 000080 - done (%): 16.0 - loss: 1.569 - lr: 5.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5518.3 - avg_words_per_second: 3124.0 - ETA: >2024-05-26 00:41:36
2024-05-25 22:14:57 (UTC) - 0:28:16 - train - INFO - step: 000081 - done (%): 16.2 - loss: 1.661 - lr: 5.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5535.2 - avg_words_per_second: 3140.9 - ETA: >2024-05-26 00:40:39
2024-05-25 22:15:09 (UTC) - 0:28:28 - train - INFO - step: 000082 - done (%): 16.4 - loss: 1.819 - lr: 5.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.2 - avg_words_per_second: 3157.5 - ETA: >2024-05-26 00:39:44
2024-05-25 22:15:20 (UTC) - 0:28:40 - train - INFO - step: 000083 - done (%): 16.6 - loss: 1.794 - lr: 5.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5538.2 - avg_words_per_second: 3174.0 - ETA: >2024-05-26 00:38:51
2024-05-25 22:15:32 (UTC) - 0:28:52 - train - INFO - step: 000084 - done (%): 16.8 - loss: 1.383 - lr: 5.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5526.8 - avg_words_per_second: 3190.1 - ETA: >2024-05-26 00:37:58
2024-05-25 22:15:44 (UTC) - 0:29:04 - train - INFO - step: 000085 - done (%): 17.0 - loss: 1.682 - lr: 5.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.8 - avg_words_per_second: 3206.1 - ETA: >2024-05-26 00:37:07
2024-05-25 22:15:56 (UTC) - 0:29:15 - train - INFO - step: 000086 - done (%): 17.2 - loss: 1.368 - lr: 5.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5525.6 - avg_words_per_second: 3221.8 - ETA: >2024-05-26 00:36:17
2024-05-25 22:16:08 (UTC) - 0:29:27 - train - INFO - step: 000087 - done (%): 17.4 - loss: 1.722 - lr: 5.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5535.4 - avg_words_per_second: 3237.4 - ETA: >2024-05-26 00:35:28
2024-05-25 22:16:20 (UTC) - 0:29:39 - train - INFO - step: 000088 - done (%): 17.6 - loss: 1.316 - lr: 5.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.6 - avg_words_per_second: 3252.7 - ETA: >2024-05-26 00:34:41
2024-05-25 22:16:32 (UTC) - 0:29:51 - train - INFO - step: 000089 - done (%): 17.8 - loss: 1.563 - lr: 5.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5529.0 - avg_words_per_second: 3267.8 - ETA: >2024-05-26 00:33:54
2024-05-25 22:16:43 (UTC) - 0:30:03 - train - INFO - step: 000090 - done (%): 18.0 - loss: 1.668 - lr: 5.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5529.6 - avg_words_per_second: 3282.7 - ETA: >2024-05-26 00:33:09
2024-05-25 22:16:55 (UTC) - 0:30:15 - train - INFO - step: 000091 - done (%): 18.2 - loss: 1.548 - lr: 5.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.2 - avg_words_per_second: 3297.5 - ETA: >2024-05-26 00:32:24
2024-05-25 22:17:07 (UTC) - 0:30:27 - train - INFO - step: 000092 - done (%): 18.4 - loss: 1.400 - lr: 5.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5499.0 - avg_words_per_second: 3311.9 - ETA: >2024-05-26 00:31:41
2024-05-25 22:17:19 (UTC) - 0:30:39 - train - INFO - step: 000093 - done (%): 18.6 - loss: 1.345 - lr: 5.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5526.6 - avg_words_per_second: 3326.2 - ETA: >2024-05-26 00:30:58
2024-05-25 22:17:31 (UTC) - 0:30:50 - train - INFO - step: 000094 - done (%): 18.8 - loss: 1.508 - lr: 5.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.1 - avg_words_per_second: 3340.4 - ETA: >2024-05-26 00:30:16
2024-05-25 22:17:43 (UTC) - 0:31:02 - train - INFO - step: 000095 - done (%): 19.0 - loss: 1.449 - lr: 5.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.2 - avg_words_per_second: 3354.4 - ETA: >2024-05-26 00:29:35
2024-05-25 22:17:55 (UTC) - 0:31:14 - train - INFO - step: 000096 - done (%): 19.2 - loss: 1.544 - lr: 5.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5518.6 - avg_words_per_second: 3368.2 - ETA: >2024-05-26 00:28:55
2024-05-25 22:18:06 (UTC) - 0:31:26 - train - INFO - step: 000097 - done (%): 19.4 - loss: 1.485 - lr: 5.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.8 - avg_words_per_second: 3381.8 - ETA: >2024-05-26 00:28:16
2024-05-25 22:18:18 (UTC) - 0:31:38 - train - INFO - step: 000098 - done (%): 19.6 - loss: 1.486 - lr: 5.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.5 - avg_words_per_second: 3395.3 - ETA: >2024-05-26 00:27:38
2024-05-25 22:18:30 (UTC) - 0:31:50 - train - INFO - step: 000099 - done (%): 19.8 - loss: 1.477 - lr: 5.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5521.5 - avg_words_per_second: 3408.5 - ETA: >2024-05-26 00:27:00
2024-05-25 22:18:42 (UTC) - 0:32:01 - train - INFO - step: 000100 - done (%): 20.0 - loss: 1.588 - lr: 5.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5527.2 - avg_words_per_second: 3421.7 - ETA: >2024-05-26 00:26:23
2024-05-25 22:18:42 (UTC) - 0:32:01 - checkpointing - INFO - Dumping checkpoint in /root/mistral-finetune/runseed5/checkpoints/checkpoint_000100/consolidated using tmp name: tmp.consolidated
2024-05-25 22:18:57 (UTC) - 0:32:16 - checkpointing - INFO - Done dumping checkpoint in /root/mistral-finetune/runseed5/checkpoints/checkpoint_000100/consolidated for step: 100
2024-05-25 22:18:57 (UTC) - 0:32:16 - checkpointing - INFO - Done deleting checkpoints
2024-05-25 22:18:57 (UTC) - 0:32:16 - checkpointing - INFO - Done!
2024-05-25 22:19:08 (UTC) - 0:32:28 - train - INFO - step: 000101 - done (%): 20.2 - loss: 1.442 - lr: 5.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5599.0 - avg_words_per_second: 3434.9 - ETA: >2024-05-26 00:26:01
2024-05-25 22:19:20 (UTC) - 0:32:40 - train - INFO - step: 000102 - done (%): 20.4 - loss: 1.662 - lr: 5.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5550.8 - avg_words_per_second: 3447.8 - ETA: >2024-05-26 00:25:26
2024-05-25 22:19:32 (UTC) - 0:32:52 - train - INFO - step: 000103 - done (%): 20.6 - loss: 1.788 - lr: 5.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5548.2 - avg_words_per_second: 3460.5 - ETA: >2024-05-26 00:24:51
2024-05-25 22:19:44 (UTC) - 0:33:03 - train - INFO - step: 000104 - done (%): 20.8 - loss: 1.665 - lr: 5.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.1 - avg_words_per_second: 3473.0 - ETA: >2024-05-26 00:24:16
2024-05-25 22:19:56 (UTC) - 0:33:15 - train - INFO - step: 000105 - done (%): 21.0 - loss: 1.484 - lr: 5.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5532.1 - avg_words_per_second: 3485.4 - ETA: >2024-05-26 00:23:43
2024-05-25 22:20:08 (UTC) - 0:33:27 - train - INFO - step: 000106 - done (%): 21.2 - loss: 1.726 - lr: 5.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5507.1 - avg_words_per_second: 3497.5 - ETA: >2024-05-26 00:23:10
2024-05-25 22:20:29 (UTC) - 0:33:48 - train - INFO - step: 000107 - done (%): 21.4 - loss: 1.233 - lr: 5.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 3116.6 - avg_words_per_second: 3493.5 - ETA: >2024-05-26 00:23:21
2024-05-25 22:20:40 (UTC) - 0:34:00 - train - INFO - step: 000108 - done (%): 21.6 - loss: 1.670 - lr: 5.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5559.8 - avg_words_per_second: 3505.5 - ETA: >2024-05-26 00:22:49
2024-05-25 22:20:52 (UTC) - 0:34:12 - train - INFO - step: 000109 - done (%): 21.8 - loss: 1.436 - lr: 5.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5537.8 - avg_words_per_second: 3517.4 - ETA: >2024-05-26 00:22:17
2024-05-25 22:21:04 (UTC) - 0:34:24 - train - INFO - step: 000110 - done (%): 22.0 - loss: 1.359 - lr: 5.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5522.5 - avg_words_per_second: 3529.0 - ETA: >2024-05-26 00:21:47
2024-05-25 22:21:16 (UTC) - 0:34:35 - train - INFO - step: 000111 - done (%): 22.2 - loss: 1.491 - lr: 5.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5542.2 - avg_words_per_second: 3540.6 - ETA: >2024-05-26 00:21:16
2024-05-25 22:21:28 (UTC) - 0:34:47 - train - INFO - step: 000112 - done (%): 22.4 - loss: 1.686 - lr: 5.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.4 - avg_words_per_second: 3552.0 - ETA: >2024-05-26 00:20:47
2024-05-25 22:21:40 (UTC) - 0:34:59 - train - INFO - step: 000113 - done (%): 22.6 - loss: 1.728 - lr: 5.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5526.8 - avg_words_per_second: 3563.3 - ETA: >2024-05-26 00:20:17
2024-05-25 22:21:52 (UTC) - 0:35:11 - train - INFO - step: 000114 - done (%): 22.8 - loss: 1.405 - lr: 5.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5540.2 - avg_words_per_second: 3574.5 - ETA: >2024-05-26 00:19:49
2024-05-25 22:22:03 (UTC) - 0:35:23 - train - INFO - step: 000115 - done (%): 23.0 - loss: 1.556 - lr: 5.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5544.3 - avg_words_per_second: 3585.6 - ETA: >2024-05-26 00:19:20
2024-05-25 22:22:15 (UTC) - 0:35:35 - train - INFO - step: 000116 - done (%): 23.2 - loss: 1.701 - lr: 5.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5528.0 - avg_words_per_second: 3596.5 - ETA: >2024-05-26 00:18:53
2024-05-25 22:22:27 (UTC) - 0:35:47 - train - INFO - step: 000117 - done (%): 23.4 - loss: 1.831 - lr: 5.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.0 - avg_words_per_second: 3607.3 - ETA: >2024-05-26 00:18:25
2024-05-25 22:22:39 (UTC) - 0:35:58 - train - INFO - step: 000118 - done (%): 23.6 - loss: 1.781 - lr: 5.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5524.8 - avg_words_per_second: 3617.9 - ETA: >2024-05-26 00:17:59
2024-05-25 22:22:51 (UTC) - 0:36:10 - train - INFO - step: 000119 - done (%): 23.8 - loss: 1.469 - lr: 5.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5528.2 - avg_words_per_second: 3628.5 - ETA: >2024-05-26 00:17:32
2024-05-25 22:23:03 (UTC) - 0:36:22 - train - INFO - step: 000120 - done (%): 24.0 - loss: 1.558 - lr: 5.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5535.2 - avg_words_per_second: 3638.9 - ETA: >2024-05-26 00:17:06
2024-05-25 22:23:14 (UTC) - 0:36:34 - train - INFO - step: 000121 - done (%): 24.2 - loss: 1.552 - lr: 5.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5530.4 - avg_words_per_second: 3649.2 - ETA: >2024-05-26 00:16:41
2024-05-25 22:23:26 (UTC) - 0:36:46 - train - INFO - step: 000122 - done (%): 24.4 - loss: 1.649 - lr: 5.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.2 - avg_words_per_second: 3659.4 - ETA: >2024-05-26 00:16:16
2024-05-25 22:23:38 (UTC) - 0:36:58 - train - INFO - step: 000123 - done (%): 24.6 - loss: 1.553 - lr: 5.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5524.2 - avg_words_per_second: 3669.5 - ETA: >2024-05-26 00:15:51
2024-05-25 22:23:50 (UTC) - 0:37:10 - train - INFO - step: 000124 - done (%): 24.8 - loss: 1.558 - lr: 5.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5528.5 - avg_words_per_second: 3679.5 - ETA: >2024-05-26 00:15:27
2024-05-25 22:24:02 (UTC) - 0:37:21 - train - INFO - step: 000125 - done (%): 25.0 - loss: 1.484 - lr: 5.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5528.5 - avg_words_per_second: 3689.4 - ETA: >2024-05-26 00:15:03
2024-05-25 22:24:14 (UTC) - 0:37:33 - train - INFO - step: 000126 - done (%): 25.2 - loss: 1.549 - lr: 5.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5523.7 - avg_words_per_second: 3699.1 - ETA: >2024-05-26 00:14:40
2024-05-25 22:24:26 (UTC) - 0:37:45 - train - INFO - step: 000127 - done (%): 25.4 - loss: 1.463 - lr: 5.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.2 - avg_words_per_second: 3708.8 - ETA: >2024-05-26 00:14:17
2024-05-25 22:24:37 (UTC) - 0:37:57 - train - INFO - step: 000128 - done (%): 25.6 - loss: 1.713 - lr: 5.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5535.2 - avg_words_per_second: 3718.4 - ETA: >2024-05-26 00:13:54
2024-05-25 22:24:49 (UTC) - 0:38:09 - train - INFO - step: 000129 - done (%): 25.8 - loss: 1.561 - lr: 5.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5528.6 - avg_words_per_second: 3727.8 - ETA: >2024-05-26 00:13:32
2024-05-25 22:25:01 (UTC) - 0:38:21 - train - INFO - step: 000130 - done (%): 26.0 - loss: 1.553 - lr: 5.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5523.5 - avg_words_per_second: 3737.2 - ETA: >2024-05-26 00:13:10
2024-05-25 22:25:13 (UTC) - 0:38:33 - train - INFO - step: 000131 - done (%): 26.2 - loss: 1.657 - lr: 5.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5528.3 - avg_words_per_second: 3746.4 - ETA: >2024-05-26 00:12:48
2024-05-25 22:25:25 (UTC) - 0:38:44 - train - INFO - step: 000132 - done (%): 26.4 - loss: 1.358 - lr: 5.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5516.8 - avg_words_per_second: 3755.6 - ETA: >2024-05-26 00:12:27
2024-05-25 22:25:37 (UTC) - 0:38:56 - train - INFO - step: 000133 - done (%): 26.6 - loss: 1.431 - lr: 5.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5530.3 - avg_words_per_second: 3764.7 - ETA: >2024-05-26 00:12:06
2024-05-25 22:25:49 (UTC) - 0:39:08 - train - INFO - step: 000134 - done (%): 26.8 - loss: 1.530 - lr: 5.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5522.8 - avg_words_per_second: 3773.6 - ETA: >2024-05-26 00:11:45
2024-05-25 22:26:00 (UTC) - 0:39:20 - train - INFO - step: 000135 - done (%): 27.0 - loss: 1.812 - lr: 5.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5535.1 - avg_words_per_second: 3782.5 - ETA: >2024-05-26 00:11:24
2024-05-25 22:26:12 (UTC) - 0:39:32 - train - INFO - step: 000136 - done (%): 27.2 - loss: 1.475 - lr: 5.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5489.8 - avg_words_per_second: 3791.2 - ETA: >2024-05-26 00:11:05
2024-05-25 22:26:24 (UTC) - 0:39:44 - train - INFO - step: 000137 - done (%): 27.4 - loss: 1.614 - lr: 5.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5525.6 - avg_words_per_second: 3799.9 - ETA: >2024-05-26 00:10:45
2024-05-25 22:26:36 (UTC) - 0:39:56 - train - INFO - step: 000138 - done (%): 27.6 - loss: 1.572 - lr: 5.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5523.9 - avg_words_per_second: 3808.5 - ETA: >2024-05-26 00:10:25
2024-05-25 22:26:48 (UTC) - 0:40:07 - train - INFO - step: 000139 - done (%): 27.8 - loss: 1.490 - lr: 5.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.8 - avg_words_per_second: 3817.1 - ETA: >2024-05-26 00:10:06
2024-05-25 22:27:00 (UTC) - 0:40:19 - train - INFO - step: 000140 - done (%): 28.0 - loss: 1.505 - lr: 5.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5532.5 - avg_words_per_second: 3825.6 - ETA: >2024-05-26 00:09:47
2024-05-25 22:27:12 (UTC) - 0:40:31 - train - INFO - step: 000141 - done (%): 28.2 - loss: 1.820 - lr: 5.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5537.9 - avg_words_per_second: 3834.0 - ETA: >2024-05-26 00:09:28
2024-05-25 22:27:23 (UTC) - 0:40:43 - train - INFO - step: 000142 - done (%): 28.4 - loss: 1.471 - lr: 5.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5524.2 - avg_words_per_second: 3842.3 - ETA: >2024-05-26 00:09:10
2024-05-25 22:27:35 (UTC) - 0:40:55 - train - INFO - step: 000143 - done (%): 28.6 - loss: 1.500 - lr: 5.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5527.3 - avg_words_per_second: 3850.5 - ETA: >2024-05-26 00:08:52
2024-05-25 22:27:47 (UTC) - 0:41:07 - train - INFO - step: 000144 - done (%): 28.8 - loss: 1.642 - lr: 5.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5532.9 - avg_words_per_second: 3858.6 - ETA: >2024-05-26 00:08:34
2024-05-25 22:27:59 (UTC) - 0:41:19 - train - INFO - step: 000145 - done (%): 29.0 - loss: 1.583 - lr: 5.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.1 - avg_words_per_second: 3866.7 - ETA: >2024-05-26 00:08:16
2024-05-25 22:28:11 (UTC) - 0:41:30 - train - INFO - step: 000146 - done (%): 29.2 - loss: 1.742 - lr: 5.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5538.2 - avg_words_per_second: 3874.7 - ETA: >2024-05-26 00:07:58
2024-05-25 22:28:23 (UTC) - 0:41:42 - train - INFO - step: 000147 - done (%): 29.4 - loss: 1.363 - lr: 5.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5525.7 - avg_words_per_second: 3882.6 - ETA: >2024-05-26 00:07:41
2024-05-25 22:28:35 (UTC) - 0:41:54 - train - INFO - step: 000148 - done (%): 29.6 - loss: 1.411 - lr: 5.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.8 - avg_words_per_second: 3890.4 - ETA: >2024-05-26 00:07:24
2024-05-25 22:28:46 (UTC) - 0:42:06 - train - INFO - step: 000149 - done (%): 29.8 - loss: 1.933 - lr: 5.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.5 - avg_words_per_second: 3898.2 - ETA: >2024-05-26 00:07:07
2024-05-25 22:28:58 (UTC) - 0:42:18 - train - INFO - step: 000150 - done (%): 30.0 - loss: 1.431 - lr: 5.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5529.7 - avg_words_per_second: 3905.9 - ETA: >2024-05-26 00:06:51
2024-05-25 22:29:10 (UTC) - 0:42:30 - train - INFO - step: 000151 - done (%): 30.2 - loss: 1.497 - lr: 5.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5526.7 - avg_words_per_second: 3913.5 - ETA: >2024-05-26 00:06:35
2024-05-25 22:29:22 (UTC) - 0:42:41 - train - INFO - step: 000152 - done (%): 30.4 - loss: 1.454 - lr: 5.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5532.1 - avg_words_per_second: 3921.0 - ETA: >2024-05-26 00:06:18
2024-05-25 22:29:34 (UTC) - 0:42:53 - train - INFO - step: 000153 - done (%): 30.6 - loss: 1.803 - lr: 5.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.4 - avg_words_per_second: 3928.5 - ETA: >2024-05-26 00:06:02
2024-05-25 22:29:46 (UTC) - 0:43:05 - train - INFO - step: 000154 - done (%): 30.8 - loss: 1.592 - lr: 5.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5535.2 - avg_words_per_second: 3936.0 - ETA: >2024-05-26 00:05:47
2024-05-25 22:29:58 (UTC) - 0:43:17 - train - INFO - step: 000155 - done (%): 31.0 - loss: 1.214 - lr: 5.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5527.1 - avg_words_per_second: 3943.3 - ETA: >2024-05-26 00:05:31
2024-05-25 22:30:09 (UTC) - 0:43:29 - train - INFO - step: 000156 - done (%): 31.2 - loss: 1.805 - lr: 4.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.8 - avg_words_per_second: 3950.6 - ETA: >2024-05-26 00:05:16
2024-05-25 22:30:21 (UTC) - 0:43:41 - train - INFO - step: 000157 - done (%): 31.4 - loss: 1.654 - lr: 4.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5522.0 - avg_words_per_second: 3957.7 - ETA: >2024-05-26 00:05:01
2024-05-25 22:30:33 (UTC) - 0:43:53 - train - INFO - step: 000158 - done (%): 31.6 - loss: 1.611 - lr: 4.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5525.6 - avg_words_per_second: 3964.9 - ETA: >2024-05-26 00:04:46
2024-05-25 22:30:45 (UTC) - 0:44:04 - train - INFO - step: 000159 - done (%): 31.8 - loss: 1.289 - lr: 4.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5540.1 - avg_words_per_second: 3972.0 - ETA: >2024-05-26 00:04:31
2024-05-25 22:30:57 (UTC) - 0:44:16 - train - INFO - step: 000160 - done (%): 32.0 - loss: 1.586 - lr: 4.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5525.7 - avg_words_per_second: 3979.0 - ETA: >2024-05-26 00:04:17
2024-05-25 22:31:09 (UTC) - 0:44:28 - train - INFO - step: 000161 - done (%): 32.2 - loss: 1.071 - lr: 4.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5532.4 - avg_words_per_second: 3985.9 - ETA: >2024-05-26 00:04:02
2024-05-25 22:31:20 (UTC) - 0:44:40 - train - INFO - step: 000162 - done (%): 32.4 - loss: 1.221 - lr: 4.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.2 - avg_words_per_second: 3992.8 - ETA: >2024-05-26 00:03:48
2024-05-25 22:31:32 (UTC) - 0:44:52 - train - INFO - step: 000163 - done (%): 32.6 - loss: 1.836 - lr: 4.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.4 - avg_words_per_second: 3999.7 - ETA: >2024-05-26 00:03:34
2024-05-25 22:31:44 (UTC) - 0:45:04 - train - INFO - step: 000164 - done (%): 32.8 - loss: 1.346 - lr: 4.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.9 - avg_words_per_second: 4006.4 - ETA: >2024-05-26 00:03:20
2024-05-25 22:31:56 (UTC) - 0:45:15 - train - INFO - step: 000165 - done (%): 33.0 - loss: 1.512 - lr: 4.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5544.4 - avg_words_per_second: 4013.2 - ETA: >2024-05-26 00:03:07
2024-05-25 22:32:08 (UTC) - 0:45:27 - train - INFO - step: 000166 - done (%): 33.2 - loss: 1.333 - lr: 4.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5543.4 - avg_words_per_second: 4019.9 - ETA: >2024-05-26 00:02:53
2024-05-25 22:32:20 (UTC) - 0:45:39 - train - INFO - step: 000167 - done (%): 33.4 - loss: 1.385 - lr: 4.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5543.1 - avg_words_per_second: 4026.5 - ETA: >2024-05-26 00:02:40
2024-05-25 22:32:31 (UTC) - 0:45:51 - train - INFO - step: 000168 - done (%): 33.6 - loss: 1.322 - lr: 4.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.8 - avg_words_per_second: 4033.1 - ETA: >2024-05-26 00:02:26
2024-05-25 22:32:43 (UTC) - 0:46:03 - train - INFO - step: 000169 - done (%): 33.8 - loss: 1.431 - lr: 4.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.8 - avg_words_per_second: 4039.6 - ETA: >2024-05-26 00:02:13
2024-05-25 22:32:55 (UTC) - 0:46:15 - train - INFO - step: 000170 - done (%): 34.0 - loss: 1.323 - lr: 4.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5522.5 - avg_words_per_second: 4046.0 - ETA: >2024-05-26 00:02:00
2024-05-25 22:33:07 (UTC) - 0:46:26 - train - INFO - step: 000171 - done (%): 34.2 - loss: 1.423 - lr: 4.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.0 - avg_words_per_second: 4052.3 - ETA: >2024-05-26 00:01:48
2024-05-25 22:33:19 (UTC) - 0:46:38 - train - INFO - step: 000172 - done (%): 34.4 - loss: 1.501 - lr: 4.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5538.7 - avg_words_per_second: 4058.7 - ETA: >2024-05-26 00:01:35
2024-05-25 22:33:31 (UTC) - 0:46:50 - train - INFO - step: 000173 - done (%): 34.6 - loss: 1.224 - lr: 4.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.4 - avg_words_per_second: 4064.9 - ETA: >2024-05-26 00:01:23
2024-05-25 22:33:42 (UTC) - 0:47:02 - train - INFO - step: 000174 - done (%): 34.8 - loss: 1.043 - lr: 4.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5529.0 - avg_words_per_second: 4071.1 - ETA: >2024-05-26 00:01:10
2024-05-25 22:33:54 (UTC) - 0:47:14 - train - INFO - step: 000175 - done (%): 35.0 - loss: 1.325 - lr: 4.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.0 - avg_words_per_second: 4077.3 - ETA: >2024-05-26 00:00:58
2024-05-25 22:34:06 (UTC) - 0:47:26 - train - INFO - step: 000176 - done (%): 35.2 - loss: 1.284 - lr: 4.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5522.9 - avg_words_per_second: 4083.4 - ETA: >2024-05-26 00:00:46
2024-05-25 22:34:18 (UTC) - 0:47:38 - train - INFO - step: 000177 - done (%): 35.4 - loss: 1.670 - lr: 4.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5540.7 - avg_words_per_second: 4089.4 - ETA: >2024-05-26 00:00:34
2024-05-25 22:34:30 (UTC) - 0:47:49 - train - INFO - step: 000178 - done (%): 35.6 - loss: 1.206 - lr: 4.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5529.0 - avg_words_per_second: 4095.4 - ETA: >2024-05-26 00:00:23
2024-05-25 22:34:42 (UTC) - 0:48:01 - train - INFO - step: 000179 - done (%): 35.8 - loss: 1.472 - lr: 4.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5499.3 - avg_words_per_second: 4101.3 - ETA: >2024-05-26 00:00:11
2024-05-25 22:34:54 (UTC) - 0:48:13 - train - INFO - step: 000180 - done (%): 36.0 - loss: 1.396 - lr: 4.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.0 - avg_words_per_second: 4107.2 - ETA: >2024-05-26 00:00:00
2024-05-25 22:35:06 (UTC) - 0:48:25 - train - INFO - step: 000181 - done (%): 36.2 - loss: 1.497 - lr: 4.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.9 - avg_words_per_second: 4113.1 - ETA: >2024-05-25 23:59:48
2024-05-25 22:35:17 (UTC) - 0:48:37 - train - INFO - step: 000182 - done (%): 36.4 - loss: 1.233 - lr: 4.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.6 - avg_words_per_second: 4118.9 - ETA: >2024-05-25 23:59:37
2024-05-25 22:35:29 (UTC) - 0:48:49 - train - INFO - step: 000183 - done (%): 36.6 - loss: 1.605 - lr: 4.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5521.8 - avg_words_per_second: 4124.6 - ETA: >2024-05-25 23:59:26
2024-05-25 22:35:41 (UTC) - 0:49:01 - train - INFO - step: 000184 - done (%): 36.8 - loss: 1.260 - lr: 4.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.1 - avg_words_per_second: 4130.3 - ETA: >2024-05-25 23:59:15
2024-05-25 22:35:53 (UTC) - 0:49:12 - train - INFO - step: 000185 - done (%): 37.0 - loss: 1.598 - lr: 4.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5541.9 - avg_words_per_second: 4136.0 - ETA: >2024-05-25 23:59:04
2024-05-25 22:36:05 (UTC) - 0:49:24 - train - INFO - step: 000186 - done (%): 37.2 - loss: 1.588 - lr: 4.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5546.0 - avg_words_per_second: 4141.7 - ETA: >2024-05-25 23:58:53
2024-05-25 22:36:17 (UTC) - 0:49:36 - train - INFO - step: 000187 - done (%): 37.4 - loss: 1.638 - lr: 4.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5537.3 - avg_words_per_second: 4147.3 - ETA: >2024-05-25 23:58:43
2024-05-25 22:36:28 (UTC) - 0:49:48 - train - INFO - step: 000188 - done (%): 37.6 - loss: 1.880 - lr: 4.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5523.7 - avg_words_per_second: 4152.8 - ETA: >2024-05-25 23:58:32
2024-05-25 22:36:40 (UTC) - 0:50:00 - train - INFO - step: 000189 - done (%): 37.8 - loss: 1.373 - lr: 4.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.0 - avg_words_per_second: 4158.3 - ETA: >2024-05-25 23:58:22
2024-05-25 22:36:52 (UTC) - 0:50:12 - train - INFO - step: 000190 - done (%): 38.0 - loss: 1.554 - lr: 4.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5528.8 - avg_words_per_second: 4163.7 - ETA: >2024-05-25 23:58:11
2024-05-25 22:37:04 (UTC) - 0:50:23 - train - INFO - step: 000191 - done (%): 38.2 - loss: 1.684 - lr: 4.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5530.4 - avg_words_per_second: 4169.1 - ETA: >2024-05-25 23:58:01
2024-05-25 22:37:16 (UTC) - 0:50:35 - train - INFO - step: 000192 - done (%): 38.4 - loss: 1.644 - lr: 4.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5537.8 - avg_words_per_second: 4174.5 - ETA: >2024-05-25 23:57:51
2024-05-25 22:37:28 (UTC) - 0:50:47 - train - INFO - step: 000193 - done (%): 38.6 - loss: 1.573 - lr: 4.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5541.5 - avg_words_per_second: 4179.8 - ETA: >2024-05-25 23:57:41
2024-05-25 22:37:39 (UTC) - 0:50:59 - train - INFO - step: 000194 - done (%): 38.8 - loss: 1.588 - lr: 4.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.8 - avg_words_per_second: 4185.1 - ETA: >2024-05-25 23:57:31
2024-05-25 22:37:51 (UTC) - 0:51:11 - train - INFO - step: 000195 - done (%): 39.0 - loss: 1.573 - lr: 4.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.2 - avg_words_per_second: 4190.3 - ETA: >2024-05-25 23:57:21
2024-05-25 22:38:03 (UTC) - 0:51:23 - train - INFO - step: 000196 - done (%): 39.2 - loss: 1.380 - lr: 4.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.2 - avg_words_per_second: 4195.6 - ETA: >2024-05-25 23:57:12
2024-05-25 22:38:15 (UTC) - 0:51:34 - train - INFO - step: 000197 - done (%): 39.4 - loss: 1.618 - lr: 4.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.6 - avg_words_per_second: 4200.7 - ETA: >2024-05-25 23:57:02
2024-05-25 22:38:27 (UTC) - 0:51:46 - train - INFO - step: 000198 - done (%): 39.6 - loss: 1.450 - lr: 4.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5515.1 - avg_words_per_second: 4205.8 - ETA: >2024-05-25 23:56:53
2024-05-25 22:38:39 (UTC) - 0:51:58 - train - INFO - step: 000199 - done (%): 39.8 - loss: 1.694 - lr: 4.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.1 - avg_words_per_second: 4210.9 - ETA: >2024-05-25 23:56:43
2024-05-25 22:38:51 (UTC) - 0:52:10 - train - INFO - step: 000200 - done (%): 40.0 - loss: 1.470 - lr: 4.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5537.1 - avg_words_per_second: 4215.9 - ETA: >2024-05-25 23:56:34
2024-05-25 22:38:51 (UTC) - 0:52:10 - checkpointing - INFO - Dumping checkpoint in /root/mistral-finetune/runseed5/checkpoints/checkpoint_000200/consolidated using tmp name: tmp.consolidated
2024-05-25 22:39:05 (UTC) - 0:52:24 - checkpointing - INFO - Done dumping checkpoint in /root/mistral-finetune/runseed5/checkpoints/checkpoint_000200/consolidated for step: 200
2024-05-25 22:39:05 (UTC) - 0:52:24 - checkpointing - INFO - Done deleting checkpoints
2024-05-25 22:39:05 (UTC) - 0:52:24 - checkpointing - INFO - Done!
2024-05-25 22:39:17 (UTC) - 0:52:36 - train - INFO - step: 000201 - done (%): 40.2 - loss: 1.370 - lr: 4.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5595.4 - avg_words_per_second: 4221.1 - ETA: >2024-05-25 23:56:39
2024-05-25 22:39:28 (UTC) - 0:52:48 - train - INFO - step: 000202 - done (%): 40.4 - loss: 1.297 - lr: 4.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5561.9 - avg_words_per_second: 4226.1 - ETA: >2024-05-25 23:56:30
2024-05-25 22:39:40 (UTC) - 0:53:00 - train - INFO - step: 000203 - done (%): 40.6 - loss: 1.427 - lr: 4.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5543.3 - avg_words_per_second: 4231.1 - ETA: >2024-05-25 23:56:20
2024-05-25 22:39:52 (UTC) - 0:53:12 - train - INFO - step: 000204 - done (%): 40.8 - loss: 1.444 - lr: 4.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.5 - avg_words_per_second: 4236.0 - ETA: >2024-05-25 23:56:12
2024-05-25 22:40:04 (UTC) - 0:53:23 - train - INFO - step: 000205 - done (%): 41.0 - loss: 1.351 - lr: 4.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.7 - avg_words_per_second: 4240.8 - ETA: >2024-05-25 23:56:03
2024-05-25 22:40:16 (UTC) - 0:53:35 - train - INFO - step: 000206 - done (%): 41.2 - loss: 1.551 - lr: 4.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5530.6 - avg_words_per_second: 4245.7 - ETA: >2024-05-25 23:55:54
2024-05-25 22:40:28 (UTC) - 0:53:47 - train - INFO - step: 000207 - done (%): 41.4 - loss: 1.254 - lr: 4.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5525.1 - avg_words_per_second: 4250.4 - ETA: >2024-05-25 23:55:45
2024-05-25 22:40:39 (UTC) - 0:53:59 - train - INFO - step: 000208 - done (%): 41.6 - loss: 1.583 - lr: 4.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.0 - avg_words_per_second: 4255.2 - ETA: >2024-05-25 23:55:37
2024-05-25 22:40:51 (UTC) - 0:54:11 - train - INFO - step: 000209 - done (%): 41.8 - loss: 1.622 - lr: 4.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5529.2 - avg_words_per_second: 4259.9 - ETA: >2024-05-25 23:55:28
2024-05-25 22:41:03 (UTC) - 0:54:23 - train - INFO - step: 000210 - done (%): 42.0 - loss: 1.504 - lr: 4.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5535.7 - avg_words_per_second: 4264.5 - ETA: >2024-05-25 23:55:20
2024-05-25 22:41:15 (UTC) - 0:54:34 - train - INFO - step: 000211 - done (%): 42.2 - loss: 1.620 - lr: 4.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.3 - avg_words_per_second: 4269.2 - ETA: >2024-05-25 23:55:11
2024-05-25 22:41:27 (UTC) - 0:54:46 - train - INFO - step: 000212 - done (%): 42.4 - loss: 1.549 - lr: 4.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.9 - avg_words_per_second: 4273.8 - ETA: >2024-05-25 23:55:03
2024-05-25 22:41:39 (UTC) - 0:54:58 - train - INFO - step: 000213 - done (%): 42.6 - loss: 1.008 - lr: 4.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5529.0 - avg_words_per_second: 4278.3 - ETA: >2024-05-25 23:54:55
2024-05-25 22:41:51 (UTC) - 0:55:10 - train - INFO - step: 000214 - done (%): 42.8 - loss: 1.178 - lr: 3.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5528.8 - avg_words_per_second: 4282.9 - ETA: >2024-05-25 23:54:47
2024-05-25 22:42:02 (UTC) - 0:55:22 - train - INFO - step: 000215 - done (%): 43.0 - loss: 1.264 - lr: 3.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5524.6 - avg_words_per_second: 4287.3 - ETA: >2024-05-25 23:54:39
2024-05-25 22:42:14 (UTC) - 0:55:34 - train - INFO - step: 000216 - done (%): 43.2 - loss: 1.416 - lr: 3.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.4 - avg_words_per_second: 4291.8 - ETA: >2024-05-25 23:54:31
2024-05-25 22:42:26 (UTC) - 0:55:46 - train - INFO - step: 000217 - done (%): 43.4 - loss: 1.051 - lr: 3.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5537.4 - avg_words_per_second: 4296.3 - ETA: >2024-05-25 23:54:23
2024-05-25 22:42:38 (UTC) - 0:55:57 - train - INFO - step: 000218 - done (%): 43.6 - loss: 1.094 - lr: 3.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5549.0 - avg_words_per_second: 4300.7 - ETA: >2024-05-25 23:54:15
2024-05-25 22:42:50 (UTC) - 0:56:09 - train - INFO - step: 000219 - done (%): 43.8 - loss: 1.476 - lr: 3.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.6 - avg_words_per_second: 4305.1 - ETA: >2024-05-25 23:54:07
2024-05-25 22:43:02 (UTC) - 0:56:21 - train - INFO - step: 000220 - done (%): 44.0 - loss: 1.279 - lr: 3.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5540.8 - avg_words_per_second: 4309.5 - ETA: >2024-05-25 23:54:00
2024-05-25 22:43:13 (UTC) - 0:56:33 - train - INFO - step: 000221 - done (%): 44.2 - loss: 1.369 - lr: 3.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.8 - avg_words_per_second: 4313.8 - ETA: >2024-05-25 23:53:52
2024-05-25 22:43:25 (UTC) - 0:56:45 - train - INFO - step: 000222 - done (%): 44.4 - loss: 1.681 - lr: 3.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.8 - avg_words_per_second: 4318.1 - ETA: >2024-05-25 23:53:44
2024-05-25 22:43:37 (UTC) - 0:56:57 - train - INFO - step: 000223 - done (%): 44.6 - loss: 1.636 - lr: 3.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5535.4 - avg_words_per_second: 4322.3 - ETA: >2024-05-25 23:53:37
2024-05-25 22:43:49 (UTC) - 0:57:08 - train - INFO - step: 000224 - done (%): 44.8 - loss: 1.244 - lr: 3.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5538.7 - avg_words_per_second: 4326.6 - ETA: >2024-05-25 23:53:30
2024-05-25 22:44:01 (UTC) - 0:57:20 - train - INFO - step: 000225 - done (%): 45.0 - loss: 1.737 - lr: 3.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5527.7 - avg_words_per_second: 4330.8 - ETA: >2024-05-25 23:53:22
2024-05-25 22:44:13 (UTC) - 0:57:32 - train - INFO - step: 000226 - done (%): 45.2 - loss: 1.352 - lr: 3.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.5 - avg_words_per_second: 4334.9 - ETA: >2024-05-25 23:53:15
2024-05-25 22:44:24 (UTC) - 0:57:44 - train - INFO - step: 000227 - done (%): 45.4 - loss: 1.543 - lr: 3.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5532.9 - avg_words_per_second: 4339.1 - ETA: >2024-05-25 23:53:08
2024-05-25 22:44:49 (UTC) - 0:58:09 - train - INFO - step: 000228 - done (%): 45.6 - loss: 1.417 - lr: 3.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 2669.9 - avg_words_per_second: 4327.2 - ETA: >2024-05-25 23:53:28
2024-05-25 22:45:01 (UTC) - 0:58:20 - train - INFO - step: 000229 - done (%): 45.8 - loss: 1.203 - lr: 3.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5547.6 - avg_words_per_second: 4331.4 - ETA: >2024-05-25 23:53:21
2024-05-25 22:45:13 (UTC) - 0:58:32 - train - INFO - step: 000230 - done (%): 46.0 - loss: 1.519 - lr: 3.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5540.0 - avg_words_per_second: 4335.5 - ETA: >2024-05-25 23:53:14
2024-05-25 22:45:24 (UTC) - 0:58:44 - train - INFO - step: 000231 - done (%): 46.2 - loss: 1.555 - lr: 3.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.5 - avg_words_per_second: 4339.6 - ETA: >2024-05-25 23:53:07
2024-05-25 22:45:36 (UTC) - 0:58:56 - train - INFO - step: 000232 - done (%): 46.4 - loss: 1.334 - lr: 3.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5530.7 - avg_words_per_second: 4343.6 - ETA: >2024-05-25 23:53:00
2024-05-25 22:45:48 (UTC) - 0:59:08 - train - INFO - step: 000233 - done (%): 46.6 - loss: 1.727 - lr: 3.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.8 - avg_words_per_second: 4347.6 - ETA: >2024-05-25 23:52:53
2024-05-25 22:46:00 (UTC) - 0:59:20 - train - INFO - step: 000234 - done (%): 46.8 - loss: 1.612 - lr: 3.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.3 - avg_words_per_second: 4351.6 - ETA: >2024-05-25 23:52:46
2024-05-25 22:46:12 (UTC) - 0:59:31 - train - INFO - step: 000235 - done (%): 47.0 - loss: 1.334 - lr: 3.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.5 - avg_words_per_second: 4355.6 - ETA: >2024-05-25 23:52:39
2024-05-25 22:46:24 (UTC) - 0:59:43 - train - INFO - step: 000236 - done (%): 47.2 - loss: 1.165 - lr: 3.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.3 - avg_words_per_second: 4359.5 - ETA: >2024-05-25 23:52:32
2024-05-25 22:46:36 (UTC) - 0:59:55 - train - INFO - step: 000237 - done (%): 47.4 - loss: 1.778 - lr: 3.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5514.8 - avg_words_per_second: 4363.3 - ETA: >2024-05-25 23:52:26
2024-05-25 22:46:47 (UTC) - 1:00:07 - train - INFO - step: 000238 - done (%): 47.6 - loss: 1.407 - lr: 3.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5525.4 - avg_words_per_second: 4367.2 - ETA: >2024-05-25 23:52:19
2024-05-25 22:46:59 (UTC) - 1:00:19 - train - INFO - step: 000239 - done (%): 47.8 - loss: 1.241 - lr: 3.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.3 - avg_words_per_second: 4371.1 - ETA: >2024-05-25 23:52:13
2024-05-25 22:47:11 (UTC) - 1:00:31 - train - INFO - step: 000240 - done (%): 48.0 - loss: 1.413 - lr: 3.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5530.9 - avg_words_per_second: 4374.9 - ETA: >2024-05-25 23:52:06
2024-05-25 22:47:23 (UTC) - 1:00:43 - train - INFO - step: 000241 - done (%): 48.2 - loss: 1.469 - lr: 3.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5522.7 - avg_words_per_second: 4378.7 - ETA: >2024-05-25 23:52:00
2024-05-25 22:47:35 (UTC) - 1:00:54 - train - INFO - step: 000242 - done (%): 48.4 - loss: 1.335 - lr: 3.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.3 - avg_words_per_second: 4382.5 - ETA: >2024-05-25 23:51:53
2024-05-25 22:47:47 (UTC) - 1:01:06 - train - INFO - step: 000243 - done (%): 48.6 - loss: 1.268 - lr: 3.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5519.5 - avg_words_per_second: 4386.2 - ETA: >2024-05-25 23:51:47
2024-05-25 22:47:59 (UTC) - 1:01:18 - train - INFO - step: 000244 - done (%): 48.8 - loss: 1.309 - lr: 3.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.9 - avg_words_per_second: 4389.9 - ETA: >2024-05-25 23:51:40
2024-05-25 22:48:10 (UTC) - 1:01:30 - train - INFO - step: 000245 - done (%): 49.0 - loss: 1.346 - lr: 3.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5542.6 - avg_words_per_second: 4393.6 - ETA: >2024-05-25 23:51:34
2024-05-25 22:48:22 (UTC) - 1:01:42 - train - INFO - step: 000246 - done (%): 49.2 - loss: 1.692 - lr: 3.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5529.3 - avg_words_per_second: 4397.3 - ETA: >2024-05-25 23:51:28
2024-05-25 22:48:34 (UTC) - 1:01:54 - train - INFO - step: 000247 - done (%): 49.4 - loss: 1.520 - lr: 3.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5521.8 - avg_words_per_second: 4400.9 - ETA: >2024-05-25 23:51:22
2024-05-25 22:48:46 (UTC) - 1:02:05 - train - INFO - step: 000248 - done (%): 49.6 - loss: 1.643 - lr: 3.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.2 - avg_words_per_second: 4404.6 - ETA: >2024-05-25 23:51:15
2024-05-25 22:48:58 (UTC) - 1:02:17 - train - INFO - step: 000249 - done (%): 49.8 - loss: 1.054 - lr: 3.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5529.3 - avg_words_per_second: 4408.2 - ETA: >2024-05-25 23:51:09
2024-05-25 22:49:10 (UTC) - 1:02:29 - train - INFO - step: 000250 - done (%): 50.0 - loss: 1.596 - lr: 3.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.0 - avg_words_per_second: 4411.8 - ETA: >2024-05-25 23:51:03
2024-05-25 22:49:22 (UTC) - 1:02:41 - train - INFO - step: 000251 - done (%): 50.2 - loss: 1.676 - lr: 3.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5532.3 - avg_words_per_second: 4415.3 - ETA: >2024-05-25 23:50:57
2024-05-25 22:49:33 (UTC) - 1:02:53 - train - INFO - step: 000252 - done (%): 50.4 - loss: 1.171 - lr: 3.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5538.2 - avg_words_per_second: 4418.9 - ETA: >2024-05-25 23:50:51
2024-05-25 22:49:45 (UTC) - 1:03:05 - train - INFO - step: 000253 - done (%): 50.6 - loss: 1.225 - lr: 3.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5541.0 - avg_words_per_second: 4422.4 - ETA: >2024-05-25 23:50:45
2024-05-25 22:49:57 (UTC) - 1:03:17 - train - INFO - step: 000254 - done (%): 50.8 - loss: 1.532 - lr: 3.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5532.7 - avg_words_per_second: 4425.9 - ETA: >2024-05-25 23:50:40
2024-05-25 22:50:09 (UTC) - 1:03:28 - train - INFO - step: 000255 - done (%): 51.0 - loss: 1.554 - lr: 3.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5527.8 - avg_words_per_second: 4429.4 - ETA: >2024-05-25 23:50:34
2024-05-25 22:50:21 (UTC) - 1:03:40 - train - INFO - step: 000256 - done (%): 51.2 - loss: 1.494 - lr: 3.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5518.6 - avg_words_per_second: 4432.8 - ETA: >2024-05-25 23:50:28
2024-05-25 22:50:33 (UTC) - 1:03:52 - train - INFO - step: 000257 - done (%): 51.4 - loss: 1.649 - lr: 3.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.9 - avg_words_per_second: 4436.2 - ETA: >2024-05-25 23:50:22
2024-05-25 22:50:44 (UTC) - 1:04:04 - train - INFO - step: 000258 - done (%): 51.6 - loss: 1.218 - lr: 3.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5522.0 - avg_words_per_second: 4439.6 - ETA: >2024-05-25 23:50:17
2024-05-25 22:50:56 (UTC) - 1:04:16 - train - INFO - step: 000259 - done (%): 51.8 - loss: 1.344 - lr: 3.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.4 - avg_words_per_second: 4443.0 - ETA: >2024-05-25 23:50:11
2024-05-25 22:51:08 (UTC) - 1:04:28 - train - INFO - step: 000260 - done (%): 52.0 - loss: 1.194 - lr: 3.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.5 - avg_words_per_second: 4446.4 - ETA: >2024-05-25 23:50:05
2024-05-25 22:51:20 (UTC) - 1:04:39 - train - INFO - step: 000261 - done (%): 52.2 - loss: 1.567 - lr: 3.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5530.2 - avg_words_per_second: 4449.8 - ETA: >2024-05-25 23:50:00
2024-05-25 22:51:32 (UTC) - 1:04:51 - train - INFO - step: 000262 - done (%): 52.4 - loss: 1.233 - lr: 3.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5520.5 - avg_words_per_second: 4453.1 - ETA: >2024-05-25 23:49:55
2024-05-25 22:51:44 (UTC) - 1:05:03 - train - INFO - step: 000263 - done (%): 52.6 - loss: 1.752 - lr: 3.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.1 - avg_words_per_second: 4456.4 - ETA: >2024-05-25 23:49:49
2024-05-25 22:51:56 (UTC) - 1:05:15 - train - INFO - step: 000264 - done (%): 52.8 - loss: 1.538 - lr: 3.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5525.9 - avg_words_per_second: 4459.6 - ETA: >2024-05-25 23:49:44
2024-05-25 22:52:07 (UTC) - 1:05:27 - train - INFO - step: 000265 - done (%): 53.0 - loss: 1.257 - lr: 3.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5532.6 - avg_words_per_second: 4462.9 - ETA: >2024-05-25 23:49:38
2024-05-25 22:52:19 (UTC) - 1:05:39 - train - INFO - step: 000266 - done (%): 53.2 - loss: 1.176 - lr: 2.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5506.6 - avg_words_per_second: 4466.1 - ETA: >2024-05-25 23:49:33
2024-05-25 22:52:31 (UTC) - 1:05:51 - train - INFO - step: 000267 - done (%): 53.4 - loss: 1.818 - lr: 2.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5527.1 - avg_words_per_second: 4469.3 - ETA: >2024-05-25 23:49:28
2024-05-25 22:52:43 (UTC) - 1:06:03 - train - INFO - step: 000268 - done (%): 53.6 - loss: 1.501 - lr: 2.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5508.7 - avg_words_per_second: 4472.4 - ETA: >2024-05-25 23:49:23
2024-05-25 22:52:55 (UTC) - 1:06:14 - train - INFO - step: 000269 - done (%): 53.8 - loss: 1.369 - lr: 2.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5518.6 - avg_words_per_second: 4475.6 - ETA: >2024-05-25 23:49:17
2024-05-25 22:53:07 (UTC) - 1:06:26 - train - INFO - step: 000270 - done (%): 54.0 - loss: 1.351 - lr: 2.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5519.4 - avg_words_per_second: 4478.7 - ETA: >2024-05-25 23:49:12
2024-05-25 22:53:19 (UTC) - 1:06:38 - train - INFO - step: 000271 - done (%): 54.2 - loss: 1.298 - lr: 2.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5517.0 - avg_words_per_second: 4481.8 - ETA: >2024-05-25 23:49:07
2024-05-25 22:53:31 (UTC) - 1:06:50 - train - INFO - step: 000272 - done (%): 54.4 - loss: 1.147 - lr: 2.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5535.4 - avg_words_per_second: 4485.0 - ETA: >2024-05-25 23:49:02
2024-05-25 22:53:42 (UTC) - 1:07:02 - train - INFO - step: 000273 - done (%): 54.6 - loss: 1.430 - lr: 2.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.8 - avg_words_per_second: 4488.1 - ETA: >2024-05-25 23:48:57
2024-05-25 22:53:54 (UTC) - 1:07:14 - train - INFO - step: 000274 - done (%): 54.8 - loss: 1.398 - lr: 2.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.4 - avg_words_per_second: 4491.2 - ETA: >2024-05-25 23:48:52
2024-05-25 22:54:06 (UTC) - 1:07:26 - train - INFO - step: 000275 - done (%): 55.0 - loss: 1.497 - lr: 2.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5516.2 - avg_words_per_second: 4494.2 - ETA: >2024-05-25 23:48:47
2024-05-25 22:54:18 (UTC) - 1:07:38 - train - INFO - step: 000276 - done (%): 55.2 - loss: 1.304 - lr: 2.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5521.6 - avg_words_per_second: 4497.3 - ETA: >2024-05-25 23:48:42
2024-05-25 22:54:30 (UTC) - 1:07:49 - train - INFO - step: 000277 - done (%): 55.4 - loss: 1.488 - lr: 2.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5527.1 - avg_words_per_second: 4500.3 - ETA: >2024-05-25 23:48:37
2024-05-25 22:54:42 (UTC) - 1:08:01 - train - INFO - step: 000278 - done (%): 55.6 - loss: 1.497 - lr: 2.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5530.6 - avg_words_per_second: 4503.3 - ETA: >2024-05-25 23:48:32
2024-05-25 22:54:54 (UTC) - 1:08:13 - train - INFO - step: 000279 - done (%): 55.8 - loss: 1.898 - lr: 2.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.4 - avg_words_per_second: 4506.3 - ETA: >2024-05-25 23:48:28
2024-05-25 22:55:05 (UTC) - 1:08:25 - train - INFO - step: 000280 - done (%): 56.0 - loss: 1.474 - lr: 2.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5527.7 - avg_words_per_second: 4509.3 - ETA: >2024-05-25 23:48:23
2024-05-25 22:55:17 (UTC) - 1:08:37 - train - INFO - step: 000281 - done (%): 56.2 - loss: 1.401 - lr: 2.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5526.0 - avg_words_per_second: 4512.2 - ETA: >2024-05-25 23:48:18
2024-05-25 22:55:29 (UTC) - 1:08:49 - train - INFO - step: 000282 - done (%): 56.4 - loss: 1.512 - lr: 2.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5514.9 - avg_words_per_second: 4515.1 - ETA: >2024-05-25 23:48:13
2024-05-25 22:55:41 (UTC) - 1:09:01 - train - INFO - step: 000283 - done (%): 56.6 - loss: 1.403 - lr: 2.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5526.5 - avg_words_per_second: 4518.1 - ETA: >2024-05-25 23:48:09
2024-05-25 22:55:53 (UTC) - 1:09:12 - train - INFO - step: 000284 - done (%): 56.8 - loss: 1.500 - lr: 2.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.7 - avg_words_per_second: 4521.0 - ETA: >2024-05-25 23:48:04
2024-05-25 22:56:05 (UTC) - 1:09:24 - train - INFO - step: 000285 - done (%): 57.0 - loss: 1.353 - lr: 2.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5532.2 - avg_words_per_second: 4523.9 - ETA: >2024-05-25 23:47:59
2024-05-25 22:56:17 (UTC) - 1:09:36 - train - INFO - step: 000286 - done (%): 57.2 - loss: 1.132 - lr: 2.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5544.9 - avg_words_per_second: 4526.8 - ETA: >2024-05-25 23:47:55
2024-05-25 22:56:28 (UTC) - 1:09:48 - train - INFO - step: 000287 - done (%): 57.4 - loss: 1.487 - lr: 2.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5532.0 - avg_words_per_second: 4529.7 - ETA: >2024-05-25 23:47:50
2024-05-25 22:56:40 (UTC) - 1:10:00 - train - INFO - step: 000288 - done (%): 57.6 - loss: 1.560 - lr: 2.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5526.8 - avg_words_per_second: 4532.5 - ETA: >2024-05-25 23:47:46
2024-05-25 22:56:52 (UTC) - 1:10:12 - train - INFO - step: 000289 - done (%): 57.8 - loss: 1.445 - lr: 2.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.3 - avg_words_per_second: 4535.4 - ETA: >2024-05-25 23:47:41
2024-05-25 22:57:04 (UTC) - 1:10:23 - train - INFO - step: 000290 - done (%): 58.0 - loss: 1.385 - lr: 2.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5537.5 - avg_words_per_second: 4538.2 - ETA: >2024-05-25 23:47:37
2024-05-25 22:57:16 (UTC) - 1:10:35 - train - INFO - step: 000291 - done (%): 58.2 - loss: 1.879 - lr: 2.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5532.6 - avg_words_per_second: 4541.0 - ETA: >2024-05-25 23:47:32
2024-05-25 22:57:28 (UTC) - 1:10:47 - train - INFO - step: 000292 - done (%): 58.4 - loss: 1.462 - lr: 2.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5512.9 - avg_words_per_second: 4543.7 - ETA: >2024-05-25 23:47:28
2024-05-25 22:57:40 (UTC) - 1:10:59 - train - INFO - step: 000293 - done (%): 58.6 - loss: 1.581 - lr: 2.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5528.9 - avg_words_per_second: 4546.5 - ETA: >2024-05-25 23:47:23
2024-05-25 22:57:51 (UTC) - 1:11:11 - train - INFO - step: 000294 - done (%): 58.8 - loss: 1.524 - lr: 2.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5527.6 - avg_words_per_second: 4549.2 - ETA: >2024-05-25 23:47:19
2024-05-25 22:58:03 (UTC) - 1:11:23 - train - INFO - step: 000295 - done (%): 59.0 - loss: 1.449 - lr: 2.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5532.5 - avg_words_per_second: 4552.0 - ETA: >2024-05-25 23:47:15
2024-05-25 22:58:15 (UTC) - 1:11:35 - train - INFO - step: 000296 - done (%): 59.2 - loss: 1.366 - lr: 2.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.3 - avg_words_per_second: 4554.7 - ETA: >2024-05-25 23:47:10
2024-05-25 22:58:27 (UTC) - 1:11:46 - train - INFO - step: 000297 - done (%): 59.4 - loss: 1.130 - lr: 2.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5518.6 - avg_words_per_second: 4557.4 - ETA: >2024-05-25 23:47:06
2024-05-25 22:58:39 (UTC) - 1:11:58 - train - INFO - step: 000298 - done (%): 59.6 - loss: 1.586 - lr: 2.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.3 - avg_words_per_second: 4560.1 - ETA: >2024-05-25 23:47:02
2024-05-25 22:58:51 (UTC) - 1:12:10 - train - INFO - step: 000299 - done (%): 59.8 - loss: 1.351 - lr: 2.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5517.8 - avg_words_per_second: 4562.7 - ETA: >2024-05-25 23:46:58
2024-05-25 22:59:03 (UTC) - 1:12:22 - train - INFO - step: 000300 - done (%): 60.0 - loss: 1.279 - lr: 2.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5522.4 - avg_words_per_second: 4565.4 - ETA: >2024-05-25 23:46:54
2024-05-25 22:59:03 (UTC) - 1:12:22 - checkpointing - INFO - Dumping checkpoint in /root/mistral-finetune/runseed5/checkpoints/checkpoint_000300/consolidated using tmp name: tmp.consolidated
2024-05-25 22:59:17 (UTC) - 1:12:36 - checkpointing - INFO - Done dumping checkpoint in /root/mistral-finetune/runseed5/checkpoints/checkpoint_000300/consolidated for step: 300
2024-05-25 22:59:17 (UTC) - 1:12:36 - checkpointing - INFO - Done deleting checkpoints
2024-05-25 22:59:17 (UTC) - 1:12:36 - checkpointing - INFO - Done!
2024-05-25 22:59:29 (UTC) - 1:12:48 - train - INFO - step: 000301 - done (%): 60.2 - loss: 1.616 - lr: 2.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5596.1 - avg_words_per_second: 4568.2 - ETA: >2024-05-25 23:47:03
2024-05-25 22:59:40 (UTC) - 1:13:00 - train - INFO - step: 000302 - done (%): 60.4 - loss: 1.353 - lr: 2.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5549.4 - avg_words_per_second: 4570.9 - ETA: >2024-05-25 23:46:59
2024-05-25 22:59:52 (UTC) - 1:13:12 - train - INFO - step: 000303 - done (%): 60.6 - loss: 1.306 - lr: 2.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5544.7 - avg_words_per_second: 4573.5 - ETA: >2024-05-25 23:46:55
2024-05-25 23:00:04 (UTC) - 1:13:24 - train - INFO - step: 000304 - done (%): 60.8 - loss: 1.474 - lr: 2.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5538.7 - avg_words_per_second: 4576.1 - ETA: >2024-05-25 23:46:51
2024-05-25 23:00:16 (UTC) - 1:13:35 - train - INFO - step: 000305 - done (%): 61.0 - loss: 1.767 - lr: 2.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5532.9 - avg_words_per_second: 4578.7 - ETA: >2024-05-25 23:46:47
2024-05-25 23:00:28 (UTC) - 1:13:47 - train - INFO - step: 000306 - done (%): 61.2 - loss: 1.460 - lr: 2.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.6 - avg_words_per_second: 4581.3 - ETA: >2024-05-25 23:46:43
2024-05-25 23:00:40 (UTC) - 1:13:59 - train - INFO - step: 000307 - done (%): 61.4 - loss: 1.585 - lr: 2.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5528.8 - avg_words_per_second: 4583.9 - ETA: >2024-05-25 23:46:39
2024-05-25 23:00:51 (UTC) - 1:14:11 - train - INFO - step: 000308 - done (%): 61.6 - loss: 1.692 - lr: 2.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.5 - avg_words_per_second: 4586.4 - ETA: >2024-05-25 23:46:35
2024-05-25 23:01:03 (UTC) - 1:14:23 - train - INFO - step: 000309 - done (%): 61.8 - loss: 1.319 - lr: 2.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5528.9 - avg_words_per_second: 4589.0 - ETA: >2024-05-25 23:46:31
2024-05-25 23:01:15 (UTC) - 1:14:35 - train - INFO - step: 000310 - done (%): 62.0 - loss: 1.571 - lr: 2.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5520.8 - avg_words_per_second: 4591.5 - ETA: >2024-05-25 23:46:27
2024-05-25 23:01:27 (UTC) - 1:14:46 - train - INFO - step: 000311 - done (%): 62.2 - loss: 1.374 - lr: 2.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5538.5 - avg_words_per_second: 4594.0 - ETA: >2024-05-25 23:46:23
2024-05-25 23:01:39 (UTC) - 1:14:58 - train - INFO - step: 000312 - done (%): 62.4 - loss: 1.531 - lr: 2.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.4 - avg_words_per_second: 4596.5 - ETA: >2024-05-25 23:46:19
2024-05-25 23:01:51 (UTC) - 1:15:10 - train - INFO - step: 000313 - done (%): 62.6 - loss: 1.690 - lr: 2.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5529.0 - avg_words_per_second: 4599.0 - ETA: >2024-05-25 23:46:15
2024-05-25 23:02:03 (UTC) - 1:15:22 - train - INFO - step: 000314 - done (%): 62.8 - loss: 1.492 - lr: 2.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5529.2 - avg_words_per_second: 4601.4 - ETA: >2024-05-25 23:46:12
2024-05-25 23:02:14 (UTC) - 1:15:34 - train - INFO - step: 000315 - done (%): 63.0 - loss: 1.363 - lr: 2.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.6 - avg_words_per_second: 4603.9 - ETA: >2024-05-25 23:46:08
2024-05-25 23:02:26 (UTC) - 1:15:46 - train - INFO - step: 000316 - done (%): 63.2 - loss: 1.536 - lr: 2.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5525.8 - avg_words_per_second: 4606.3 - ETA: >2024-05-25 23:46:04
2024-05-25 23:02:38 (UTC) - 1:15:58 - train - INFO - step: 000317 - done (%): 63.4 - loss: 0.918 - lr: 1.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.0 - avg_words_per_second: 4608.8 - ETA: >2024-05-25 23:46:00
2024-05-25 23:02:50 (UTC) - 1:16:09 - train - INFO - step: 000318 - done (%): 63.6 - loss: 1.602 - lr: 1.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5538.6 - avg_words_per_second: 4611.2 - ETA: >2024-05-25 23:45:57
2024-05-25 23:03:02 (UTC) - 1:16:21 - train - INFO - step: 000319 - done (%): 63.8 - loss: 1.502 - lr: 1.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5530.3 - avg_words_per_second: 4613.6 - ETA: >2024-05-25 23:45:53
2024-05-25 23:03:14 (UTC) - 1:16:33 - train - INFO - step: 000320 - done (%): 64.0 - loss: 1.174 - lr: 1.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5541.7 - avg_words_per_second: 4616.0 - ETA: >2024-05-25 23:45:49
2024-05-25 23:03:25 (UTC) - 1:16:45 - train - INFO - step: 000321 - done (%): 64.2 - loss: 1.312 - lr: 1.9e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5535.1 - avg_words_per_second: 4618.4 - ETA: >2024-05-25 23:45:45
2024-05-25 23:03:37 (UTC) - 1:16:57 - train - INFO - step: 000322 - done (%): 64.4 - loss: 1.556 - lr: 1.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.7 - avg_words_per_second: 4620.8 - ETA: >2024-05-25 23:45:42
2024-05-25 23:03:49 (UTC) - 1:17:09 - train - INFO - step: 000323 - done (%): 64.6 - loss: 1.376 - lr: 1.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.6 - avg_words_per_second: 4623.1 - ETA: >2024-05-25 23:45:38
2024-05-25 23:04:01 (UTC) - 1:17:20 - train - INFO - step: 000324 - done (%): 64.8 - loss: 1.338 - lr: 1.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.3 - avg_words_per_second: 4625.5 - ETA: >2024-05-25 23:45:35
2024-05-25 23:04:13 (UTC) - 1:17:32 - train - INFO - step: 000325 - done (%): 65.0 - loss: 1.470 - lr: 1.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.8 - avg_words_per_second: 4627.8 - ETA: >2024-05-25 23:45:31
2024-05-25 23:04:25 (UTC) - 1:17:44 - train - INFO - step: 000326 - done (%): 65.2 - loss: 1.063 - lr: 1.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.6 - avg_words_per_second: 4630.2 - ETA: >2024-05-25 23:45:27
2024-05-25 23:04:36 (UTC) - 1:17:56 - train - INFO - step: 000327 - done (%): 65.4 - loss: 1.150 - lr: 1.8e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5545.6 - avg_words_per_second: 4632.5 - ETA: >2024-05-25 23:45:24
2024-05-25 23:04:48 (UTC) - 1:18:08 - train - INFO - step: 000328 - done (%): 65.6 - loss: 1.332 - lr: 1.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5542.2 - avg_words_per_second: 4634.8 - ETA: >2024-05-25 23:45:20
2024-05-25 23:05:00 (UTC) - 1:18:20 - train - INFO - step: 000329 - done (%): 65.8 - loss: 0.928 - lr: 1.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5529.8 - avg_words_per_second: 4637.1 - ETA: >2024-05-25 23:45:17
2024-05-25 23:05:12 (UTC) - 1:18:31 - train - INFO - step: 000330 - done (%): 66.0 - loss: 1.566 - lr: 1.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5541.8 - avg_words_per_second: 4639.4 - ETA: >2024-05-25 23:45:13
2024-05-25 23:05:24 (UTC) - 1:18:43 - train - INFO - step: 000331 - done (%): 66.2 - loss: 1.357 - lr: 1.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5522.9 - avg_words_per_second: 4641.6 - ETA: >2024-05-25 23:45:10
2024-05-25 23:05:36 (UTC) - 1:18:55 - train - INFO - step: 000332 - done (%): 66.4 - loss: 1.501 - lr: 1.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5521.8 - avg_words_per_second: 4643.9 - ETA: >2024-05-25 23:45:07
2024-05-25 23:05:48 (UTC) - 1:19:07 - train - INFO - step: 000333 - done (%): 66.6 - loss: 1.255 - lr: 1.7e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5535.4 - avg_words_per_second: 4646.1 - ETA: >2024-05-25 23:45:03
2024-05-25 23:05:59 (UTC) - 1:19:19 - train - INFO - step: 000334 - done (%): 66.8 - loss: 1.473 - lr: 1.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.3 - avg_words_per_second: 4648.3 - ETA: >2024-05-25 23:45:00
2024-05-25 23:06:11 (UTC) - 1:19:31 - train - INFO - step: 000335 - done (%): 67.0 - loss: 1.245 - lr: 1.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5530.6 - avg_words_per_second: 4650.6 - ETA: >2024-05-25 23:44:56
2024-05-25 23:06:23 (UTC) - 1:19:43 - train - INFO - step: 000336 - done (%): 67.2 - loss: 1.144 - lr: 1.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.7 - avg_words_per_second: 4652.8 - ETA: >2024-05-25 23:44:53
2024-05-25 23:06:35 (UTC) - 1:19:54 - train - INFO - step: 000337 - done (%): 67.4 - loss: 1.227 - lr: 1.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5528.2 - avg_words_per_second: 4655.0 - ETA: >2024-05-25 23:44:50
2024-05-25 23:06:47 (UTC) - 1:20:06 - train - INFO - step: 000338 - done (%): 67.6 - loss: 1.098 - lr: 1.6e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.4 - avg_words_per_second: 4657.1 - ETA: >2024-05-25 23:44:46
2024-05-25 23:06:59 (UTC) - 1:20:18 - train - INFO - step: 000339 - done (%): 67.8 - loss: 1.598 - lr: 1.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5522.5 - avg_words_per_second: 4659.3 - ETA: >2024-05-25 23:44:43
2024-05-25 23:07:11 (UTC) - 1:20:30 - train - INFO - step: 000340 - done (%): 68.0 - loss: 1.365 - lr: 1.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5530.9 - avg_words_per_second: 4661.5 - ETA: >2024-05-25 23:44:40
2024-05-25 23:07:22 (UTC) - 1:20:42 - train - INFO - step: 000341 - done (%): 68.2 - loss: 1.765 - lr: 1.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5537.1 - avg_words_per_second: 4663.6 - ETA: >2024-05-25 23:44:37
2024-05-25 23:07:34 (UTC) - 1:20:54 - train - INFO - step: 000342 - done (%): 68.4 - loss: 1.611 - lr: 1.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5538.2 - avg_words_per_second: 4665.8 - ETA: >2024-05-25 23:44:33
2024-05-25 23:07:46 (UTC) - 1:21:06 - train - INFO - step: 000343 - done (%): 68.6 - loss: 1.263 - lr: 1.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.9 - avg_words_per_second: 4667.9 - ETA: >2024-05-25 23:44:30
2024-05-25 23:07:58 (UTC) - 1:21:17 - train - INFO - step: 000344 - done (%): 68.8 - loss: 1.222 - lr: 1.5e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.9 - avg_words_per_second: 4670.0 - ETA: >2024-05-25 23:44:27
2024-05-25 23:08:10 (UTC) - 1:21:29 - train - INFO - step: 000345 - done (%): 69.0 - loss: 1.333 - lr: 1.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5540.8 - avg_words_per_second: 4672.2 - ETA: >2024-05-25 23:44:24
2024-05-25 23:08:22 (UTC) - 1:21:41 - train - INFO - step: 000346 - done (%): 69.2 - loss: 1.656 - lr: 1.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5530.3 - avg_words_per_second: 4674.3 - ETA: >2024-05-25 23:44:21
2024-05-25 23:08:33 (UTC) - 1:21:53 - train - INFO - step: 000347 - done (%): 69.4 - loss: 1.091 - lr: 1.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.0 - avg_words_per_second: 4676.3 - ETA: >2024-05-25 23:44:18
2024-05-25 23:08:57 (UTC) - 1:22:17 - train - INFO - step: 000348 - done (%): 69.6 - loss: 1.215 - lr: 1.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 2773.9 - avg_words_per_second: 4667.2 - ETA: >2024-05-25 23:44:31
2024-05-25 23:09:09 (UTC) - 1:22:28 - train - INFO - step: 000349 - done (%): 69.8 - loss: 1.466 - lr: 1.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5554.7 - avg_words_per_second: 4669.3 - ETA: >2024-05-25 23:44:28
2024-05-25 23:09:21 (UTC) - 1:22:40 - train - INFO - step: 000350 - done (%): 70.0 - loss: 1.327 - lr: 1.4e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5523.4 - avg_words_per_second: 4671.4 - ETA: >2024-05-25 23:44:25
2024-05-25 23:09:33 (UTC) - 1:22:52 - train - INFO - step: 000351 - done (%): 70.2 - loss: 1.249 - lr: 1.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5540.6 - avg_words_per_second: 4673.4 - ETA: >2024-05-25 23:44:22
2024-05-25 23:09:44 (UTC) - 1:23:04 - train - INFO - step: 000352 - done (%): 70.4 - loss: 1.041 - lr: 1.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5523.7 - avg_words_per_second: 4675.5 - ETA: >2024-05-25 23:44:19
2024-05-25 23:09:56 (UTC) - 1:23:16 - train - INFO - step: 000353 - done (%): 70.6 - loss: 1.243 - lr: 1.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5532.1 - avg_words_per_second: 4677.5 - ETA: >2024-05-25 23:44:16
2024-05-25 23:10:08 (UTC) - 1:23:28 - train - INFO - step: 000354 - done (%): 70.8 - loss: 1.302 - lr: 1.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5528.8 - avg_words_per_second: 4679.6 - ETA: >2024-05-25 23:44:13
2024-05-25 23:10:20 (UTC) - 1:23:39 - train - INFO - step: 000355 - done (%): 71.0 - loss: 1.125 - lr: 1.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5540.1 - avg_words_per_second: 4681.6 - ETA: >2024-05-25 23:44:10
2024-05-25 23:10:32 (UTC) - 1:23:51 - train - INFO - step: 000356 - done (%): 71.2 - loss: 1.213 - lr: 1.3e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5537.4 - avg_words_per_second: 4683.7 - ETA: >2024-05-25 23:44:07
2024-05-25 23:10:44 (UTC) - 1:24:03 - train - INFO - step: 000357 - done (%): 71.4 - loss: 1.164 - lr: 1.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5543.6 - avg_words_per_second: 4685.7 - ETA: >2024-05-25 23:44:04
2024-05-25 23:10:55 (UTC) - 1:24:15 - train - INFO - step: 000358 - done (%): 71.6 - loss: 1.353 - lr: 1.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5527.0 - avg_words_per_second: 4687.7 - ETA: >2024-05-25 23:44:01
2024-05-25 23:11:07 (UTC) - 1:24:27 - train - INFO - step: 000359 - done (%): 71.8 - loss: 1.558 - lr: 1.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5545.0 - avg_words_per_second: 4689.7 - ETA: >2024-05-25 23:43:58
2024-05-25 23:11:19 (UTC) - 1:24:39 - train - INFO - step: 000360 - done (%): 72.0 - loss: 1.347 - lr: 1.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.5 - avg_words_per_second: 4691.7 - ETA: >2024-05-25 23:43:55
2024-05-25 23:11:31 (UTC) - 1:24:50 - train - INFO - step: 000361 - done (%): 72.2 - loss: 1.318 - lr: 1.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5527.0 - avg_words_per_second: 4693.7 - ETA: >2024-05-25 23:43:52
2024-05-25 23:11:43 (UTC) - 1:25:02 - train - INFO - step: 000362 - done (%): 72.4 - loss: 1.455 - lr: 1.2e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5529.9 - avg_words_per_second: 4695.6 - ETA: >2024-05-25 23:43:49
2024-05-25 23:11:55 (UTC) - 1:25:14 - train - INFO - step: 000363 - done (%): 72.6 - loss: 1.327 - lr: 1.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5538.1 - avg_words_per_second: 4697.6 - ETA: >2024-05-25 23:43:46
2024-05-25 23:12:06 (UTC) - 1:25:26 - train - INFO - step: 000364 - done (%): 72.8 - loss: 1.119 - lr: 1.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.7 - avg_words_per_second: 4699.5 - ETA: >2024-05-25 23:43:43
2024-05-25 23:12:18 (UTC) - 1:25:38 - train - INFO - step: 000365 - done (%): 73.0 - loss: 1.480 - lr: 1.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.2 - avg_words_per_second: 4701.5 - ETA: >2024-05-25 23:43:40
2024-05-25 23:12:30 (UTC) - 1:25:50 - train - INFO - step: 000366 - done (%): 73.2 - loss: 1.575 - lr: 1.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5538.4 - avg_words_per_second: 4703.4 - ETA: >2024-05-25 23:43:37
2024-05-25 23:12:42 (UTC) - 1:26:02 - train - INFO - step: 000367 - done (%): 73.4 - loss: 1.028 - lr: 1.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.0 - avg_words_per_second: 4705.4 - ETA: >2024-05-25 23:43:34
2024-05-25 23:12:54 (UTC) - 1:26:13 - train - INFO - step: 000368 - done (%): 73.6 - loss: 1.233 - lr: 1.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.7 - avg_words_per_second: 4707.3 - ETA: >2024-05-25 23:43:32
2024-05-25 23:13:06 (UTC) - 1:26:25 - train - INFO - step: 000369 - done (%): 73.8 - loss: 1.436 - lr: 1.1e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5543.9 - avg_words_per_second: 4709.2 - ETA: >2024-05-25 23:43:29
2024-05-25 23:13:17 (UTC) - 1:26:37 - train - INFO - step: 000370 - done (%): 74.0 - loss: 1.319 - lr: 1.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5538.4 - avg_words_per_second: 4711.1 - ETA: >2024-05-25 23:43:26
2024-05-25 23:13:29 (UTC) - 1:26:49 - train - INFO - step: 000371 - done (%): 74.2 - loss: 1.388 - lr: 1.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5545.9 - avg_words_per_second: 4713.0 - ETA: >2024-05-25 23:43:23
2024-05-25 23:13:41 (UTC) - 1:27:01 - train - INFO - step: 000372 - done (%): 74.4 - loss: 1.307 - lr: 1.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5540.8 - avg_words_per_second: 4714.9 - ETA: >2024-05-25 23:43:20
2024-05-25 23:13:53 (UTC) - 1:27:13 - train - INFO - step: 000373 - done (%): 74.6 - loss: 1.504 - lr: 1.0e-05 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5537.1 - avg_words_per_second: 4716.8 - ETA: >2024-05-25 23:43:18
2024-05-25 23:14:05 (UTC) - 1:27:24 - train - INFO - step: 000374 - done (%): 74.8 - loss: 1.137 - lr: 9.8e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5532.2 - avg_words_per_second: 4718.7 - ETA: >2024-05-25 23:43:15
2024-05-25 23:14:17 (UTC) - 1:27:36 - train - INFO - step: 000375 - done (%): 75.0 - loss: 1.364 - lr: 9.7e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.1 - avg_words_per_second: 4720.5 - ETA: >2024-05-25 23:43:12
2024-05-25 23:14:29 (UTC) - 1:27:48 - train - INFO - step: 000376 - done (%): 75.2 - loss: 1.640 - lr: 9.5e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5530.0 - avg_words_per_second: 4722.4 - ETA: >2024-05-25 23:43:09
2024-05-25 23:14:40 (UTC) - 1:28:00 - train - INFO - step: 000377 - done (%): 75.4 - loss: 1.579 - lr: 9.4e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5525.9 - avg_words_per_second: 4724.2 - ETA: >2024-05-25 23:43:07
2024-05-25 23:14:52 (UTC) - 1:28:12 - train - INFO - step: 000378 - done (%): 75.6 - loss: 1.289 - lr: 9.2e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5535.4 - avg_words_per_second: 4726.0 - ETA: >2024-05-25 23:43:04
2024-05-25 23:15:04 (UTC) - 1:28:24 - train - INFO - step: 000379 - done (%): 75.8 - loss: 1.053 - lr: 9.1e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5541.3 - avg_words_per_second: 4727.8 - ETA: >2024-05-25 23:43:01
2024-05-25 23:15:16 (UTC) - 1:28:35 - train - INFO - step: 000380 - done (%): 76.0 - loss: 1.310 - lr: 9.0e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.7 - avg_words_per_second: 4729.7 - ETA: >2024-05-25 23:42:59
2024-05-25 23:15:28 (UTC) - 1:28:47 - train - INFO - step: 000381 - done (%): 76.2 - loss: 0.957 - lr: 8.8e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5503.4 - avg_words_per_second: 4731.4 - ETA: >2024-05-25 23:42:56
2024-05-25 23:15:40 (UTC) - 1:28:59 - train - INFO - step: 000382 - done (%): 76.4 - loss: 1.262 - lr: 8.7e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5537.8 - avg_words_per_second: 4733.2 - ETA: >2024-05-25 23:42:53
2024-05-25 23:15:51 (UTC) - 1:29:11 - train - INFO - step: 000383 - done (%): 76.6 - loss: 1.507 - lr: 8.5e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.7 - avg_words_per_second: 4735.0 - ETA: >2024-05-25 23:42:51
2024-05-25 23:16:03 (UTC) - 1:29:23 - train - INFO - step: 000384 - done (%): 76.8 - loss: 1.251 - lr: 8.4e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5541.0 - avg_words_per_second: 4736.8 - ETA: >2024-05-25 23:42:48
2024-05-25 23:16:15 (UTC) - 1:29:35 - train - INFO - step: 000385 - done (%): 77.0 - loss: 1.230 - lr: 8.3e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5543.2 - avg_words_per_second: 4738.6 - ETA: >2024-05-25 23:42:46
2024-05-25 23:16:27 (UTC) - 1:29:46 - train - INFO - step: 000386 - done (%): 77.2 - loss: 1.359 - lr: 8.1e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5532.4 - avg_words_per_second: 4740.4 - ETA: >2024-05-25 23:42:43
2024-05-25 23:16:39 (UTC) - 1:29:58 - train - INFO - step: 000387 - done (%): 77.4 - loss: 1.069 - lr: 8.0e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5527.5 - avg_words_per_second: 4742.1 - ETA: >2024-05-25 23:42:41
2024-05-25 23:16:51 (UTC) - 1:30:10 - train - INFO - step: 000388 - done (%): 77.6 - loss: 1.511 - lr: 7.9e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.7 - avg_words_per_second: 4743.8 - ETA: >2024-05-25 23:42:38
2024-05-25 23:17:03 (UTC) - 1:30:22 - train - INFO - step: 000389 - done (%): 77.8 - loss: 1.536 - lr: 7.7e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5526.6 - avg_words_per_second: 4745.6 - ETA: >2024-05-25 23:42:35
2024-05-25 23:17:14 (UTC) - 1:30:34 - train - INFO - step: 000390 - done (%): 78.0 - loss: 1.264 - lr: 7.6e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.1 - avg_words_per_second: 4747.3 - ETA: >2024-05-25 23:42:33
2024-05-25 23:17:26 (UTC) - 1:30:46 - train - INFO - step: 000391 - done (%): 78.2 - loss: 1.485 - lr: 7.5e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.6 - avg_words_per_second: 4749.0 - ETA: >2024-05-25 23:42:30
2024-05-25 23:17:38 (UTC) - 1:30:58 - train - INFO - step: 000392 - done (%): 78.4 - loss: 1.789 - lr: 7.3e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5532.6 - avg_words_per_second: 4750.8 - ETA: >2024-05-25 23:42:28
2024-05-25 23:17:50 (UTC) - 1:31:09 - train - INFO - step: 000393 - done (%): 78.6 - loss: 1.360 - lr: 7.2e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5540.7 - avg_words_per_second: 4752.5 - ETA: >2024-05-25 23:42:25
2024-05-25 23:18:02 (UTC) - 1:31:21 - train - INFO - step: 000394 - done (%): 78.8 - loss: 1.210 - lr: 7.1e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5538.6 - avg_words_per_second: 4754.2 - ETA: >2024-05-25 23:42:23
2024-05-25 23:18:14 (UTC) - 1:31:33 - train - INFO - step: 000395 - done (%): 79.0 - loss: 1.167 - lr: 6.9e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.0 - avg_words_per_second: 4755.9 - ETA: >2024-05-25 23:42:20
2024-05-25 23:18:25 (UTC) - 1:31:45 - train - INFO - step: 000396 - done (%): 79.2 - loss: 1.363 - lr: 6.8e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.1 - avg_words_per_second: 4757.6 - ETA: >2024-05-25 23:42:18
2024-05-25 23:18:37 (UTC) - 1:31:57 - train - INFO - step: 000397 - done (%): 79.4 - loss: 1.483 - lr: 6.7e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.1 - avg_words_per_second: 4759.3 - ETA: >2024-05-25 23:42:16
2024-05-25 23:18:49 (UTC) - 1:32:09 - train - INFO - step: 000398 - done (%): 79.6 - loss: 1.338 - lr: 6.6e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5542.2 - avg_words_per_second: 4761.0 - ETA: >2024-05-25 23:42:13
2024-05-25 23:19:01 (UTC) - 1:32:20 - train - INFO - step: 000399 - done (%): 79.8 - loss: 1.236 - lr: 6.4e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5530.8 - avg_words_per_second: 4762.6 - ETA: >2024-05-25 23:42:11
2024-05-25 23:19:13 (UTC) - 1:32:32 - train - INFO - step: 000400 - done (%): 80.0 - loss: 1.424 - lr: 6.3e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5538.1 - avg_words_per_second: 4764.3 - ETA: >2024-05-25 23:42:08
2024-05-25 23:19:13 (UTC) - 1:32:32 - checkpointing - INFO - Dumping checkpoint in /root/mistral-finetune/runseed5/checkpoints/checkpoint_000400/consolidated using tmp name: tmp.consolidated
2024-05-25 23:19:27 (UTC) - 1:32:47 - checkpointing - INFO - Done dumping checkpoint in /root/mistral-finetune/runseed5/checkpoints/checkpoint_000400/consolidated for step: 400
2024-05-25 23:19:28 (UTC) - 1:32:47 - checkpointing - INFO - Deleted ckpt: /root/mistral-finetune/runseed5/checkpoints/checkpoint_000100
2024-05-25 23:19:28 (UTC) - 1:32:47 - checkpointing - INFO - Done deleting checkpoints /root/mistral-finetune/runseed5/checkpoints/checkpoint_000100
2024-05-25 23:19:28 (UTC) - 1:32:47 - checkpointing - INFO - Done!
2024-05-25 23:19:40 (UTC) - 1:32:59 - train - INFO - step: 000401 - done (%): 80.2 - loss: 1.077 - lr: 6.2e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5606.8 - avg_words_per_second: 4766.1 - ETA: >2024-05-25 23:42:21
2024-05-25 23:19:51 (UTC) - 1:33:11 - train - INFO - step: 000402 - done (%): 80.4 - loss: 1.377 - lr: 6.1e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5566.8 - avg_words_per_second: 4767.8 - ETA: >2024-05-25 23:42:18
2024-05-25 23:20:03 (UTC) - 1:33:23 - train - INFO - step: 000403 - done (%): 80.6 - loss: 1.233 - lr: 6.0e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5548.7 - avg_words_per_second: 4769.4 - ETA: >2024-05-25 23:42:16
2024-05-25 23:20:15 (UTC) - 1:33:35 - train - INFO - step: 000404 - done (%): 80.8 - loss: 1.095 - lr: 5.8e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.5 - avg_words_per_second: 4771.1 - ETA: >2024-05-25 23:42:14
2024-05-25 23:20:27 (UTC) - 1:33:46 - train - INFO - step: 000405 - done (%): 81.0 - loss: 1.492 - lr: 5.7e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.4 - avg_words_per_second: 4772.7 - ETA: >2024-05-25 23:42:11
2024-05-25 23:20:39 (UTC) - 1:33:58 - train - INFO - step: 000406 - done (%): 81.2 - loss: 1.291 - lr: 5.6e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5526.7 - avg_words_per_second: 4774.3 - ETA: >2024-05-25 23:42:09
2024-05-25 23:20:51 (UTC) - 1:34:10 - train - INFO - step: 000407 - done (%): 81.4 - loss: 1.561 - lr: 5.5e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.1 - avg_words_per_second: 4775.9 - ETA: >2024-05-25 23:42:07
2024-05-25 23:21:02 (UTC) - 1:34:22 - train - INFO - step: 000408 - done (%): 81.6 - loss: 1.230 - lr: 5.4e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.0 - avg_words_per_second: 4777.5 - ETA: >2024-05-25 23:42:04
2024-05-25 23:21:14 (UTC) - 1:34:34 - train - INFO - step: 000409 - done (%): 81.8 - loss: 1.731 - lr: 5.3e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.6 - avg_words_per_second: 4779.1 - ETA: >2024-05-25 23:42:02
2024-05-25 23:21:26 (UTC) - 1:34:46 - train - INFO - step: 000410 - done (%): 82.0 - loss: 1.481 - lr: 5.2e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.1 - avg_words_per_second: 4780.7 - ETA: >2024-05-25 23:42:00
2024-05-25 23:21:38 (UTC) - 1:34:57 - train - INFO - step: 000411 - done (%): 82.2 - loss: 1.634 - lr: 5.0e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5538.6 - avg_words_per_second: 4782.3 - ETA: >2024-05-25 23:41:58
2024-05-25 23:21:50 (UTC) - 1:35:09 - train - INFO - step: 000412 - done (%): 82.4 - loss: 1.352 - lr: 4.9e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.4 - avg_words_per_second: 4783.9 - ETA: >2024-05-25 23:41:55
2024-05-25 23:22:02 (UTC) - 1:35:21 - train - INFO - step: 000413 - done (%): 82.6 - loss: 1.451 - lr: 4.8e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5544.1 - avg_words_per_second: 4785.5 - ETA: >2024-05-25 23:41:53
2024-05-25 23:22:13 (UTC) - 1:35:33 - train - INFO - step: 000414 - done (%): 82.8 - loss: 1.380 - lr: 4.7e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5538.5 - avg_words_per_second: 4787.1 - ETA: >2024-05-25 23:41:51
2024-05-25 23:22:25 (UTC) - 1:35:45 - train - INFO - step: 000415 - done (%): 83.0 - loss: 1.538 - lr: 4.6e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5529.9 - avg_words_per_second: 4788.6 - ETA: >2024-05-25 23:41:49
2024-05-25 23:22:37 (UTC) - 1:35:57 - train - INFO - step: 000416 - done (%): 83.2 - loss: 1.893 - lr: 4.5e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5532.7 - avg_words_per_second: 4790.2 - ETA: >2024-05-25 23:41:46
2024-05-25 23:22:49 (UTC) - 1:36:08 - train - INFO - step: 000417 - done (%): 83.4 - loss: 1.469 - lr: 4.4e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5542.8 - avg_words_per_second: 4791.7 - ETA: >2024-05-25 23:41:44
2024-05-25 23:23:01 (UTC) - 1:36:20 - train - INFO - step: 000418 - done (%): 83.6 - loss: 1.383 - lr: 4.3e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5529.4 - avg_words_per_second: 4793.2 - ETA: >2024-05-25 23:41:42
2024-05-25 23:23:13 (UTC) - 1:36:32 - train - INFO - step: 000419 - done (%): 83.8 - loss: 1.206 - lr: 4.2e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5548.5 - avg_words_per_second: 4794.8 - ETA: >2024-05-25 23:41:40
2024-05-25 23:23:24 (UTC) - 1:36:44 - train - INFO - step: 000420 - done (%): 84.0 - loss: 1.500 - lr: 4.1e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5545.8 - avg_words_per_second: 4796.4 - ETA: >2024-05-25 23:41:38
2024-05-25 23:23:36 (UTC) - 1:36:56 - train - INFO - step: 000421 - done (%): 84.2 - loss: 1.674 - lr: 4.0e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5543.7 - avg_words_per_second: 4797.9 - ETA: >2024-05-25 23:41:35
2024-05-25 23:23:48 (UTC) - 1:37:08 - train - INFO - step: 000422 - done (%): 84.4 - loss: 1.102 - lr: 3.9e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5538.2 - avg_words_per_second: 4799.4 - ETA: >2024-05-25 23:41:33
2024-05-25 23:24:00 (UTC) - 1:37:19 - train - INFO - step: 000423 - done (%): 84.6 - loss: 1.357 - lr: 3.8e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5540.1 - avg_words_per_second: 4800.9 - ETA: >2024-05-25 23:41:31
2024-05-25 23:24:12 (UTC) - 1:37:31 - train - INFO - step: 000424 - done (%): 84.8 - loss: 1.255 - lr: 3.7e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5542.9 - avg_words_per_second: 4802.4 - ETA: >2024-05-25 23:41:29
2024-05-25 23:24:24 (UTC) - 1:37:43 - train - INFO - step: 000425 - done (%): 85.0 - loss: 1.207 - lr: 3.6e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.4 - avg_words_per_second: 4803.9 - ETA: >2024-05-25 23:41:27
2024-05-25 23:24:35 (UTC) - 1:37:55 - train - INFO - step: 000426 - done (%): 85.2 - loss: 1.114 - lr: 3.5e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5537.2 - avg_words_per_second: 4805.4 - ETA: >2024-05-25 23:41:25
2024-05-25 23:24:47 (UTC) - 1:38:07 - train - INFO - step: 000427 - done (%): 85.4 - loss: 1.556 - lr: 3.4e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.6 - avg_words_per_second: 4806.9 - ETA: >2024-05-25 23:41:22
2024-05-25 23:24:59 (UTC) - 1:38:19 - train - INFO - step: 000428 - done (%): 85.6 - loss: 1.464 - lr: 3.3e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5542.1 - avg_words_per_second: 4808.4 - ETA: >2024-05-25 23:41:20
2024-05-25 23:25:11 (UTC) - 1:38:30 - train - INFO - step: 000429 - done (%): 85.8 - loss: 1.224 - lr: 3.2e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5529.0 - avg_words_per_second: 4809.9 - ETA: >2024-05-25 23:41:18
2024-05-25 23:25:23 (UTC) - 1:38:42 - train - INFO - step: 000430 - done (%): 86.0 - loss: 1.508 - lr: 3.2e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5552.7 - avg_words_per_second: 4811.4 - ETA: >2024-05-25 23:41:16
2024-05-25 23:25:35 (UTC) - 1:38:54 - train - INFO - step: 000431 - done (%): 86.2 - loss: 1.482 - lr: 3.1e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5541.8 - avg_words_per_second: 4812.8 - ETA: >2024-05-25 23:41:14
2024-05-25 23:25:46 (UTC) - 1:39:06 - train - INFO - step: 000432 - done (%): 86.4 - loss: 1.239 - lr: 3.0e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5537.4 - avg_words_per_second: 4814.3 - ETA: >2024-05-25 23:41:12
2024-05-25 23:25:58 (UTC) - 1:39:18 - train - INFO - step: 000433 - done (%): 86.6 - loss: 1.235 - lr: 2.9e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.5 - avg_words_per_second: 4815.7 - ETA: >2024-05-25 23:41:10
2024-05-25 23:26:10 (UTC) - 1:39:30 - train - INFO - step: 000434 - done (%): 86.8 - loss: 1.342 - lr: 2.8e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5521.2 - avg_words_per_second: 4817.2 - ETA: >2024-05-25 23:41:08
2024-05-25 23:26:22 (UTC) - 1:39:41 - train - INFO - step: 000435 - done (%): 87.0 - loss: 1.733 - lr: 2.7e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.9 - avg_words_per_second: 4818.6 - ETA: >2024-05-25 23:41:06
2024-05-25 23:26:34 (UTC) - 1:39:53 - train - INFO - step: 000436 - done (%): 87.2 - loss: 1.642 - lr: 2.6e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5547.3 - avg_words_per_second: 4820.0 - ETA: >2024-05-25 23:41:04
2024-05-25 23:26:46 (UTC) - 1:40:05 - train - INFO - step: 000437 - done (%): 87.4 - loss: 1.226 - lr: 2.6e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5546.9 - avg_words_per_second: 4821.5 - ETA: >2024-05-25 23:41:02
2024-05-25 23:26:57 (UTC) - 1:40:17 - train - INFO - step: 000438 - done (%): 87.6 - loss: 1.192 - lr: 2.5e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5545.1 - avg_words_per_second: 4822.9 - ETA: >2024-05-25 23:41:00
2024-05-25 23:27:09 (UTC) - 1:40:29 - train - INFO - step: 000439 - done (%): 87.8 - loss: 1.534 - lr: 2.4e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5540.1 - avg_words_per_second: 4824.4 - ETA: >2024-05-25 23:40:58
2024-05-25 23:27:21 (UTC) - 1:40:41 - train - INFO - step: 000440 - done (%): 88.0 - loss: 1.528 - lr: 2.3e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5529.1 - avg_words_per_second: 4825.8 - ETA: >2024-05-25 23:40:56
2024-05-25 23:27:33 (UTC) - 1:40:52 - train - INFO - step: 000441 - done (%): 88.2 - loss: 1.444 - lr: 2.3e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5542.9 - avg_words_per_second: 4827.2 - ETA: >2024-05-25 23:40:54
2024-05-25 23:27:45 (UTC) - 1:41:04 - train - INFO - step: 000442 - done (%): 88.4 - loss: 1.447 - lr: 2.2e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5537.3 - avg_words_per_second: 4828.6 - ETA: >2024-05-25 23:40:52
2024-05-25 23:27:57 (UTC) - 1:41:16 - train - INFO - step: 000443 - done (%): 88.6 - loss: 1.414 - lr: 2.1e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5540.2 - avg_words_per_second: 4830.0 - ETA: >2024-05-25 23:40:50
2024-05-25 23:28:08 (UTC) - 1:41:28 - train - INFO - step: 000444 - done (%): 88.8 - loss: 1.568 - lr: 2.0e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.5 - avg_words_per_second: 4831.4 - ETA: >2024-05-25 23:40:48
2024-05-25 23:28:20 (UTC) - 1:41:40 - train - INFO - step: 000445 - done (%): 89.0 - loss: 1.299 - lr: 2.0e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.4 - avg_words_per_second: 4832.7 - ETA: >2024-05-25 23:40:46
2024-05-25 23:28:32 (UTC) - 1:41:52 - train - INFO - step: 000446 - done (%): 89.2 - loss: 1.262 - lr: 1.9e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.0 - avg_words_per_second: 4834.1 - ETA: >2024-05-25 23:40:44
2024-05-25 23:28:44 (UTC) - 1:42:03 - train - INFO - step: 000447 - done (%): 89.4 - loss: 1.323 - lr: 1.8e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5544.4 - avg_words_per_second: 4835.5 - ETA: >2024-05-25 23:40:42
2024-05-25 23:28:56 (UTC) - 1:42:15 - train - INFO - step: 000448 - done (%): 89.6 - loss: 1.818 - lr: 1.8e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5538.6 - avg_words_per_second: 4836.9 - ETA: >2024-05-25 23:40:40
2024-05-25 23:29:08 (UTC) - 1:42:27 - train - INFO - step: 000449 - done (%): 89.8 - loss: 1.538 - lr: 1.7e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5537.0 - avg_words_per_second: 4838.2 - ETA: >2024-05-25 23:40:38
2024-05-25 23:29:19 (UTC) - 1:42:39 - train - INFO - step: 000450 - done (%): 90.0 - loss: 1.573 - lr: 1.6e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.6 - avg_words_per_second: 4839.6 - ETA: >2024-05-25 23:40:37
2024-05-25 23:29:31 (UTC) - 1:42:51 - train - INFO - step: 000451 - done (%): 90.2 - loss: 1.229 - lr: 1.6e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5537.3 - avg_words_per_second: 4840.9 - ETA: >2024-05-25 23:40:35
2024-05-25 23:29:43 (UTC) - 1:43:03 - train - INFO - step: 000452 - done (%): 90.4 - loss: 1.313 - lr: 1.5e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.9 - avg_words_per_second: 4842.3 - ETA: >2024-05-25 23:40:33
2024-05-25 23:29:55 (UTC) - 1:43:14 - train - INFO - step: 000453 - done (%): 90.6 - loss: 1.196 - lr: 1.4e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.5 - avg_words_per_second: 4843.6 - ETA: >2024-05-25 23:40:31
2024-05-25 23:30:07 (UTC) - 1:43:26 - train - INFO - step: 000454 - done (%): 90.8 - loss: 1.496 - lr: 1.4e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5527.1 - avg_words_per_second: 4844.9 - ETA: >2024-05-25 23:40:29
2024-05-25 23:30:19 (UTC) - 1:43:38 - train - INFO - step: 000455 - done (%): 91.0 - loss: 1.299 - lr: 1.3e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.5 - avg_words_per_second: 4846.3 - ETA: >2024-05-25 23:40:27
2024-05-25 23:30:30 (UTC) - 1:43:50 - train - INFO - step: 000456 - done (%): 91.2 - loss: 1.780 - lr: 1.3e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5537.6 - avg_words_per_second: 4847.6 - ETA: >2024-05-25 23:40:25
2024-05-25 23:30:42 (UTC) - 1:44:02 - train - INFO - step: 000457 - done (%): 91.4 - loss: 1.160 - lr: 1.2e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5529.6 - avg_words_per_second: 4848.9 - ETA: >2024-05-25 23:40:24
2024-05-25 23:30:54 (UTC) - 1:44:14 - train - INFO - step: 000458 - done (%): 91.6 - loss: 1.724 - lr: 1.2e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5535.5 - avg_words_per_second: 4850.2 - ETA: >2024-05-25 23:40:22
2024-05-25 23:31:06 (UTC) - 1:44:26 - train - INFO - step: 000459 - done (%): 91.8 - loss: 1.454 - lr: 1.1e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5538.1 - avg_words_per_second: 4851.5 - ETA: >2024-05-25 23:40:20
2024-05-25 23:31:18 (UTC) - 1:44:37 - train - INFO - step: 000460 - done (%): 92.0 - loss: 1.251 - lr: 1.0e-06 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5541.4 - avg_words_per_second: 4852.8 - ETA: >2024-05-25 23:40:18
2024-05-25 23:31:30 (UTC) - 1:44:49 - train - INFO - step: 000461 - done (%): 92.2 - loss: 1.350 - lr: 9.9e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.5 - avg_words_per_second: 4854.1 - ETA: >2024-05-25 23:40:16
2024-05-25 23:31:42 (UTC) - 1:45:01 - train - INFO - step: 000462 - done (%): 92.4 - loss: 1.429 - lr: 9.4e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.5 - avg_words_per_second: 4855.4 - ETA: >2024-05-25 23:40:14
2024-05-25 23:31:53 (UTC) - 1:45:13 - train - INFO - step: 000463 - done (%): 92.6 - loss: 1.163 - lr: 8.9e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5541.7 - avg_words_per_second: 4856.7 - ETA: >2024-05-25 23:40:13
2024-05-25 23:32:05 (UTC) - 1:45:25 - train - INFO - step: 000464 - done (%): 92.8 - loss: 1.526 - lr: 8.5e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.9 - avg_words_per_second: 4858.0 - ETA: >2024-05-25 23:40:11
2024-05-25 23:32:17 (UTC) - 1:45:37 - train - INFO - step: 000465 - done (%): 93.0 - loss: 1.675 - lr: 8.0e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5541.4 - avg_words_per_second: 4859.3 - ETA: >2024-05-25 23:40:09
2024-05-25 23:32:29 (UTC) - 1:45:48 - train - INFO - step: 000466 - done (%): 93.2 - loss: 1.866 - lr: 7.6e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5540.4 - avg_words_per_second: 4860.6 - ETA: >2024-05-25 23:40:07
2024-05-25 23:32:41 (UTC) - 1:46:00 - train - INFO - step: 000467 - done (%): 93.4 - loss: 1.326 - lr: 7.1e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5541.7 - avg_words_per_second: 4861.9 - ETA: >2024-05-25 23:40:05
2024-05-25 23:32:53 (UTC) - 1:46:12 - train - INFO - step: 000468 - done (%): 93.6 - loss: 1.136 - lr: 6.7e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5528.1 - avg_words_per_second: 4863.1 - ETA: >2024-05-25 23:40:04
2024-05-25 23:33:04 (UTC) - 1:46:24 - train - INFO - step: 000469 - done (%): 93.8 - loss: 1.483 - lr: 6.3e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5537.0 - avg_words_per_second: 4864.4 - ETA: >2024-05-25 23:40:02
2024-05-25 23:33:28 (UTC) - 1:46:47 - train - INFO - step: 000470 - done (%): 94.0 - loss: 1.345 - lr: 5.9e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 2818.4 - avg_words_per_second: 4856.9 - ETA: >2024-05-25 23:40:12
2024-05-25 23:33:39 (UTC) - 1:46:59 - train - INFO - step: 000471 - done (%): 94.2 - loss: 1.299 - lr: 5.5e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5565.8 - avg_words_per_second: 4858.2 - ETA: >2024-05-25 23:40:11
2024-05-25 23:33:51 (UTC) - 1:47:11 - train - INFO - step: 000472 - done (%): 94.4 - loss: 1.494 - lr: 5.1e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5542.0 - avg_words_per_second: 4859.5 - ETA: >2024-05-25 23:40:09
2024-05-25 23:34:03 (UTC) - 1:47:23 - train - INFO - step: 000473 - done (%): 94.6 - loss: 1.573 - lr: 4.8e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.6 - avg_words_per_second: 4860.7 - ETA: >2024-05-25 23:40:07
2024-05-25 23:34:15 (UTC) - 1:47:34 - train - INFO - step: 000474 - done (%): 94.8 - loss: 1.278 - lr: 4.4e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5540.6 - avg_words_per_second: 4862.0 - ETA: >2024-05-25 23:40:05
2024-05-25 23:34:27 (UTC) - 1:47:46 - train - INFO - step: 000475 - done (%): 95.0 - loss: 1.504 - lr: 4.1e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5534.5 - avg_words_per_second: 4863.2 - ETA: >2024-05-25 23:40:04
2024-05-25 23:34:39 (UTC) - 1:47:58 - train - INFO - step: 000476 - done (%): 95.2 - loss: 1.486 - lr: 3.8e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.5 - avg_words_per_second: 4864.5 - ETA: >2024-05-25 23:40:02
2024-05-25 23:34:50 (UTC) - 1:48:10 - train - INFO - step: 000477 - done (%): 95.4 - loss: 1.377 - lr: 3.5e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5527.3 - avg_words_per_second: 4865.7 - ETA: >2024-05-25 23:40:00
2024-05-25 23:35:02 (UTC) - 1:48:22 - train - INFO - step: 000478 - done (%): 95.6 - loss: 1.727 - lr: 3.2e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5528.6 - avg_words_per_second: 4866.9 - ETA: >2024-05-25 23:39:59
2024-05-25 23:35:14 (UTC) - 1:48:34 - train - INFO - step: 000479 - done (%): 95.8 - loss: 1.266 - lr: 2.9e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5522.3 - avg_words_per_second: 4868.1 - ETA: >2024-05-25 23:39:57
2024-05-25 23:35:26 (UTC) - 1:48:46 - train - INFO - step: 000480 - done (%): 96.0 - loss: 1.332 - lr: 2.6e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5539.5 - avg_words_per_second: 4869.3 - ETA: >2024-05-25 23:39:55
2024-05-25 23:35:38 (UTC) - 1:48:57 - train - INFO - step: 000481 - done (%): 96.2 - loss: 1.463 - lr: 2.4e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.1 - avg_words_per_second: 4870.6 - ETA: >2024-05-25 23:39:54
2024-05-25 23:35:50 (UTC) - 1:49:09 - train - INFO - step: 000482 - done (%): 96.4 - loss: 1.158 - lr: 2.1e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.9 - avg_words_per_second: 4871.8 - ETA: >2024-05-25 23:39:52
2024-05-25 23:36:02 (UTC) - 1:49:21 - train - INFO - step: 000483 - done (%): 96.6 - loss: 1.384 - lr: 1.9e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.4 - avg_words_per_second: 4873.0 - ETA: >2024-05-25 23:39:50
2024-05-25 23:36:13 (UTC) - 1:49:33 - train - INFO - step: 000484 - done (%): 96.8 - loss: 1.282 - lr: 1.7e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5535.0 - avg_words_per_second: 4874.2 - ETA: >2024-05-25 23:39:49
2024-05-25 23:36:25 (UTC) - 1:49:45 - train - INFO - step: 000485 - done (%): 97.0 - loss: 1.392 - lr: 1.5e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5530.2 - avg_words_per_second: 4875.4 - ETA: >2024-05-25 23:39:47
2024-05-25 23:36:37 (UTC) - 1:49:57 - train - INFO - step: 000486 - done (%): 97.2 - loss: 1.775 - lr: 1.3e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5517.3 - avg_words_per_second: 4876.5 - ETA: >2024-05-25 23:39:45
2024-05-25 23:36:49 (UTC) - 1:50:08 - train - INFO - step: 000487 - done (%): 97.4 - loss: 1.305 - lr: 1.1e-07 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5530.9 - avg_words_per_second: 4877.7 - ETA: >2024-05-25 23:39:44
2024-05-25 23:37:01 (UTC) - 1:50:20 - train - INFO - step: 000488 - done (%): 97.6 - loss: 1.311 - lr: 9.5e-08 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5537.2 - avg_words_per_second: 4878.9 - ETA: >2024-05-25 23:39:42
2024-05-25 23:37:13 (UTC) - 1:50:32 - train - INFO - step: 000489 - done (%): 97.8 - loss: 1.443 - lr: 8.0e-08 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5533.1 - avg_words_per_second: 4880.1 - ETA: >2024-05-25 23:39:40
2024-05-25 23:37:24 (UTC) - 1:50:44 - train - INFO - step: 000490 - done (%): 98.0 - loss: 1.190 - lr: 6.6e-08 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5536.8 - avg_words_per_second: 4881.3 - ETA: >2024-05-25 23:39:39
2024-05-25 23:37:36 (UTC) - 1:50:56 - train - INFO - step: 000491 - done (%): 98.2 - loss: 1.498 - lr: 5.3e-08 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5540.3 - avg_words_per_second: 4882.5 - ETA: >2024-05-25 23:39:37
2024-05-25 23:37:48 (UTC) - 1:51:08 - train - INFO - step: 000492 - done (%): 98.4 - loss: 1.326 - lr: 4.2e-08 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5531.8 - avg_words_per_second: 4883.6 - ETA: >2024-05-25 23:39:36
2024-05-25 23:38:00 (UTC) - 1:51:20 - train - INFO - step: 000493 - done (%): 98.6 - loss: 1.438 - lr: 3.2e-08 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5535.2 - avg_words_per_second: 4884.8 - ETA: >2024-05-25 23:39:34
2024-05-25 23:38:12 (UTC) - 1:51:31 - train - INFO - step: 000494 - done (%): 98.8 - loss: 1.261 - lr: 2.4e-08 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5537.0 - avg_words_per_second: 4886.0 - ETA: >2024-05-25 23:39:32
2024-05-25 23:38:24 (UTC) - 1:51:43 - train - INFO - step: 000495 - done (%): 99.0 - loss: 1.314 - lr: 1.7e-08 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5527.3 - avg_words_per_second: 4887.1 - ETA: >2024-05-25 23:39:31
2024-05-25 23:38:36 (UTC) - 1:51:55 - train - INFO - step: 000496 - done (%): 99.2 - loss: 1.184 - lr: 1.1e-08 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5524.9 - avg_words_per_second: 4888.2 - ETA: >2024-05-25 23:39:29
2024-05-25 23:38:47 (UTC) - 1:52:07 - train - INFO - step: 000497 - done (%): 99.4 - loss: 1.244 - lr: 6.1e-09 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5529.0 - avg_words_per_second: 4889.4 - ETA: >2024-05-25 23:39:28
2024-05-25 23:38:59 (UTC) - 1:52:19 - train - INFO - step: 000498 - done (%): 99.6 - loss: 1.390 - lr: 2.9e-09 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5532.7 - avg_words_per_second: 4890.5 - ETA: >2024-05-25 23:39:26
2024-05-25 23:39:11 (UTC) - 1:52:31 - train - INFO - step: 000499 - done (%): 99.8 - loss: 1.459 - lr: 9.0e-10 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5530.5 - avg_words_per_second: 4891.7 - ETA: >2024-05-25 23:39:25
2024-05-25 23:39:23 (UTC) - 1:52:42 - train - INFO - step: 000500 - done (%): 100.0 - loss: 1.562 - lr: 2.4e-10 - peak_alloc_mem (GB): 64.9 - alloc_mem (GB): 24.1 - words_per_second: 5528.8 - avg_words_per_second: 4892.8 - ETA: >2024-05-25 23:39:23
2024-05-25 23:39:23 (UTC) - 1:52:42 - checkpointing - INFO - Dumping checkpoint in /root/mistral-finetune/runseed5/checkpoints/checkpoint_000500/consolidated using tmp name: tmp.consolidated
2024-05-25 23:39:37 (UTC) - 1:52:57 - checkpointing - INFO - Done dumping checkpoint in /root/mistral-finetune/runseed5/checkpoints/checkpoint_000500/consolidated for step: 500
2024-05-25 23:39:38 (UTC) - 1:52:58 - checkpointing - INFO - Deleted ckpt: /root/mistral-finetune/runseed5/checkpoints/checkpoint_000200
2024-05-25 23:39:38 (UTC) - 1:52:58 - checkpointing - INFO - Done deleting checkpoints /root/mistral-finetune/runseed5/checkpoints/checkpoint_000200
2024-05-25 23:39:38 (UTC) - 1:52:58 - checkpointing - INFO - Done!
2024-05-25 23:39:38 (UTC) - 1:52:58 - train - INFO - done!
2024-05-25 23:39:38 (UTC) - 1:52:58 - utils - INFO - Closing: eval_logger