|
Training 1/2 epoch (loss 0.6953): 0%| | 0/840 [00:05<?, ?it/s]
Training 1/2 epoch (loss 0.6953): 0%| | 1/840 [00:05<1:13:24, 5.25s/it]
Training 1/2 epoch (loss 0.6914): 0%| | 1/840 [00:08<1:13:24, 5.25s/it]
Training 1/2 epoch (loss 0.6914): 0%| | 2/840 [00:08<1:00:36, 4.34s/it]
Training 1/2 epoch (loss 0.6953): 0%| | 2/840 [00:11<1:00:36, 4.34s/it]
Training 1/2 epoch (loss 0.6953): 0%| | 3/840 [00:11<51:24, 3.69s/it]
Training 1/2 epoch (loss 0.6914): 0%| | 3/840 [00:14<51:24, 3.69s/it]
Training 1/2 epoch (loss 0.6914): 0%| | 4/840 [00:14<47:14, 3.39s/it]
Training 1/2 epoch (loss 0.6953): 0%| | 4/840 [00:19<47:14, 3.39s/it]
Training 1/2 epoch (loss 0.6953): 1%| | 5/840 [00:19<52:19, 3.76s/it]
Training 1/2 epoch (loss 0.6953): 1%| | 5/840 [00:22<52:19, 3.76s/it]
Training 1/2 epoch (loss 0.6953): 1%| | 6/840 [00:22<50:49, 3.66s/it]
Training 1/2 epoch (loss 0.6953): 1%| | 6/840 [00:26<50:49, 3.66s/it]
Training 1/2 epoch (loss 0.6953): 1%| | 7/840 [00:26<51:38, 3.72s/it]
Training 1/2 epoch (loss 0.6953): 1%| | 7/840 [00:32<51:38, 3.72s/it]
Training 1/2 epoch (loss 0.6953): 1%| | 8/840 [00:32<59:29, 4.29s/it]
Training 1/2 epoch (loss 0.6914): 1%| | 8/840 [00:35<59:29, 4.29s/it]
Training 1/2 epoch (loss 0.6914): 1%| | 9/840 [00:35<57:58, 4.19s/it]
Training 1/2 epoch (loss 0.6914): 1%| | 9/840 [00:39<57:58, 4.19s/it]
Training 1/2 epoch (loss 0.6914): 1%| | 10/840 [00:39<53:10, 3.84s/it]
Training 1/2 epoch (loss 0.6953): 1%| | 10/840 [00:41<53:10, 3.84s/it]
Training 1/2 epoch (loss 0.6953): 1%|β | 11/840 [00:41<47:33, 3.44s/it]
Training 1/2 epoch (loss 0.6953): 1%|β | 11/840 [00:45<47:33, 3.44s/it]
Training 1/2 epoch (loss 0.6953): 1%|β | 12/840 [00:45<50:24, 3.65s/it]
Training 1/2 epoch (loss 0.6953): 1%|β | 12/840 [00:48<50:24, 3.65s/it]
Training 1/2 epoch (loss 0.6953): 2%|β | 13/840 [00:48<45:59, 3.34s/it]
Training 1/2 epoch (loss 0.6953): 2%|β | 13/840 [00:51<45:59, 3.34s/it]
Training 1/2 epoch (loss 0.6953): 2%|β | 14/840 [00:51<46:46, 3.40s/it]
Training 1/2 epoch (loss 0.6914): 2%|β | 14/840 [00:54<46:46, 3.40s/it]
Training 1/2 epoch (loss 0.6914): 2%|β | 15/840 [00:54<44:37, 3.25s/it]
Training 1/2 epoch (loss 0.6875): 2%|β | 15/840 [00:58<44:37, 3.25s/it]
Training 1/2 epoch (loss 0.6875): 2%|β | 16/840 [00:58<45:35, 3.32s/it]
Training 1/2 epoch (loss 0.6953): 2%|β | 16/840 [01:01<45:35, 3.32s/it]
Training 1/2 epoch (loss 0.6953): 2%|β | 17/840 [01:01<43:42, 3.19s/it]
Training 1/2 epoch (loss 0.6914): 2%|β | 17/840 [01:04<43:42, 3.19s/it]
Training 1/2 epoch (loss 0.6914): 2%|β | 18/840 [01:04<44:04, 3.22s/it]
Training 1/2 epoch (loss 0.6953): 2%|β | 18/840 [01:08<44:04, 3.22s/it]
Training 1/2 epoch (loss 0.6953): 2%|β | 19/840 [01:08<48:10, 3.52s/it]
Training 1/2 epoch (loss 0.6875): 2%|β | 19/840 [01:12<48:10, 3.52s/it]
Training 1/2 epoch (loss 0.6875): 2%|β | 20/840 [01:12<49:31, 3.62s/it]
Training 1/2 epoch (loss 0.6953): 2%|β | 20/840 [01:16<49:31, 3.62s/it]
Training 1/2 epoch (loss 0.6953): 2%|β | 21/840 [01:16<49:34, 3.63s/it]
Training 1/2 epoch (loss 0.6914): 2%|β | 21/840 [01:18<49:34, 3.63s/it]
Training 1/2 epoch (loss 0.6914): 3%|β | 22/840 [01:18<45:56, 3.37s/it]
Training 1/2 epoch (loss 0.6953): 3%|β | 22/840 [01:21<45:56, 3.37s/it]
Training 1/2 epoch (loss 0.6953): 3%|β | 23/840 [01:21<43:13, 3.17s/it]
Training 1/2 epoch (loss 0.6875): 3%|β | 23/840 [01:25<43:13, 3.17s/it]
Training 1/2 epoch (loss 0.6875): 3%|β | 24/840 [01:25<45:43, 3.36s/it]
Training 1/2 epoch (loss 0.6797): 3%|β | 24/840 [01:30<45:43, 3.36s/it]
Training 1/2 epoch (loss 0.6797): 3%|β | 25/840 [01:30<51:21, 3.78s/it]
Training 1/2 epoch (loss 0.6836): 3%|β | 25/840 [01:35<51:21, 3.78s/it]
Training 1/2 epoch (loss 0.6836): 3%|β | 26/840 [01:35<56:20, 4.15s/it]
Training 1/2 epoch (loss 0.6641): 3%|β | 26/840 [01:38<56:20, 4.15s/it]
Training 1/2 epoch (loss 0.6641): 3%|β | 27/840 [01:38<53:18, 3.93s/it]
Training 1/2 epoch (loss 0.6562): 3%|β | 27/840 [01:42<53:18, 3.93s/it]
Training 1/2 epoch (loss 0.6562): 3%|β | 28/840 [01:42<50:55, 3.76s/it]
Training 1/2 epoch (loss 0.6602): 3%|β | 28/840 [01:44<50:55, 3.76s/it]
Training 1/2 epoch (loss 0.6602): 3%|β | 29/840 [01:44<45:44, 3.38s/it]
Training 1/2 epoch (loss 0.6562): 3%|β | 29/840 [01:49<45:44, 3.38s/it]
Training 1/2 epoch (loss 0.6562): 4%|β | 30/840 [01:49<52:16, 3.87s/it]
Training 1/2 epoch (loss 0.7031): 4%|β | 30/840 [01:54<52:16, 3.87s/it]
Training 1/2 epoch (loss 0.7031): 4%|β | 31/840 [01:54<55:12, 4.09s/it]
Training 1/2 epoch (loss 0.6602): 4%|β | 31/840 [01:57<55:12, 4.09s/it]
Training 1/2 epoch (loss 0.6602): 4%|β | 32/840 [01:57<52:52, 3.93s/it]
Training 1/2 epoch (loss 0.5938): 4%|β | 32/840 [02:03<52:52, 3.93s/it]
Training 1/2 epoch (loss 0.5938): 4%|β | 33/840 [02:03<58:43, 4.37s/it]
Training 1/2 epoch (loss 0.6289): 4%|β | 33/840 [02:06<58:43, 4.37s/it]
Training 1/2 epoch (loss 0.6289): 4%|β | 34/840 [02:06<55:01, 4.10s/it]
Training 1/2 epoch (loss 0.6055): 4%|β | 34/840 [02:11<55:01, 4.10s/it]
Training 1/2 epoch (loss 0.6055): 4%|β | 35/840 [02:11<56:57, 4.25s/it]
Training 1/2 epoch (loss 0.5391): 4%|β | 35/840 [02:14<56:57, 4.25s/it]
Training 1/2 epoch (loss 0.5391): 4%|β | 36/840 [02:14<54:14, 4.05s/it]
Training 1/2 epoch (loss 0.6562): 4%|β | 36/840 [02:17<54:14, 4.05s/it]
Training 1/2 epoch (loss 0.6562): 4%|β | 37/840 [02:17<50:34, 3.78s/it]
Training 1/2 epoch (loss 0.5430): 4%|β | 37/840 [02:21<50:34, 3.78s/it]
Training 1/2 epoch (loss 0.5430): 5%|β | 38/840 [02:21<48:03, 3.60s/it]
Training 1/2 epoch (loss 0.6016): 5%|β | 38/840 [02:25<48:03, 3.60s/it]
Training 1/2 epoch (loss 0.6016): 5%|β | 39/840 [02:25<52:33, 3.94s/it]
Training 1/2 epoch (loss 0.6094): 5%|β | 39/840 [02:30<52:33, 3.94s/it]
Training 1/2 epoch (loss 0.6094): 5%|β | 40/840 [02:30<57:21, 4.30s/it]
Training 1/2 epoch (loss 0.6172): 5%|β | 40/840 [02:35<57:21, 4.30s/it]
Training 1/2 epoch (loss 0.6172): 5%|β | 41/840 [02:35<57:42, 4.33s/it]
Training 1/2 epoch (loss 0.5938): 5%|β | 41/840 [02:38<57:42, 4.33s/it]
Training 1/2 epoch (loss 0.5938): 5%|β | 42/840 [02:38<54:38, 4.11s/it]
Training 1/2 epoch (loss 0.6172): 5%|β | 42/840 [02:43<54:38, 4.11s/it]
Training 1/2 epoch (loss 0.6172): 5%|β | 43/840 [02:43<55:57, 4.21s/it]
Training 1/2 epoch (loss 0.6797): 5%|β | 43/840 [02:47<55:57, 4.21s/it]
Training 1/2 epoch (loss 0.6797): 5%|β | 44/840 [02:47<55:39, 4.20s/it]
Training 1/2 epoch (loss 0.5820): 5%|β | 44/840 [02:50<55:39, 4.20s/it]
Training 1/2 epoch (loss 0.5820): 5%|β | 45/840 [02:50<49:10, 3.71s/it]
Training 1/2 epoch (loss 0.5469): 5%|β | 45/840 [02:55<49:10, 3.71s/it]
Training 1/2 epoch (loss 0.5469): 5%|β | 46/840 [02:55<56:44, 4.29s/it]
Training 1/2 epoch (loss 0.5859): 5%|β | 46/840 [02:59<56:44, 4.29s/it]
Training 1/2 epoch (loss 0.5859): 6%|β | 47/840 [02:59<54:20, 4.11s/it]
Training 1/2 epoch (loss 0.5859): 6%|β | 47/840 [03:02<54:20, 4.11s/it]
Training 1/2 epoch (loss 0.5859): 6%|β | 48/840 [03:02<50:15, 3.81s/it]
Training 1/2 epoch (loss 0.7266): 6%|β | 48/840 [03:05<50:15, 3.81s/it]
Training 1/2 epoch (loss 0.7266): 6%|β | 49/840 [03:05<45:19, 3.44s/it]
Training 1/2 epoch (loss 0.6914): 6%|β | 49/840 [03:07<45:19, 3.44s/it]
Training 1/2 epoch (loss 0.6914): 6%|β | 50/840 [03:07<42:28, 3.23s/it]
Training 1/2 epoch (loss 0.6016): 6%|β | 50/840 [03:11<42:28, 3.23s/it]
Training 1/2 epoch (loss 0.6016): 6%|β | 51/840 [03:11<46:00, 3.50s/it]
Training 1/2 epoch (loss 0.6250): 6%|β | 51/840 [03:15<46:00, 3.50s/it]
Training 1/2 epoch (loss 0.6250): 6%|β | 52/840 [03:15<44:52, 3.42s/it]
Training 1/2 epoch (loss 0.6328): 6%|β | 52/840 [03:18<44:52, 3.42s/it]
Training 1/2 epoch (loss 0.6328): 6%|β | 53/840 [03:18<42:53, 3.27s/it]
Training 1/2 epoch (loss 0.5938): 6%|β | 53/840 [03:21<42:53, 3.27s/it]
Training 1/2 epoch (loss 0.5938): 6%|β | 54/840 [03:21<42:21, 3.23s/it]
Training 1/2 epoch (loss 0.6289): 6%|β | 54/840 [03:24<42:21, 3.23s/it]
Training 1/2 epoch (loss 0.6289): 7%|β | 55/840 [03:24<43:03, 3.29s/it]
Training 1/2 epoch (loss 0.5859): 7%|β | 55/840 [03:27<43:03, 3.29s/it]
Training 1/2 epoch (loss 0.5859): 7%|β | 56/840 [03:27<42:25, 3.25s/it]
Training 1/2 epoch (loss 0.6719): 7%|β | 56/840 [03:30<42:25, 3.25s/it]
Training 1/2 epoch (loss 0.6719): 7%|β | 57/840 [03:30<41:27, 3.18s/it]
Training 1/2 epoch (loss 0.5859): 7%|β | 57/840 [03:34<41:27, 3.18s/it]
Training 1/2 epoch (loss 0.5859): 7%|β | 58/840 [03:34<42:55, 3.29s/it]
Training 1/2 epoch (loss 0.6406): 7%|β | 58/840 [03:39<42:55, 3.29s/it]
Training 1/2 epoch (loss 0.6406): 7%|β | 59/840 [03:39<48:06, 3.70s/it]
Training 1/2 epoch (loss 0.5312): 7%|β | 59/840 [03:44<48:06, 3.70s/it]
Training 1/2 epoch (loss 0.5312): 7%|β | 60/840 [03:44<54:55, 4.22s/it]
Training 1/2 epoch (loss 0.5547): 7%|β | 60/840 [03:47<54:55, 4.22s/it]
Training 1/2 epoch (loss 0.5547): 7%|β | 61/840 [03:47<50:47, 3.91s/it]
Training 1/2 epoch (loss 0.6914): 7%|β | 61/840 [03:51<50:47, 3.91s/it]
Training 1/2 epoch (loss 0.6914): 7%|β | 62/840 [03:51<50:41, 3.91s/it]
Training 1/2 epoch (loss 0.6484): 7%|β | 62/840 [03:57<50:41, 3.91s/it]
Training 1/2 epoch (loss 0.6484): 8%|β | 63/840 [03:57<56:32, 4.37s/it]
Training 1/2 epoch (loss 0.7578): 8%|β | 63/840 [04:01<56:32, 4.37s/it]
Training 1/2 epoch (loss 0.7578): 8%|β | 64/840 [04:01<57:20, 4.43s/it]
Training 1/2 epoch (loss 0.5820): 8%|β | 64/840 [04:04<57:20, 4.43s/it]
Training 1/2 epoch (loss 0.5820): 8%|β | 65/840 [04:04<51:12, 3.97s/it]
Training 1/2 epoch (loss 0.5977): 8%|β | 65/840 [04:07<51:12, 3.97s/it]
Training 1/2 epoch (loss 0.5977): 8%|β | 66/840 [04:07<46:42, 3.62s/it]
Training 1/2 epoch (loss 0.6758): 8%|β | 66/840 [04:11<46:42, 3.62s/it]
Training 1/2 epoch (loss 0.6758): 8%|β | 67/840 [04:11<48:13, 3.74s/it]
Training 1/2 epoch (loss 0.5938): 8%|β | 67/840 [04:15<48:13, 3.74s/it]
Training 1/2 epoch (loss 0.5938): 8%|β | 68/840 [04:15<50:21, 3.91s/it]
Training 1/2 epoch (loss 0.6719): 8%|β | 68/840 [04:18<50:21, 3.91s/it]
Training 1/2 epoch (loss 0.6719): 8%|β | 69/840 [04:18<45:26, 3.54s/it]
Training 1/2 epoch (loss 0.5508): 8%|β | 69/840 [04:22<45:26, 3.54s/it]
Training 1/2 epoch (loss 0.5508): 8%|β | 70/840 [04:22<47:49, 3.73s/it]
Training 1/2 epoch (loss 0.5820): 8%|β | 70/840 [04:25<47:49, 3.73s/it]
Training 1/2 epoch (loss 0.5820): 8%|β | 71/840 [04:25<46:18, 3.61s/it]
Training 1/2 epoch (loss 0.6055): 8%|β | 71/840 [04:31<46:18, 3.61s/it]
Training 1/2 epoch (loss 0.6055): 9%|β | 72/840 [04:31<53:57, 4.21s/it]
Training 1/2 epoch (loss 0.5938): 9%|β | 72/840 [04:36<53:57, 4.21s/it]
Training 1/2 epoch (loss 0.5938): 9%|β | 73/840 [04:36<58:31, 4.58s/it]
Training 1/2 epoch (loss 0.6133): 9%|β | 73/840 [04:41<58:31, 4.58s/it]
Training 1/2 epoch (loss 0.6133): 9%|β | 74/840 [04:41<58:52, 4.61s/it]
Training 1/2 epoch (loss 0.5938): 9%|β | 74/840 [04:47<58:52, 4.61s/it]
Training 1/2 epoch (loss 0.5938): 9%|β | 75/840 [04:47<1:02:10, 4.88s/it]
Training 1/2 epoch (loss 0.6094): 9%|β | 75/840 [04:49<1:02:10, 4.88s/it]
Training 1/2 epoch (loss 0.6094): 9%|β | 76/840 [04:49<53:39, 4.21s/it]
Training 1/2 epoch (loss 0.6172): 9%|β | 76/840 [04:52<53:39, 4.21s/it]
Training 1/2 epoch (loss 0.6172): 9%|β | 77/840 [04:52<49:35, 3.90s/it]
Training 1/2 epoch (loss 0.6094): 9%|β | 77/840 [04:55<49:35, 3.90s/it]
Training 1/2 epoch (loss 0.6094): 9%|β | 78/840 [04:55<45:17, 3.57s/it]
Training 1/2 epoch (loss 0.5664): 9%|β | 78/840 [05:00<45:17, 3.57s/it]
Training 1/2 epoch (loss 0.5664): 9%|β | 79/840 [05:00<48:35, 3.83s/it]
Training 1/2 epoch (loss 0.6523): 9%|β | 79/840 [05:03<48:35, 3.83s/it]
Training 1/2 epoch (loss 0.6523): 10%|β | 80/840 [05:03<45:59, 3.63s/it]
Training 1/2 epoch (loss 0.5625): 10%|β | 80/840 [05:06<45:59, 3.63s/it]
Training 1/2 epoch (loss 0.5625): 10%|β | 81/840 [05:06<44:57, 3.55s/it]
Training 1/2 epoch (loss 0.6328): 10%|β | 81/840 [05:09<44:57, 3.55s/it]
Training 1/2 epoch (loss 0.6328): 10%|β | 82/840 [05:09<44:02, 3.49s/it]
Training 1/2 epoch (loss 0.5781): 10%|β | 82/840 [05:14<44:02, 3.49s/it]
Training 1/2 epoch (loss 0.5781): 10%|β | 83/840 [05:14<47:30, 3.77s/it]
Training 1/2 epoch (loss 0.5625): 10%|β | 83/840 [05:18<47:30, 3.77s/it]
Training 1/2 epoch (loss 0.5625): 10%|β | 84/840 [05:18<47:50, 3.80s/it]
Training 1/2 epoch (loss 0.6211): 10%|β | 84/840 [05:21<47:50, 3.80s/it]
Training 1/2 epoch (loss 0.6211): 10%|β | 85/840 [05:21<44:31, 3.54s/it]
Training 1/2 epoch (loss 0.6953): 10%|β | 85/840 [05:24<44:31, 3.54s/it]
Training 1/2 epoch (loss 0.6953): 10%|β | 86/840 [05:24<41:40, 3.32s/it]
Training 1/2 epoch (loss 0.5977): 10%|β | 86/840 [05:29<41:40, 3.32s/it]
Training 1/2 epoch (loss 0.5977): 10%|β | 87/840 [05:29<49:26, 3.94s/it]
Training 1/2 epoch (loss 0.6562): 10%|β | 87/840 [05:33<49:26, 3.94s/it]
Training 1/2 epoch (loss 0.6562): 10%|β | 88/840 [05:33<50:02, 3.99s/it]
Training 1/2 epoch (loss 0.6250): 10%|β | 88/840 [05:37<50:02, 3.99s/it]
Training 1/2 epoch (loss 0.6250): 11%|β | 89/840 [05:37<48:02, 3.84s/it]
Training 1/2 epoch (loss 0.5547): 11%|β | 89/840 [05:40<48:02, 3.84s/it]
Training 1/2 epoch (loss 0.5547): 11%|β | 90/840 [05:40<46:38, 3.73s/it]
Training 1/2 epoch (loss 0.6836): 11%|β | 90/840 [05:43<46:38, 3.73s/it]
Training 1/2 epoch (loss 0.6836): 11%|β | 91/840 [05:43<42:55, 3.44s/it]
Training 1/2 epoch (loss 0.5664): 11%|β | 91/840 [05:46<42:55, 3.44s/it]
Training 1/2 epoch (loss 0.5664): 11%|β | 92/840 [05:46<41:53, 3.36s/it]
Training 1/2 epoch (loss 0.6250): 11%|β | 92/840 [05:49<41:53, 3.36s/it]
Training 1/2 epoch (loss 0.6250): 11%|β | 93/840 [05:49<39:38, 3.18s/it]
Training 1/2 epoch (loss 0.5781): 11%|β | 93/840 [05:52<39:38, 3.18s/it]
Training 1/2 epoch (loss 0.5781): 11%|β | 94/840 [05:52<39:49, 3.20s/it]
Training 1/2 epoch (loss 0.5977): 11%|β | 94/840 [05:56<39:49, 3.20s/it]
Training 1/2 epoch (loss 0.5977): 11%|ββ | 95/840 [05:56<42:01, 3.38s/it]
Training 1/2 epoch (loss 0.5664): 11%|ββ | 95/840 [06:01<42:01, 3.38s/it]
Training 1/2 epoch (loss 0.5664): 11%|ββ | 96/840 [06:01<50:17, 4.06s/it]
Training 1/2 epoch (loss 0.5820): 11%|ββ | 96/840 [06:05<50:17, 4.06s/it]
Training 1/2 epoch (loss 0.5820): 12%|ββ | 97/840 [06:05<50:09, 4.05s/it]
Training 1/2 epoch (loss 0.5586): 12%|ββ | 97/840 [06:10<50:09, 4.05s/it]
Training 1/2 epoch (loss 0.5586): 12%|ββ | 98/840 [06:10<50:33, 4.09s/it]
Training 1/2 epoch (loss 0.5547): 12%|ββ | 98/840 [06:12<50:33, 4.09s/it]
Training 1/2 epoch (loss 0.5547): 12%|ββ | 99/840 [06:12<46:02, 3.73s/it]
Training 1/2 epoch (loss 0.5352): 12%|ββ | 99/840 [06:16<46:02, 3.73s/it]
Training 1/2 epoch (loss 0.5352): 12%|ββ | 100/840 [06:16<43:30, 3.53s/it]
Training 1/2 epoch (loss 0.5312): 12%|ββ | 100/840 [06:21<43:30, 3.53s/it]
Training 1/2 epoch (loss 0.5312): 12%|ββ | 101/840 [06:21<50:42, 4.12s/it]
Training 1/2 epoch (loss 0.6172): 12%|ββ | 101/840 [06:26<50:42, 4.12s/it]
Training 1/2 epoch (loss 0.6172): 12%|ββ | 102/840 [06:26<52:12, 4.24s/it]
Training 1/2 epoch (loss 0.6094): 12%|ββ | 102/840 [06:29<52:12, 4.24s/it]
Training 1/2 epoch (loss 0.6094): 12%|ββ | 103/840 [06:29<47:50, 3.90s/it]
Training 1/2 epoch (loss 0.6445): 12%|ββ | 103/840 [06:32<47:50, 3.90s/it]
Training 1/2 epoch (loss 0.6445): 12%|ββ | 104/840 [06:32<46:53, 3.82s/it]
Training 1/2 epoch (loss 0.5781): 12%|ββ | 104/840 [06:36<46:53, 3.82s/it]
Training 1/2 epoch (loss 0.5781): 12%|ββ | 105/840 [06:36<47:00, 3.84s/it]
Training 1/2 epoch (loss 0.6289): 12%|ββ | 105/840 [06:41<47:00, 3.84s/it]
Training 1/2 epoch (loss 0.6289): 13%|ββ | 106/840 [06:41<49:32, 4.05s/it]
Training 1/2 epoch (loss 0.5508): 13%|ββ | 106/840 [06:46<49:32, 4.05s/it]
Training 1/2 epoch (loss 0.5508): 13%|ββ | 107/840 [06:46<54:25, 4.46s/it]
Training 1/2 epoch (loss 0.5664): 13%|ββ | 107/840 [06:49<54:25, 4.46s/it]
Training 1/2 epoch (loss 0.5664): 13%|ββ | 108/840 [06:49<49:55, 4.09s/it]
Training 1/2 epoch (loss 0.5742): 13%|ββ | 108/840 [06:53<49:55, 4.09s/it]
Training 1/2 epoch (loss 0.5742): 13%|ββ | 109/840 [06:53<46:38, 3.83s/it]
Training 1/2 epoch (loss 0.6094): 13%|ββ | 109/840 [06:58<46:38, 3.83s/it]
Training 1/2 epoch (loss 0.6094): 13%|ββ | 110/840 [06:58<52:31, 4.32s/it]
Training 1/2 epoch (loss 0.6719): 13%|ββ | 110/840 [07:01<52:31, 4.32s/it]
Training 1/2 epoch (loss 0.6719): 13%|ββ | 111/840 [07:01<46:24, 3.82s/it]
Training 1/2 epoch (loss 0.5977): 13%|ββ | 111/840 [07:06<46:24, 3.82s/it]
Training 1/2 epoch (loss 0.5977): 13%|ββ | 112/840 [07:06<52:37, 4.34s/it]
Training 1/2 epoch (loss 0.6133): 13%|ββ | 112/840 [07:12<52:37, 4.34s/it]
Training 1/2 epoch (loss 0.6133): 13%|ββ | 113/840 [07:12<56:29, 4.66s/it]
Training 1/2 epoch (loss 0.6523): 13%|ββ | 113/840 [07:15<56:29, 4.66s/it]
Training 1/2 epoch (loss 0.6523): 14%|ββ | 114/840 [07:15<50:43, 4.19s/it]
Training 1/2 epoch (loss 0.5547): 14%|ββ | 114/840 [07:18<50:43, 4.19s/it]
Training 1/2 epoch (loss 0.5547): 14%|ββ | 115/840 [07:18<49:00, 4.06s/it]
Training 1/2 epoch (loss 0.5469): 14%|ββ | 115/840 [07:22<49:00, 4.06s/it]
Training 1/2 epoch (loss 0.5469): 14%|ββ | 116/840 [07:22<46:56, 3.89s/it]
Training 1/2 epoch (loss 0.5625): 14%|ββ | 116/840 [07:25<46:56, 3.89s/it]
Training 1/2 epoch (loss 0.5625): 14%|ββ | 117/840 [07:25<44:30, 3.69s/it]
Training 1/2 epoch (loss 0.6094): 14%|ββ | 117/840 [07:30<44:30, 3.69s/it]
Training 1/2 epoch (loss 0.6094): 14%|ββ | 118/840 [07:30<46:47, 3.89s/it]
Training 1/2 epoch (loss 0.6797): 14%|ββ | 118/840 [07:34<46:47, 3.89s/it]
Training 1/2 epoch (loss 0.6797): 14%|ββ | 119/840 [07:34<49:41, 4.14s/it]
Training 1/2 epoch (loss 0.6172): 14%|ββ | 119/840 [07:38<49:41, 4.14s/it]
Training 1/2 epoch (loss 0.6172): 14%|ββ | 120/840 [07:38<46:26, 3.87s/it]
Training 1/2 epoch (loss 0.5117): 14%|ββ | 120/840 [07:41<46:26, 3.87s/it]
Training 1/2 epoch (loss 0.5117): 14%|ββ | 121/840 [07:41<46:22, 3.87s/it]
Training 1/2 epoch (loss 0.5859): 14%|ββ | 121/840 [07:45<46:22, 3.87s/it]
Training 1/2 epoch (loss 0.5859): 15%|ββ | 122/840 [07:45<45:58, 3.84s/it]
Training 1/2 epoch (loss 0.6523): 15%|ββ | 122/840 [07:48<45:58, 3.84s/it]
Training 1/2 epoch (loss 0.6523): 15%|ββ | 123/840 [07:48<43:21, 3.63s/it]
Training 1/2 epoch (loss 0.4961): 15%|ββ | 123/840 [07:52<43:21, 3.63s/it]
Training 1/2 epoch (loss 0.4961): 15%|ββ | 124/840 [07:52<45:04, 3.78s/it]
Training 1/2 epoch (loss 0.5547): 15%|ββ | 124/840 [07:55<45:04, 3.78s/it]
Training 1/2 epoch (loss 0.5547): 15%|ββ | 125/840 [07:55<40:59, 3.44s/it]
Training 1/2 epoch (loss 0.5586): 15%|ββ | 125/840 [07:58<40:59, 3.44s/it]
Training 1/2 epoch (loss 0.5586): 15%|ββ | 126/840 [07:58<40:06, 3.37s/it]
Training 1/2 epoch (loss 0.5586): 15%|ββ | 126/840 [08:01<40:06, 3.37s/it]
Training 1/2 epoch (loss 0.5586): 15%|ββ | 127/840 [08:01<39:16, 3.31s/it]
Training 1/2 epoch (loss 0.6484): 15%|ββ | 127/840 [08:07<39:16, 3.31s/it]
Training 1/2 epoch (loss 0.6484): 15%|ββ | 128/840 [08:07<47:01, 3.96s/it]
Training 1/2 epoch (loss 0.5820): 15%|ββ | 128/840 [08:10<47:01, 3.96s/it]
Training 1/2 epoch (loss 0.5820): 15%|ββ | 129/840 [08:10<43:25, 3.66s/it]
Training 1/2 epoch (loss 0.6172): 15%|ββ | 129/840 [08:13<43:25, 3.66s/it]
Training 1/2 epoch (loss 0.6172): 15%|ββ | 130/840 [08:13<41:16, 3.49s/it]
Training 1/2 epoch (loss 0.5703): 15%|ββ | 130/840 [08:16<41:16, 3.49s/it]
Training 1/2 epoch (loss 0.5703): 16%|ββ | 131/840 [08:16<38:30, 3.26s/it]
Training 1/2 epoch (loss 0.6055): 16%|ββ | 131/840 [08:21<38:30, 3.26s/it]
Training 1/2 epoch (loss 0.6055): 16%|ββ | 132/840 [08:21<44:13, 3.75s/it]
Training 1/2 epoch (loss 0.5625): 16%|ββ | 132/840 [08:25<44:13, 3.75s/it]
Training 1/2 epoch (loss 0.5625): 16%|ββ | 133/840 [08:25<45:08, 3.83s/it]
Training 1/2 epoch (loss 0.6797): 16%|ββ | 133/840 [08:29<45:08, 3.83s/it]
Training 1/2 epoch (loss 0.6797): 16%|ββ | 134/840 [08:29<47:18, 4.02s/it]
Training 1/2 epoch (loss 0.6016): 16%|ββ | 134/840 [08:32<47:18, 4.02s/it]
Training 1/2 epoch (loss 0.6016): 16%|ββ | 135/840 [08:32<44:09, 3.76s/it]
Training 1/2 epoch (loss 0.6602): 16%|ββ | 135/840 [08:36<44:09, 3.76s/it]
Training 1/2 epoch (loss 0.6602): 16%|ββ | 136/840 [08:36<44:22, 3.78s/it]
Training 1/2 epoch (loss 0.5430): 16%|ββ | 136/840 [08:39<44:22, 3.78s/it]
Training 1/2 epoch (loss 0.5430): 16%|ββ | 137/840 [08:39<40:15, 3.44s/it]
Training 1/2 epoch (loss 0.5508): 16%|ββ | 137/840 [08:43<40:15, 3.44s/it]
Training 1/2 epoch (loss 0.5508): 16%|ββ | 138/840 [08:43<44:42, 3.82s/it]
Training 1/2 epoch (loss 0.6172): 16%|ββ | 138/840 [08:47<44:42, 3.82s/it]
Training 1/2 epoch (loss 0.6172): 17%|ββ | 139/840 [08:47<45:27, 3.89s/it]
Training 1/2 epoch (loss 0.5039): 17%|ββ | 139/840 [08:51<45:27, 3.89s/it]
Training 1/2 epoch (loss 0.5039): 17%|ββ | 140/840 [08:51<43:36, 3.74s/it]
Training 1/2 epoch (loss 0.5391): 17%|ββ | 140/840 [08:56<43:36, 3.74s/it]
Training 1/2 epoch (loss 0.5391): 17%|ββ | 141/840 [08:56<47:11, 4.05s/it]
Training 1/2 epoch (loss 0.5547): 17%|ββ | 141/840 [09:01<47:11, 4.05s/it]
Training 1/2 epoch (loss 0.5547): 17%|ββ | 142/840 [09:01<52:18, 4.50s/it]
Training 1/2 epoch (loss 0.5742): 17%|ββ | 142/840 [09:05<52:18, 4.50s/it]
Training 1/2 epoch (loss 0.5742): 17%|ββ | 143/840 [09:05<49:46, 4.28s/it]
Training 1/2 epoch (loss 0.4531): 17%|ββ | 143/840 [09:08<49:46, 4.28s/it]
Training 1/2 epoch (loss 0.4531): 17%|ββ | 144/840 [09:08<45:43, 3.94s/it]
Training 1/2 epoch (loss 0.6133): 17%|ββ | 144/840 [09:11<45:43, 3.94s/it]
Training 1/2 epoch (loss 0.6133): 17%|ββ | 145/840 [09:11<42:21, 3.66s/it]
Training 1/2 epoch (loss 0.5312): 17%|ββ | 145/840 [09:16<42:21, 3.66s/it]
Training 1/2 epoch (loss 0.5312): 17%|ββ | 146/840 [09:16<47:11, 4.08s/it]
Training 1/2 epoch (loss 0.5742): 17%|ββ | 146/840 [09:20<47:11, 4.08s/it]
Training 1/2 epoch (loss 0.5742): 18%|ββ | 147/840 [09:20<45:58, 3.98s/it]
Training 1/2 epoch (loss 0.6641): 18%|ββ | 147/840 [09:24<45:58, 3.98s/it]
Training 1/2 epoch (loss 0.6641): 18%|ββ | 148/840 [09:24<45:39, 3.96s/it]
Training 1/2 epoch (loss 0.5859): 18%|ββ | 148/840 [09:28<45:39, 3.96s/it]
Training 1/2 epoch (loss 0.5859): 18%|ββ | 149/840 [09:28<45:25, 3.94s/it]
Training 1/2 epoch (loss 0.6172): 18%|ββ | 149/840 [09:31<45:25, 3.94s/it]
Training 1/2 epoch (loss 0.6172): 18%|ββ | 150/840 [09:31<42:35, 3.70s/it]
Training 1/2 epoch (loss 0.6562): 18%|ββ | 150/840 [09:35<42:35, 3.70s/it]
Training 1/2 epoch (loss 0.6562): 18%|ββ | 151/840 [09:35<42:29, 3.70s/it]
Training 1/2 epoch (loss 0.5664): 18%|ββ | 151/840 [09:38<42:29, 3.70s/it]
Training 1/2 epoch (loss 0.5664): 18%|ββ | 152/840 [09:38<42:39, 3.72s/it]
Training 1/2 epoch (loss 0.4863): 18%|ββ | 152/840 [09:42<42:39, 3.72s/it]
Training 1/2 epoch (loss 0.4863): 18%|ββ | 153/840 [09:42<42:37, 3.72s/it]
Training 1/2 epoch (loss 0.5625): 18%|ββ | 153/840 [09:45<42:37, 3.72s/it]
Training 1/2 epoch (loss 0.5625): 18%|ββ | 154/840 [09:45<40:46, 3.57s/it]
Training 1/2 epoch (loss 0.7422): 18%|ββ | 154/840 [09:50<40:46, 3.57s/it]
Training 1/2 epoch (loss 0.7422): 18%|ββ | 155/840 [09:50<44:16, 3.88s/it]
Training 1/2 epoch (loss 0.5703): 18%|ββ | 155/840 [09:54<44:16, 3.88s/it]
Training 1/2 epoch (loss 0.5703): 19%|ββ | 156/840 [09:54<43:53, 3.85s/it]
Training 1/2 epoch (loss 0.5625): 19%|ββ | 156/840 [09:57<43:53, 3.85s/it]
Training 1/2 epoch (loss 0.5625): 19%|ββ | 157/840 [09:57<41:39, 3.66s/it]
Training 1/2 epoch (loss 0.5742): 19%|ββ | 157/840 [10:00<41:39, 3.66s/it]
Training 1/2 epoch (loss 0.5742): 19%|ββ | 158/840 [10:00<38:51, 3.42s/it]
Training 1/2 epoch (loss 0.5195): 19%|ββ | 158/840 [10:03<38:51, 3.42s/it]
Training 1/2 epoch (loss 0.5195): 19%|ββ | 159/840 [10:03<39:53, 3.52s/it]
Training 1/2 epoch (loss 0.6211): 19%|ββ | 159/840 [10:08<39:53, 3.52s/it]
Training 1/2 epoch (loss 0.6211): 19%|ββ | 160/840 [10:08<42:35, 3.76s/it]
Training 1/2 epoch (loss 0.5781): 19%|ββ | 160/840 [10:12<42:35, 3.76s/it]
Training 1/2 epoch (loss 0.5781): 19%|ββ | 161/840 [10:12<44:55, 3.97s/it]
Training 1/2 epoch (loss 0.6797): 19%|ββ | 161/840 [10:18<44:55, 3.97s/it]
Training 1/2 epoch (loss 0.6797): 19%|ββ | 162/840 [10:18<50:23, 4.46s/it]
Training 1/2 epoch (loss 0.6367): 19%|ββ | 162/840 [10:22<50:23, 4.46s/it]
Training 1/2 epoch (loss 0.6367): 19%|ββ | 163/840 [10:22<49:29, 4.39s/it]
Training 1/2 epoch (loss 0.5469): 19%|ββ | 163/840 [10:25<49:29, 4.39s/it]
Training 1/2 epoch (loss 0.5469): 20%|ββ | 164/840 [10:25<45:14, 4.02s/it]
Training 1/2 epoch (loss 0.5781): 20%|ββ | 164/840 [10:29<45:14, 4.02s/it]
Training 1/2 epoch (loss 0.5781): 20%|ββ | 165/840 [10:29<44:03, 3.92s/it]
Training 1/2 epoch (loss 0.5859): 20%|ββ | 165/840 [10:32<44:03, 3.92s/it]
Training 1/2 epoch (loss 0.5859): 20%|ββ | 166/840 [10:32<42:16, 3.76s/it]
Training 1/2 epoch (loss 0.5000): 20%|ββ | 166/840 [10:35<42:16, 3.76s/it]
Training 1/2 epoch (loss 0.5000): 20%|ββ | 167/840 [10:35<39:59, 3.56s/it]
Training 1/2 epoch (loss 0.5195): 20%|ββ | 167/840 [10:39<39:59, 3.56s/it]
Training 1/2 epoch (loss 0.5195): 20%|ββ | 168/840 [10:39<39:59, 3.57s/it]
Training 1/2 epoch (loss 0.5742): 20%|ββ | 168/840 [10:42<39:59, 3.57s/it]
Training 1/2 epoch (loss 0.5742): 20%|ββ | 169/840 [10:42<39:15, 3.51s/it]
Training 1/2 epoch (loss 0.5781): 20%|ββ | 169/840 [10:46<39:15, 3.51s/it]
Training 1/2 epoch (loss 0.5781): 20%|ββ | 170/840 [10:46<39:32, 3.54s/it]
Training 1/2 epoch (loss 0.5820): 20%|ββ | 170/840 [10:50<39:32, 3.54s/it]
Training 1/2 epoch (loss 0.5820): 20%|ββ | 171/840 [10:50<41:20, 3.71s/it]
Training 1/2 epoch (loss 0.5430): 20%|ββ | 171/840 [10:54<41:20, 3.71s/it]
Training 1/2 epoch (loss 0.5430): 20%|ββ | 172/840 [10:54<40:56, 3.68s/it]
Training 1/2 epoch (loss 0.5469): 20%|ββ | 172/840 [10:57<40:56, 3.68s/it]
Training 1/2 epoch (loss 0.5469): 21%|ββ | 173/840 [10:57<41:14, 3.71s/it]
Training 1/2 epoch (loss 0.5938): 21%|ββ | 173/840 [11:00<41:14, 3.71s/it]
Training 1/2 epoch (loss 0.5938): 21%|ββ | 174/840 [11:00<38:10, 3.44s/it]
Training 1/2 epoch (loss 0.5938): 21%|ββ | 174/840 [11:03<38:10, 3.44s/it]
Training 1/2 epoch (loss 0.5938): 21%|ββ | 175/840 [11:03<35:42, 3.22s/it]
Training 1/2 epoch (loss 0.5547): 21%|ββ | 175/840 [11:06<35:42, 3.22s/it]
Training 1/2 epoch (loss 0.5547): 21%|ββ | 176/840 [11:06<35:27, 3.20s/it]
Training 1/2 epoch (loss 0.5938): 21%|ββ | 176/840 [11:12<35:27, 3.20s/it]
Training 1/2 epoch (loss 0.5938): 21%|ββ | 177/840 [11:12<42:50, 3.88s/it]
Training 1/2 epoch (loss 0.6094): 21%|ββ | 177/840 [11:15<42:50, 3.88s/it]
Training 1/2 epoch (loss 0.6094): 21%|ββ | 178/840 [11:15<42:03, 3.81s/it]
Training 1/2 epoch (loss 0.5938): 21%|ββ | 178/840 [11:20<42:03, 3.81s/it]
Training 1/2 epoch (loss 0.5938): 21%|βββ | 179/840 [11:20<43:27, 3.94s/it]
Training 1/2 epoch (loss 0.5547): 21%|βββ | 179/840 [11:22<43:27, 3.94s/it]
Training 1/2 epoch (loss 0.5547): 21%|βββ | 180/840 [11:22<38:58, 3.54s/it]
Training 1/2 epoch (loss 0.4941): 21%|βββ | 180/840 [11:25<38:58, 3.54s/it]
Training 1/2 epoch (loss 0.4941): 22%|βββ | 181/840 [11:25<37:41, 3.43s/it]
Training 1/2 epoch (loss 0.5391): 22%|βββ | 181/840 [11:28<37:41, 3.43s/it]
Training 1/2 epoch (loss 0.5391): 22%|βββ | 182/840 [11:28<35:17, 3.22s/it]
Training 1/2 epoch (loss 0.5586): 22%|βββ | 182/840 [11:31<35:17, 3.22s/it]
Training 1/2 epoch (loss 0.5586): 22%|βββ | 183/840 [11:31<34:56, 3.19s/it]
Training 1/2 epoch (loss 0.5664): 22%|βββ | 183/840 [11:34<34:56, 3.19s/it]
Training 1/2 epoch (loss 0.5664): 22%|βββ | 184/840 [11:34<34:53, 3.19s/it]
Training 1/2 epoch (loss 0.6250): 22%|βββ | 184/840 [11:38<34:53, 3.19s/it]
Training 1/2 epoch (loss 0.6250): 22%|βββ | 185/840 [11:38<35:53, 3.29s/it]
Training 1/2 epoch (loss 0.5625): 22%|βββ | 185/840 [11:42<35:53, 3.29s/it]
Training 1/2 epoch (loss 0.5625): 22%|βββ | 186/840 [11:42<39:21, 3.61s/it]
Training 1/2 epoch (loss 0.6328): 22%|βββ | 186/840 [11:46<39:21, 3.61s/it]
Training 1/2 epoch (loss 0.6328): 22%|βββ | 187/840 [11:46<38:48, 3.57s/it]
Training 1/2 epoch (loss 0.6055): 22%|βββ | 187/840 [11:50<38:48, 3.57s/it]
Training 1/2 epoch (loss 0.6055): 22%|βββ | 188/840 [11:50<41:44, 3.84s/it]
Training 1/2 epoch (loss 0.6289): 22%|βββ | 188/840 [11:53<41:44, 3.84s/it]
Training 1/2 epoch (loss 0.6289): 22%|βββ | 189/840 [11:53<39:10, 3.61s/it]
Training 1/2 epoch (loss 0.5625): 22%|βββ | 189/840 [11:57<39:10, 3.61s/it]
Training 1/2 epoch (loss 0.5625): 23%|βββ | 190/840 [11:57<38:05, 3.52s/it]
Training 1/2 epoch (loss 0.4395): 23%|βββ | 190/840 [12:00<38:05, 3.52s/it]
Training 1/2 epoch (loss 0.4395): 23%|βββ | 191/840 [12:00<39:07, 3.62s/it]
Training 1/2 epoch (loss 0.6641): 23%|βββ | 191/840 [12:04<39:07, 3.62s/it]
Training 1/2 epoch (loss 0.6641): 23%|βββ | 192/840 [12:04<38:55, 3.60s/it]
Training 1/2 epoch (loss 0.5312): 23%|βββ | 192/840 [12:07<38:55, 3.60s/it]
Training 1/2 epoch (loss 0.5312): 23%|βββ | 193/840 [12:07<37:09, 3.45s/it]
Training 1/2 epoch (loss 0.5195): 23%|βββ | 193/840 [12:11<37:09, 3.45s/it]
Training 1/2 epoch (loss 0.5195): 23%|βββ | 194/840 [12:11<38:19, 3.56s/it]
Training 1/2 epoch (loss 0.5352): 23%|βββ | 194/840 [12:14<38:19, 3.56s/it]
Training 1/2 epoch (loss 0.5352): 23%|βββ | 195/840 [12:14<35:59, 3.35s/it]
Training 1/2 epoch (loss 0.5859): 23%|βββ | 195/840 [12:17<35:59, 3.35s/it]
Training 1/2 epoch (loss 0.5859): 23%|βββ | 196/840 [12:17<36:03, 3.36s/it]
Training 1/2 epoch (loss 0.5625): 23%|βββ | 196/840 [12:23<36:03, 3.36s/it]
Training 1/2 epoch (loss 0.5625): 23%|βββ | 197/840 [12:23<42:51, 4.00s/it]
Training 1/2 epoch (loss 0.5547): 23%|βββ | 197/840 [12:26<42:51, 4.00s/it]
Training 1/2 epoch (loss 0.5547): 24%|βββ | 198/840 [12:26<42:06, 3.94s/it]
Training 1/2 epoch (loss 0.5859): 24%|βββ | 198/840 [12:29<42:06, 3.94s/it]
Training 1/2 epoch (loss 0.5859): 24%|βββ | 199/840 [12:29<39:15, 3.67s/it]
Training 1/2 epoch (loss 0.5547): 24%|βββ | 199/840 [12:33<39:15, 3.67s/it]
Training 1/2 epoch (loss 0.5547): 24%|βββ | 200/840 [12:33<38:27, 3.61s/it]
Training 1/2 epoch (loss 0.6406): 24%|βββ | 200/840 [12:37<38:27, 3.61s/it]
Training 1/2 epoch (loss 0.6406): 24%|βββ | 201/840 [12:37<38:38, 3.63s/it]
Training 1/2 epoch (loss 0.5586): 24%|βββ | 201/840 [12:40<38:38, 3.63s/it]
Training 1/2 epoch (loss 0.5586): 24%|βββ | 202/840 [12:40<38:21, 3.61s/it]
Training 1/2 epoch (loss 0.5352): 24%|βββ | 202/840 [12:43<38:21, 3.61s/it]
Training 1/2 epoch (loss 0.5352): 24%|βββ | 203/840 [12:43<35:06, 3.31s/it]
Training 1/2 epoch (loss 0.6211): 24%|βββ | 203/840 [12:47<35:06, 3.31s/it]
Training 1/2 epoch (loss 0.6211): 24%|βββ | 204/840 [12:47<36:36, 3.45s/it]
Training 1/2 epoch (loss 0.5469): 24%|βββ | 204/840 [12:51<36:36, 3.45s/it]
Training 1/2 epoch (loss 0.5469): 24%|βββ | 205/840 [12:51<41:09, 3.89s/it]
Training 1/2 epoch (loss 0.5469): 24%|βββ | 205/840 [12:55<41:09, 3.89s/it]
Training 1/2 epoch (loss 0.5469): 25%|βββ | 206/840 [12:55<39:41, 3.76s/it]
Training 1/2 epoch (loss 0.5664): 25%|βββ | 206/840 [12:58<39:41, 3.76s/it]
Training 1/2 epoch (loss 0.5664): 25%|βββ | 207/840 [12:58<38:05, 3.61s/it]
Training 1/2 epoch (loss 0.5234): 25%|βββ | 207/840 [13:03<38:05, 3.61s/it]
Training 1/2 epoch (loss 0.5234): 25%|βββ | 208/840 [13:03<41:36, 3.95s/it]
Training 1/2 epoch (loss 0.4707): 25%|βββ | 208/840 [13:06<41:36, 3.95s/it]
Training 1/2 epoch (loss 0.4707): 25%|βββ | 209/840 [13:06<37:46, 3.59s/it]
Training 1/2 epoch (loss 0.5781): 25%|βββ | 209/840 [13:09<37:46, 3.59s/it]
Training 1/2 epoch (loss 0.5781): 25%|βββ | 210/840 [13:09<36:41, 3.49s/it]
Training 1/2 epoch (loss 0.6641): 25%|βββ | 210/840 [13:14<36:41, 3.49s/it]
Training 1/2 epoch (loss 0.6641): 25%|βββ | 211/840 [13:14<42:53, 4.09s/it]
Training 1/2 epoch (loss 0.5586): 25%|βββ | 211/840 [13:17<42:53, 4.09s/it]
Training 1/2 epoch (loss 0.5586): 25%|βββ | 212/840 [13:17<39:21, 3.76s/it]
Training 1/2 epoch (loss 0.4805): 25%|βββ | 212/840 [13:21<39:21, 3.76s/it]
Training 1/2 epoch (loss 0.4805): 25%|βββ | 213/840 [13:21<37:16, 3.57s/it]
Training 1/2 epoch (loss 0.6680): 25%|βββ | 213/840 [13:24<37:16, 3.57s/it]
Training 1/2 epoch (loss 0.6680): 25%|βββ | 214/840 [13:24<37:07, 3.56s/it]
Training 1/2 epoch (loss 0.5117): 25%|βββ | 214/840 [13:30<37:07, 3.56s/it]
Training 1/2 epoch (loss 0.5117): 26%|βββ | 215/840 [13:30<43:01, 4.13s/it]
Training 1/2 epoch (loss 0.5938): 26%|βββ | 215/840 [13:34<43:01, 4.13s/it]
Training 1/2 epoch (loss 0.5938): 26%|βββ | 216/840 [13:34<42:56, 4.13s/it]
Training 1/2 epoch (loss 0.5234): 26%|βββ | 216/840 [13:37<42:56, 4.13s/it]
Training 1/2 epoch (loss 0.5234): 26%|βββ | 217/840 [13:37<42:01, 4.05s/it]
Training 1/2 epoch (loss 0.4980): 26%|βββ | 217/840 [13:42<42:01, 4.05s/it]
Training 1/2 epoch (loss 0.4980): 26%|βββ | 218/840 [13:42<44:51, 4.33s/it]
Training 1/2 epoch (loss 0.6172): 26%|βββ | 218/840 [13:46<44:51, 4.33s/it]
Training 1/2 epoch (loss 0.6172): 26%|βββ | 219/840 [13:46<42:43, 4.13s/it]
Training 1/2 epoch (loss 0.4844): 26%|βββ | 219/840 [13:50<42:43, 4.13s/it]
Training 1/2 epoch (loss 0.4844): 26%|βββ | 220/840 [13:50<43:17, 4.19s/it]
Training 1/2 epoch (loss 0.5000): 26%|βββ | 220/840 [13:53<43:17, 4.19s/it]
Training 1/2 epoch (loss 0.5000): 26%|βββ | 221/840 [13:53<38:47, 3.76s/it]
Training 1/2 epoch (loss 0.5000): 26%|βββ | 221/840 [13:58<38:47, 3.76s/it]
Training 1/2 epoch (loss 0.5000): 26%|βββ | 222/840 [13:58<42:18, 4.11s/it]
Training 1/2 epoch (loss 0.6484): 26%|βββ | 222/840 [14:01<42:18, 4.11s/it]
Training 1/2 epoch (loss 0.6484): 27%|βββ | 223/840 [14:01<39:47, 3.87s/it]
Training 1/2 epoch (loss 0.5547): 27%|βββ | 223/840 [14:04<39:47, 3.87s/it]
Training 1/2 epoch (loss 0.5547): 27%|βββ | 224/840 [14:04<35:40, 3.48s/it]
Training 1/2 epoch (loss 0.5312): 27%|βββ | 224/840 [14:08<35:40, 3.48s/it]
Training 1/2 epoch (loss 0.5312): 27%|βββ | 225/840 [14:08<38:42, 3.78s/it]
Training 1/2 epoch (loss 0.5078): 27%|βββ | 225/840 [14:12<38:42, 3.78s/it]
Training 1/2 epoch (loss 0.5078): 27%|βββ | 226/840 [14:12<38:07, 3.73s/it]
Training 1/2 epoch (loss 0.5078): 27%|βββ | 226/840 [14:15<38:07, 3.73s/it]
Training 1/2 epoch (loss 0.5078): 27%|βββ | 227/840 [14:15<36:02, 3.53s/it]
Training 1/2 epoch (loss 0.6602): 27%|βββ | 227/840 [14:21<36:02, 3.53s/it]
Training 1/2 epoch (loss 0.6602): 27%|βββ | 228/840 [14:21<42:18, 4.15s/it]
Training 1/2 epoch (loss 0.5156): 27%|βββ | 228/840 [14:24<42:18, 4.15s/it]
Training 1/2 epoch (loss 0.5156): 27%|βββ | 229/840 [14:24<39:52, 3.92s/it]
Training 1/2 epoch (loss 0.5938): 27%|βββ | 229/840 [14:29<39:52, 3.92s/it]
Training 1/2 epoch (loss 0.5938): 27%|βββ | 230/840 [14:29<41:38, 4.10s/it]
Training 1/2 epoch (loss 0.5469): 27%|βββ | 230/840 [14:32<41:38, 4.10s/it]
Training 1/2 epoch (loss 0.5469): 28%|βββ | 231/840 [14:32<37:52, 3.73s/it]
Training 1/2 epoch (loss 0.5586): 28%|βββ | 231/840 [14:35<37:52, 3.73s/it]
Training 1/2 epoch (loss 0.5586): 28%|βββ | 232/840 [14:35<36:11, 3.57s/it]
Training 1/2 epoch (loss 0.5898): 28%|βββ | 232/840 [14:39<36:11, 3.57s/it]
Training 1/2 epoch (loss 0.5898): 28%|βββ | 233/840 [14:39<39:18, 3.89s/it]
Training 1/2 epoch (loss 0.5742): 28%|βββ | 233/840 [14:42<39:18, 3.89s/it]
Training 1/2 epoch (loss 0.5742): 28%|βββ | 234/840 [14:42<36:07, 3.58s/it]
Training 1/2 epoch (loss 0.6562): 28%|βββ | 234/840 [14:45<36:07, 3.58s/it]
Training 1/2 epoch (loss 0.6562): 28%|βββ | 235/840 [14:45<33:48, 3.35s/it]
Training 1/2 epoch (loss 0.5938): 28%|βββ | 235/840 [14:51<33:48, 3.35s/it]
Training 1/2 epoch (loss 0.5938): 28%|βββ | 236/840 [14:51<40:30, 4.02s/it]
Training 1/2 epoch (loss 0.4961): 28%|βββ | 236/840 [14:55<40:30, 4.02s/it]
Training 1/2 epoch (loss 0.4961): 28%|βββ | 237/840 [14:55<41:53, 4.17s/it]
Training 1/2 epoch (loss 0.6016): 28%|βββ | 237/840 [14:58<41:53, 4.17s/it]
Training 1/2 epoch (loss 0.6016): 28%|βββ | 238/840 [14:58<38:56, 3.88s/it]
Training 1/2 epoch (loss 0.6172): 28%|βββ | 238/840 [15:04<38:56, 3.88s/it]
Training 1/2 epoch (loss 0.6172): 28%|βββ | 239/840 [15:04<43:34, 4.35s/it]
Training 1/2 epoch (loss 0.5039): 28%|βββ | 239/840 [15:08<43:34, 4.35s/it]
Training 1/2 epoch (loss 0.5039): 29%|βββ | 240/840 [15:08<41:52, 4.19s/it]
Training 1/2 epoch (loss 0.5078): 29%|βββ | 240/840 [15:12<41:52, 4.19s/it]
Training 1/2 epoch (loss 0.5078): 29%|βββ | 241/840 [15:12<42:46, 4.28s/it]
Training 1/2 epoch (loss 0.5078): 29%|βββ | 241/840 [15:16<42:46, 4.28s/it]
Training 1/2 epoch (loss 0.5078): 29%|βββ | 242/840 [15:16<41:55, 4.21s/it]
Training 1/2 epoch (loss 0.5859): 29%|βββ | 242/840 [15:20<41:55, 4.21s/it]
Training 1/2 epoch (loss 0.5859): 29%|βββ | 243/840 [15:20<39:42, 3.99s/it]
Training 1/2 epoch (loss 0.5430): 29%|βββ | 243/840 [15:25<39:42, 3.99s/it]
Training 1/2 epoch (loss 0.5430): 29%|βββ | 244/840 [15:25<44:13, 4.45s/it]
Training 1/2 epoch (loss 0.5430): 29%|βββ | 244/840 [15:28<44:13, 4.45s/it]
Training 1/2 epoch (loss 0.5430): 29%|βββ | 245/840 [15:28<40:21, 4.07s/it]
Training 1/2 epoch (loss 0.6172): 29%|βββ | 245/840 [15:31<40:21, 4.07s/it]
Training 1/2 epoch (loss 0.6172): 29%|βββ | 246/840 [15:31<37:37, 3.80s/it]
Training 1/2 epoch (loss 0.5625): 29%|βββ | 246/840 [15:34<37:37, 3.80s/it]
Training 1/2 epoch (loss 0.5625): 29%|βββ | 247/840 [15:34<33:59, 3.44s/it]
Training 1/2 epoch (loss 0.5469): 29%|βββ | 247/840 [15:40<33:59, 3.44s/it]
Training 1/2 epoch (loss 0.5469): 30%|βββ | 248/840 [15:40<40:13, 4.08s/it]
Training 1/2 epoch (loss 0.5664): 30%|βββ | 248/840 [15:43<40:13, 4.08s/it]
Training 1/2 epoch (loss 0.5664): 30%|βββ | 249/840 [15:43<37:10, 3.77s/it]
Training 1/2 epoch (loss 0.5430): 30%|βββ | 249/840 [15:46<37:10, 3.77s/it]
Training 1/2 epoch (loss 0.5430): 30%|βββ | 250/840 [15:46<34:46, 3.54s/it]
Training 1/2 epoch (loss 0.5156): 30%|βββ | 250/840 [15:49<34:46, 3.54s/it]
Training 1/2 epoch (loss 0.5156): 30%|βββ | 251/840 [15:49<35:15, 3.59s/it]
Training 1/2 epoch (loss 0.5195): 30%|βββ | 251/840 [15:53<35:15, 3.59s/it]
Training 1/2 epoch (loss 0.5195): 30%|βββ | 252/840 [15:53<34:10, 3.49s/it]
Training 1/2 epoch (loss 0.5352): 30%|βββ | 252/840 [15:56<34:10, 3.49s/it]
Training 1/2 epoch (loss 0.5352): 30%|βββ | 253/840 [15:56<34:34, 3.53s/it]
Training 1/2 epoch (loss 0.5625): 30%|βββ | 253/840 [16:00<34:34, 3.53s/it]
Training 1/2 epoch (loss 0.5625): 30%|βββ | 254/840 [16:00<35:41, 3.65s/it]
Training 1/2 epoch (loss 0.5391): 30%|βββ | 254/840 [16:04<35:41, 3.65s/it]
Training 1/2 epoch (loss 0.5391): 30%|βββ | 255/840 [16:04<34:42, 3.56s/it]
Training 1/2 epoch (loss 0.5391): 30%|βββ | 255/840 [16:07<34:42, 3.56s/it]
Training 1/2 epoch (loss 0.5391): 30%|βββ | 256/840 [16:07<35:35, 3.66s/it]
Training 1/2 epoch (loss 0.5547): 30%|βββ | 256/840 [16:12<35:35, 3.66s/it]
Training 1/2 epoch (loss 0.5547): 31%|βββ | 257/840 [16:12<38:50, 4.00s/it]
Training 1/2 epoch (loss 0.4883): 31%|βββ | 257/840 [16:16<38:50, 4.00s/it]
Training 1/2 epoch (loss 0.4883): 31%|βββ | 258/840 [16:16<36:53, 3.80s/it]
Training 1/2 epoch (loss 0.6133): 31%|βββ | 258/840 [16:19<36:53, 3.80s/it]
Training 1/2 epoch (loss 0.6133): 31%|βββ | 259/840 [16:19<34:15, 3.54s/it]
Training 1/2 epoch (loss 0.6641): 31%|βββ | 259/840 [16:22<34:15, 3.54s/it]
Training 1/2 epoch (loss 0.6641): 31%|βββ | 260/840 [16:22<34:41, 3.59s/it]
Training 1/2 epoch (loss 0.4961): 31%|βββ | 260/840 [16:25<34:41, 3.59s/it]
Training 1/2 epoch (loss 0.4961): 31%|βββ | 261/840 [16:25<31:51, 3.30s/it]
Training 1/2 epoch (loss 0.5273): 31%|βββ | 261/840 [16:28<31:51, 3.30s/it]
Training 1/2 epoch (loss 0.5273): 31%|βββ | 262/840 [16:28<32:32, 3.38s/it]
Training 1/2 epoch (loss 0.5430): 31%|βββ | 262/840 [16:31<32:32, 3.38s/it]
Training 1/2 epoch (loss 0.5430): 31%|ββββ | 263/840 [16:31<30:28, 3.17s/it]
Training 1/2 epoch (loss 0.5469): 31%|ββββ | 263/840 [16:34<30:28, 3.17s/it]
Training 1/2 epoch (loss 0.5469): 31%|ββββ | 264/840 [16:34<28:40, 2.99s/it]
Training 1/2 epoch (loss 0.4570): 31%|ββββ | 264/840 [16:37<28:40, 2.99s/it]
Training 1/2 epoch (loss 0.4570): 32%|ββββ | 265/840 [16:37<28:23, 2.96s/it]
Training 1/2 epoch (loss 0.8672): 32%|ββββ | 265/840 [16:42<28:23, 2.96s/it]
Training 1/2 epoch (loss 0.8672): 32%|ββββ | 266/840 [16:42<35:43, 3.73s/it]
Training 1/2 epoch (loss 0.6367): 32%|ββββ | 266/840 [16:45<35:43, 3.73s/it]
Training 1/2 epoch (loss 0.6367): 32%|ββββ | 267/840 [16:45<33:34, 3.52s/it]
Training 1/2 epoch (loss 0.4766): 32%|ββββ | 267/840 [16:48<33:34, 3.52s/it]
Training 1/2 epoch (loss 0.4766): 32%|ββββ | 268/840 [16:48<32:47, 3.44s/it]
Training 1/2 epoch (loss 0.4961): 32%|ββββ | 268/840 [16:51<32:47, 3.44s/it]
Training 1/2 epoch (loss 0.4961): 32%|ββββ | 269/840 [16:51<30:15, 3.18s/it]
Training 1/2 epoch (loss 0.5000): 32%|ββββ | 269/840 [16:54<30:15, 3.18s/it]
Training 1/2 epoch (loss 0.5000): 32%|ββββ | 270/840 [16:54<30:43, 3.23s/it]
Training 1/2 epoch (loss 0.5352): 32%|ββββ | 270/840 [16:59<30:43, 3.23s/it]
Training 1/2 epoch (loss 0.5352): 32%|ββββ | 271/840 [16:59<35:10, 3.71s/it]
Training 1/2 epoch (loss 0.5156): 32%|ββββ | 271/840 [17:02<35:10, 3.71s/it]
Training 1/2 epoch (loss 0.5156): 32%|ββββ | 272/840 [17:02<33:09, 3.50s/it]
Training 1/2 epoch (loss 0.5000): 32%|ββββ | 272/840 [17:06<33:09, 3.50s/it]
Training 1/2 epoch (loss 0.5000): 32%|ββββ | 273/840 [17:06<35:11, 3.72s/it]
Training 1/2 epoch (loss 0.5039): 32%|ββββ | 273/840 [17:11<35:11, 3.72s/it]
Training 1/2 epoch (loss 0.5039): 33%|ββββ | 274/840 [17:11<36:20, 3.85s/it]
Training 1/2 epoch (loss 0.5625): 33%|ββββ | 274/840 [17:15<36:20, 3.85s/it]
Training 1/2 epoch (loss 0.5625): 33%|ββββ | 275/840 [17:15<39:02, 4.15s/it]
Training 1/2 epoch (loss 0.5156): 33%|ββββ | 275/840 [17:20<39:02, 4.15s/it]
Training 1/2 epoch (loss 0.5156): 33%|ββββ | 276/840 [17:20<39:20, 4.18s/it]
Training 1/2 epoch (loss 0.5977): 33%|ββββ | 276/840 [17:23<39:20, 4.18s/it]
Training 1/2 epoch (loss 0.5977): 33%|ββββ | 277/840 [17:23<36:41, 3.91s/it]
Training 1/2 epoch (loss 0.4336): 33%|ββββ | 277/840 [17:27<36:41, 3.91s/it]
Training 1/2 epoch (loss 0.4336): 33%|ββββ | 278/840 [17:27<36:41, 3.92s/it]
Training 1/2 epoch (loss 0.5586): 33%|ββββ | 278/840 [17:30<36:41, 3.92s/it]
Training 1/2 epoch (loss 0.5586): 33%|ββββ | 279/840 [17:30<35:48, 3.83s/it]
Training 1/2 epoch (loss 0.5273): 33%|ββββ | 279/840 [17:33<35:48, 3.83s/it]
Training 1/2 epoch (loss 0.5273): 33%|ββββ | 280/840 [17:33<33:05, 3.55s/it]
Training 1/2 epoch (loss 0.5781): 33%|ββββ | 280/840 [17:37<33:05, 3.55s/it]
Training 1/2 epoch (loss 0.5781): 33%|ββββ | 281/840 [17:37<32:55, 3.53s/it]
Training 1/2 epoch (loss 0.6250): 33%|ββββ | 281/840 [17:40<32:55, 3.53s/it]
Training 1/2 epoch (loss 0.6250): 34%|ββββ | 282/840 [17:40<31:44, 3.41s/it]
Training 1/2 epoch (loss 0.5547): 34%|ββββ | 282/840 [17:44<31:44, 3.41s/it]
Training 1/2 epoch (loss 0.5547): 34%|ββββ | 283/840 [17:44<33:14, 3.58s/it]
Training 1/2 epoch (loss 0.5781): 34%|ββββ | 283/840 [17:47<33:14, 3.58s/it]
Training 1/2 epoch (loss 0.5781): 34%|ββββ | 284/840 [17:47<31:11, 3.37s/it]
Training 1/2 epoch (loss 0.5430): 34%|ββββ | 284/840 [17:51<31:11, 3.37s/it]
Training 1/2 epoch (loss 0.5430): 34%|ββββ | 285/840 [17:51<32:10, 3.48s/it]
Training 1/2 epoch (loss 0.5273): 34%|ββββ | 285/840 [17:54<32:10, 3.48s/it]
Training 1/2 epoch (loss 0.5273): 34%|ββββ | 286/840 [17:54<32:11, 3.49s/it]
Training 1/2 epoch (loss 0.4414): 34%|ββββ | 286/840 [17:58<32:11, 3.49s/it]
Training 1/2 epoch (loss 0.4414): 34%|ββββ | 287/840 [17:58<32:22, 3.51s/it]
Training 1/2 epoch (loss 0.5000): 34%|ββββ | 287/840 [18:01<32:22, 3.51s/it]
Training 1/2 epoch (loss 0.5000): 34%|ββββ | 288/840 [18:01<30:29, 3.31s/it]
Training 1/2 epoch (loss 0.5586): 34%|ββββ | 288/840 [18:04<30:29, 3.31s/it]
Training 1/2 epoch (loss 0.5586): 34%|ββββ | 289/840 [18:04<29:46, 3.24s/it]
Training 1/2 epoch (loss 0.5898): 34%|ββββ | 289/840 [18:07<29:46, 3.24s/it]
Training 1/2 epoch (loss 0.5898): 35%|ββββ | 290/840 [18:07<30:50, 3.36s/it]
Training 1/2 epoch (loss 0.5195): 35%|ββββ | 290/840 [18:12<30:50, 3.36s/it]
Training 1/2 epoch (loss 0.5195): 35%|ββββ | 291/840 [18:12<33:19, 3.64s/it]
Training 1/2 epoch (loss 0.4844): 35%|ββββ | 291/840 [18:16<33:19, 3.64s/it]
Training 1/2 epoch (loss 0.4844): 35%|ββββ | 292/840 [18:16<36:47, 4.03s/it]
Training 1/2 epoch (loss 0.6523): 35%|ββββ | 292/840 [18:21<36:47, 4.03s/it]
Training 1/2 epoch (loss 0.6523): 35%|ββββ | 293/840 [18:21<37:39, 4.13s/it]
Training 1/2 epoch (loss 0.5547): 35%|ββββ | 293/840 [18:24<37:39, 4.13s/it]
Training 1/2 epoch (loss 0.5547): 35%|ββββ | 294/840 [18:24<34:43, 3.82s/it]
Training 1/2 epoch (loss 0.5625): 35%|ββββ | 294/840 [18:27<34:43, 3.82s/it]
Training 1/2 epoch (loss 0.5625): 35%|ββββ | 295/840 [18:27<31:44, 3.49s/it]
Training 1/2 epoch (loss 0.5117): 35%|ββββ | 295/840 [18:30<31:44, 3.49s/it]
Training 1/2 epoch (loss 0.5117): 35%|ββββ | 296/840 [18:30<31:47, 3.51s/it]
Training 1/2 epoch (loss 0.5625): 35%|ββββ | 296/840 [18:34<31:47, 3.51s/it]
Training 1/2 epoch (loss 0.5625): 35%|ββββ | 297/840 [18:34<32:10, 3.56s/it]
Training 1/2 epoch (loss 0.4434): 35%|ββββ | 297/840 [18:38<32:10, 3.56s/it]
Training 1/2 epoch (loss 0.4434): 35%|ββββ | 298/840 [18:38<32:44, 3.63s/it]
Training 1/2 epoch (loss 0.5547): 35%|ββββ | 298/840 [18:41<32:44, 3.63s/it]
Training 1/2 epoch (loss 0.5547): 36%|ββββ | 299/840 [18:41<32:18, 3.58s/it]
Training 1/2 epoch (loss 0.5938): 36%|ββββ | 299/840 [18:44<32:18, 3.58s/it]
Training 1/2 epoch (loss 0.5938): 36%|ββββ | 300/840 [18:44<30:37, 3.40s/it]
Training 1/2 epoch (loss 0.5742): 36%|ββββ | 300/840 [18:49<30:37, 3.40s/it]
Training 1/2 epoch (loss 0.5742): 36%|ββββ | 301/840 [18:49<34:22, 3.83s/it]
Training 1/2 epoch (loss 0.5820): 36%|ββββ | 301/840 [18:53<34:22, 3.83s/it]
Training 1/2 epoch (loss 0.5820): 36%|ββββ | 302/840 [18:53<34:19, 3.83s/it]
Training 1/2 epoch (loss 0.4883): 36%|ββββ | 302/840 [18:58<34:19, 3.83s/it]
Training 1/2 epoch (loss 0.4883): 36%|ββββ | 303/840 [18:58<38:48, 4.34s/it]
Training 1/2 epoch (loss 0.6367): 36%|ββββ | 303/840 [19:02<38:48, 4.34s/it]
Training 1/2 epoch (loss 0.6367): 36%|ββββ | 304/840 [19:02<36:01, 4.03s/it]
Training 1/2 epoch (loss 0.5156): 36%|ββββ | 304/840 [19:05<36:01, 4.03s/it]
Training 1/2 epoch (loss 0.5156): 36%|ββββ | 305/840 [19:05<34:55, 3.92s/it]
Training 1/2 epoch (loss 0.6445): 36%|ββββ | 305/840 [19:10<34:55, 3.92s/it]
Training 1/2 epoch (loss 0.6445): 36%|ββββ | 306/840 [19:10<37:38, 4.23s/it]
Training 1/2 epoch (loss 0.4141): 36%|ββββ | 306/840 [19:14<37:38, 4.23s/it]
Training 1/2 epoch (loss 0.4141): 37%|ββββ | 307/840 [19:14<35:52, 4.04s/it]
Training 1/2 epoch (loss 0.4805): 37%|ββββ | 307/840 [19:17<35:52, 4.04s/it]
Training 1/2 epoch (loss 0.4805): 37%|ββββ | 308/840 [19:17<33:45, 3.81s/it]
Training 1/2 epoch (loss 0.5547): 37%|ββββ | 308/840 [19:21<33:45, 3.81s/it]
Training 1/2 epoch (loss 0.5547): 37%|ββββ | 309/840 [19:21<33:24, 3.77s/it]
Training 1/2 epoch (loss 0.5039): 37%|ββββ | 309/840 [19:24<33:24, 3.77s/it]
Training 1/2 epoch (loss 0.5039): 37%|ββββ | 310/840 [19:24<31:46, 3.60s/it]
Training 1/2 epoch (loss 0.5312): 37%|ββββ | 310/840 [19:28<31:46, 3.60s/it]
Training 1/2 epoch (loss 0.5312): 37%|ββββ | 311/840 [19:28<31:44, 3.60s/it]
Training 1/2 epoch (loss 0.4941): 37%|ββββ | 311/840 [19:31<31:44, 3.60s/it]
Training 1/2 epoch (loss 0.4941): 37%|ββββ | 312/840 [19:31<31:02, 3.53s/it]
Training 1/2 epoch (loss 0.5547): 37%|ββββ | 312/840 [19:33<31:02, 3.53s/it]
Training 1/2 epoch (loss 0.5547): 37%|ββββ | 313/840 [19:33<28:15, 3.22s/it]
Training 1/2 epoch (loss 0.3809): 37%|ββββ | 313/840 [19:38<28:15, 3.22s/it]
Training 1/2 epoch (loss 0.3809): 37%|ββββ | 314/840 [19:38<30:37, 3.49s/it]
Training 1/2 epoch (loss 0.5078): 37%|ββββ | 314/840 [19:42<30:37, 3.49s/it]
Training 1/2 epoch (loss 0.5078): 38%|ββββ | 315/840 [19:42<32:55, 3.76s/it]
Training 1/2 epoch (loss 0.5469): 38%|ββββ | 315/840 [19:45<32:55, 3.76s/it]
Training 1/2 epoch (loss 0.5469): 38%|ββββ | 316/840 [19:45<32:19, 3.70s/it]
Training 1/2 epoch (loss 0.5742): 38%|ββββ | 316/840 [19:50<32:19, 3.70s/it]
Training 1/2 epoch (loss 0.5742): 38%|ββββ | 317/840 [19:50<33:53, 3.89s/it]
Training 1/2 epoch (loss 0.6328): 38%|ββββ | 317/840 [19:53<33:53, 3.89s/it]
Training 1/2 epoch (loss 0.6328): 38%|ββββ | 318/840 [19:53<31:55, 3.67s/it]
Training 1/2 epoch (loss 0.4883): 38%|ββββ | 318/840 [19:58<31:55, 3.67s/it]
Training 1/2 epoch (loss 0.4883): 38%|ββββ | 319/840 [19:58<36:24, 4.19s/it]
Training 1/2 epoch (loss 0.5000): 38%|ββββ | 319/840 [20:01<36:24, 4.19s/it]
Training 1/2 epoch (loss 0.5000): 38%|ββββ | 320/840 [20:01<33:04, 3.82s/it]
Training 1/2 epoch (loss 0.5078): 38%|ββββ | 320/840 [20:07<33:04, 3.82s/it]
Training 1/2 epoch (loss 0.5078): 38%|ββββ | 321/840 [20:07<37:00, 4.28s/it]
Training 1/2 epoch (loss 0.4727): 38%|ββββ | 321/840 [20:11<37:00, 4.28s/it]
Training 1/2 epoch (loss 0.4727): 38%|ββββ | 322/840 [20:11<37:27, 4.34s/it]
Training 1/2 epoch (loss 0.5898): 38%|ββββ | 322/840 [20:16<37:27, 4.34s/it]
Training 1/2 epoch (loss 0.5898): 38%|ββββ | 323/840 [20:16<38:02, 4.41s/it]
Training 1/2 epoch (loss 0.5977): 38%|ββββ | 323/840 [20:19<38:02, 4.41s/it]
Training 1/2 epoch (loss 0.5977): 39%|ββββ | 324/840 [20:19<35:35, 4.14s/it]
Training 1/2 epoch (loss 0.5742): 39%|ββββ | 324/840 [20:22<35:35, 4.14s/it]
Training 1/2 epoch (loss 0.5742): 39%|ββββ | 325/840 [20:22<32:21, 3.77s/it]
Training 1/2 epoch (loss 0.5938): 39%|ββββ | 325/840 [20:27<32:21, 3.77s/it]
Training 1/2 epoch (loss 0.5938): 39%|ββββ | 326/840 [20:27<34:49, 4.06s/it]
Training 1/2 epoch (loss 0.4590): 39%|ββββ | 326/840 [20:32<34:49, 4.06s/it]
Training 1/2 epoch (loss 0.4590): 39%|ββββ | 327/840 [20:32<38:13, 4.47s/it]
Training 1/2 epoch (loss 0.4297): 39%|ββββ | 327/840 [20:36<38:13, 4.47s/it]
Training 1/2 epoch (loss 0.4297): 39%|ββββ | 328/840 [20:36<36:26, 4.27s/it]
Training 1/2 epoch (loss 0.4004): 39%|ββββ | 328/840 [20:40<36:26, 4.27s/it]
Training 1/2 epoch (loss 0.4004): 39%|ββββ | 329/840 [20:40<34:04, 4.00s/it]
Training 1/2 epoch (loss 0.4961): 39%|ββββ | 329/840 [20:43<34:04, 4.00s/it]
Training 1/2 epoch (loss 0.4961): 39%|ββββ | 330/840 [20:43<31:47, 3.74s/it]
Training 1/2 epoch (loss 0.4863): 39%|ββββ | 330/840 [20:46<31:47, 3.74s/it]
Training 1/2 epoch (loss 0.4863): 39%|ββββ | 331/840 [20:46<30:15, 3.57s/it]
Training 1/2 epoch (loss 0.5859): 39%|ββββ | 331/840 [20:48<30:15, 3.57s/it]
Training 1/2 epoch (loss 0.5859): 40%|ββββ | 332/840 [20:48<27:50, 3.29s/it]
Training 1/2 epoch (loss 0.5859): 40%|ββββ | 332/840 [20:54<27:50, 3.29s/it]
Training 1/2 epoch (loss 0.5859): 40%|ββββ | 333/840 [20:54<33:17, 3.94s/it]
Training 1/2 epoch (loss 0.6484): 40%|ββββ | 333/840 [20:59<33:17, 3.94s/it]
Training 1/2 epoch (loss 0.6484): 40%|ββββ | 334/840 [20:59<37:23, 4.43s/it]
Training 1/2 epoch (loss 0.5078): 40%|ββββ | 334/840 [21:05<37:23, 4.43s/it]
Training 1/2 epoch (loss 0.5078): 40%|ββββ | 335/840 [21:05<40:00, 4.75s/it]
Training 1/2 epoch (loss 0.4844): 40%|ββββ | 335/840 [21:09<40:00, 4.75s/it]
Training 1/2 epoch (loss 0.4844): 40%|ββββ | 336/840 [21:09<36:58, 4.40s/it]
Training 1/2 epoch (loss 0.6250): 40%|ββββ | 336/840 [21:12<36:58, 4.40s/it]
Training 1/2 epoch (loss 0.6250): 40%|ββββ | 337/840 [21:12<35:15, 4.21s/it]
Training 1/2 epoch (loss 0.5781): 40%|ββββ | 337/840 [21:15<35:15, 4.21s/it]
Training 1/2 epoch (loss 0.5781): 40%|ββββ | 338/840 [21:15<32:30, 3.89s/it]
Training 1/2 epoch (loss 0.4727): 40%|ββββ | 338/840 [21:19<32:30, 3.89s/it]
Training 1/2 epoch (loss 0.4727): 40%|ββββ | 339/840 [21:19<31:08, 3.73s/it]
Training 1/2 epoch (loss 0.5820): 40%|ββββ | 339/840 [21:23<31:08, 3.73s/it]
Training 1/2 epoch (loss 0.5820): 40%|ββββ | 340/840 [21:23<31:56, 3.83s/it]
Training 1/2 epoch (loss 0.4844): 40%|ββββ | 340/840 [21:26<31:56, 3.83s/it]
Training 1/2 epoch (loss 0.4844): 41%|ββββ | 341/840 [21:26<30:07, 3.62s/it]
Training 1/2 epoch (loss 0.5938): 41%|ββββ | 341/840 [21:30<30:07, 3.62s/it]
Training 1/2 epoch (loss 0.5938): 41%|ββββ | 342/840 [21:30<30:40, 3.69s/it]
Training 1/2 epoch (loss 0.6602): 41%|ββββ | 342/840 [21:33<30:40, 3.69s/it]
Training 1/2 epoch (loss 0.6602): 41%|ββββ | 343/840 [21:33<28:57, 3.50s/it]
Training 1/2 epoch (loss 0.5352): 41%|ββββ | 343/840 [21:38<28:57, 3.50s/it]
Training 1/2 epoch (loss 0.5352): 41%|ββββ | 344/840 [21:38<31:36, 3.82s/it]
Training 1/2 epoch (loss 0.6094): 41%|ββββ | 344/840 [21:40<31:36, 3.82s/it]
Training 1/2 epoch (loss 0.6094): 41%|ββββ | 345/840 [21:40<28:35, 3.47s/it]
Training 1/2 epoch (loss 0.5156): 41%|ββββ | 345/840 [21:43<28:35, 3.47s/it]
Training 1/2 epoch (loss 0.5156): 41%|ββββ | 346/840 [21:43<26:27, 3.21s/it]
Training 1/2 epoch (loss 0.5039): 41%|ββββ | 346/840 [21:45<26:27, 3.21s/it]
Training 1/2 epoch (loss 0.5039): 41%|βββββ | 347/840 [21:45<25:09, 3.06s/it]
Training 1/2 epoch (loss 0.4668): 41%|βββββ | 347/840 [21:49<25:09, 3.06s/it]
Training 1/2 epoch (loss 0.4668): 41%|βββββ | 348/840 [21:49<25:40, 3.13s/it]
Training 1/2 epoch (loss 0.5781): 41%|βββββ | 348/840 [21:53<25:40, 3.13s/it]
Training 1/2 epoch (loss 0.5781): 42%|βββββ | 349/840 [21:53<27:48, 3.40s/it]
Training 1/2 epoch (loss 0.4492): 42%|βββββ | 349/840 [21:56<27:48, 3.40s/it]
Training 1/2 epoch (loss 0.4492): 42%|βββββ | 350/840 [21:56<26:15, 3.22s/it]
Training 1/2 epoch (loss 0.4473): 42%|βββββ | 350/840 [22:01<26:15, 3.22s/it]
Training 1/2 epoch (loss 0.4473): 42%|βββββ | 351/840 [22:01<31:38, 3.88s/it]
Training 1/2 epoch (loss 0.4844): 42%|βββββ | 351/840 [22:04<31:38, 3.88s/it]
Training 1/2 epoch (loss 0.4844): 42%|βββββ | 352/840 [22:04<29:36, 3.64s/it]
Training 1/2 epoch (loss 0.4922): 42%|βββββ | 352/840 [22:08<29:36, 3.64s/it]
Training 1/2 epoch (loss 0.4922): 42%|βββββ | 353/840 [22:08<29:36, 3.65s/it]
Training 1/2 epoch (loss 0.5430): 42%|βββββ | 353/840 [22:11<29:36, 3.65s/it]
Training 1/2 epoch (loss 0.5430): 42%|βββββ | 354/840 [22:11<29:15, 3.61s/it]
Training 1/2 epoch (loss 0.5469): 42%|βββββ | 354/840 [22:14<29:15, 3.61s/it]
Training 1/2 epoch (loss 0.5469): 42%|βββββ | 355/840 [22:14<28:07, 3.48s/it]
Training 1/2 epoch (loss 0.5039): 42%|βββββ | 355/840 [22:18<28:07, 3.48s/it]
Training 1/2 epoch (loss 0.5039): 42%|βββββ | 356/840 [22:18<29:20, 3.64s/it]
Training 1/2 epoch (loss 0.4473): 42%|βββββ | 356/840 [22:23<29:20, 3.64s/it]
Training 1/2 epoch (loss 0.4473): 42%|βββββ | 357/840 [22:23<32:16, 4.01s/it]
Training 1/2 epoch (loss 0.5156): 42%|βββββ | 357/840 [22:28<32:16, 4.01s/it]
Training 1/2 epoch (loss 0.5156): 43%|βββββ | 358/840 [22:28<33:35, 4.18s/it]
Training 1/2 epoch (loss 0.4766): 43%|βββββ | 358/840 [22:33<33:35, 4.18s/it]
Training 1/2 epoch (loss 0.4766): 43%|βββββ | 359/840 [22:33<36:34, 4.56s/it]
Training 1/2 epoch (loss 0.5312): 43%|βββββ | 359/840 [22:37<36:34, 4.56s/it]
Training 1/2 epoch (loss 0.5312): 43%|βββββ | 360/840 [22:37<35:15, 4.41s/it]
Training 1/2 epoch (loss 0.4395): 43%|βββββ | 360/840 [22:41<35:15, 4.41s/it]
Training 1/2 epoch (loss 0.4395): 43%|βββββ | 361/840 [22:41<34:10, 4.28s/it]
Training 1/2 epoch (loss 0.4766): 43%|βββββ | 361/840 [22:44<34:10, 4.28s/it]
Training 1/2 epoch (loss 0.4766): 43%|βββββ | 362/840 [22:44<30:47, 3.86s/it]
Training 1/2 epoch (loss 0.4785): 43%|βββββ | 362/840 [22:48<30:47, 3.86s/it]
Training 1/2 epoch (loss 0.4785): 43%|βββββ | 363/840 [22:48<29:54, 3.76s/it]
Training 1/2 epoch (loss 0.4141): 43%|βββββ | 363/840 [22:53<29:54, 3.76s/it]
Training 1/2 epoch (loss 0.4141): 43%|βββββ | 364/840 [22:53<33:02, 4.17s/it]
Training 1/2 epoch (loss 0.5547): 43%|βββββ | 364/840 [22:57<33:02, 4.17s/it]
Training 1/2 epoch (loss 0.5547): 43%|βββββ | 365/840 [22:57<31:44, 4.01s/it]
Training 1/2 epoch (loss 0.5703): 43%|βββββ | 365/840 [23:02<31:44, 4.01s/it]
Training 1/2 epoch (loss 0.5703): 44%|βββββ | 366/840 [23:02<35:11, 4.45s/it]
Training 1/2 epoch (loss 0.3711): 44%|βββββ | 366/840 [23:06<35:11, 4.45s/it]
Training 1/2 epoch (loss 0.3711): 44%|βββββ | 367/840 [23:06<32:59, 4.18s/it]
Training 1/2 epoch (loss 0.5078): 44%|βββββ | 367/840 [23:10<32:59, 4.18s/it]
Training 1/2 epoch (loss 0.5078): 44%|βββββ | 368/840 [23:10<33:55, 4.31s/it]
Training 1/2 epoch (loss 0.6055): 44%|βββββ | 368/840 [23:13<33:55, 4.31s/it]
Training 1/2 epoch (loss 0.6055): 44%|βββββ | 369/840 [23:13<31:07, 3.97s/it]
Training 1/2 epoch (loss 0.7266): 44%|βββββ | 369/840 [23:19<31:07, 3.97s/it]
Training 1/2 epoch (loss 0.7266): 44%|βββββ | 370/840 [23:19<34:43, 4.43s/it]
Training 1/2 epoch (loss 0.6250): 44%|βββββ | 370/840 [23:24<34:43, 4.43s/it]
Training 1/2 epoch (loss 0.6250): 44%|βββββ | 371/840 [23:24<37:04, 4.74s/it]
Training 1/2 epoch (loss 0.3438): 44%|βββββ | 371/840 [23:30<37:04, 4.74s/it]
Training 1/2 epoch (loss 0.3438): 44%|βββββ | 372/840 [23:30<38:45, 4.97s/it]
Training 1/2 epoch (loss 0.4219): 44%|βββββ | 372/840 [23:33<38:45, 4.97s/it]
Training 1/2 epoch (loss 0.4219): 44%|βββββ | 373/840 [23:33<35:04, 4.51s/it]
Training 1/2 epoch (loss 0.4883): 44%|βββββ | 373/840 [23:37<35:04, 4.51s/it]
Training 1/2 epoch (loss 0.4883): 45%|βββββ | 374/840 [23:37<32:47, 4.22s/it]
Training 1/2 epoch (loss 0.5078): 45%|βββββ | 374/840 [23:41<32:47, 4.22s/it]
Training 1/2 epoch (loss 0.5078): 45%|βββββ | 375/840 [23:41<33:26, 4.32s/it]
Training 1/2 epoch (loss 0.5469): 45%|βββββ | 375/840 [23:46<33:26, 4.32s/it]
Training 1/2 epoch (loss 0.5469): 45%|βββββ | 376/840 [23:46<34:40, 4.48s/it]
Training 1/2 epoch (loss 0.4297): 45%|βββββ | 376/840 [23:50<34:40, 4.48s/it]
Training 1/2 epoch (loss 0.4297): 45%|βββββ | 377/840 [23:50<33:26, 4.33s/it]
Training 1/2 epoch (loss 0.4062): 45%|βββββ | 377/840 [23:54<33:26, 4.33s/it]
Training 1/2 epoch (loss 0.4062): 45%|βββββ | 378/840 [23:54<32:04, 4.17s/it]
Training 1/2 epoch (loss 0.4961): 45%|βββββ | 378/840 [23:57<32:04, 4.17s/it]
Training 1/2 epoch (loss 0.4961): 45%|βββββ | 379/840 [23:57<28:38, 3.73s/it]
Training 1/2 epoch (loss 0.4512): 45%|βββββ | 379/840 [24:02<28:38, 3.73s/it]
Training 1/2 epoch (loss 0.4512): 45%|βββββ | 380/840 [24:02<31:06, 4.06s/it]
Training 1/2 epoch (loss 0.5898): 45%|βββββ | 380/840 [24:04<31:06, 4.06s/it]
Training 1/2 epoch (loss 0.5898): 45%|βββββ | 381/840 [24:04<28:12, 3.69s/it]
Training 1/2 epoch (loss 0.4219): 45%|βββββ | 381/840 [24:09<28:12, 3.69s/it]
Training 1/2 epoch (loss 0.4219): 45%|βββββ | 382/840 [24:09<30:58, 4.06s/it]
Training 1/2 epoch (loss 0.5312): 45%|βββββ | 382/840 [24:12<30:58, 4.06s/it]
Training 1/2 epoch (loss 0.5312): 46%|βββββ | 383/840 [24:12<28:55, 3.80s/it]
Training 1/2 epoch (loss 0.4570): 46%|βββββ | 383/840 [24:16<28:55, 3.80s/it]
Training 1/2 epoch (loss 0.4570): 46%|βββββ | 384/840 [24:16<28:24, 3.74s/it]
Training 1/2 epoch (loss 0.4922): 46%|βββββ | 384/840 [24:19<28:24, 3.74s/it]
Training 1/2 epoch (loss 0.4922): 46%|βββββ | 385/840 [24:19<26:48, 3.54s/it]
Training 1/2 epoch (loss 0.5703): 46%|βββββ | 385/840 [24:22<26:48, 3.54s/it]
Training 1/2 epoch (loss 0.5703): 46%|βββββ | 386/840 [24:22<25:39, 3.39s/it]
Training 1/2 epoch (loss 0.5859): 46%|βββββ | 386/840 [24:25<25:39, 3.39s/it]
Training 1/2 epoch (loss 0.5859): 46%|βββββ | 387/840 [24:25<24:14, 3.21s/it]
Training 1/2 epoch (loss 0.5078): 46%|βββββ | 387/840 [24:29<24:14, 3.21s/it]
Training 1/2 epoch (loss 0.5078): 46%|βββββ | 388/840 [24:29<26:35, 3.53s/it]
Training 1/2 epoch (loss 0.4219): 46%|βββββ | 388/840 [24:32<26:35, 3.53s/it]
Training 1/2 epoch (loss 0.4219): 46%|βββββ | 389/840 [24:32<24:53, 3.31s/it]
Training 1/2 epoch (loss 0.3906): 46%|βββββ | 389/840 [24:35<24:53, 3.31s/it]
Training 1/2 epoch (loss 0.3906): 46%|βββββ | 390/840 [24:35<24:15, 3.23s/it]
Training 1/2 epoch (loss 0.5312): 46%|βββββ | 390/840 [24:38<24:15, 3.23s/it]
Training 1/2 epoch (loss 0.5312): 47%|βββββ | 391/840 [24:38<24:00, 3.21s/it]
Training 1/2 epoch (loss 0.4570): 47%|βββββ | 391/840 [24:41<24:00, 3.21s/it]
Training 1/2 epoch (loss 0.4570): 47%|βββββ | 392/840 [24:41<22:15, 2.98s/it]
Training 1/2 epoch (loss 0.5703): 47%|βββββ | 392/840 [24:44<22:15, 2.98s/it]
Training 1/2 epoch (loss 0.5703): 47%|βββββ | 393/840 [24:44<23:22, 3.14s/it]
Training 1/2 epoch (loss 0.5117): 47%|βββββ | 393/840 [24:47<23:22, 3.14s/it]
Training 1/2 epoch (loss 0.5117): 47%|βββββ | 394/840 [24:47<23:18, 3.14s/it]
Training 1/2 epoch (loss 0.4570): 47%|βββββ | 394/840 [24:50<23:18, 3.14s/it]
Training 1/2 epoch (loss 0.4570): 47%|βββββ | 395/840 [24:50<22:05, 2.98s/it]
Training 1/2 epoch (loss 0.4512): 47%|βββββ | 395/840 [24:53<22:05, 2.98s/it]
Training 1/2 epoch (loss 0.4512): 47%|βββββ | 396/840 [24:53<22:55, 3.10s/it]
Training 1/2 epoch (loss 0.4414): 47%|βββββ | 396/840 [24:57<22:55, 3.10s/it]
Training 1/2 epoch (loss 0.4414): 47%|βββββ | 397/840 [24:57<24:40, 3.34s/it]
Training 1/2 epoch (loss 0.4180): 47%|βββββ | 397/840 [25:01<24:40, 3.34s/it]
Training 1/2 epoch (loss 0.4180): 47%|βββββ | 398/840 [25:01<24:47, 3.37s/it]
Training 1/2 epoch (loss 0.6211): 47%|βββββ | 398/840 [25:06<24:47, 3.37s/it]
Training 1/2 epoch (loss 0.6211): 48%|βββββ | 399/840 [25:06<29:22, 4.00s/it]
Training 1/2 epoch (loss 0.5000): 48%|βββββ | 399/840 [25:12<29:22, 4.00s/it]
Training 1/2 epoch (loss 0.5000): 48%|βββββ | 400/840 [25:12<32:51, 4.48s/it]
Training 1/2 epoch (loss 0.4629): 48%|βββββ | 400/840 [25:16<32:51, 4.48s/it]
Training 1/2 epoch (loss 0.4629): 48%|βββββ | 401/840 [25:16<31:37, 4.32s/it]
Training 1/2 epoch (loss 0.4375): 48%|βββββ | 401/840 [25:21<31:37, 4.32s/it]
Training 1/2 epoch (loss 0.4375): 48%|βββββ | 402/840 [25:21<34:13, 4.69s/it]
Training 1/2 epoch (loss 0.4863): 48%|βββββ | 402/840 [25:27<34:13, 4.69s/it]
Training 1/2 epoch (loss 0.4863): 48%|βββββ | 403/840 [25:27<35:50, 4.92s/it]
Training 1/2 epoch (loss 0.4688): 48%|βββββ | 403/840 [25:30<35:50, 4.92s/it]
Training 1/2 epoch (loss 0.4688): 48%|βββββ | 404/840 [25:30<31:37, 4.35s/it]
Training 1/2 epoch (loss 0.5898): 48%|βββββ | 404/840 [25:32<31:37, 4.35s/it]
Training 1/2 epoch (loss 0.5898): 48%|βββββ | 405/840 [25:32<27:29, 3.79s/it]
Training 1/2 epoch (loss 0.5586): 48%|βββββ | 405/840 [25:37<27:29, 3.79s/it]
Training 1/2 epoch (loss 0.5586): 48%|βββββ | 406/840 [25:37<29:17, 4.05s/it]
Training 1/2 epoch (loss 0.5078): 48%|βββββ | 406/840 [25:42<29:17, 4.05s/it]
Training 1/2 epoch (loss 0.5078): 48%|βββββ | 407/840 [25:42<30:35, 4.24s/it]
Training 1/2 epoch (loss 0.5000): 48%|βββββ | 407/840 [25:46<30:35, 4.24s/it]
Training 1/2 epoch (loss 0.5000): 49%|βββββ | 408/840 [25:46<31:43, 4.41s/it]
Training 1/2 epoch (loss 0.4844): 49%|βββββ | 408/840 [25:49<31:43, 4.41s/it]
Training 1/2 epoch (loss 0.4844): 49%|βββββ | 409/840 [25:49<27:51, 3.88s/it]
Training 1/2 epoch (loss 0.4004): 49%|βββββ | 409/840 [25:55<27:51, 3.88s/it]
Training 1/2 epoch (loss 0.4004): 49%|βββββ | 410/840 [25:55<31:27, 4.39s/it]
Training 1/2 epoch (loss 0.5234): 49%|βββββ | 410/840 [25:58<31:27, 4.39s/it]
Training 1/2 epoch (loss 0.5234): 49%|βββββ | 411/840 [25:58<28:14, 3.95s/it]
Training 1/2 epoch (loss 0.4961): 49%|βββββ | 411/840 [26:01<28:14, 3.95s/it]
Training 1/2 epoch (loss 0.4961): 49%|βββββ | 412/840 [26:01<26:42, 3.74s/it]
Training 1/2 epoch (loss 0.4492): 49%|βββββ | 412/840 [26:04<26:42, 3.74s/it]
Training 1/2 epoch (loss 0.4492): 49%|βββββ | 413/840 [26:04<25:13, 3.54s/it]
Training 1/2 epoch (loss 0.4668): 49%|βββββ | 413/840 [26:08<25:13, 3.54s/it]
Training 1/2 epoch (loss 0.4668): 49%|βββββ | 414/840 [26:08<26:55, 3.79s/it]
Training 1/2 epoch (loss 0.4316): 49%|βββββ | 414/840 [26:12<26:55, 3.79s/it]
Training 1/2 epoch (loss 0.4316): 49%|βββββ | 415/840 [26:12<27:47, 3.92s/it]
Training 1/2 epoch (loss 0.5195): 49%|βββββ | 415/840 [26:17<27:47, 3.92s/it]
Training 1/2 epoch (loss 0.5195): 50%|βββββ | 416/840 [26:17<29:24, 4.16s/it]
Training 1/2 epoch (loss 0.4922): 50%|βββββ | 416/840 [26:21<29:24, 4.16s/it]
Training 1/2 epoch (loss 0.4922): 50%|βββββ | 417/840 [26:21<27:36, 3.92s/it]
Training 1/2 epoch (loss 0.4082): 50%|βββββ | 417/840 [26:24<27:36, 3.92s/it]
Training 1/2 epoch (loss 0.4082): 50%|βββββ | 418/840 [26:24<27:16, 3.88s/it]
Training 1/2 epoch (loss 0.5820): 50%|βββββ | 418/840 [26:27<27:16, 3.88s/it]
Training 1/2 epoch (loss 0.5820): 50%|βββββ | 419/840 [26:27<25:34, 3.64s/it]
Training 1/2 epoch (loss 0.4297): 50%|βββββ | 419/840 [26:31<25:34, 3.64s/it]
Training 1/2 epoch (loss 0.4297): 50%|βββββ | 420/840 [26:31<25:35, 3.66s/it]
Training 2/2 epoch (loss 0.5664): 50%|βββββ | 420/840 [26:34<25:35, 3.66s/it]
Training 2/2 epoch (loss 0.5664): 50%|βββββ | 421/840 [26:34<23:03, 3.30s/it]
Training 2/2 epoch (loss 0.4414): 50%|βββββ | 421/840 [26:37<23:03, 3.30s/it]
Training 2/2 epoch (loss 0.4414): 50%|βββββ | 422/840 [26:37<23:57, 3.44s/it]
Training 2/2 epoch (loss 0.4531): 50%|βββββ | 422/840 [26:40<23:57, 3.44s/it]
Training 2/2 epoch (loss 0.4531): 50%|βββββ | 423/840 [26:40<22:54, 3.30s/it]
Training 2/2 epoch (loss 0.4219): 50%|βββββ | 423/840 [26:43<22:54, 3.30s/it]
Training 2/2 epoch (loss 0.4219): 50%|βββββ | 424/840 [26:43<22:08, 3.19s/it]
Training 2/2 epoch (loss 0.4805): 50%|βββββ | 424/840 [26:48<22:08, 3.19s/it]
Training 2/2 epoch (loss 0.4805): 51%|βββββ | 425/840 [26:48<24:47, 3.58s/it]
Training 2/2 epoch (loss 0.4824): 51%|βββββ | 425/840 [26:51<24:47, 3.58s/it]
Training 2/2 epoch (loss 0.4824): 51%|βββββ | 426/840 [26:51<24:30, 3.55s/it]
Training 2/2 epoch (loss 0.5469): 51%|βββββ | 426/840 [26:55<24:30, 3.55s/it]
Training 2/2 epoch (loss 0.5469): 51%|βββββ | 427/840 [26:55<25:04, 3.64s/it]
Training 2/2 epoch (loss 0.4316): 51%|βββββ | 427/840 [27:01<25:04, 3.64s/it]
Training 2/2 epoch (loss 0.4316): 51%|βββββ | 428/840 [27:01<28:56, 4.22s/it]
Training 2/2 epoch (loss 0.5508): 51%|βββββ | 428/840 [27:05<28:56, 4.22s/it]
Training 2/2 epoch (loss 0.5508): 51%|βββββ | 429/840 [27:05<28:23, 4.14s/it]
Training 2/2 epoch (loss 0.4492): 51%|βββββ | 429/840 [27:08<28:23, 4.14s/it]
Training 2/2 epoch (loss 0.4492): 51%|βββββ | 430/840 [27:08<26:10, 3.83s/it]
Training 2/2 epoch (loss 0.6016): 51%|βββββ | 430/840 [27:10<26:10, 3.83s/it]
Training 2/2 epoch (loss 0.6016): 51%|ββββββ | 431/840 [27:10<23:31, 3.45s/it]
Training 2/2 epoch (loss 0.4941): 51%|ββββββ | 431/840 [27:14<23:31, 3.45s/it]
Training 2/2 epoch (loss 0.4941): 51%|ββββββ | 432/840 [27:14<24:52, 3.66s/it]
Training 2/2 epoch (loss 0.5391): 51%|ββββββ | 432/840 [27:17<24:52, 3.66s/it]
Training 2/2 epoch (loss 0.5391): 52%|ββββββ | 433/840 [27:17<22:42, 3.35s/it]
Training 2/2 epoch (loss 0.5312): 52%|ββββββ | 433/840 [27:21<22:42, 3.35s/it]
Training 2/2 epoch (loss 0.5312): 52%|ββββββ | 434/840 [27:21<23:05, 3.41s/it]
Training 2/2 epoch (loss 0.3555): 52%|ββββββ | 434/840 [27:23<23:05, 3.41s/it]
Training 2/2 epoch (loss 0.3555): 52%|ββββββ | 435/840 [27:23<22:00, 3.26s/it]
Training 2/2 epoch (loss 0.4941): 52%|ββββββ | 435/840 [27:27<22:00, 3.26s/it]
Training 2/2 epoch (loss 0.4941): 52%|ββββββ | 436/840 [27:27<22:26, 3.33s/it]
Training 2/2 epoch (loss 0.6172): 52%|ββββββ | 436/840 [27:30<22:26, 3.33s/it]
Training 2/2 epoch (loss 0.6172): 52%|ββββββ | 437/840 [27:30<21:27, 3.20s/it]
Training 2/2 epoch (loss 0.4707): 52%|ββββββ | 437/840 [27:33<21:27, 3.20s/it]
Training 2/2 epoch (loss 0.4707): 52%|ββββββ | 438/840 [27:33<21:37, 3.23s/it]
Training 2/2 epoch (loss 0.4922): 52%|ββββββ | 438/840 [27:37<21:37, 3.23s/it]
Training 2/2 epoch (loss 0.4922): 52%|ββββββ | 439/840 [27:37<23:33, 3.53s/it]
Training 2/2 epoch (loss 0.5039): 52%|ββββββ | 439/840 [27:41<23:33, 3.53s/it]
Training 2/2 epoch (loss 0.5039): 52%|ββββββ | 440/840 [27:41<24:07, 3.62s/it]
Training 2/2 epoch (loss 0.5352): 52%|ββββββ | 440/840 [27:45<24:07, 3.62s/it]
Training 2/2 epoch (loss 0.5352): 52%|ββββββ | 441/840 [27:45<24:06, 3.62s/it]
Training 2/2 epoch (loss 0.5273): 52%|ββββββ | 441/840 [27:48<24:06, 3.62s/it]
Training 2/2 epoch (loss 0.5273): 53%|ββββββ | 442/840 [27:48<22:18, 3.36s/it]
Training 2/2 epoch (loss 0.4746): 53%|ββββββ | 442/840 [27:50<22:18, 3.36s/it]
Training 2/2 epoch (loss 0.4746): 53%|ββββββ | 443/840 [27:50<20:57, 3.17s/it]
Training 2/2 epoch (loss 0.4648): 53%|ββββββ | 443/840 [27:54<20:57, 3.17s/it]
Training 2/2 epoch (loss 0.4648): 53%|ββββββ | 444/840 [27:54<22:08, 3.36s/it]
Training 2/2 epoch (loss 0.4609): 53%|ββββββ | 444/840 [27:59<22:08, 3.36s/it]
Training 2/2 epoch (loss 0.4609): 53%|ββββββ | 445/840 [27:59<24:50, 3.77s/it]
Training 2/2 epoch (loss 0.4258): 53%|ββββββ | 445/840 [28:04<24:50, 3.77s/it]
Training 2/2 epoch (loss 0.4258): 53%|ββββββ | 446/840 [28:04<27:11, 4.14s/it]
Training 2/2 epoch (loss 0.3730): 53%|ββββββ | 446/840 [28:07<27:11, 4.14s/it]
Training 2/2 epoch (loss 0.3730): 53%|ββββββ | 447/840 [28:07<25:43, 3.93s/it]
Training 2/2 epoch (loss 0.3203): 53%|ββββββ | 447/840 [28:11<25:43, 3.93s/it]
Training 2/2 epoch (loss 0.3203): 53%|ββββββ | 448/840 [28:11<24:31, 3.75s/it]
Training 2/2 epoch (loss 0.3047): 53%|ββββββ | 448/840 [28:13<24:31, 3.75s/it]
Training 2/2 epoch (loss 0.3047): 53%|ββββββ | 449/840 [28:13<22:02, 3.38s/it]
Training 2/2 epoch (loss 0.3945): 53%|ββββββ | 449/840 [28:18<22:02, 3.38s/it]
Training 2/2 epoch (loss 0.3945): 54%|ββββββ | 450/840 [28:18<25:03, 3.86s/it]
Training 2/2 epoch (loss 0.4473): 54%|ββββββ | 450/840 [28:23<25:03, 3.86s/it]
Training 2/2 epoch (loss 0.4473): 54%|ββββββ | 451/840 [28:23<26:28, 4.08s/it]
Training 2/2 epoch (loss 0.4277): 54%|ββββββ | 451/840 [28:26<26:28, 4.08s/it]
Training 2/2 epoch (loss 0.4277): 54%|ββββββ | 452/840 [28:26<25:18, 3.91s/it]
Training 2/2 epoch (loss 0.3359): 54%|ββββββ | 452/840 [28:32<25:18, 3.91s/it]
Training 2/2 epoch (loss 0.3359): 54%|ββββββ | 453/840 [28:32<28:06, 4.36s/it]
Training 2/2 epoch (loss 0.2773): 54%|ββββββ | 453/840 [28:35<28:06, 4.36s/it]
Training 2/2 epoch (loss 0.2773): 54%|ββββββ | 454/840 [28:35<26:16, 4.08s/it]
Training 2/2 epoch (loss 0.2598): 54%|ββββββ | 454/840 [28:40<26:16, 4.08s/it]
Training 2/2 epoch (loss 0.2598): 54%|ββββββ | 455/840 [28:40<27:08, 4.23s/it]
Training 2/2 epoch (loss 0.3320): 54%|ββββββ | 455/840 [28:43<27:08, 4.23s/it]
Training 2/2 epoch (loss 0.3320): 54%|ββββββ | 456/840 [28:43<25:50, 4.04s/it]
Training 2/2 epoch (loss 0.2656): 54%|ββββββ | 456/840 [28:46<25:50, 4.04s/it]
Training 2/2 epoch (loss 0.2656): 54%|ββββββ | 457/840 [28:46<24:06, 3.78s/it]
Training 2/2 epoch (loss 0.3066): 54%|ββββββ | 457/840 [28:50<24:06, 3.78s/it]
Training 2/2 epoch (loss 0.3066): 55%|ββββββ | 458/840 [28:50<22:49, 3.58s/it]
Training 2/2 epoch (loss 0.2559): 55%|ββββββ | 458/840 [28:54<22:49, 3.58s/it]
Training 2/2 epoch (loss 0.2559): 55%|ββββββ | 459/840 [28:54<24:53, 3.92s/it]
Training 2/2 epoch (loss 0.1855): 55%|ββββββ | 459/840 [28:59<24:53, 3.92s/it]
Training 2/2 epoch (loss 0.1855): 55%|ββββββ | 460/840 [28:59<27:09, 4.29s/it]
Training 2/2 epoch (loss 0.2266): 55%|ββββββ | 460/840 [29:04<27:09, 4.29s/it]
Training 2/2 epoch (loss 0.2266): 55%|ββββββ | 461/840 [29:04<27:15, 4.31s/it]
Training 2/2 epoch (loss 0.1865): 55%|ββββββ | 461/840 [29:07<27:15, 4.31s/it]
Training 2/2 epoch (loss 0.1865): 55%|ββββββ | 462/840 [29:07<25:49, 4.10s/it]
Training 2/2 epoch (loss 0.2949): 55%|ββββββ | 462/840 [29:12<25:49, 4.10s/it]
Training 2/2 epoch (loss 0.2949): 55%|ββββββ | 463/840 [29:12<26:23, 4.20s/it]
Training 2/2 epoch (loss 0.4043): 55%|ββββββ | 463/840 [29:16<26:23, 4.20s/it]
Training 2/2 epoch (loss 0.4043): 55%|ββββββ | 464/840 [29:16<26:14, 4.19s/it]
Training 2/2 epoch (loss 0.2559): 55%|ββββββ | 464/840 [29:19<26:14, 4.19s/it]
Training 2/2 epoch (loss 0.2559): 55%|ββββββ | 465/840 [29:19<23:08, 3.70s/it]
Training 2/2 epoch (loss 0.3398): 55%|ββββββ | 465/840 [29:24<23:08, 3.70s/it]
Training 2/2 epoch (loss 0.3398): 55%|ββββββ | 466/840 [29:24<26:36, 4.27s/it]
Training 2/2 epoch (loss 0.3789): 55%|ββββββ | 466/840 [29:28<26:36, 4.27s/it]
Training 2/2 epoch (loss 0.3789): 56%|ββββββ | 467/840 [29:28<25:31, 4.11s/it]
Training 2/2 epoch (loss 0.1738): 56%|ββββββ | 467/840 [29:31<25:31, 4.11s/it]
Training 2/2 epoch (loss 0.1738): 56%|ββββββ | 468/840 [29:31<23:29, 3.79s/it]
Training 2/2 epoch (loss 0.5859): 56%|ββββββ | 468/840 [29:34<23:29, 3.79s/it]
Training 2/2 epoch (loss 0.5859): 56%|ββββββ | 469/840 [29:34<21:13, 3.43s/it]
Training 2/2 epoch (loss 0.4824): 56%|ββββββ | 469/840 [29:36<21:13, 3.43s/it]
Training 2/2 epoch (loss 0.4824): 56%|ββββββ | 470/840 [29:36<19:50, 3.22s/it]
Training 2/2 epoch (loss 0.4863): 56%|ββββββ | 470/840 [29:40<19:50, 3.22s/it]
Training 2/2 epoch (loss 0.4863): 56%|ββββββ | 471/840 [29:40<21:26, 3.49s/it]
Training 2/2 epoch (loss 0.3633): 56%|ββββββ | 471/840 [29:44<21:26, 3.49s/it]
Training 2/2 epoch (loss 0.3633): 56%|ββββββ | 472/840 [29:44<20:53, 3.41s/it]
Training 2/2 epoch (loss 0.1660): 56%|ββββββ | 472/840 [29:46<20:53, 3.41s/it]
Training 2/2 epoch (loss 0.1660): 56%|ββββββ | 473/840 [29:46<19:55, 3.26s/it]
Training 2/2 epoch (loss 0.2852): 56%|ββββββ | 473/840 [29:50<19:55, 3.26s/it]
Training 2/2 epoch (loss 0.2852): 56%|ββββββ | 474/840 [29:50<19:38, 3.22s/it]
Training 2/2 epoch (loss 0.2109): 56%|ββββββ | 474/840 [29:53<19:38, 3.22s/it]
Training 2/2 epoch (loss 0.2109): 57%|ββββββ | 475/840 [29:53<19:59, 3.29s/it]
Training 2/2 epoch (loss 0.3262): 57%|ββββββ | 475/840 [29:56<19:59, 3.29s/it]
Training 2/2 epoch (loss 0.3262): 57%|ββββββ | 476/840 [29:56<19:39, 3.24s/it]
Training 2/2 epoch (loss 0.1973): 57%|ββββββ | 476/840 [29:59<19:39, 3.24s/it]
Training 2/2 epoch (loss 0.1973): 57%|ββββββ | 477/840 [29:59<19:10, 3.17s/it]
Training 2/2 epoch (loss 0.1631): 57%|ββββββ | 477/840 [30:03<19:10, 3.17s/it]
Training 2/2 epoch (loss 0.1631): 57%|ββββββ | 478/840 [30:03<19:47, 3.28s/it]
Training 2/2 epoch (loss 0.2734): 57%|ββββββ | 478/840 [30:07<19:47, 3.28s/it]
Training 2/2 epoch (loss 0.2734): 57%|ββββββ | 479/840 [30:07<22:11, 3.69s/it]
Training 2/2 epoch (loss 0.2119): 57%|ββββββ | 479/840 [30:13<22:11, 3.69s/it]
Training 2/2 epoch (loss 0.2119): 57%|ββββββ | 480/840 [30:13<25:15, 4.21s/it]
Training 2/2 epoch (loss 0.1836): 57%|ββββββ | 480/840 [30:16<25:15, 4.21s/it]
Training 2/2 epoch (loss 0.1836): 57%|ββββββ | 481/840 [30:16<23:22, 3.91s/it]
Training 2/2 epoch (loss 0.3164): 57%|ββββββ | 481/840 [30:20<23:22, 3.91s/it]
Training 2/2 epoch (loss 0.3164): 57%|ββββββ | 482/840 [30:20<23:15, 3.90s/it]
Training 2/2 epoch (loss 0.3164): 57%|ββββββ | 482/840 [30:25<23:15, 3.90s/it]
Training 2/2 epoch (loss 0.3164): 57%|ββββββ | 483/840 [30:25<25:53, 4.35s/it]
Training 2/2 epoch (loss 0.3926): 57%|ββββββ | 483/840 [30:30<25:53, 4.35s/it]
Training 2/2 epoch (loss 0.3926): 58%|ββββββ | 484/840 [30:30<26:14, 4.42s/it]
Training 2/2 epoch (loss 0.1250): 58%|ββββββ | 484/840 [30:33<26:14, 4.42s/it]
Training 2/2 epoch (loss 0.1250): 58%|ββββββ | 485/840 [30:33<23:21, 3.95s/it]
Training 2/2 epoch (loss 0.2363): 58%|ββββββ | 485/840 [30:36<23:21, 3.95s/it]
Training 2/2 epoch (loss 0.2363): 58%|ββββββ | 486/840 [30:36<21:15, 3.60s/it]
Training 2/2 epoch (loss 0.2656): 58%|ββββββ | 486/840 [30:40<21:15, 3.60s/it]
Training 2/2 epoch (loss 0.2656): 58%|ββββββ | 487/840 [30:40<21:56, 3.73s/it]
Training 2/2 epoch (loss 0.2578): 58%|ββββββ | 487/840 [30:44<21:56, 3.73s/it]
Training 2/2 epoch (loss 0.2578): 58%|ββββββ | 488/840 [30:44<22:53, 3.90s/it]
Training 2/2 epoch (loss 0.2305): 58%|ββββββ | 488/840 [30:46<22:53, 3.90s/it]
Training 2/2 epoch (loss 0.2305): 58%|ββββββ | 489/840 [30:46<20:34, 3.52s/it]
Training 2/2 epoch (loss 0.1226): 58%|ββββββ | 489/840 [30:51<20:34, 3.52s/it]
Training 2/2 epoch (loss 0.1226): 58%|ββββββ | 490/840 [30:51<21:39, 3.71s/it]
Training 2/2 epoch (loss 0.1279): 58%|ββββββ | 490/840 [30:54<21:39, 3.71s/it]
Training 2/2 epoch (loss 0.1279): 58%|ββββββ | 491/840 [30:54<20:55, 3.60s/it]
Training 2/2 epoch (loss 0.1582): 58%|ββββββ | 491/840 [31:00<20:55, 3.60s/it]
Training 2/2 epoch (loss 0.1582): 59%|ββββββ | 492/840 [31:00<24:20, 4.20s/it]
Training 2/2 epoch (loss 0.1992): 59%|ββββββ | 492/840 [31:05<24:20, 4.20s/it]
Training 2/2 epoch (loss 0.1992): 59%|ββββββ | 493/840 [31:05<26:23, 4.56s/it]
Training 2/2 epoch (loss 0.2637): 59%|ββββββ | 493/840 [31:10<26:23, 4.56s/it]
Training 2/2 epoch (loss 0.2637): 59%|ββββββ | 494/840 [31:10<26:30, 4.60s/it]
Training 2/2 epoch (loss 0.2148): 59%|ββββββ | 494/840 [31:15<26:30, 4.60s/it]
Training 2/2 epoch (loss 0.2148): 59%|ββββββ | 495/840 [31:15<27:55, 4.86s/it]
Training 2/2 epoch (loss 0.2617): 59%|ββββββ | 495/840 [31:18<27:55, 4.86s/it]
Training 2/2 epoch (loss 0.2617): 59%|ββββββ | 496/840 [31:18<24:04, 4.20s/it]
Training 2/2 epoch (loss 0.1787): 59%|ββββββ | 496/840 [31:21<24:04, 4.20s/it]
Training 2/2 epoch (loss 0.1787): 59%|ββββββ | 497/840 [31:21<22:12, 3.89s/it]
Training 2/2 epoch (loss 0.1924): 59%|ββββββ | 497/840 [31:24<22:12, 3.89s/it]
Training 2/2 epoch (loss 0.1924): 59%|ββββββ | 498/840 [31:24<20:14, 3.55s/it]
Training 2/2 epoch (loss 0.1426): 59%|ββββββ | 498/840 [31:28<20:14, 3.55s/it]
Training 2/2 epoch (loss 0.1426): 59%|ββββββ | 499/840 [31:28<21:42, 3.82s/it]
Training 2/2 epoch (loss 0.1050): 59%|ββββββ | 499/840 [31:31<21:42, 3.82s/it]
Training 2/2 epoch (loss 0.1050): 60%|ββββββ | 500/840 [31:31<20:32, 3.62s/it]
Training 2/2 epoch (loss 0.1982): 60%|ββββββ | 500/840 [31:35<20:32, 3.62s/it]
Training 2/2 epoch (loss 0.1982): 60%|ββββββ | 501/840 [31:35<19:59, 3.54s/it]
Training 2/2 epoch (loss 0.1118): 60%|ββββββ | 501/840 [31:38<19:59, 3.54s/it]
Training 2/2 epoch (loss 0.1118): 60%|ββββββ | 502/840 [31:38<19:31, 3.47s/it]
Training 2/2 epoch (loss 0.1621): 60%|ββββββ | 502/840 [31:42<19:31, 3.47s/it]
Training 2/2 epoch (loss 0.1621): 60%|ββββββ | 503/840 [31:42<21:04, 3.75s/it]
Training 2/2 epoch (loss 0.1484): 60%|ββββββ | 503/840 [31:46<21:04, 3.75s/it]
Training 2/2 epoch (loss 0.1484): 60%|ββββββ | 504/840 [31:46<21:13, 3.79s/it]
Training 2/2 epoch (loss 0.4121): 60%|ββββββ | 504/840 [31:49<21:13, 3.79s/it]
Training 2/2 epoch (loss 0.4121): 60%|ββββββ | 505/840 [31:49<19:41, 3.53s/it]
Training 2/2 epoch (loss 0.2295): 60%|ββββββ | 505/840 [31:52<19:41, 3.53s/it]
Training 2/2 epoch (loss 0.2295): 60%|ββββββ | 506/840 [31:52<18:21, 3.30s/it]
Training 2/2 epoch (loss 0.2002): 60%|ββββββ | 506/840 [31:57<18:21, 3.30s/it]
Training 2/2 epoch (loss 0.2002): 60%|ββββββ | 507/840 [31:57<21:46, 3.92s/it]
Training 2/2 epoch (loss 0.1758): 60%|ββββββ | 507/840 [32:01<21:46, 3.92s/it]
Training 2/2 epoch (loss 0.1758): 60%|ββββββ | 508/840 [32:01<21:58, 3.97s/it]
Training 2/2 epoch (loss 0.1943): 60%|ββββββ | 508/840 [32:05<21:58, 3.97s/it]
Training 2/2 epoch (loss 0.1943): 61%|ββββββ | 509/840 [32:05<21:03, 3.82s/it]
Training 2/2 epoch (loss 0.1816): 61%|ββββββ | 509/840 [32:08<21:03, 3.82s/it]
Training 2/2 epoch (loss 0.1816): 61%|ββββββ | 510/840 [32:08<20:25, 3.71s/it]
Training 2/2 epoch (loss 0.2637): 61%|ββββββ | 510/840 [32:11<20:25, 3.71s/it]
Training 2/2 epoch (loss 0.2637): 61%|ββββββ | 511/840 [32:11<18:45, 3.42s/it]
Training 2/2 epoch (loss 0.3730): 61%|ββββββ | 511/840 [32:14<18:45, 3.42s/it]
Training 2/2 epoch (loss 0.3730): 61%|ββββββ | 512/840 [32:14<18:16, 3.34s/it]
Training 2/2 epoch (loss 0.1030): 61%|ββββββ | 512/840 [32:17<18:16, 3.34s/it]
Training 2/2 epoch (loss 0.1030): 61%|ββββββ | 513/840 [32:17<17:15, 3.17s/it]
Training 2/2 epoch (loss 0.1602): 61%|ββββββ | 513/840 [32:20<17:15, 3.17s/it]
Training 2/2 epoch (loss 0.1602): 61%|ββββββ | 514/840 [32:20<17:17, 3.18s/it]
Training 2/2 epoch (loss 0.1680): 61%|ββββββ | 514/840 [32:24<17:17, 3.18s/it]
Training 2/2 epoch (loss 0.1680): 61%|βββββββ | 515/840 [32:24<18:14, 3.37s/it]
Training 2/2 epoch (loss 0.1016): 61%|βββββββ | 515/840 [32:30<18:14, 3.37s/it]
Training 2/2 epoch (loss 0.1016): 61%|βββββββ | 516/840 [32:30<21:48, 4.04s/it]
Training 2/2 epoch (loss 0.0908): 61%|βββββββ | 516/840 [32:34<21:48, 4.04s/it]
Training 2/2 epoch (loss 0.0908): 62%|βββββββ | 517/840 [32:34<21:42, 4.03s/it]
Training 2/2 epoch (loss 0.1963): 62%|βββββββ | 517/840 [32:38<21:42, 4.03s/it]
Training 2/2 epoch (loss 0.1963): 62%|βββββββ | 518/840 [32:38<21:52, 4.07s/it]
Training 2/2 epoch (loss 0.1475): 62%|βββββββ | 518/840 [32:41<21:52, 4.07s/it]
Training 2/2 epoch (loss 0.1475): 62%|βββββββ | 519/840 [32:41<19:49, 3.71s/it]
Training 2/2 epoch (loss 0.1396): 62%|βββββββ | 519/840 [32:44<19:49, 3.71s/it]
Training 2/2 epoch (loss 0.1396): 62%|βββββββ | 520/840 [32:44<18:43, 3.51s/it]
Training 2/2 epoch (loss 0.1396): 62%|βββββββ | 520/840 [32:49<18:43, 3.51s/it]
Training 2/2 epoch (loss 0.1396): 62%|βββββββ | 521/840 [32:49<21:46, 4.10s/it]
Training 2/2 epoch (loss 0.2754): 62%|βββββββ | 521/840 [32:54<21:46, 4.10s/it]
Training 2/2 epoch (loss 0.2754): 62%|βββββββ | 522/840 [32:54<22:23, 4.22s/it]
Training 2/2 epoch (loss 0.2168): 62%|βββββββ | 522/840 [32:57<22:23, 4.22s/it]
Training 2/2 epoch (loss 0.2168): 62%|βββββββ | 523/840 [32:57<20:29, 3.88s/it]
Training 2/2 epoch (loss 0.2100): 62%|βββββββ | 523/840 [33:00<20:29, 3.88s/it]
Training 2/2 epoch (loss 0.2100): 62%|βββββββ | 524/840 [33:00<20:01, 3.80s/it]
Training 2/2 epoch (loss 0.2090): 62%|βββββββ | 524/840 [33:04<20:01, 3.80s/it]
Training 2/2 epoch (loss 0.2090): 62%|βββββββ | 525/840 [33:04<20:03, 3.82s/it]
Training 2/2 epoch (loss 0.3457): 62%|βββββββ | 525/840 [33:09<20:03, 3.82s/it]
Training 2/2 epoch (loss 0.3457): 63%|βββββββ | 526/840 [33:09<21:05, 4.03s/it]
Training 2/2 epoch (loss 0.1553): 63%|βββββββ | 526/840 [33:14<21:05, 4.03s/it]
Training 2/2 epoch (loss 0.1553): 63%|βββββββ | 527/840 [33:14<23:07, 4.43s/it]
Training 2/2 epoch (loss 0.1738): 63%|βββββββ | 527/840 [33:17<23:07, 4.43s/it]
Training 2/2 epoch (loss 0.1738): 63%|βββββββ | 528/840 [33:17<21:12, 4.08s/it]
Training 2/2 epoch (loss 0.1709): 63%|βββββββ | 528/840 [33:21<21:12, 4.08s/it]
Training 2/2 epoch (loss 0.1709): 63%|βββββββ | 529/840 [33:21<19:40, 3.80s/it]
Training 2/2 epoch (loss 0.1895): 63%|βββββββ | 529/840 [33:26<19:40, 3.80s/it]
Training 2/2 epoch (loss 0.1895): 63%|βββββββ | 530/840 [33:26<22:10, 4.29s/it]
Training 2/2 epoch (loss 0.3242): 63%|βββββββ | 530/840 [33:29<22:10, 4.29s/it]
Training 2/2 epoch (loss 0.3242): 63%|βββββββ | 531/840 [33:29<19:33, 3.80s/it]
Training 2/2 epoch (loss 0.2256): 63%|βββββββ | 531/840 [33:34<19:33, 3.80s/it]
Training 2/2 epoch (loss 0.2256): 63%|βββββββ | 532/840 [33:34<22:07, 4.31s/it]
Training 2/2 epoch (loss 0.3926): 63%|βββββββ | 532/840 [33:40<22:07, 4.31s/it]
Training 2/2 epoch (loss 0.3926): 63%|βββββββ | 533/840 [33:40<23:48, 4.65s/it]
Training 2/2 epoch (loss 0.2246): 63%|βββββββ | 533/840 [33:43<23:48, 4.65s/it]
Training 2/2 epoch (loss 0.2246): 64%|βββββββ | 534/840 [33:43<21:21, 4.19s/it]
Training 2/2 epoch (loss 0.4336): 64%|βββββββ | 534/840 [33:46<21:21, 4.19s/it]
Training 2/2 epoch (loss 0.4336): 64%|βββββββ | 535/840 [33:46<20:31, 4.04s/it]
Training 2/2 epoch (loss 0.2637): 64%|βββββββ | 535/840 [33:50<20:31, 4.04s/it]
Training 2/2 epoch (loss 0.2637): 64%|βββββββ | 536/840 [33:50<19:36, 3.87s/it]
Training 2/2 epoch (loss 0.2715): 64%|βββββββ | 536/840 [33:53<19:36, 3.87s/it]
Training 2/2 epoch (loss 0.2715): 64%|βββββββ | 537/840 [33:53<18:33, 3.67s/it]
Training 2/2 epoch (loss 0.1426): 64%|βββββββ | 537/840 [33:57<18:33, 3.67s/it]
Training 2/2 epoch (loss 0.1426): 64%|βββββββ | 538/840 [33:57<19:27, 3.87s/it]
Training 2/2 epoch (loss 0.1660): 64%|βββββββ | 538/840 [34:02<19:27, 3.87s/it]
Training 2/2 epoch (loss 0.1660): 64%|βββββββ | 539/840 [34:02<20:38, 4.11s/it]
Training 2/2 epoch (loss 0.2246): 64%|βββββββ | 539/840 [34:05<20:38, 4.11s/it]
Training 2/2 epoch (loss 0.2246): 64%|βββββββ | 540/840 [34:05<19:13, 3.84s/it]
Training 2/2 epoch (loss 0.1416): 64%|βββββββ | 540/840 [34:09<19:13, 3.84s/it]
Training 2/2 epoch (loss 0.1416): 64%|βββββββ | 541/840 [34:09<19:11, 3.85s/it]
Training 2/2 epoch (loss 0.1367): 64%|βββββββ | 541/840 [34:13<19:11, 3.85s/it]
Training 2/2 epoch (loss 0.1367): 65%|βββββββ | 542/840 [34:13<18:58, 3.82s/it]
Training 2/2 epoch (loss 0.2119): 65%|βββββββ | 542/840 [34:16<18:58, 3.82s/it]
Training 2/2 epoch (loss 0.2119): 65%|βββββββ | 543/840 [34:16<17:51, 3.61s/it]
Training 2/2 epoch (loss 0.1172): 65%|βββββββ | 543/840 [34:20<17:51, 3.61s/it]
Training 2/2 epoch (loss 0.1172): 65%|βββββββ | 544/840 [34:20<18:32, 3.76s/it]
Training 2/2 epoch (loss 0.1660): 65%|βββββββ | 544/840 [34:23<18:32, 3.76s/it]
Training 2/2 epoch (loss 0.1660): 65%|βββββββ | 545/840 [34:23<16:48, 3.42s/it]
Training 2/2 epoch (loss 0.1953): 65%|βββββββ | 545/840 [34:26<16:48, 3.42s/it]
Training 2/2 epoch (loss 0.1953): 65%|βββββββ | 546/840 [34:26<16:23, 3.34s/it]
Training 2/2 epoch (loss 0.1216): 65%|βββββββ | 546/840 [34:29<16:23, 3.34s/it]
Training 2/2 epoch (loss 0.1216): 65%|βββββββ | 547/840 [34:29<16:03, 3.29s/it]
Training 2/2 epoch (loss 0.3730): 65%|βββββββ | 547/840 [34:35<16:03, 3.29s/it]
Training 2/2 epoch (loss 0.3730): 65%|βββββββ | 548/840 [34:35<19:12, 3.95s/it]
Training 2/2 epoch (loss 0.2432): 65%|βββββββ | 548/840 [34:37<19:12, 3.95s/it]
Training 2/2 epoch (loss 0.2432): 65%|βββββββ | 549/840 [34:37<17:41, 3.65s/it]
Training 2/2 epoch (loss 0.2100): 65%|βββββββ | 549/840 [34:41<17:41, 3.65s/it]
Training 2/2 epoch (loss 0.2100): 65%|βββββββ | 550/840 [34:41<16:47, 3.47s/it]
Training 2/2 epoch (loss 0.1396): 65%|βββββββ | 550/840 [34:43<16:47, 3.47s/it]
Training 2/2 epoch (loss 0.1396): 66%|βββββββ | 551/840 [34:43<15:38, 3.25s/it]
Training 2/2 epoch (loss 0.1216): 66%|βββββββ | 551/840 [34:48<15:38, 3.25s/it]
Training 2/2 epoch (loss 0.1216): 66%|βββββββ | 552/840 [34:48<17:54, 3.73s/it]
Training 2/2 epoch (loss 0.2656): 66%|βββββββ | 552/840 [34:52<17:54, 3.73s/it]
Training 2/2 epoch (loss 0.2656): 66%|βββββββ | 553/840 [34:52<18:14, 3.81s/it]
Training 2/2 epoch (loss 0.1924): 66%|βββββββ | 553/840 [34:57<18:14, 3.81s/it]
Training 2/2 epoch (loss 0.1924): 66%|βββββββ | 554/840 [34:57<19:05, 4.00s/it]
Training 2/2 epoch (loss 0.1299): 66%|βββββββ | 554/840 [35:00<19:05, 4.00s/it]
Training 2/2 epoch (loss 0.1299): 66%|βββββββ | 555/840 [35:00<17:44, 3.74s/it]
Training 2/2 epoch (loss 0.2168): 66%|βββββββ | 555/840 [35:04<17:44, 3.74s/it]
Training 2/2 epoch (loss 0.2168): 66%|βββββββ | 556/840 [35:04<17:52, 3.78s/it]
Training 2/2 epoch (loss 0.1455): 66%|βββββββ | 556/840 [35:06<17:52, 3.78s/it]
Training 2/2 epoch (loss 0.1455): 66%|βββββββ | 557/840 [35:06<16:07, 3.42s/it]
Training 2/2 epoch (loss 0.1016): 66%|βββββββ | 557/840 [35:11<16:07, 3.42s/it]
Training 2/2 epoch (loss 0.1016): 66%|βββββββ | 558/840 [35:11<17:52, 3.80s/it]
Training 2/2 epoch (loss 0.1484): 66%|βββββββ | 558/840 [35:15<17:52, 3.80s/it]
Training 2/2 epoch (loss 0.1484): 67%|βββββββ | 559/840 [35:15<18:08, 3.88s/it]
Training 2/2 epoch (loss 0.2012): 67%|βββββββ | 559/840 [35:18<18:08, 3.88s/it]
Training 2/2 epoch (loss 0.2012): 67%|βββββββ | 560/840 [35:18<17:21, 3.72s/it]
Training 2/2 epoch (loss 0.0820): 67%|βββββββ | 560/840 [35:23<17:21, 3.72s/it]
Training 2/2 epoch (loss 0.0820): 67%|βββββββ | 561/840 [35:23<18:45, 4.03s/it]
Training 2/2 epoch (loss 0.1328): 67%|βββββββ | 561/840 [35:29<18:45, 4.03s/it]
Training 2/2 epoch (loss 0.1328): 67%|βββββββ | 562/840 [35:29<20:44, 4.47s/it]
Training 2/2 epoch (loss 0.1191): 67%|βββββββ | 562/840 [35:32<20:44, 4.47s/it]
Training 2/2 epoch (loss 0.1191): 67%|βββββββ | 563/840 [35:32<19:39, 4.26s/it]
Training 2/2 epoch (loss 0.1128): 67%|βββββββ | 563/840 [35:35<19:39, 4.26s/it]
Training 2/2 epoch (loss 0.1128): 67%|βββββββ | 564/840 [35:35<18:01, 3.92s/it]
Training 2/2 epoch (loss 0.2773): 67%|βββββββ | 564/840 [35:38<18:01, 3.92s/it]
Training 2/2 epoch (loss 0.2773): 67%|βββββββ | 565/840 [35:38<16:38, 3.63s/it]
Training 2/2 epoch (loss 0.3750): 67%|βββββββ | 565/840 [35:43<16:38, 3.63s/it]
Training 2/2 epoch (loss 0.3750): 67%|βββββββ | 566/840 [35:43<18:33, 4.06s/it]
Training 2/2 epoch (loss 0.1719): 67%|βββββββ | 566/840 [35:47<18:33, 4.06s/it]
Training 2/2 epoch (loss 0.1719): 68%|βββββββ | 567/840 [35:47<18:02, 3.97s/it]
Training 2/2 epoch (loss 0.2598): 68%|βββββββ | 567/840 [35:51<18:02, 3.97s/it]
Training 2/2 epoch (loss 0.2598): 68%|βββββββ | 568/840 [35:51<17:52, 3.94s/it]
Training 2/2 epoch (loss 0.2188): 68%|βββββββ | 568/840 [35:55<17:52, 3.94s/it]
Training 2/2 epoch (loss 0.2188): 68%|βββββββ | 569/840 [35:55<17:44, 3.93s/it]
Training 2/2 epoch (loss 0.1738): 68%|βββββββ | 569/840 [35:58<17:44, 3.93s/it]
Training 2/2 epoch (loss 0.1738): 68%|βββββββ | 570/840 [35:58<16:33, 3.68s/it]
Training 2/2 epoch (loss 0.1846): 68%|βββββββ | 570/840 [36:02<16:33, 3.68s/it]
Training 2/2 epoch (loss 0.1846): 68%|βββββββ | 571/840 [36:02<16:30, 3.68s/it]
Training 2/2 epoch (loss 0.1992): 68%|βββββββ | 571/840 [36:05<16:30, 3.68s/it]
Training 2/2 epoch (loss 0.1992): 68%|βββββββ | 572/840 [36:05<16:31, 3.70s/it]
Training 2/2 epoch (loss 0.1885): 68%|βββββββ | 572/840 [36:09<16:31, 3.70s/it]
Training 2/2 epoch (loss 0.1885): 68%|βββββββ | 573/840 [36:09<16:28, 3.70s/it]
Training 2/2 epoch (loss 0.1064): 68%|βββββββ | 573/840 [36:12<16:28, 3.70s/it]
Training 2/2 epoch (loss 0.1064): 68%|βββββββ | 574/840 [36:12<15:46, 3.56s/it]
Training 2/2 epoch (loss 0.2471): 68%|βββββββ | 574/840 [36:17<15:46, 3.56s/it]
Training 2/2 epoch (loss 0.2471): 68%|βββββββ | 575/840 [36:17<17:01, 3.85s/it]
Training 2/2 epoch (loss 0.1533): 68%|βββββββ | 575/840 [36:21<17:01, 3.85s/it]
Training 2/2 epoch (loss 0.1533): 69%|βββββββ | 576/840 [36:21<16:49, 3.82s/it]
Training 2/2 epoch (loss 0.1758): 69%|βββββββ | 576/840 [36:24<16:49, 3.82s/it]
Training 2/2 epoch (loss 0.1758): 69%|βββββββ | 577/840 [36:24<15:55, 3.63s/it]
Training 2/2 epoch (loss 0.2754): 69%|βββββββ | 577/840 [36:27<15:55, 3.63s/it]
Training 2/2 epoch (loss 0.2754): 69%|βββββββ | 578/840 [36:27<14:50, 3.40s/it]
Training 2/2 epoch (loss 0.1699): 69%|βββββββ | 578/840 [36:30<14:50, 3.40s/it]
Training 2/2 epoch (loss 0.1699): 69%|βββββββ | 579/840 [36:30<15:12, 3.50s/it]
Training 2/2 epoch (loss 0.0752): 69%|βββββββ | 579/840 [36:35<15:12, 3.50s/it]
Training 2/2 epoch (loss 0.0752): 69%|βββββββ | 580/840 [36:35<16:09, 3.73s/it]
Training 2/2 epoch (loss 0.3008): 69%|βββββββ | 580/840 [36:39<16:09, 3.73s/it]
Training 2/2 epoch (loss 0.3008): 69%|βββββββ | 581/840 [36:39<17:02, 3.95s/it]
Training 2/2 epoch (loss 0.5117): 69%|βββββββ | 581/840 [36:45<17:02, 3.95s/it]
Training 2/2 epoch (loss 0.5117): 69%|βββββββ | 582/840 [36:45<19:03, 4.43s/it]
Training 2/2 epoch (loss 0.0981): 69%|βββββββ | 582/840 [36:49<19:03, 4.43s/it]
Training 2/2 epoch (loss 0.0981): 69%|βββββββ | 583/840 [36:49<18:40, 4.36s/it]
Training 2/2 epoch (loss 0.1602): 69%|βββββββ | 583/840 [36:52<18:40, 4.36s/it]
Training 2/2 epoch (loss 0.1602): 70%|βββββββ | 584/840 [36:52<17:02, 3.99s/it]
Training 2/2 epoch (loss 0.1006): 70%|βββββββ | 584/840 [36:56<17:02, 3.99s/it]
Training 2/2 epoch (loss 0.1006): 70%|βββββββ | 585/840 [36:56<16:32, 3.89s/it]
Training 2/2 epoch (loss 0.1758): 70%|βββββββ | 585/840 [36:59<16:32, 3.89s/it]
Training 2/2 epoch (loss 0.1758): 70%|βββββββ | 586/840 [36:59<15:51, 3.75s/it]
Training 2/2 epoch (loss 0.2178): 70%|βββββββ | 586/840 [37:02<15:51, 3.75s/it]
Training 2/2 epoch (loss 0.2178): 70%|βββββββ | 587/840 [37:02<14:57, 3.55s/it]
Training 2/2 epoch (loss 0.1338): 70%|βββββββ | 587/840 [37:06<14:57, 3.55s/it]
Training 2/2 epoch (loss 0.1338): 70%|βββββββ | 588/840 [37:06<14:55, 3.56s/it]
Training 2/2 epoch (loss 0.1318): 70%|βββββββ | 588/840 [37:09<14:55, 3.56s/it]
Training 2/2 epoch (loss 0.1318): 70%|βββββββ | 589/840 [37:09<14:36, 3.49s/it]
Training 2/2 epoch (loss 0.1621): 70%|βββββββ | 589/840 [37:13<14:36, 3.49s/it]
Training 2/2 epoch (loss 0.1621): 70%|βββββββ | 590/840 [37:13<14:39, 3.52s/it]
Training 2/2 epoch (loss 0.0493): 70%|βββββββ | 590/840 [37:17<14:39, 3.52s/it]
Training 2/2 epoch (loss 0.0493): 70%|βββββββ | 591/840 [37:17<15:18, 3.69s/it]
Training 2/2 epoch (loss 0.0947): 70%|βββββββ | 591/840 [37:20<15:18, 3.69s/it]
Training 2/2 epoch (loss 0.0947): 70%|βββββββ | 592/840 [37:20<15:05, 3.65s/it]
Training 2/2 epoch (loss 0.1138): 70%|βββββββ | 592/840 [37:24<15:05, 3.65s/it]
Training 2/2 epoch (loss 0.1138): 71%|βββββββ | 593/840 [37:24<15:11, 3.69s/it]
Training 2/2 epoch (loss 0.1855): 71%|βββββββ | 593/840 [37:27<15:11, 3.69s/it]
Training 2/2 epoch (loss 0.1855): 71%|βββββββ | 594/840 [37:27<14:00, 3.42s/it]
Training 2/2 epoch (loss 0.0664): 71%|βββββββ | 594/840 [37:30<14:00, 3.42s/it]
Training 2/2 epoch (loss 0.0664): 71%|βββββββ | 595/840 [37:30<13:05, 3.20s/it]
Training 2/2 epoch (loss 0.1289): 71%|βββββββ | 595/840 [37:33<13:05, 3.20s/it]
Training 2/2 epoch (loss 0.1289): 71%|βββββββ | 596/840 [37:33<12:58, 3.19s/it]
Training 2/2 epoch (loss 0.1099): 71%|βββββββ | 596/840 [37:38<12:58, 3.19s/it]
Training 2/2 epoch (loss 0.1099): 71%|βββββββ | 597/840 [37:38<15:41, 3.87s/it]
Training 2/2 epoch (loss 0.1865): 71%|βββββββ | 597/840 [37:42<15:41, 3.87s/it]
Training 2/2 epoch (loss 0.1865): 71%|βββββββ | 598/840 [37:42<15:20, 3.80s/it]
Training 2/2 epoch (loss 0.2715): 71%|βββββββ | 598/840 [37:46<15:20, 3.80s/it]
Training 2/2 epoch (loss 0.2715): 71%|ββββββββ | 599/840 [37:46<15:46, 3.93s/it]
Training 2/2 epoch (loss 0.0527): 71%|ββββββββ | 599/840 [37:49<15:46, 3.93s/it]
Training 2/2 epoch (loss 0.0527): 71%|ββββββββ | 600/840 [37:49<14:06, 3.53s/it]
Training 2/2 epoch (loss 0.0483): 71%|ββββββββ | 600/840 [37:52<14:06, 3.53s/it]
Training 2/2 epoch (loss 0.0483): 72%|ββββββββ | 601/840 [37:52<13:37, 3.42s/it]
Training 2/2 epoch (loss 0.0713): 72%|ββββββββ | 601/840 [37:55<13:37, 3.42s/it]
Training 2/2 epoch (loss 0.0713): 72%|ββββββββ | 602/840 [37:55<12:44, 3.21s/it]
Training 2/2 epoch (loss 0.0383): 72%|ββββββββ | 602/840 [37:58<12:44, 3.21s/it]
Training 2/2 epoch (loss 0.0383): 72%|ββββββββ | 603/840 [37:58<12:31, 3.17s/it]
Training 2/2 epoch (loss 0.0771): 72%|ββββββββ | 603/840 [38:01<12:31, 3.17s/it]
Training 2/2 epoch (loss 0.0771): 72%|ββββββββ | 604/840 [38:01<12:27, 3.17s/it]
Training 2/2 epoch (loss 0.1963): 72%|ββββββββ | 604/840 [38:04<12:27, 3.17s/it]
Training 2/2 epoch (loss 0.1963): 72%|ββββββββ | 605/840 [38:04<12:50, 3.28s/it]
Training 2/2 epoch (loss 0.0376): 72%|ββββββββ | 605/840 [38:09<12:50, 3.28s/it]
Training 2/2 epoch (loss 0.0376): 72%|ββββββββ | 606/840 [38:09<14:00, 3.59s/it]
Training 2/2 epoch (loss 0.3691): 72%|ββββββββ | 606/840 [38:12<14:00, 3.59s/it]
Training 2/2 epoch (loss 0.3691): 72%|ββββββββ | 607/840 [38:12<13:46, 3.55s/it]
Training 2/2 epoch (loss 0.1816): 72%|ββββββββ | 607/840 [38:17<13:46, 3.55s/it]
Training 2/2 epoch (loss 0.1816): 72%|ββββββββ | 608/840 [38:17<14:47, 3.82s/it]
Training 2/2 epoch (loss 0.2197): 72%|ββββββββ | 608/840 [38:20<14:47, 3.82s/it]
Training 2/2 epoch (loss 0.2197): 72%|ββββββββ | 609/840 [38:20<13:50, 3.59s/it]
Training 2/2 epoch (loss 0.1250): 72%|ββββββββ | 609/840 [38:23<13:50, 3.59s/it]
Training 2/2 epoch (loss 0.1250): 73%|ββββββββ | 610/840 [38:23<13:24, 3.50s/it]
Training 2/2 epoch (loss 0.0530): 73%|ββββββββ | 610/840 [38:27<13:24, 3.50s/it]
Training 2/2 epoch (loss 0.0530): 73%|ββββββββ | 611/840 [38:27<13:44, 3.60s/it]
Training 2/2 epoch (loss 0.0854): 73%|ββββββββ | 611/840 [38:30<13:44, 3.60s/it]
Training 2/2 epoch (loss 0.0854): 73%|ββββββββ | 612/840 [38:30<13:38, 3.59s/it]
Training 2/2 epoch (loss 0.1357): 73%|ββββββββ | 612/840 [38:33<13:38, 3.59s/it]
Training 2/2 epoch (loss 0.1357): 73%|ββββββββ | 613/840 [38:33<12:58, 3.43s/it]
Training 2/2 epoch (loss 0.1416): 73%|ββββββββ | 613/840 [38:37<12:58, 3.43s/it]
Training 2/2 epoch (loss 0.1416): 73%|ββββββββ | 614/840 [38:37<13:20, 3.54s/it]
Training 2/2 epoch (loss 0.1016): 73%|ββββββββ | 614/840 [38:40<13:20, 3.54s/it]
Training 2/2 epoch (loss 0.1016): 73%|ββββββββ | 615/840 [38:40<12:29, 3.33s/it]
Training 2/2 epoch (loss 0.1040): 73%|ββββββββ | 615/840 [38:43<12:29, 3.33s/it]
Training 2/2 epoch (loss 0.1040): 73%|ββββββββ | 616/840 [38:43<12:27, 3.34s/it]
Training 2/2 epoch (loss 0.4531): 73%|ββββββββ | 616/840 [38:49<12:27, 3.34s/it]
Training 2/2 epoch (loss 0.4531): 73%|ββββββββ | 617/840 [38:49<14:48, 3.98s/it]
Training 2/2 epoch (loss 0.3594): 73%|ββββββββ | 617/840 [38:53<14:48, 3.98s/it]
Training 2/2 epoch (loss 0.3594): 74%|ββββββββ | 618/840 [38:53<14:30, 3.92s/it]
Training 2/2 epoch (loss 0.1992): 74%|ββββββββ | 618/840 [38:56<14:30, 3.92s/it]
Training 2/2 epoch (loss 0.1992): 74%|ββββββββ | 619/840 [38:56<13:27, 3.66s/it]
Training 2/2 epoch (loss 0.0518): 74%|ββββββββ | 619/840 [38:59<13:27, 3.66s/it]
Training 2/2 epoch (loss 0.0518): 74%|ββββββββ | 620/840 [38:59<13:09, 3.59s/it]
Training 2/2 epoch (loss 0.0732): 74%|ββββββββ | 620/840 [39:03<13:09, 3.59s/it]
Training 2/2 epoch (loss 0.0732): 74%|ββββββββ | 621/840 [39:03<13:11, 3.61s/it]
Training 2/2 epoch (loss 0.1299): 74%|ββββββββ | 621/840 [39:06<13:11, 3.61s/it]
Training 2/2 epoch (loss 0.1299): 74%|ββββββββ | 622/840 [39:06<13:04, 3.60s/it]
Training 2/2 epoch (loss 0.0386): 74%|ββββββββ | 622/840 [39:09<13:04, 3.60s/it]
Training 2/2 epoch (loss 0.0386): 74%|ββββββββ | 623/840 [39:09<11:55, 3.30s/it]
Training 2/2 epoch (loss 0.2275): 74%|ββββββββ | 623/840 [39:13<11:55, 3.30s/it]
Training 2/2 epoch (loss 0.2275): 74%|ββββββββ | 624/840 [39:13<12:23, 3.44s/it]
Training 2/2 epoch (loss 0.1367): 74%|ββββββββ | 624/840 [39:18<12:23, 3.44s/it]
Training 2/2 epoch (loss 0.1367): 74%|ββββββββ | 625/840 [39:18<13:52, 3.87s/it]
Training 2/2 epoch (loss 0.1748): 74%|ββββββββ | 625/840 [39:21<13:52, 3.87s/it]
Training 2/2 epoch (loss 0.1748): 75%|ββββββββ | 626/840 [39:21<13:21, 3.75s/it]
Training 2/2 epoch (loss 0.0918): 75%|ββββββββ | 626/840 [39:24<13:21, 3.75s/it]
Training 2/2 epoch (loss 0.0918): 75%|ββββββββ | 627/840 [39:24<12:44, 3.59s/it]
Training 2/2 epoch (loss 0.0654): 75%|ββββββββ | 627/840 [39:29<12:44, 3.59s/it]
Training 2/2 epoch (loss 0.0654): 75%|ββββββββ | 628/840 [39:29<13:52, 3.93s/it]
Training 2/2 epoch (loss 0.1006): 75%|ββββββββ | 628/840 [39:32<13:52, 3.93s/it]
Training 2/2 epoch (loss 0.1006): 75%|ββββββββ | 629/840 [39:32<12:34, 3.58s/it]
Training 2/2 epoch (loss 0.1133): 75%|ββββββββ | 629/840 [39:35<12:34, 3.58s/it]
Training 2/2 epoch (loss 0.1133): 75%|ββββββββ | 630/840 [39:35<12:10, 3.48s/it]
Training 2/2 epoch (loss 0.0520): 75%|ββββββββ | 630/840 [39:41<12:10, 3.48s/it]
Training 2/2 epoch (loss 0.0520): 75%|ββββββββ | 631/840 [39:41<14:11, 4.07s/it]
Training 2/2 epoch (loss 0.0732): 75%|ββββββββ | 631/840 [39:43<14:11, 4.07s/it]
Training 2/2 epoch (loss 0.0732): 75%|ββββββββ | 632/840 [39:43<12:59, 3.75s/it]
Training 2/2 epoch (loss 0.0791): 75%|ββββββββ | 632/840 [39:47<12:59, 3.75s/it]
Training 2/2 epoch (loss 0.0791): 75%|ββββββββ | 633/840 [39:47<12:15, 3.55s/it]
Training 2/2 epoch (loss 0.0435): 75%|ββββββββ | 633/840 [39:50<12:15, 3.55s/it]
Training 2/2 epoch (loss 0.0435): 75%|ββββββββ | 634/840 [39:50<12:10, 3.54s/it]
Training 2/2 epoch (loss 0.1533): 75%|ββββββββ | 634/840 [39:56<12:10, 3.54s/it]
Training 2/2 epoch (loss 0.1533): 76%|ββββββββ | 635/840 [39:56<14:03, 4.11s/it]
Training 2/2 epoch (loss 0.1060): 76%|ββββββββ | 635/840 [40:00<14:03, 4.11s/it]
Training 2/2 epoch (loss 0.1060): 76%|ββββββββ | 636/840 [40:00<13:58, 4.11s/it]
Training 2/2 epoch (loss 0.1250): 76%|ββββββββ | 636/840 [40:04<13:58, 4.11s/it]
Training 2/2 epoch (loss 0.1250): 76%|ββββββββ | 637/840 [40:04<13:39, 4.03s/it]
Training 2/2 epoch (loss 0.1250): 76%|ββββββββ | 637/840 [40:08<13:39, 4.03s/it]
Training 2/2 epoch (loss 0.1250): 76%|ββββββββ | 638/840 [40:08<14:31, 4.32s/it]
Training 2/2 epoch (loss 0.0566): 76%|ββββββββ | 638/840 [40:12<14:31, 4.32s/it]
Training 2/2 epoch (loss 0.0566): 76%|ββββββββ | 639/840 [40:12<13:46, 4.11s/it]
Training 2/2 epoch (loss 0.0253): 76%|ββββββββ | 639/840 [40:16<13:46, 4.11s/it]
Training 2/2 epoch (loss 0.0253): 76%|ββββββββ | 640/840 [40:16<13:52, 4.16s/it]
Training 2/2 epoch (loss 0.0942): 76%|ββββββββ | 640/840 [40:19<13:52, 4.16s/it]
Training 2/2 epoch (loss 0.0942): 76%|ββββββββ | 641/840 [40:19<12:22, 3.73s/it]
Training 2/2 epoch (loss 0.0312): 76%|ββββββββ | 641/840 [40:24<12:22, 3.73s/it]
Training 2/2 epoch (loss 0.0312): 76%|ββββββββ | 642/840 [40:24<13:28, 4.09s/it]
Training 2/2 epoch (loss 0.1504): 76%|ββββββββ | 642/840 [40:27<13:28, 4.09s/it]
Training 2/2 epoch (loss 0.1504): 77%|ββββββββ | 643/840 [40:27<12:38, 3.85s/it]
Training 2/2 epoch (loss 0.1167): 77%|ββββββββ | 643/840 [40:30<12:38, 3.85s/it]
Training 2/2 epoch (loss 0.1167): 77%|ββββββββ | 644/840 [40:30<11:17, 3.46s/it]
Training 2/2 epoch (loss 0.0601): 77%|ββββββββ | 644/840 [40:34<11:17, 3.46s/it]
Training 2/2 epoch (loss 0.0601): 77%|ββββββββ | 645/840 [40:34<12:14, 3.77s/it]
Training 2/2 epoch (loss 0.1670): 77%|ββββββββ | 645/840 [40:38<12:14, 3.77s/it]
Training 2/2 epoch (loss 0.1670): 77%|ββββββββ | 646/840 [40:38<11:59, 3.71s/it]
Training 2/2 epoch (loss 0.2041): 77%|ββββββββ | 646/840 [40:41<11:59, 3.71s/it]
Training 2/2 epoch (loss 0.2041): 77%|ββββββββ | 647/840 [40:41<11:18, 3.51s/it]
Training 2/2 epoch (loss 0.1074): 77%|ββββββββ | 647/840 [40:47<11:18, 3.51s/it]
Training 2/2 epoch (loss 0.1074): 77%|ββββββββ | 648/840 [40:47<13:13, 4.13s/it]
Training 2/2 epoch (loss 0.2090): 77%|ββββββββ | 648/840 [40:50<13:13, 4.13s/it]
Training 2/2 epoch (loss 0.2090): 77%|ββββββββ | 649/840 [40:50<12:23, 3.89s/it]
Training 2/2 epoch (loss 0.1138): 77%|ββββββββ | 649/840 [40:54<12:23, 3.89s/it]
Training 2/2 epoch (loss 0.1138): 77%|ββββββββ | 650/840 [40:54<12:54, 4.07s/it]
Training 2/2 epoch (loss 0.1758): 77%|ββββββββ | 650/840 [40:57<12:54, 4.07s/it]
Training 2/2 epoch (loss 0.1758): 78%|ββββββββ | 651/840 [40:57<11:42, 3.72s/it]
Training 2/2 epoch (loss 0.0535): 78%|ββββββββ | 651/840 [41:00<11:42, 3.72s/it]
Training 2/2 epoch (loss 0.0535): 78%|ββββββββ | 652/840 [41:00<11:08, 3.56s/it]
Training 2/2 epoch (loss 0.0791): 78%|ββββββββ | 652/840 [41:05<11:08, 3.56s/it]
Training 2/2 epoch (loss 0.0791): 78%|ββββββββ | 653/840 [41:05<12:03, 3.87s/it]
Training 2/2 epoch (loss 0.0266): 78%|ββββββββ | 653/840 [41:08<12:03, 3.87s/it]
Training 2/2 epoch (loss 0.0266): 78%|ββββββββ | 654/840 [41:08<11:01, 3.56s/it]
Training 2/2 epoch (loss 0.3711): 78%|ββββββββ | 654/840 [41:11<11:01, 3.56s/it]
Training 2/2 epoch (loss 0.3711): 78%|ββββββββ | 655/840 [41:11<10:17, 3.34s/it]
Training 2/2 epoch (loss 0.1777): 78%|ββββββββ | 655/840 [41:16<10:17, 3.34s/it]
Training 2/2 epoch (loss 0.1777): 78%|ββββββββ | 656/840 [41:16<12:18, 4.01s/it]
Training 2/2 epoch (loss 0.0767): 78%|ββββββββ | 656/840 [41:21<12:18, 4.01s/it]
Training 2/2 epoch (loss 0.0767): 78%|ββββββββ | 657/840 [41:21<12:38, 4.15s/it]
Training 2/2 epoch (loss 0.1006): 78%|ββββββββ | 657/840 [41:24<12:38, 4.15s/it]
Training 2/2 epoch (loss 0.1006): 78%|ββββββββ | 658/840 [41:24<11:44, 3.87s/it]
Training 2/2 epoch (loss 0.3164): 78%|ββββββββ | 658/840 [41:29<11:44, 3.87s/it]
Training 2/2 epoch (loss 0.3164): 78%|ββββββββ | 659/840 [41:29<13:04, 4.33s/it]
Training 2/2 epoch (loss 0.1758): 78%|ββββββββ | 659/840 [41:33<13:04, 4.33s/it]
Training 2/2 epoch (loss 0.1758): 79%|ββββββββ | 660/840 [41:33<12:30, 4.17s/it]
Training 2/2 epoch (loss 0.1099): 79%|ββββββββ | 660/840 [41:38<12:30, 4.17s/it]
Training 2/2 epoch (loss 0.1099): 79%|ββββββββ | 661/840 [41:38<12:44, 4.27s/it]
Training 2/2 epoch (loss 0.1011): 79%|ββββββββ | 661/840 [41:42<12:44, 4.27s/it]
Training 2/2 epoch (loss 0.1011): 79%|ββββββββ | 662/840 [41:42<12:25, 4.19s/it]
Training 2/2 epoch (loss 0.1416): 79%|ββββββββ | 662/840 [41:45<12:25, 4.19s/it]
Training 2/2 epoch (loss 0.1416): 79%|ββββββββ | 663/840 [41:45<11:42, 3.97s/it]
Training 2/2 epoch (loss 0.0640): 79%|ββββββββ | 663/840 [41:51<11:42, 3.97s/it]
Training 2/2 epoch (loss 0.0640): 79%|ββββββββ | 664/840 [41:51<12:58, 4.42s/it]
Training 2/2 epoch (loss 0.0928): 79%|ββββββββ | 664/840 [41:54<12:58, 4.42s/it]
Training 2/2 epoch (loss 0.0928): 79%|ββββββββ | 665/840 [41:54<11:50, 4.06s/it]
Training 2/2 epoch (loss 0.2812): 79%|ββββββββ | 665/840 [41:57<11:50, 4.06s/it]
Training 2/2 epoch (loss 0.2812): 79%|ββββββββ | 666/840 [41:57<10:58, 3.78s/it]
Training 2/2 epoch (loss 0.0664): 79%|ββββββββ | 666/840 [42:00<10:58, 3.78s/it]
Training 2/2 epoch (loss 0.0664): 79%|ββββββββ | 667/840 [42:00<09:51, 3.42s/it]
Training 2/2 epoch (loss 0.0898): 79%|ββββββββ | 667/840 [42:05<09:51, 3.42s/it]
Training 2/2 epoch (loss 0.0898): 80%|ββββββββ | 668/840 [42:05<11:38, 4.06s/it]
Training 2/2 epoch (loss 0.1152): 80%|ββββββββ | 668/840 [42:08<11:38, 4.06s/it]
Training 2/2 epoch (loss 0.1152): 80%|ββββββββ | 669/840 [42:08<10:42, 3.76s/it]
Training 2/2 epoch (loss 0.0559): 80%|ββββββββ | 669/840 [42:11<10:42, 3.76s/it]
Training 2/2 epoch (loss 0.0559): 80%|ββββββββ | 670/840 [42:11<09:58, 3.52s/it]
Training 2/2 epoch (loss 0.1133): 80%|ββββββββ | 670/840 [42:15<09:58, 3.52s/it]
Training 2/2 epoch (loss 0.1133): 80%|ββββββββ | 671/840 [42:15<10:03, 3.57s/it]
Training 2/2 epoch (loss 0.0549): 80%|ββββββββ | 671/840 [42:18<10:03, 3.57s/it]
Training 2/2 epoch (loss 0.0549): 80%|ββββββββ | 672/840 [42:18<09:42, 3.47s/it]
Training 2/2 epoch (loss 0.0461): 80%|ββββββββ | 672/840 [42:22<09:42, 3.47s/it]
Training 2/2 epoch (loss 0.0461): 80%|ββββββββ | 673/840 [42:22<09:47, 3.52s/it]
Training 2/2 epoch (loss 0.1660): 80%|ββββββββ | 673/840 [42:26<09:47, 3.52s/it]
Training 2/2 epoch (loss 0.1660): 80%|ββββββββ | 674/840 [42:26<10:02, 3.63s/it]
Training 2/2 epoch (loss 0.1143): 80%|ββββββββ | 674/840 [42:29<10:02, 3.63s/it]
Training 2/2 epoch (loss 0.1143): 80%|ββββββββ | 675/840 [42:29<09:43, 3.54s/it]
Training 2/2 epoch (loss 0.0518): 80%|ββββββββ | 675/840 [42:33<09:43, 3.54s/it]
Training 2/2 epoch (loss 0.0518): 80%|ββββββββ | 676/840 [42:33<09:55, 3.63s/it]
Training 2/2 epoch (loss 0.0432): 80%|ββββββββ | 676/840 [42:38<09:55, 3.63s/it]
Training 2/2 epoch (loss 0.0432): 81%|ββββββββ | 677/840 [42:38<10:48, 3.98s/it]
Training 2/2 epoch (loss 0.0664): 81%|ββββββββ | 677/840 [42:41<10:48, 3.98s/it]
Training 2/2 epoch (loss 0.0664): 81%|ββββββββ | 678/840 [42:41<10:13, 3.79s/it]
Training 2/2 epoch (loss 0.1553): 81%|ββββββββ | 678/840 [42:44<10:13, 3.79s/it]
Training 2/2 epoch (loss 0.1553): 81%|ββββββββ | 679/840 [42:44<09:24, 3.51s/it]
Training 2/2 epoch (loss 0.0986): 81%|ββββββββ | 679/840 [42:47<09:24, 3.51s/it]
Training 2/2 epoch (loss 0.0986): 81%|ββββββββ | 680/840 [42:47<09:30, 3.57s/it]
Training 2/2 epoch (loss 0.0806): 81%|ββββββββ | 680/840 [42:50<09:30, 3.57s/it]
Training 2/2 epoch (loss 0.0806): 81%|ββββββββ | 681/840 [42:50<08:41, 3.28s/it]
Training 2/2 epoch (loss 0.0354): 81%|ββββββββ | 681/840 [42:54<08:41, 3.28s/it]
Training 2/2 epoch (loss 0.0354): 81%|ββββββββ | 682/840 [42:54<08:51, 3.36s/it]
Training 2/2 epoch (loss 0.1641): 81%|ββββββββ | 682/840 [42:56<08:51, 3.36s/it]
Training 2/2 epoch (loss 0.1641): 81%|βββββββββ | 683/840 [42:56<08:14, 3.15s/it]
Training 2/2 epoch (loss 0.0679): 81%|βββββββββ | 683/840 [42:59<08:14, 3.15s/it]
Training 2/2 epoch (loss 0.0679): 81%|βββββββββ | 684/840 [42:59<07:43, 2.97s/it]
Training 2/2 epoch (loss 0.2910): 81%|βββββββββ | 684/840 [43:02<07:43, 2.97s/it]
Training 2/2 epoch (loss 0.2910): 82%|βββββββββ | 685/840 [43:02<07:36, 2.95s/it]
Training 2/2 epoch (loss 0.3105): 82%|βββββββββ | 685/840 [43:07<07:36, 2.95s/it]
Training 2/2 epoch (loss 0.3105): 82%|βββββββββ | 686/840 [43:07<09:31, 3.71s/it]
Training 2/2 epoch (loss 0.2695): 82%|βββββββββ | 686/840 [43:10<09:31, 3.71s/it]
Training 2/2 epoch (loss 0.2695): 82%|βββββββββ | 687/840 [43:10<08:55, 3.50s/it]
Training 2/2 epoch (loss 0.0962): 82%|βββββββββ | 687/840 [43:13<08:55, 3.50s/it]
Training 2/2 epoch (loss 0.0962): 82%|βββββββββ | 688/840 [43:13<08:39, 3.42s/it]
Training 2/2 epoch (loss 0.0732): 82%|βββββββββ | 688/840 [43:16<08:39, 3.42s/it]
Training 2/2 epoch (loss 0.0732): 82%|βββββββββ | 689/840 [43:16<07:56, 3.15s/it]
Training 2/2 epoch (loss 0.0830): 82%|βββββββββ | 689/840 [43:19<07:56, 3.15s/it]
Training 2/2 epoch (loss 0.0830): 82%|βββββββββ | 690/840 [43:19<08:01, 3.21s/it]
Training 2/2 epoch (loss 0.0413): 82%|βββββββββ | 690/840 [43:24<08:01, 3.21s/it]
Training 2/2 epoch (loss 0.0413): 82%|βββββββββ | 691/840 [43:24<09:10, 3.69s/it]
Training 2/2 epoch (loss 0.0435): 82%|βββββββββ | 691/840 [43:27<09:10, 3.69s/it]
Training 2/2 epoch (loss 0.0435): 82%|βββββββββ | 692/840 [43:27<08:36, 3.49s/it]
Training 2/2 epoch (loss 0.1216): 82%|βββββββββ | 692/840 [43:31<08:36, 3.49s/it]
Training 2/2 epoch (loss 0.1216): 82%|βββββββββ | 693/840 [43:31<09:05, 3.71s/it]
Training 2/2 epoch (loss 0.0742): 82%|βββββββββ | 693/840 [43:35<09:05, 3.71s/it]
Training 2/2 epoch (loss 0.0742): 83%|βββββββββ | 694/840 [43:35<09:19, 3.83s/it]
Training 2/2 epoch (loss 0.0598): 83%|βββββββββ | 694/840 [43:40<09:19, 3.83s/it]
Training 2/2 epoch (loss 0.0598): 83%|βββββββββ | 695/840 [43:40<09:57, 4.12s/it]
Training 2/2 epoch (loss 0.0598): 83%|βββββββββ | 695/840 [43:45<09:57, 4.12s/it]
Training 2/2 epoch (loss 0.0598): 83%|βββββββββ | 696/840 [43:45<09:58, 4.16s/it]
Training 2/2 epoch (loss 0.1553): 83%|βββββββββ | 696/840 [43:48<09:58, 4.16s/it]
Training 2/2 epoch (loss 0.1553): 83%|βββββββββ | 697/840 [43:48<09:15, 3.89s/it]
Training 2/2 epoch (loss 0.0801): 83%|βββββββββ | 697/840 [43:52<09:15, 3.89s/it]
Training 2/2 epoch (loss 0.0801): 83%|βββββββββ | 698/840 [43:52<09:12, 3.89s/it]
Training 2/2 epoch (loss 0.0645): 83%|βββββββββ | 698/840 [43:55<09:12, 3.89s/it]
Training 2/2 epoch (loss 0.0645): 83%|βββββββββ | 699/840 [43:55<08:57, 3.81s/it]
Training 2/2 epoch (loss 0.1133): 83%|βββββββββ | 699/840 [43:58<08:57, 3.81s/it]
Training 2/2 epoch (loss 0.1133): 83%|βββββββββ | 700/840 [43:58<08:13, 3.52s/it]
Training 2/2 epoch (loss 0.1035): 83%|βββββββββ | 700/840 [44:02<08:13, 3.52s/it]
Training 2/2 epoch (loss 0.1035): 83%|βββββββββ | 701/840 [44:02<08:07, 3.51s/it]
Training 2/2 epoch (loss 0.0693): 83%|βββββββββ | 701/840 [44:05<08:07, 3.51s/it]
Training 2/2 epoch (loss 0.0693): 84%|βββββββββ | 702/840 [44:05<07:47, 3.39s/it]
Training 2/2 epoch (loss 0.0552): 84%|βββββββββ | 702/840 [44:09<07:47, 3.39s/it]
Training 2/2 epoch (loss 0.0552): 84%|βββββββββ | 703/840 [44:09<08:08, 3.56s/it]
Training 2/2 epoch (loss 0.1104): 84%|βββββββββ | 703/840 [44:12<08:08, 3.56s/it]
Training 2/2 epoch (loss 0.1104): 84%|βββββββββ | 704/840 [44:12<07:34, 3.34s/it]
Training 2/2 epoch (loss 0.0347): 84%|βββββββββ | 704/840 [44:15<07:34, 3.34s/it]
Training 2/2 epoch (loss 0.0347): 84%|βββββββββ | 705/840 [44:15<07:47, 3.46s/it]
Training 2/2 epoch (loss 0.0396): 84%|βββββββββ | 705/840 [44:19<07:47, 3.46s/it]
Training 2/2 epoch (loss 0.0396): 84%|βββββββββ | 706/840 [44:19<07:43, 3.46s/it]
Training 2/2 epoch (loss 0.1357): 84%|βββββββββ | 706/840 [44:22<07:43, 3.46s/it]
Training 2/2 epoch (loss 0.1357): 84%|βββββββββ | 707/840 [44:22<07:44, 3.49s/it]
Training 2/2 epoch (loss 0.0503): 84%|βββββββββ | 707/840 [44:25<07:44, 3.49s/it]
Training 2/2 epoch (loss 0.0503): 84%|βββββββββ | 708/840 [44:25<07:14, 3.29s/it]
Training 2/2 epoch (loss 0.0786): 84%|βββββββββ | 708/840 [44:28<07:14, 3.29s/it]
Training 2/2 epoch (loss 0.0786): 84%|βββββββββ | 709/840 [44:28<07:02, 3.23s/it]
Training 2/2 epoch (loss 0.1523): 84%|βββββββββ | 709/840 [44:32<07:02, 3.23s/it]
Training 2/2 epoch (loss 0.1523): 85%|βββββββββ | 710/840 [44:32<07:14, 3.35s/it]
Training 2/2 epoch (loss 0.0143): 85%|βββββββββ | 710/840 [44:36<07:14, 3.35s/it]
Training 2/2 epoch (loss 0.0143): 85%|βββββββββ | 711/840 [44:36<07:47, 3.62s/it]
Training 2/2 epoch (loss 0.0266): 85%|βββββββββ | 711/840 [44:41<07:47, 3.62s/it]
Training 2/2 epoch (loss 0.0266): 85%|βββββββββ | 712/840 [44:41<08:32, 4.01s/it]
Training 2/2 epoch (loss 0.2812): 85%|βββββββββ | 712/840 [44:45<08:32, 4.01s/it]
Training 2/2 epoch (loss 0.2812): 85%|βββββββββ | 713/840 [44:45<08:41, 4.11s/it]
Training 2/2 epoch (loss 0.0557): 85%|βββββββββ | 713/840 [44:48<08:41, 4.11s/it]
Training 2/2 epoch (loss 0.0557): 85%|βββββββββ | 714/840 [44:48<07:58, 3.80s/it]
Training 2/2 epoch (loss 0.0742): 85%|βββββββββ | 714/840 [44:51<07:58, 3.80s/it]
Training 2/2 epoch (loss 0.0742): 85%|βββββββββ | 715/840 [44:51<07:15, 3.48s/it]
Training 2/2 epoch (loss 0.0457): 85%|βββββββββ | 715/840 [44:55<07:15, 3.48s/it]
Training 2/2 epoch (loss 0.0457): 85%|βββββββββ | 716/840 [44:55<07:12, 3.49s/it]
Training 2/2 epoch (loss 0.1079): 85%|βββββββββ | 716/840 [44:58<07:12, 3.49s/it]
Training 2/2 epoch (loss 0.1079): 85%|βββββββββ | 717/840 [44:58<07:15, 3.54s/it]
Training 2/2 epoch (loss 0.0471): 85%|βββββββββ | 717/840 [45:02<07:15, 3.54s/it]
Training 2/2 epoch (loss 0.0471): 85%|βββββββββ | 718/840 [45:02<07:20, 3.61s/it]
Training 2/2 epoch (loss 0.1064): 85%|βββββββββ | 718/840 [45:06<07:20, 3.61s/it]
Training 2/2 epoch (loss 0.1064): 86%|βββββββββ | 719/840 [45:06<07:11, 3.57s/it]
Training 2/2 epoch (loss 0.0182): 86%|βββββββββ | 719/840 [45:09<07:11, 3.57s/it]
Training 2/2 epoch (loss 0.0182): 86%|βββββββββ | 720/840 [45:09<06:46, 3.39s/it]
Training 2/2 epoch (loss 0.2119): 86%|βββββββββ | 720/840 [45:13<06:46, 3.39s/it]
Training 2/2 epoch (loss 0.2119): 86%|βββββββββ | 721/840 [45:13<07:33, 3.81s/it]
Training 2/2 epoch (loss 0.1689): 86%|βββββββββ | 721/840 [45:17<07:33, 3.81s/it]
Training 2/2 epoch (loss 0.1689): 86%|βββββββββ | 722/840 [45:17<07:30, 3.81s/it]
Training 2/2 epoch (loss 0.0479): 86%|βββββββββ | 722/840 [45:23<07:30, 3.81s/it]
Training 2/2 epoch (loss 0.0479): 86%|βββββββββ | 723/840 [45:23<08:25, 4.32s/it]
Training 2/2 epoch (loss 0.0776): 86%|βββββββββ | 723/840 [45:26<08:25, 4.32s/it]
Training 2/2 epoch (loss 0.0776): 86%|βββββββββ | 724/840 [45:26<07:45, 4.01s/it]
Training 2/2 epoch (loss 0.0762): 86%|βββββββββ | 724/840 [45:30<07:45, 4.01s/it]
Training 2/2 epoch (loss 0.0762): 86%|βββββββββ | 725/840 [45:30<07:28, 3.90s/it]
Training 2/2 epoch (loss 0.1045): 86%|βββββββββ | 725/840 [45:35<07:28, 3.90s/it]
Training 2/2 epoch (loss 0.1045): 86%|βββββββββ | 726/840 [45:35<08:00, 4.21s/it]
Training 2/2 epoch (loss 0.0476): 86%|βββββββββ | 726/840 [45:38<08:00, 4.21s/it]
Training 2/2 epoch (loss 0.0476): 87%|βββββββββ | 727/840 [45:38<07:34, 4.02s/it]
Training 2/2 epoch (loss 0.0923): 87%|βββββββββ | 727/840 [45:41<07:34, 4.02s/it]
Training 2/2 epoch (loss 0.0923): 87%|βββββββββ | 728/840 [45:41<07:04, 3.79s/it]
Training 2/2 epoch (loss 0.0688): 87%|βββββββββ | 728/840 [45:45<07:04, 3.79s/it]
Training 2/2 epoch (loss 0.0688): 87%|βββββββββ | 729/840 [45:45<06:56, 3.75s/it]
Training 2/2 epoch (loss 0.0449): 87%|βββββββββ | 729/840 [45:48<06:56, 3.75s/it]
Training 2/2 epoch (loss 0.0449): 87%|βββββββββ | 730/840 [45:48<06:34, 3.58s/it]
Training 2/2 epoch (loss 0.0300): 87%|βββββββββ | 730/840 [45:52<06:34, 3.58s/it]
Training 2/2 epoch (loss 0.0300): 87%|βββββββββ | 731/840 [45:52<06:31, 3.59s/it]
Training 2/2 epoch (loss 0.1279): 87%|βββββββββ | 731/840 [45:55<06:31, 3.59s/it]
Training 2/2 epoch (loss 0.1279): 87%|βββββββββ | 732/840 [45:55<06:19, 3.52s/it]
Training 2/2 epoch (loss 0.0718): 87%|βββββββββ | 732/840 [45:58<06:19, 3.52s/it]
Training 2/2 epoch (loss 0.0718): 87%|βββββββββ | 733/840 [45:58<05:42, 3.20s/it]
Training 2/2 epoch (loss 0.0693): 87%|βββββββββ | 733/840 [46:02<05:42, 3.20s/it]
Training 2/2 epoch (loss 0.0693): 87%|βββββββββ | 734/840 [46:02<06:08, 3.48s/it]
Training 2/2 epoch (loss 0.0271): 87%|βββββββββ | 734/840 [46:06<06:08, 3.48s/it]
Training 2/2 epoch (loss 0.0271): 88%|βββββββββ | 735/840 [46:06<06:32, 3.74s/it]
Training 2/2 epoch (loss 0.0532): 88%|βββββββββ | 735/840 [46:10<06:32, 3.74s/it]
Training 2/2 epoch (loss 0.0532): 88%|βββββββββ | 736/840 [46:10<06:24, 3.70s/it]
Training 2/2 epoch (loss 0.0476): 88%|βββββββββ | 736/840 [46:14<06:24, 3.70s/it]
Training 2/2 epoch (loss 0.0476): 88%|βββββββββ | 737/840 [46:14<06:39, 3.88s/it]
Training 2/2 epoch (loss 0.1914): 88%|βββββββββ | 737/840 [46:17<06:39, 3.88s/it]
Training 2/2 epoch (loss 0.1914): 88%|βββββββββ | 738/840 [46:17<06:12, 3.65s/it]
Training 2/2 epoch (loss 0.1006): 88%|βββββββββ | 738/840 [46:23<06:12, 3.65s/it]
Training 2/2 epoch (loss 0.1006): 88%|βββββββββ | 739/840 [46:23<07:02, 4.18s/it]
Training 2/2 epoch (loss 0.0239): 88%|βββββββββ | 739/840 [46:25<07:02, 4.18s/it]
Training 2/2 epoch (loss 0.0239): 88%|βββββββββ | 740/840 [46:25<06:20, 3.81s/it]
Training 2/2 epoch (loss 0.0498): 88%|βββββββββ | 740/840 [46:31<06:20, 3.81s/it]
Training 2/2 epoch (loss 0.0498): 88%|βββββββββ | 741/840 [46:31<07:02, 4.26s/it]
Training 2/2 epoch (loss 0.1182): 88%|βββββββββ | 741/840 [46:35<07:02, 4.26s/it]
Training 2/2 epoch (loss 0.1182): 88%|βββββββββ | 742/840 [46:35<07:03, 4.32s/it]
Training 2/2 epoch (loss 0.0859): 88%|βββββββββ | 742/840 [46:40<07:03, 4.32s/it]
Training 2/2 epoch (loss 0.0859): 88%|βββββββββ | 743/840 [46:40<07:06, 4.40s/it]
Training 2/2 epoch (loss 0.0952): 88%|βββββββββ | 743/840 [46:43<07:06, 4.40s/it]
Training 2/2 epoch (loss 0.0952): 89%|βββββββββ | 744/840 [46:43<06:36, 4.13s/it]
Training 2/2 epoch (loss 0.0583): 89%|βββββββββ | 744/840 [46:46<06:36, 4.13s/it]
Training 2/2 epoch (loss 0.0583): 89%|βββββββββ | 745/840 [46:46<05:57, 3.76s/it]
Training 2/2 epoch (loss 0.2227): 89%|βββββββββ | 745/840 [46:51<05:57, 3.76s/it]
Training 2/2 epoch (loss 0.2227): 89%|βββββββββ | 746/840 [46:51<06:20, 4.05s/it]
Training 2/2 epoch (loss 0.0718): 89%|βββββββββ | 746/840 [46:56<06:20, 4.05s/it]
Training 2/2 epoch (loss 0.0718): 89%|βββββββββ | 747/840 [46:56<06:54, 4.46s/it]
Training 2/2 epoch (loss 0.0410): 89%|βββββββββ | 747/840 [47:00<06:54, 4.46s/it]
Training 2/2 epoch (loss 0.0410): 89%|βββββββββ | 748/840 [47:00<06:31, 4.26s/it]
Training 2/2 epoch (loss 0.0718): 89%|βββββββββ | 748/840 [47:04<06:31, 4.26s/it]
Training 2/2 epoch (loss 0.0718): 89%|βββββββββ | 749/840 [47:04<06:02, 3.99s/it]
Training 2/2 epoch (loss 0.0325): 89%|βββββββββ | 749/840 [47:07<06:02, 3.99s/it]
Training 2/2 epoch (loss 0.0325): 89%|βββββββββ | 750/840 [47:07<05:35, 3.73s/it]
Training 2/2 epoch (loss 0.0166): 89%|βββββββββ | 750/840 [47:10<05:35, 3.73s/it]
Training 2/2 epoch (loss 0.0166): 89%|βββββββββ | 751/840 [47:10<05:16, 3.56s/it]
Training 2/2 epoch (loss 0.0781): 89%|βββββββββ | 751/840 [47:12<05:16, 3.56s/it]
Training 2/2 epoch (loss 0.0781): 90%|βββββββββ | 752/840 [47:12<04:48, 3.27s/it]
Training 2/2 epoch (loss 0.0938): 90%|βββββββββ | 752/840 [47:18<04:48, 3.27s/it]
Training 2/2 epoch (loss 0.0938): 90%|βββββββββ | 753/840 [47:18<05:41, 3.93s/it]
Training 2/2 epoch (loss 0.1172): 90%|βββββββββ | 753/840 [47:23<05:41, 3.93s/it]
Training 2/2 epoch (loss 0.1172): 90%|βββββββββ | 754/840 [47:23<06:20, 4.42s/it]
Training 2/2 epoch (loss 0.0337): 90%|βββββββββ | 754/840 [47:29<06:20, 4.42s/it]
Training 2/2 epoch (loss 0.0337): 90%|βββββββββ | 755/840 [47:29<06:43, 4.74s/it]
Training 2/2 epoch (loss 0.0327): 90%|βββββββββ | 755/840 [47:33<06:43, 4.74s/it]
Training 2/2 epoch (loss 0.0327): 90%|βββββββββ | 756/840 [47:33<06:08, 4.39s/it]
Training 2/2 epoch (loss 0.0908): 90%|βββββββββ | 756/840 [47:36<06:08, 4.39s/it]
Training 2/2 epoch (loss 0.0908): 90%|βββββββββ | 757/840 [47:36<05:47, 4.19s/it]
Training 2/2 epoch (loss 0.1011): 90%|βββββββββ | 757/840 [47:39<05:47, 4.19s/it]
Training 2/2 epoch (loss 0.1011): 90%|βββββββββ | 758/840 [47:39<05:17, 3.87s/it]
Training 2/2 epoch (loss 0.0708): 90%|βββββββββ | 758/840 [47:43<05:17, 3.87s/it]
Training 2/2 epoch (loss 0.0708): 90%|βββββββββ | 759/840 [47:43<05:00, 3.71s/it]
Training 2/2 epoch (loss 0.0574): 90%|βββββββββ | 759/840 [47:47<05:00, 3.71s/it]
Training 2/2 epoch (loss 0.0574): 90%|βββββββββ | 760/840 [47:47<05:04, 3.80s/it]
Training 2/2 epoch (loss 0.0211): 90%|βββββββββ | 760/840 [47:50<05:04, 3.80s/it]
Training 2/2 epoch (loss 0.0211): 91%|βββββββββ | 761/840 [47:50<04:44, 3.61s/it]
Training 2/2 epoch (loss 0.0322): 91%|βββββββββ | 761/840 [47:54<04:44, 3.61s/it]
Training 2/2 epoch (loss 0.0322): 91%|βββββββββ | 762/840 [47:54<04:46, 3.67s/it]
Training 2/2 epoch (loss 0.1357): 91%|βββββββββ | 762/840 [47:57<04:46, 3.67s/it]
Training 2/2 epoch (loss 0.1357): 91%|βββββββββ | 763/840 [47:57<04:27, 3.47s/it]
Training 2/2 epoch (loss 0.1289): 91%|βββββββββ | 763/840 [48:01<04:27, 3.47s/it]
Training 2/2 epoch (loss 0.1289): 91%|βββββββββ | 764/840 [48:01<04:48, 3.79s/it]
Training 2/2 epoch (loss 0.1123): 91%|βββββββββ | 764/840 [48:04<04:48, 3.79s/it]
Training 2/2 epoch (loss 0.1123): 91%|βββββββββ | 765/840 [48:04<04:18, 3.44s/it]
Training 2/2 epoch (loss 0.1367): 91%|βββββββββ | 765/840 [48:07<04:18, 3.44s/it]
Training 2/2 epoch (loss 0.1367): 91%|βββββββββ | 766/840 [48:07<03:56, 3.20s/it]
Training 2/2 epoch (loss 0.0620): 91%|βββββββββ | 766/840 [48:09<03:56, 3.20s/it]
Training 2/2 epoch (loss 0.0620): 91%|ββββββββββ| 767/840 [48:09<03:41, 3.04s/it]
Training 2/2 epoch (loss 0.0105): 91%|ββββββββββ| 767/840 [48:12<03:41, 3.04s/it]
Training 2/2 epoch (loss 0.0105): 91%|ββββββββββ| 768/840 [48:12<03:43, 3.11s/it]
Training 2/2 epoch (loss 0.0264): 91%|ββββββββββ| 768/840 [48:16<03:43, 3.11s/it]
Training 2/2 epoch (loss 0.0264): 92%|ββββββββββ| 769/840 [48:16<03:59, 3.38s/it]
Training 2/2 epoch (loss 0.0366): 92%|ββββββββββ| 769/840 [48:19<03:59, 3.38s/it]
Training 2/2 epoch (loss 0.0366): 92%|ββββββββββ| 770/840 [48:19<03:43, 3.19s/it]
Training 2/2 epoch (loss 0.0591): 92%|ββββββββββ| 770/840 [48:25<03:43, 3.19s/it]
Training 2/2 epoch (loss 0.0591): 92%|ββββββββββ| 771/840 [48:25<04:26, 3.86s/it]
Training 2/2 epoch (loss 0.1289): 92%|ββββββββββ| 771/840 [48:28<04:26, 3.86s/it]
Training 2/2 epoch (loss 0.1289): 92%|ββββββββββ| 772/840 [48:28<04:06, 3.62s/it]
Training 2/2 epoch (loss 0.1240): 92%|ββββββββββ| 772/840 [48:31<04:06, 3.62s/it]
Training 2/2 epoch (loss 0.1240): 92%|ββββββββββ| 773/840 [48:31<04:02, 3.62s/it]
Training 2/2 epoch (loss 0.0559): 92%|ββββββββββ| 773/840 [48:35<04:02, 3.62s/it]
Training 2/2 epoch (loss 0.0559): 92%|ββββββββββ| 774/840 [48:35<03:57, 3.60s/it]
Training 2/2 epoch (loss 0.0889): 92%|ββββββββββ| 774/840 [48:38<03:57, 3.60s/it]
Training 2/2 epoch (loss 0.0889): 92%|ββββββββββ| 775/840 [48:38<03:45, 3.47s/it]
Training 2/2 epoch (loss 0.0742): 92%|ββββββββββ| 775/840 [48:42<03:45, 3.47s/it]
Training 2/2 epoch (loss 0.0742): 92%|ββββββββββ| 776/840 [48:42<03:52, 3.63s/it]
Training 2/2 epoch (loss 0.0157): 92%|ββββββββββ| 776/840 [48:47<03:52, 3.63s/it]
Training 2/2 epoch (loss 0.0157): 92%|ββββββββββ| 777/840 [48:47<04:12, 4.00s/it]
Training 2/2 epoch (loss 0.1455): 92%|ββββββββββ| 777/840 [48:51<04:12, 4.00s/it]
Training 2/2 epoch (loss 0.1455): 93%|ββββββββββ| 778/840 [48:51<04:18, 4.17s/it]
Training 2/2 epoch (loss 0.0698): 93%|ββββββββββ| 778/840 [48:57<04:18, 4.17s/it]
Training 2/2 epoch (loss 0.0698): 93%|ββββββββββ| 779/840 [48:57<04:37, 4.55s/it]
Training 2/2 epoch (loss 0.0591): 93%|ββββββββββ| 779/840 [49:01<04:37, 4.55s/it]
Training 2/2 epoch (loss 0.0591): 93%|ββββββββββ| 780/840 [49:01<04:24, 4.40s/it]
Training 2/2 epoch (loss 0.0077): 93%|ββββββββββ| 780/840 [49:05<04:24, 4.40s/it]
Training 2/2 epoch (loss 0.0077): 93%|ββββββββββ| 781/840 [49:05<04:12, 4.28s/it]
Training 2/2 epoch (loss 0.1226): 93%|ββββββββββ| 781/840 [49:08<04:12, 4.28s/it]
Training 2/2 epoch (loss 0.1226): 93%|ββββββββββ| 782/840 [49:08<03:43, 3.85s/it]
Training 2/2 epoch (loss 0.0391): 93%|ββββββββββ| 782/840 [49:11<03:43, 3.85s/it]
Training 2/2 epoch (loss 0.0391): 93%|ββββββββββ| 783/840 [49:11<03:33, 3.75s/it]
Training 2/2 epoch (loss 0.0635): 93%|ββββββββββ| 783/840 [49:16<03:33, 3.75s/it]
Training 2/2 epoch (loss 0.0635): 93%|ββββββββββ| 784/840 [49:16<03:52, 4.15s/it]
Training 2/2 epoch (loss 0.1543): 93%|ββββββββββ| 784/840 [49:20<03:52, 4.15s/it]
Training 2/2 epoch (loss 0.1543): 93%|ββββββββββ| 785/840 [49:20<03:40, 4.00s/it]
Training 2/2 epoch (loss 0.0698): 93%|ββββββββββ| 785/840 [49:26<03:40, 4.00s/it]
Training 2/2 epoch (loss 0.0698): 94%|ββββββββββ| 786/840 [49:26<04:00, 4.45s/it]
Training 2/2 epoch (loss 0.0493): 94%|ββββββββββ| 786/840 [49:29<04:00, 4.45s/it]
Training 2/2 epoch (loss 0.0493): 94%|ββββββββββ| 787/840 [49:29<03:41, 4.17s/it]
Training 2/2 epoch (loss 0.0518): 94%|ββββββββββ| 787/840 [49:34<03:41, 4.17s/it]
Training 2/2 epoch (loss 0.0518): 94%|ββββββββββ| 788/840 [49:34<03:43, 4.30s/it]
Training 2/2 epoch (loss 0.2695): 94%|ββββββββββ| 788/840 [49:37<03:43, 4.30s/it]
Training 2/2 epoch (loss 0.2695): 94%|ββββββββββ| 789/840 [49:37<03:21, 3.96s/it]
Training 2/2 epoch (loss 0.4727): 94%|ββββββββββ| 789/840 [49:42<03:21, 3.96s/it]
Training 2/2 epoch (loss 0.4727): 94%|ββββββββββ| 790/840 [49:42<03:41, 4.43s/it]
Training 2/2 epoch (loss 0.0918): 94%|ββββββββββ| 790/840 [49:48<03:41, 4.43s/it]
Training 2/2 epoch (loss 0.0918): 94%|ββββββββββ| 791/840 [49:48<03:52, 4.74s/it]
Training 2/2 epoch (loss 0.0255): 94%|ββββββββββ| 791/840 [49:53<03:52, 4.74s/it]
Training 2/2 epoch (loss 0.0255): 94%|ββββββββββ| 792/840 [49:53<03:58, 4.97s/it]
Training 2/2 epoch (loss 0.0214): 94%|ββββββββββ| 792/840 [49:57<03:58, 4.97s/it]
Training 2/2 epoch (loss 0.0214): 94%|ββββββββββ| 793/840 [49:57<03:31, 4.50s/it]
Training 2/2 epoch (loss 0.0361): 94%|ββββββββββ| 793/840 [50:00<03:31, 4.50s/it]
Training 2/2 epoch (loss 0.0361): 95%|ββββββββββ| 794/840 [50:00<03:13, 4.22s/it]
Training 2/2 epoch (loss 0.0684): 95%|ββββββββββ| 794/840 [50:05<03:13, 4.22s/it]
Training 2/2 epoch (loss 0.0684): 95%|ββββββββββ| 795/840 [50:05<03:14, 4.31s/it]
Training 2/2 epoch (loss 0.0306): 95%|ββββββββββ| 795/840 [50:10<03:14, 4.31s/it]
Training 2/2 epoch (loss 0.0306): 95%|ββββββββββ| 796/840 [50:10<03:17, 4.49s/it]
Training 2/2 epoch (loss 0.0079): 95%|ββββββββββ| 796/840 [50:14<03:17, 4.49s/it]
Training 2/2 epoch (loss 0.0079): 95%|ββββββββββ| 797/840 [50:14<03:06, 4.33s/it]
Training 2/2 epoch (loss 0.0544): 95%|ββββββββββ| 797/840 [50:17<03:06, 4.33s/it]
Training 2/2 epoch (loss 0.0544): 95%|ββββββββββ| 798/840 [50:17<02:54, 4.16s/it]
Training 2/2 epoch (loss 0.0205): 95%|ββββββββββ| 798/840 [50:20<02:54, 4.16s/it]
Training 2/2 epoch (loss 0.0205): 95%|ββββββββββ| 799/840 [50:20<02:32, 3.72s/it]
Training 2/2 epoch (loss 0.0732): 95%|ββββββββββ| 799/840 [50:25<02:32, 3.72s/it]
Training 2/2 epoch (loss 0.0732): 95%|ββββββββββ| 800/840 [50:25<02:42, 4.05s/it]
Training 2/2 epoch (loss 0.0366): 95%|ββββββββββ| 800/840 [50:28<02:42, 4.05s/it]
Training 2/2 epoch (loss 0.0366): 95%|ββββββββββ| 801/840 [50:28<02:23, 3.67s/it]
Training 2/2 epoch (loss 0.0310): 95%|ββββββββββ| 801/840 [50:33<02:23, 3.67s/it]
Training 2/2 epoch (loss 0.0310): 95%|ββββββββββ| 802/840 [50:33<02:33, 4.05s/it]
Training 2/2 epoch (loss 0.0332): 95%|ββββββββββ| 802/840 [50:36<02:33, 4.05s/it]
Training 2/2 epoch (loss 0.0332): 96%|ββββββββββ| 803/840 [50:36<02:20, 3.79s/it]
Training 2/2 epoch (loss 0.2227): 96%|ββββββββββ| 803/840 [50:39<02:20, 3.79s/it]
Training 2/2 epoch (loss 0.2227): 96%|ββββββββββ| 804/840 [50:39<02:14, 3.73s/it]
Training 2/2 epoch (loss 0.0654): 96%|ββββββββββ| 804/840 [50:43<02:14, 3.73s/it]
Training 2/2 epoch (loss 0.0654): 96%|ββββββββββ| 805/840 [50:43<02:03, 3.53s/it]
Training 2/2 epoch (loss 0.0537): 96%|ββββββββββ| 805/840 [50:46<02:03, 3.53s/it]
Training 2/2 epoch (loss 0.0537): 96%|ββββββββββ| 806/840 [50:46<01:54, 3.37s/it]
Training 2/2 epoch (loss 0.0986): 96%|ββββββββββ| 806/840 [50:48<01:54, 3.37s/it]
Training 2/2 epoch (loss 0.0986): 96%|ββββββββββ| 807/840 [50:48<01:45, 3.20s/it]
Training 2/2 epoch (loss 0.1133): 96%|ββββββββββ| 807/840 [50:53<01:45, 3.20s/it]
Training 2/2 epoch (loss 0.1133): 96%|ββββββββββ| 808/840 [50:53<01:52, 3.52s/it]
Training 2/2 epoch (loss 0.0403): 96%|ββββββββββ| 808/840 [50:55<01:52, 3.52s/it]
Training 2/2 epoch (loss 0.0403): 96%|ββββββββββ| 809/840 [50:55<01:42, 3.29s/it]
Training 2/2 epoch (loss 0.0062): 96%|ββββββββββ| 809/840 [50:58<01:42, 3.29s/it]
Training 2/2 epoch (loss 0.0062): 96%|ββββββββββ| 810/840 [50:58<01:36, 3.21s/it]
Training 2/2 epoch (loss 0.0194): 96%|ββββββββββ| 810/840 [51:02<01:36, 3.21s/it]
Training 2/2 epoch (loss 0.0194): 97%|ββββββββββ| 811/840 [51:02<01:32, 3.19s/it]
Training 2/2 epoch (loss 0.0359): 97%|ββββββββββ| 811/840 [51:04<01:32, 3.19s/it]
Training 2/2 epoch (loss 0.0359): 97%|ββββββββββ| 812/840 [51:04<01:22, 2.96s/it]
Training 2/2 epoch (loss 0.0425): 97%|ββββββββββ| 812/840 [51:07<01:22, 2.96s/it]
Training 2/2 epoch (loss 0.0425): 97%|ββββββββββ| 813/840 [51:07<01:24, 3.12s/it]
Training 2/2 epoch (loss 0.0284): 97%|ββββββββββ| 813/840 [51:11<01:24, 3.12s/it]
Training 2/2 epoch (loss 0.0284): 97%|ββββββββββ| 814/840 [51:11<01:21, 3.12s/it]
Training 2/2 epoch (loss 0.0483): 97%|ββββββββββ| 814/840 [51:13<01:21, 3.12s/it]
Training 2/2 epoch (loss 0.0483): 97%|ββββββββββ| 815/840 [51:13<01:13, 2.96s/it]
Training 2/2 epoch (loss 0.0396): 97%|ββββββββββ| 815/840 [51:17<01:13, 2.96s/it]
Training 2/2 epoch (loss 0.0396): 97%|ββββββββββ| 816/840 [51:17<01:13, 3.08s/it]
Training 2/2 epoch (loss 0.0383): 97%|ββββββββββ| 816/840 [51:20<01:13, 3.08s/it]
Training 2/2 epoch (loss 0.0383): 97%|ββββββββββ| 817/840 [51:20<01:16, 3.33s/it]
Training 2/2 epoch (loss 0.0718): 97%|ββββββββββ| 817/840 [51:24<01:16, 3.33s/it]
Training 2/2 epoch (loss 0.0718): 97%|ββββββββββ| 818/840 [51:24<01:13, 3.35s/it]
Training 2/2 epoch (loss 0.1543): 97%|ββββββββββ| 818/840 [51:29<01:13, 3.35s/it]
Training 2/2 epoch (loss 0.1543): 98%|ββββββββββ| 819/840 [51:29<01:23, 3.98s/it]
Training 2/2 epoch (loss 0.0223): 98%|ββββββββββ| 819/840 [51:35<01:23, 3.98s/it]
Training 2/2 epoch (loss 0.0223): 98%|ββββββββββ| 820/840 [51:35<01:29, 4.47s/it]
Training 2/2 epoch (loss 0.0791): 98%|ββββββββββ| 820/840 [51:39<01:29, 4.47s/it]
Training 2/2 epoch (loss 0.0791): 98%|ββββββββββ| 821/840 [51:39<01:22, 4.32s/it]
Training 2/2 epoch (loss 0.0574): 98%|ββββββββββ| 821/840 [51:44<01:22, 4.32s/it]
Training 2/2 epoch (loss 0.0574): 98%|ββββββββββ| 822/840 [51:44<01:24, 4.68s/it]
Training 2/2 epoch (loss 0.0101): 98%|ββββββββββ| 822/840 [51:50<01:24, 4.68s/it]
Training 2/2 epoch (loss 0.0101): 98%|ββββββββββ| 823/840 [51:50<01:23, 4.90s/it]
Training 2/2 epoch (loss 0.0542): 98%|ββββββββββ| 823/840 [51:53<01:23, 4.90s/it]
Training 2/2 epoch (loss 0.0542): 98%|ββββββββββ| 824/840 [51:53<01:09, 4.34s/it]
Training 2/2 epoch (loss 0.0564): 98%|ββββββββββ| 824/840 [51:55<01:09, 4.34s/it]
Training 2/2 epoch (loss 0.0564): 98%|ββββββββββ| 825/840 [51:55<00:56, 3.78s/it]
Training 2/2 epoch (loss 0.1426): 98%|ββββββββββ| 825/840 [52:00<00:56, 3.78s/it]
Training 2/2 epoch (loss 0.1426): 98%|ββββββββββ| 826/840 [52:00<00:56, 4.03s/it]
Training 2/2 epoch (loss 0.0206): 98%|ββββββββββ| 826/840 [52:05<00:56, 4.03s/it]
Training 2/2 epoch (loss 0.0206): 98%|ββββββββββ| 827/840 [52:05<00:54, 4.22s/it]
Training 2/2 epoch (loss 0.1108): 98%|ββββββββββ| 827/840 [52:09<00:54, 4.22s/it]
Training 2/2 epoch (loss 0.1108): 99%|ββββββββββ| 828/840 [52:09<00:52, 4.39s/it]
Training 2/2 epoch (loss 0.0576): 99%|ββββββββββ| 828/840 [52:12<00:52, 4.39s/it]
Training 2/2 epoch (loss 0.0576): 99%|ββββββββββ| 829/840 [52:12<00:42, 3.86s/it]
Training 2/2 epoch (loss 0.0596): 99%|ββββββββββ| 829/840 [52:18<00:42, 3.86s/it]
Training 2/2 epoch (loss 0.0596): 99%|ββββββββββ| 830/840 [52:18<00:43, 4.37s/it]
Training 2/2 epoch (loss 0.0610): 99%|ββββββββββ| 830/840 [52:20<00:43, 4.37s/it]
Training 2/2 epoch (loss 0.0610): 99%|ββββββββββ| 831/840 [52:20<00:35, 3.93s/it]
Training 2/2 epoch (loss 0.0079): 99%|ββββββββββ| 831/840 [52:24<00:35, 3.93s/it]
Training 2/2 epoch (loss 0.0079): 99%|ββββββββββ| 832/840 [52:24<00:29, 3.73s/it]
Training 2/2 epoch (loss 0.0393): 99%|ββββββββββ| 832/840 [52:27<00:29, 3.73s/it]
Training 2/2 epoch (loss 0.0393): 99%|ββββββββββ| 833/840 [52:27<00:24, 3.53s/it]
Training 2/2 epoch (loss 0.0503): 99%|ββββββββββ| 833/840 [52:31<00:24, 3.53s/it]
Training 2/2 epoch (loss 0.0503): 99%|ββββββββββ| 834/840 [52:31<00:22, 3.78s/it]
Training 2/2 epoch (loss 0.0415): 99%|ββββββββββ| 834/840 [52:35<00:22, 3.78s/it]
Training 2/2 epoch (loss 0.0415): 99%|ββββββββββ| 835/840 [52:35<00:19, 3.91s/it]
Training 2/2 epoch (loss 0.0461): 99%|ββββββββββ| 835/840 [52:40<00:19, 3.91s/it]
Training 2/2 epoch (loss 0.0461): 100%|ββββββββββ| 836/840 [52:40<00:16, 4.15s/it]
Training 2/2 epoch (loss 0.0581): 100%|ββββββββββ| 836/840 [52:43<00:16, 4.15s/it]
Training 2/2 epoch (loss 0.0581): 100%|ββββββββββ| 837/840 [52:43<00:11, 3.90s/it]
Training 2/2 epoch (loss 0.0503): 100%|ββββββββββ| 837/840 [52:47<00:11, 3.90s/it]
Training 2/2 epoch (loss 0.0503): 100%|ββββββββββ| 838/840 [52:47<00:07, 3.86s/it]
Training 2/2 epoch (loss 0.1348): 100%|ββββββββββ| 838/840 [52:50<00:07, 3.86s/it]
Training 2/2 epoch (loss 0.1348): 100%|ββββββββββ| 839/840 [52:50<00:03, 3.63s/it]
Training 2/2 epoch (loss 0.0248): 100%|ββββββββββ| 839/840 [52:54<00:03, 3.63s/it]
Training 2/2 epoch (loss 0.0248): 100%|ββββββββββ| 840/840 [52:54<00:00, 3.64s/it]
Training 2/2 epoch (loss 0.0248): 100%|ββββββββββ| 840/840 [52:54<00:00, 3.78s/it] |