|
[2025-02-14 07:22:25,511][00436] Saving configuration to /content/train_dir/default_experiment/config.json... |
|
[2025-02-14 07:22:25,513][00436] Rollout worker 0 uses device cpu |
|
[2025-02-14 07:22:25,515][00436] Rollout worker 1 uses device cpu |
|
[2025-02-14 07:22:25,517][00436] Rollout worker 2 uses device cpu |
|
[2025-02-14 07:22:25,518][00436] Rollout worker 3 uses device cpu |
|
[2025-02-14 07:22:25,519][00436] Rollout worker 4 uses device cpu |
|
[2025-02-14 07:22:25,520][00436] Rollout worker 5 uses device cpu |
|
[2025-02-14 07:22:25,521][00436] Rollout worker 6 uses device cpu |
|
[2025-02-14 07:22:25,522][00436] Rollout worker 7 uses device cpu |
|
[2025-02-14 07:22:25,670][00436] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-02-14 07:22:25,672][00436] InferenceWorker_p0-w0: min num requests: 2 |
|
[2025-02-14 07:22:25,705][00436] Starting all processes... |
|
[2025-02-14 07:22:25,706][00436] Starting process learner_proc0 |
|
[2025-02-14 07:22:25,763][00436] Starting all processes... |
|
[2025-02-14 07:22:25,777][00436] Starting process inference_proc0-0 |
|
[2025-02-14 07:22:25,777][00436] Starting process rollout_proc0 |
|
[2025-02-14 07:22:25,777][00436] Starting process rollout_proc1 |
|
[2025-02-14 07:22:25,777][00436] Starting process rollout_proc2 |
|
[2025-02-14 07:22:25,777][00436] Starting process rollout_proc3 |
|
[2025-02-14 07:22:25,777][00436] Starting process rollout_proc4 |
|
[2025-02-14 07:22:25,777][00436] Starting process rollout_proc5 |
|
[2025-02-14 07:22:25,777][00436] Starting process rollout_proc6 |
|
[2025-02-14 07:22:25,777][00436] Starting process rollout_proc7 |
|
[2025-02-14 07:22:42,193][04608] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-02-14 07:22:42,196][04608] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 |
|
[2025-02-14 07:22:42,328][04622] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-02-14 07:22:42,329][04622] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 |
|
[2025-02-14 07:22:42,336][04608] Num visible devices: 1 |
|
[2025-02-14 07:22:42,359][04622] Num visible devices: 1 |
|
[2025-02-14 07:22:42,368][04608] Starting seed is not provided |
|
[2025-02-14 07:22:42,368][04608] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-02-14 07:22:42,369][04608] Initializing actor-critic model on device cuda:0 |
|
[2025-02-14 07:22:42,370][04608] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-02-14 07:22:42,373][04608] RunningMeanStd input shape: (1,) |
|
[2025-02-14 07:22:42,412][04608] ConvEncoder: input_channels=3 |
|
[2025-02-14 07:22:42,426][04627] Worker 5 uses CPU cores [1] |
|
[2025-02-14 07:22:42,471][04624] Worker 2 uses CPU cores [0] |
|
[2025-02-14 07:22:42,557][04629] Worker 7 uses CPU cores [1] |
|
[2025-02-14 07:22:42,655][04623] Worker 1 uses CPU cores [1] |
|
[2025-02-14 07:22:42,771][04625] Worker 3 uses CPU cores [1] |
|
[2025-02-14 07:22:42,873][04626] Worker 4 uses CPU cores [0] |
|
[2025-02-14 07:22:42,916][04621] Worker 0 uses CPU cores [0] |
|
[2025-02-14 07:22:42,932][04608] Conv encoder output size: 512 |
|
[2025-02-14 07:22:42,932][04608] Policy head output size: 512 |
|
[2025-02-14 07:22:42,956][04628] Worker 6 uses CPU cores [0] |
|
[2025-02-14 07:22:42,998][04608] Created Actor Critic model with architecture: |
|
[2025-02-14 07:22:42,999][04608] ActorCriticSharedWeights( |
|
(obs_normalizer): ObservationNormalizer( |
|
(running_mean_std): RunningMeanStdDictInPlace( |
|
(running_mean_std): ModuleDict( |
|
(obs): RunningMeanStdInPlace() |
|
) |
|
) |
|
) |
|
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) |
|
(encoder): VizdoomEncoder( |
|
(basic_encoder): ConvEncoder( |
|
(enc): RecursiveScriptModule( |
|
original_name=ConvEncoderImpl |
|
(conv_head): RecursiveScriptModule( |
|
original_name=Sequential |
|
(0): RecursiveScriptModule(original_name=Conv2d) |
|
(1): RecursiveScriptModule(original_name=ELU) |
|
(2): RecursiveScriptModule(original_name=Conv2d) |
|
(3): RecursiveScriptModule(original_name=ELU) |
|
(4): RecursiveScriptModule(original_name=Conv2d) |
|
(5): RecursiveScriptModule(original_name=ELU) |
|
) |
|
(mlp_layers): RecursiveScriptModule( |
|
original_name=Sequential |
|
(0): RecursiveScriptModule(original_name=Linear) |
|
(1): RecursiveScriptModule(original_name=ELU) |
|
) |
|
) |
|
) |
|
) |
|
(core): ModelCoreRNN( |
|
(core): GRU(512, 512) |
|
) |
|
(decoder): MlpDecoder( |
|
(mlp): Identity() |
|
) |
|
(critic_linear): Linear(in_features=512, out_features=1, bias=True) |
|
(action_parameterization): ActionParameterizationDefault( |
|
(distribution_linear): Linear(in_features=512, out_features=5, bias=True) |
|
) |
|
) |
|
[2025-02-14 07:22:43,249][04608] Using optimizer <class 'torch.optim.adam.Adam'> |
|
[2025-02-14 07:22:45,671][00436] Heartbeat connected on InferenceWorker_p0-w0 |
|
[2025-02-14 07:22:45,679][00436] Heartbeat connected on RolloutWorker_w0 |
|
[2025-02-14 07:22:45,683][00436] Heartbeat connected on RolloutWorker_w1 |
|
[2025-02-14 07:22:45,686][00436] Heartbeat connected on RolloutWorker_w2 |
|
[2025-02-14 07:22:45,690][00436] Heartbeat connected on RolloutWorker_w3 |
|
[2025-02-14 07:22:45,693][00436] Heartbeat connected on RolloutWorker_w4 |
|
[2025-02-14 07:22:45,697][00436] Heartbeat connected on RolloutWorker_w5 |
|
[2025-02-14 07:22:45,700][00436] Heartbeat connected on RolloutWorker_w6 |
|
[2025-02-14 07:22:45,704][00436] Heartbeat connected on RolloutWorker_w7 |
|
[2025-02-14 07:22:45,973][00436] Heartbeat connected on Batcher_0 |
|
[2025-02-14 07:22:47,317][04608] No checkpoints found |
|
[2025-02-14 07:22:47,317][04608] Did not load from checkpoint, starting from scratch! |
|
[2025-02-14 07:22:47,318][04608] Initialized policy 0 weights for model version 0 |
|
[2025-02-14 07:22:47,322][04608] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-02-14 07:22:47,330][04608] LearnerWorker_p0 finished initialization! |
|
[2025-02-14 07:22:47,331][00436] Heartbeat connected on LearnerWorker_p0 |
|
[2025-02-14 07:22:47,503][04622] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-02-14 07:22:47,504][04622] RunningMeanStd input shape: (1,) |
|
[2025-02-14 07:22:47,515][04622] ConvEncoder: input_channels=3 |
|
[2025-02-14 07:22:47,617][04622] Conv encoder output size: 512 |
|
[2025-02-14 07:22:47,617][04622] Policy head output size: 512 |
|
[2025-02-14 07:22:47,652][00436] Inference worker 0-0 is ready! |
|
[2025-02-14 07:22:47,654][00436] All inference workers are ready! Signal rollout workers to start! |
|
[2025-02-14 07:22:47,944][04624] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-02-14 07:22:47,954][04625] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-02-14 07:22:47,976][04628] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-02-14 07:22:47,977][04627] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-02-14 07:22:47,982][04629] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-02-14 07:22:47,980][04621] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-02-14 07:22:48,031][04623] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-02-14 07:22:48,115][04626] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-02-14 07:22:49,108][04629] Decorrelating experience for 0 frames... |
|
[2025-02-14 07:22:49,176][04621] Decorrelating experience for 0 frames... |
|
[2025-02-14 07:22:49,175][04628] Decorrelating experience for 0 frames... |
|
[2025-02-14 07:22:49,178][04624] Decorrelating experience for 0 frames... |
|
[2025-02-14 07:22:49,240][00436] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2025-02-14 07:22:49,583][04627] Decorrelating experience for 0 frames... |
|
[2025-02-14 07:22:50,249][04624] Decorrelating experience for 32 frames... |
|
[2025-02-14 07:22:50,251][04621] Decorrelating experience for 32 frames... |
|
[2025-02-14 07:22:50,258][04628] Decorrelating experience for 32 frames... |
|
[2025-02-14 07:22:50,268][04626] Decorrelating experience for 0 frames... |
|
[2025-02-14 07:22:50,678][04629] Decorrelating experience for 32 frames... |
|
[2025-02-14 07:22:51,497][04627] Decorrelating experience for 32 frames... |
|
[2025-02-14 07:22:51,530][04626] Decorrelating experience for 32 frames... |
|
[2025-02-14 07:22:51,821][04629] Decorrelating experience for 64 frames... |
|
[2025-02-14 07:22:51,852][04621] Decorrelating experience for 64 frames... |
|
[2025-02-14 07:22:51,854][04628] Decorrelating experience for 64 frames... |
|
[2025-02-14 07:22:52,490][04623] Decorrelating experience for 0 frames... |
|
[2025-02-14 07:22:52,693][04624] Decorrelating experience for 64 frames... |
|
[2025-02-14 07:22:52,715][04627] Decorrelating experience for 64 frames... |
|
[2025-02-14 07:22:53,247][04623] Decorrelating experience for 32 frames... |
|
[2025-02-14 07:22:53,911][04628] Decorrelating experience for 96 frames... |
|
[2025-02-14 07:22:54,152][04621] Decorrelating experience for 96 frames... |
|
[2025-02-14 07:22:54,240][00436] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2025-02-14 07:22:54,306][04624] Decorrelating experience for 96 frames... |
|
[2025-02-14 07:22:54,942][04623] Decorrelating experience for 64 frames... |
|
[2025-02-14 07:22:55,369][04627] Decorrelating experience for 96 frames... |
|
[2025-02-14 07:22:55,639][04626] Decorrelating experience for 64 frames... |
|
[2025-02-14 07:22:56,649][04623] Decorrelating experience for 96 frames... |
|
[2025-02-14 07:22:57,544][04629] Decorrelating experience for 96 frames... |
|
[2025-02-14 07:22:59,080][04626] Decorrelating experience for 96 frames... |
|
[2025-02-14 07:22:59,227][04608] Signal inference workers to stop experience collection... |
|
[2025-02-14 07:22:59,238][04622] InferenceWorker_p0-w0: stopping experience collection |
|
[2025-02-14 07:22:59,240][00436] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 38.2. Samples: 382. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2025-02-14 07:22:59,247][00436] Avg episode reward: [(0, '2.646')] |
|
[2025-02-14 07:23:01,301][04608] Signal inference workers to resume experience collection... |
|
[2025-02-14 07:23:01,302][04622] InferenceWorker_p0-w0: resuming experience collection |
|
[2025-02-14 07:23:04,240][00436] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1092.3). Total num frames: 16384. Throughput: 0: 243.2. Samples: 3648. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-02-14 07:23:04,242][00436] Avg episode reward: [(0, '3.394')] |
|
[2025-02-14 07:23:09,240][00436] Fps is (10 sec: 3686.4, 60 sec: 1843.2, 300 sec: 1843.2). Total num frames: 36864. Throughput: 0: 479.7. Samples: 9594. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:23:09,244][00436] Avg episode reward: [(0, '3.836')] |
|
[2025-02-14 07:23:10,469][04622] Updated weights for policy 0, policy_version 10 (0.0013) |
|
[2025-02-14 07:23:14,240][00436] Fps is (10 sec: 3686.4, 60 sec: 2129.9, 300 sec: 2129.9). Total num frames: 53248. Throughput: 0: 458.2. Samples: 11456. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:23:14,248][00436] Avg episode reward: [(0, '4.286')] |
|
[2025-02-14 07:23:19,240][00436] Fps is (10 sec: 3686.4, 60 sec: 2457.6, 300 sec: 2457.6). Total num frames: 73728. Throughput: 0: 591.1. Samples: 17734. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-02-14 07:23:19,245][00436] Avg episode reward: [(0, '4.245')] |
|
[2025-02-14 07:23:20,481][04622] Updated weights for policy 0, policy_version 20 (0.0021) |
|
[2025-02-14 07:23:24,241][00436] Fps is (10 sec: 3686.0, 60 sec: 2574.5, 300 sec: 2574.5). Total num frames: 90112. Throughput: 0: 670.6. Samples: 23472. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-02-14 07:23:24,244][00436] Avg episode reward: [(0, '4.288')] |
|
[2025-02-14 07:23:29,240][00436] Fps is (10 sec: 3686.4, 60 sec: 2764.8, 300 sec: 2764.8). Total num frames: 110592. Throughput: 0: 636.9. Samples: 25476. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:23:29,242][00436] Avg episode reward: [(0, '4.367')] |
|
[2025-02-14 07:23:29,252][04608] Saving new best policy, reward=4.367! |
|
[2025-02-14 07:23:31,799][04622] Updated weights for policy 0, policy_version 30 (0.0018) |
|
[2025-02-14 07:23:34,240][00436] Fps is (10 sec: 4096.5, 60 sec: 2912.7, 300 sec: 2912.7). Total num frames: 131072. Throughput: 0: 714.4. Samples: 32150. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:23:34,245][00436] Avg episode reward: [(0, '4.274')] |
|
[2025-02-14 07:23:39,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3031.0, 300 sec: 3031.0). Total num frames: 151552. Throughput: 0: 848.7. Samples: 38192. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:23:39,243][00436] Avg episode reward: [(0, '4.377')] |
|
[2025-02-14 07:23:39,249][04608] Saving new best policy, reward=4.377! |
|
[2025-02-14 07:23:42,735][04622] Updated weights for policy 0, policy_version 40 (0.0017) |
|
[2025-02-14 07:23:44,240][00436] Fps is (10 sec: 3686.4, 60 sec: 3053.4, 300 sec: 3053.4). Total num frames: 167936. Throughput: 0: 886.0. Samples: 40252. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:23:44,242][00436] Avg episode reward: [(0, '4.520')] |
|
[2025-02-14 07:23:44,249][04608] Saving new best policy, reward=4.520! |
|
[2025-02-14 07:23:49,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3208.5, 300 sec: 3208.5). Total num frames: 192512. Throughput: 0: 963.3. Samples: 46996. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:23:49,244][00436] Avg episode reward: [(0, '4.367')] |
|
[2025-02-14 07:23:51,748][04622] Updated weights for policy 0, policy_version 50 (0.0012) |
|
[2025-02-14 07:23:54,241][00436] Fps is (10 sec: 4095.7, 60 sec: 3481.5, 300 sec: 3213.7). Total num frames: 208896. Throughput: 0: 963.8. Samples: 52966. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:23:54,244][00436] Avg episode reward: [(0, '4.445')] |
|
[2025-02-14 07:23:59,240][00436] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3276.8). Total num frames: 229376. Throughput: 0: 975.2. Samples: 55340. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:23:59,242][00436] Avg episode reward: [(0, '4.501')] |
|
[2025-02-14 07:24:02,566][04622] Updated weights for policy 0, policy_version 60 (0.0015) |
|
[2025-02-14 07:24:04,240][00436] Fps is (10 sec: 4096.4, 60 sec: 3891.2, 300 sec: 3331.4). Total num frames: 249856. Throughput: 0: 986.3. Samples: 62116. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-02-14 07:24:04,243][00436] Avg episode reward: [(0, '4.277')] |
|
[2025-02-14 07:24:09,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3379.2). Total num frames: 270336. Throughput: 0: 987.7. Samples: 67916. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:24:09,247][00436] Avg episode reward: [(0, '4.343')] |
|
[2025-02-14 07:24:13,251][04622] Updated weights for policy 0, policy_version 70 (0.0015) |
|
[2025-02-14 07:24:14,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3421.4). Total num frames: 290816. Throughput: 0: 995.9. Samples: 70290. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-02-14 07:24:14,244][00436] Avg episode reward: [(0, '4.337')] |
|
[2025-02-14 07:24:19,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3458.8). Total num frames: 311296. Throughput: 0: 1001.8. Samples: 77232. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-02-14 07:24:19,246][00436] Avg episode reward: [(0, '4.387')] |
|
[2025-02-14 07:24:19,254][04608] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000076_311296.pth... |
|
[2025-02-14 07:24:23,200][04622] Updated weights for policy 0, policy_version 80 (0.0012) |
|
[2025-02-14 07:24:24,242][00436] Fps is (10 sec: 3685.8, 60 sec: 3959.4, 300 sec: 3449.2). Total num frames: 327680. Throughput: 0: 987.6. Samples: 82636. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:24:24,244][00436] Avg episode reward: [(0, '4.382')] |
|
[2025-02-14 07:24:29,240][00436] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3481.6). Total num frames: 348160. Throughput: 0: 998.9. Samples: 85204. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:24:29,242][00436] Avg episode reward: [(0, '4.256')] |
|
[2025-02-14 07:24:33,177][04622] Updated weights for policy 0, policy_version 90 (0.0024) |
|
[2025-02-14 07:24:34,240][00436] Fps is (10 sec: 4506.4, 60 sec: 4027.7, 300 sec: 3549.9). Total num frames: 372736. Throughput: 0: 1000.8. Samples: 92034. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:24:34,242][00436] Avg episode reward: [(0, '4.277')] |
|
[2025-02-14 07:24:39,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3537.5). Total num frames: 389120. Throughput: 0: 989.5. Samples: 97492. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:24:39,243][00436] Avg episode reward: [(0, '4.448')] |
|
[2025-02-14 07:24:43,860][04622] Updated weights for policy 0, policy_version 100 (0.0023) |
|
[2025-02-14 07:24:44,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3561.7). Total num frames: 409600. Throughput: 0: 1000.4. Samples: 100360. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-02-14 07:24:44,245][00436] Avg episode reward: [(0, '4.594')] |
|
[2025-02-14 07:24:44,247][04608] Saving new best policy, reward=4.594! |
|
[2025-02-14 07:24:49,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3584.0). Total num frames: 430080. Throughput: 0: 997.3. Samples: 106996. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:24:49,246][00436] Avg episode reward: [(0, '4.487')] |
|
[2025-02-14 07:24:54,240][00436] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3571.7). Total num frames: 446464. Throughput: 0: 985.6. Samples: 112270. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:24:54,242][00436] Avg episode reward: [(0, '4.411')] |
|
[2025-02-14 07:24:54,735][04622] Updated weights for policy 0, policy_version 110 (0.0012) |
|
[2025-02-14 07:24:59,240][00436] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3591.9). Total num frames: 466944. Throughput: 0: 999.7. Samples: 115278. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:24:59,246][00436] Avg episode reward: [(0, '4.199')] |
|
[2025-02-14 07:25:03,690][04622] Updated weights for policy 0, policy_version 120 (0.0019) |
|
[2025-02-14 07:25:04,240][00436] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3640.9). Total num frames: 491520. Throughput: 0: 996.9. Samples: 122092. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:25:04,246][00436] Avg episode reward: [(0, '4.182')] |
|
[2025-02-14 07:25:09,242][00436] Fps is (10 sec: 4095.4, 60 sec: 3959.4, 300 sec: 3627.8). Total num frames: 507904. Throughput: 0: 988.2. Samples: 127106. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-02-14 07:25:09,244][00436] Avg episode reward: [(0, '4.300')] |
|
[2025-02-14 07:25:14,240][00436] Fps is (10 sec: 3686.3, 60 sec: 3959.5, 300 sec: 3644.0). Total num frames: 528384. Throughput: 0: 1002.4. Samples: 130310. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:25:14,243][00436] Avg episode reward: [(0, '4.578')] |
|
[2025-02-14 07:25:14,599][04622] Updated weights for policy 0, policy_version 130 (0.0013) |
|
[2025-02-14 07:25:19,241][00436] Fps is (10 sec: 4505.9, 60 sec: 4027.7, 300 sec: 3686.4). Total num frames: 552960. Throughput: 0: 1000.6. Samples: 137060. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:25:19,246][00436] Avg episode reward: [(0, '4.645')] |
|
[2025-02-14 07:25:19,251][04608] Saving new best policy, reward=4.645! |
|
[2025-02-14 07:25:24,240][00436] Fps is (10 sec: 3686.5, 60 sec: 3959.6, 300 sec: 3646.8). Total num frames: 565248. Throughput: 0: 986.7. Samples: 141892. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:25:24,244][00436] Avg episode reward: [(0, '4.720')] |
|
[2025-02-14 07:25:24,249][04608] Saving new best policy, reward=4.720! |
|
[2025-02-14 07:25:25,492][04622] Updated weights for policy 0, policy_version 140 (0.0027) |
|
[2025-02-14 07:25:29,240][00436] Fps is (10 sec: 3686.7, 60 sec: 4027.7, 300 sec: 3686.4). Total num frames: 589824. Throughput: 0: 997.2. Samples: 145232. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:25:29,243][00436] Avg episode reward: [(0, '4.658')] |
|
[2025-02-14 07:25:34,244][00436] Fps is (10 sec: 4504.0, 60 sec: 3959.2, 300 sec: 3698.7). Total num frames: 610304. Throughput: 0: 1001.8. Samples: 152080. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:25:34,246][00436] Avg episode reward: [(0, '4.672')] |
|
[2025-02-14 07:25:34,750][04622] Updated weights for policy 0, policy_version 150 (0.0025) |
|
[2025-02-14 07:25:39,240][00436] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3686.4). Total num frames: 626688. Throughput: 0: 993.2. Samples: 156964. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:25:39,246][00436] Avg episode reward: [(0, '4.699')] |
|
[2025-02-14 07:25:44,240][00436] Fps is (10 sec: 3687.7, 60 sec: 3959.5, 300 sec: 3698.1). Total num frames: 647168. Throughput: 0: 1002.0. Samples: 160368. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:25:44,246][00436] Avg episode reward: [(0, '4.515')] |
|
[2025-02-14 07:25:45,218][04622] Updated weights for policy 0, policy_version 160 (0.0016) |
|
[2025-02-14 07:25:49,240][00436] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3731.9). Total num frames: 671744. Throughput: 0: 999.6. Samples: 167076. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:25:49,249][00436] Avg episode reward: [(0, '4.365')] |
|
[2025-02-14 07:25:54,240][00436] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3719.6). Total num frames: 688128. Throughput: 0: 997.6. Samples: 171998. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:25:54,245][00436] Avg episode reward: [(0, '4.361')] |
|
[2025-02-14 07:25:55,918][04622] Updated weights for policy 0, policy_version 170 (0.0014) |
|
[2025-02-14 07:25:59,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3729.5). Total num frames: 708608. Throughput: 0: 1003.0. Samples: 175446. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:25:59,243][00436] Avg episode reward: [(0, '4.560')] |
|
[2025-02-14 07:26:04,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3738.9). Total num frames: 729088. Throughput: 0: 1004.5. Samples: 182262. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:26:04,249][00436] Avg episode reward: [(0, '4.709')] |
|
[2025-02-14 07:26:06,132][04622] Updated weights for policy 0, policy_version 180 (0.0019) |
|
[2025-02-14 07:26:09,240][00436] Fps is (10 sec: 3686.4, 60 sec: 3959.6, 300 sec: 3727.4). Total num frames: 745472. Throughput: 0: 1005.4. Samples: 187136. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:26:09,244][00436] Avg episode reward: [(0, '4.697')] |
|
[2025-02-14 07:26:14,240][00436] Fps is (10 sec: 4095.9, 60 sec: 4027.7, 300 sec: 3756.3). Total num frames: 770048. Throughput: 0: 1007.0. Samples: 190548. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:26:14,243][00436] Avg episode reward: [(0, '4.887')] |
|
[2025-02-14 07:26:14,246][04608] Saving new best policy, reward=4.887! |
|
[2025-02-14 07:26:15,649][04622] Updated weights for policy 0, policy_version 190 (0.0013) |
|
[2025-02-14 07:26:19,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3891.3, 300 sec: 3744.9). Total num frames: 786432. Throughput: 0: 998.9. Samples: 197026. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:26:19,244][00436] Avg episode reward: [(0, '4.872')] |
|
[2025-02-14 07:26:19,255][04608] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000192_786432.pth... |
|
[2025-02-14 07:26:24,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3753.1). Total num frames: 806912. Throughput: 0: 998.2. Samples: 201882. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:26:24,246][00436] Avg episode reward: [(0, '4.842')] |
|
[2025-02-14 07:26:26,561][04622] Updated weights for policy 0, policy_version 200 (0.0018) |
|
[2025-02-14 07:26:29,240][00436] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3779.5). Total num frames: 831488. Throughput: 0: 999.8. Samples: 205358. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:26:29,242][00436] Avg episode reward: [(0, '4.864')] |
|
[2025-02-14 07:26:34,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.7, 300 sec: 3768.3). Total num frames: 847872. Throughput: 0: 996.6. Samples: 211922. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:26:34,246][00436] Avg episode reward: [(0, '4.791')] |
|
[2025-02-14 07:26:37,364][04622] Updated weights for policy 0, policy_version 210 (0.0014) |
|
[2025-02-14 07:26:39,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3775.4). Total num frames: 868352. Throughput: 0: 1002.2. Samples: 217098. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:26:39,242][00436] Avg episode reward: [(0, '4.939')] |
|
[2025-02-14 07:26:39,247][04608] Saving new best policy, reward=4.939! |
|
[2025-02-14 07:26:44,240][00436] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3782.3). Total num frames: 888832. Throughput: 0: 1001.7. Samples: 220524. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:26:44,243][00436] Avg episode reward: [(0, '5.081')] |
|
[2025-02-14 07:26:44,245][04608] Saving new best policy, reward=5.081! |
|
[2025-02-14 07:26:46,331][04622] Updated weights for policy 0, policy_version 220 (0.0020) |
|
[2025-02-14 07:26:49,240][00436] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3771.7). Total num frames: 905216. Throughput: 0: 987.6. Samples: 226706. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:26:49,252][00436] Avg episode reward: [(0, '5.290')] |
|
[2025-02-14 07:26:49,320][04608] Saving new best policy, reward=5.290! |
|
[2025-02-14 07:26:54,242][00436] Fps is (10 sec: 3685.8, 60 sec: 3959.4, 300 sec: 3778.3). Total num frames: 925696. Throughput: 0: 995.3. Samples: 231926. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-02-14 07:26:54,244][00436] Avg episode reward: [(0, '5.280')] |
|
[2025-02-14 07:26:57,160][04622] Updated weights for policy 0, policy_version 230 (0.0017) |
|
[2025-02-14 07:26:59,240][00436] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3801.1). Total num frames: 950272. Throughput: 0: 995.2. Samples: 235330. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:26:59,242][00436] Avg episode reward: [(0, '5.437')] |
|
[2025-02-14 07:26:59,253][04608] Saving new best policy, reward=5.437! |
|
[2025-02-14 07:27:04,240][00436] Fps is (10 sec: 4096.6, 60 sec: 3959.5, 300 sec: 3790.8). Total num frames: 966656. Throughput: 0: 987.3. Samples: 241456. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:27:04,244][00436] Avg episode reward: [(0, '5.557')] |
|
[2025-02-14 07:27:04,249][04608] Saving new best policy, reward=5.557! |
|
[2025-02-14 07:27:08,109][04622] Updated weights for policy 0, policy_version 240 (0.0014) |
|
[2025-02-14 07:27:09,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3796.7). Total num frames: 987136. Throughput: 0: 1000.1. Samples: 246888. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-02-14 07:27:09,242][00436] Avg episode reward: [(0, '5.509')] |
|
[2025-02-14 07:27:14,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3802.3). Total num frames: 1007616. Throughput: 0: 997.3. Samples: 250238. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:27:14,242][00436] Avg episode reward: [(0, '5.790')] |
|
[2025-02-14 07:27:14,245][04608] Saving new best policy, reward=5.790! |
|
[2025-02-14 07:27:18,724][04622] Updated weights for policy 0, policy_version 250 (0.0018) |
|
[2025-02-14 07:27:19,241][00436] Fps is (10 sec: 3686.1, 60 sec: 3959.4, 300 sec: 3792.6). Total num frames: 1024000. Throughput: 0: 975.1. Samples: 255804. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-02-14 07:27:19,249][00436] Avg episode reward: [(0, '6.032')] |
|
[2025-02-14 07:27:19,258][04608] Saving new best policy, reward=6.032! |
|
[2025-02-14 07:27:24,240][00436] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3798.1). Total num frames: 1044480. Throughput: 0: 980.8. Samples: 261234. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-02-14 07:27:24,249][00436] Avg episode reward: [(0, '6.231')] |
|
[2025-02-14 07:27:24,253][04608] Saving new best policy, reward=6.231! |
|
[2025-02-14 07:27:28,638][04622] Updated weights for policy 0, policy_version 260 (0.0035) |
|
[2025-02-14 07:27:29,242][00436] Fps is (10 sec: 4095.6, 60 sec: 3891.1, 300 sec: 3803.4). Total num frames: 1064960. Throughput: 0: 978.8. Samples: 264572. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-02-14 07:27:29,245][00436] Avg episode reward: [(0, '6.061')] |
|
[2025-02-14 07:27:34,240][00436] Fps is (10 sec: 3686.3, 60 sec: 3891.2, 300 sec: 3794.2). Total num frames: 1081344. Throughput: 0: 973.4. Samples: 270508. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:27:34,242][00436] Avg episode reward: [(0, '6.566')] |
|
[2025-02-14 07:27:34,248][04608] Saving new best policy, reward=6.566! |
|
[2025-02-14 07:27:39,168][04622] Updated weights for policy 0, policy_version 270 (0.0016) |
|
[2025-02-14 07:27:39,240][00436] Fps is (10 sec: 4096.7, 60 sec: 3959.5, 300 sec: 3813.5). Total num frames: 1105920. Throughput: 0: 986.1. Samples: 276300. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:27:39,243][00436] Avg episode reward: [(0, '6.791')] |
|
[2025-02-14 07:27:39,253][04608] Saving new best policy, reward=6.791! |
|
[2025-02-14 07:27:44,240][00436] Fps is (10 sec: 4505.6, 60 sec: 3959.4, 300 sec: 3818.3). Total num frames: 1126400. Throughput: 0: 985.7. Samples: 279688. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-02-14 07:27:44,246][00436] Avg episode reward: [(0, '7.181')] |
|
[2025-02-14 07:27:44,249][04608] Saving new best policy, reward=7.181! |
|
[2025-02-14 07:27:49,247][00436] Fps is (10 sec: 3683.8, 60 sec: 3959.0, 300 sec: 3873.8). Total num frames: 1142784. Throughput: 0: 971.6. Samples: 285186. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-02-14 07:27:49,252][00436] Avg episode reward: [(0, '7.295')] |
|
[2025-02-14 07:27:49,259][04608] Saving new best policy, reward=7.295! |
|
[2025-02-14 07:27:50,495][04622] Updated weights for policy 0, policy_version 280 (0.0024) |
|
[2025-02-14 07:27:54,240][00436] Fps is (10 sec: 3686.4, 60 sec: 3959.6, 300 sec: 3943.3). Total num frames: 1163264. Throughput: 0: 982.0. Samples: 291078. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:27:54,249][00436] Avg episode reward: [(0, '7.602')] |
|
[2025-02-14 07:27:54,251][04608] Saving new best policy, reward=7.602! |
|
[2025-02-14 07:27:59,240][00436] Fps is (10 sec: 4098.9, 60 sec: 3891.2, 300 sec: 3957.2). Total num frames: 1183744. Throughput: 0: 981.1. Samples: 294388. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:27:59,247][00436] Avg episode reward: [(0, '6.568')] |
|
[2025-02-14 07:27:59,274][04622] Updated weights for policy 0, policy_version 290 (0.0014) |
|
[2025-02-14 07:28:04,240][00436] Fps is (10 sec: 3686.5, 60 sec: 3891.2, 300 sec: 3943.3). Total num frames: 1200128. Throughput: 0: 979.4. Samples: 299878. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-02-14 07:28:04,242][00436] Avg episode reward: [(0, '6.722')] |
|
[2025-02-14 07:28:09,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 1224704. Throughput: 0: 998.8. Samples: 306182. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-02-14 07:28:09,243][00436] Avg episode reward: [(0, '6.874')] |
|
[2025-02-14 07:28:10,147][04622] Updated weights for policy 0, policy_version 300 (0.0025) |
|
[2025-02-14 07:28:14,240][00436] Fps is (10 sec: 4505.4, 60 sec: 3959.4, 300 sec: 3971.0). Total num frames: 1245184. Throughput: 0: 1000.7. Samples: 309602. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-02-14 07:28:14,245][00436] Avg episode reward: [(0, '7.821')] |
|
[2025-02-14 07:28:14,251][04608] Saving new best policy, reward=7.821! |
|
[2025-02-14 07:28:19,240][00436] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3971.1). Total num frames: 1261568. Throughput: 0: 984.1. Samples: 314794. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:28:19,247][00436] Avg episode reward: [(0, '8.107')] |
|
[2025-02-14 07:28:19,255][04608] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000308_1261568.pth... |
|
[2025-02-14 07:28:19,373][04608] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000076_311296.pth |
|
[2025-02-14 07:28:19,381][04608] Saving new best policy, reward=8.107! |
|
[2025-02-14 07:28:20,915][04622] Updated weights for policy 0, policy_version 310 (0.0021) |
|
[2025-02-14 07:28:24,240][00436] Fps is (10 sec: 3686.5, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 1282048. Throughput: 0: 993.7. Samples: 321016. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:28:24,242][00436] Avg episode reward: [(0, '8.278')] |
|
[2025-02-14 07:28:24,248][04608] Saving new best policy, reward=8.278! |
|
[2025-02-14 07:28:29,242][00436] Fps is (10 sec: 4504.9, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 1306624. Throughput: 0: 993.1. Samples: 324378. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:28:29,245][00436] Avg episode reward: [(0, '8.424')] |
|
[2025-02-14 07:28:29,253][04608] Saving new best policy, reward=8.424! |
|
[2025-02-14 07:28:30,726][04622] Updated weights for policy 0, policy_version 320 (0.0018) |
|
[2025-02-14 07:28:34,240][00436] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 1318912. Throughput: 0: 983.1. Samples: 329420. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-02-14 07:28:34,242][00436] Avg episode reward: [(0, '8.857')] |
|
[2025-02-14 07:28:34,250][04608] Saving new best policy, reward=8.857! |
|
[2025-02-14 07:28:39,241][00436] Fps is (10 sec: 3686.7, 60 sec: 3959.4, 300 sec: 3984.9). Total num frames: 1343488. Throughput: 0: 999.2. Samples: 336042. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:28:39,243][00436] Avg episode reward: [(0, '9.158')] |
|
[2025-02-14 07:28:39,253][04608] Saving new best policy, reward=9.158! |
|
[2025-02-14 07:28:40,854][04622] Updated weights for policy 0, policy_version 330 (0.0023) |
|
[2025-02-14 07:28:44,241][00436] Fps is (10 sec: 4505.3, 60 sec: 3959.4, 300 sec: 3971.0). Total num frames: 1363968. Throughput: 0: 1001.1. Samples: 339438. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:28:44,245][00436] Avg episode reward: [(0, '9.227')] |
|
[2025-02-14 07:28:44,248][04608] Saving new best policy, reward=9.227! |
|
[2025-02-14 07:28:49,240][00436] Fps is (10 sec: 3686.7, 60 sec: 3959.9, 300 sec: 3971.0). Total num frames: 1380352. Throughput: 0: 986.3. Samples: 344260. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-02-14 07:28:49,242][00436] Avg episode reward: [(0, '9.319')] |
|
[2025-02-14 07:28:49,250][04608] Saving new best policy, reward=9.319! |
|
[2025-02-14 07:28:51,684][04622] Updated weights for policy 0, policy_version 340 (0.0014) |
|
[2025-02-14 07:28:54,240][00436] Fps is (10 sec: 3686.7, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 1400832. Throughput: 0: 994.0. Samples: 350914. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:28:54,246][00436] Avg episode reward: [(0, '9.927')] |
|
[2025-02-14 07:28:54,248][04608] Saving new best policy, reward=9.927! |
|
[2025-02-14 07:28:59,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 1421312. Throughput: 0: 991.6. Samples: 354222. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:28:59,242][00436] Avg episode reward: [(0, '10.006')] |
|
[2025-02-14 07:28:59,261][04608] Saving new best policy, reward=10.006! |
|
[2025-02-14 07:29:02,746][04622] Updated weights for policy 0, policy_version 350 (0.0013) |
|
[2025-02-14 07:29:04,240][00436] Fps is (10 sec: 3686.3, 60 sec: 3959.5, 300 sec: 3957.1). Total num frames: 1437696. Throughput: 0: 981.3. Samples: 358954. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:29:04,246][00436] Avg episode reward: [(0, '9.969')] |
|
[2025-02-14 07:29:09,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 1462272. Throughput: 0: 996.9. Samples: 365876. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:29:09,244][00436] Avg episode reward: [(0, '10.037')] |
|
[2025-02-14 07:29:09,251][04608] Saving new best policy, reward=10.037! |
|
[2025-02-14 07:29:11,475][04622] Updated weights for policy 0, policy_version 360 (0.0019) |
|
[2025-02-14 07:29:14,241][00436] Fps is (10 sec: 4505.3, 60 sec: 3959.4, 300 sec: 3971.0). Total num frames: 1482752. Throughput: 0: 997.3. Samples: 369256. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-02-14 07:29:14,245][00436] Avg episode reward: [(0, '10.848')] |
|
[2025-02-14 07:29:14,250][04608] Saving new best policy, reward=10.848! |
|
[2025-02-14 07:29:19,240][00436] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3971.1). Total num frames: 1499136. Throughput: 0: 987.4. Samples: 373852. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:29:19,243][00436] Avg episode reward: [(0, '11.494')] |
|
[2025-02-14 07:29:19,250][04608] Saving new best policy, reward=11.494! |
|
[2025-02-14 07:29:22,503][04622] Updated weights for policy 0, policy_version 370 (0.0016) |
|
[2025-02-14 07:29:24,240][00436] Fps is (10 sec: 4096.3, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 1523712. Throughput: 0: 991.9. Samples: 380676. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-02-14 07:29:24,248][00436] Avg episode reward: [(0, '11.714')] |
|
[2025-02-14 07:29:24,250][04608] Saving new best policy, reward=11.714! |
|
[2025-02-14 07:29:29,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3891.3, 300 sec: 3957.2). Total num frames: 1540096. Throughput: 0: 992.4. Samples: 384094. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:29:29,243][00436] Avg episode reward: [(0, '11.695')] |
|
[2025-02-14 07:29:33,326][04622] Updated weights for policy 0, policy_version 380 (0.0012) |
|
[2025-02-14 07:29:34,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 1560576. Throughput: 0: 990.9. Samples: 388852. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:29:34,246][00436] Avg episode reward: [(0, '11.970')] |
|
[2025-02-14 07:29:34,248][04608] Saving new best policy, reward=11.970! |
|
[2025-02-14 07:29:39,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 1581056. Throughput: 0: 996.3. Samples: 395746. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-02-14 07:29:39,247][00436] Avg episode reward: [(0, '12.113')] |
|
[2025-02-14 07:29:39,259][04608] Saving new best policy, reward=12.113! |
|
[2025-02-14 07:29:42,495][04622] Updated weights for policy 0, policy_version 390 (0.0012) |
|
[2025-02-14 07:29:44,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 1601536. Throughput: 0: 999.4. Samples: 399196. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-02-14 07:29:44,247][00436] Avg episode reward: [(0, '12.308')] |
|
[2025-02-14 07:29:44,255][04608] Saving new best policy, reward=12.308! |
|
[2025-02-14 07:29:49,240][00436] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 1617920. Throughput: 0: 997.6. Samples: 403848. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:29:49,246][00436] Avg episode reward: [(0, '12.246')] |
|
[2025-02-14 07:29:53,041][04622] Updated weights for policy 0, policy_version 400 (0.0019) |
|
[2025-02-14 07:29:54,240][00436] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 1642496. Throughput: 0: 997.4. Samples: 410760. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:29:54,246][00436] Avg episode reward: [(0, '11.491')] |
|
[2025-02-14 07:29:59,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 1658880. Throughput: 0: 996.5. Samples: 414098. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:29:59,245][00436] Avg episode reward: [(0, '12.670')] |
|
[2025-02-14 07:29:59,257][04608] Saving new best policy, reward=12.670! |
|
[2025-02-14 07:30:03,836][04622] Updated weights for policy 0, policy_version 410 (0.0023) |
|
[2025-02-14 07:30:04,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3971.1). Total num frames: 1679360. Throughput: 0: 1003.1. Samples: 418992. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:30:04,248][00436] Avg episode reward: [(0, '14.654')] |
|
[2025-02-14 07:30:04,251][04608] Saving new best policy, reward=14.654! |
|
[2025-02-14 07:30:09,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 1699840. Throughput: 0: 999.4. Samples: 425650. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:30:09,243][00436] Avg episode reward: [(0, '15.103')] |
|
[2025-02-14 07:30:09,249][04608] Saving new best policy, reward=15.103! |
|
[2025-02-14 07:30:14,215][04622] Updated weights for policy 0, policy_version 420 (0.0014) |
|
[2025-02-14 07:30:14,240][00436] Fps is (10 sec: 4095.9, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 1720320. Throughput: 0: 992.8. Samples: 428772. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:30:14,243][00436] Avg episode reward: [(0, '15.350')] |
|
[2025-02-14 07:30:14,245][04608] Saving new best policy, reward=15.350! |
|
[2025-02-14 07:30:19,240][00436] Fps is (10 sec: 3686.3, 60 sec: 3959.4, 300 sec: 3971.0). Total num frames: 1736704. Throughput: 0: 994.4. Samples: 433602. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:30:19,243][00436] Avg episode reward: [(0, '14.708')] |
|
[2025-02-14 07:30:19,250][04608] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000424_1736704.pth... |
|
[2025-02-14 07:30:19,362][04608] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000192_786432.pth |
|
[2025-02-14 07:30:23,791][04622] Updated weights for policy 0, policy_version 430 (0.0024) |
|
[2025-02-14 07:30:24,240][00436] Fps is (10 sec: 4096.1, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 1761280. Throughput: 0: 997.3. Samples: 440624. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:30:24,242][00436] Avg episode reward: [(0, '14.736')] |
|
[2025-02-14 07:30:29,240][00436] Fps is (10 sec: 4096.1, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 1777664. Throughput: 0: 986.9. Samples: 443606. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:30:29,247][00436] Avg episode reward: [(0, '16.641')] |
|
[2025-02-14 07:30:29,256][04608] Saving new best policy, reward=16.641! |
|
[2025-02-14 07:30:34,240][00436] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 1798144. Throughput: 0: 995.7. Samples: 448654. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:30:34,246][00436] Avg episode reward: [(0, '16.194')] |
|
[2025-02-14 07:30:34,803][04622] Updated weights for policy 0, policy_version 440 (0.0021) |
|
[2025-02-14 07:30:39,240][00436] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 1822720. Throughput: 0: 995.7. Samples: 455568. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:30:39,244][00436] Avg episode reward: [(0, '17.385')] |
|
[2025-02-14 07:30:39,252][04608] Saving new best policy, reward=17.385! |
|
[2025-02-14 07:30:44,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 1839104. Throughput: 0: 985.5. Samples: 458446. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:30:44,247][00436] Avg episode reward: [(0, '15.577')] |
|
[2025-02-14 07:30:45,592][04622] Updated weights for policy 0, policy_version 450 (0.0025) |
|
[2025-02-14 07:30:49,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 1859584. Throughput: 0: 995.7. Samples: 463798. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:30:49,244][00436] Avg episode reward: [(0, '14.558')] |
|
[2025-02-14 07:30:54,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 1880064. Throughput: 0: 1000.8. Samples: 470688. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-02-14 07:30:54,247][00436] Avg episode reward: [(0, '15.108')] |
|
[2025-02-14 07:30:54,380][04622] Updated weights for policy 0, policy_version 460 (0.0031) |
|
[2025-02-14 07:30:59,240][00436] Fps is (10 sec: 3686.3, 60 sec: 3959.4, 300 sec: 3957.1). Total num frames: 1896448. Throughput: 0: 995.6. Samples: 473576. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:30:59,247][00436] Avg episode reward: [(0, '15.371')] |
|
[2025-02-14 07:31:04,240][00436] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 1916928. Throughput: 0: 1008.5. Samples: 478986. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:31:04,247][00436] Avg episode reward: [(0, '16.063')] |
|
[2025-02-14 07:31:05,082][04622] Updated weights for policy 0, policy_version 470 (0.0019) |
|
[2025-02-14 07:31:09,240][00436] Fps is (10 sec: 4505.8, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 1941504. Throughput: 0: 1006.7. Samples: 485926. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:31:09,242][00436] Avg episode reward: [(0, '17.060')] |
|
[2025-02-14 07:31:14,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 1957888. Throughput: 0: 1000.9. Samples: 488646. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-02-14 07:31:14,246][00436] Avg episode reward: [(0, '17.102')] |
|
[2025-02-14 07:31:15,818][04622] Updated weights for policy 0, policy_version 480 (0.0034) |
|
[2025-02-14 07:31:19,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.8, 300 sec: 3971.0). Total num frames: 1978368. Throughput: 0: 1008.6. Samples: 494040. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:31:19,247][00436] Avg episode reward: [(0, '16.683')] |
|
[2025-02-14 07:31:24,240][00436] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 2002944. Throughput: 0: 1010.4. Samples: 501034. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:31:24,245][00436] Avg episode reward: [(0, '18.929')] |
|
[2025-02-14 07:31:24,248][04608] Saving new best policy, reward=18.929! |
|
[2025-02-14 07:31:25,154][04622] Updated weights for policy 0, policy_version 490 (0.0015) |
|
[2025-02-14 07:31:29,240][00436] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 2015232. Throughput: 0: 1000.4. Samples: 503466. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-02-14 07:31:29,249][00436] Avg episode reward: [(0, '18.615')] |
|
[2025-02-14 07:31:34,240][00436] Fps is (10 sec: 3686.3, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 2039808. Throughput: 0: 1006.4. Samples: 509086. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:31:34,247][00436] Avg episode reward: [(0, '19.650')] |
|
[2025-02-14 07:31:34,251][04608] Saving new best policy, reward=19.650! |
|
[2025-02-14 07:31:35,627][04622] Updated weights for policy 0, policy_version 500 (0.0013) |
|
[2025-02-14 07:31:39,240][00436] Fps is (10 sec: 4915.2, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 2064384. Throughput: 0: 1007.1. Samples: 516006. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:31:39,247][00436] Avg episode reward: [(0, '20.814')] |
|
[2025-02-14 07:31:39,254][04608] Saving new best policy, reward=20.814! |
|
[2025-02-14 07:31:44,240][00436] Fps is (10 sec: 3686.5, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 2076672. Throughput: 0: 995.5. Samples: 518372. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:31:44,245][00436] Avg episode reward: [(0, '21.065')] |
|
[2025-02-14 07:31:44,247][04608] Saving new best policy, reward=21.065! |
|
[2025-02-14 07:31:46,459][04622] Updated weights for policy 0, policy_version 510 (0.0016) |
|
[2025-02-14 07:31:49,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 2101248. Throughput: 0: 1001.6. Samples: 524056. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:31:49,242][00436] Avg episode reward: [(0, '19.696')] |
|
[2025-02-14 07:31:54,240][00436] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 2121728. Throughput: 0: 998.8. Samples: 530870. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:31:54,244][00436] Avg episode reward: [(0, '21.346')] |
|
[2025-02-14 07:31:54,250][04608] Saving new best policy, reward=21.346! |
|
[2025-02-14 07:31:56,859][04622] Updated weights for policy 0, policy_version 520 (0.0018) |
|
[2025-02-14 07:31:59,240][00436] Fps is (10 sec: 3686.5, 60 sec: 4027.8, 300 sec: 3971.0). Total num frames: 2138112. Throughput: 0: 984.2. Samples: 532934. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:31:59,245][00436] Avg episode reward: [(0, '22.718')] |
|
[2025-02-14 07:31:59,253][04608] Saving new best policy, reward=22.718! |
|
[2025-02-14 07:32:04,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 2158592. Throughput: 0: 997.2. Samples: 538914. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-02-14 07:32:04,242][00436] Avg episode reward: [(0, '23.126')] |
|
[2025-02-14 07:32:04,248][04608] Saving new best policy, reward=23.126! |
|
[2025-02-14 07:32:06,464][04622] Updated weights for policy 0, policy_version 530 (0.0013) |
|
[2025-02-14 07:32:09,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 2179072. Throughput: 0: 991.0. Samples: 545630. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:32:09,244][00436] Avg episode reward: [(0, '23.739')] |
|
[2025-02-14 07:32:09,263][04608] Saving new best policy, reward=23.739! |
|
[2025-02-14 07:32:14,240][00436] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 2195456. Throughput: 0: 980.5. Samples: 547588. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:32:14,243][00436] Avg episode reward: [(0, '23.968')] |
|
[2025-02-14 07:32:14,248][04608] Saving new best policy, reward=23.968! |
|
[2025-02-14 07:32:17,319][04622] Updated weights for policy 0, policy_version 540 (0.0019) |
|
[2025-02-14 07:32:19,240][00436] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 2220032. Throughput: 0: 993.9. Samples: 553812. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:32:19,243][00436] Avg episode reward: [(0, '23.042')] |
|
[2025-02-14 07:32:19,251][04608] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000542_2220032.pth... |
|
[2025-02-14 07:32:19,380][04608] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000308_1261568.pth |
|
[2025-02-14 07:32:24,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3971.1). Total num frames: 2236416. Throughput: 0: 982.0. Samples: 560194. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-02-14 07:32:24,242][00436] Avg episode reward: [(0, '21.758')] |
|
[2025-02-14 07:32:28,368][04622] Updated weights for policy 0, policy_version 550 (0.0024) |
|
[2025-02-14 07:32:29,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 2256896. Throughput: 0: 974.9. Samples: 562244. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:32:29,245][00436] Avg episode reward: [(0, '20.389')] |
|
[2025-02-14 07:32:34,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 2277376. Throughput: 0: 993.1. Samples: 568746. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:32:34,242][00436] Avg episode reward: [(0, '19.443')] |
|
[2025-02-14 07:32:37,116][04622] Updated weights for policy 0, policy_version 560 (0.0030) |
|
[2025-02-14 07:32:39,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3971.0). Total num frames: 2297856. Throughput: 0: 986.1. Samples: 575244. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:32:39,243][00436] Avg episode reward: [(0, '18.236')] |
|
[2025-02-14 07:32:44,240][00436] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3971.1). Total num frames: 2314240. Throughput: 0: 985.6. Samples: 577288. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-02-14 07:32:44,242][00436] Avg episode reward: [(0, '18.383')] |
|
[2025-02-14 07:32:47,956][04622] Updated weights for policy 0, policy_version 570 (0.0026) |
|
[2025-02-14 07:32:49,240][00436] Fps is (10 sec: 4095.9, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 2338816. Throughput: 0: 1003.3. Samples: 584064. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:32:49,246][00436] Avg episode reward: [(0, '18.782')] |
|
[2025-02-14 07:32:54,245][00436] Fps is (10 sec: 4503.6, 60 sec: 3959.2, 300 sec: 3984.9). Total num frames: 2359296. Throughput: 0: 987.7. Samples: 590080. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:32:54,249][00436] Avg episode reward: [(0, '20.879')] |
|
[2025-02-14 07:32:58,587][04622] Updated weights for policy 0, policy_version 580 (0.0034) |
|
[2025-02-14 07:32:59,240][00436] Fps is (10 sec: 3686.5, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 2375680. Throughput: 0: 991.2. Samples: 592190. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:32:59,246][00436] Avg episode reward: [(0, '21.103')] |
|
[2025-02-14 07:33:04,240][00436] Fps is (10 sec: 4097.8, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 2400256. Throughput: 0: 1007.4. Samples: 599144. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:33:04,243][00436] Avg episode reward: [(0, '21.663')] |
|
[2025-02-14 07:33:07,910][04622] Updated weights for policy 0, policy_version 590 (0.0013) |
|
[2025-02-14 07:33:09,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 2416640. Throughput: 0: 998.9. Samples: 605144. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:33:09,243][00436] Avg episode reward: [(0, '20.637')] |
|
[2025-02-14 07:33:14,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 2437120. Throughput: 0: 1006.1. Samples: 607518. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:33:14,246][00436] Avg episode reward: [(0, '20.703')] |
|
[2025-02-14 07:33:18,084][04622] Updated weights for policy 0, policy_version 600 (0.0017) |
|
[2025-02-14 07:33:19,240][00436] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3998.8). Total num frames: 2461696. Throughput: 0: 1012.5. Samples: 614310. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:33:19,247][00436] Avg episode reward: [(0, '20.128')] |
|
[2025-02-14 07:33:24,240][00436] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3971.1). Total num frames: 2478080. Throughput: 0: 996.6. Samples: 620090. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-02-14 07:33:24,242][00436] Avg episode reward: [(0, '20.834')] |
|
[2025-02-14 07:33:28,681][04622] Updated weights for policy 0, policy_version 610 (0.0020) |
|
[2025-02-14 07:33:29,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3998.8). Total num frames: 2498560. Throughput: 0: 1010.1. Samples: 622742. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:33:29,247][00436] Avg episode reward: [(0, '22.299')] |
|
[2025-02-14 07:33:34,240][00436] Fps is (10 sec: 4505.5, 60 sec: 4096.0, 300 sec: 3998.8). Total num frames: 2523136. Throughput: 0: 1013.0. Samples: 629650. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:33:34,243][00436] Avg episode reward: [(0, '24.283')] |
|
[2025-02-14 07:33:34,249][04608] Saving new best policy, reward=24.283! |
|
[2025-02-14 07:33:38,669][04622] Updated weights for policy 0, policy_version 620 (0.0014) |
|
[2025-02-14 07:33:39,242][00436] Fps is (10 sec: 4095.3, 60 sec: 4027.6, 300 sec: 3984.9). Total num frames: 2539520. Throughput: 0: 1001.7. Samples: 635152. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:33:39,249][00436] Avg episode reward: [(0, '23.661')] |
|
[2025-02-14 07:33:44,240][00436] Fps is (10 sec: 3686.5, 60 sec: 4096.0, 300 sec: 3998.8). Total num frames: 2560000. Throughput: 0: 1018.9. Samples: 638040. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:33:44,248][00436] Avg episode reward: [(0, '24.927')] |
|
[2025-02-14 07:33:44,252][04608] Saving new best policy, reward=24.927! |
|
[2025-02-14 07:33:48,538][04622] Updated weights for policy 0, policy_version 630 (0.0014) |
|
[2025-02-14 07:33:49,240][00436] Fps is (10 sec: 4096.6, 60 sec: 4027.7, 300 sec: 3998.8). Total num frames: 2580480. Throughput: 0: 1011.2. Samples: 644650. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:33:49,247][00436] Avg episode reward: [(0, '24.711')] |
|
[2025-02-14 07:33:54,240][00436] Fps is (10 sec: 3686.4, 60 sec: 3959.8, 300 sec: 3984.9). Total num frames: 2596864. Throughput: 0: 991.9. Samples: 649780. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:33:54,246][00436] Avg episode reward: [(0, '24.723')] |
|
[2025-02-14 07:33:59,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3998.8). Total num frames: 2617344. Throughput: 0: 1006.0. Samples: 652788. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-02-14 07:33:59,243][00436] Avg episode reward: [(0, '24.583')] |
|
[2025-02-14 07:33:59,514][04622] Updated weights for policy 0, policy_version 640 (0.0017) |
|
[2025-02-14 07:34:04,240][00436] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3998.8). Total num frames: 2641920. Throughput: 0: 1007.7. Samples: 659658. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-02-14 07:34:04,242][00436] Avg episode reward: [(0, '24.606')] |
|
[2025-02-14 07:34:09,240][00436] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 2658304. Throughput: 0: 992.4. Samples: 664750. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-02-14 07:34:09,241][00436] Avg episode reward: [(0, '24.136')] |
|
[2025-02-14 07:34:09,943][04622] Updated weights for policy 0, policy_version 650 (0.0029) |
|
[2025-02-14 07:34:14,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3998.8). Total num frames: 2678784. Throughput: 0: 1008.1. Samples: 668106. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-02-14 07:34:14,243][00436] Avg episode reward: [(0, '25.078')] |
|
[2025-02-14 07:34:14,245][04608] Saving new best policy, reward=25.078! |
|
[2025-02-14 07:34:19,205][04622] Updated weights for policy 0, policy_version 660 (0.0017) |
|
[2025-02-14 07:34:19,240][00436] Fps is (10 sec: 4505.5, 60 sec: 4027.7, 300 sec: 3998.8). Total num frames: 2703360. Throughput: 0: 1001.7. Samples: 674728. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-02-14 07:34:19,252][00436] Avg episode reward: [(0, '25.160')] |
|
[2025-02-14 07:34:19,263][04608] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000660_2703360.pth... |
|
[2025-02-14 07:34:19,432][04608] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000424_1736704.pth |
|
[2025-02-14 07:34:19,452][04608] Saving new best policy, reward=25.160! |
|
[2025-02-14 07:34:24,240][00436] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 2715648. Throughput: 0: 984.7. Samples: 679462. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) |
|
[2025-02-14 07:34:24,242][00436] Avg episode reward: [(0, '24.458')] |
|
[2025-02-14 07:34:29,240][00436] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 2736128. Throughput: 0: 993.7. Samples: 682756. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-02-14 07:34:29,242][00436] Avg episode reward: [(0, '24.962')] |
|
[2025-02-14 07:34:30,103][04622] Updated weights for policy 0, policy_version 670 (0.0014) |
|
[2025-02-14 07:34:34,246][00436] Fps is (10 sec: 4503.1, 60 sec: 3959.1, 300 sec: 3998.7). Total num frames: 2760704. Throughput: 0: 1000.9. Samples: 689698. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-02-14 07:34:34,247][00436] Avg episode reward: [(0, '24.211')] |
|
[2025-02-14 07:34:39,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.6, 300 sec: 3984.9). Total num frames: 2777088. Throughput: 0: 994.9. Samples: 694550. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:34:39,243][00436] Avg episode reward: [(0, '23.678')] |
|
[2025-02-14 07:34:40,699][04622] Updated weights for policy 0, policy_version 680 (0.0030) |
|
[2025-02-14 07:34:44,240][00436] Fps is (10 sec: 3688.5, 60 sec: 3959.5, 300 sec: 3998.8). Total num frames: 2797568. Throughput: 0: 1004.8. Samples: 698004. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:34:44,249][00436] Avg episode reward: [(0, '24.150')] |
|
[2025-02-14 07:34:49,240][00436] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3998.8). Total num frames: 2822144. Throughput: 0: 1007.3. Samples: 704988. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:34:49,244][00436] Avg episode reward: [(0, '24.854')] |
|
[2025-02-14 07:34:50,372][04622] Updated weights for policy 0, policy_version 690 (0.0036) |
|
[2025-02-14 07:34:54,240][00436] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3998.8). Total num frames: 2838528. Throughput: 0: 999.1. Samples: 709708. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:34:54,245][00436] Avg episode reward: [(0, '25.648')] |
|
[2025-02-14 07:34:54,248][04608] Saving new best policy, reward=25.648! |
|
[2025-02-14 07:34:59,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3998.8). Total num frames: 2859008. Throughput: 0: 1000.0. Samples: 713104. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:34:59,246][00436] Avg episode reward: [(0, '26.143')] |
|
[2025-02-14 07:34:59,253][04608] Saving new best policy, reward=26.143! |
|
[2025-02-14 07:35:00,336][04622] Updated weights for policy 0, policy_version 700 (0.0022) |
|
[2025-02-14 07:35:04,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3998.8). Total num frames: 2879488. Throughput: 0: 1005.6. Samples: 719980. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:35:04,242][00436] Avg episode reward: [(0, '26.911')] |
|
[2025-02-14 07:35:04,248][04608] Saving new best policy, reward=26.911! |
|
[2025-02-14 07:35:09,240][00436] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 2895872. Throughput: 0: 1004.2. Samples: 724652. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:35:09,242][00436] Avg episode reward: [(0, '26.585')] |
|
[2025-02-14 07:35:11,118][04622] Updated weights for policy 0, policy_version 710 (0.0035) |
|
[2025-02-14 07:35:14,240][00436] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 2920448. Throughput: 0: 1008.4. Samples: 728134. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:35:14,242][00436] Avg episode reward: [(0, '25.920')] |
|
[2025-02-14 07:35:19,240][00436] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3998.8). Total num frames: 2940928. Throughput: 0: 1006.7. Samples: 734994. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:35:19,245][00436] Avg episode reward: [(0, '25.187')] |
|
[2025-02-14 07:35:21,616][04622] Updated weights for policy 0, policy_version 720 (0.0026) |
|
[2025-02-14 07:35:24,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3998.8). Total num frames: 2957312. Throughput: 0: 1009.4. Samples: 739974. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:35:24,245][00436] Avg episode reward: [(0, '25.072')] |
|
[2025-02-14 07:35:29,240][00436] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 4012.7). Total num frames: 2981888. Throughput: 0: 1009.6. Samples: 743436. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-02-14 07:35:29,248][00436] Avg episode reward: [(0, '22.444')] |
|
[2025-02-14 07:35:30,650][04622] Updated weights for policy 0, policy_version 730 (0.0019) |
|
[2025-02-14 07:35:34,240][00436] Fps is (10 sec: 4505.6, 60 sec: 4028.1, 300 sec: 3998.8). Total num frames: 3002368. Throughput: 0: 999.8. Samples: 749980. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:35:34,247][00436] Avg episode reward: [(0, '21.739')] |
|
[2025-02-14 07:35:39,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3998.8). Total num frames: 3018752. Throughput: 0: 1007.5. Samples: 755046. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-02-14 07:35:39,246][00436] Avg episode reward: [(0, '23.325')] |
|
[2025-02-14 07:35:41,428][04622] Updated weights for policy 0, policy_version 740 (0.0024) |
|
[2025-02-14 07:35:44,242][00436] Fps is (10 sec: 4095.1, 60 sec: 4095.8, 300 sec: 4012.7). Total num frames: 3043328. Throughput: 0: 1009.3. Samples: 758524. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:35:44,245][00436] Avg episode reward: [(0, '23.741')] |
|
[2025-02-14 07:35:49,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3998.8). Total num frames: 3059712. Throughput: 0: 1002.4. Samples: 765090. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:35:49,246][00436] Avg episode reward: [(0, '23.506')] |
|
[2025-02-14 07:35:52,142][04622] Updated weights for policy 0, policy_version 750 (0.0023) |
|
[2025-02-14 07:35:54,240][00436] Fps is (10 sec: 3687.2, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 3080192. Throughput: 0: 1014.6. Samples: 770310. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:35:54,248][00436] Avg episode reward: [(0, '24.373')] |
|
[2025-02-14 07:35:59,240][00436] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 3100672. Throughput: 0: 1013.0. Samples: 773720. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-02-14 07:35:59,242][00436] Avg episode reward: [(0, '25.499')] |
|
[2025-02-14 07:36:00,922][04622] Updated weights for policy 0, policy_version 760 (0.0012) |
|
[2025-02-14 07:36:04,240][00436] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3998.8). Total num frames: 3121152. Throughput: 0: 1000.4. Samples: 780010. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:36:04,242][00436] Avg episode reward: [(0, '27.260')] |
|
[2025-02-14 07:36:04,245][04608] Saving new best policy, reward=27.260! |
|
[2025-02-14 07:36:09,242][00436] Fps is (10 sec: 4095.4, 60 sec: 4095.9, 300 sec: 4012.7). Total num frames: 3141632. Throughput: 0: 1012.1. Samples: 785522. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:36:09,250][00436] Avg episode reward: [(0, '25.830')] |
|
[2025-02-14 07:36:11,672][04622] Updated weights for policy 0, policy_version 770 (0.0026) |
|
[2025-02-14 07:36:14,240][00436] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 3162112. Throughput: 0: 1011.8. Samples: 788968. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:36:14,246][00436] Avg episode reward: [(0, '25.006')] |
|
[2025-02-14 07:36:19,249][00436] Fps is (10 sec: 4093.0, 60 sec: 4027.1, 300 sec: 3998.7). Total num frames: 3182592. Throughput: 0: 1002.6. Samples: 795108. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:36:19,253][00436] Avg episode reward: [(0, '26.167')] |
|
[2025-02-14 07:36:19,263][04608] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000777_3182592.pth... |
|
[2025-02-14 07:36:19,428][04608] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000542_2220032.pth |
|
[2025-02-14 07:36:22,698][04622] Updated weights for policy 0, policy_version 780 (0.0025) |
|
[2025-02-14 07:36:24,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 3198976. Throughput: 0: 1006.8. Samples: 800350. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:36:24,248][00436] Avg episode reward: [(0, '26.335')] |
|
[2025-02-14 07:36:29,241][00436] Fps is (10 sec: 4099.4, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 3223552. Throughput: 0: 1004.1. Samples: 803708. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:36:29,243][00436] Avg episode reward: [(0, '26.680')] |
|
[2025-02-14 07:36:32,451][04622] Updated weights for policy 0, policy_version 790 (0.0020) |
|
[2025-02-14 07:36:34,243][00436] Fps is (10 sec: 4095.0, 60 sec: 3959.3, 300 sec: 3984.9). Total num frames: 3239936. Throughput: 0: 989.6. Samples: 809626. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:36:34,248][00436] Avg episode reward: [(0, '25.282')] |
|
[2025-02-14 07:36:39,240][00436] Fps is (10 sec: 3686.6, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 3260416. Throughput: 0: 1000.4. Samples: 815330. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:36:39,248][00436] Avg episode reward: [(0, '25.117')] |
|
[2025-02-14 07:36:42,421][04622] Updated weights for policy 0, policy_version 800 (0.0016) |
|
[2025-02-14 07:36:44,240][00436] Fps is (10 sec: 4506.8, 60 sec: 4027.9, 300 sec: 4012.7). Total num frames: 3284992. Throughput: 0: 1001.0. Samples: 818766. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-02-14 07:36:44,246][00436] Avg episode reward: [(0, '26.939')] |
|
[2025-02-14 07:36:49,240][00436] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 3297280. Throughput: 0: 991.4. Samples: 824624. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:36:49,247][00436] Avg episode reward: [(0, '25.444')] |
|
[2025-02-14 07:36:53,233][04622] Updated weights for policy 0, policy_version 810 (0.0023) |
|
[2025-02-14 07:36:54,240][00436] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3998.8). Total num frames: 3317760. Throughput: 0: 998.8. Samples: 830466. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:36:54,242][00436] Avg episode reward: [(0, '26.589')] |
|
[2025-02-14 07:36:59,240][00436] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 3342336. Throughput: 0: 1000.0. Samples: 833970. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-02-14 07:36:59,247][00436] Avg episode reward: [(0, '25.980')] |
|
[2025-02-14 07:37:03,068][04622] Updated weights for policy 0, policy_version 820 (0.0013) |
|
[2025-02-14 07:37:04,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3998.8). Total num frames: 3358720. Throughput: 0: 990.9. Samples: 839688. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:37:04,247][00436] Avg episode reward: [(0, '26.249')] |
|
[2025-02-14 07:37:09,240][00436] Fps is (10 sec: 4096.0, 60 sec: 4027.8, 300 sec: 4026.6). Total num frames: 3383296. Throughput: 0: 1005.4. Samples: 845594. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-02-14 07:37:09,242][00436] Avg episode reward: [(0, '25.071')] |
|
[2025-02-14 07:37:12,759][04622] Updated weights for policy 0, policy_version 830 (0.0019) |
|
[2025-02-14 07:37:14,240][00436] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 3403776. Throughput: 0: 1008.3. Samples: 849080. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:37:14,246][00436] Avg episode reward: [(0, '25.741')] |
|
[2025-02-14 07:37:19,240][00436] Fps is (10 sec: 3686.4, 60 sec: 3960.1, 300 sec: 4012.7). Total num frames: 3420160. Throughput: 0: 1002.8. Samples: 854748. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:37:19,246][00436] Avg episode reward: [(0, '24.750')] |
|
[2025-02-14 07:37:23,694][04622] Updated weights for policy 0, policy_version 840 (0.0021) |
|
[2025-02-14 07:37:24,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 3440640. Throughput: 0: 1008.1. Samples: 860694. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-02-14 07:37:24,245][00436] Avg episode reward: [(0, '25.794')] |
|
[2025-02-14 07:37:29,240][00436] Fps is (10 sec: 4505.6, 60 sec: 4027.8, 300 sec: 4026.6). Total num frames: 3465216. Throughput: 0: 1009.2. Samples: 864182. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:37:29,247][00436] Avg episode reward: [(0, '26.312')] |
|
[2025-02-14 07:37:34,071][04622] Updated weights for policy 0, policy_version 850 (0.0012) |
|
[2025-02-14 07:37:34,240][00436] Fps is (10 sec: 4096.0, 60 sec: 4027.9, 300 sec: 4012.7). Total num frames: 3481600. Throughput: 0: 1002.7. Samples: 869744. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:37:34,242][00436] Avg episode reward: [(0, '25.470')] |
|
[2025-02-14 07:37:39,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3502080. Throughput: 0: 1013.3. Samples: 876066. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-02-14 07:37:39,242][00436] Avg episode reward: [(0, '25.970')] |
|
[2025-02-14 07:37:42,909][04622] Updated weights for policy 0, policy_version 860 (0.0028) |
|
[2025-02-14 07:37:44,245][00436] Fps is (10 sec: 4503.5, 60 sec: 4027.4, 300 sec: 4026.5). Total num frames: 3526656. Throughput: 0: 1014.5. Samples: 879626. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-02-14 07:37:44,248][00436] Avg episode reward: [(0, '25.240')] |
|
[2025-02-14 07:37:49,240][00436] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 4012.8). Total num frames: 3543040. Throughput: 0: 1007.6. Samples: 885032. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) |
|
[2025-02-14 07:37:49,246][00436] Avg episode reward: [(0, '25.148')] |
|
[2025-02-14 07:37:53,464][04622] Updated weights for policy 0, policy_version 870 (0.0024) |
|
[2025-02-14 07:37:54,240][00436] Fps is (10 sec: 3688.1, 60 sec: 4096.0, 300 sec: 4026.6). Total num frames: 3563520. Throughput: 0: 1019.8. Samples: 891484. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-02-14 07:37:54,245][00436] Avg episode reward: [(0, '24.830')] |
|
[2025-02-14 07:37:59,240][00436] Fps is (10 sec: 4505.6, 60 sec: 4096.0, 300 sec: 4026.6). Total num frames: 3588096. Throughput: 0: 1019.9. Samples: 894976. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-02-14 07:37:59,244][00436] Avg episode reward: [(0, '24.278')] |
|
[2025-02-14 07:38:04,184][04622] Updated weights for policy 0, policy_version 880 (0.0012) |
|
[2025-02-14 07:38:04,240][00436] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 4026.6). Total num frames: 3604480. Throughput: 0: 1007.2. Samples: 900070. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:38:04,242][00436] Avg episode reward: [(0, '23.579')] |
|
[2025-02-14 07:38:09,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3624960. Throughput: 0: 1021.6. Samples: 906664. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:38:09,245][00436] Avg episode reward: [(0, '23.139')] |
|
[2025-02-14 07:38:13,068][04622] Updated weights for policy 0, policy_version 890 (0.0013) |
|
[2025-02-14 07:38:14,240][00436] Fps is (10 sec: 4505.5, 60 sec: 4096.0, 300 sec: 4026.6). Total num frames: 3649536. Throughput: 0: 1022.2. Samples: 910182. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-02-14 07:38:14,246][00436] Avg episode reward: [(0, '23.301')] |
|
[2025-02-14 07:38:19,241][00436] Fps is (10 sec: 3685.9, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 3661824. Throughput: 0: 1009.8. Samples: 915186. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:38:19,244][00436] Avg episode reward: [(0, '23.173')] |
|
[2025-02-14 07:38:19,256][04608] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000894_3661824.pth... |
|
[2025-02-14 07:38:19,417][04608] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000660_2703360.pth |
|
[2025-02-14 07:38:23,888][04622] Updated weights for policy 0, policy_version 900 (0.0020) |
|
[2025-02-14 07:38:24,242][00436] Fps is (10 sec: 3685.7, 60 sec: 4095.9, 300 sec: 4026.5). Total num frames: 3686400. Throughput: 0: 1017.0. Samples: 921834. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-02-14 07:38:24,247][00436] Avg episode reward: [(0, '23.255')] |
|
[2025-02-14 07:38:29,241][00436] Fps is (10 sec: 4505.9, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 3706880. Throughput: 0: 1012.8. Samples: 925196. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:38:29,247][00436] Avg episode reward: [(0, '23.885')] |
|
[2025-02-14 07:38:34,240][00436] Fps is (10 sec: 3687.1, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 3723264. Throughput: 0: 1001.6. Samples: 930102. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-02-14 07:38:34,247][00436] Avg episode reward: [(0, '24.474')] |
|
[2025-02-14 07:38:34,589][04622] Updated weights for policy 0, policy_version 910 (0.0017) |
|
[2025-02-14 07:38:39,241][00436] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 4026.6). Total num frames: 3747840. Throughput: 0: 1012.0. Samples: 937024. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:38:39,248][00436] Avg episode reward: [(0, '23.977')] |
|
[2025-02-14 07:38:43,761][04622] Updated weights for policy 0, policy_version 920 (0.0014) |
|
[2025-02-14 07:38:44,240][00436] Fps is (10 sec: 4505.7, 60 sec: 4028.0, 300 sec: 4026.6). Total num frames: 3768320. Throughput: 0: 1011.6. Samples: 940496. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-02-14 07:38:44,242][00436] Avg episode reward: [(0, '23.977')] |
|
[2025-02-14 07:38:49,240][00436] Fps is (10 sec: 3686.6, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3784704. Throughput: 0: 1006.9. Samples: 945382. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-02-14 07:38:49,247][00436] Avg episode reward: [(0, '23.865')] |
|
[2025-02-14 07:38:54,179][04622] Updated weights for policy 0, policy_version 930 (0.0018) |
|
[2025-02-14 07:38:54,240][00436] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 4040.5). Total num frames: 3809280. Throughput: 0: 1010.5. Samples: 952136. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-02-14 07:38:54,247][00436] Avg episode reward: [(0, '25.630')] |
|
[2025-02-14 07:38:59,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 4012.7). Total num frames: 3825664. Throughput: 0: 1009.1. Samples: 955590. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-02-14 07:38:59,243][00436] Avg episode reward: [(0, '25.000')] |
|
[2025-02-14 07:39:04,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3846144. Throughput: 0: 1005.9. Samples: 960452. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-02-14 07:39:04,248][00436] Avg episode reward: [(0, '25.738')] |
|
[2025-02-14 07:39:04,871][04622] Updated weights for policy 0, policy_version 940 (0.0020) |
|
[2025-02-14 07:39:09,240][00436] Fps is (10 sec: 4095.9, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3866624. Throughput: 0: 1013.2. Samples: 967426. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-02-14 07:39:09,244][00436] Avg episode reward: [(0, '26.803')] |
|
[2025-02-14 07:39:14,240][00436] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 4012.7). Total num frames: 3887104. Throughput: 0: 1015.8. Samples: 970908. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:39:14,242][00436] Avg episode reward: [(0, '26.635')] |
|
[2025-02-14 07:39:15,209][04622] Updated weights for policy 0, policy_version 950 (0.0025) |
|
[2025-02-14 07:39:19,240][00436] Fps is (10 sec: 4096.1, 60 sec: 4096.1, 300 sec: 4040.5). Total num frames: 3907584. Throughput: 0: 1018.1. Samples: 975918. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:39:19,247][00436] Avg episode reward: [(0, '24.862')] |
|
[2025-02-14 07:39:24,154][04622] Updated weights for policy 0, policy_version 960 (0.0020) |
|
[2025-02-14 07:39:24,240][00436] Fps is (10 sec: 4505.6, 60 sec: 4096.1, 300 sec: 4054.3). Total num frames: 3932160. Throughput: 0: 1018.3. Samples: 982846. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) |
|
[2025-02-14 07:39:24,242][00436] Avg episode reward: [(0, '24.505')] |
|
[2025-02-14 07:39:29,240][00436] Fps is (10 sec: 4095.9, 60 sec: 4027.8, 300 sec: 4026.6). Total num frames: 3948544. Throughput: 0: 1013.4. Samples: 986098. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:39:29,245][00436] Avg episode reward: [(0, '23.735')] |
|
[2025-02-14 07:39:34,240][00436] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 4040.5). Total num frames: 3969024. Throughput: 0: 1017.8. Samples: 991184. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:39:34,243][00436] Avg episode reward: [(0, '23.052')] |
|
[2025-02-14 07:39:34,843][04622] Updated weights for policy 0, policy_version 970 (0.0012) |
|
[2025-02-14 07:39:39,240][00436] Fps is (10 sec: 4096.1, 60 sec: 4027.8, 300 sec: 4040.5). Total num frames: 3989504. Throughput: 0: 1018.5. Samples: 997968. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-02-14 07:39:39,246][00436] Avg episode reward: [(0, '22.781')] |
|
[2025-02-14 07:39:42,959][04608] Stopping Batcher_0... |
|
[2025-02-14 07:39:42,960][04608] Loop batcher_evt_loop terminating... |
|
[2025-02-14 07:39:42,960][00436] Component Batcher_0 stopped! |
|
[2025-02-14 07:39:42,966][00436] Component RolloutWorker_w3 process died already! Don't wait for it. |
|
[2025-02-14 07:39:42,971][04608] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-02-14 07:39:43,060][04622] Weights refcount: 2 0 |
|
[2025-02-14 07:39:43,072][04622] Stopping InferenceWorker_p0-w0... |
|
[2025-02-14 07:39:43,072][04622] Loop inference_proc0-0_evt_loop terminating... |
|
[2025-02-14 07:39:43,072][00436] Component InferenceWorker_p0-w0 stopped! |
|
[2025-02-14 07:39:43,097][04608] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000777_3182592.pth |
|
[2025-02-14 07:39:43,119][04608] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-02-14 07:39:43,331][00436] Component LearnerWorker_p0 stopped! |
|
[2025-02-14 07:39:43,334][04608] Stopping LearnerWorker_p0... |
|
[2025-02-14 07:39:43,334][04608] Loop learner_proc0_evt_loop terminating... |
|
[2025-02-14 07:39:43,487][04623] Stopping RolloutWorker_w1... |
|
[2025-02-14 07:39:43,488][04623] Loop rollout_proc1_evt_loop terminating... |
|
[2025-02-14 07:39:43,488][00436] Component RolloutWorker_w1 stopped! |
|
[2025-02-14 07:39:43,532][00436] Component RolloutWorker_w0 stopped! |
|
[2025-02-14 07:39:43,542][04621] Stopping RolloutWorker_w0... |
|
[2025-02-14 07:39:43,542][04621] Loop rollout_proc0_evt_loop terminating... |
|
[2025-02-14 07:39:43,551][00436] Component RolloutWorker_w2 stopped! |
|
[2025-02-14 07:39:43,559][04624] Stopping RolloutWorker_w2... |
|
[2025-02-14 07:39:43,559][04624] Loop rollout_proc2_evt_loop terminating... |
|
[2025-02-14 07:39:43,562][00436] Component RolloutWorker_w6 stopped! |
|
[2025-02-14 07:39:43,567][04628] Stopping RolloutWorker_w6... |
|
[2025-02-14 07:39:43,568][04628] Loop rollout_proc6_evt_loop terminating... |
|
[2025-02-14 07:39:43,573][04627] Stopping RolloutWorker_w5... |
|
[2025-02-14 07:39:43,574][04627] Loop rollout_proc5_evt_loop terminating... |
|
[2025-02-14 07:39:43,573][00436] Component RolloutWorker_w4 stopped! |
|
[2025-02-14 07:39:43,577][00436] Component RolloutWorker_w5 stopped! |
|
[2025-02-14 07:39:43,582][04626] Stopping RolloutWorker_w4... |
|
[2025-02-14 07:39:43,583][04626] Loop rollout_proc4_evt_loop terminating... |
|
[2025-02-14 07:39:43,756][04629] Stopping RolloutWorker_w7... |
|
[2025-02-14 07:39:43,759][04629] Loop rollout_proc7_evt_loop terminating... |
|
[2025-02-14 07:39:43,756][00436] Component RolloutWorker_w7 stopped! |
|
[2025-02-14 07:39:43,763][00436] Waiting for process learner_proc0 to stop... |
|
[2025-02-14 07:39:45,451][00436] Waiting for process inference_proc0-0 to join... |
|
[2025-02-14 07:39:45,457][00436] Waiting for process rollout_proc0 to join... |
|
[2025-02-14 07:39:47,695][00436] Waiting for process rollout_proc1 to join... |
|
[2025-02-14 07:39:47,697][00436] Waiting for process rollout_proc2 to join... |
|
[2025-02-14 07:39:47,699][00436] Waiting for process rollout_proc3 to join... |
|
[2025-02-14 07:39:47,700][00436] Waiting for process rollout_proc4 to join... |
|
[2025-02-14 07:39:47,701][00436] Waiting for process rollout_proc5 to join... |
|
[2025-02-14 07:39:47,703][00436] Waiting for process rollout_proc6 to join... |
|
[2025-02-14 07:39:47,705][00436] Waiting for process rollout_proc7 to join... |
|
[2025-02-14 07:39:47,707][00436] Batcher 0 profile tree view: |
|
batching: 24.5308, releasing_batches: 0.0254 |
|
[2025-02-14 07:39:47,709][00436] InferenceWorker_p0-w0 profile tree view: |
|
wait_policy: 0.0000 |
|
wait_policy_total: 403.2455 |
|
update_model: 8.3502 |
|
weight_update: 0.0022 |
|
one_step: 0.0039 |
|
handle_policy_step: 567.3101 |
|
deserialize: 13.6692, stack: 3.2107, obs_to_device_normalize: 121.5577, forward: 298.6928, send_messages: 24.2719 |
|
prepare_outputs: 81.6524 |
|
to_cpu: 51.2355 |
|
[2025-02-14 07:39:47,710][00436] Learner 0 profile tree view: |
|
misc: 0.0038, prepare_batch: 12.6112 |
|
train: 70.2627 |
|
epoch_init: 0.0045, minibatch_init: 0.0065, losses_postprocess: 0.6231, kl_divergence: 0.6273, after_optimizer: 33.0908 |
|
calculate_losses: 24.2577 |
|
losses_init: 0.0032, forward_head: 1.2739, bptt_initial: 16.1137, tail: 1.0098, advantages_returns: 0.2412, losses: 3.4637 |
|
bptt: 1.9242 |
|
bptt_forward_core: 1.8161 |
|
update: 11.0621 |
|
clip: 0.8515 |
|
[2025-02-14 07:39:47,711][00436] RolloutWorker_w0 profile tree view: |
|
wait_for_trajectories: 0.2696, enqueue_policy_requests: 90.1751, env_step: 811.9072, overhead: 12.5710, complete_rollouts: 8.2949 |
|
save_policy_outputs: 20.0645 |
|
split_output_tensors: 7.8047 |
|
[2025-02-14 07:39:47,713][00436] RolloutWorker_w7 profile tree view: |
|
wait_for_trajectories: 0.2817, enqueue_policy_requests: 125.1016, env_step: 776.3216, overhead: 11.8984, complete_rollouts: 5.8138 |
|
save_policy_outputs: 16.8293 |
|
split_output_tensors: 6.5355 |
|
[2025-02-14 07:39:47,714][00436] Loop Runner_EvtLoop terminating... |
|
[2025-02-14 07:39:47,716][00436] Runner profile tree view: |
|
main_loop: 1042.0112 |
|
[2025-02-14 07:39:47,717][00436] Collected {0: 4005888}, FPS: 3844.4 |
|
[2025-02-14 07:40:16,022][00436] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json |
|
[2025-02-14 07:40:16,024][00436] Overriding arg 'num_workers' with value 1 passed from command line |
|
[2025-02-14 07:40:16,026][00436] Adding new argument 'no_render'=True that is not in the saved config file! |
|
[2025-02-14 07:40:16,028][00436] Adding new argument 'save_video'=True that is not in the saved config file! |
|
[2025-02-14 07:40:16,030][00436] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-02-14 07:40:16,032][00436] Adding new argument 'video_name'=None that is not in the saved config file! |
|
[2025-02-14 07:40:16,033][00436] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-02-14 07:40:16,035][00436] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! |
|
[2025-02-14 07:40:16,036][00436] Adding new argument 'push_to_hub'=False that is not in the saved config file! |
|
[2025-02-14 07:40:16,037][00436] Adding new argument 'hf_repository'=None that is not in the saved config file! |
|
[2025-02-14 07:40:16,038][00436] Adding new argument 'policy_index'=0 that is not in the saved config file! |
|
[2025-02-14 07:40:16,039][00436] Adding new argument 'eval_deterministic'=False that is not in the saved config file! |
|
[2025-02-14 07:40:16,040][00436] Adding new argument 'train_script'=None that is not in the saved config file! |
|
[2025-02-14 07:40:16,041][00436] Adding new argument 'enjoy_script'=None that is not in the saved config file! |
|
[2025-02-14 07:40:16,042][00436] Using frameskip 1 and render_action_repeat=4 for evaluation |
|
[2025-02-14 07:40:16,080][00436] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-02-14 07:40:16,086][00436] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-02-14 07:40:16,090][00436] RunningMeanStd input shape: (1,) |
|
[2025-02-14 07:40:16,109][00436] ConvEncoder: input_channels=3 |
|
[2025-02-14 07:40:16,231][00436] Conv encoder output size: 512 |
|
[2025-02-14 07:40:16,233][00436] Policy head output size: 512 |
|
[2025-02-14 07:40:16,411][00436] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-02-14 07:40:17,152][00436] Num frames 100... |
|
[2025-02-14 07:40:17,285][00436] Num frames 200... |
|
[2025-02-14 07:40:17,418][00436] Num frames 300... |
|
[2025-02-14 07:40:17,548][00436] Num frames 400... |
|
[2025-02-14 07:40:17,744][00436] Avg episode rewards: #0: 10.800, true rewards: #0: 4.800 |
|
[2025-02-14 07:40:17,746][00436] Avg episode reward: 10.800, avg true_objective: 4.800 |
|
[2025-02-14 07:40:17,795][00436] Num frames 500... |
|
[2025-02-14 07:40:17,976][00436] Num frames 600... |
|
[2025-02-14 07:40:18,146][00436] Num frames 700... |
|
[2025-02-14 07:40:18,334][00436] Num frames 800... |
|
[2025-02-14 07:40:18,512][00436] Num frames 900... |
|
[2025-02-14 07:40:18,692][00436] Num frames 1000... |
|
[2025-02-14 07:40:18,866][00436] Num frames 1100... |
|
[2025-02-14 07:40:19,045][00436] Num frames 1200... |
|
[2025-02-14 07:40:19,136][00436] Avg episode rewards: #0: 14.085, true rewards: #0: 6.085 |
|
[2025-02-14 07:40:19,138][00436] Avg episode reward: 14.085, avg true_objective: 6.085 |
|
[2025-02-14 07:40:19,307][00436] Num frames 1300... |
|
[2025-02-14 07:40:19,495][00436] Num frames 1400... |
|
[2025-02-14 07:40:19,677][00436] Num frames 1500... |
|
[2025-02-14 07:40:19,865][00436] Num frames 1600... |
|
[2025-02-14 07:40:20,012][00436] Num frames 1700... |
|
[2025-02-14 07:40:20,168][00436] Num frames 1800... |
|
[2025-02-14 07:40:20,325][00436] Num frames 1900... |
|
[2025-02-14 07:40:20,462][00436] Num frames 2000... |
|
[2025-02-14 07:40:20,595][00436] Num frames 2100... |
|
[2025-02-14 07:40:20,737][00436] Num frames 2200... |
|
[2025-02-14 07:40:20,865][00436] Num frames 2300... |
|
[2025-02-14 07:40:21,001][00436] Num frames 2400... |
|
[2025-02-14 07:40:21,130][00436] Num frames 2500... |
|
[2025-02-14 07:40:21,276][00436] Num frames 2600... |
|
[2025-02-14 07:40:21,387][00436] Avg episode rewards: #0: 19.810, true rewards: #0: 8.810 |
|
[2025-02-14 07:40:21,389][00436] Avg episode reward: 19.810, avg true_objective: 8.810 |
|
[2025-02-14 07:40:21,468][00436] Num frames 2700... |
|
[2025-02-14 07:40:21,599][00436] Num frames 2800... |
|
[2025-02-14 07:40:21,737][00436] Num frames 2900... |
|
[2025-02-14 07:40:21,870][00436] Num frames 3000... |
|
[2025-02-14 07:40:22,010][00436] Num frames 3100... |
|
[2025-02-14 07:40:22,156][00436] Num frames 3200... |
|
[2025-02-14 07:40:22,309][00436] Num frames 3300... |
|
[2025-02-14 07:40:22,451][00436] Num frames 3400... |
|
[2025-02-14 07:40:22,581][00436] Num frames 3500... |
|
[2025-02-14 07:40:22,714][00436] Num frames 3600... |
|
[2025-02-14 07:40:22,847][00436] Num frames 3700... |
|
[2025-02-14 07:40:22,978][00436] Num frames 3800... |
|
[2025-02-14 07:40:23,038][00436] Avg episode rewards: #0: 22.508, true rewards: #0: 9.507 |
|
[2025-02-14 07:40:23,040][00436] Avg episode reward: 22.508, avg true_objective: 9.507 |
|
[2025-02-14 07:40:23,173][00436] Num frames 3900... |
|
[2025-02-14 07:40:23,309][00436] Num frames 4000... |
|
[2025-02-14 07:40:23,451][00436] Num frames 4100... |
|
[2025-02-14 07:40:23,537][00436] Avg episode rewards: #0: 18.846, true rewards: #0: 8.246 |
|
[2025-02-14 07:40:23,539][00436] Avg episode reward: 18.846, avg true_objective: 8.246 |
|
[2025-02-14 07:40:23,646][00436] Num frames 4200... |
|
[2025-02-14 07:40:23,779][00436] Num frames 4300... |
|
[2025-02-14 07:40:23,911][00436] Num frames 4400... |
|
[2025-02-14 07:40:24,047][00436] Num frames 4500... |
|
[2025-02-14 07:40:24,188][00436] Num frames 4600... |
|
[2025-02-14 07:40:24,326][00436] Num frames 4700... |
|
[2025-02-14 07:40:24,470][00436] Num frames 4800... |
|
[2025-02-14 07:40:24,607][00436] Num frames 4900... |
|
[2025-02-14 07:40:24,739][00436] Num frames 5000... |
|
[2025-02-14 07:40:24,862][00436] Avg episode rewards: #0: 18.918, true rewards: #0: 8.418 |
|
[2025-02-14 07:40:24,864][00436] Avg episode reward: 18.918, avg true_objective: 8.418 |
|
[2025-02-14 07:40:24,934][00436] Num frames 5100... |
|
[2025-02-14 07:40:25,069][00436] Num frames 5200... |
|
[2025-02-14 07:40:25,212][00436] Num frames 5300... |
|
[2025-02-14 07:40:25,344][00436] Num frames 5400... |
|
[2025-02-14 07:40:25,488][00436] Num frames 5500... |
|
[2025-02-14 07:40:25,621][00436] Num frames 5600... |
|
[2025-02-14 07:40:25,751][00436] Num frames 5700... |
|
[2025-02-14 07:40:25,884][00436] Num frames 5800... |
|
[2025-02-14 07:40:26,018][00436] Num frames 5900... |
|
[2025-02-14 07:40:26,147][00436] Num frames 6000... |
|
[2025-02-14 07:40:26,286][00436] Num frames 6100... |
|
[2025-02-14 07:40:26,429][00436] Num frames 6200... |
|
[2025-02-14 07:40:26,562][00436] Num frames 6300... |
|
[2025-02-14 07:40:26,694][00436] Num frames 6400... |
|
[2025-02-14 07:40:26,831][00436] Num frames 6500... |
|
[2025-02-14 07:40:27,002][00436] Avg episode rewards: #0: 21.696, true rewards: #0: 9.410 |
|
[2025-02-14 07:40:27,004][00436] Avg episode reward: 21.696, avg true_objective: 9.410 |
|
[2025-02-14 07:40:27,025][00436] Num frames 6600... |
|
[2025-02-14 07:40:27,157][00436] Num frames 6700... |
|
[2025-02-14 07:40:27,298][00436] Num frames 6800... |
|
[2025-02-14 07:40:27,439][00436] Num frames 6900... |
|
[2025-02-14 07:40:27,573][00436] Num frames 7000... |
|
[2025-02-14 07:40:27,710][00436] Num frames 7100... |
|
[2025-02-14 07:40:27,840][00436] Num frames 7200... |
|
[2025-02-14 07:40:27,969][00436] Num frames 7300... |
|
[2025-02-14 07:40:28,100][00436] Num frames 7400... |
|
[2025-02-14 07:40:28,182][00436] Avg episode rewards: #0: 21.024, true rewards: #0: 9.274 |
|
[2025-02-14 07:40:28,187][00436] Avg episode reward: 21.024, avg true_objective: 9.274 |
|
[2025-02-14 07:40:28,299][00436] Num frames 7500... |
|
[2025-02-14 07:40:28,429][00436] Num frames 7600... |
|
[2025-02-14 07:40:28,566][00436] Num frames 7700... |
|
[2025-02-14 07:40:28,697][00436] Num frames 7800... |
|
[2025-02-14 07:40:28,830][00436] Num frames 7900... |
|
[2025-02-14 07:40:28,972][00436] Num frames 8000... |
|
[2025-02-14 07:40:29,111][00436] Num frames 8100... |
|
[2025-02-14 07:40:29,255][00436] Num frames 8200... |
|
[2025-02-14 07:40:29,421][00436] Avg episode rewards: #0: 20.537, true rewards: #0: 9.203 |
|
[2025-02-14 07:40:29,422][00436] Avg episode reward: 20.537, avg true_objective: 9.203 |
|
[2025-02-14 07:40:29,449][00436] Num frames 8300... |
|
[2025-02-14 07:40:29,593][00436] Num frames 8400... |
|
[2025-02-14 07:40:29,726][00436] Num frames 8500... |
|
[2025-02-14 07:40:29,863][00436] Num frames 8600... |
|
[2025-02-14 07:40:30,044][00436] Num frames 8700... |
|
[2025-02-14 07:40:30,225][00436] Num frames 8800... |
|
[2025-02-14 07:40:30,389][00436] Avg episode rewards: #0: 19.459, true rewards: #0: 8.859 |
|
[2025-02-14 07:40:30,391][00436] Avg episode reward: 19.459, avg true_objective: 8.859 |
|
[2025-02-14 07:41:25,146][00436] Replay video saved to /content/train_dir/default_experiment/replay.mp4! |
|
[2025-02-14 07:43:28,512][00436] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json |
|
[2025-02-14 07:43:28,514][00436] Overriding arg 'num_workers' with value 1 passed from command line |
|
[2025-02-14 07:43:28,517][00436] Adding new argument 'no_render'=True that is not in the saved config file! |
|
[2025-02-14 07:43:28,520][00436] Adding new argument 'save_video'=True that is not in the saved config file! |
|
[2025-02-14 07:43:28,523][00436] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-02-14 07:43:28,526][00436] Adding new argument 'video_name'=None that is not in the saved config file! |
|
[2025-02-14 07:43:28,528][00436] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! |
|
[2025-02-14 07:43:28,530][00436] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! |
|
[2025-02-14 07:43:28,533][00436] Adding new argument 'push_to_hub'=True that is not in the saved config file! |
|
[2025-02-14 07:43:28,534][00436] Adding new argument 'hf_repository'='gyaan/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! |
|
[2025-02-14 07:43:28,535][00436] Adding new argument 'policy_index'=0 that is not in the saved config file! |
|
[2025-02-14 07:43:28,537][00436] Adding new argument 'eval_deterministic'=False that is not in the saved config file! |
|
[2025-02-14 07:43:28,538][00436] Adding new argument 'train_script'=None that is not in the saved config file! |
|
[2025-02-14 07:43:28,540][00436] Adding new argument 'enjoy_script'=None that is not in the saved config file! |
|
[2025-02-14 07:43:28,542][00436] Using frameskip 1 and render_action_repeat=4 for evaluation |
|
[2025-02-14 07:43:28,642][00436] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-02-14 07:43:28,647][00436] RunningMeanStd input shape: (1,) |
|
[2025-02-14 07:43:28,721][00436] ConvEncoder: input_channels=3 |
|
[2025-02-14 07:43:28,926][00436] Conv encoder output size: 512 |
|
[2025-02-14 07:43:28,931][00436] Policy head output size: 512 |
|
[2025-02-14 07:43:28,957][00436] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-02-14 07:43:29,841][00436] Num frames 100... |
|
[2025-02-14 07:43:30,020][00436] Num frames 200... |
|
[2025-02-14 07:43:30,212][00436] Num frames 300... |
|
[2025-02-14 07:43:30,401][00436] Num frames 400... |
|
[2025-02-14 07:43:30,537][00436] Num frames 500... |
|
[2025-02-14 07:43:30,673][00436] Num frames 600... |
|
[2025-02-14 07:43:30,822][00436] Num frames 700... |
|
[2025-02-14 07:43:30,884][00436] Avg episode rewards: #0: 13.040, true rewards: #0: 7.040 |
|
[2025-02-14 07:43:30,886][00436] Avg episode reward: 13.040, avg true_objective: 7.040 |
|
[2025-02-14 07:43:31,013][00436] Num frames 800... |
|
[2025-02-14 07:43:31,144][00436] Num frames 900... |
|
[2025-02-14 07:43:31,281][00436] Num frames 1000... |
|
[2025-02-14 07:43:31,413][00436] Num frames 1100... |
|
[2025-02-14 07:43:31,544][00436] Num frames 1200... |
|
[2025-02-14 07:43:31,675][00436] Num frames 1300... |
|
[2025-02-14 07:43:31,825][00436] Num frames 1400... |
|
[2025-02-14 07:43:31,956][00436] Num frames 1500... |
|
[2025-02-14 07:43:32,088][00436] Avg episode rewards: #0: 16.285, true rewards: #0: 7.785 |
|
[2025-02-14 07:43:32,089][00436] Avg episode reward: 16.285, avg true_objective: 7.785 |
|
[2025-02-14 07:43:32,150][00436] Num frames 1600... |
|
[2025-02-14 07:43:32,290][00436] Num frames 1700... |
|
[2025-02-14 07:43:32,421][00436] Num frames 1800... |
|
[2025-02-14 07:43:32,551][00436] Num frames 1900... |
|
[2025-02-14 07:43:32,681][00436] Num frames 2000... |
|
[2025-02-14 07:43:32,815][00436] Num frames 2100... |
|
[2025-02-14 07:43:32,948][00436] Num frames 2200... |
|
[2025-02-14 07:43:33,080][00436] Num frames 2300... |
|
[2025-02-14 07:43:33,225][00436] Num frames 2400... |
|
[2025-02-14 07:43:33,357][00436] Num frames 2500... |
|
[2025-02-14 07:43:33,487][00436] Num frames 2600... |
|
[2025-02-14 07:43:33,620][00436] Num frames 2700... |
|
[2025-02-14 07:43:33,750][00436] Num frames 2800... |
|
[2025-02-14 07:43:33,896][00436] Num frames 2900... |
|
[2025-02-14 07:43:34,026][00436] Num frames 3000... |
|
[2025-02-14 07:43:34,155][00436] Num frames 3100... |
|
[2025-02-14 07:43:34,296][00436] Num frames 3200... |
|
[2025-02-14 07:43:34,429][00436] Num frames 3300... |
|
[2025-02-14 07:43:34,561][00436] Num frames 3400... |
|
[2025-02-14 07:43:34,692][00436] Num frames 3500... |
|
[2025-02-14 07:43:34,822][00436] Num frames 3600... |
|
[2025-02-14 07:43:34,959][00436] Avg episode rewards: #0: 29.190, true rewards: #0: 12.190 |
|
[2025-02-14 07:43:34,961][00436] Avg episode reward: 29.190, avg true_objective: 12.190 |
|
[2025-02-14 07:43:35,016][00436] Num frames 3700... |
|
[2025-02-14 07:43:35,146][00436] Num frames 3800... |
|
[2025-02-14 07:43:35,286][00436] Num frames 3900... |
|
[2025-02-14 07:43:35,415][00436] Num frames 4000... |
|
[2025-02-14 07:43:35,544][00436] Num frames 4100... |
|
[2025-02-14 07:43:35,674][00436] Num frames 4200... |
|
[2025-02-14 07:43:35,803][00436] Num frames 4300... |
|
[2025-02-14 07:43:35,942][00436] Num frames 4400... |
|
[2025-02-14 07:43:36,072][00436] Num frames 4500... |
|
[2025-02-14 07:43:36,158][00436] Avg episode rewards: #0: 25.802, true rewards: #0: 11.303 |
|
[2025-02-14 07:43:36,160][00436] Avg episode reward: 25.802, avg true_objective: 11.303 |
|
[2025-02-14 07:43:36,271][00436] Num frames 4600... |
|
[2025-02-14 07:43:36,400][00436] Num frames 4700... |
|
[2025-02-14 07:43:36,535][00436] Num frames 4800... |
|
[2025-02-14 07:43:36,665][00436] Num frames 4900... |
|
[2025-02-14 07:43:36,796][00436] Num frames 5000... |
|
[2025-02-14 07:43:36,943][00436] Num frames 5100... |
|
[2025-02-14 07:43:37,074][00436] Num frames 5200... |
|
[2025-02-14 07:43:37,212][00436] Num frames 5300... |
|
[2025-02-14 07:43:37,338][00436] Avg episode rewards: #0: 24.706, true rewards: #0: 10.706 |
|
[2025-02-14 07:43:37,340][00436] Avg episode reward: 24.706, avg true_objective: 10.706 |
|
[2025-02-14 07:43:37,402][00436] Num frames 5400... |
|
[2025-02-14 07:43:37,535][00436] Num frames 5500... |
|
[2025-02-14 07:43:37,666][00436] Num frames 5600... |
|
[2025-02-14 07:43:37,796][00436] Num frames 5700... |
|
[2025-02-14 07:43:37,929][00436] Num frames 5800... |
|
[2025-02-14 07:43:38,065][00436] Num frames 5900... |
|
[2025-02-14 07:43:38,207][00436] Num frames 6000... |
|
[2025-02-14 07:43:38,381][00436] Avg episode rewards: #0: 23.315, true rewards: #0: 10.148 |
|
[2025-02-14 07:43:38,384][00436] Avg episode reward: 23.315, avg true_objective: 10.148 |
|
[2025-02-14 07:43:38,401][00436] Num frames 6100... |
|
[2025-02-14 07:43:38,532][00436] Num frames 6200... |
|
[2025-02-14 07:43:38,662][00436] Num frames 6300... |
|
[2025-02-14 07:43:38,799][00436] Num frames 6400... |
|
[2025-02-14 07:43:38,867][00436] Avg episode rewards: #0: 20.584, true rewards: #0: 9.156 |
|
[2025-02-14 07:43:38,869][00436] Avg episode reward: 20.584, avg true_objective: 9.156 |
|
[2025-02-14 07:43:38,997][00436] Num frames 6500... |
|
[2025-02-14 07:43:39,131][00436] Num frames 6600... |
|
[2025-02-14 07:43:39,267][00436] Num frames 6700... |
|
[2025-02-14 07:43:39,396][00436] Num frames 6800... |
|
[2025-02-14 07:43:39,528][00436] Num frames 6900... |
|
[2025-02-14 07:43:39,660][00436] Num frames 7000... |
|
[2025-02-14 07:43:39,821][00436] Avg episode rewards: #0: 19.976, true rewards: #0: 8.851 |
|
[2025-02-14 07:43:39,823][00436] Avg episode reward: 19.976, avg true_objective: 8.851 |
|
[2025-02-14 07:43:39,855][00436] Num frames 7100... |
|
[2025-02-14 07:43:39,992][00436] Num frames 7200... |
|
[2025-02-14 07:43:40,129][00436] Num frames 7300... |
|
[2025-02-14 07:43:40,266][00436] Num frames 7400... |
|
[2025-02-14 07:43:40,410][00436] Num frames 7500... |
|
[2025-02-14 07:43:40,591][00436] Num frames 7600... |
|
[2025-02-14 07:43:40,765][00436] Num frames 7700... |
|
[2025-02-14 07:43:40,940][00436] Num frames 7800... |
|
[2025-02-14 07:43:41,120][00436] Num frames 7900... |
|
[2025-02-14 07:43:41,308][00436] Num frames 8000... |
|
[2025-02-14 07:43:41,479][00436] Num frames 8100... |
|
[2025-02-14 07:43:41,649][00436] Num frames 8200... |
|
[2025-02-14 07:43:41,826][00436] Num frames 8300... |
|
[2025-02-14 07:43:42,004][00436] Num frames 8400... |
|
[2025-02-14 07:43:42,116][00436] Avg episode rewards: #0: 21.254, true rewards: #0: 9.366 |
|
[2025-02-14 07:43:42,118][00436] Avg episode reward: 21.254, avg true_objective: 9.366 |
|
[2025-02-14 07:43:42,260][00436] Num frames 8500... |
|
[2025-02-14 07:43:42,445][00436] Num frames 8600... |
|
[2025-02-14 07:43:42,625][00436] Num frames 8700... |
|
[2025-02-14 07:43:42,765][00436] Num frames 8800... |
|
[2025-02-14 07:43:42,898][00436] Num frames 8900... |
|
[2025-02-14 07:43:43,032][00436] Num frames 9000... |
|
[2025-02-14 07:43:43,178][00436] Num frames 9100... |
|
[2025-02-14 07:43:43,309][00436] Num frames 9200... |
|
[2025-02-14 07:43:43,449][00436] Num frames 9300... |
|
[2025-02-14 07:43:43,580][00436] Num frames 9400... |
|
[2025-02-14 07:43:43,685][00436] Avg episode rewards: #0: 21.437, true rewards: #0: 9.437 |
|
[2025-02-14 07:43:43,686][00436] Avg episode reward: 21.437, avg true_objective: 9.437 |
|
[2025-02-14 07:44:38,128][00436] Replay video saved to /content/train_dir/default_experiment/replay.mp4! |
|
[2025-02-14 07:44:42,755][00436] The model has been pushed to https://huggingface.co/gyaan/rl_course_vizdoom_health_gathering_supreme |
|
[2025-02-14 07:48:00,387][00436] Environment doom_basic already registered, overwriting... |
|
[2025-02-14 07:48:00,389][00436] Environment doom_two_colors_easy already registered, overwriting... |
|
[2025-02-14 07:48:00,391][00436] Environment doom_two_colors_hard already registered, overwriting... |
|
[2025-02-14 07:48:00,393][00436] Environment doom_dm already registered, overwriting... |
|
[2025-02-14 07:48:00,398][00436] Environment doom_dwango5 already registered, overwriting... |
|
[2025-02-14 07:48:00,399][00436] Environment doom_my_way_home_flat_actions already registered, overwriting... |
|
[2025-02-14 07:48:00,400][00436] Environment doom_defend_the_center_flat_actions already registered, overwriting... |
|
[2025-02-14 07:48:00,401][00436] Environment doom_my_way_home already registered, overwriting... |
|
[2025-02-14 07:48:00,405][00436] Environment doom_deadly_corridor already registered, overwriting... |
|
[2025-02-14 07:48:00,406][00436] Environment doom_defend_the_center already registered, overwriting... |
|
[2025-02-14 07:48:00,407][00436] Environment doom_defend_the_line already registered, overwriting... |
|
[2025-02-14 07:48:00,408][00436] Environment doom_health_gathering already registered, overwriting... |
|
[2025-02-14 07:48:00,409][00436] Environment doom_health_gathering_supreme already registered, overwriting... |
|
[2025-02-14 07:48:00,413][00436] Environment doom_battle already registered, overwriting... |
|
[2025-02-14 07:48:00,414][00436] Environment doom_battle2 already registered, overwriting... |
|
[2025-02-14 07:48:00,415][00436] Environment doom_duel_bots already registered, overwriting... |
|
[2025-02-14 07:48:00,416][00436] Environment doom_deathmatch_bots already registered, overwriting... |
|
[2025-02-14 07:48:00,417][00436] Environment doom_duel already registered, overwriting... |
|
[2025-02-14 07:48:00,417][00436] Environment doom_deathmatch_full already registered, overwriting... |
|
[2025-02-14 07:48:00,418][00436] Environment doom_benchmark already registered, overwriting... |
|
[2025-02-14 07:48:00,419][00436] register_encoder_factory: <function make_vizdoom_encoder at 0x790af59fec00> |
|
[2025-02-14 07:48:00,444][00436] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json |
|
[2025-02-14 07:48:00,453][00436] Overriding arg 'train_for_env_steps' with value 5000000 passed from command line |
|
[2025-02-14 07:48:00,465][00436] Experiment dir /content/train_dir/default_experiment already exists! |
|
[2025-02-14 07:48:00,467][00436] Resuming existing experiment from /content/train_dir/default_experiment... |
|
[2025-02-14 07:48:00,468][00436] Weights and Biases integration disabled |
|
[2025-02-14 07:48:00,473][00436] Environment var CUDA_VISIBLE_DEVICES is 0 |
|
|
|
[2025-02-14 07:48:03,692][00436] Starting experiment with the following configuration: |
|
help=False |
|
algo=APPO |
|
env=doom_health_gathering_supreme |
|
experiment=default_experiment |
|
train_dir=/content/train_dir |
|
restart_behavior=resume |
|
device=gpu |
|
seed=None |
|
num_policies=1 |
|
async_rl=True |
|
serial_mode=False |
|
batched_sampling=False |
|
num_batches_to_accumulate=2 |
|
worker_num_splits=2 |
|
policy_workers_per_policy=1 |
|
max_policy_lag=1000 |
|
num_workers=8 |
|
num_envs_per_worker=4 |
|
batch_size=1024 |
|
num_batches_per_epoch=1 |
|
num_epochs=1 |
|
rollout=32 |
|
recurrence=32 |
|
shuffle_minibatches=False |
|
gamma=0.99 |
|
reward_scale=1.0 |
|
reward_clip=1000.0 |
|
value_bootstrap=False |
|
normalize_returns=True |
|
exploration_loss_coeff=0.001 |
|
value_loss_coeff=0.5 |
|
kl_loss_coeff=0.0 |
|
exploration_loss=symmetric_kl |
|
gae_lambda=0.95 |
|
ppo_clip_ratio=0.1 |
|
ppo_clip_value=0.2 |
|
with_vtrace=False |
|
vtrace_rho=1.0 |
|
vtrace_c=1.0 |
|
optimizer=adam |
|
adam_eps=1e-06 |
|
adam_beta1=0.9 |
|
adam_beta2=0.999 |
|
max_grad_norm=4.0 |
|
learning_rate=0.0001 |
|
lr_schedule=constant |
|
lr_schedule_kl_threshold=0.008 |
|
lr_adaptive_min=1e-06 |
|
lr_adaptive_max=0.01 |
|
obs_subtract_mean=0.0 |
|
obs_scale=255.0 |
|
normalize_input=True |
|
normalize_input_keys=None |
|
decorrelate_experience_max_seconds=0 |
|
decorrelate_envs_on_one_worker=True |
|
actor_worker_gpus=[] |
|
set_workers_cpu_affinity=True |
|
force_envs_single_thread=False |
|
default_niceness=0 |
|
log_to_file=True |
|
experiment_summaries_interval=10 |
|
flush_summaries_interval=30 |
|
stats_avg=100 |
|
summaries_use_frameskip=True |
|
heartbeat_interval=20 |
|
heartbeat_reporting_interval=600 |
|
train_for_env_steps=5000000 |
|
train_for_seconds=10000000000 |
|
save_every_sec=120 |
|
keep_checkpoints=2 |
|
load_checkpoint_kind=latest |
|
save_milestones_sec=-1 |
|
save_best_every_sec=5 |
|
save_best_metric=reward |
|
save_best_after=100000 |
|
benchmark=False |
|
encoder_mlp_layers=[512, 512] |
|
encoder_conv_architecture=convnet_simple |
|
encoder_conv_mlp_layers=[512] |
|
use_rnn=True |
|
rnn_size=512 |
|
rnn_type=gru |
|
rnn_num_layers=1 |
|
decoder_mlp_layers=[] |
|
nonlinearity=elu |
|
policy_initialization=orthogonal |
|
policy_init_gain=1.0 |
|
actor_critic_share_weights=True |
|
adaptive_stddev=True |
|
continuous_tanh_scale=0.0 |
|
initial_stddev=1.0 |
|
use_env_info_cache=False |
|
env_gpu_actions=False |
|
env_gpu_observations=True |
|
env_frameskip=4 |
|
env_framestack=1 |
|
pixel_format=CHW |
|
use_record_episode_statistics=False |
|
with_wandb=False |
|
wandb_user=None |
|
wandb_project=sample_factory |
|
wandb_group=None |
|
wandb_job_type=SF |
|
wandb_tags=[] |
|
with_pbt=False |
|
pbt_mix_policies_in_one_env=True |
|
pbt_period_env_steps=5000000 |
|
pbt_start_mutation=20000000 |
|
pbt_replace_fraction=0.3 |
|
pbt_mutation_rate=0.15 |
|
pbt_replace_reward_gap=0.1 |
|
pbt_replace_reward_gap_absolute=1e-06 |
|
pbt_optimize_gamma=False |
|
pbt_target_objective=true_objective |
|
pbt_perturb_min=1.1 |
|
pbt_perturb_max=1.5 |
|
num_agents=-1 |
|
num_humans=0 |
|
num_bots=-1 |
|
start_bot_difficulty=None |
|
timelimit=None |
|
res_w=128 |
|
res_h=72 |
|
wide_aspect_ratio=False |
|
eval_env_frameskip=1 |
|
fps=35 |
|
command_line=--env=doom_health_gathering_supreme --num_workers=8 --num_envs_per_worker=4 --train_for_env_steps=4000000 |
|
cli_args={'env': 'doom_health_gathering_supreme', 'num_workers': 8, 'num_envs_per_worker': 4, 'train_for_env_steps': 4000000} |
|
git_hash=unknown |
|
git_repo_name=not a git repository |
|
[2025-02-14 07:48:03,694][00436] Saving configuration to /content/train_dir/default_experiment/config.json... |
|
[2025-02-14 07:48:03,696][00436] Rollout worker 0 uses device cpu |
|
[2025-02-14 07:48:03,699][00436] Rollout worker 1 uses device cpu |
|
[2025-02-14 07:48:03,701][00436] Rollout worker 2 uses device cpu |
|
[2025-02-14 07:48:03,702][00436] Rollout worker 3 uses device cpu |
|
[2025-02-14 07:48:03,703][00436] Rollout worker 4 uses device cpu |
|
[2025-02-14 07:48:03,710][00436] Rollout worker 5 uses device cpu |
|
[2025-02-14 07:48:03,712][00436] Rollout worker 6 uses device cpu |
|
[2025-02-14 07:48:03,713][00436] Rollout worker 7 uses device cpu |
|
[2025-02-14 07:48:03,787][00436] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-02-14 07:48:03,790][00436] InferenceWorker_p0-w0: min num requests: 2 |
|
[2025-02-14 07:48:03,821][00436] Starting all processes... |
|
[2025-02-14 07:48:03,822][00436] Starting process learner_proc0 |
|
[2025-02-14 07:48:03,886][00436] Starting all processes... |
|
[2025-02-14 07:48:03,898][00436] Starting process inference_proc0-0 |
|
[2025-02-14 07:48:03,898][00436] Starting process rollout_proc0 |
|
[2025-02-14 07:48:03,899][00436] Starting process rollout_proc1 |
|
[2025-02-14 07:48:03,899][00436] Starting process rollout_proc2 |
|
[2025-02-14 07:48:03,900][00436] Starting process rollout_proc3 |
|
[2025-02-14 07:48:03,900][00436] Starting process rollout_proc4 |
|
[2025-02-14 07:48:03,901][00436] Starting process rollout_proc5 |
|
[2025-02-14 07:48:03,903][00436] Starting process rollout_proc6 |
|
[2025-02-14 07:48:03,903][00436] Starting process rollout_proc7 |
|
[2025-02-14 07:48:19,194][13629] Worker 4 uses CPU cores [0] |
|
[2025-02-14 07:48:19,300][13627] Worker 2 uses CPU cores [0] |
|
[2025-02-14 07:48:19,403][13631] Worker 6 uses CPU cores [0] |
|
[2025-02-14 07:48:19,410][13630] Worker 5 uses CPU cores [1] |
|
[2025-02-14 07:48:19,419][13626] Worker 1 uses CPU cores [1] |
|
[2025-02-14 07:48:19,511][13607] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-02-14 07:48:19,512][13607] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 |
|
[2025-02-14 07:48:19,524][13632] Worker 7 uses CPU cores [1] |
|
[2025-02-14 07:48:19,540][13624] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-02-14 07:48:19,541][13624] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 |
|
[2025-02-14 07:48:19,565][13624] Num visible devices: 1 |
|
[2025-02-14 07:48:19,566][13607] Num visible devices: 1 |
|
[2025-02-14 07:48:19,575][13628] Worker 3 uses CPU cores [1] |
|
[2025-02-14 07:48:19,579][13607] Starting seed is not provided |
|
[2025-02-14 07:48:19,579][13607] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-02-14 07:48:19,579][13607] Initializing actor-critic model on device cuda:0 |
|
[2025-02-14 07:48:19,580][13607] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-02-14 07:48:19,581][13607] RunningMeanStd input shape: (1,) |
|
[2025-02-14 07:48:19,592][13625] Worker 0 uses CPU cores [0] |
|
[2025-02-14 07:48:19,600][13607] ConvEncoder: input_channels=3 |
|
[2025-02-14 07:48:19,718][13607] Conv encoder output size: 512 |
|
[2025-02-14 07:48:19,718][13607] Policy head output size: 512 |
|
[2025-02-14 07:48:19,734][13607] Created Actor Critic model with architecture: |
|
[2025-02-14 07:48:19,734][13607] ActorCriticSharedWeights( |
|
(obs_normalizer): ObservationNormalizer( |
|
(running_mean_std): RunningMeanStdDictInPlace( |
|
(running_mean_std): ModuleDict( |
|
(obs): RunningMeanStdInPlace() |
|
) |
|
) |
|
) |
|
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) |
|
(encoder): VizdoomEncoder( |
|
(basic_encoder): ConvEncoder( |
|
(enc): RecursiveScriptModule( |
|
original_name=ConvEncoderImpl |
|
(conv_head): RecursiveScriptModule( |
|
original_name=Sequential |
|
(0): RecursiveScriptModule(original_name=Conv2d) |
|
(1): RecursiveScriptModule(original_name=ELU) |
|
(2): RecursiveScriptModule(original_name=Conv2d) |
|
(3): RecursiveScriptModule(original_name=ELU) |
|
(4): RecursiveScriptModule(original_name=Conv2d) |
|
(5): RecursiveScriptModule(original_name=ELU) |
|
) |
|
(mlp_layers): RecursiveScriptModule( |
|
original_name=Sequential |
|
(0): RecursiveScriptModule(original_name=Linear) |
|
(1): RecursiveScriptModule(original_name=ELU) |
|
) |
|
) |
|
) |
|
) |
|
(core): ModelCoreRNN( |
|
(core): GRU(512, 512) |
|
) |
|
(decoder): MlpDecoder( |
|
(mlp): Identity() |
|
) |
|
(critic_linear): Linear(in_features=512, out_features=1, bias=True) |
|
(action_parameterization): ActionParameterizationDefault( |
|
(distribution_linear): Linear(in_features=512, out_features=5, bias=True) |
|
) |
|
) |
|
[2025-02-14 07:48:19,858][13607] Using optimizer <class 'torch.optim.adam.Adam'> |
|
[2025-02-14 07:48:21,021][13607] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-02-14 07:48:21,059][13607] Loading model from checkpoint |
|
[2025-02-14 07:48:21,061][13607] Loaded experiment state at self.train_step=978, self.env_steps=4005888 |
|
[2025-02-14 07:48:21,061][13607] Initialized policy 0 weights for model version 978 |
|
[2025-02-14 07:48:21,063][13607] LearnerWorker_p0 finished initialization! |
|
[2025-02-14 07:48:21,064][13607] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-02-14 07:48:21,190][13624] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-02-14 07:48:21,192][13624] RunningMeanStd input shape: (1,) |
|
[2025-02-14 07:48:21,204][13624] ConvEncoder: input_channels=3 |
|
[2025-02-14 07:48:21,305][13624] Conv encoder output size: 512 |
|
[2025-02-14 07:48:21,305][13624] Policy head output size: 512 |
|
[2025-02-14 07:48:21,343][00436] Inference worker 0-0 is ready! |
|
[2025-02-14 07:48:21,344][00436] All inference workers are ready! Signal rollout workers to start! |
|
[2025-02-14 07:48:21,595][13629] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-02-14 07:48:21,670][13631] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-02-14 07:48:21,668][13628] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-02-14 07:48:21,675][13632] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-02-14 07:48:21,697][13625] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-02-14 07:48:21,723][13630] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-02-14 07:48:21,740][13627] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-02-14 07:48:21,774][13626] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-02-14 07:48:23,118][13628] Decorrelating experience for 0 frames... |
|
[2025-02-14 07:48:23,114][13632] Decorrelating experience for 0 frames... |
|
[2025-02-14 07:48:23,593][13629] Decorrelating experience for 0 frames... |
|
[2025-02-14 07:48:23,632][13631] Decorrelating experience for 0 frames... |
|
[2025-02-14 07:48:23,653][13625] Decorrelating experience for 0 frames... |
|
[2025-02-14 07:48:23,673][13627] Decorrelating experience for 0 frames... |
|
[2025-02-14 07:48:23,779][00436] Heartbeat connected on Batcher_0 |
|
[2025-02-14 07:48:23,784][00436] Heartbeat connected on LearnerWorker_p0 |
|
[2025-02-14 07:48:23,814][00436] Heartbeat connected on InferenceWorker_p0-w0 |
|
[2025-02-14 07:48:24,111][13632] Decorrelating experience for 32 frames... |
|
[2025-02-14 07:48:24,201][13630] Decorrelating experience for 0 frames... |
|
[2025-02-14 07:48:24,527][13631] Decorrelating experience for 32 frames... |
|
[2025-02-14 07:48:24,597][13627] Decorrelating experience for 32 frames... |
|
[2025-02-14 07:48:24,801][13628] Decorrelating experience for 32 frames... |
|
[2025-02-14 07:48:25,297][13625] Decorrelating experience for 32 frames... |
|
[2025-02-14 07:48:25,301][13626] Decorrelating experience for 0 frames... |
|
[2025-02-14 07:48:25,473][00436] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 4005888. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2025-02-14 07:48:26,119][13630] Decorrelating experience for 32 frames... |
|
[2025-02-14 07:48:26,149][13631] Decorrelating experience for 64 frames... |
|
[2025-02-14 07:48:26,748][13628] Decorrelating experience for 64 frames... |
|
[2025-02-14 07:48:26,998][13632] Decorrelating experience for 64 frames... |
|
[2025-02-14 07:48:27,257][13629] Decorrelating experience for 32 frames... |
|
[2025-02-14 07:48:27,730][13625] Decorrelating experience for 64 frames... |
|
[2025-02-14 07:48:28,211][13630] Decorrelating experience for 64 frames... |
|
[2025-02-14 07:48:28,218][13627] Decorrelating experience for 64 frames... |
|
[2025-02-14 07:48:28,639][13628] Decorrelating experience for 96 frames... |
|
[2025-02-14 07:48:29,022][00436] Heartbeat connected on RolloutWorker_w3 |
|
[2025-02-14 07:48:29,138][13631] Decorrelating experience for 96 frames... |
|
[2025-02-14 07:48:29,793][00436] Heartbeat connected on RolloutWorker_w6 |
|
[2025-02-14 07:48:30,180][13629] Decorrelating experience for 64 frames... |
|
[2025-02-14 07:48:30,291][13632] Decorrelating experience for 96 frames... |
|
[2025-02-14 07:48:30,301][13626] Decorrelating experience for 32 frames... |
|
[2025-02-14 07:48:30,421][13625] Decorrelating experience for 96 frames... |
|
[2025-02-14 07:48:30,473][00436] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4005888. Throughput: 0: 12.0. Samples: 60. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2025-02-14 07:48:30,600][00436] Heartbeat connected on RolloutWorker_w7 |
|
[2025-02-14 07:48:30,889][00436] Heartbeat connected on RolloutWorker_w0 |
|
[2025-02-14 07:48:31,876][13627] Decorrelating experience for 96 frames... |
|
[2025-02-14 07:48:32,485][00436] Heartbeat connected on RolloutWorker_w2 |
|
[2025-02-14 07:48:32,744][13630] Decorrelating experience for 96 frames... |
|
[2025-02-14 07:48:33,274][00436] Heartbeat connected on RolloutWorker_w5 |
|
[2025-02-14 07:48:33,899][13626] Decorrelating experience for 64 frames... |
|
[2025-02-14 07:48:34,563][13607] Signal inference workers to stop experience collection... |
|
[2025-02-14 07:48:34,588][13624] InferenceWorker_p0-w0: stopping experience collection |
|
[2025-02-14 07:48:35,005][13626] Decorrelating experience for 96 frames... |
|
[2025-02-14 07:48:35,045][13629] Decorrelating experience for 96 frames... |
|
[2025-02-14 07:48:35,127][00436] Heartbeat connected on RolloutWorker_w4 |
|
[2025-02-14 07:48:35,173][00436] Heartbeat connected on RolloutWorker_w1 |
|
[2025-02-14 07:48:35,473][00436] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4005888. Throughput: 0: 179.2. Samples: 1792. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2025-02-14 07:48:35,485][00436] Avg episode reward: [(0, '5.519')] |
|
[2025-02-14 07:48:35,556][13607] Signal inference workers to resume experience collection... |
|
[2025-02-14 07:48:35,557][13624] InferenceWorker_p0-w0: resuming experience collection |
|
[2025-02-14 07:48:40,474][00436] Fps is (10 sec: 2457.5, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 4030464. Throughput: 0: 439.6. Samples: 6594. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:48:40,476][00436] Avg episode reward: [(0, '10.389')] |
|
[2025-02-14 07:48:45,001][13624] Updated weights for policy 0, policy_version 988 (0.0027) |
|
[2025-02-14 07:48:45,473][00436] Fps is (10 sec: 4096.0, 60 sec: 2048.0, 300 sec: 2048.0). Total num frames: 4046848. Throughput: 0: 561.5. Samples: 11230. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:48:45,480][00436] Avg episode reward: [(0, '14.393')] |
|
[2025-02-14 07:48:50,473][00436] Fps is (10 sec: 4096.1, 60 sec: 2621.4, 300 sec: 2621.4). Total num frames: 4071424. Throughput: 0: 590.2. Samples: 14756. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-02-14 07:48:50,478][00436] Avg episode reward: [(0, '16.854')] |
|
[2025-02-14 07:48:54,104][13624] Updated weights for policy 0, policy_version 998 (0.0029) |
|
[2025-02-14 07:48:55,474][00436] Fps is (10 sec: 4505.4, 60 sec: 2867.2, 300 sec: 2867.2). Total num frames: 4091904. Throughput: 0: 706.3. Samples: 21188. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-02-14 07:48:55,478][00436] Avg episode reward: [(0, '17.677')] |
|
[2025-02-14 07:49:00,473][00436] Fps is (10 sec: 3276.8, 60 sec: 2808.7, 300 sec: 2808.7). Total num frames: 4104192. Throughput: 0: 737.3. Samples: 25804. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:49:00,478][00436] Avg episode reward: [(0, '19.975')] |
|
[2025-02-14 07:49:05,091][13624] Updated weights for policy 0, policy_version 1008 (0.0021) |
|
[2025-02-14 07:49:05,473][00436] Fps is (10 sec: 3686.6, 60 sec: 3072.0, 300 sec: 3072.0). Total num frames: 4128768. Throughput: 0: 731.0. Samples: 29238. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-02-14 07:49:05,479][00436] Avg episode reward: [(0, '22.568')] |
|
[2025-02-14 07:49:10,473][00436] Fps is (10 sec: 4505.6, 60 sec: 3185.8, 300 sec: 3185.8). Total num frames: 4149248. Throughput: 0: 804.5. Samples: 36204. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:49:10,477][00436] Avg episode reward: [(0, '25.940')] |
|
[2025-02-14 07:49:15,473][00436] Fps is (10 sec: 3686.4, 60 sec: 3194.9, 300 sec: 3194.9). Total num frames: 4165632. Throughput: 0: 907.2. Samples: 40886. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-02-14 07:49:15,480][00436] Avg episode reward: [(0, '26.159')] |
|
[2025-02-14 07:49:16,080][13624] Updated weights for policy 0, policy_version 1018 (0.0026) |
|
[2025-02-14 07:49:20,473][00436] Fps is (10 sec: 4096.0, 60 sec: 3351.3, 300 sec: 3351.3). Total num frames: 4190208. Throughput: 0: 946.0. Samples: 44364. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-02-14 07:49:20,482][00436] Avg episode reward: [(0, '28.399')] |
|
[2025-02-14 07:49:20,489][13607] Saving new best policy, reward=28.399! |
|
[2025-02-14 07:49:24,782][13624] Updated weights for policy 0, policy_version 1028 (0.0015) |
|
[2025-02-14 07:49:25,476][00436] Fps is (10 sec: 4504.4, 60 sec: 3413.2, 300 sec: 3413.2). Total num frames: 4210688. Throughput: 0: 993.7. Samples: 51312. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-02-14 07:49:25,484][00436] Avg episode reward: [(0, '29.158')] |
|
[2025-02-14 07:49:25,486][13607] Saving new best policy, reward=29.158! |
|
[2025-02-14 07:49:30,473][00436] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3339.8). Total num frames: 4222976. Throughput: 0: 987.0. Samples: 55646. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-02-14 07:49:30,475][00436] Avg episode reward: [(0, '27.438')] |
|
[2025-02-14 07:49:35,473][00436] Fps is (10 sec: 3687.4, 60 sec: 4027.7, 300 sec: 3452.3). Total num frames: 4247552. Throughput: 0: 983.4. Samples: 59008. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-02-14 07:49:35,480][00436] Avg episode reward: [(0, '26.877')] |
|
[2025-02-14 07:49:35,970][13624] Updated weights for policy 0, policy_version 1038 (0.0012) |
|
[2025-02-14 07:49:40,474][00436] Fps is (10 sec: 4505.4, 60 sec: 3959.5, 300 sec: 3495.2). Total num frames: 4268032. Throughput: 0: 993.6. Samples: 65898. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:49:40,476][00436] Avg episode reward: [(0, '25.331')] |
|
[2025-02-14 07:49:45,473][00436] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3481.6). Total num frames: 4284416. Throughput: 0: 993.7. Samples: 70522. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-02-14 07:49:45,476][00436] Avg episode reward: [(0, '25.188')] |
|
[2025-02-14 07:49:46,752][13624] Updated weights for policy 0, policy_version 1048 (0.0015) |
|
[2025-02-14 07:49:50,473][00436] Fps is (10 sec: 4096.2, 60 sec: 3959.5, 300 sec: 3565.9). Total num frames: 4308992. Throughput: 0: 996.2. Samples: 74068. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:49:50,478][00436] Avg episode reward: [(0, '24.083')] |
|
[2025-02-14 07:49:55,476][00436] Fps is (10 sec: 4504.5, 60 sec: 3959.3, 300 sec: 3595.3). Total num frames: 4329472. Throughput: 0: 998.6. Samples: 81144. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:49:55,480][00436] Avg episode reward: [(0, '25.341')] |
|
[2025-02-14 07:49:56,023][13624] Updated weights for policy 0, policy_version 1058 (0.0014) |
|
[2025-02-14 07:50:00,477][00436] Fps is (10 sec: 3685.1, 60 sec: 4027.5, 300 sec: 3578.5). Total num frames: 4345856. Throughput: 0: 992.5. Samples: 85552. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-02-14 07:50:00,486][00436] Avg episode reward: [(0, '25.838')] |
|
[2025-02-14 07:50:00,498][13607] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001061_4345856.pth... |
|
[2025-02-14 07:50:00,627][13607] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000894_3661824.pth |
|
[2025-02-14 07:50:05,473][00436] Fps is (10 sec: 3687.3, 60 sec: 3959.5, 300 sec: 3604.5). Total num frames: 4366336. Throughput: 0: 989.3. Samples: 88882. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-02-14 07:50:05,480][00436] Avg episode reward: [(0, '25.862')] |
|
[2025-02-14 07:50:06,789][13624] Updated weights for policy 0, policy_version 1068 (0.0016) |
|
[2025-02-14 07:50:10,475][00436] Fps is (10 sec: 4506.4, 60 sec: 4027.6, 300 sec: 3666.8). Total num frames: 4390912. Throughput: 0: 987.1. Samples: 95732. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:50:10,478][00436] Avg episode reward: [(0, '27.260')] |
|
[2025-02-14 07:50:15,473][00436] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3611.9). Total num frames: 4403200. Throughput: 0: 994.1. Samples: 100382. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-02-14 07:50:15,480][00436] Avg episode reward: [(0, '26.786')] |
|
[2025-02-14 07:50:17,519][13624] Updated weights for policy 0, policy_version 1078 (0.0023) |
|
[2025-02-14 07:50:20,473][00436] Fps is (10 sec: 3687.0, 60 sec: 3959.5, 300 sec: 3668.6). Total num frames: 4427776. Throughput: 0: 997.2. Samples: 103884. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:50:20,478][00436] Avg episode reward: [(0, '26.185')] |
|
[2025-02-14 07:50:25,473][00436] Fps is (10 sec: 4505.6, 60 sec: 3959.6, 300 sec: 3686.4). Total num frames: 4448256. Throughput: 0: 996.1. Samples: 110720. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:50:25,479][00436] Avg episode reward: [(0, '25.814')] |
|
[2025-02-14 07:50:27,557][13624] Updated weights for policy 0, policy_version 1088 (0.0013) |
|
[2025-02-14 07:50:30,473][00436] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3637.2). Total num frames: 4460544. Throughput: 0: 991.2. Samples: 115124. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-02-14 07:50:30,476][00436] Avg episode reward: [(0, '25.918')] |
|
[2025-02-14 07:50:35,473][00436] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3686.4). Total num frames: 4485120. Throughput: 0: 985.7. Samples: 118424. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-02-14 07:50:35,476][00436] Avg episode reward: [(0, '24.770')] |
|
[2025-02-14 07:50:37,525][13624] Updated weights for policy 0, policy_version 1098 (0.0015) |
|
[2025-02-14 07:50:40,473][00436] Fps is (10 sec: 4915.2, 60 sec: 4027.8, 300 sec: 3731.9). Total num frames: 4509696. Throughput: 0: 986.2. Samples: 125520. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:50:40,476][00436] Avg episode reward: [(0, '24.364')] |
|
[2025-02-14 07:50:45,474][00436] Fps is (10 sec: 3686.3, 60 sec: 3959.4, 300 sec: 3686.4). Total num frames: 4521984. Throughput: 0: 997.4. Samples: 130434. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:50:45,482][00436] Avg episode reward: [(0, '24.075')] |
|
[2025-02-14 07:50:48,142][13624] Updated weights for policy 0, policy_version 1108 (0.0026) |
|
[2025-02-14 07:50:50,473][00436] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3728.8). Total num frames: 4546560. Throughput: 0: 999.6. Samples: 133866. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:50:50,475][00436] Avg episode reward: [(0, '23.896')] |
|
[2025-02-14 07:50:55,473][00436] Fps is (10 sec: 4915.4, 60 sec: 4027.9, 300 sec: 3768.3). Total num frames: 4571136. Throughput: 0: 1006.4. Samples: 141018. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:50:55,478][00436] Avg episode reward: [(0, '24.519')] |
|
[2025-02-14 07:50:57,741][13624] Updated weights for policy 0, policy_version 1118 (0.0015) |
|
[2025-02-14 07:51:00,475][00436] Fps is (10 sec: 3685.9, 60 sec: 3959.6, 300 sec: 3726.0). Total num frames: 4583424. Throughput: 0: 1009.6. Samples: 145816. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-02-14 07:51:00,482][00436] Avg episode reward: [(0, '25.762')] |
|
[2025-02-14 07:51:05,473][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3763.2). Total num frames: 4608000. Throughput: 0: 1009.6. Samples: 149316. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-02-14 07:51:05,476][00436] Avg episode reward: [(0, '26.854')] |
|
[2025-02-14 07:51:07,433][13624] Updated weights for policy 0, policy_version 1128 (0.0021) |
|
[2025-02-14 07:51:10,473][00436] Fps is (10 sec: 4915.9, 60 sec: 4027.8, 300 sec: 3798.1). Total num frames: 4632576. Throughput: 0: 1016.5. Samples: 156464. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:51:10,475][00436] Avg episode reward: [(0, '26.332')] |
|
[2025-02-14 07:51:15,473][00436] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3758.7). Total num frames: 4644864. Throughput: 0: 1027.6. Samples: 161368. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-02-14 07:51:15,476][00436] Avg episode reward: [(0, '25.744')] |
|
[2025-02-14 07:51:18,084][13624] Updated weights for policy 0, policy_version 1138 (0.0020) |
|
[2025-02-14 07:51:20,474][00436] Fps is (10 sec: 3686.3, 60 sec: 4027.7, 300 sec: 3791.7). Total num frames: 4669440. Throughput: 0: 1031.1. Samples: 164824. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-02-14 07:51:20,478][00436] Avg episode reward: [(0, '26.056')] |
|
[2025-02-14 07:51:25,473][00436] Fps is (10 sec: 4915.2, 60 sec: 4096.0, 300 sec: 3822.9). Total num frames: 4694016. Throughput: 0: 1032.4. Samples: 171980. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-02-14 07:51:25,475][00436] Avg episode reward: [(0, '24.355')] |
|
[2025-02-14 07:51:27,589][13624] Updated weights for policy 0, policy_version 1148 (0.0018) |
|
[2025-02-14 07:51:30,477][00436] Fps is (10 sec: 3685.2, 60 sec: 4095.8, 300 sec: 3786.0). Total num frames: 4706304. Throughput: 0: 1028.2. Samples: 176708. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-02-14 07:51:30,482][00436] Avg episode reward: [(0, '24.539')] |
|
[2025-02-14 07:51:35,473][00436] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 3815.7). Total num frames: 4730880. Throughput: 0: 1027.6. Samples: 180106. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-02-14 07:51:35,476][00436] Avg episode reward: [(0, '24.743')] |
|
[2025-02-14 07:51:37,471][13624] Updated weights for policy 0, policy_version 1158 (0.0018) |
|
[2025-02-14 07:51:40,474][00436] Fps is (10 sec: 4916.7, 60 sec: 4096.0, 300 sec: 3843.9). Total num frames: 4755456. Throughput: 0: 1028.2. Samples: 187286. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-02-14 07:51:40,477][00436] Avg episode reward: [(0, '26.191')] |
|
[2025-02-14 07:51:45,473][00436] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 3809.3). Total num frames: 4767744. Throughput: 0: 1028.9. Samples: 192116. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-02-14 07:51:45,476][00436] Avg episode reward: [(0, '26.387')] |
|
[2025-02-14 07:51:48,010][13624] Updated weights for policy 0, policy_version 1168 (0.0017) |
|
[2025-02-14 07:51:50,473][00436] Fps is (10 sec: 3686.6, 60 sec: 4096.0, 300 sec: 3836.3). Total num frames: 4792320. Throughput: 0: 1029.6. Samples: 195648. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-02-14 07:51:50,476][00436] Avg episode reward: [(0, '26.799')] |
|
[2025-02-14 07:51:55,473][00436] Fps is (10 sec: 4915.2, 60 sec: 4096.0, 300 sec: 3861.9). Total num frames: 4816896. Throughput: 0: 1028.6. Samples: 202750. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-02-14 07:51:55,477][00436] Avg episode reward: [(0, '26.529')] |
|
[2025-02-14 07:51:57,355][13624] Updated weights for policy 0, policy_version 1178 (0.0024) |
|
[2025-02-14 07:52:00,473][00436] Fps is (10 sec: 3686.4, 60 sec: 4096.1, 300 sec: 3829.3). Total num frames: 4829184. Throughput: 0: 1025.3. Samples: 207506. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:52:00,478][00436] Avg episode reward: [(0, '27.132')] |
|
[2025-02-14 07:52:00,518][13607] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001180_4833280.pth... |
|
[2025-02-14 07:52:00,666][13607] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth |
|
[2025-02-14 07:52:05,473][00436] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 3854.0). Total num frames: 4853760. Throughput: 0: 1021.8. Samples: 210804. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-02-14 07:52:05,476][00436] Avg episode reward: [(0, '27.476')] |
|
[2025-02-14 07:52:07,449][13624] Updated weights for policy 0, policy_version 1188 (0.0027) |
|
[2025-02-14 07:52:10,473][00436] Fps is (10 sec: 4915.2, 60 sec: 4096.0, 300 sec: 3877.5). Total num frames: 4878336. Throughput: 0: 1019.9. Samples: 217874. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-02-14 07:52:10,476][00436] Avg episode reward: [(0, '27.026')] |
|
[2025-02-14 07:52:15,473][00436] Fps is (10 sec: 4096.0, 60 sec: 4164.3, 300 sec: 3864.5). Total num frames: 4894720. Throughput: 0: 1020.7. Samples: 222634. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-02-14 07:52:15,477][00436] Avg episode reward: [(0, '26.811')] |
|
[2025-02-14 07:52:18,001][13624] Updated weights for policy 0, policy_version 1198 (0.0015) |
|
[2025-02-14 07:52:20,473][00436] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 3869.4). Total num frames: 4915200. Throughput: 0: 1023.9. Samples: 226182. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-02-14 07:52:20,483][00436] Avg episode reward: [(0, '28.712')] |
|
[2025-02-14 07:52:25,475][00436] Fps is (10 sec: 4504.7, 60 sec: 4095.9, 300 sec: 3891.2). Total num frames: 4939776. Throughput: 0: 1023.5. Samples: 233344. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-02-14 07:52:25,482][00436] Avg episode reward: [(0, '29.438')] |
|
[2025-02-14 07:52:25,487][13607] Saving new best policy, reward=29.438! |
|
[2025-02-14 07:52:27,425][13624] Updated weights for policy 0, policy_version 1208 (0.0013) |
|
[2025-02-14 07:52:30,473][00436] Fps is (10 sec: 4096.0, 60 sec: 4164.5, 300 sec: 3878.7). Total num frames: 4956160. Throughput: 0: 1018.7. Samples: 237956. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-02-14 07:52:30,479][00436] Avg episode reward: [(0, '29.554')] |
|
[2025-02-14 07:52:30,492][13607] Saving new best policy, reward=29.554! |
|
[2025-02-14 07:52:35,473][00436] Fps is (10 sec: 3687.1, 60 sec: 4096.0, 300 sec: 3883.0). Total num frames: 4976640. Throughput: 0: 1014.8. Samples: 241316. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-02-14 07:52:35,480][00436] Avg episode reward: [(0, '29.219')] |
|
[2025-02-14 07:52:37,506][13624] Updated weights for policy 0, policy_version 1218 (0.0022) |
|
[2025-02-14 07:52:40,473][00436] Fps is (10 sec: 4505.6, 60 sec: 4096.0, 300 sec: 3903.2). Total num frames: 5001216. Throughput: 0: 1016.5. Samples: 248492. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-02-14 07:52:40,476][00436] Avg episode reward: [(0, '28.054')] |
|
[2025-02-14 07:52:41,361][13607] Stopping Batcher_0... |
|
[2025-02-14 07:52:41,365][00436] Component Batcher_0 stopped! |
|
[2025-02-14 07:52:41,367][13607] Loop batcher_evt_loop terminating... |
|
[2025-02-14 07:52:41,373][13607] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001222_5005312.pth... |
|
[2025-02-14 07:52:41,467][13624] Weights refcount: 2 0 |
|
[2025-02-14 07:52:41,479][00436] Component InferenceWorker_p0-w0 stopped! |
|
[2025-02-14 07:52:41,483][13624] Stopping InferenceWorker_p0-w0... |
|
[2025-02-14 07:52:41,483][13624] Loop inference_proc0-0_evt_loop terminating... |
|
[2025-02-14 07:52:41,548][13607] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001061_4345856.pth |
|
[2025-02-14 07:52:41,575][13607] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001222_5005312.pth... |
|
[2025-02-14 07:52:41,827][00436] Component LearnerWorker_p0 stopped! |
|
[2025-02-14 07:52:41,833][13607] Stopping LearnerWorker_p0... |
|
[2025-02-14 07:52:41,833][13607] Loop learner_proc0_evt_loop terminating... |
|
[2025-02-14 07:52:42,072][00436] Component RolloutWorker_w5 stopped! |
|
[2025-02-14 07:52:42,078][13630] Stopping RolloutWorker_w5... |
|
[2025-02-14 07:52:42,083][00436] Component RolloutWorker_w7 stopped! |
|
[2025-02-14 07:52:42,087][13632] Stopping RolloutWorker_w7... |
|
[2025-02-14 07:52:42,094][00436] Component RolloutWorker_w3 stopped! |
|
[2025-02-14 07:52:42,098][13628] Stopping RolloutWorker_w3... |
|
[2025-02-14 07:52:42,099][13628] Loop rollout_proc3_evt_loop terminating... |
|
[2025-02-14 07:52:42,100][13630] Loop rollout_proc5_evt_loop terminating... |
|
[2025-02-14 07:52:42,112][00436] Component RolloutWorker_w1 stopped! |
|
[2025-02-14 07:52:42,115][13626] Stopping RolloutWorker_w1... |
|
[2025-02-14 07:52:42,116][13626] Loop rollout_proc1_evt_loop terminating... |
|
[2025-02-14 07:52:42,107][13632] Loop rollout_proc7_evt_loop terminating... |
|
[2025-02-14 07:52:42,270][00436] Component RolloutWorker_w0 stopped! |
|
[2025-02-14 07:52:42,270][13625] Stopping RolloutWorker_w0... |
|
[2025-02-14 07:52:42,276][13625] Loop rollout_proc0_evt_loop terminating... |
|
[2025-02-14 07:52:42,326][00436] Component RolloutWorker_w2 stopped! |
|
[2025-02-14 07:52:42,334][13627] Stopping RolloutWorker_w2... |
|
[2025-02-14 07:52:42,335][13627] Loop rollout_proc2_evt_loop terminating... |
|
[2025-02-14 07:52:42,437][13631] Stopping RolloutWorker_w6... |
|
[2025-02-14 07:52:42,437][00436] Component RolloutWorker_w6 stopped! |
|
[2025-02-14 07:52:42,443][13629] Stopping RolloutWorker_w4... |
|
[2025-02-14 07:52:42,443][13629] Loop rollout_proc4_evt_loop terminating... |
|
[2025-02-14 07:52:42,443][00436] Component RolloutWorker_w4 stopped! |
|
[2025-02-14 07:52:42,446][00436] Waiting for process learner_proc0 to stop... |
|
[2025-02-14 07:52:42,449][13631] Loop rollout_proc6_evt_loop terminating... |
|
[2025-02-14 07:52:44,115][00436] Waiting for process inference_proc0-0 to join... |
|
[2025-02-14 07:52:44,174][00436] Waiting for process rollout_proc0 to join... |
|
[2025-02-14 07:52:46,466][00436] Waiting for process rollout_proc1 to join... |
|
[2025-02-14 07:52:46,506][00436] Waiting for process rollout_proc2 to join... |
|
[2025-02-14 07:52:46,509][00436] Waiting for process rollout_proc3 to join... |
|
[2025-02-14 07:52:46,511][00436] Waiting for process rollout_proc4 to join... |
|
[2025-02-14 07:52:46,515][00436] Waiting for process rollout_proc5 to join... |
|
[2025-02-14 07:52:46,516][00436] Waiting for process rollout_proc6 to join... |
|
[2025-02-14 07:52:46,518][00436] Waiting for process rollout_proc7 to join... |
|
[2025-02-14 07:52:46,522][00436] Batcher 0 profile tree view: |
|
batching: 6.1311, releasing_batches: 0.0063 |
|
[2025-02-14 07:52:46,523][00436] InferenceWorker_p0-w0 profile tree view: |
|
wait_policy: 0.0000 |
|
wait_policy_total: 103.2326 |
|
update_model: 2.0046 |
|
weight_update: 0.0022 |
|
one_step: 0.0104 |
|
handle_policy_step: 144.4081 |
|
deserialize: 3.4548, stack: 0.7423, obs_to_device_normalize: 30.3564, forward: 74.6529, send_messages: 7.1637 |
|
prepare_outputs: 22.0848 |
|
to_cpu: 13.6550 |
|
[2025-02-14 07:52:46,524][00436] Learner 0 profile tree view: |
|
misc: 0.0009, prepare_batch: 4.2370 |
|
train: 20.0195 |
|
epoch_init: 0.0012, minibatch_init: 0.0014, losses_postprocess: 0.1660, kl_divergence: 0.1880, after_optimizer: 0.8447 |
|
calculate_losses: 6.5899 |
|
losses_init: 0.0008, forward_head: 0.6238, bptt_initial: 4.1340, tail: 0.3297, advantages_returns: 0.0705, losses: 0.8757 |
|
bptt: 0.4848 |
|
bptt_forward_core: 0.4517 |
|
update: 12.0735 |
|
clip: 0.2249 |
|
[2025-02-14 07:52:46,526][00436] RolloutWorker_w0 profile tree view: |
|
wait_for_trajectories: 0.0736, enqueue_policy_requests: 22.8663, env_step: 200.0027, overhead: 2.9398, complete_rollouts: 2.0251 |
|
save_policy_outputs: 4.4570 |
|
split_output_tensors: 1.7724 |
|
[2025-02-14 07:52:46,527][00436] RolloutWorker_w7 profile tree view: |
|
wait_for_trajectories: 0.0737, enqueue_policy_requests: 24.7789, env_step: 198.5096, overhead: 2.8235, complete_rollouts: 1.8543 |
|
save_policy_outputs: 4.2478 |
|
split_output_tensors: 1.7220 |
|
[2025-02-14 07:52:46,528][00436] Loop Runner_EvtLoop terminating... |
|
[2025-02-14 07:52:46,530][00436] Runner profile tree view: |
|
main_loop: 282.7091 |
|
[2025-02-14 07:52:46,531][00436] Collected {0: 5005312}, FPS: 3535.2 |
|
[2025-02-14 07:54:33,274][00436] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json |
|
[2025-02-14 07:54:33,276][00436] Overriding arg 'num_workers' with value 1 passed from command line |
|
[2025-02-14 07:54:33,278][00436] Adding new argument 'no_render'=True that is not in the saved config file! |
|
[2025-02-14 07:54:33,279][00436] Adding new argument 'save_video'=True that is not in the saved config file! |
|
[2025-02-14 07:54:33,281][00436] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-02-14 07:54:33,283][00436] Adding new argument 'video_name'=None that is not in the saved config file! |
|
[2025-02-14 07:54:33,284][00436] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-02-14 07:54:33,285][00436] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! |
|
[2025-02-14 07:54:33,288][00436] Adding new argument 'push_to_hub'=False that is not in the saved config file! |
|
[2025-02-14 07:54:33,290][00436] Adding new argument 'hf_repository'=None that is not in the saved config file! |
|
[2025-02-14 07:54:33,291][00436] Adding new argument 'policy_index'=0 that is not in the saved config file! |
|
[2025-02-14 07:54:33,292][00436] Adding new argument 'eval_deterministic'=False that is not in the saved config file! |
|
[2025-02-14 07:54:33,293][00436] Adding new argument 'train_script'=None that is not in the saved config file! |
|
[2025-02-14 07:54:33,296][00436] Adding new argument 'enjoy_script'=None that is not in the saved config file! |
|
[2025-02-14 07:54:33,298][00436] Using frameskip 1 and render_action_repeat=4 for evaluation |
|
[2025-02-14 07:54:33,335][00436] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-02-14 07:54:33,337][00436] RunningMeanStd input shape: (1,) |
|
[2025-02-14 07:54:33,354][00436] ConvEncoder: input_channels=3 |
|
[2025-02-14 07:54:33,391][00436] Conv encoder output size: 512 |
|
[2025-02-14 07:54:33,392][00436] Policy head output size: 512 |
|
[2025-02-14 07:54:33,413][00436] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001222_5005312.pth... |
|
[2025-02-14 07:54:33,841][00436] Num frames 100... |
|
[2025-02-14 07:54:33,977][00436] Num frames 200... |
|
[2025-02-14 07:54:34,108][00436] Num frames 300... |
|
[2025-02-14 07:54:34,257][00436] Num frames 400... |
|
[2025-02-14 07:54:34,401][00436] Num frames 500... |
|
[2025-02-14 07:54:34,559][00436] Avg episode rewards: #0: 9.760, true rewards: #0: 5.760 |
|
[2025-02-14 07:54:34,561][00436] Avg episode reward: 9.760, avg true_objective: 5.760 |
|
[2025-02-14 07:54:34,597][00436] Num frames 600... |
|
[2025-02-14 07:54:34,738][00436] Num frames 700... |
|
[2025-02-14 07:54:34,873][00436] Num frames 800... |
|
[2025-02-14 07:54:35,004][00436] Num frames 900... |
|
[2025-02-14 07:54:35,137][00436] Num frames 1000... |
|
[2025-02-14 07:54:35,277][00436] Num frames 1100... |
|
[2025-02-14 07:54:35,409][00436] Num frames 1200... |
|
[2025-02-14 07:54:35,548][00436] Num frames 1300... |
|
[2025-02-14 07:54:35,687][00436] Num frames 1400... |
|
[2025-02-14 07:54:35,825][00436] Num frames 1500... |
|
[2025-02-14 07:54:35,957][00436] Num frames 1600... |
|
[2025-02-14 07:54:36,097][00436] Num frames 1700... |
|
[2025-02-14 07:54:36,191][00436] Avg episode rewards: #0: 19.640, true rewards: #0: 8.640 |
|
[2025-02-14 07:54:36,193][00436] Avg episode reward: 19.640, avg true_objective: 8.640 |
|
[2025-02-14 07:54:36,299][00436] Num frames 1800... |
|
[2025-02-14 07:54:36,435][00436] Num frames 1900... |
|
[2025-02-14 07:54:36,565][00436] Num frames 2000... |
|
[2025-02-14 07:54:36,721][00436] Num frames 2100... |
|
[2025-02-14 07:54:36,915][00436] Num frames 2200... |
|
[2025-02-14 07:54:37,096][00436] Num frames 2300... |
|
[2025-02-14 07:54:37,281][00436] Num frames 2400... |
|
[2025-02-14 07:54:37,452][00436] Num frames 2500... |
|
[2025-02-14 07:54:37,630][00436] Num frames 2600... |
|
[2025-02-14 07:54:37,760][00436] Avg episode rewards: #0: 20.807, true rewards: #0: 8.807 |
|
[2025-02-14 07:54:37,762][00436] Avg episode reward: 20.807, avg true_objective: 8.807 |
|
[2025-02-14 07:54:37,870][00436] Num frames 2700... |
|
[2025-02-14 07:54:38,039][00436] Num frames 2800... |
|
[2025-02-14 07:54:38,232][00436] Num frames 2900... |
|
[2025-02-14 07:54:38,424][00436] Num frames 3000... |
|
[2025-02-14 07:54:38,609][00436] Num frames 3100... |
|
[2025-02-14 07:54:38,793][00436] Num frames 3200... |
|
[2025-02-14 07:54:38,986][00436] Num frames 3300... |
|
[2025-02-14 07:54:39,130][00436] Num frames 3400... |
|
[2025-02-14 07:54:39,268][00436] Num frames 3500... |
|
[2025-02-14 07:54:39,403][00436] Num frames 3600... |
|
[2025-02-14 07:54:39,531][00436] Num frames 3700... |
|
[2025-02-14 07:54:39,665][00436] Num frames 3800... |
|
[2025-02-14 07:54:39,766][00436] Avg episode rewards: #0: 22.330, true rewards: #0: 9.580 |
|
[2025-02-14 07:54:39,767][00436] Avg episode reward: 22.330, avg true_objective: 9.580 |
|
[2025-02-14 07:54:39,864][00436] Num frames 3900... |
|
[2025-02-14 07:54:39,993][00436] Num frames 4000... |
|
[2025-02-14 07:54:40,125][00436] Num frames 4100... |
|
[2025-02-14 07:54:40,264][00436] Num frames 4200... |
|
[2025-02-14 07:54:40,400][00436] Num frames 4300... |
|
[2025-02-14 07:54:40,530][00436] Num frames 4400... |
|
[2025-02-14 07:54:40,598][00436] Avg episode rewards: #0: 19.816, true rewards: #0: 8.816 |
|
[2025-02-14 07:54:40,599][00436] Avg episode reward: 19.816, avg true_objective: 8.816 |
|
[2025-02-14 07:54:40,727][00436] Num frames 4500... |
|
[2025-02-14 07:54:40,865][00436] Num frames 4600... |
|
[2025-02-14 07:54:41,002][00436] Num frames 4700... |
|
[2025-02-14 07:54:41,131][00436] Num frames 4800... |
|
[2025-02-14 07:54:41,269][00436] Num frames 4900... |
|
[2025-02-14 07:54:41,401][00436] Num frames 5000... |
|
[2025-02-14 07:54:41,530][00436] Num frames 5100... |
|
[2025-02-14 07:54:41,661][00436] Num frames 5200... |
|
[2025-02-14 07:54:41,793][00436] Num frames 5300... |
|
[2025-02-14 07:54:41,854][00436] Avg episode rewards: #0: 19.840, true rewards: #0: 8.840 |
|
[2025-02-14 07:54:41,856][00436] Avg episode reward: 19.840, avg true_objective: 8.840 |
|
[2025-02-14 07:54:41,991][00436] Num frames 5400... |
|
[2025-02-14 07:54:42,124][00436] Num frames 5500... |
|
[2025-02-14 07:54:42,265][00436] Num frames 5600... |
|
[2025-02-14 07:54:42,400][00436] Num frames 5700... |
|
[2025-02-14 07:54:42,535][00436] Num frames 5800... |
|
[2025-02-14 07:54:42,667][00436] Num frames 5900... |
|
[2025-02-14 07:54:42,797][00436] Num frames 6000... |
|
[2025-02-14 07:54:42,932][00436] Num frames 6100... |
|
[2025-02-14 07:54:43,073][00436] Num frames 6200... |
|
[2025-02-14 07:54:43,212][00436] Num frames 6300... |
|
[2025-02-14 07:54:43,343][00436] Num frames 6400... |
|
[2025-02-14 07:54:43,431][00436] Avg episode rewards: #0: 21.034, true rewards: #0: 9.177 |
|
[2025-02-14 07:54:43,432][00436] Avg episode reward: 21.034, avg true_objective: 9.177 |
|
[2025-02-14 07:54:43,531][00436] Num frames 6500... |
|
[2025-02-14 07:54:43,660][00436] Num frames 6600... |
|
[2025-02-14 07:54:43,788][00436] Num frames 6700... |
|
[2025-02-14 07:54:43,920][00436] Num frames 6800... |
|
[2025-02-14 07:54:44,059][00436] Num frames 6900... |
|
[2025-02-14 07:54:44,198][00436] Num frames 7000... |
|
[2025-02-14 07:54:44,329][00436] Num frames 7100... |
|
[2025-02-14 07:54:44,459][00436] Num frames 7200... |
|
[2025-02-14 07:54:44,589][00436] Num frames 7300... |
|
[2025-02-14 07:54:44,718][00436] Num frames 7400... |
|
[2025-02-14 07:54:44,848][00436] Num frames 7500... |
|
[2025-02-14 07:54:44,983][00436] Num frames 7600... |
|
[2025-02-14 07:54:45,136][00436] Avg episode rewards: #0: 21.340, true rewards: #0: 9.590 |
|
[2025-02-14 07:54:45,137][00436] Avg episode reward: 21.340, avg true_objective: 9.590 |
|
[2025-02-14 07:54:45,180][00436] Num frames 7700... |
|
[2025-02-14 07:54:45,316][00436] Num frames 7800... |
|
[2025-02-14 07:54:45,456][00436] Num frames 7900... |
|
[2025-02-14 07:54:45,593][00436] Num frames 8000... |
|
[2025-02-14 07:54:45,728][00436] Num frames 8100... |
|
[2025-02-14 07:54:45,860][00436] Num frames 8200... |
|
[2025-02-14 07:54:45,992][00436] Num frames 8300... |
|
[2025-02-14 07:54:46,130][00436] Num frames 8400... |
|
[2025-02-14 07:54:46,270][00436] Num frames 8500... |
|
[2025-02-14 07:54:46,404][00436] Num frames 8600... |
|
[2025-02-14 07:54:46,536][00436] Num frames 8700... |
|
[2025-02-14 07:54:46,672][00436] Avg episode rewards: #0: 21.622, true rewards: #0: 9.733 |
|
[2025-02-14 07:54:46,674][00436] Avg episode reward: 21.622, avg true_objective: 9.733 |
|
[2025-02-14 07:54:46,731][00436] Num frames 8800... |
|
[2025-02-14 07:54:46,865][00436] Num frames 8900... |
|
[2025-02-14 07:54:47,000][00436] Num frames 9000... |
|
[2025-02-14 07:54:47,140][00436] Num frames 9100... |
|
[2025-02-14 07:54:47,281][00436] Num frames 9200... |
|
[2025-02-14 07:54:47,421][00436] Num frames 9300... |
|
[2025-02-14 07:54:47,555][00436] Num frames 9400... |
|
[2025-02-14 07:54:47,686][00436] Num frames 9500... |
|
[2025-02-14 07:54:47,823][00436] Num frames 9600... |
|
[2025-02-14 07:54:47,959][00436] Num frames 9700... |
|
[2025-02-14 07:54:48,100][00436] Num frames 9800... |
|
[2025-02-14 07:54:48,244][00436] Num frames 9900... |
|
[2025-02-14 07:54:48,387][00436] Num frames 10000... |
|
[2025-02-14 07:54:48,527][00436] Num frames 10100... |
|
[2025-02-14 07:54:48,591][00436] Avg episode rewards: #0: 22.404, true rewards: #0: 10.104 |
|
[2025-02-14 07:54:48,594][00436] Avg episode reward: 22.404, avg true_objective: 10.104 |
|
[2025-02-14 07:55:49,275][00436] Replay video saved to /content/train_dir/default_experiment/replay.mp4! |
|
[2025-02-14 07:56:08,753][00436] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json |
|
[2025-02-14 07:56:08,755][00436] Overriding arg 'num_workers' with value 1 passed from command line |
|
[2025-02-14 07:56:08,757][00436] Adding new argument 'no_render'=True that is not in the saved config file! |
|
[2025-02-14 07:56:08,758][00436] Adding new argument 'save_video'=True that is not in the saved config file! |
|
[2025-02-14 07:56:08,761][00436] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-02-14 07:56:08,763][00436] Adding new argument 'video_name'=None that is not in the saved config file! |
|
[2025-02-14 07:56:08,764][00436] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! |
|
[2025-02-14 07:56:08,768][00436] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! |
|
[2025-02-14 07:56:08,769][00436] Adding new argument 'push_to_hub'=True that is not in the saved config file! |
|
[2025-02-14 07:56:08,770][00436] Adding new argument 'hf_repository'='gyaan/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! |
|
[2025-02-14 07:56:08,771][00436] Adding new argument 'policy_index'=0 that is not in the saved config file! |
|
[2025-02-14 07:56:08,775][00436] Adding new argument 'eval_deterministic'=False that is not in the saved config file! |
|
[2025-02-14 07:56:08,777][00436] Adding new argument 'train_script'=None that is not in the saved config file! |
|
[2025-02-14 07:56:08,778][00436] Adding new argument 'enjoy_script'=None that is not in the saved config file! |
|
[2025-02-14 07:56:08,779][00436] Using frameskip 1 and render_action_repeat=4 for evaluation |
|
[2025-02-14 07:56:08,813][00436] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-02-14 07:56:08,815][00436] RunningMeanStd input shape: (1,) |
|
[2025-02-14 07:56:08,828][00436] ConvEncoder: input_channels=3 |
|
[2025-02-14 07:56:08,864][00436] Conv encoder output size: 512 |
|
[2025-02-14 07:56:08,865][00436] Policy head output size: 512 |
|
[2025-02-14 07:56:08,883][00436] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001222_5005312.pth... |
|
[2025-02-14 07:56:09,360][00436] Num frames 100... |
|
[2025-02-14 07:56:09,497][00436] Num frames 200... |
|
[2025-02-14 07:56:09,624][00436] Num frames 300... |
|
[2025-02-14 07:56:09,761][00436] Num frames 400... |
|
[2025-02-14 07:56:09,899][00436] Num frames 500... |
|
[2025-02-14 07:56:10,028][00436] Num frames 600... |
|
[2025-02-14 07:56:10,169][00436] Num frames 700... |
|
[2025-02-14 07:56:10,310][00436] Num frames 800... |
|
[2025-02-14 07:56:10,441][00436] Num frames 900... |
|
[2025-02-14 07:56:10,571][00436] Num frames 1000... |
|
[2025-02-14 07:56:10,705][00436] Num frames 1100... |
|
[2025-02-14 07:56:10,854][00436] Num frames 1200... |
|
[2025-02-14 07:56:10,986][00436] Num frames 1300... |
|
[2025-02-14 07:56:11,114][00436] Num frames 1400... |
|
[2025-02-14 07:56:11,249][00436] Num frames 1500... |
|
[2025-02-14 07:56:11,386][00436] Num frames 1600... |
|
[2025-02-14 07:56:11,439][00436] Avg episode rewards: #0: 45.000, true rewards: #0: 16.000 |
|
[2025-02-14 07:56:11,440][00436] Avg episode reward: 45.000, avg true_objective: 16.000 |
|
[2025-02-14 07:56:11,570][00436] Num frames 1700... |
|
[2025-02-14 07:56:11,699][00436] Num frames 1800... |
|
[2025-02-14 07:56:11,827][00436] Num frames 1900... |
|
[2025-02-14 07:56:11,973][00436] Num frames 2000... |
|
[2025-02-14 07:56:12,051][00436] Avg episode rewards: #0: 25.580, true rewards: #0: 10.080 |
|
[2025-02-14 07:56:12,052][00436] Avg episode reward: 25.580, avg true_objective: 10.080 |
|
[2025-02-14 07:56:12,171][00436] Num frames 2100... |
|
[2025-02-14 07:56:12,309][00436] Num frames 2200... |
|
[2025-02-14 07:56:12,439][00436] Num frames 2300... |
|
[2025-02-14 07:56:12,571][00436] Num frames 2400... |
|
[2025-02-14 07:56:12,704][00436] Num frames 2500... |
|
[2025-02-14 07:56:12,840][00436] Num frames 2600... |
|
[2025-02-14 07:56:12,979][00436] Num frames 2700... |
|
[2025-02-14 07:56:13,112][00436] Num frames 2800... |
|
[2025-02-14 07:56:13,252][00436] Num frames 2900... |
|
[2025-02-14 07:56:13,386][00436] Num frames 3000... |
|
[2025-02-14 07:56:13,518][00436] Num frames 3100... |
|
[2025-02-14 07:56:13,649][00436] Num frames 3200... |
|
[2025-02-14 07:56:13,784][00436] Num frames 3300... |
|
[2025-02-14 07:56:13,928][00436] Num frames 3400... |
|
[2025-02-14 07:56:14,061][00436] Num frames 3500... |
|
[2025-02-14 07:56:14,198][00436] Num frames 3600... |
|
[2025-02-14 07:56:14,331][00436] Num frames 3700... |
|
[2025-02-14 07:56:14,462][00436] Num frames 3800... |
|
[2025-02-14 07:56:14,596][00436] Num frames 3900... |
|
[2025-02-14 07:56:14,727][00436] Num frames 4000... |
|
[2025-02-14 07:56:14,857][00436] Num frames 4100... |
|
[2025-02-14 07:56:14,933][00436] Avg episode rewards: #0: 36.053, true rewards: #0: 13.720 |
|
[2025-02-14 07:56:14,935][00436] Avg episode reward: 36.053, avg true_objective: 13.720 |
|
[2025-02-14 07:56:15,042][00436] Num frames 4200... |
|
[2025-02-14 07:56:15,177][00436] Num frames 4300... |
|
[2025-02-14 07:56:15,315][00436] Num frames 4400... |
|
[2025-02-14 07:56:15,445][00436] Num frames 4500... |
|
[2025-02-14 07:56:15,590][00436] Num frames 4600... |
|
[2025-02-14 07:56:15,724][00436] Num frames 4700... |
|
[2025-02-14 07:56:15,854][00436] Num frames 4800... |
|
[2025-02-14 07:56:15,937][00436] Avg episode rewards: #0: 30.550, true rewards: #0: 12.050 |
|
[2025-02-14 07:56:15,939][00436] Avg episode reward: 30.550, avg true_objective: 12.050 |
|
[2025-02-14 07:56:16,077][00436] Num frames 4900... |
|
[2025-02-14 07:56:16,266][00436] Num frames 5000... |
|
[2025-02-14 07:56:16,437][00436] Num frames 5100... |
|
[2025-02-14 07:56:16,613][00436] Num frames 5200... |
|
[2025-02-14 07:56:16,781][00436] Num frames 5300... |
|
[2025-02-14 07:56:17,006][00436] Avg episode rewards: #0: 26.392, true rewards: #0: 10.792 |
|
[2025-02-14 07:56:17,008][00436] Avg episode reward: 26.392, avg true_objective: 10.792 |
|
[2025-02-14 07:56:17,021][00436] Num frames 5400... |
|
[2025-02-14 07:56:17,192][00436] Num frames 5500... |
|
[2025-02-14 07:56:17,363][00436] Num frames 5600... |
|
[2025-02-14 07:56:17,544][00436] Num frames 5700... |
|
[2025-02-14 07:56:17,733][00436] Num frames 5800... |
|
[2025-02-14 07:56:17,914][00436] Num frames 5900... |
|
[2025-02-14 07:56:18,102][00436] Num frames 6000... |
|
[2025-02-14 07:56:18,289][00436] Num frames 6100... |
|
[2025-02-14 07:56:18,424][00436] Num frames 6200... |
|
[2025-02-14 07:56:18,555][00436] Num frames 6300... |
|
[2025-02-14 07:56:18,687][00436] Num frames 6400... |
|
[2025-02-14 07:56:18,822][00436] Num frames 6500... |
|
[2025-02-14 07:56:18,955][00436] Num frames 6600... |
|
[2025-02-14 07:56:19,095][00436] Num frames 6700... |
|
[2025-02-14 07:56:19,236][00436] Num frames 6800... |
|
[2025-02-14 07:56:19,298][00436] Avg episode rewards: #0: 28.173, true rewards: #0: 11.340 |
|
[2025-02-14 07:56:19,299][00436] Avg episode reward: 28.173, avg true_objective: 11.340 |
|
[2025-02-14 07:56:19,425][00436] Num frames 6900... |
|
[2025-02-14 07:56:19,553][00436] Num frames 7000... |
|
[2025-02-14 07:56:19,684][00436] Num frames 7100... |
|
[2025-02-14 07:56:19,815][00436] Num frames 7200... |
|
[2025-02-14 07:56:19,957][00436] Num frames 7300... |
|
[2025-02-14 07:56:20,104][00436] Num frames 7400... |
|
[2025-02-14 07:56:20,243][00436] Num frames 7500... |
|
[2025-02-14 07:56:20,374][00436] Num frames 7600... |
|
[2025-02-14 07:56:20,512][00436] Num frames 7700... |
|
[2025-02-14 07:56:20,647][00436] Num frames 7800... |
|
[2025-02-14 07:56:20,780][00436] Num frames 7900... |
|
[2025-02-14 07:56:20,913][00436] Num frames 8000... |
|
[2025-02-14 07:56:20,983][00436] Avg episode rewards: #0: 27.871, true rewards: #0: 11.443 |
|
[2025-02-14 07:56:20,985][00436] Avg episode reward: 27.871, avg true_objective: 11.443 |
|
[2025-02-14 07:56:21,112][00436] Num frames 8100... |
|
[2025-02-14 07:56:21,249][00436] Num frames 8200... |
|
[2025-02-14 07:56:21,380][00436] Num frames 8300... |
|
[2025-02-14 07:56:21,515][00436] Num frames 8400... |
|
[2025-02-14 07:56:21,649][00436] Num frames 8500... |
|
[2025-02-14 07:56:21,779][00436] Num frames 8600... |
|
[2025-02-14 07:56:21,912][00436] Num frames 8700... |
|
[2025-02-14 07:56:22,046][00436] Num frames 8800... |
|
[2025-02-14 07:56:22,194][00436] Num frames 8900... |
|
[2025-02-14 07:56:22,327][00436] Num frames 9000... |
|
[2025-02-14 07:56:22,505][00436] Avg episode rewards: #0: 28.237, true rewards: #0: 11.362 |
|
[2025-02-14 07:56:22,507][00436] Avg episode reward: 28.237, avg true_objective: 11.362 |
|
[2025-02-14 07:56:22,525][00436] Num frames 9100... |
|
[2025-02-14 07:56:22,656][00436] Num frames 9200... |
|
[2025-02-14 07:56:22,788][00436] Num frames 9300... |
|
[2025-02-14 07:56:22,925][00436] Num frames 9400... |
|
[2025-02-14 07:56:23,055][00436] Num frames 9500... |
|
[2025-02-14 07:56:23,203][00436] Num frames 9600... |
|
[2025-02-14 07:56:23,335][00436] Num frames 9700... |
|
[2025-02-14 07:56:23,468][00436] Num frames 9800... |
|
[2025-02-14 07:56:23,599][00436] Num frames 9900... |
|
[2025-02-14 07:56:23,731][00436] Num frames 10000... |
|
[2025-02-14 07:56:23,862][00436] Num frames 10100... |
|
[2025-02-14 07:56:23,994][00436] Num frames 10200... |
|
[2025-02-14 07:56:24,105][00436] Avg episode rewards: #0: 27.935, true rewards: #0: 11.380 |
|
[2025-02-14 07:56:24,107][00436] Avg episode reward: 27.935, avg true_objective: 11.380 |
|
[2025-02-14 07:56:24,194][00436] Num frames 10300... |
|
[2025-02-14 07:56:24,330][00436] Num frames 10400... |
|
[2025-02-14 07:56:24,462][00436] Num frames 10500... |
|
[2025-02-14 07:56:24,591][00436] Num frames 10600... |
|
[2025-02-14 07:56:24,722][00436] Num frames 10700... |
|
[2025-02-14 07:56:24,857][00436] Num frames 10800... |
|
[2025-02-14 07:56:24,990][00436] Num frames 10900... |
|
[2025-02-14 07:56:25,126][00436] Num frames 11000... |
|
[2025-02-14 07:56:25,279][00436] Num frames 11100... |
|
[2025-02-14 07:56:25,413][00436] Num frames 11200... |
|
[2025-02-14 07:56:25,548][00436] Num frames 11300... |
|
[2025-02-14 07:56:25,685][00436] Num frames 11400... |
|
[2025-02-14 07:56:25,816][00436] Num frames 11500... |
|
[2025-02-14 07:56:25,949][00436] Num frames 11600... |
|
[2025-02-14 07:56:26,076][00436] Num frames 11700... |
|
[2025-02-14 07:56:26,216][00436] Num frames 11800... |
|
[2025-02-14 07:56:26,357][00436] Num frames 11900... |
|
[2025-02-14 07:56:26,463][00436] Avg episode rewards: #0: 29.138, true rewards: #0: 11.938 |
|
[2025-02-14 07:56:26,466][00436] Avg episode reward: 29.138, avg true_objective: 11.938 |
|
[2025-02-14 07:57:38,965][00436] Replay video saved to /content/train_dir/default_experiment/replay.mp4! |
|
|