diff --git "a/sf_log.txt" "b/sf_log.txt" --- "a/sf_log.txt" +++ "b/sf_log.txt" @@ -1,47 +1,47 @@ -[2023-02-26 11:53:32,334][00001] Saving configuration to ./runs/default_experiment/config.json... -[2023-02-26 11:53:32,335][00001] Rollout worker 0 uses device cpu -[2023-02-26 11:53:32,335][00001] Rollout worker 1 uses device cpu -[2023-02-26 11:53:32,335][00001] Rollout worker 2 uses device cpu -[2023-02-26 11:53:32,335][00001] Rollout worker 3 uses device cpu -[2023-02-26 11:53:32,335][00001] Rollout worker 4 uses device cpu -[2023-02-26 11:53:32,335][00001] Rollout worker 5 uses device cpu -[2023-02-26 11:53:32,335][00001] Rollout worker 6 uses device cpu -[2023-02-26 11:53:32,335][00001] Rollout worker 7 uses device cpu -[2023-02-26 11:53:32,335][00001] Rollout worker 8 uses device cpu -[2023-02-26 11:53:32,335][00001] Rollout worker 9 uses device cpu -[2023-02-26 11:53:32,336][00001] Rollout worker 10 uses device cpu -[2023-02-26 11:53:32,336][00001] Rollout worker 11 uses device cpu -[2023-02-26 11:53:32,419][00001] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-02-26 11:53:32,419][00001] InferenceWorker_p0-w0: min num requests: 4 -[2023-02-26 11:53:32,442][00001] Starting all processes... -[2023-02-26 11:53:32,442][00001] Starting process learner_proc0 -[2023-02-26 11:53:33,166][00001] Starting all processes... -[2023-02-26 11:53:33,169][00001] Starting process inference_proc0-0 -[2023-02-26 11:53:33,169][00001] Starting process rollout_proc0 -[2023-02-26 11:53:33,169][00001] Starting process rollout_proc1 -[2023-02-26 11:53:33,170][00141] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-02-26 11:53:33,170][00141] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 -[2023-02-26 11:53:33,170][00001] Starting process rollout_proc2 -[2023-02-26 11:53:33,170][00001] Starting process rollout_proc3 -[2023-02-26 11:53:33,170][00001] Starting process rollout_proc4 -[2023-02-26 11:53:33,170][00001] Starting process rollout_proc5 -[2023-02-26 11:53:33,171][00001] Starting process rollout_proc6 -[2023-02-26 11:53:33,179][00141] Num visible devices: 1 -[2023-02-26 11:53:33,172][00001] Starting process rollout_proc7 -[2023-02-26 11:53:33,172][00001] Starting process rollout_proc8 -[2023-02-26 11:53:33,172][00001] Starting process rollout_proc9 -[2023-02-26 11:53:33,173][00001] Starting process rollout_proc10 -[2023-02-26 11:53:33,173][00001] Starting process rollout_proc11 -[2023-02-26 11:53:33,213][00141] Starting seed is not provided -[2023-02-26 11:53:33,214][00141] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-02-26 11:53:33,214][00141] Initializing actor-critic model on device cuda:0 -[2023-02-26 11:53:33,214][00141] RunningMeanStd input shape: (3, 72, 128) -[2023-02-26 11:53:33,215][00141] RunningMeanStd input shape: (1,) -[2023-02-26 11:53:33,230][00141] ConvEncoder: input_channels=3 -[2023-02-26 11:53:33,330][00141] Conv encoder output size: 512 -[2023-02-26 11:53:33,331][00141] Policy head output size: 512 -[2023-02-26 11:53:33,344][00141] Created Actor Critic model with architecture: -[2023-02-26 11:53:33,344][00141] ActorCriticSharedWeights( +[2023-02-26 11:54:55,181][00001] Saving configuration to ./runs/default_experiment/config.json... +[2023-02-26 11:54:55,182][00001] Rollout worker 0 uses device cpu +[2023-02-26 11:54:55,182][00001] Rollout worker 1 uses device cpu +[2023-02-26 11:54:55,182][00001] Rollout worker 2 uses device cpu +[2023-02-26 11:54:55,182][00001] Rollout worker 3 uses device cpu +[2023-02-26 11:54:55,182][00001] Rollout worker 4 uses device cpu +[2023-02-26 11:54:55,182][00001] Rollout worker 5 uses device cpu +[2023-02-26 11:54:55,182][00001] Rollout worker 6 uses device cpu +[2023-02-26 11:54:55,182][00001] Rollout worker 7 uses device cpu +[2023-02-26 11:54:55,182][00001] Rollout worker 8 uses device cpu +[2023-02-26 11:54:55,183][00001] Rollout worker 9 uses device cpu +[2023-02-26 11:54:55,183][00001] Rollout worker 10 uses device cpu +[2023-02-26 11:54:55,183][00001] Rollout worker 11 uses device cpu +[2023-02-26 11:54:55,265][00001] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-02-26 11:54:55,266][00001] InferenceWorker_p0-w0: min num requests: 4 +[2023-02-26 11:54:55,288][00001] Starting all processes... +[2023-02-26 11:54:55,288][00001] Starting process learner_proc0 +[2023-02-26 11:54:56,013][00001] Starting all processes... +[2023-02-26 11:54:56,016][00001] Starting process inference_proc0-0 +[2023-02-26 11:54:56,016][00001] Starting process rollout_proc0 +[2023-02-26 11:54:56,016][00141] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-02-26 11:54:56,016][00141] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 +[2023-02-26 11:54:56,016][00001] Starting process rollout_proc1 +[2023-02-26 11:54:56,016][00001] Starting process rollout_proc2 +[2023-02-26 11:54:56,016][00001] Starting process rollout_proc3 +[2023-02-26 11:54:56,016][00001] Starting process rollout_proc4 +[2023-02-26 11:54:56,016][00001] Starting process rollout_proc5 +[2023-02-26 11:54:56,017][00001] Starting process rollout_proc6 +[2023-02-26 11:54:56,025][00141] Num visible devices: 1 +[2023-02-26 11:54:56,017][00001] Starting process rollout_proc7 +[2023-02-26 11:54:56,018][00001] Starting process rollout_proc8 +[2023-02-26 11:54:56,018][00001] Starting process rollout_proc9 +[2023-02-26 11:54:56,018][00001] Starting process rollout_proc10 +[2023-02-26 11:54:56,020][00001] Starting process rollout_proc11 +[2023-02-26 11:54:56,056][00141] Starting seed is not provided +[2023-02-26 11:54:56,056][00141] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-02-26 11:54:56,057][00141] Initializing actor-critic model on device cuda:0 +[2023-02-26 11:54:56,057][00141] RunningMeanStd input shape: (3, 72, 128) +[2023-02-26 11:54:56,057][00141] RunningMeanStd input shape: (1,) +[2023-02-26 11:54:56,067][00141] ConvEncoder: input_channels=3 +[2023-02-26 11:54:56,162][00141] Conv encoder output size: 512 +[2023-02-26 11:54:56,163][00141] Policy head output size: 512 +[2023-02-26 11:54:56,174][00141] Created Actor Critic model with architecture: +[2023-02-26 11:54:56,175][00141] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( @@ -82,325 +82,3730 @@ (distribution_linear): Linear(in_features=512, out_features=5, bias=True) ) ) -[2023-02-26 11:53:34,210][00201] Worker 7 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] -[2023-02-26 11:53:34,236][00191] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-02-26 11:53:34,237][00191] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 -[2023-02-26 11:53:34,238][00194] Worker 4 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] -[2023-02-26 11:53:34,247][00191] Num visible devices: 1 -[2023-02-26 11:53:34,247][00195] Worker 5 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] -[2023-02-26 11:53:34,251][00192] Worker 2 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] -[2023-02-26 11:53:34,260][00190] Worker 1 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] -[2023-02-26 11:53:34,261][00189] Worker 0 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] -[2023-02-26 11:53:34,262][00196] Worker 6 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] -[2023-02-26 11:53:34,266][00197] Worker 8 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] -[2023-02-26 11:53:34,275][00199] Worker 10 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] -[2023-02-26 11:53:34,276][00193] Worker 3 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] -[2023-02-26 11:53:34,280][00198] Worker 9 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] -[2023-02-26 11:53:34,288][00200] Worker 11 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] -[2023-02-26 11:53:34,937][00141] Using optimizer -[2023-02-26 11:53:34,937][00141] No checkpoints found -[2023-02-26 11:53:34,937][00141] Did not load from checkpoint, starting from scratch! -[2023-02-26 11:53:34,938][00141] Initialized policy 0 weights for model version 0 -[2023-02-26 11:53:34,939][00141] LearnerWorker_p0 finished initialization! -[2023-02-26 11:53:34,939][00141] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-02-26 11:53:34,988][00191] RunningMeanStd input shape: (3, 72, 128) -[2023-02-26 11:53:34,989][00191] RunningMeanStd input shape: (1,) -[2023-02-26 11:53:34,996][00191] ConvEncoder: input_channels=3 -[2023-02-26 11:53:35,058][00191] Conv encoder output size: 512 -[2023-02-26 11:53:35,058][00191] Policy head output size: 512 -[2023-02-26 11:53:35,721][00001] Inference worker 0-0 is ready! -[2023-02-26 11:53:35,721][00001] All inference workers are ready! Signal rollout workers to start! -[2023-02-26 11:53:35,742][00192] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-26 11:53:35,745][00199] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-26 11:53:35,746][00189] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-26 11:53:35,747][00197] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-26 11:53:35,748][00196] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-26 11:53:35,749][00200] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-26 11:53:35,749][00198] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-26 11:53:35,749][00201] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-26 11:53:35,755][00195] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-26 11:53:35,755][00194] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-26 11:53:35,756][00193] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-26 11:53:35,756][00190] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-26 11:53:35,823][00001] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) -[2023-02-26 11:53:35,976][00192] Decorrelating experience for 0 frames... -[2023-02-26 11:53:36,014][00197] Decorrelating experience for 0 frames... -[2023-02-26 11:53:36,015][00189] Decorrelating experience for 0 frames... -[2023-02-26 11:53:36,019][00200] Decorrelating experience for 0 frames... -[2023-02-26 11:53:36,019][00198] Decorrelating experience for 0 frames... -[2023-02-26 11:53:36,020][00194] Decorrelating experience for 0 frames... -[2023-02-26 11:53:36,020][00193] Decorrelating experience for 0 frames... -[2023-02-26 11:53:36,020][00190] Decorrelating experience for 0 frames... -[2023-02-26 11:53:36,110][00201] Decorrelating experience for 0 frames... -[2023-02-26 11:53:36,164][00189] Decorrelating experience for 32 frames... -[2023-02-26 11:53:36,181][00200] Decorrelating experience for 32 frames... -[2023-02-26 11:53:36,181][00198] Decorrelating experience for 32 frames... -[2023-02-26 11:53:36,184][00194] Decorrelating experience for 32 frames... -[2023-02-26 11:53:36,185][00193] Decorrelating experience for 32 frames... -[2023-02-26 11:53:36,195][00190] Decorrelating experience for 32 frames... -[2023-02-26 11:53:36,250][00199] Decorrelating experience for 0 frames... -[2023-02-26 11:53:36,325][00192] Decorrelating experience for 32 frames... -[2023-02-26 11:53:36,336][00200] Decorrelating experience for 64 frames... -[2023-02-26 11:53:36,336][00198] Decorrelating experience for 64 frames... -[2023-02-26 11:53:36,347][00190] Decorrelating experience for 64 frames... -[2023-02-26 11:53:36,349][00201] Decorrelating experience for 32 frames... -[2023-02-26 11:53:36,359][00196] Decorrelating experience for 0 frames... -[2023-02-26 11:53:36,422][00195] Decorrelating experience for 0 frames... -[2023-02-26 11:53:36,465][00189] Decorrelating experience for 64 frames... -[2023-02-26 11:53:36,486][00199] Decorrelating experience for 32 frames... -[2023-02-26 11:53:36,514][00196] Decorrelating experience for 32 frames... -[2023-02-26 11:53:36,515][00197] Decorrelating experience for 32 frames... -[2023-02-26 11:53:36,515][00201] Decorrelating experience for 64 frames... -[2023-02-26 11:53:36,515][00198] Decorrelating experience for 96 frames... -[2023-02-26 11:53:36,518][00190] Decorrelating experience for 96 frames... -[2023-02-26 11:53:36,556][00195] Decorrelating experience for 32 frames... -[2023-02-26 11:53:36,638][00194] Decorrelating experience for 64 frames... -[2023-02-26 11:53:36,641][00193] Decorrelating experience for 64 frames... -[2023-02-26 11:53:36,668][00197] Decorrelating experience for 64 frames... -[2023-02-26 11:53:36,684][00192] Decorrelating experience for 64 frames... -[2023-02-26 11:53:36,684][00189] Decorrelating experience for 96 frames... -[2023-02-26 11:53:36,700][00199] Decorrelating experience for 64 frames... -[2023-02-26 11:53:36,710][00195] Decorrelating experience for 64 frames... -[2023-02-26 11:53:36,718][00198] Decorrelating experience for 128 frames... -[2023-02-26 11:53:36,784][00201] Decorrelating experience for 96 frames... -[2023-02-26 11:53:36,807][00194] Decorrelating experience for 96 frames... -[2023-02-26 11:53:36,833][00197] Decorrelating experience for 96 frames... -[2023-02-26 11:53:36,860][00193] Decorrelating experience for 96 frames... -[2023-02-26 11:53:36,865][00190] Decorrelating experience for 128 frames... -[2023-02-26 11:53:36,865][00199] Decorrelating experience for 96 frames... -[2023-02-26 11:53:36,951][00192] Decorrelating experience for 96 frames... -[2023-02-26 11:53:36,963][00200] Decorrelating experience for 96 frames... -[2023-02-26 11:53:36,974][00195] Decorrelating experience for 96 frames... -[2023-02-26 11:53:37,000][00194] Decorrelating experience for 128 frames... -[2023-02-26 11:53:37,024][00197] Decorrelating experience for 128 frames... -[2023-02-26 11:53:37,057][00193] Decorrelating experience for 128 frames... -[2023-02-26 11:53:37,059][00190] Decorrelating experience for 160 frames... -[2023-02-26 11:53:37,118][00196] Decorrelating experience for 64 frames... -[2023-02-26 11:53:37,126][00201] Decorrelating experience for 128 frames... -[2023-02-26 11:53:37,164][00200] Decorrelating experience for 128 frames... -[2023-02-26 11:53:37,177][00199] Decorrelating experience for 128 frames... -[2023-02-26 11:53:37,206][00192] Decorrelating experience for 128 frames... -[2023-02-26 11:53:37,221][00197] Decorrelating experience for 160 frames... -[2023-02-26 11:53:37,298][00196] Decorrelating experience for 96 frames... -[2023-02-26 11:53:37,319][00198] Decorrelating experience for 160 frames... -[2023-02-26 11:53:37,325][00193] Decorrelating experience for 160 frames... -[2023-02-26 11:53:37,327][00201] Decorrelating experience for 160 frames... -[2023-02-26 11:53:37,374][00200] Decorrelating experience for 160 frames... -[2023-02-26 11:53:37,376][00190] Decorrelating experience for 192 frames... -[2023-02-26 11:53:37,478][00199] Decorrelating experience for 160 frames... -[2023-02-26 11:53:37,483][00189] Decorrelating experience for 128 frames... -[2023-02-26 11:53:37,495][00196] Decorrelating experience for 128 frames... -[2023-02-26 11:53:37,522][00195] Decorrelating experience for 128 frames... -[2023-02-26 11:53:37,537][00193] Decorrelating experience for 192 frames... -[2023-02-26 11:53:37,599][00200] Decorrelating experience for 192 frames... -[2023-02-26 11:53:37,634][00198] Decorrelating experience for 192 frames... -[2023-02-26 11:53:37,643][00201] Decorrelating experience for 192 frames... -[2023-02-26 11:53:37,700][00197] Decorrelating experience for 192 frames... -[2023-02-26 11:53:37,700][00196] Decorrelating experience for 160 frames... -[2023-02-26 11:53:37,701][00199] Decorrelating experience for 192 frames... -[2023-02-26 11:53:37,711][00190] Decorrelating experience for 224 frames... -[2023-02-26 11:53:37,736][00195] Decorrelating experience for 160 frames... -[2023-02-26 11:53:37,784][00189] Decorrelating experience for 160 frames... -[2023-02-26 11:53:37,805][00194] Decorrelating experience for 160 frames... -[2023-02-26 11:53:37,805][00192] Decorrelating experience for 160 frames... -[2023-02-26 11:53:37,884][00193] Decorrelating experience for 224 frames... -[2023-02-26 11:53:37,902][00201] Decorrelating experience for 224 frames... -[2023-02-26 11:53:37,925][00196] Decorrelating experience for 192 frames... -[2023-02-26 11:53:37,928][00200] Decorrelating experience for 224 frames... -[2023-02-26 11:53:37,949][00199] Decorrelating experience for 224 frames... -[2023-02-26 11:53:37,952][00195] Decorrelating experience for 192 frames... -[2023-02-26 11:53:38,045][00189] Decorrelating experience for 192 frames... -[2023-02-26 11:53:38,065][00194] Decorrelating experience for 192 frames... -[2023-02-26 11:53:38,139][00192] Decorrelating experience for 192 frames... -[2023-02-26 11:53:38,169][00196] Decorrelating experience for 224 frames... -[2023-02-26 11:53:38,184][00195] Decorrelating experience for 224 frames... -[2023-02-26 11:53:38,223][00198] Decorrelating experience for 224 frames... -[2023-02-26 11:53:38,289][00189] Decorrelating experience for 224 frames... -[2023-02-26 11:53:38,339][00194] Decorrelating experience for 224 frames... -[2023-02-26 11:53:38,383][00192] Decorrelating experience for 224 frames... -[2023-02-26 11:53:38,489][00141] Signal inference workers to stop experience collection... -[2023-02-26 11:53:38,491][00191] InferenceWorker_p0-w0: stopping experience collection -[2023-02-26 11:53:38,557][00197] Decorrelating experience for 224 frames... -[2023-02-26 11:53:39,094][00141] Signal inference workers to resume experience collection... -[2023-02-26 11:53:39,095][00191] InferenceWorker_p0-w0: resuming experience collection -[2023-02-26 11:53:39,763][00001] Component Batcher_0 stopped! -[2023-02-26 11:53:39,763][00141] Stopping Batcher_0... -[2023-02-26 11:53:39,764][00141] Loop batcher_evt_loop terminating... -[2023-02-26 11:53:39,764][00141] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000000004_16384.pth... -[2023-02-26 11:53:39,779][00201] Stopping RolloutWorker_w7... -[2023-02-26 11:53:39,779][00200] Stopping RolloutWorker_w11... -[2023-02-26 11:53:39,779][00190] Stopping RolloutWorker_w1... -[2023-02-26 11:53:39,779][00198] Stopping RolloutWorker_w9... -[2023-02-26 11:53:39,779][00001] Component RolloutWorker_w11 stopped! -[2023-02-26 11:53:39,779][00199] Stopping RolloutWorker_w10... -[2023-02-26 11:53:39,779][00195] Stopping RolloutWorker_w5... -[2023-02-26 11:53:39,779][00201] Loop rollout_proc7_evt_loop terminating... -[2023-02-26 11:53:39,779][00001] Component RolloutWorker_w7 stopped! -[2023-02-26 11:53:39,779][00192] Stopping RolloutWorker_w2... -[2023-02-26 11:53:39,779][00190] Loop rollout_proc1_evt_loop terminating... -[2023-02-26 11:53:39,779][00198] Loop rollout_proc9_evt_loop terminating... -[2023-02-26 11:53:39,779][00199] Loop rollout_proc10_evt_loop terminating... -[2023-02-26 11:53:39,779][00001] Component RolloutWorker_w1 stopped! -[2023-02-26 11:53:39,779][00200] Loop rollout_proc11_evt_loop terminating... -[2023-02-26 11:53:39,779][00001] Component RolloutWorker_w9 stopped! -[2023-02-26 11:53:39,780][00001] Component RolloutWorker_w5 stopped! -[2023-02-26 11:53:39,779][00195] Loop rollout_proc5_evt_loop terminating... -[2023-02-26 11:53:39,779][00192] Loop rollout_proc2_evt_loop terminating... -[2023-02-26 11:53:39,780][00001] Component RolloutWorker_w10 stopped! -[2023-02-26 11:53:39,780][00001] Component RolloutWorker_w2 stopped! -[2023-02-26 11:53:39,780][00001] Component RolloutWorker_w8 stopped! -[2023-02-26 11:53:39,780][00197] Stopping RolloutWorker_w8... -[2023-02-26 11:53:39,780][00001] Component RolloutWorker_w6 stopped! -[2023-02-26 11:53:39,780][00001] Component RolloutWorker_w3 stopped! -[2023-02-26 11:53:39,780][00001] Component RolloutWorker_w4 stopped! -[2023-02-26 11:53:39,780][00196] Stopping RolloutWorker_w6... -[2023-02-26 11:53:39,780][00197] Loop rollout_proc8_evt_loop terminating... -[2023-02-26 11:53:39,780][00193] Stopping RolloutWorker_w3... -[2023-02-26 11:53:39,780][00194] Stopping RolloutWorker_w4... -[2023-02-26 11:53:39,780][00196] Loop rollout_proc6_evt_loop terminating... -[2023-02-26 11:53:39,780][00193] Loop rollout_proc3_evt_loop terminating... -[2023-02-26 11:53:39,780][00194] Loop rollout_proc4_evt_loop terminating... -[2023-02-26 11:53:39,781][00001] Component RolloutWorker_w0 stopped! -[2023-02-26 11:53:39,781][00189] Stopping RolloutWorker_w0... -[2023-02-26 11:53:39,781][00189] Loop rollout_proc0_evt_loop terminating... -[2023-02-26 11:53:39,789][00191] Weights refcount: 2 0 -[2023-02-26 11:53:39,790][00191] Stopping InferenceWorker_p0-w0... -[2023-02-26 11:53:39,790][00001] Component InferenceWorker_p0-w0 stopped! -[2023-02-26 11:53:39,791][00191] Loop inference_proc0-0_evt_loop terminating... -[2023-02-26 11:53:39,811][00141] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000000004_16384.pth... -[2023-02-26 11:53:39,869][00141] Stopping LearnerWorker_p0... -[2023-02-26 11:53:39,869][00141] Loop learner_proc0_evt_loop terminating... -[2023-02-26 11:53:39,869][00001] Component LearnerWorker_p0 stopped! -[2023-02-26 11:53:39,870][00001] Waiting for process learner_proc0 to stop... -[2023-02-26 11:53:40,573][00001] Waiting for process inference_proc0-0 to join... -[2023-02-26 11:53:40,573][00001] Waiting for process rollout_proc0 to join... -[2023-02-26 11:53:40,577][00001] Waiting for process rollout_proc1 to join... -[2023-02-26 11:53:40,577][00001] Waiting for process rollout_proc2 to join... -[2023-02-26 11:53:40,577][00001] Waiting for process rollout_proc3 to join... -[2023-02-26 11:53:40,577][00001] Waiting for process rollout_proc4 to join... -[2023-02-26 11:53:40,577][00001] Waiting for process rollout_proc5 to join... -[2023-02-26 11:53:40,578][00001] Waiting for process rollout_proc6 to join... -[2023-02-26 11:53:40,578][00001] Waiting for process rollout_proc7 to join... -[2023-02-26 11:53:40,578][00001] Waiting for process rollout_proc8 to join... -[2023-02-26 11:53:40,578][00001] Waiting for process rollout_proc9 to join... -[2023-02-26 11:53:40,578][00001] Waiting for process rollout_proc10 to join... -[2023-02-26 11:53:40,579][00001] Waiting for process rollout_proc11 to join... -[2023-02-26 11:53:40,579][00001] Batcher 0 profile tree view: -batching: 0.0546, releasing_batches: 0.0008 -[2023-02-26 11:53:40,579][00001] InferenceWorker_p0-w0 profile tree view: -wait_policy: 0.0000 - wait_policy_total: 2.2468 -update_model: 0.2048 - weight_update: 0.0503 -one_step: 0.0076 - handle_policy_step: 0.7121 - deserialize: 0.0288, stack: 0.0024, obs_to_device_normalize: 0.1026, forward: 0.4772, send_messages: 0.0251 - prepare_outputs: 0.0543 - to_cpu: 0.0308 -[2023-02-26 11:53:40,579][00001] Learner 0 profile tree view: -misc: 0.0000, prepare_batch: 1.1456 -train: 0.2287 - epoch_init: 0.0000, minibatch_init: 0.0000, losses_postprocess: 0.0003, kl_divergence: 0.0004, after_optimizer: 0.0060 - calculate_losses: 0.0418 - losses_init: 0.0000, forward_head: 0.0260, bptt_initial: 0.0099, tail: 0.0013, advantages_returns: 0.0005, losses: 0.0018 - bptt: 0.0020 - bptt_forward_core: 0.0019 - update: 0.1793 - clip: 0.0028 -[2023-02-26 11:53:40,580][00001] RolloutWorker_w0 profile tree view: -wait_for_trajectories: 0.0004, enqueue_policy_requests: 0.0104, env_step: 0.1969, overhead: 0.0129, complete_rollouts: 0.0002 -save_policy_outputs: 0.0131 - split_output_tensors: 0.0063 -[2023-02-26 11:53:40,580][00001] RolloutWorker_w11 profile tree view: -wait_for_trajectories: 0.0007, enqueue_policy_requests: 0.0232, env_step: 0.4797, overhead: 0.0301, complete_rollouts: 0.0004 -save_policy_outputs: 0.0298 - split_output_tensors: 0.0143 -[2023-02-26 11:53:40,580][00001] Loop Runner_EvtLoop terminating... -[2023-02-26 11:53:40,580][00001] Runner profile tree view: -main_loop: 8.1387 -[2023-02-26 11:53:40,580][00001] Collected {0: 16384}, FPS: 2013.1 -[2023-02-26 11:53:40,590][00001] Loading existing experiment configuration from ./runs/default_experiment/config.json -[2023-02-26 11:53:40,590][00001] Overriding arg 'num_workers' with value 1 passed from command line -[2023-02-26 11:53:40,590][00001] Adding new argument 'no_render'=True that is not in the saved config file! -[2023-02-26 11:53:40,590][00001] Adding new argument 'save_video'=True that is not in the saved config file! -[2023-02-26 11:53:40,590][00001] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! -[2023-02-26 11:53:40,591][00001] Adding new argument 'video_name'=None that is not in the saved config file! -[2023-02-26 11:53:40,591][00001] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! -[2023-02-26 11:53:40,591][00001] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! -[2023-02-26 11:53:40,591][00001] Adding new argument 'push_to_hub'=True that is not in the saved config file! -[2023-02-26 11:53:40,591][00001] Adding new argument 'hf_repository'='chavicoski/vizdoom_health_gathering_supreme' that is not in the saved config file! -[2023-02-26 11:53:40,591][00001] Adding new argument 'policy_index'=0 that is not in the saved config file! -[2023-02-26 11:53:40,591][00001] Adding new argument 'eval_deterministic'=False that is not in the saved config file! -[2023-02-26 11:53:40,591][00001] Adding new argument 'train_script'=None that is not in the saved config file! -[2023-02-26 11:53:40,591][00001] Adding new argument 'enjoy_script'=None that is not in the saved config file! -[2023-02-26 11:53:40,591][00001] Using frameskip 1 and render_action_repeat=4 for evaluation -[2023-02-26 11:53:40,597][00001] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-26 11:53:40,598][00001] RunningMeanStd input shape: (3, 72, 128) -[2023-02-26 11:53:40,598][00001] RunningMeanStd input shape: (1,) -[2023-02-26 11:53:40,611][00001] ConvEncoder: input_channels=3 -[2023-02-26 11:53:40,696][00001] Conv encoder output size: 512 -[2023-02-26 11:53:40,696][00001] Policy head output size: 512 -[2023-02-26 11:53:41,916][00001] Loading state from checkpoint ./runs/default_experiment/checkpoint_p0/checkpoint_000000004_16384.pth... -[2023-02-26 11:53:42,521][00001] Num frames 100... -[2023-02-26 11:53:42,623][00001] Num frames 200... -[2023-02-26 11:53:42,725][00001] Num frames 300... -[2023-02-26 11:53:42,827][00001] Num frames 400... -[2023-02-26 11:53:42,928][00001] Avg episode rewards: #0: 5.480, true rewards: #0: 4.480 -[2023-02-26 11:53:42,929][00001] Avg episode reward: 5.480, avg true_objective: 4.480 -[2023-02-26 11:53:42,998][00001] Num frames 500... -[2023-02-26 11:53:43,102][00001] Num frames 600... -[2023-02-26 11:53:43,203][00001] Num frames 700... -[2023-02-26 11:53:43,305][00001] Num frames 800... -[2023-02-26 11:53:43,391][00001] Avg episode rewards: #0: 4.660, true rewards: #0: 4.160 -[2023-02-26 11:53:43,391][00001] Avg episode reward: 4.660, avg true_objective: 4.160 -[2023-02-26 11:53:43,478][00001] Num frames 900... -[2023-02-26 11:53:43,578][00001] Num frames 1000... -[2023-02-26 11:53:43,679][00001] Num frames 1100... -[2023-02-26 11:53:43,780][00001] Num frames 1200... -[2023-02-26 11:53:43,849][00001] Avg episode rewards: #0: 4.387, true rewards: #0: 4.053 -[2023-02-26 11:53:43,849][00001] Avg episode reward: 4.387, avg true_objective: 4.053 -[2023-02-26 11:53:43,958][00001] Num frames 1300... -[2023-02-26 11:53:44,058][00001] Num frames 1400... -[2023-02-26 11:53:44,159][00001] Num frames 1500... -[2023-02-26 11:53:44,260][00001] Num frames 1600... -[2023-02-26 11:53:44,377][00001] Avg episode rewards: #0: 4.660, true rewards: #0: 4.160 -[2023-02-26 11:53:44,377][00001] Avg episode reward: 4.660, avg true_objective: 4.160 -[2023-02-26 11:53:44,427][00001] Num frames 1700... -[2023-02-26 11:53:44,532][00001] Num frames 1800... -[2023-02-26 11:53:44,633][00001] Num frames 1900... -[2023-02-26 11:53:44,733][00001] Num frames 2000... -[2023-02-26 11:53:44,834][00001] Num frames 2100... -[2023-02-26 11:53:44,899][00001] Avg episode rewards: #0: 4.824, true rewards: #0: 4.224 -[2023-02-26 11:53:44,899][00001] Avg episode reward: 4.824, avg true_objective: 4.224 -[2023-02-26 11:53:45,012][00001] Num frames 2200... -[2023-02-26 11:53:45,113][00001] Num frames 2300... -[2023-02-26 11:53:45,214][00001] Num frames 2400... -[2023-02-26 11:53:45,315][00001] Num frames 2500... -[2023-02-26 11:53:45,429][00001] Avg episode rewards: #0: 4.933, true rewards: #0: 4.267 -[2023-02-26 11:53:45,430][00001] Avg episode reward: 4.933, avg true_objective: 4.267 -[2023-02-26 11:53:45,485][00001] Num frames 2600... -[2023-02-26 11:53:45,594][00001] Num frames 2700... -[2023-02-26 11:53:45,696][00001] Num frames 2800... -[2023-02-26 11:53:45,798][00001] Num frames 2900... -[2023-02-26 11:53:45,895][00001] Avg episode rewards: #0: 4.777, true rewards: #0: 4.206 -[2023-02-26 11:53:45,896][00001] Avg episode reward: 4.777, avg true_objective: 4.206 -[2023-02-26 11:53:45,975][00001] Num frames 3000... -[2023-02-26 11:53:46,080][00001] Num frames 3100... -[2023-02-26 11:53:46,181][00001] Num frames 3200... -[2023-02-26 11:53:46,283][00001] Num frames 3300... -[2023-02-26 11:53:46,397][00001] Avg episode rewards: #0: 4.825, true rewards: #0: 4.200 -[2023-02-26 11:53:46,397][00001] Avg episode reward: 4.825, avg true_objective: 4.200 -[2023-02-26 11:53:46,451][00001] Num frames 3400... -[2023-02-26 11:53:46,559][00001] Num frames 3500... -[2023-02-26 11:53:46,660][00001] Num frames 3600... -[2023-02-26 11:53:46,761][00001] Num frames 3700... -[2023-02-26 11:53:46,862][00001] Num frames 3800... -[2023-02-26 11:53:46,964][00001] Num frames 3900... -[2023-02-26 11:53:47,021][00001] Avg episode rewards: #0: 5.116, true rewards: #0: 4.338 -[2023-02-26 11:53:47,022][00001] Avg episode reward: 5.116, avg true_objective: 4.338 -[2023-02-26 11:53:47,140][00001] Num frames 4000... -[2023-02-26 11:53:47,238][00001] Num frames 4100... -[2023-02-26 11:53:47,339][00001] Num frames 4200... -[2023-02-26 11:53:47,480][00001] Avg episode rewards: #0: 4.988, true rewards: #0: 4.288 -[2023-02-26 11:53:47,480][00001] Avg episode reward: 4.988, avg true_objective: 4.288 -[2023-02-26 11:53:51,609][00001] Replay video saved to ./runs/default_experiment/replay.mp4! +[2023-02-26 11:54:57,073][00197] Worker 8 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] +[2023-02-26 11:54:57,085][00191] Worker 1 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] +[2023-02-26 11:54:57,085][00199] Worker 10 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] +[2023-02-26 11:54:57,085][00192] Worker 5 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] +[2023-02-26 11:54:57,086][00195] Worker 2 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] +[2023-02-26 11:54:57,088][00194] Worker 4 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] +[2023-02-26 11:54:57,088][00200] Worker 11 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] +[2023-02-26 11:54:57,092][00189] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-02-26 11:54:57,092][00189] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 +[2023-02-26 11:54:57,094][00193] Worker 3 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] +[2023-02-26 11:54:57,095][00201] Worker 9 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] +[2023-02-26 11:54:57,099][00190] Worker 0 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] +[2023-02-26 11:54:57,102][00189] Num visible devices: 1 +[2023-02-26 11:54:57,108][00196] Worker 6 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] +[2023-02-26 11:54:57,111][00198] Worker 7 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] +[2023-02-26 11:54:57,770][00141] Using optimizer +[2023-02-26 11:54:57,770][00141] No checkpoints found +[2023-02-26 11:54:57,771][00141] Did not load from checkpoint, starting from scratch! +[2023-02-26 11:54:57,771][00141] Initialized policy 0 weights for model version 0 +[2023-02-26 11:54:57,772][00141] LearnerWorker_p0 finished initialization! +[2023-02-26 11:54:57,772][00141] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-02-26 11:54:57,824][00189] RunningMeanStd input shape: (3, 72, 128) +[2023-02-26 11:54:57,824][00189] RunningMeanStd input shape: (1,) +[2023-02-26 11:54:57,832][00189] ConvEncoder: input_channels=3 +[2023-02-26 11:54:57,890][00189] Conv encoder output size: 512 +[2023-02-26 11:54:57,890][00189] Policy head output size: 512 +[2023-02-26 11:54:58,527][00001] Inference worker 0-0 is ready! +[2023-02-26 11:54:58,527][00001] All inference workers are ready! Signal rollout workers to start! +[2023-02-26 11:54:58,544][00194] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-26 11:54:58,544][00191] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-26 11:54:58,545][00201] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-26 11:54:58,545][00195] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-26 11:54:58,557][00196] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-26 11:54:58,559][00198] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-26 11:54:58,569][00200] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-26 11:54:58,569][00199] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-26 11:54:58,569][00193] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-26 11:54:58,569][00197] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-26 11:54:58,570][00190] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-26 11:54:58,570][00192] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-26 11:54:58,656][00001] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-02-26 11:54:58,741][00195] Decorrelating experience for 0 frames... +[2023-02-26 11:54:58,741][00191] Decorrelating experience for 0 frames... +[2023-02-26 11:54:58,741][00194] Decorrelating experience for 0 frames... +[2023-02-26 11:54:58,766][00196] Decorrelating experience for 0 frames... +[2023-02-26 11:54:58,767][00198] Decorrelating experience for 0 frames... +[2023-02-26 11:54:58,776][00197] Decorrelating experience for 0 frames... +[2023-02-26 11:54:58,777][00192] Decorrelating experience for 0 frames... +[2023-02-26 11:54:58,880][00195] Decorrelating experience for 32 frames... +[2023-02-26 11:54:58,892][00201] Decorrelating experience for 0 frames... +[2023-02-26 11:54:58,893][00191] Decorrelating experience for 32 frames... +[2023-02-26 11:54:58,914][00197] Decorrelating experience for 32 frames... +[2023-02-26 11:54:58,920][00192] Decorrelating experience for 32 frames... +[2023-02-26 11:54:58,931][00193] Decorrelating experience for 0 frames... +[2023-02-26 11:54:58,935][00199] Decorrelating experience for 0 frames... +[2023-02-26 11:54:58,936][00200] Decorrelating experience for 0 frames... +[2023-02-26 11:54:59,026][00201] Decorrelating experience for 32 frames... +[2023-02-26 11:54:59,076][00190] Decorrelating experience for 0 frames... +[2023-02-26 11:54:59,079][00193] Decorrelating experience for 32 frames... +[2023-02-26 11:54:59,080][00197] Decorrelating experience for 64 frames... +[2023-02-26 11:54:59,081][00199] Decorrelating experience for 32 frames... +[2023-02-26 11:54:59,081][00200] Decorrelating experience for 32 frames... +[2023-02-26 11:54:59,105][00196] Decorrelating experience for 32 frames... +[2023-02-26 11:54:59,180][00195] Decorrelating experience for 64 frames... +[2023-02-26 11:54:59,226][00192] Decorrelating experience for 64 frames... +[2023-02-26 11:54:59,230][00201] Decorrelating experience for 64 frames... +[2023-02-26 11:54:59,239][00193] Decorrelating experience for 64 frames... +[2023-02-26 11:54:59,249][00191] Decorrelating experience for 64 frames... +[2023-02-26 11:54:59,261][00198] Decorrelating experience for 32 frames... +[2023-02-26 11:54:59,267][00196] Decorrelating experience for 64 frames... +[2023-02-26 11:54:59,333][00200] Decorrelating experience for 64 frames... +[2023-02-26 11:54:59,343][00195] Decorrelating experience for 96 frames... +[2023-02-26 11:54:59,388][00194] Decorrelating experience for 32 frames... +[2023-02-26 11:54:59,402][00193] Decorrelating experience for 96 frames... +[2023-02-26 11:54:59,420][00191] Decorrelating experience for 96 frames... +[2023-02-26 11:54:59,422][00190] Decorrelating experience for 32 frames... +[2023-02-26 11:54:59,441][00192] Decorrelating experience for 96 frames... +[2023-02-26 11:54:59,514][00198] Decorrelating experience for 64 frames... +[2023-02-26 11:54:59,547][00194] Decorrelating experience for 64 frames... +[2023-02-26 11:54:59,551][00195] Decorrelating experience for 128 frames... +[2023-02-26 11:54:59,555][00199] Decorrelating experience for 64 frames... +[2023-02-26 11:54:59,572][00190] Decorrelating experience for 64 frames... +[2023-02-26 11:54:59,602][00200] Decorrelating experience for 96 frames... +[2023-02-26 11:54:59,691][00196] Decorrelating experience for 96 frames... +[2023-02-26 11:54:59,706][00197] Decorrelating experience for 96 frames... +[2023-02-26 11:54:59,707][00198] Decorrelating experience for 96 frames... +[2023-02-26 11:54:59,714][00191] Decorrelating experience for 128 frames... +[2023-02-26 11:54:59,717][00194] Decorrelating experience for 96 frames... +[2023-02-26 11:54:59,728][00199] Decorrelating experience for 96 frames... +[2023-02-26 11:54:59,848][00201] Decorrelating experience for 96 frames... +[2023-02-26 11:54:59,890][00196] Decorrelating experience for 128 frames... +[2023-02-26 11:54:59,904][00192] Decorrelating experience for 128 frames... +[2023-02-26 11:54:59,914][00193] Decorrelating experience for 128 frames... +[2023-02-26 11:54:59,917][00197] Decorrelating experience for 128 frames... +[2023-02-26 11:54:59,918][00200] Decorrelating experience for 128 frames... +[2023-02-26 11:54:59,920][00194] Decorrelating experience for 128 frames... +[2023-02-26 11:54:59,994][00190] Decorrelating experience for 96 frames... +[2023-02-26 11:55:00,047][00201] Decorrelating experience for 128 frames... +[2023-02-26 11:55:00,058][00198] Decorrelating experience for 128 frames... +[2023-02-26 11:55:00,061][00191] Decorrelating experience for 160 frames... +[2023-02-26 11:55:00,076][00199] Decorrelating experience for 128 frames... +[2023-02-26 11:55:00,122][00193] Decorrelating experience for 160 frames... +[2023-02-26 11:55:00,123][00197] Decorrelating experience for 160 frames... +[2023-02-26 11:55:00,133][00200] Decorrelating experience for 160 frames... +[2023-02-26 11:55:00,191][00196] Decorrelating experience for 160 frames... +[2023-02-26 11:55:00,256][00195] Decorrelating experience for 160 frames... +[2023-02-26 11:55:00,279][00194] Decorrelating experience for 160 frames... +[2023-02-26 11:55:00,281][00198] Decorrelating experience for 160 frames... +[2023-02-26 11:55:00,282][00199] Decorrelating experience for 160 frames... +[2023-02-26 11:55:00,353][00197] Decorrelating experience for 192 frames... +[2023-02-26 11:55:00,357][00200] Decorrelating experience for 192 frames... +[2023-02-26 11:55:00,357][00201] Decorrelating experience for 160 frames... +[2023-02-26 11:55:00,358][00193] Decorrelating experience for 192 frames... +[2023-02-26 11:55:00,414][00192] Decorrelating experience for 160 frames... +[2023-02-26 11:55:00,456][00190] Decorrelating experience for 128 frames... +[2023-02-26 11:55:00,492][00198] Decorrelating experience for 192 frames... +[2023-02-26 11:55:00,588][00191] Decorrelating experience for 192 frames... +[2023-02-26 11:55:00,594][00197] Decorrelating experience for 224 frames... +[2023-02-26 11:55:00,599][00193] Decorrelating experience for 224 frames... +[2023-02-26 11:55:00,627][00201] Decorrelating experience for 192 frames... +[2023-02-26 11:55:00,632][00200] Decorrelating experience for 224 frames... +[2023-02-26 11:55:00,656][00190] Decorrelating experience for 160 frames... +[2023-02-26 11:55:00,752][00198] Decorrelating experience for 224 frames... +[2023-02-26 11:55:00,772][00195] Decorrelating experience for 192 frames... +[2023-02-26 11:55:00,789][00192] Decorrelating experience for 192 frames... +[2023-02-26 11:55:00,820][00191] Decorrelating experience for 224 frames... +[2023-02-26 11:55:00,892][00190] Decorrelating experience for 192 frames... +[2023-02-26 11:55:00,905][00196] Decorrelating experience for 192 frames... +[2023-02-26 11:55:00,960][00201] Decorrelating experience for 224 frames... +[2023-02-26 11:55:01,005][00195] Decorrelating experience for 224 frames... +[2023-02-26 11:55:01,046][00194] Decorrelating experience for 192 frames... +[2023-02-26 11:55:01,149][00196] Decorrelating experience for 224 frames... +[2023-02-26 11:55:01,189][00190] Decorrelating experience for 224 frames... +[2023-02-26 11:55:01,241][00199] Decorrelating experience for 192 frames... +[2023-02-26 11:55:01,341][00141] Signal inference workers to stop experience collection... +[2023-02-26 11:55:01,343][00189] InferenceWorker_p0-w0: stopping experience collection +[2023-02-26 11:55:01,369][00194] Decorrelating experience for 224 frames... +[2023-02-26 11:55:01,433][00192] Decorrelating experience for 224 frames... +[2023-02-26 11:55:01,486][00199] Decorrelating experience for 224 frames... +[2023-02-26 11:55:01,974][00141] Signal inference workers to resume experience collection... +[2023-02-26 11:55:01,974][00189] InferenceWorker_p0-w0: resuming experience collection +[2023-02-26 11:55:03,163][00189] Updated weights for policy 0, policy_version 10 (0.0007) +[2023-02-26 11:55:03,656][00001] Fps is (10 sec: 14745.6, 60 sec: 14745.6, 300 sec: 14745.6). Total num frames: 73728. Throughput: 0: 714.4. Samples: 3572. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-02-26 11:55:03,656][00001] Avg episode reward: [(0, '4.178')] +[2023-02-26 11:55:03,805][00189] Updated weights for policy 0, policy_version 20 (0.0008) +[2023-02-26 11:55:04,450][00189] Updated weights for policy 0, policy_version 30 (0.0007) +[2023-02-26 11:55:05,194][00189] Updated weights for policy 0, policy_version 40 (0.0007) +[2023-02-26 11:55:05,827][00189] Updated weights for policy 0, policy_version 50 (0.0006) +[2023-02-26 11:55:06,513][00189] Updated weights for policy 0, policy_version 60 (0.0006) +[2023-02-26 11:55:07,225][00189] Updated weights for policy 0, policy_version 70 (0.0006) +[2023-02-26 11:55:07,912][00189] Updated weights for policy 0, policy_version 80 (0.0006) +[2023-02-26 11:55:08,609][00189] Updated weights for policy 0, policy_version 90 (0.0006) +[2023-02-26 11:55:08,656][00001] Fps is (10 sec: 36864.2, 60 sec: 36864.2, 300 sec: 36864.2). Total num frames: 368640. Throughput: 0: 7765.6. Samples: 77656. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) +[2023-02-26 11:55:08,656][00001] Avg episode reward: [(0, '4.585')] +[2023-02-26 11:55:08,656][00141] Saving new best policy, reward=4.585! +[2023-02-26 11:55:09,318][00189] Updated weights for policy 0, policy_version 100 (0.0007) +[2023-02-26 11:55:09,943][00189] Updated weights for policy 0, policy_version 110 (0.0007) +[2023-02-26 11:55:10,642][00189] Updated weights for policy 0, policy_version 120 (0.0007) +[2023-02-26 11:55:11,363][00189] Updated weights for policy 0, policy_version 130 (0.0008) +[2023-02-26 11:55:11,998][00189] Updated weights for policy 0, policy_version 140 (0.0006) +[2023-02-26 11:55:12,701][00189] Updated weights for policy 0, policy_version 150 (0.0006) +[2023-02-26 11:55:13,401][00189] Updated weights for policy 0, policy_version 160 (0.0007) +[2023-02-26 11:55:13,656][00001] Fps is (10 sec: 59801.7, 60 sec: 44783.0, 300 sec: 44783.0). Total num frames: 671744. Throughput: 0: 11186.7. Samples: 167800. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 11:55:13,656][00001] Avg episode reward: [(0, '4.981')] +[2023-02-26 11:55:13,658][00141] Saving new best policy, reward=4.981! +[2023-02-26 11:55:14,038][00189] Updated weights for policy 0, policy_version 170 (0.0006) +[2023-02-26 11:55:14,791][00189] Updated weights for policy 0, policy_version 180 (0.0006) +[2023-02-26 11:55:15,261][00001] Heartbeat connected on Batcher_0 +[2023-02-26 11:55:15,263][00001] Heartbeat connected on LearnerWorker_p0 +[2023-02-26 11:55:15,268][00001] Heartbeat connected on InferenceWorker_p0-w0 +[2023-02-26 11:55:15,272][00001] Heartbeat connected on RolloutWorker_w0 +[2023-02-26 11:55:15,275][00001] Heartbeat connected on RolloutWorker_w3 +[2023-02-26 11:55:15,275][00001] Heartbeat connected on RolloutWorker_w2 +[2023-02-26 11:55:15,276][00001] Heartbeat connected on RolloutWorker_w1 +[2023-02-26 11:55:15,277][00001] Heartbeat connected on RolloutWorker_w5 +[2023-02-26 11:55:15,277][00001] Heartbeat connected on RolloutWorker_w4 +[2023-02-26 11:55:15,284][00001] Heartbeat connected on RolloutWorker_w6 +[2023-02-26 11:55:15,284][00001] Heartbeat connected on RolloutWorker_w9 +[2023-02-26 11:55:15,284][00001] Heartbeat connected on RolloutWorker_w7 +[2023-02-26 11:55:15,287][00001] Heartbeat connected on RolloutWorker_w8 +[2023-02-26 11:55:15,292][00001] Heartbeat connected on RolloutWorker_w10 +[2023-02-26 11:55:15,292][00001] Heartbeat connected on RolloutWorker_w11 +[2023-02-26 11:55:15,454][00189] Updated weights for policy 0, policy_version 190 (0.0007) +[2023-02-26 11:55:16,124][00189] Updated weights for policy 0, policy_version 200 (0.0006) +[2023-02-26 11:55:16,846][00189] Updated weights for policy 0, policy_version 210 (0.0006) +[2023-02-26 11:55:17,516][00189] Updated weights for policy 0, policy_version 220 (0.0006) +[2023-02-26 11:55:18,153][00189] Updated weights for policy 0, policy_version 230 (0.0006) +[2023-02-26 11:55:18,656][00001] Fps is (10 sec: 59801.1, 60 sec: 48332.7, 300 sec: 48332.7). Total num frames: 966656. Throughput: 0: 10621.2. Samples: 212424. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 11:55:18,656][00001] Avg episode reward: [(0, '4.709')] +[2023-02-26 11:55:18,890][00189] Updated weights for policy 0, policy_version 240 (0.0006) +[2023-02-26 11:55:19,553][00189] Updated weights for policy 0, policy_version 250 (0.0006) +[2023-02-26 11:55:20,235][00189] Updated weights for policy 0, policy_version 260 (0.0006) +[2023-02-26 11:55:20,902][00189] Updated weights for policy 0, policy_version 270 (0.0006) +[2023-02-26 11:55:21,585][00189] Updated weights for policy 0, policy_version 280 (0.0006) +[2023-02-26 11:55:22,281][00189] Updated weights for policy 0, policy_version 290 (0.0006) +[2023-02-26 11:55:22,954][00189] Updated weights for policy 0, policy_version 300 (0.0006) +[2023-02-26 11:55:23,619][00189] Updated weights for policy 0, policy_version 310 (0.0006) +[2023-02-26 11:55:23,656][00001] Fps is (10 sec: 59801.6, 60 sec: 50790.5, 300 sec: 50790.5). Total num frames: 1269760. Throughput: 0: 12104.2. Samples: 302604. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) +[2023-02-26 11:55:23,656][00001] Avg episode reward: [(0, '4.611')] +[2023-02-26 11:55:24,349][00189] Updated weights for policy 0, policy_version 320 (0.0006) +[2023-02-26 11:55:24,997][00189] Updated weights for policy 0, policy_version 330 (0.0006) +[2023-02-26 11:55:25,661][00189] Updated weights for policy 0, policy_version 340 (0.0006) +[2023-02-26 11:55:26,396][00189] Updated weights for policy 0, policy_version 350 (0.0006) +[2023-02-26 11:55:27,054][00189] Updated weights for policy 0, policy_version 360 (0.0006) +[2023-02-26 11:55:27,750][00189] Updated weights for policy 0, policy_version 370 (0.0006) +[2023-02-26 11:55:28,455][00189] Updated weights for policy 0, policy_version 380 (0.0006) +[2023-02-26 11:55:28,656][00001] Fps is (10 sec: 59801.8, 60 sec: 52155.7, 300 sec: 52155.7). Total num frames: 1564672. Throughput: 0: 13078.7. Samples: 392360. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) +[2023-02-26 11:55:28,656][00001] Avg episode reward: [(0, '4.580')] +[2023-02-26 11:55:29,113][00189] Updated weights for policy 0, policy_version 390 (0.0006) +[2023-02-26 11:55:29,799][00189] Updated weights for policy 0, policy_version 400 (0.0006) +[2023-02-26 11:55:30,506][00189] Updated weights for policy 0, policy_version 410 (0.0006) +[2023-02-26 11:55:31,166][00189] Updated weights for policy 0, policy_version 420 (0.0006) +[2023-02-26 11:55:31,878][00189] Updated weights for policy 0, policy_version 430 (0.0006) +[2023-02-26 11:55:32,543][00189] Updated weights for policy 0, policy_version 440 (0.0006) +[2023-02-26 11:55:33,220][00189] Updated weights for policy 0, policy_version 450 (0.0006) +[2023-02-26 11:55:33,656][00001] Fps is (10 sec: 59801.2, 60 sec: 53365.0, 300 sec: 53365.0). Total num frames: 1867776. Throughput: 0: 12492.6. Samples: 437240. Policy #0 lag: (min: 0.0, avg: 2.1, max: 5.0) +[2023-02-26 11:55:33,656][00001] Avg episode reward: [(0, '4.682')] +[2023-02-26 11:55:33,967][00189] Updated weights for policy 0, policy_version 460 (0.0007) +[2023-02-26 11:55:34,628][00189] Updated weights for policy 0, policy_version 470 (0.0006) +[2023-02-26 11:55:35,285][00189] Updated weights for policy 0, policy_version 480 (0.0006) +[2023-02-26 11:55:36,004][00189] Updated weights for policy 0, policy_version 490 (0.0006) +[2023-02-26 11:55:36,654][00189] Updated weights for policy 0, policy_version 500 (0.0006) +[2023-02-26 11:55:37,346][00189] Updated weights for policy 0, policy_version 510 (0.0006) +[2023-02-26 11:55:38,048][00189] Updated weights for policy 0, policy_version 520 (0.0006) +[2023-02-26 11:55:38,656][00001] Fps is (10 sec: 59801.6, 60 sec: 54067.2, 300 sec: 54067.2). Total num frames: 2162688. Throughput: 0: 13169.0. Samples: 526760. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 11:55:38,656][00001] Avg episode reward: [(0, '4.863')] +[2023-02-26 11:55:38,716][00189] Updated weights for policy 0, policy_version 530 (0.0006) +[2023-02-26 11:55:39,410][00189] Updated weights for policy 0, policy_version 540 (0.0006) +[2023-02-26 11:55:40,125][00189] Updated weights for policy 0, policy_version 550 (0.0007) +[2023-02-26 11:55:40,794][00189] Updated weights for policy 0, policy_version 560 (0.0006) +[2023-02-26 11:55:41,470][00189] Updated weights for policy 0, policy_version 570 (0.0006) +[2023-02-26 11:55:42,179][00189] Updated weights for policy 0, policy_version 580 (0.0006) +[2023-02-26 11:55:42,858][00189] Updated weights for policy 0, policy_version 590 (0.0007) +[2023-02-26 11:55:43,541][00189] Updated weights for policy 0, policy_version 600 (0.0006) +[2023-02-26 11:55:43,656][00001] Fps is (10 sec: 59392.0, 60 sec: 54704.3, 300 sec: 54704.3). Total num frames: 2461696. Throughput: 0: 13689.1. Samples: 616012. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 11:55:43,656][00001] Avg episode reward: [(0, '4.636')] +[2023-02-26 11:55:44,237][00189] Updated weights for policy 0, policy_version 610 (0.0006) +[2023-02-26 11:55:44,915][00189] Updated weights for policy 0, policy_version 620 (0.0006) +[2023-02-26 11:55:45,592][00189] Updated weights for policy 0, policy_version 630 (0.0006) +[2023-02-26 11:55:46,289][00189] Updated weights for policy 0, policy_version 640 (0.0006) +[2023-02-26 11:55:46,957][00189] Updated weights for policy 0, policy_version 650 (0.0006) +[2023-02-26 11:55:47,671][00189] Updated weights for policy 0, policy_version 660 (0.0006) +[2023-02-26 11:55:48,366][00189] Updated weights for policy 0, policy_version 670 (0.0007) +[2023-02-26 11:55:48,656][00001] Fps is (10 sec: 59801.9, 60 sec: 55214.2, 300 sec: 55214.2). Total num frames: 2760704. Throughput: 0: 14605.3. Samples: 660808. Policy #0 lag: (min: 0.0, avg: 1.7, max: 3.0) +[2023-02-26 11:55:48,656][00001] Avg episode reward: [(0, '4.817')] +[2023-02-26 11:55:49,015][00189] Updated weights for policy 0, policy_version 680 (0.0007) +[2023-02-26 11:55:49,730][00189] Updated weights for policy 0, policy_version 690 (0.0006) +[2023-02-26 11:55:50,381][00189] Updated weights for policy 0, policy_version 700 (0.0006) +[2023-02-26 11:55:51,054][00189] Updated weights for policy 0, policy_version 710 (0.0006) +[2023-02-26 11:55:51,747][00189] Updated weights for policy 0, policy_version 720 (0.0007) +[2023-02-26 11:55:52,429][00189] Updated weights for policy 0, policy_version 730 (0.0006) +[2023-02-26 11:55:53,111][00189] Updated weights for policy 0, policy_version 740 (0.0006) +[2023-02-26 11:55:53,656][00001] Fps is (10 sec: 59801.9, 60 sec: 55631.2, 300 sec: 55631.2). Total num frames: 3059712. Throughput: 0: 14960.0. Samples: 750856. Policy #0 lag: (min: 0.0, avg: 1.5, max: 3.0) +[2023-02-26 11:55:53,656][00001] Avg episode reward: [(0, '5.138')] +[2023-02-26 11:55:53,670][00141] Saving new best policy, reward=5.138! +[2023-02-26 11:55:53,785][00189] Updated weights for policy 0, policy_version 750 (0.0006) +[2023-02-26 11:55:54,464][00189] Updated weights for policy 0, policy_version 760 (0.0006) +[2023-02-26 11:55:55,141][00189] Updated weights for policy 0, policy_version 770 (0.0006) +[2023-02-26 11:55:55,844][00189] Updated weights for policy 0, policy_version 780 (0.0006) +[2023-02-26 11:55:56,481][00189] Updated weights for policy 0, policy_version 790 (0.0006) +[2023-02-26 11:55:57,145][00189] Updated weights for policy 0, policy_version 800 (0.0006) +[2023-02-26 11:55:57,865][00189] Updated weights for policy 0, policy_version 810 (0.0006) +[2023-02-26 11:55:58,068][00141] Signal inference workers to stop experience collection... (50 times) +[2023-02-26 11:55:58,069][00141] Signal inference workers to resume experience collection... (50 times) +[2023-02-26 11:55:58,072][00189] InferenceWorker_p0-w0: stopping experience collection (50 times) +[2023-02-26 11:55:58,072][00189] InferenceWorker_p0-w0: resuming experience collection (50 times) +[2023-02-26 11:55:58,496][00189] Updated weights for policy 0, policy_version 820 (0.0006) +[2023-02-26 11:55:58,656][00001] Fps is (10 sec: 60210.9, 60 sec: 56047.0, 300 sec: 56047.0). Total num frames: 3362816. Throughput: 0: 14977.8. Samples: 841800. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) +[2023-02-26 11:55:58,656][00001] Avg episode reward: [(0, '5.408')] +[2023-02-26 11:55:58,658][00141] Saving new best policy, reward=5.408! +[2023-02-26 11:55:59,177][00189] Updated weights for policy 0, policy_version 830 (0.0006) +[2023-02-26 11:55:59,861][00189] Updated weights for policy 0, policy_version 840 (0.0006) +[2023-02-26 11:56:00,534][00189] Updated weights for policy 0, policy_version 850 (0.0006) +[2023-02-26 11:56:01,224][00189] Updated weights for policy 0, policy_version 860 (0.0006) +[2023-02-26 11:56:01,879][00189] Updated weights for policy 0, policy_version 870 (0.0007) +[2023-02-26 11:56:02,558][00189] Updated weights for policy 0, policy_version 880 (0.0006) +[2023-02-26 11:56:03,275][00189] Updated weights for policy 0, policy_version 890 (0.0006) +[2023-02-26 11:56:03,656][00001] Fps is (10 sec: 60621.0, 60 sec: 59869.9, 300 sec: 56398.8). Total num frames: 3665920. Throughput: 0: 15000.5. Samples: 887444. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 11:56:03,656][00001] Avg episode reward: [(0, '5.101')] +[2023-02-26 11:56:03,929][00189] Updated weights for policy 0, policy_version 900 (0.0006) +[2023-02-26 11:56:04,608][00189] Updated weights for policy 0, policy_version 910 (0.0006) +[2023-02-26 11:56:05,322][00189] Updated weights for policy 0, policy_version 920 (0.0006) +[2023-02-26 11:56:05,962][00189] Updated weights for policy 0, policy_version 930 (0.0006) +[2023-02-26 11:56:06,678][00189] Updated weights for policy 0, policy_version 940 (0.0006) +[2023-02-26 11:56:07,362][00189] Updated weights for policy 0, policy_version 950 (0.0007) +[2023-02-26 11:56:08,062][00189] Updated weights for policy 0, policy_version 960 (0.0006) +[2023-02-26 11:56:08,656][00001] Fps is (10 sec: 60210.8, 60 sec: 59938.0, 300 sec: 56641.8). Total num frames: 3964928. Throughput: 0: 14991.7. Samples: 977232. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 11:56:08,656][00001] Avg episode reward: [(0, '5.097')] +[2023-02-26 11:56:08,718][00189] Updated weights for policy 0, policy_version 970 (0.0006) +[2023-02-26 11:56:09,392][00189] Updated weights for policy 0, policy_version 980 (0.0006) +[2023-02-26 11:56:10,116][00189] Updated weights for policy 0, policy_version 990 (0.0006) +[2023-02-26 11:56:10,773][00189] Updated weights for policy 0, policy_version 1000 (0.0007) +[2023-02-26 11:56:11,442][00189] Updated weights for policy 0, policy_version 1010 (0.0007) +[2023-02-26 11:56:12,164][00189] Updated weights for policy 0, policy_version 1020 (0.0007) +[2023-02-26 11:56:12,838][00189] Updated weights for policy 0, policy_version 1030 (0.0006) +[2023-02-26 11:56:13,510][00189] Updated weights for policy 0, policy_version 1040 (0.0006) +[2023-02-26 11:56:13,656][00001] Fps is (10 sec: 60211.2, 60 sec: 59938.2, 300 sec: 56907.1). Total num frames: 4268032. Throughput: 0: 14987.4. Samples: 1066792. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) +[2023-02-26 11:56:13,656][00001] Avg episode reward: [(0, '5.985')] +[2023-02-26 11:56:13,659][00141] Saving new best policy, reward=5.985! +[2023-02-26 11:56:14,228][00189] Updated weights for policy 0, policy_version 1050 (0.0007) +[2023-02-26 11:56:14,898][00189] Updated weights for policy 0, policy_version 1060 (0.0006) +[2023-02-26 11:56:15,592][00189] Updated weights for policy 0, policy_version 1070 (0.0006) +[2023-02-26 11:56:16,276][00189] Updated weights for policy 0, policy_version 1080 (0.0006) +[2023-02-26 11:56:17,015][00189] Updated weights for policy 0, policy_version 1090 (0.0007) +[2023-02-26 11:56:17,682][00189] Updated weights for policy 0, policy_version 1100 (0.0006) +[2023-02-26 11:56:18,388][00189] Updated weights for policy 0, policy_version 1110 (0.0007) +[2023-02-26 11:56:18,656][00001] Fps is (10 sec: 59802.5, 60 sec: 59938.3, 300 sec: 57036.9). Total num frames: 4562944. Throughput: 0: 14976.5. Samples: 1111180. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 11:56:18,656][00001] Avg episode reward: [(0, '6.718')] +[2023-02-26 11:56:18,656][00141] Saving new best policy, reward=6.718! +[2023-02-26 11:56:19,072][00189] Updated weights for policy 0, policy_version 1120 (0.0006) +[2023-02-26 11:56:19,775][00189] Updated weights for policy 0, policy_version 1130 (0.0006) +[2023-02-26 11:56:20,456][00189] Updated weights for policy 0, policy_version 1140 (0.0007) +[2023-02-26 11:56:21,144][00189] Updated weights for policy 0, policy_version 1150 (0.0006) +[2023-02-26 11:56:21,798][00189] Updated weights for policy 0, policy_version 1160 (0.0006) +[2023-02-26 11:56:22,493][00189] Updated weights for policy 0, policy_version 1170 (0.0007) +[2023-02-26 11:56:23,178][00189] Updated weights for policy 0, policy_version 1180 (0.0006) +[2023-02-26 11:56:23,656][00001] Fps is (10 sec: 59391.8, 60 sec: 59869.9, 300 sec: 57199.5). Total num frames: 4861952. Throughput: 0: 14968.4. Samples: 1200340. Policy #0 lag: (min: 0.0, avg: 1.4, max: 4.0) +[2023-02-26 11:56:23,656][00001] Avg episode reward: [(0, '7.311')] +[2023-02-26 11:56:23,659][00141] Saving new best policy, reward=7.311! +[2023-02-26 11:56:23,837][00189] Updated weights for policy 0, policy_version 1190 (0.0006) +[2023-02-26 11:56:24,558][00189] Updated weights for policy 0, policy_version 1200 (0.0006) +[2023-02-26 11:56:25,217][00189] Updated weights for policy 0, policy_version 1210 (0.0006) +[2023-02-26 11:56:25,935][00189] Updated weights for policy 0, policy_version 1220 (0.0006) +[2023-02-26 11:56:26,633][00189] Updated weights for policy 0, policy_version 1230 (0.0006) +[2023-02-26 11:56:27,272][00189] Updated weights for policy 0, policy_version 1240 (0.0006) +[2023-02-26 11:56:27,982][00189] Updated weights for policy 0, policy_version 1250 (0.0006) +[2023-02-26 11:56:28,656][00001] Fps is (10 sec: 59391.0, 60 sec: 59869.8, 300 sec: 57298.4). Total num frames: 5156864. Throughput: 0: 14979.5. Samples: 1290092. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 11:56:28,656][00001] Avg episode reward: [(0, '8.923')] +[2023-02-26 11:56:28,656][00141] Saving new best policy, reward=8.923! +[2023-02-26 11:56:28,718][00189] Updated weights for policy 0, policy_version 1260 (0.0006) +[2023-02-26 11:56:29,339][00189] Updated weights for policy 0, policy_version 1270 (0.0006) +[2023-02-26 11:56:29,996][00189] Updated weights for policy 0, policy_version 1280 (0.0006) +[2023-02-26 11:56:30,708][00189] Updated weights for policy 0, policy_version 1290 (0.0006) +[2023-02-26 11:56:31,363][00189] Updated weights for policy 0, policy_version 1300 (0.0006) +[2023-02-26 11:56:32,020][00189] Updated weights for policy 0, policy_version 1310 (0.0006) +[2023-02-26 11:56:32,718][00189] Updated weights for policy 0, policy_version 1320 (0.0006) +[2023-02-26 11:56:33,375][00189] Updated weights for policy 0, policy_version 1330 (0.0006) +[2023-02-26 11:56:33,656][00001] Fps is (10 sec: 60210.7, 60 sec: 59938.1, 300 sec: 57516.4). Total num frames: 5464064. Throughput: 0: 14992.7. Samples: 1335480. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 11:56:33,656][00001] Avg episode reward: [(0, '13.573')] +[2023-02-26 11:56:33,659][00141] Saving new best policy, reward=13.573! +[2023-02-26 11:56:34,028][00189] Updated weights for policy 0, policy_version 1340 (0.0007) +[2023-02-26 11:56:34,712][00189] Updated weights for policy 0, policy_version 1350 (0.0006) +[2023-02-26 11:56:35,395][00189] Updated weights for policy 0, policy_version 1360 (0.0006) +[2023-02-26 11:56:36,052][00189] Updated weights for policy 0, policy_version 1370 (0.0006) +[2023-02-26 11:56:36,757][00189] Updated weights for policy 0, policy_version 1380 (0.0007) +[2023-02-26 11:56:37,446][00189] Updated weights for policy 0, policy_version 1390 (0.0006) +[2023-02-26 11:56:38,118][00189] Updated weights for policy 0, policy_version 1400 (0.0007) +[2023-02-26 11:56:38,656][00001] Fps is (10 sec: 60621.1, 60 sec: 60006.4, 300 sec: 57630.7). Total num frames: 5763072. Throughput: 0: 15022.3. Samples: 1426860. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 11:56:38,656][00001] Avg episode reward: [(0, '15.285')] +[2023-02-26 11:56:38,656][00141] Saving new best policy, reward=15.285! +[2023-02-26 11:56:38,800][00189] Updated weights for policy 0, policy_version 1410 (0.0006) +[2023-02-26 11:56:39,510][00189] Updated weights for policy 0, policy_version 1420 (0.0006) +[2023-02-26 11:56:40,164][00189] Updated weights for policy 0, policy_version 1430 (0.0006) +[2023-02-26 11:56:40,815][00189] Updated weights for policy 0, policy_version 1440 (0.0006) +[2023-02-26 11:56:41,532][00189] Updated weights for policy 0, policy_version 1450 (0.0007) +[2023-02-26 11:56:42,205][00189] Updated weights for policy 0, policy_version 1460 (0.0006) +[2023-02-26 11:56:42,853][00189] Updated weights for policy 0, policy_version 1470 (0.0007) +[2023-02-26 11:56:43,617][00189] Updated weights for policy 0, policy_version 1480 (0.0007) +[2023-02-26 11:56:43,656][00001] Fps is (10 sec: 60211.5, 60 sec: 60074.7, 300 sec: 57773.1). Total num frames: 6066176. Throughput: 0: 14994.1. Samples: 1516536. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 11:56:43,656][00001] Avg episode reward: [(0, '16.013')] +[2023-02-26 11:56:43,659][00141] Saving new best policy, reward=16.013! +[2023-02-26 11:56:44,261][00189] Updated weights for policy 0, policy_version 1490 (0.0006) +[2023-02-26 11:56:44,928][00189] Updated weights for policy 0, policy_version 1500 (0.0006) +[2023-02-26 11:56:45,677][00189] Updated weights for policy 0, policy_version 1510 (0.0007) +[2023-02-26 11:56:46,304][00189] Updated weights for policy 0, policy_version 1520 (0.0006) +[2023-02-26 11:56:46,976][00189] Updated weights for policy 0, policy_version 1530 (0.0006) +[2023-02-26 11:56:47,699][00189] Updated weights for policy 0, policy_version 1540 (0.0006) +[2023-02-26 11:56:48,354][00189] Updated weights for policy 0, policy_version 1550 (0.0006) +[2023-02-26 11:56:48,656][00001] Fps is (10 sec: 60211.7, 60 sec: 60074.7, 300 sec: 57865.3). Total num frames: 6365184. Throughput: 0: 14979.5. Samples: 1561520. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) +[2023-02-26 11:56:48,656][00001] Avg episode reward: [(0, '17.834')] +[2023-02-26 11:56:48,656][00141] Saving new best policy, reward=17.834! +[2023-02-26 11:56:48,991][00189] Updated weights for policy 0, policy_version 1560 (0.0006) +[2023-02-26 11:56:49,716][00189] Updated weights for policy 0, policy_version 1570 (0.0006) +[2023-02-26 11:56:50,375][00189] Updated weights for policy 0, policy_version 1580 (0.0006) +[2023-02-26 11:56:51,038][00189] Updated weights for policy 0, policy_version 1590 (0.0006) +[2023-02-26 11:56:51,708][00189] Updated weights for policy 0, policy_version 1600 (0.0006) +[2023-02-26 11:56:52,418][00189] Updated weights for policy 0, policy_version 1610 (0.0006) +[2023-02-26 11:56:53,073][00189] Updated weights for policy 0, policy_version 1620 (0.0006) +[2023-02-26 11:56:53,656][00001] Fps is (10 sec: 60211.1, 60 sec: 60142.9, 300 sec: 57985.1). Total num frames: 6668288. Throughput: 0: 15000.9. Samples: 1652272. Policy #0 lag: (min: 0.0, avg: 1.5, max: 3.0) +[2023-02-26 11:56:53,656][00001] Avg episode reward: [(0, '22.030')] +[2023-02-26 11:56:53,660][00141] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000001628_6668288.pth... +[2023-02-26 11:56:53,697][00141] Saving new best policy, reward=22.030! +[2023-02-26 11:56:53,769][00189] Updated weights for policy 0, policy_version 1630 (0.0006) +[2023-02-26 11:56:54,432][00189] Updated weights for policy 0, policy_version 1640 (0.0006) +[2023-02-26 11:56:55,101][00189] Updated weights for policy 0, policy_version 1650 (0.0006) +[2023-02-26 11:56:55,796][00189] Updated weights for policy 0, policy_version 1660 (0.0006) +[2023-02-26 11:56:56,474][00189] Updated weights for policy 0, policy_version 1670 (0.0006) +[2023-02-26 11:56:57,121][00189] Updated weights for policy 0, policy_version 1680 (0.0006) +[2023-02-26 11:56:57,805][00189] Updated weights for policy 0, policy_version 1690 (0.0006) +[2023-02-26 11:56:58,511][00189] Updated weights for policy 0, policy_version 1700 (0.0006) +[2023-02-26 11:56:58,656][00001] Fps is (10 sec: 60620.6, 60 sec: 60142.9, 300 sec: 58094.9). Total num frames: 6971392. Throughput: 0: 15028.0. Samples: 1743052. Policy #0 lag: (min: 0.0, avg: 1.4, max: 3.0) +[2023-02-26 11:56:58,656][00001] Avg episode reward: [(0, '18.488')] +[2023-02-26 11:56:59,153][00189] Updated weights for policy 0, policy_version 1710 (0.0006) +[2023-02-26 11:56:59,866][00189] Updated weights for policy 0, policy_version 1720 (0.0007) +[2023-02-26 11:57:00,536][00189] Updated weights for policy 0, policy_version 1730 (0.0006) +[2023-02-26 11:57:01,188][00189] Updated weights for policy 0, policy_version 1740 (0.0006) +[2023-02-26 11:57:01,912][00189] Updated weights for policy 0, policy_version 1750 (0.0007) +[2023-02-26 11:57:02,570][00189] Updated weights for policy 0, policy_version 1760 (0.0006) +[2023-02-26 11:57:03,212][00189] Updated weights for policy 0, policy_version 1770 (0.0007) +[2023-02-26 11:57:03,656][00001] Fps is (10 sec: 60620.8, 60 sec: 60142.8, 300 sec: 58196.0). Total num frames: 7274496. Throughput: 0: 15050.7. Samples: 1788464. Policy #0 lag: (min: 0.0, avg: 1.5, max: 4.0) +[2023-02-26 11:57:03,656][00001] Avg episode reward: [(0, '21.850')] +[2023-02-26 11:57:03,966][00189] Updated weights for policy 0, policy_version 1780 (0.0006) +[2023-02-26 11:57:04,613][00189] Updated weights for policy 0, policy_version 1790 (0.0006) +[2023-02-26 11:57:05,267][00189] Updated weights for policy 0, policy_version 1800 (0.0006) +[2023-02-26 11:57:06,004][00189] Updated weights for policy 0, policy_version 1810 (0.0007) +[2023-02-26 11:57:06,650][00189] Updated weights for policy 0, policy_version 1820 (0.0006) +[2023-02-26 11:57:07,314][00189] Updated weights for policy 0, policy_version 1830 (0.0006) +[2023-02-26 11:57:08,020][00189] Updated weights for policy 0, policy_version 1840 (0.0007) +[2023-02-26 11:57:08,656][00001] Fps is (10 sec: 60211.3, 60 sec: 60143.0, 300 sec: 58257.7). Total num frames: 7573504. Throughput: 0: 15078.4. Samples: 1878868. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) +[2023-02-26 11:57:08,656][00001] Avg episode reward: [(0, '18.236')] +[2023-02-26 11:57:08,684][00189] Updated weights for policy 0, policy_version 1850 (0.0006) +[2023-02-26 11:57:09,346][00189] Updated weights for policy 0, policy_version 1860 (0.0006) +[2023-02-26 11:57:10,053][00189] Updated weights for policy 0, policy_version 1870 (0.0006) +[2023-02-26 11:57:10,712][00189] Updated weights for policy 0, policy_version 1880 (0.0006) +[2023-02-26 11:57:11,400][00189] Updated weights for policy 0, policy_version 1890 (0.0006) +[2023-02-26 11:57:12,095][00189] Updated weights for policy 0, policy_version 1900 (0.0006) +[2023-02-26 11:57:12,757][00189] Updated weights for policy 0, policy_version 1910 (0.0006) +[2023-02-26 11:57:13,444][00189] Updated weights for policy 0, policy_version 1920 (0.0006) +[2023-02-26 11:57:13,656][00001] Fps is (10 sec: 60211.2, 60 sec: 60142.8, 300 sec: 58345.2). Total num frames: 7876608. Throughput: 0: 15090.8. Samples: 1969176. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 11:57:13,656][00001] Avg episode reward: [(0, '21.892')] +[2023-02-26 11:57:14,128][00189] Updated weights for policy 0, policy_version 1930 (0.0006) +[2023-02-26 11:57:14,783][00189] Updated weights for policy 0, policy_version 1940 (0.0006) +[2023-02-26 11:57:15,497][00189] Updated weights for policy 0, policy_version 1950 (0.0006) +[2023-02-26 11:57:16,161][00189] Updated weights for policy 0, policy_version 1960 (0.0006) +[2023-02-26 11:57:16,848][00189] Updated weights for policy 0, policy_version 1970 (0.0006) +[2023-02-26 11:57:17,558][00189] Updated weights for policy 0, policy_version 1980 (0.0007) +[2023-02-26 11:57:18,264][00189] Updated weights for policy 0, policy_version 1990 (0.0006) +[2023-02-26 11:57:18,656][00001] Fps is (10 sec: 59800.9, 60 sec: 60142.7, 300 sec: 58368.0). Total num frames: 8171520. Throughput: 0: 15082.9. Samples: 2014212. Policy #0 lag: (min: 0.0, avg: 1.9, max: 3.0) +[2023-02-26 11:57:18,656][00001] Avg episode reward: [(0, '19.686')] +[2023-02-26 11:57:18,953][00189] Updated weights for policy 0, policy_version 2000 (0.0006) +[2023-02-26 11:57:19,635][00189] Updated weights for policy 0, policy_version 2010 (0.0006) +[2023-02-26 11:57:20,292][00189] Updated weights for policy 0, policy_version 2020 (0.0006) +[2023-02-26 11:57:20,993][00189] Updated weights for policy 0, policy_version 2030 (0.0006) +[2023-02-26 11:57:21,669][00189] Updated weights for policy 0, policy_version 2040 (0.0006) +[2023-02-26 11:57:22,345][00189] Updated weights for policy 0, policy_version 2050 (0.0006) +[2023-02-26 11:57:23,042][00189] Updated weights for policy 0, policy_version 2060 (0.0006) +[2023-02-26 11:57:23,656][00001] Fps is (10 sec: 59392.0, 60 sec: 60142.9, 300 sec: 58417.4). Total num frames: 8470528. Throughput: 0: 15037.8. Samples: 2103560. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 11:57:23,656][00001] Avg episode reward: [(0, '22.933')] +[2023-02-26 11:57:23,662][00141] Saving new best policy, reward=22.933! +[2023-02-26 11:57:23,718][00189] Updated weights for policy 0, policy_version 2070 (0.0006) +[2023-02-26 11:57:24,396][00189] Updated weights for policy 0, policy_version 2080 (0.0006) +[2023-02-26 11:57:25,092][00189] Updated weights for policy 0, policy_version 2090 (0.0006) +[2023-02-26 11:57:25,769][00189] Updated weights for policy 0, policy_version 2100 (0.0006) +[2023-02-26 11:57:26,445][00189] Updated weights for policy 0, policy_version 2110 (0.0006) +[2023-02-26 11:57:27,113][00189] Updated weights for policy 0, policy_version 2120 (0.0006) +[2023-02-26 11:57:27,796][00189] Updated weights for policy 0, policy_version 2130 (0.0006) +[2023-02-26 11:57:28,485][00189] Updated weights for policy 0, policy_version 2140 (0.0006) +[2023-02-26 11:57:28,656][00001] Fps is (10 sec: 60211.8, 60 sec: 60279.6, 300 sec: 58490.9). Total num frames: 8773632. Throughput: 0: 15052.5. Samples: 2193896. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 11:57:28,656][00001] Avg episode reward: [(0, '24.080')] +[2023-02-26 11:57:28,656][00141] Saving new best policy, reward=24.080! +[2023-02-26 11:57:29,156][00189] Updated weights for policy 0, policy_version 2150 (0.0006) +[2023-02-26 11:57:29,846][00189] Updated weights for policy 0, policy_version 2160 (0.0006) +[2023-02-26 11:57:30,534][00189] Updated weights for policy 0, policy_version 2170 (0.0006) +[2023-02-26 11:57:31,187][00189] Updated weights for policy 0, policy_version 2180 (0.0006) +[2023-02-26 11:57:31,880][00189] Updated weights for policy 0, policy_version 2190 (0.0006) +[2023-02-26 11:57:32,575][00189] Updated weights for policy 0, policy_version 2200 (0.0007) +[2023-02-26 11:57:33,207][00189] Updated weights for policy 0, policy_version 2210 (0.0006) +[2023-02-26 11:57:33,656][00001] Fps is (10 sec: 60621.0, 60 sec: 60211.3, 300 sec: 58559.6). Total num frames: 9076736. Throughput: 0: 15057.0. Samples: 2239088. Policy #0 lag: (min: 0.0, avg: 1.4, max: 4.0) +[2023-02-26 11:57:33,656][00001] Avg episode reward: [(0, '23.904')] +[2023-02-26 11:57:33,911][00189] Updated weights for policy 0, policy_version 2220 (0.0006) +[2023-02-26 11:57:34,608][00189] Updated weights for policy 0, policy_version 2230 (0.0006) +[2023-02-26 11:57:35,264][00189] Updated weights for policy 0, policy_version 2240 (0.0006) +[2023-02-26 11:57:35,930][00189] Updated weights for policy 0, policy_version 2250 (0.0007) +[2023-02-26 11:57:36,637][00189] Updated weights for policy 0, policy_version 2260 (0.0006) +[2023-02-26 11:57:37,321][00189] Updated weights for policy 0, policy_version 2270 (0.0007) +[2023-02-26 11:57:37,973][00189] Updated weights for policy 0, policy_version 2280 (0.0006) +[2023-02-26 11:57:38,648][00189] Updated weights for policy 0, policy_version 2290 (0.0006) +[2023-02-26 11:57:38,656][00001] Fps is (10 sec: 60620.8, 60 sec: 60279.5, 300 sec: 58624.0). Total num frames: 9379840. Throughput: 0: 15054.2. Samples: 2329708. Policy #0 lag: (min: 0.0, avg: 1.5, max: 3.0) +[2023-02-26 11:57:38,656][00001] Avg episode reward: [(0, '23.661')] +[2023-02-26 11:57:39,339][00189] Updated weights for policy 0, policy_version 2300 (0.0006) +[2023-02-26 11:57:40,043][00189] Updated weights for policy 0, policy_version 2310 (0.0006) +[2023-02-26 11:57:40,703][00189] Updated weights for policy 0, policy_version 2320 (0.0006) +[2023-02-26 11:57:41,408][00189] Updated weights for policy 0, policy_version 2330 (0.0006) +[2023-02-26 11:57:42,094][00189] Updated weights for policy 0, policy_version 2340 (0.0006) +[2023-02-26 11:57:42,799][00189] Updated weights for policy 0, policy_version 2350 (0.0006) +[2023-02-26 11:57:43,456][00189] Updated weights for policy 0, policy_version 2360 (0.0006) +[2023-02-26 11:57:43,656][00001] Fps is (10 sec: 59801.8, 60 sec: 60143.0, 300 sec: 58634.9). Total num frames: 9674752. Throughput: 0: 15027.8. Samples: 2419304. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 11:57:43,656][00001] Avg episode reward: [(0, '23.961')] +[2023-02-26 11:57:44,160][00189] Updated weights for policy 0, policy_version 2370 (0.0006) +[2023-02-26 11:57:44,850][00189] Updated weights for policy 0, policy_version 2380 (0.0006) +[2023-02-26 11:57:45,524][00189] Updated weights for policy 0, policy_version 2390 (0.0006) +[2023-02-26 11:57:46,214][00189] Updated weights for policy 0, policy_version 2400 (0.0006) +[2023-02-26 11:57:46,868][00189] Updated weights for policy 0, policy_version 2410 (0.0007) +[2023-02-26 11:57:47,578][00189] Updated weights for policy 0, policy_version 2420 (0.0006) +[2023-02-26 11:57:48,267][00189] Updated weights for policy 0, policy_version 2430 (0.0006) +[2023-02-26 11:57:48,656][00001] Fps is (10 sec: 59392.2, 60 sec: 60142.9, 300 sec: 58669.2). Total num frames: 9973760. Throughput: 0: 15018.6. Samples: 2464300. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 11:57:48,656][00001] Avg episode reward: [(0, '22.931')] +[2023-02-26 11:57:48,924][00189] Updated weights for policy 0, policy_version 2440 (0.0006) +[2023-02-26 11:57:49,617][00189] Updated weights for policy 0, policy_version 2450 (0.0006) +[2023-02-26 11:57:50,315][00189] Updated weights for policy 0, policy_version 2460 (0.0007) +[2023-02-26 11:57:50,985][00189] Updated weights for policy 0, policy_version 2470 (0.0006) +[2023-02-26 11:57:51,689][00189] Updated weights for policy 0, policy_version 2480 (0.0006) +[2023-02-26 11:57:52,365][00189] Updated weights for policy 0, policy_version 2490 (0.0006) +[2023-02-26 11:57:53,043][00189] Updated weights for policy 0, policy_version 2500 (0.0006) +[2023-02-26 11:57:53,656][00001] Fps is (10 sec: 60211.3, 60 sec: 60143.0, 300 sec: 58725.0). Total num frames: 10276864. Throughput: 0: 15006.0. Samples: 2554136. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 11:57:53,656][00001] Avg episode reward: [(0, '25.663')] +[2023-02-26 11:57:53,659][00141] Saving new best policy, reward=25.663! +[2023-02-26 11:57:53,718][00189] Updated weights for policy 0, policy_version 2510 (0.0006) +[2023-02-26 11:57:54,364][00189] Updated weights for policy 0, policy_version 2520 (0.0006) +[2023-02-26 11:57:55,062][00189] Updated weights for policy 0, policy_version 2530 (0.0006) +[2023-02-26 11:57:55,752][00189] Updated weights for policy 0, policy_version 2540 (0.0007) +[2023-02-26 11:57:56,430][00189] Updated weights for policy 0, policy_version 2550 (0.0007) +[2023-02-26 11:57:57,176][00189] Updated weights for policy 0, policy_version 2560 (0.0006) +[2023-02-26 11:57:57,808][00189] Updated weights for policy 0, policy_version 2570 (0.0006) +[2023-02-26 11:57:58,502][00189] Updated weights for policy 0, policy_version 2580 (0.0006) +[2023-02-26 11:57:58,656][00001] Fps is (10 sec: 60211.0, 60 sec: 60074.7, 300 sec: 58754.9). Total num frames: 10575872. Throughput: 0: 14996.9. Samples: 2644036. Policy #0 lag: (min: 0.0, avg: 1.7, max: 3.0) +[2023-02-26 11:57:58,656][00001] Avg episode reward: [(0, '22.832')] +[2023-02-26 11:57:59,183][00189] Updated weights for policy 0, policy_version 2590 (0.0006) +[2023-02-26 11:57:59,853][00189] Updated weights for policy 0, policy_version 2600 (0.0006) +[2023-02-26 11:58:00,527][00189] Updated weights for policy 0, policy_version 2610 (0.0006) +[2023-02-26 11:58:01,211][00189] Updated weights for policy 0, policy_version 2620 (0.0006) +[2023-02-26 11:58:01,864][00189] Updated weights for policy 0, policy_version 2630 (0.0006) +[2023-02-26 11:58:02,560][00189] Updated weights for policy 0, policy_version 2640 (0.0007) +[2023-02-26 11:58:03,226][00189] Updated weights for policy 0, policy_version 2650 (0.0006) +[2023-02-26 11:58:03,656][00001] Fps is (10 sec: 60210.8, 60 sec: 60074.7, 300 sec: 58805.3). Total num frames: 10878976. Throughput: 0: 15014.3. Samples: 2689856. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 11:58:03,656][00001] Avg episode reward: [(0, '22.750')] +[2023-02-26 11:58:03,908][00189] Updated weights for policy 0, policy_version 2660 (0.0006) +[2023-02-26 11:58:04,577][00189] Updated weights for policy 0, policy_version 2670 (0.0006) +[2023-02-26 11:58:05,263][00189] Updated weights for policy 0, policy_version 2680 (0.0007) +[2023-02-26 11:58:05,948][00189] Updated weights for policy 0, policy_version 2690 (0.0006) +[2023-02-26 11:58:06,584][00189] Updated weights for policy 0, policy_version 2700 (0.0006) +[2023-02-26 11:58:07,304][00189] Updated weights for policy 0, policy_version 2710 (0.0006) +[2023-02-26 11:58:07,973][00189] Updated weights for policy 0, policy_version 2720 (0.0007) +[2023-02-26 11:58:08,656][00001] Fps is (10 sec: 60211.3, 60 sec: 60074.7, 300 sec: 58831.5). Total num frames: 11177984. Throughput: 0: 15040.6. Samples: 2780388. Policy #0 lag: (min: 0.0, avg: 1.4, max: 3.0) +[2023-02-26 11:58:08,656][00001] Avg episode reward: [(0, '21.309')] +[2023-02-26 11:58:08,665][00189] Updated weights for policy 0, policy_version 2730 (0.0006) +[2023-02-26 11:58:09,346][00189] Updated weights for policy 0, policy_version 2740 (0.0006) +[2023-02-26 11:58:09,990][00189] Updated weights for policy 0, policy_version 2750 (0.0006) +[2023-02-26 11:58:10,705][00189] Updated weights for policy 0, policy_version 2760 (0.0006) +[2023-02-26 11:58:11,399][00189] Updated weights for policy 0, policy_version 2770 (0.0006) +[2023-02-26 11:58:12,048][00189] Updated weights for policy 0, policy_version 2780 (0.0006) +[2023-02-26 11:58:12,732][00189] Updated weights for policy 0, policy_version 2790 (0.0006) +[2023-02-26 11:58:13,417][00189] Updated weights for policy 0, policy_version 2800 (0.0006) +[2023-02-26 11:58:13,656][00001] Fps is (10 sec: 60210.4, 60 sec: 60074.5, 300 sec: 58877.3). Total num frames: 11481088. Throughput: 0: 15042.4. Samples: 2870808. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) +[2023-02-26 11:58:13,656][00001] Avg episode reward: [(0, '24.992')] +[2023-02-26 11:58:14,094][00189] Updated weights for policy 0, policy_version 2810 (0.0006) +[2023-02-26 11:58:14,759][00189] Updated weights for policy 0, policy_version 2820 (0.0006) +[2023-02-26 11:58:15,439][00189] Updated weights for policy 0, policy_version 2830 (0.0006) +[2023-02-26 11:58:16,132][00189] Updated weights for policy 0, policy_version 2840 (0.0006) +[2023-02-26 11:58:16,783][00189] Updated weights for policy 0, policy_version 2850 (0.0007) +[2023-02-26 11:58:17,454][00189] Updated weights for policy 0, policy_version 2860 (0.0006) +[2023-02-26 11:58:18,184][00189] Updated weights for policy 0, policy_version 2870 (0.0006) +[2023-02-26 11:58:18,656][00001] Fps is (10 sec: 60621.0, 60 sec: 60211.4, 300 sec: 58921.0). Total num frames: 11784192. Throughput: 0: 15046.8. Samples: 2916192. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) +[2023-02-26 11:58:18,656][00001] Avg episode reward: [(0, '25.115')] +[2023-02-26 11:58:18,839][00189] Updated weights for policy 0, policy_version 2880 (0.0006) +[2023-02-26 11:58:19,478][00189] Updated weights for policy 0, policy_version 2890 (0.0006) +[2023-02-26 11:58:20,218][00189] Updated weights for policy 0, policy_version 2900 (0.0006) +[2023-02-26 11:58:20,884][00189] Updated weights for policy 0, policy_version 2910 (0.0007) +[2023-02-26 11:58:21,554][00189] Updated weights for policy 0, policy_version 2920 (0.0007) +[2023-02-26 11:58:22,241][00189] Updated weights for policy 0, policy_version 2930 (0.0006) +[2023-02-26 11:58:22,936][00189] Updated weights for policy 0, policy_version 2940 (0.0006) +[2023-02-26 11:58:23,614][00189] Updated weights for policy 0, policy_version 2950 (0.0006) +[2023-02-26 11:58:23,656][00001] Fps is (10 sec: 60212.1, 60 sec: 60211.2, 300 sec: 58942.4). Total num frames: 12083200. Throughput: 0: 15037.1. Samples: 3006376. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) +[2023-02-26 11:58:23,656][00001] Avg episode reward: [(0, '27.914')] +[2023-02-26 11:58:23,659][00141] Saving new best policy, reward=27.914! +[2023-02-26 11:58:24,302][00189] Updated weights for policy 0, policy_version 2960 (0.0007) +[2023-02-26 11:58:25,000][00189] Updated weights for policy 0, policy_version 2970 (0.0006) +[2023-02-26 11:58:25,651][00189] Updated weights for policy 0, policy_version 2980 (0.0006) +[2023-02-26 11:58:26,331][00189] Updated weights for policy 0, policy_version 2990 (0.0006) +[2023-02-26 11:58:27,018][00189] Updated weights for policy 0, policy_version 3000 (0.0006) +[2023-02-26 11:58:27,696][00189] Updated weights for policy 0, policy_version 3010 (0.0006) +[2023-02-26 11:58:28,372][00189] Updated weights for policy 0, policy_version 3020 (0.0006) +[2023-02-26 11:58:28,656][00001] Fps is (10 sec: 60211.1, 60 sec: 60211.2, 300 sec: 58982.4). Total num frames: 12386304. Throughput: 0: 15051.9. Samples: 3096640. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 11:58:28,656][00001] Avg episode reward: [(0, '25.761')] +[2023-02-26 11:58:29,087][00189] Updated weights for policy 0, policy_version 3030 (0.0006) +[2023-02-26 11:58:29,732][00189] Updated weights for policy 0, policy_version 3040 (0.0006) +[2023-02-26 11:58:30,397][00189] Updated weights for policy 0, policy_version 3050 (0.0007) +[2023-02-26 11:58:31,106][00189] Updated weights for policy 0, policy_version 3060 (0.0007) +[2023-02-26 11:58:31,791][00189] Updated weights for policy 0, policy_version 3070 (0.0006) +[2023-02-26 11:58:32,473][00189] Updated weights for policy 0, policy_version 3080 (0.0006) +[2023-02-26 11:58:33,150][00189] Updated weights for policy 0, policy_version 3090 (0.0006) +[2023-02-26 11:58:33,656][00001] Fps is (10 sec: 60211.1, 60 sec: 60142.9, 300 sec: 59001.4). Total num frames: 12685312. Throughput: 0: 15053.2. Samples: 3141696. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) +[2023-02-26 11:58:33,656][00001] Avg episode reward: [(0, '24.097')] +[2023-02-26 11:58:33,801][00189] Updated weights for policy 0, policy_version 3100 (0.0006) +[2023-02-26 11:58:34,516][00189] Updated weights for policy 0, policy_version 3110 (0.0006) +[2023-02-26 11:58:35,180][00189] Updated weights for policy 0, policy_version 3120 (0.0006) +[2023-02-26 11:58:35,876][00189] Updated weights for policy 0, policy_version 3130 (0.0006) +[2023-02-26 11:58:36,560][00189] Updated weights for policy 0, policy_version 3140 (0.0006) +[2023-02-26 11:58:37,230][00189] Updated weights for policy 0, policy_version 3150 (0.0006) +[2023-02-26 11:58:37,909][00189] Updated weights for policy 0, policy_version 3160 (0.0006) +[2023-02-26 11:58:38,601][00189] Updated weights for policy 0, policy_version 3170 (0.0006) +[2023-02-26 11:58:38,656][00001] Fps is (10 sec: 59801.2, 60 sec: 60074.6, 300 sec: 59019.6). Total num frames: 12984320. Throughput: 0: 15058.1. Samples: 3231752. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) +[2023-02-26 11:58:38,656][00001] Avg episode reward: [(0, '23.588')] +[2023-02-26 11:58:39,308][00189] Updated weights for policy 0, policy_version 3180 (0.0006) +[2023-02-26 11:58:39,995][00189] Updated weights for policy 0, policy_version 3190 (0.0006) +[2023-02-26 11:58:40,699][00189] Updated weights for policy 0, policy_version 3200 (0.0006) +[2023-02-26 11:58:41,364][00189] Updated weights for policy 0, policy_version 3210 (0.0006) +[2023-02-26 11:58:42,089][00189] Updated weights for policy 0, policy_version 3220 (0.0007) +[2023-02-26 11:58:42,831][00189] Updated weights for policy 0, policy_version 3230 (0.0006) +[2023-02-26 11:58:43,530][00189] Updated weights for policy 0, policy_version 3240 (0.0007) +[2023-02-26 11:58:43,656][00001] Fps is (10 sec: 58982.5, 60 sec: 60006.4, 300 sec: 59000.6). Total num frames: 13275136. Throughput: 0: 15015.5. Samples: 3319736. Policy #0 lag: (min: 0.0, avg: 1.4, max: 3.0) +[2023-02-26 11:58:43,656][00001] Avg episode reward: [(0, '26.011')] +[2023-02-26 11:58:44,241][00189] Updated weights for policy 0, policy_version 3250 (0.0007) +[2023-02-26 11:58:44,981][00189] Updated weights for policy 0, policy_version 3260 (0.0006) +[2023-02-26 11:58:45,624][00189] Updated weights for policy 0, policy_version 3270 (0.0006) +[2023-02-26 11:58:46,300][00189] Updated weights for policy 0, policy_version 3280 (0.0007) +[2023-02-26 11:58:47,006][00189] Updated weights for policy 0, policy_version 3290 (0.0006) +[2023-02-26 11:58:47,675][00189] Updated weights for policy 0, policy_version 3300 (0.0007) +[2023-02-26 11:58:48,343][00189] Updated weights for policy 0, policy_version 3310 (0.0007) +[2023-02-26 11:58:48,656][00001] Fps is (10 sec: 58573.2, 60 sec: 59938.1, 300 sec: 59000.2). Total num frames: 13570048. Throughput: 0: 14978.3. Samples: 3363880. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) +[2023-02-26 11:58:48,656][00001] Avg episode reward: [(0, '27.406')] +[2023-02-26 11:58:49,081][00189] Updated weights for policy 0, policy_version 3320 (0.0007) +[2023-02-26 11:58:49,777][00189] Updated weights for policy 0, policy_version 3330 (0.0007) +[2023-02-26 11:58:50,433][00189] Updated weights for policy 0, policy_version 3340 (0.0006) +[2023-02-26 11:58:51,151][00189] Updated weights for policy 0, policy_version 3350 (0.0006) +[2023-02-26 11:58:51,882][00189] Updated weights for policy 0, policy_version 3360 (0.0007) +[2023-02-26 11:58:52,610][00189] Updated weights for policy 0, policy_version 3370 (0.0007) +[2023-02-26 11:58:53,331][00189] Updated weights for policy 0, policy_version 3380 (0.0007) +[2023-02-26 11:58:53,656][00001] Fps is (10 sec: 58572.6, 60 sec: 59733.2, 300 sec: 58982.4). Total num frames: 13860864. Throughput: 0: 14912.2. Samples: 3451440. Policy #0 lag: (min: 0.0, avg: 1.8, max: 3.0) +[2023-02-26 11:58:53,656][00001] Avg episode reward: [(0, '25.216')] +[2023-02-26 11:58:53,664][00141] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000003385_13864960.pth... +[2023-02-26 11:58:54,075][00189] Updated weights for policy 0, policy_version 3390 (0.0006) +[2023-02-26 11:58:54,753][00189] Updated weights for policy 0, policy_version 3400 (0.0006) +[2023-02-26 11:58:55,482][00189] Updated weights for policy 0, policy_version 3410 (0.0006) +[2023-02-26 11:58:56,180][00189] Updated weights for policy 0, policy_version 3420 (0.0006) +[2023-02-26 11:58:56,845][00189] Updated weights for policy 0, policy_version 3430 (0.0006) +[2023-02-26 11:58:57,547][00189] Updated weights for policy 0, policy_version 3440 (0.0006) +[2023-02-26 11:58:58,278][00189] Updated weights for policy 0, policy_version 3450 (0.0007) +[2023-02-26 11:58:58,656][00001] Fps is (10 sec: 58162.8, 60 sec: 59596.8, 300 sec: 58965.3). Total num frames: 14151680. Throughput: 0: 14843.0. Samples: 3538740. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) +[2023-02-26 11:58:58,656][00001] Avg episode reward: [(0, '27.614')] +[2023-02-26 11:58:58,934][00189] Updated weights for policy 0, policy_version 3460 (0.0006) +[2023-02-26 11:58:59,620][00189] Updated weights for policy 0, policy_version 3470 (0.0006) +[2023-02-26 11:59:00,349][00189] Updated weights for policy 0, policy_version 3480 (0.0006) +[2023-02-26 11:59:01,016][00189] Updated weights for policy 0, policy_version 3490 (0.0006) +[2023-02-26 11:59:01,705][00189] Updated weights for policy 0, policy_version 3500 (0.0006) +[2023-02-26 11:59:02,417][00189] Updated weights for policy 0, policy_version 3510 (0.0006) +[2023-02-26 11:59:03,107][00189] Updated weights for policy 0, policy_version 3520 (0.0007) +[2023-02-26 11:59:03,656][00001] Fps is (10 sec: 58982.6, 60 sec: 59528.5, 300 sec: 58982.4). Total num frames: 14450688. Throughput: 0: 14818.3. Samples: 3583016. Policy #0 lag: (min: 0.0, avg: 1.8, max: 3.0) +[2023-02-26 11:59:03,656][00001] Avg episode reward: [(0, '24.845')] +[2023-02-26 11:59:03,803][00189] Updated weights for policy 0, policy_version 3530 (0.0006) +[2023-02-26 11:59:04,481][00189] Updated weights for policy 0, policy_version 3540 (0.0006) +[2023-02-26 11:59:05,189][00189] Updated weights for policy 0, policy_version 3550 (0.0007) +[2023-02-26 11:59:05,885][00189] Updated weights for policy 0, policy_version 3560 (0.0006) +[2023-02-26 11:59:06,558][00189] Updated weights for policy 0, policy_version 3570 (0.0006) +[2023-02-26 11:59:07,304][00189] Updated weights for policy 0, policy_version 3580 (0.0007) +[2023-02-26 11:59:07,971][00189] Updated weights for policy 0, policy_version 3590 (0.0006) +[2023-02-26 11:59:08,630][00189] Updated weights for policy 0, policy_version 3600 (0.0006) +[2023-02-26 11:59:08,656][00001] Fps is (10 sec: 59391.9, 60 sec: 59460.2, 300 sec: 58982.4). Total num frames: 14745600. Throughput: 0: 14779.3. Samples: 3671444. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 11:59:08,656][00001] Avg episode reward: [(0, '25.454')] +[2023-02-26 11:59:09,371][00189] Updated weights for policy 0, policy_version 3610 (0.0007) +[2023-02-26 11:59:10,039][00189] Updated weights for policy 0, policy_version 3620 (0.0006) +[2023-02-26 11:59:10,731][00189] Updated weights for policy 0, policy_version 3630 (0.0006) +[2023-02-26 11:59:11,438][00189] Updated weights for policy 0, policy_version 3640 (0.0007) +[2023-02-26 11:59:12,090][00189] Updated weights for policy 0, policy_version 3650 (0.0006) +[2023-02-26 11:59:12,810][00189] Updated weights for policy 0, policy_version 3660 (0.0007) +[2023-02-26 11:59:13,521][00189] Updated weights for policy 0, policy_version 3670 (0.0006) +[2023-02-26 11:59:13,656][00001] Fps is (10 sec: 58572.7, 60 sec: 59255.6, 300 sec: 58966.3). Total num frames: 15036416. Throughput: 0: 14745.8. Samples: 3760200. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 11:59:13,656][00001] Avg episode reward: [(0, '24.650')] +[2023-02-26 11:59:14,180][00189] Updated weights for policy 0, policy_version 3680 (0.0006) +[2023-02-26 11:59:14,927][00189] Updated weights for policy 0, policy_version 3690 (0.0006) +[2023-02-26 11:59:15,578][00189] Updated weights for policy 0, policy_version 3700 (0.0007) +[2023-02-26 11:59:16,287][00189] Updated weights for policy 0, policy_version 3710 (0.0006) +[2023-02-26 11:59:16,988][00189] Updated weights for policy 0, policy_version 3720 (0.0006) +[2023-02-26 11:59:17,657][00189] Updated weights for policy 0, policy_version 3730 (0.0007) +[2023-02-26 11:59:18,373][00189] Updated weights for policy 0, policy_version 3740 (0.0006) +[2023-02-26 11:59:18,656][00001] Fps is (10 sec: 58982.5, 60 sec: 59187.1, 300 sec: 58982.4). Total num frames: 15335424. Throughput: 0: 14732.8. Samples: 3804672. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) +[2023-02-26 11:59:18,656][00001] Avg episode reward: [(0, '22.615')] +[2023-02-26 11:59:19,042][00189] Updated weights for policy 0, policy_version 3750 (0.0006) +[2023-02-26 11:59:19,730][00189] Updated weights for policy 0, policy_version 3760 (0.0006) +[2023-02-26 11:59:20,441][00189] Updated weights for policy 0, policy_version 3770 (0.0006) +[2023-02-26 11:59:21,129][00189] Updated weights for policy 0, policy_version 3780 (0.0006) +[2023-02-26 11:59:21,811][00189] Updated weights for policy 0, policy_version 3790 (0.0006) +[2023-02-26 11:59:22,475][00189] Updated weights for policy 0, policy_version 3800 (0.0006) +[2023-02-26 11:59:23,205][00189] Updated weights for policy 0, policy_version 3810 (0.0006) +[2023-02-26 11:59:23,656][00001] Fps is (10 sec: 59391.7, 60 sec: 59118.9, 300 sec: 58982.4). Total num frames: 15630336. Throughput: 0: 14701.6. Samples: 3893324. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) +[2023-02-26 11:59:23,656][00001] Avg episode reward: [(0, '23.473')] +[2023-02-26 11:59:23,693][00141] Signal inference workers to stop experience collection... (100 times) +[2023-02-26 11:59:23,694][00141] Signal inference workers to resume experience collection... (100 times) +[2023-02-26 11:59:23,697][00189] InferenceWorker_p0-w0: stopping experience collection (100 times) +[2023-02-26 11:59:23,701][00189] InferenceWorker_p0-w0: resuming experience collection (100 times) +[2023-02-26 11:59:23,891][00189] Updated weights for policy 0, policy_version 3820 (0.0006) +[2023-02-26 11:59:24,548][00189] Updated weights for policy 0, policy_version 3830 (0.0007) +[2023-02-26 11:59:25,301][00189] Updated weights for policy 0, policy_version 3840 (0.0007) +[2023-02-26 11:59:25,985][00189] Updated weights for policy 0, policy_version 3850 (0.0006) +[2023-02-26 11:59:26,647][00189] Updated weights for policy 0, policy_version 3860 (0.0007) +[2023-02-26 11:59:27,386][00189] Updated weights for policy 0, policy_version 3870 (0.0007) +[2023-02-26 11:59:28,063][00189] Updated weights for policy 0, policy_version 3880 (0.0007) +[2023-02-26 11:59:28,656][00001] Fps is (10 sec: 58981.8, 60 sec: 58982.2, 300 sec: 58982.4). Total num frames: 15925248. Throughput: 0: 14714.2. Samples: 3981876. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) +[2023-02-26 11:59:28,656][00001] Avg episode reward: [(0, '26.868')] +[2023-02-26 11:59:28,733][00189] Updated weights for policy 0, policy_version 3890 (0.0006) +[2023-02-26 11:59:29,456][00189] Updated weights for policy 0, policy_version 3900 (0.0006) +[2023-02-26 11:59:30,139][00189] Updated weights for policy 0, policy_version 3910 (0.0007) +[2023-02-26 11:59:30,820][00189] Updated weights for policy 0, policy_version 3920 (0.0006) +[2023-02-26 11:59:31,541][00189] Updated weights for policy 0, policy_version 3930 (0.0007) +[2023-02-26 11:59:32,208][00189] Updated weights for policy 0, policy_version 3940 (0.0006) +[2023-02-26 11:59:32,891][00189] Updated weights for policy 0, policy_version 3950 (0.0006) +[2023-02-26 11:59:33,616][00189] Updated weights for policy 0, policy_version 3960 (0.0006) +[2023-02-26 11:59:33,656][00001] Fps is (10 sec: 58982.9, 60 sec: 58914.2, 300 sec: 58982.4). Total num frames: 16220160. Throughput: 0: 14720.4. Samples: 4026300. Policy #0 lag: (min: 1.0, avg: 1.8, max: 4.0) +[2023-02-26 11:59:33,656][00001] Avg episode reward: [(0, '28.720')] +[2023-02-26 11:59:33,659][00141] Saving new best policy, reward=28.720! +[2023-02-26 11:59:34,289][00189] Updated weights for policy 0, policy_version 3970 (0.0006) +[2023-02-26 11:59:34,987][00189] Updated weights for policy 0, policy_version 3980 (0.0007) +[2023-02-26 11:59:35,688][00189] Updated weights for policy 0, policy_version 3990 (0.0006) +[2023-02-26 11:59:36,388][00189] Updated weights for policy 0, policy_version 4000 (0.0006) +[2023-02-26 11:59:37,059][00189] Updated weights for policy 0, policy_version 4010 (0.0006) +[2023-02-26 11:59:37,755][00189] Updated weights for policy 0, policy_version 4020 (0.0006) +[2023-02-26 11:59:38,486][00189] Updated weights for policy 0, policy_version 4030 (0.0007) +[2023-02-26 11:59:38,656][00001] Fps is (10 sec: 58983.2, 60 sec: 58845.9, 300 sec: 58982.4). Total num frames: 16515072. Throughput: 0: 14743.0. Samples: 4114872. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 11:59:38,656][00001] Avg episode reward: [(0, '24.766')] +[2023-02-26 11:59:39,144][00189] Updated weights for policy 0, policy_version 4040 (0.0006) +[2023-02-26 11:59:39,800][00189] Updated weights for policy 0, policy_version 4050 (0.0007) +[2023-02-26 11:59:40,541][00189] Updated weights for policy 0, policy_version 4060 (0.0006) +[2023-02-26 11:59:41,232][00189] Updated weights for policy 0, policy_version 4070 (0.0006) +[2023-02-26 11:59:41,885][00189] Updated weights for policy 0, policy_version 4080 (0.0006) +[2023-02-26 11:59:42,654][00189] Updated weights for policy 0, policy_version 4090 (0.0006) +[2023-02-26 11:59:43,299][00189] Updated weights for policy 0, policy_version 4100 (0.0006) +[2023-02-26 11:59:43,656][00001] Fps is (10 sec: 58981.9, 60 sec: 58914.1, 300 sec: 58982.4). Total num frames: 16809984. Throughput: 0: 14774.8. Samples: 4203608. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 11:59:43,656][00001] Avg episode reward: [(0, '26.614')] +[2023-02-26 11:59:43,963][00189] Updated weights for policy 0, policy_version 4110 (0.0006) +[2023-02-26 11:59:44,703][00189] Updated weights for policy 0, policy_version 4120 (0.0006) +[2023-02-26 11:59:45,384][00189] Updated weights for policy 0, policy_version 4130 (0.0006) +[2023-02-26 11:59:46,063][00189] Updated weights for policy 0, policy_version 4140 (0.0007) +[2023-02-26 11:59:46,756][00189] Updated weights for policy 0, policy_version 4150 (0.0006) +[2023-02-26 11:59:47,477][00189] Updated weights for policy 0, policy_version 4160 (0.0006) +[2023-02-26 11:59:48,135][00189] Updated weights for policy 0, policy_version 4170 (0.0006) +[2023-02-26 11:59:48,656][00001] Fps is (10 sec: 59392.1, 60 sec: 58982.4, 300 sec: 58996.5). Total num frames: 17108992. Throughput: 0: 14774.2. Samples: 4247852. Policy #0 lag: (min: 0.0, avg: 1.5, max: 4.0) +[2023-02-26 11:59:48,656][00001] Avg episode reward: [(0, '28.235')] +[2023-02-26 11:59:48,848][00189] Updated weights for policy 0, policy_version 4180 (0.0006) +[2023-02-26 11:59:49,542][00189] Updated weights for policy 0, policy_version 4190 (0.0006) +[2023-02-26 11:59:50,244][00189] Updated weights for policy 0, policy_version 4200 (0.0007) +[2023-02-26 11:59:50,941][00189] Updated weights for policy 0, policy_version 4210 (0.0006) +[2023-02-26 11:59:51,656][00189] Updated weights for policy 0, policy_version 4220 (0.0007) +[2023-02-26 11:59:52,326][00189] Updated weights for policy 0, policy_version 4230 (0.0006) +[2023-02-26 11:59:53,018][00189] Updated weights for policy 0, policy_version 4240 (0.0007) +[2023-02-26 11:59:53,656][00001] Fps is (10 sec: 59392.2, 60 sec: 59050.7, 300 sec: 58996.3). Total num frames: 17403904. Throughput: 0: 14773.1. Samples: 4336232. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 11:59:53,656][00001] Avg episode reward: [(0, '27.546')] +[2023-02-26 11:59:53,761][00189] Updated weights for policy 0, policy_version 4250 (0.0007) +[2023-02-26 11:59:54,405][00189] Updated weights for policy 0, policy_version 4260 (0.0006) +[2023-02-26 11:59:55,063][00189] Updated weights for policy 0, policy_version 4270 (0.0007) +[2023-02-26 11:59:55,825][00189] Updated weights for policy 0, policy_version 4280 (0.0007) +[2023-02-26 11:59:56,489][00189] Updated weights for policy 0, policy_version 4290 (0.0006) +[2023-02-26 11:59:57,150][00189] Updated weights for policy 0, policy_version 4300 (0.0007) +[2023-02-26 11:59:57,892][00189] Updated weights for policy 0, policy_version 4310 (0.0006) +[2023-02-26 11:59:58,574][00189] Updated weights for policy 0, policy_version 4320 (0.0007) +[2023-02-26 11:59:58,656][00001] Fps is (10 sec: 58982.0, 60 sec: 59118.9, 300 sec: 59746.1). Total num frames: 17698816. Throughput: 0: 14774.3. Samples: 4425044. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 11:59:58,656][00001] Avg episode reward: [(0, '24.978')] +[2023-02-26 11:59:59,233][00189] Updated weights for policy 0, policy_version 4330 (0.0006) +[2023-02-26 11:59:59,962][00189] Updated weights for policy 0, policy_version 4340 (0.0006) +[2023-02-26 12:00:00,677][00189] Updated weights for policy 0, policy_version 4350 (0.0006) +[2023-02-26 12:00:01,328][00189] Updated weights for policy 0, policy_version 4360 (0.0007) +[2023-02-26 12:00:02,015][00189] Updated weights for policy 0, policy_version 4370 (0.0007) +[2023-02-26 12:00:02,737][00189] Updated weights for policy 0, policy_version 4380 (0.0006) +[2023-02-26 12:00:03,443][00189] Updated weights for policy 0, policy_version 4390 (0.0007) +[2023-02-26 12:00:03,656][00001] Fps is (10 sec: 58983.1, 60 sec: 59050.7, 300 sec: 59746.1). Total num frames: 17993728. Throughput: 0: 14766.6. Samples: 4469168. Policy #0 lag: (min: 0.0, avg: 1.4, max: 3.0) +[2023-02-26 12:00:03,656][00001] Avg episode reward: [(0, '26.497')] +[2023-02-26 12:00:04,091][00189] Updated weights for policy 0, policy_version 4400 (0.0006) +[2023-02-26 12:00:04,819][00189] Updated weights for policy 0, policy_version 4410 (0.0007) +[2023-02-26 12:00:05,534][00189] Updated weights for policy 0, policy_version 4420 (0.0007) +[2023-02-26 12:00:06,157][00141] Signal inference workers to stop experience collection... (150 times) +[2023-02-26 12:00:06,158][00141] Signal inference workers to resume experience collection... (150 times) +[2023-02-26 12:00:06,165][00189] InferenceWorker_p0-w0: stopping experience collection (150 times) +[2023-02-26 12:00:06,165][00189] InferenceWorker_p0-w0: resuming experience collection (150 times) +[2023-02-26 12:00:06,176][00189] Updated weights for policy 0, policy_version 4430 (0.0007) +[2023-02-26 12:00:06,907][00189] Updated weights for policy 0, policy_version 4440 (0.0006) +[2023-02-26 12:00:07,619][00189] Updated weights for policy 0, policy_version 4450 (0.0006) +[2023-02-26 12:00:08,275][00189] Updated weights for policy 0, policy_version 4460 (0.0007) +[2023-02-26 12:00:08,656][00001] Fps is (10 sec: 58572.7, 60 sec: 58982.4, 300 sec: 59704.4). Total num frames: 18284544. Throughput: 0: 14760.4. Samples: 4557540. Policy #0 lag: (min: 0.0, avg: 1.4, max: 3.0) +[2023-02-26 12:00:08,656][00001] Avg episode reward: [(0, '24.744')] +[2023-02-26 12:00:09,023][00189] Updated weights for policy 0, policy_version 4470 (0.0006) +[2023-02-26 12:00:09,699][00189] Updated weights for policy 0, policy_version 4480 (0.0007) +[2023-02-26 12:00:10,373][00189] Updated weights for policy 0, policy_version 4490 (0.0006) +[2023-02-26 12:00:11,120][00189] Updated weights for policy 0, policy_version 4500 (0.0007) +[2023-02-26 12:00:11,768][00189] Updated weights for policy 0, policy_version 4510 (0.0006) +[2023-02-26 12:00:12,454][00189] Updated weights for policy 0, policy_version 4520 (0.0007) +[2023-02-26 12:00:13,195][00189] Updated weights for policy 0, policy_version 4530 (0.0006) +[2023-02-26 12:00:13,656][00001] Fps is (10 sec: 58981.9, 60 sec: 59118.9, 300 sec: 59718.3). Total num frames: 18583552. Throughput: 0: 14756.8. Samples: 4645932. Policy #0 lag: (min: 0.0, avg: 2.2, max: 5.0) +[2023-02-26 12:00:13,656][00001] Avg episode reward: [(0, '26.036')] +[2023-02-26 12:00:13,851][00189] Updated weights for policy 0, policy_version 4540 (0.0006) +[2023-02-26 12:00:14,551][00189] Updated weights for policy 0, policy_version 4550 (0.0007) +[2023-02-26 12:00:15,242][00189] Updated weights for policy 0, policy_version 4560 (0.0006) +[2023-02-26 12:00:15,941][00189] Updated weights for policy 0, policy_version 4570 (0.0006) +[2023-02-26 12:00:16,630][00189] Updated weights for policy 0, policy_version 4580 (0.0007) +[2023-02-26 12:00:17,288][00189] Updated weights for policy 0, policy_version 4590 (0.0007) +[2023-02-26 12:00:18,038][00189] Updated weights for policy 0, policy_version 4600 (0.0007) +[2023-02-26 12:00:18,656][00001] Fps is (10 sec: 58982.8, 60 sec: 58982.4, 300 sec: 59676.6). Total num frames: 18874368. Throughput: 0: 14758.9. Samples: 4690452. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) +[2023-02-26 12:00:18,656][00001] Avg episode reward: [(0, '26.827')] +[2023-02-26 12:00:18,717][00189] Updated weights for policy 0, policy_version 4610 (0.0007) +[2023-02-26 12:00:19,375][00189] Updated weights for policy 0, policy_version 4620 (0.0006) +[2023-02-26 12:00:20,121][00189] Updated weights for policy 0, policy_version 4630 (0.0007) +[2023-02-26 12:00:20,794][00189] Updated weights for policy 0, policy_version 4640 (0.0007) +[2023-02-26 12:00:21,474][00189] Updated weights for policy 0, policy_version 4650 (0.0007) +[2023-02-26 12:00:22,192][00189] Updated weights for policy 0, policy_version 4660 (0.0006) +[2023-02-26 12:00:22,886][00189] Updated weights for policy 0, policy_version 4670 (0.0006) +[2023-02-26 12:00:23,537][00189] Updated weights for policy 0, policy_version 4680 (0.0006) +[2023-02-26 12:00:23,656][00001] Fps is (10 sec: 58982.9, 60 sec: 59050.8, 300 sec: 59690.5). Total num frames: 19173376. Throughput: 0: 14754.4. Samples: 4778820. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) +[2023-02-26 12:00:23,656][00001] Avg episode reward: [(0, '28.208')] +[2023-02-26 12:00:24,283][00189] Updated weights for policy 0, policy_version 4690 (0.0007) +[2023-02-26 12:00:24,962][00189] Updated weights for policy 0, policy_version 4700 (0.0007) +[2023-02-26 12:00:25,615][00189] Updated weights for policy 0, policy_version 4710 (0.0006) +[2023-02-26 12:00:26,376][00189] Updated weights for policy 0, policy_version 4720 (0.0006) +[2023-02-26 12:00:27,052][00189] Updated weights for policy 0, policy_version 4730 (0.0007) +[2023-02-26 12:00:27,688][00189] Updated weights for policy 0, policy_version 4740 (0.0007) +[2023-02-26 12:00:28,424][00189] Updated weights for policy 0, policy_version 4750 (0.0006) +[2023-02-26 12:00:28,656][00001] Fps is (10 sec: 59391.6, 60 sec: 59050.7, 300 sec: 59662.8). Total num frames: 19468288. Throughput: 0: 14752.3. Samples: 4867460. Policy #0 lag: (min: 0.0, avg: 1.7, max: 3.0) +[2023-02-26 12:00:28,656][00001] Avg episode reward: [(0, '26.990')] +[2023-02-26 12:00:29,104][00189] Updated weights for policy 0, policy_version 4760 (0.0006) +[2023-02-26 12:00:29,781][00189] Updated weights for policy 0, policy_version 4770 (0.0007) +[2023-02-26 12:00:30,515][00189] Updated weights for policy 0, policy_version 4780 (0.0007) +[2023-02-26 12:00:31,196][00189] Updated weights for policy 0, policy_version 4790 (0.0006) +[2023-02-26 12:00:31,889][00189] Updated weights for policy 0, policy_version 4800 (0.0006) +[2023-02-26 12:00:32,582][00189] Updated weights for policy 0, policy_version 4810 (0.0006) +[2023-02-26 12:00:33,243][00189] Updated weights for policy 0, policy_version 4820 (0.0006) +[2023-02-26 12:00:33,656][00001] Fps is (10 sec: 59391.3, 60 sec: 59118.9, 300 sec: 59676.6). Total num frames: 19767296. Throughput: 0: 14755.9. Samples: 4911868. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:00:33,656][00001] Avg episode reward: [(0, '27.637')] +[2023-02-26 12:00:33,963][00189] Updated weights for policy 0, policy_version 4830 (0.0006) +[2023-02-26 12:00:34,643][00189] Updated weights for policy 0, policy_version 4840 (0.0006) +[2023-02-26 12:00:35,305][00189] Updated weights for policy 0, policy_version 4850 (0.0006) +[2023-02-26 12:00:36,046][00189] Updated weights for policy 0, policy_version 4860 (0.0006) +[2023-02-26 12:00:36,698][00189] Updated weights for policy 0, policy_version 4870 (0.0007) +[2023-02-26 12:00:37,376][00189] Updated weights for policy 0, policy_version 4880 (0.0006) +[2023-02-26 12:00:38,103][00189] Updated weights for policy 0, policy_version 4890 (0.0006) +[2023-02-26 12:00:38,656][00001] Fps is (10 sec: 59392.4, 60 sec: 59118.9, 300 sec: 59662.8). Total num frames: 20062208. Throughput: 0: 14767.9. Samples: 5000784. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) +[2023-02-26 12:00:38,656][00001] Avg episode reward: [(0, '24.093')] +[2023-02-26 12:00:38,757][00189] Updated weights for policy 0, policy_version 4900 (0.0006) +[2023-02-26 12:00:39,472][00189] Updated weights for policy 0, policy_version 4910 (0.0006) +[2023-02-26 12:00:40,178][00189] Updated weights for policy 0, policy_version 4920 (0.0007) +[2023-02-26 12:00:40,851][00189] Updated weights for policy 0, policy_version 4930 (0.0006) +[2023-02-26 12:00:41,559][00189] Updated weights for policy 0, policy_version 4940 (0.0006) +[2023-02-26 12:00:42,252][00189] Updated weights for policy 0, policy_version 4950 (0.0006) +[2023-02-26 12:00:42,929][00189] Updated weights for policy 0, policy_version 4960 (0.0006) +[2023-02-26 12:00:43,618][00189] Updated weights for policy 0, policy_version 4970 (0.0006) +[2023-02-26 12:00:43,656][00001] Fps is (10 sec: 58982.7, 60 sec: 59119.0, 300 sec: 59648.9). Total num frames: 20357120. Throughput: 0: 14769.5. Samples: 5089672. Policy #0 lag: (min: 0.0, avg: 1.8, max: 3.0) +[2023-02-26 12:00:43,656][00001] Avg episode reward: [(0, '28.764')] +[2023-02-26 12:00:43,659][00141] Saving new best policy, reward=28.764! +[2023-02-26 12:00:44,321][00189] Updated weights for policy 0, policy_version 4980 (0.0006) +[2023-02-26 12:00:45,003][00189] Updated weights for policy 0, policy_version 4990 (0.0007) +[2023-02-26 12:00:45,704][00189] Updated weights for policy 0, policy_version 5000 (0.0006) +[2023-02-26 12:00:46,408][00189] Updated weights for policy 0, policy_version 5010 (0.0007) +[2023-02-26 12:00:47,095][00189] Updated weights for policy 0, policy_version 5020 (0.0006) +[2023-02-26 12:00:47,792][00189] Updated weights for policy 0, policy_version 5030 (0.0006) +[2023-02-26 12:00:48,482][00189] Updated weights for policy 0, policy_version 5040 (0.0006) +[2023-02-26 12:00:48,656][00001] Fps is (10 sec: 58982.2, 60 sec: 59050.6, 300 sec: 59635.0). Total num frames: 20652032. Throughput: 0: 14773.0. Samples: 5133952. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:00:48,656][00001] Avg episode reward: [(0, '26.614')] +[2023-02-26 12:00:49,189][00189] Updated weights for policy 0, policy_version 5050 (0.0006) +[2023-02-26 12:00:49,868][00189] Updated weights for policy 0, policy_version 5060 (0.0006) +[2023-02-26 12:00:50,545][00189] Updated weights for policy 0, policy_version 5070 (0.0006) +[2023-02-26 12:00:51,265][00189] Updated weights for policy 0, policy_version 5080 (0.0006) +[2023-02-26 12:00:51,945][00189] Updated weights for policy 0, policy_version 5090 (0.0007) +[2023-02-26 12:00:52,646][00189] Updated weights for policy 0, policy_version 5100 (0.0006) +[2023-02-26 12:00:53,322][00189] Updated weights for policy 0, policy_version 5110 (0.0006) +[2023-02-26 12:00:53,656][00001] Fps is (10 sec: 58982.0, 60 sec: 59050.7, 300 sec: 59607.2). Total num frames: 20946944. Throughput: 0: 14779.6. Samples: 5222624. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:00:53,656][00001] Avg episode reward: [(0, '26.080')] +[2023-02-26 12:00:53,659][00141] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000005114_20946944.pth... +[2023-02-26 12:00:53,699][00141] Removing ./runs/default_experiment/checkpoint_p0/checkpoint_000001628_6668288.pth +[2023-02-26 12:00:54,024][00189] Updated weights for policy 0, policy_version 5120 (0.0007) +[2023-02-26 12:00:54,737][00189] Updated weights for policy 0, policy_version 5130 (0.0006) +[2023-02-26 12:00:55,425][00189] Updated weights for policy 0, policy_version 5140 (0.0006) +[2023-02-26 12:00:56,114][00189] Updated weights for policy 0, policy_version 5150 (0.0006) +[2023-02-26 12:00:56,825][00189] Updated weights for policy 0, policy_version 5160 (0.0007) +[2023-02-26 12:00:57,521][00189] Updated weights for policy 0, policy_version 5170 (0.0006) +[2023-02-26 12:00:58,174][00189] Updated weights for policy 0, policy_version 5180 (0.0007) +[2023-02-26 12:00:58,656][00001] Fps is (10 sec: 58982.4, 60 sec: 59050.7, 300 sec: 59579.4). Total num frames: 21241856. Throughput: 0: 14778.5. Samples: 5310964. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:00:58,656][00001] Avg episode reward: [(0, '25.736')] +[2023-02-26 12:00:58,912][00189] Updated weights for policy 0, policy_version 5190 (0.0007) +[2023-02-26 12:00:59,578][00189] Updated weights for policy 0, policy_version 5200 (0.0006) +[2023-02-26 12:01:00,265][00189] Updated weights for policy 0, policy_version 5210 (0.0007) +[2023-02-26 12:01:00,983][00189] Updated weights for policy 0, policy_version 5220 (0.0007) +[2023-02-26 12:01:01,659][00189] Updated weights for policy 0, policy_version 5230 (0.0006) +[2023-02-26 12:01:02,339][00189] Updated weights for policy 0, policy_version 5240 (0.0006) +[2023-02-26 12:01:03,064][00189] Updated weights for policy 0, policy_version 5250 (0.0006) +[2023-02-26 12:01:03,656][00001] Fps is (10 sec: 58982.5, 60 sec: 59050.5, 300 sec: 59565.6). Total num frames: 21536768. Throughput: 0: 14775.2. Samples: 5355336. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:01:03,656][00001] Avg episode reward: [(0, '27.365')] +[2023-02-26 12:01:03,726][00189] Updated weights for policy 0, policy_version 5260 (0.0006) +[2023-02-26 12:01:04,422][00189] Updated weights for policy 0, policy_version 5270 (0.0006) +[2023-02-26 12:01:05,130][00189] Updated weights for policy 0, policy_version 5280 (0.0007) +[2023-02-26 12:01:05,812][00189] Updated weights for policy 0, policy_version 5290 (0.0006) +[2023-02-26 12:01:06,542][00189] Updated weights for policy 0, policy_version 5300 (0.0006) +[2023-02-26 12:01:07,218][00189] Updated weights for policy 0, policy_version 5310 (0.0006) +[2023-02-26 12:01:07,913][00189] Updated weights for policy 0, policy_version 5320 (0.0007) +[2023-02-26 12:01:08,622][00189] Updated weights for policy 0, policy_version 5330 (0.0006) +[2023-02-26 12:01:08,656][00001] Fps is (10 sec: 58982.4, 60 sec: 59119.0, 300 sec: 59537.8). Total num frames: 21831680. Throughput: 0: 14775.4. Samples: 5443712. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:01:08,656][00001] Avg episode reward: [(0, '25.806')] +[2023-02-26 12:01:09,313][00189] Updated weights for policy 0, policy_version 5340 (0.0006) +[2023-02-26 12:01:09,989][00189] Updated weights for policy 0, policy_version 5350 (0.0006) +[2023-02-26 12:01:10,703][00189] Updated weights for policy 0, policy_version 5360 (0.0006) +[2023-02-26 12:01:11,391][00189] Updated weights for policy 0, policy_version 5370 (0.0006) +[2023-02-26 12:01:12,095][00189] Updated weights for policy 0, policy_version 5380 (0.0007) +[2023-02-26 12:01:12,792][00189] Updated weights for policy 0, policy_version 5390 (0.0006) +[2023-02-26 12:01:13,477][00189] Updated weights for policy 0, policy_version 5400 (0.0007) +[2023-02-26 12:01:13,656][00001] Fps is (10 sec: 58982.1, 60 sec: 59050.6, 300 sec: 59537.7). Total num frames: 22126592. Throughput: 0: 14768.8. Samples: 5532056. Policy #0 lag: (min: 0.0, avg: 1.5, max: 3.0) +[2023-02-26 12:01:13,656][00001] Avg episode reward: [(0, '30.324')] +[2023-02-26 12:01:13,660][00141] Saving new best policy, reward=30.324! +[2023-02-26 12:01:14,176][00189] Updated weights for policy 0, policy_version 5410 (0.0006) +[2023-02-26 12:01:14,888][00189] Updated weights for policy 0, policy_version 5420 (0.0007) +[2023-02-26 12:01:15,557][00189] Updated weights for policy 0, policy_version 5430 (0.0006) +[2023-02-26 12:01:16,249][00189] Updated weights for policy 0, policy_version 5440 (0.0006) +[2023-02-26 12:01:16,972][00189] Updated weights for policy 0, policy_version 5450 (0.0006) +[2023-02-26 12:01:17,636][00189] Updated weights for policy 0, policy_version 5460 (0.0007) +[2023-02-26 12:01:18,329][00189] Updated weights for policy 0, policy_version 5470 (0.0006) +[2023-02-26 12:01:18,656][00001] Fps is (10 sec: 58982.3, 60 sec: 59118.9, 300 sec: 59523.9). Total num frames: 22421504. Throughput: 0: 14766.7. Samples: 5576368. Policy #0 lag: (min: 0.0, avg: 1.5, max: 4.0) +[2023-02-26 12:01:18,656][00001] Avg episode reward: [(0, '30.047')] +[2023-02-26 12:01:19,040][00189] Updated weights for policy 0, policy_version 5480 (0.0006) +[2023-02-26 12:01:19,712][00189] Updated weights for policy 0, policy_version 5490 (0.0006) +[2023-02-26 12:01:20,433][00189] Updated weights for policy 0, policy_version 5500 (0.0007) +[2023-02-26 12:01:21,106][00189] Updated weights for policy 0, policy_version 5510 (0.0006) +[2023-02-26 12:01:21,798][00189] Updated weights for policy 0, policy_version 5520 (0.0006) +[2023-02-26 12:01:22,509][00189] Updated weights for policy 0, policy_version 5530 (0.0006) +[2023-02-26 12:01:23,182][00189] Updated weights for policy 0, policy_version 5540 (0.0006) +[2023-02-26 12:01:23,656][00001] Fps is (10 sec: 59392.7, 60 sec: 59118.9, 300 sec: 59537.8). Total num frames: 22720512. Throughput: 0: 14764.3. Samples: 5665176. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:01:23,656][00001] Avg episode reward: [(0, '28.359')] +[2023-02-26 12:01:23,860][00189] Updated weights for policy 0, policy_version 5550 (0.0006) +[2023-02-26 12:01:24,579][00189] Updated weights for policy 0, policy_version 5560 (0.0006) +[2023-02-26 12:01:25,246][00189] Updated weights for policy 0, policy_version 5570 (0.0006) +[2023-02-26 12:01:25,981][00189] Updated weights for policy 0, policy_version 5580 (0.0006) +[2023-02-26 12:01:26,650][00189] Updated weights for policy 0, policy_version 5590 (0.0007) +[2023-02-26 12:01:27,363][00189] Updated weights for policy 0, policy_version 5600 (0.0007) +[2023-02-26 12:01:28,117][00189] Updated weights for policy 0, policy_version 5610 (0.0007) +[2023-02-26 12:01:28,656][00001] Fps is (10 sec: 58982.8, 60 sec: 59050.7, 300 sec: 59482.3). Total num frames: 23011328. Throughput: 0: 14746.3. Samples: 5753256. Policy #0 lag: (min: 1.0, avg: 1.6, max: 4.0) +[2023-02-26 12:01:28,656][00001] Avg episode reward: [(0, '25.541')] +[2023-02-26 12:01:28,773][00189] Updated weights for policy 0, policy_version 5620 (0.0006) +[2023-02-26 12:01:29,449][00189] Updated weights for policy 0, policy_version 5630 (0.0006) +[2023-02-26 12:01:30,171][00189] Updated weights for policy 0, policy_version 5640 (0.0006) +[2023-02-26 12:01:30,843][00189] Updated weights for policy 0, policy_version 5650 (0.0007) +[2023-02-26 12:01:31,521][00189] Updated weights for policy 0, policy_version 5660 (0.0007) +[2023-02-26 12:01:32,260][00189] Updated weights for policy 0, policy_version 5670 (0.0006) +[2023-02-26 12:01:32,911][00189] Updated weights for policy 0, policy_version 5680 (0.0006) +[2023-02-26 12:01:33,561][00189] Updated weights for policy 0, policy_version 5690 (0.0006) +[2023-02-26 12:01:33,656][00001] Fps is (10 sec: 58572.1, 60 sec: 58982.3, 300 sec: 59468.4). Total num frames: 23306240. Throughput: 0: 14745.5. Samples: 5797500. Policy #0 lag: (min: 0.0, avg: 1.5, max: 3.0) +[2023-02-26 12:01:33,656][00001] Avg episode reward: [(0, '26.566')] +[2023-02-26 12:01:33,936][00141] Signal inference workers to stop experience collection... (200 times) +[2023-02-26 12:01:33,937][00141] Signal inference workers to resume experience collection... (200 times) +[2023-02-26 12:01:33,944][00189] InferenceWorker_p0-w0: stopping experience collection (200 times) +[2023-02-26 12:01:33,944][00189] InferenceWorker_p0-w0: resuming experience collection (200 times) +[2023-02-26 12:01:34,353][00189] Updated weights for policy 0, policy_version 5700 (0.0008) +[2023-02-26 12:01:34,994][00189] Updated weights for policy 0, policy_version 5710 (0.0006) +[2023-02-26 12:01:35,652][00189] Updated weights for policy 0, policy_version 5720 (0.0006) +[2023-02-26 12:01:36,398][00189] Updated weights for policy 0, policy_version 5730 (0.0007) +[2023-02-26 12:01:37,071][00189] Updated weights for policy 0, policy_version 5740 (0.0006) +[2023-02-26 12:01:37,765][00189] Updated weights for policy 0, policy_version 5750 (0.0007) +[2023-02-26 12:01:38,435][00189] Updated weights for policy 0, policy_version 5760 (0.0007) +[2023-02-26 12:01:38,656][00001] Fps is (10 sec: 59392.1, 60 sec: 59050.7, 300 sec: 59454.5). Total num frames: 23605248. Throughput: 0: 14751.9. Samples: 5886456. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:01:38,656][00001] Avg episode reward: [(0, '25.795')] +[2023-02-26 12:01:39,161][00189] Updated weights for policy 0, policy_version 5770 (0.0007) +[2023-02-26 12:01:39,863][00189] Updated weights for policy 0, policy_version 5780 (0.0007) +[2023-02-26 12:01:40,516][00189] Updated weights for policy 0, policy_version 5790 (0.0007) +[2023-02-26 12:01:41,252][00189] Updated weights for policy 0, policy_version 5800 (0.0007) +[2023-02-26 12:01:41,955][00189] Updated weights for policy 0, policy_version 5810 (0.0006) +[2023-02-26 12:01:42,608][00189] Updated weights for policy 0, policy_version 5820 (0.0007) +[2023-02-26 12:01:43,361][00189] Updated weights for policy 0, policy_version 5830 (0.0006) +[2023-02-26 12:01:43,656][00001] Fps is (10 sec: 59392.8, 60 sec: 59050.7, 300 sec: 59440.6). Total num frames: 23900160. Throughput: 0: 14754.9. Samples: 5974932. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) +[2023-02-26 12:01:43,656][00001] Avg episode reward: [(0, '25.402')] +[2023-02-26 12:01:44,008][00189] Updated weights for policy 0, policy_version 5840 (0.0007) +[2023-02-26 12:01:44,692][00189] Updated weights for policy 0, policy_version 5850 (0.0006) +[2023-02-26 12:01:45,402][00189] Updated weights for policy 0, policy_version 5860 (0.0007) +[2023-02-26 12:01:46,123][00189] Updated weights for policy 0, policy_version 5870 (0.0007) +[2023-02-26 12:01:46,769][00189] Updated weights for policy 0, policy_version 5880 (0.0007) +[2023-02-26 12:01:47,464][00189] Updated weights for policy 0, policy_version 5890 (0.0006) +[2023-02-26 12:01:48,209][00189] Updated weights for policy 0, policy_version 5900 (0.0006) +[2023-02-26 12:01:48,656][00001] Fps is (10 sec: 58572.7, 60 sec: 58982.4, 300 sec: 59399.0). Total num frames: 24190976. Throughput: 0: 14755.7. Samples: 6019340. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) +[2023-02-26 12:01:48,656][00001] Avg episode reward: [(0, '29.136')] +[2023-02-26 12:01:48,873][00189] Updated weights for policy 0, policy_version 5910 (0.0007) +[2023-02-26 12:01:49,547][00189] Updated weights for policy 0, policy_version 5920 (0.0006) +[2023-02-26 12:01:50,296][00189] Updated weights for policy 0, policy_version 5930 (0.0006) +[2023-02-26 12:01:50,961][00189] Updated weights for policy 0, policy_version 5940 (0.0006) +[2023-02-26 12:01:51,625][00189] Updated weights for policy 0, policy_version 5950 (0.0006) +[2023-02-26 12:01:52,369][00189] Updated weights for policy 0, policy_version 5960 (0.0006) +[2023-02-26 12:01:53,060][00189] Updated weights for policy 0, policy_version 5970 (0.0006) +[2023-02-26 12:01:53,656][00001] Fps is (10 sec: 58572.6, 60 sec: 58982.5, 300 sec: 59371.2). Total num frames: 24485888. Throughput: 0: 14751.4. Samples: 6107524. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) +[2023-02-26 12:01:53,656][00001] Avg episode reward: [(0, '30.319')] +[2023-02-26 12:01:53,719][00189] Updated weights for policy 0, policy_version 5980 (0.0006) +[2023-02-26 12:01:54,409][00189] Updated weights for policy 0, policy_version 5990 (0.0006) +[2023-02-26 12:01:55,145][00189] Updated weights for policy 0, policy_version 6000 (0.0006) +[2023-02-26 12:01:55,811][00189] Updated weights for policy 0, policy_version 6010 (0.0006) +[2023-02-26 12:01:56,500][00189] Updated weights for policy 0, policy_version 6020 (0.0007) +[2023-02-26 12:01:57,214][00189] Updated weights for policy 0, policy_version 6030 (0.0006) +[2023-02-26 12:01:57,853][00189] Updated weights for policy 0, policy_version 6040 (0.0006) +[2023-02-26 12:01:58,589][00189] Updated weights for policy 0, policy_version 6050 (0.0006) +[2023-02-26 12:01:58,656][00001] Fps is (10 sec: 58982.5, 60 sec: 58982.5, 300 sec: 59343.4). Total num frames: 24780800. Throughput: 0: 14757.7. Samples: 6196148. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:01:58,656][00001] Avg episode reward: [(0, '26.447')] +[2023-02-26 12:01:59,290][00189] Updated weights for policy 0, policy_version 6060 (0.0007) +[2023-02-26 12:01:59,950][00189] Updated weights for policy 0, policy_version 6070 (0.0006) +[2023-02-26 12:02:00,647][00189] Updated weights for policy 0, policy_version 6080 (0.0006) +[2023-02-26 12:02:01,383][00189] Updated weights for policy 0, policy_version 6090 (0.0006) +[2023-02-26 12:02:02,046][00189] Updated weights for policy 0, policy_version 6100 (0.0007) +[2023-02-26 12:02:02,729][00189] Updated weights for policy 0, policy_version 6110 (0.0006) +[2023-02-26 12:02:03,468][00189] Updated weights for policy 0, policy_version 6120 (0.0007) +[2023-02-26 12:02:03,656][00001] Fps is (10 sec: 58981.5, 60 sec: 58982.3, 300 sec: 59329.5). Total num frames: 25075712. Throughput: 0: 14759.3. Samples: 6240536. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) +[2023-02-26 12:02:03,656][00001] Avg episode reward: [(0, '27.478')] +[2023-02-26 12:02:04,128][00189] Updated weights for policy 0, policy_version 6130 (0.0006) +[2023-02-26 12:02:04,788][00189] Updated weights for policy 0, policy_version 6140 (0.0007) +[2023-02-26 12:02:05,548][00189] Updated weights for policy 0, policy_version 6150 (0.0006) +[2023-02-26 12:02:06,207][00189] Updated weights for policy 0, policy_version 6160 (0.0006) +[2023-02-26 12:02:06,890][00189] Updated weights for policy 0, policy_version 6170 (0.0006) +[2023-02-26 12:02:07,594][00189] Updated weights for policy 0, policy_version 6180 (0.0007) +[2023-02-26 12:02:08,266][00189] Updated weights for policy 0, policy_version 6190 (0.0006) +[2023-02-26 12:02:08,656][00001] Fps is (10 sec: 59391.8, 60 sec: 59050.7, 300 sec: 59315.6). Total num frames: 25374720. Throughput: 0: 14758.3. Samples: 6329300. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:02:08,656][00001] Avg episode reward: [(0, '25.317')] +[2023-02-26 12:02:08,995][00189] Updated weights for policy 0, policy_version 6200 (0.0007) +[2023-02-26 12:02:09,686][00189] Updated weights for policy 0, policy_version 6210 (0.0006) +[2023-02-26 12:02:10,356][00189] Updated weights for policy 0, policy_version 6220 (0.0007) +[2023-02-26 12:02:11,048][00189] Updated weights for policy 0, policy_version 6230 (0.0006) +[2023-02-26 12:02:11,789][00189] Updated weights for policy 0, policy_version 6240 (0.0006) +[2023-02-26 12:02:12,451][00189] Updated weights for policy 0, policy_version 6250 (0.0006) +[2023-02-26 12:02:13,138][00189] Updated weights for policy 0, policy_version 6260 (0.0006) +[2023-02-26 12:02:13,656][00001] Fps is (10 sec: 58982.0, 60 sec: 58982.3, 300 sec: 59301.7). Total num frames: 25665536. Throughput: 0: 14761.7. Samples: 6417536. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:02:13,656][00001] Avg episode reward: [(0, '29.324')] +[2023-02-26 12:02:13,888][00189] Updated weights for policy 0, policy_version 6270 (0.0006) +[2023-02-26 12:02:14,548][00189] Updated weights for policy 0, policy_version 6280 (0.0007) +[2023-02-26 12:02:15,213][00189] Updated weights for policy 0, policy_version 6290 (0.0006) +[2023-02-26 12:02:15,952][00189] Updated weights for policy 0, policy_version 6300 (0.0007) +[2023-02-26 12:02:16,627][00189] Updated weights for policy 0, policy_version 6310 (0.0007) +[2023-02-26 12:02:17,300][00189] Updated weights for policy 0, policy_version 6320 (0.0006) +[2023-02-26 12:02:17,989][00189] Updated weights for policy 0, policy_version 6330 (0.0007) +[2023-02-26 12:02:18,656][00001] Fps is (10 sec: 58982.4, 60 sec: 59050.7, 300 sec: 59301.8). Total num frames: 25964544. Throughput: 0: 14760.1. Samples: 6461704. Policy #0 lag: (min: 0.0, avg: 1.8, max: 3.0) +[2023-02-26 12:02:18,656][00001] Avg episode reward: [(0, '27.804')] +[2023-02-26 12:02:18,694][00189] Updated weights for policy 0, policy_version 6340 (0.0006) +[2023-02-26 12:02:19,383][00189] Updated weights for policy 0, policy_version 6350 (0.0006) +[2023-02-26 12:02:20,044][00189] Updated weights for policy 0, policy_version 6360 (0.0007) +[2023-02-26 12:02:20,750][00189] Updated weights for policy 0, policy_version 6370 (0.0006) +[2023-02-26 12:02:21,440][00189] Updated weights for policy 0, policy_version 6380 (0.0007) +[2023-02-26 12:02:22,122][00189] Updated weights for policy 0, policy_version 6390 (0.0006) +[2023-02-26 12:02:22,799][00189] Updated weights for policy 0, policy_version 6400 (0.0006) +[2023-02-26 12:02:23,452][00189] Updated weights for policy 0, policy_version 6410 (0.0006) +[2023-02-26 12:02:23,656][00001] Fps is (10 sec: 59802.4, 60 sec: 59050.6, 300 sec: 59287.8). Total num frames: 26263552. Throughput: 0: 14775.6. Samples: 6551360. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:02:23,656][00001] Avg episode reward: [(0, '27.371')] +[2023-02-26 12:02:24,170][00189] Updated weights for policy 0, policy_version 6420 (0.0006) +[2023-02-26 12:02:24,806][00189] Updated weights for policy 0, policy_version 6430 (0.0006) +[2023-02-26 12:02:25,453][00189] Updated weights for policy 0, policy_version 6440 (0.0007) +[2023-02-26 12:02:26,199][00189] Updated weights for policy 0, policy_version 6450 (0.0006) +[2023-02-26 12:02:26,842][00189] Updated weights for policy 0, policy_version 6460 (0.0006) +[2023-02-26 12:02:27,481][00189] Updated weights for policy 0, policy_version 6470 (0.0006) +[2023-02-26 12:02:28,244][00189] Updated weights for policy 0, policy_version 6480 (0.0006) +[2023-02-26 12:02:28,656][00001] Fps is (10 sec: 60211.4, 60 sec: 59255.5, 300 sec: 59287.9). Total num frames: 26566656. Throughput: 0: 14827.1. Samples: 6642152. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) +[2023-02-26 12:02:28,656][00001] Avg episode reward: [(0, '27.992')] +[2023-02-26 12:02:28,886][00189] Updated weights for policy 0, policy_version 6490 (0.0007) +[2023-02-26 12:02:29,529][00189] Updated weights for policy 0, policy_version 6500 (0.0007) +[2023-02-26 12:02:30,258][00189] Updated weights for policy 0, policy_version 6510 (0.0006) +[2023-02-26 12:02:30,939][00189] Updated weights for policy 0, policy_version 6520 (0.0006) +[2023-02-26 12:02:31,615][00189] Updated weights for policy 0, policy_version 6530 (0.0006) +[2023-02-26 12:02:32,309][00189] Updated weights for policy 0, policy_version 6540 (0.0007) +[2023-02-26 12:02:33,078][00189] Updated weights for policy 0, policy_version 6550 (0.0007) +[2023-02-26 12:02:33,656][00001] Fps is (10 sec: 59801.8, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 26861568. Throughput: 0: 14843.3. Samples: 6687288. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) +[2023-02-26 12:02:33,656][00001] Avg episode reward: [(0, '28.803')] +[2023-02-26 12:02:33,768][00189] Updated weights for policy 0, policy_version 6560 (0.0006) +[2023-02-26 12:02:34,502][00189] Updated weights for policy 0, policy_version 6570 (0.0006) +[2023-02-26 12:02:35,175][00189] Updated weights for policy 0, policy_version 6580 (0.0007) +[2023-02-26 12:02:35,861][00189] Updated weights for policy 0, policy_version 6590 (0.0007) +[2023-02-26 12:02:36,604][00189] Updated weights for policy 0, policy_version 6600 (0.0006) +[2023-02-26 12:02:37,261][00189] Updated weights for policy 0, policy_version 6610 (0.0006) +[2023-02-26 12:02:37,930][00189] Updated weights for policy 0, policy_version 6620 (0.0006) +[2023-02-26 12:02:38,082][00141] Signal inference workers to stop experience collection... (250 times) +[2023-02-26 12:02:38,082][00141] Signal inference workers to resume experience collection... (250 times) +[2023-02-26 12:02:38,086][00189] InferenceWorker_p0-w0: stopping experience collection (250 times) +[2023-02-26 12:02:38,086][00189] InferenceWorker_p0-w0: resuming experience collection (250 times) +[2023-02-26 12:02:38,656][00001] Fps is (10 sec: 58572.5, 60 sec: 59118.9, 300 sec: 59246.2). Total num frames: 27152384. Throughput: 0: 14813.2. Samples: 6774116. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:02:38,656][00001] Avg episode reward: [(0, '26.270')] +[2023-02-26 12:02:38,697][00189] Updated weights for policy 0, policy_version 6630 (0.0006) +[2023-02-26 12:02:39,357][00189] Updated weights for policy 0, policy_version 6640 (0.0006) +[2023-02-26 12:02:39,997][00189] Updated weights for policy 0, policy_version 6650 (0.0007) +[2023-02-26 12:02:40,727][00189] Updated weights for policy 0, policy_version 6660 (0.0006) +[2023-02-26 12:02:41,398][00189] Updated weights for policy 0, policy_version 6670 (0.0006) +[2023-02-26 12:02:42,076][00189] Updated weights for policy 0, policy_version 6680 (0.0006) +[2023-02-26 12:02:42,807][00189] Updated weights for policy 0, policy_version 6690 (0.0006) +[2023-02-26 12:02:43,482][00189] Updated weights for policy 0, policy_version 6700 (0.0006) +[2023-02-26 12:02:43,656][00001] Fps is (10 sec: 58982.3, 60 sec: 59187.1, 300 sec: 59246.2). Total num frames: 27451392. Throughput: 0: 14827.8. Samples: 6863400. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) +[2023-02-26 12:02:43,656][00001] Avg episode reward: [(0, '28.766')] +[2023-02-26 12:02:44,132][00189] Updated weights for policy 0, policy_version 6710 (0.0006) +[2023-02-26 12:02:44,879][00189] Updated weights for policy 0, policy_version 6720 (0.0006) +[2023-02-26 12:02:45,515][00189] Updated weights for policy 0, policy_version 6730 (0.0006) +[2023-02-26 12:02:46,182][00189] Updated weights for policy 0, policy_version 6740 (0.0006) +[2023-02-26 12:02:46,933][00189] Updated weights for policy 0, policy_version 6750 (0.0006) +[2023-02-26 12:02:47,576][00189] Updated weights for policy 0, policy_version 6760 (0.0006) +[2023-02-26 12:02:48,230][00189] Updated weights for policy 0, policy_version 6770 (0.0007) +[2023-02-26 12:02:48,656][00001] Fps is (10 sec: 60211.5, 60 sec: 59392.0, 300 sec: 59246.2). Total num frames: 27754496. Throughput: 0: 14838.6. Samples: 6908268. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) +[2023-02-26 12:02:48,656][00001] Avg episode reward: [(0, '26.388')] +[2023-02-26 12:02:48,986][00189] Updated weights for policy 0, policy_version 6780 (0.0006) +[2023-02-26 12:02:49,633][00189] Updated weights for policy 0, policy_version 6790 (0.0006) +[2023-02-26 12:02:50,291][00189] Updated weights for policy 0, policy_version 6800 (0.0006) +[2023-02-26 12:02:51,021][00189] Updated weights for policy 0, policy_version 6810 (0.0006) +[2023-02-26 12:02:51,686][00189] Updated weights for policy 0, policy_version 6820 (0.0007) +[2023-02-26 12:02:52,350][00189] Updated weights for policy 0, policy_version 6830 (0.0006) +[2023-02-26 12:02:53,066][00189] Updated weights for policy 0, policy_version 6840 (0.0006) +[2023-02-26 12:02:53,656][00001] Fps is (10 sec: 59801.2, 60 sec: 59391.9, 300 sec: 59232.3). Total num frames: 28049408. Throughput: 0: 14858.5. Samples: 6997936. Policy #0 lag: (min: 0.0, avg: 1.5, max: 4.0) +[2023-02-26 12:02:53,656][00001] Avg episode reward: [(0, '30.230')] +[2023-02-26 12:02:53,660][00141] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000006848_28049408.pth... +[2023-02-26 12:02:53,697][00141] Removing ./runs/default_experiment/checkpoint_p0/checkpoint_000003385_13864960.pth +[2023-02-26 12:02:53,744][00189] Updated weights for policy 0, policy_version 6850 (0.0006) +[2023-02-26 12:02:54,401][00189] Updated weights for policy 0, policy_version 6860 (0.0006) +[2023-02-26 12:02:55,165][00189] Updated weights for policy 0, policy_version 6870 (0.0007) +[2023-02-26 12:02:55,822][00189] Updated weights for policy 0, policy_version 6880 (0.0007) +[2023-02-26 12:02:56,490][00189] Updated weights for policy 0, policy_version 6890 (0.0006) +[2023-02-26 12:02:57,228][00189] Updated weights for policy 0, policy_version 6900 (0.0006) +[2023-02-26 12:02:57,871][00189] Updated weights for policy 0, policy_version 6910 (0.0006) +[2023-02-26 12:02:58,557][00189] Updated weights for policy 0, policy_version 6920 (0.0007) +[2023-02-26 12:02:58,656][00001] Fps is (10 sec: 59392.0, 60 sec: 59460.3, 300 sec: 59218.5). Total num frames: 28348416. Throughput: 0: 14884.2. Samples: 7087320. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) +[2023-02-26 12:02:58,656][00001] Avg episode reward: [(0, '29.476')] +[2023-02-26 12:02:59,265][00189] Updated weights for policy 0, policy_version 6930 (0.0006) +[2023-02-26 12:02:59,943][00189] Updated weights for policy 0, policy_version 6940 (0.0006) +[2023-02-26 12:03:00,655][00189] Updated weights for policy 0, policy_version 6950 (0.0006) +[2023-02-26 12:03:01,319][00189] Updated weights for policy 0, policy_version 6960 (0.0006) +[2023-02-26 12:03:02,075][00189] Updated weights for policy 0, policy_version 6970 (0.0006) +[2023-02-26 12:03:02,741][00189] Updated weights for policy 0, policy_version 6980 (0.0006) +[2023-02-26 12:03:03,382][00189] Updated weights for policy 0, policy_version 6990 (0.0007) +[2023-02-26 12:03:03,656][00001] Fps is (10 sec: 59392.8, 60 sec: 59460.4, 300 sec: 59204.6). Total num frames: 28643328. Throughput: 0: 14889.0. Samples: 7131708. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) +[2023-02-26 12:03:03,656][00001] Avg episode reward: [(0, '30.323')] +[2023-02-26 12:03:04,129][00189] Updated weights for policy 0, policy_version 7000 (0.0006) +[2023-02-26 12:03:04,797][00189] Updated weights for policy 0, policy_version 7010 (0.0006) +[2023-02-26 12:03:05,469][00189] Updated weights for policy 0, policy_version 7020 (0.0006) +[2023-02-26 12:03:06,206][00189] Updated weights for policy 0, policy_version 7030 (0.0007) +[2023-02-26 12:03:06,856][00189] Updated weights for policy 0, policy_version 7040 (0.0006) +[2023-02-26 12:03:07,529][00189] Updated weights for policy 0, policy_version 7050 (0.0006) +[2023-02-26 12:03:08,231][00189] Updated weights for policy 0, policy_version 7060 (0.0006) +[2023-02-26 12:03:08,656][00001] Fps is (10 sec: 59391.5, 60 sec: 59460.2, 300 sec: 59190.7). Total num frames: 28942336. Throughput: 0: 14877.9. Samples: 7220864. Policy #0 lag: (min: 1.0, avg: 2.1, max: 4.0) +[2023-02-26 12:03:08,656][00001] Avg episode reward: [(0, '31.943')] +[2023-02-26 12:03:08,656][00141] Saving new best policy, reward=31.943! +[2023-02-26 12:03:08,912][00189] Updated weights for policy 0, policy_version 7070 (0.0006) +[2023-02-26 12:03:09,624][00189] Updated weights for policy 0, policy_version 7080 (0.0007) +[2023-02-26 12:03:10,277][00189] Updated weights for policy 0, policy_version 7090 (0.0006) +[2023-02-26 12:03:10,963][00189] Updated weights for policy 0, policy_version 7100 (0.0006) +[2023-02-26 12:03:11,693][00189] Updated weights for policy 0, policy_version 7110 (0.0007) +[2023-02-26 12:03:12,351][00189] Updated weights for policy 0, policy_version 7120 (0.0007) +[2023-02-26 12:03:13,042][00189] Updated weights for policy 0, policy_version 7130 (0.0007) +[2023-02-26 12:03:13,656][00001] Fps is (10 sec: 59391.9, 60 sec: 59528.7, 300 sec: 59162.9). Total num frames: 29237248. Throughput: 0: 14844.1. Samples: 7310136. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) +[2023-02-26 12:03:13,656][00001] Avg episode reward: [(0, '29.375')] +[2023-02-26 12:03:13,758][00189] Updated weights for policy 0, policy_version 7140 (0.0006) +[2023-02-26 12:03:14,408][00189] Updated weights for policy 0, policy_version 7150 (0.0006) +[2023-02-26 12:03:15,093][00189] Updated weights for policy 0, policy_version 7160 (0.0006) +[2023-02-26 12:03:15,801][00189] Updated weights for policy 0, policy_version 7170 (0.0006) +[2023-02-26 12:03:16,494][00189] Updated weights for policy 0, policy_version 7180 (0.0007) +[2023-02-26 12:03:17,162][00189] Updated weights for policy 0, policy_version 7190 (0.0006) +[2023-02-26 12:03:17,896][00189] Updated weights for policy 0, policy_version 7200 (0.0006) +[2023-02-26 12:03:18,568][00189] Updated weights for policy 0, policy_version 7210 (0.0006) +[2023-02-26 12:03:18,656][00001] Fps is (10 sec: 59392.4, 60 sec: 59528.5, 300 sec: 59162.9). Total num frames: 29536256. Throughput: 0: 14830.3. Samples: 7354652. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:03:18,656][00001] Avg episode reward: [(0, '30.997')] +[2023-02-26 12:03:19,249][00189] Updated weights for policy 0, policy_version 7220 (0.0006) +[2023-02-26 12:03:19,950][00189] Updated weights for policy 0, policy_version 7230 (0.0006) +[2023-02-26 12:03:20,640][00189] Updated weights for policy 0, policy_version 7240 (0.0007) +[2023-02-26 12:03:21,307][00189] Updated weights for policy 0, policy_version 7250 (0.0006) +[2023-02-26 12:03:21,998][00189] Updated weights for policy 0, policy_version 7260 (0.0007) +[2023-02-26 12:03:22,715][00189] Updated weights for policy 0, policy_version 7270 (0.0006) +[2023-02-26 12:03:23,397][00189] Updated weights for policy 0, policy_version 7280 (0.0007) +[2023-02-26 12:03:23,656][00001] Fps is (10 sec: 59801.5, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 29835264. Throughput: 0: 14883.0. Samples: 7443852. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) +[2023-02-26 12:03:23,656][00001] Avg episode reward: [(0, '28.848')] +[2023-02-26 12:03:24,049][00189] Updated weights for policy 0, policy_version 7290 (0.0006) +[2023-02-26 12:03:24,755][00189] Updated weights for policy 0, policy_version 7300 (0.0006) +[2023-02-26 12:03:25,445][00189] Updated weights for policy 0, policy_version 7310 (0.0006) +[2023-02-26 12:03:26,112][00189] Updated weights for policy 0, policy_version 7320 (0.0007) +[2023-02-26 12:03:26,814][00189] Updated weights for policy 0, policy_version 7330 (0.0007) +[2023-02-26 12:03:27,507][00189] Updated weights for policy 0, policy_version 7340 (0.0006) +[2023-02-26 12:03:28,191][00189] Updated weights for policy 0, policy_version 7350 (0.0006) +[2023-02-26 12:03:28,656][00001] Fps is (10 sec: 59392.0, 60 sec: 59392.0, 300 sec: 59135.1). Total num frames: 30130176. Throughput: 0: 14885.4. Samples: 7533240. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) +[2023-02-26 12:03:28,656][00001] Avg episode reward: [(0, '28.198')] +[2023-02-26 12:03:28,876][00189] Updated weights for policy 0, policy_version 7360 (0.0007) +[2023-02-26 12:03:29,570][00189] Updated weights for policy 0, policy_version 7370 (0.0007) +[2023-02-26 12:03:30,249][00189] Updated weights for policy 0, policy_version 7380 (0.0006) +[2023-02-26 12:03:30,943][00189] Updated weights for policy 0, policy_version 7390 (0.0006) +[2023-02-26 12:03:31,631][00189] Updated weights for policy 0, policy_version 7400 (0.0007) +[2023-02-26 12:03:32,312][00189] Updated weights for policy 0, policy_version 7410 (0.0006) +[2023-02-26 12:03:33,013][00189] Updated weights for policy 0, policy_version 7420 (0.0006) +[2023-02-26 12:03:33,656][00001] Fps is (10 sec: 59392.0, 60 sec: 59460.3, 300 sec: 59135.1). Total num frames: 30429184. Throughput: 0: 14881.9. Samples: 7577956. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:03:33,656][00001] Avg episode reward: [(0, '30.039')] +[2023-02-26 12:03:33,684][00189] Updated weights for policy 0, policy_version 7430 (0.0006) +[2023-02-26 12:03:34,363][00189] Updated weights for policy 0, policy_version 7440 (0.0006) +[2023-02-26 12:03:35,104][00189] Updated weights for policy 0, policy_version 7450 (0.0007) +[2023-02-26 12:03:35,766][00189] Updated weights for policy 0, policy_version 7460 (0.0006) +[2023-02-26 12:03:36,511][00189] Updated weights for policy 0, policy_version 7470 (0.0006) +[2023-02-26 12:03:37,214][00189] Updated weights for policy 0, policy_version 7480 (0.0006) +[2023-02-26 12:03:37,898][00189] Updated weights for policy 0, policy_version 7490 (0.0007) +[2023-02-26 12:03:38,618][00189] Updated weights for policy 0, policy_version 7500 (0.0007) +[2023-02-26 12:03:38,656][00001] Fps is (10 sec: 58982.0, 60 sec: 59460.2, 300 sec: 59135.1). Total num frames: 30720000. Throughput: 0: 14847.8. Samples: 7666088. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:03:38,656][00001] Avg episode reward: [(0, '32.616')] +[2023-02-26 12:03:38,667][00141] Saving new best policy, reward=32.616! +[2023-02-26 12:03:39,330][00189] Updated weights for policy 0, policy_version 7510 (0.0007) +[2023-02-26 12:03:40,046][00189] Updated weights for policy 0, policy_version 7520 (0.0006) +[2023-02-26 12:03:40,735][00189] Updated weights for policy 0, policy_version 7530 (0.0007) +[2023-02-26 12:03:41,417][00189] Updated weights for policy 0, policy_version 7540 (0.0006) +[2023-02-26 12:03:42,165][00189] Updated weights for policy 0, policy_version 7550 (0.0007) +[2023-02-26 12:03:42,856][00189] Updated weights for policy 0, policy_version 7560 (0.0007) +[2023-02-26 12:03:43,564][00189] Updated weights for policy 0, policy_version 7570 (0.0006) +[2023-02-26 12:03:43,656][00001] Fps is (10 sec: 58163.4, 60 sec: 59323.8, 300 sec: 59121.2). Total num frames: 31010816. Throughput: 0: 14791.2. Samples: 7752924. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:03:43,656][00001] Avg episode reward: [(0, '30.521')] +[2023-02-26 12:03:44,227][00189] Updated weights for policy 0, policy_version 7580 (0.0006) +[2023-02-26 12:03:44,931][00189] Updated weights for policy 0, policy_version 7590 (0.0007) +[2023-02-26 12:03:45,635][00189] Updated weights for policy 0, policy_version 7600 (0.0006) +[2023-02-26 12:03:46,308][00189] Updated weights for policy 0, policy_version 7610 (0.0006) +[2023-02-26 12:03:46,999][00189] Updated weights for policy 0, policy_version 7620 (0.0006) +[2023-02-26 12:03:47,715][00189] Updated weights for policy 0, policy_version 7630 (0.0007) +[2023-02-26 12:03:48,414][00189] Updated weights for policy 0, policy_version 7640 (0.0006) +[2023-02-26 12:03:48,656][00001] Fps is (10 sec: 58573.1, 60 sec: 59187.2, 300 sec: 59135.1). Total num frames: 31305728. Throughput: 0: 14802.8. Samples: 7797836. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:03:48,656][00001] Avg episode reward: [(0, '30.850')] +[2023-02-26 12:03:49,100][00189] Updated weights for policy 0, policy_version 7650 (0.0006) +[2023-02-26 12:03:49,843][00189] Updated weights for policy 0, policy_version 7660 (0.0006) +[2023-02-26 12:03:50,539][00189] Updated weights for policy 0, policy_version 7670 (0.0007) +[2023-02-26 12:03:51,241][00189] Updated weights for policy 0, policy_version 7680 (0.0007) +[2023-02-26 12:03:51,961][00189] Updated weights for policy 0, policy_version 7690 (0.0007) +[2023-02-26 12:03:52,663][00189] Updated weights for policy 0, policy_version 7700 (0.0007) +[2023-02-26 12:03:53,375][00189] Updated weights for policy 0, policy_version 7710 (0.0006) +[2023-02-26 12:03:53,656][00001] Fps is (10 sec: 58572.5, 60 sec: 59119.0, 300 sec: 59135.1). Total num frames: 31596544. Throughput: 0: 14751.1. Samples: 7884664. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:03:53,656][00001] Avg episode reward: [(0, '32.728')] +[2023-02-26 12:03:53,678][00141] Saving new best policy, reward=32.728! +[2023-02-26 12:03:54,062][00189] Updated weights for policy 0, policy_version 7720 (0.0006) +[2023-02-26 12:03:54,761][00189] Updated weights for policy 0, policy_version 7730 (0.0007) +[2023-02-26 12:03:55,510][00189] Updated weights for policy 0, policy_version 7740 (0.0007) +[2023-02-26 12:03:56,207][00189] Updated weights for policy 0, policy_version 7750 (0.0007) +[2023-02-26 12:03:56,897][00189] Updated weights for policy 0, policy_version 7760 (0.0007) +[2023-02-26 12:03:57,638][00189] Updated weights for policy 0, policy_version 7770 (0.0006) +[2023-02-26 12:03:58,322][00189] Updated weights for policy 0, policy_version 7780 (0.0006) +[2023-02-26 12:03:58,656][00001] Fps is (10 sec: 57752.6, 60 sec: 58913.9, 300 sec: 59093.5). Total num frames: 31883264. Throughput: 0: 14699.8. Samples: 7971628. Policy #0 lag: (min: 0.0, avg: 1.5, max: 3.0) +[2023-02-26 12:03:58,656][00001] Avg episode reward: [(0, '33.604')] +[2023-02-26 12:03:58,657][00141] Saving new best policy, reward=33.604! +[2023-02-26 12:03:58,993][00189] Updated weights for policy 0, policy_version 7790 (0.0007) +[2023-02-26 12:03:59,712][00189] Updated weights for policy 0, policy_version 7800 (0.0006) +[2023-02-26 12:04:00,412][00189] Updated weights for policy 0, policy_version 7810 (0.0006) +[2023-02-26 12:04:01,087][00189] Updated weights for policy 0, policy_version 7820 (0.0007) +[2023-02-26 12:04:01,755][00189] Updated weights for policy 0, policy_version 7830 (0.0006) +[2023-02-26 12:04:02,536][00189] Updated weights for policy 0, policy_version 7840 (0.0007) +[2023-02-26 12:04:03,201][00189] Updated weights for policy 0, policy_version 7850 (0.0007) +[2023-02-26 12:04:03,656][00001] Fps is (10 sec: 58163.1, 60 sec: 58914.1, 300 sec: 59093.5). Total num frames: 32178176. Throughput: 0: 14693.9. Samples: 8015880. Policy #0 lag: (min: 0.0, avg: 1.4, max: 3.0) +[2023-02-26 12:04:03,656][00001] Avg episode reward: [(0, '35.549')] +[2023-02-26 12:04:03,677][00141] Saving new best policy, reward=35.549! +[2023-02-26 12:04:03,890][00189] Updated weights for policy 0, policy_version 7860 (0.0007) +[2023-02-26 12:04:04,643][00189] Updated weights for policy 0, policy_version 7870 (0.0006) +[2023-02-26 12:04:05,354][00189] Updated weights for policy 0, policy_version 7880 (0.0007) +[2023-02-26 12:04:06,016][00189] Updated weights for policy 0, policy_version 7890 (0.0006) +[2023-02-26 12:04:06,729][00189] Updated weights for policy 0, policy_version 7900 (0.0006) +[2023-02-26 12:04:07,413][00189] Updated weights for policy 0, policy_version 7910 (0.0006) +[2023-02-26 12:04:08,147][00189] Updated weights for policy 0, policy_version 7920 (0.0006) +[2023-02-26 12:04:08,656][00001] Fps is (10 sec: 58573.6, 60 sec: 58777.6, 300 sec: 59093.5). Total num frames: 32468992. Throughput: 0: 14649.8. Samples: 8103092. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:04:08,656][00001] Avg episode reward: [(0, '33.852')] +[2023-02-26 12:04:08,830][00189] Updated weights for policy 0, policy_version 7930 (0.0006) +[2023-02-26 12:04:09,512][00189] Updated weights for policy 0, policy_version 7940 (0.0006) +[2023-02-26 12:04:10,207][00189] Updated weights for policy 0, policy_version 7950 (0.0006) +[2023-02-26 12:04:10,917][00189] Updated weights for policy 0, policy_version 7960 (0.0006) +[2023-02-26 12:04:11,616][00189] Updated weights for policy 0, policy_version 7970 (0.0006) +[2023-02-26 12:04:12,312][00189] Updated weights for policy 0, policy_version 7980 (0.0006) +[2023-02-26 12:04:13,022][00189] Updated weights for policy 0, policy_version 7990 (0.0007) +[2023-02-26 12:04:13,656][00001] Fps is (10 sec: 58162.7, 60 sec: 58709.2, 300 sec: 59065.7). Total num frames: 32759808. Throughput: 0: 14613.2. Samples: 8190836. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:04:13,656][00001] Avg episode reward: [(0, '30.885')] +[2023-02-26 12:04:13,764][00189] Updated weights for policy 0, policy_version 8000 (0.0006) +[2023-02-26 12:04:14,447][00189] Updated weights for policy 0, policy_version 8010 (0.0007) +[2023-02-26 12:04:15,152][00189] Updated weights for policy 0, policy_version 8020 (0.0006) +[2023-02-26 12:04:15,896][00189] Updated weights for policy 0, policy_version 8030 (0.0006) +[2023-02-26 12:04:16,595][00189] Updated weights for policy 0, policy_version 8040 (0.0007) +[2023-02-26 12:04:17,273][00189] Updated weights for policy 0, policy_version 8050 (0.0007) +[2023-02-26 12:04:18,033][00189] Updated weights for policy 0, policy_version 8060 (0.0006) +[2023-02-26 12:04:18,656][00001] Fps is (10 sec: 57753.4, 60 sec: 58504.5, 300 sec: 59037.9). Total num frames: 33046528. Throughput: 0: 14579.0. Samples: 8234012. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:04:18,656][00001] Avg episode reward: [(0, '33.977')] +[2023-02-26 12:04:18,727][00189] Updated weights for policy 0, policy_version 8070 (0.0006) +[2023-02-26 12:04:19,434][00189] Updated weights for policy 0, policy_version 8080 (0.0007) +[2023-02-26 12:04:20,168][00189] Updated weights for policy 0, policy_version 8090 (0.0006) +[2023-02-26 12:04:20,870][00189] Updated weights for policy 0, policy_version 8100 (0.0006) +[2023-02-26 12:04:21,574][00189] Updated weights for policy 0, policy_version 8110 (0.0007) +[2023-02-26 12:04:22,278][00189] Updated weights for policy 0, policy_version 8120 (0.0006) +[2023-02-26 12:04:22,989][00189] Updated weights for policy 0, policy_version 8130 (0.0006) +[2023-02-26 12:04:23,656][00001] Fps is (10 sec: 57754.0, 60 sec: 58367.9, 300 sec: 59024.1). Total num frames: 33337344. Throughput: 0: 14541.3. Samples: 8320448. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) +[2023-02-26 12:04:23,656][00001] Avg episode reward: [(0, '31.923')] +[2023-02-26 12:04:23,700][00189] Updated weights for policy 0, policy_version 8140 (0.0006) +[2023-02-26 12:04:24,409][00189] Updated weights for policy 0, policy_version 8150 (0.0007) +[2023-02-26 12:04:25,096][00189] Updated weights for policy 0, policy_version 8160 (0.0006) +[2023-02-26 12:04:25,775][00189] Updated weights for policy 0, policy_version 8170 (0.0007) +[2023-02-26 12:04:26,518][00189] Updated weights for policy 0, policy_version 8180 (0.0006) +[2023-02-26 12:04:27,185][00189] Updated weights for policy 0, policy_version 8190 (0.0007) +[2023-02-26 12:04:27,919][00189] Updated weights for policy 0, policy_version 8200 (0.0006) +[2023-02-26 12:04:28,612][00189] Updated weights for policy 0, policy_version 8210 (0.0007) +[2023-02-26 12:04:28,656][00001] Fps is (10 sec: 58162.7, 60 sec: 58299.6, 300 sec: 59010.1). Total num frames: 33628160. Throughput: 0: 14547.8. Samples: 8407576. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:04:28,656][00001] Avg episode reward: [(0, '33.207')] +[2023-02-26 12:04:29,299][00189] Updated weights for policy 0, policy_version 8220 (0.0007) +[2023-02-26 12:04:29,637][00141] Signal inference workers to stop experience collection... (300 times) +[2023-02-26 12:04:29,640][00189] InferenceWorker_p0-w0: stopping experience collection (300 times) +[2023-02-26 12:04:29,646][00141] Signal inference workers to resume experience collection... (300 times) +[2023-02-26 12:04:29,646][00189] InferenceWorker_p0-w0: resuming experience collection (300 times) +[2023-02-26 12:04:30,072][00189] Updated weights for policy 0, policy_version 8230 (0.0007) +[2023-02-26 12:04:30,732][00189] Updated weights for policy 0, policy_version 8240 (0.0007) +[2023-02-26 12:04:31,464][00189] Updated weights for policy 0, policy_version 8250 (0.0007) +[2023-02-26 12:04:32,188][00189] Updated weights for policy 0, policy_version 8260 (0.0007) +[2023-02-26 12:04:32,861][00189] Updated weights for policy 0, policy_version 8270 (0.0007) +[2023-02-26 12:04:33,590][00189] Updated weights for policy 0, policy_version 8280 (0.0006) +[2023-02-26 12:04:33,656][00001] Fps is (10 sec: 58163.6, 60 sec: 58163.2, 300 sec: 58996.3). Total num frames: 33918976. Throughput: 0: 14514.5. Samples: 8450988. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:04:33,656][00001] Avg episode reward: [(0, '31.784')] +[2023-02-26 12:04:34,284][00189] Updated weights for policy 0, policy_version 8290 (0.0006) +[2023-02-26 12:04:34,947][00189] Updated weights for policy 0, policy_version 8300 (0.0006) +[2023-02-26 12:04:35,675][00189] Updated weights for policy 0, policy_version 8310 (0.0006) +[2023-02-26 12:04:36,380][00189] Updated weights for policy 0, policy_version 8320 (0.0007) +[2023-02-26 12:04:37,056][00189] Updated weights for policy 0, policy_version 8330 (0.0006) +[2023-02-26 12:04:37,783][00189] Updated weights for policy 0, policy_version 8340 (0.0007) +[2023-02-26 12:04:38,488][00189] Updated weights for policy 0, policy_version 8350 (0.0007) +[2023-02-26 12:04:38,656][00001] Fps is (10 sec: 58163.7, 60 sec: 58163.2, 300 sec: 58982.4). Total num frames: 34209792. Throughput: 0: 14526.8. Samples: 8538372. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:04:38,656][00001] Avg episode reward: [(0, '32.358')] +[2023-02-26 12:04:39,159][00189] Updated weights for policy 0, policy_version 8360 (0.0006) +[2023-02-26 12:04:39,873][00189] Updated weights for policy 0, policy_version 8370 (0.0007) +[2023-02-26 12:04:40,577][00189] Updated weights for policy 0, policy_version 8380 (0.0007) +[2023-02-26 12:04:41,299][00189] Updated weights for policy 0, policy_version 8390 (0.0006) +[2023-02-26 12:04:42,006][00189] Updated weights for policy 0, policy_version 8400 (0.0007) +[2023-02-26 12:04:42,760][00189] Updated weights for policy 0, policy_version 8410 (0.0006) +[2023-02-26 12:04:43,449][00189] Updated weights for policy 0, policy_version 8420 (0.0006) +[2023-02-26 12:04:43,656][00001] Fps is (10 sec: 57752.5, 60 sec: 58094.7, 300 sec: 58940.7). Total num frames: 34496512. Throughput: 0: 14519.1. Samples: 8624988. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) +[2023-02-26 12:04:43,656][00001] Avg episode reward: [(0, '37.514')] +[2023-02-26 12:04:43,671][00141] Saving new best policy, reward=37.514! +[2023-02-26 12:04:44,161][00189] Updated weights for policy 0, policy_version 8430 (0.0007) +[2023-02-26 12:04:44,867][00189] Updated weights for policy 0, policy_version 8440 (0.0006) +[2023-02-26 12:04:45,585][00189] Updated weights for policy 0, policy_version 8450 (0.0006) +[2023-02-26 12:04:46,279][00189] Updated weights for policy 0, policy_version 8460 (0.0006) +[2023-02-26 12:04:47,015][00189] Updated weights for policy 0, policy_version 8470 (0.0006) +[2023-02-26 12:04:47,674][00189] Updated weights for policy 0, policy_version 8480 (0.0006) +[2023-02-26 12:04:48,409][00189] Updated weights for policy 0, policy_version 8490 (0.0006) +[2023-02-26 12:04:48,656][00001] Fps is (10 sec: 57753.6, 60 sec: 58026.6, 300 sec: 58926.9). Total num frames: 34787328. Throughput: 0: 14493.6. Samples: 8668092. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) +[2023-02-26 12:04:48,656][00001] Avg episode reward: [(0, '35.923')] +[2023-02-26 12:04:49,115][00189] Updated weights for policy 0, policy_version 8500 (0.0006) +[2023-02-26 12:04:49,818][00189] Updated weights for policy 0, policy_version 8510 (0.0006) +[2023-02-26 12:04:50,518][00189] Updated weights for policy 0, policy_version 8520 (0.0006) +[2023-02-26 12:04:51,240][00189] Updated weights for policy 0, policy_version 8530 (0.0007) +[2023-02-26 12:04:51,945][00189] Updated weights for policy 0, policy_version 8540 (0.0007) +[2023-02-26 12:04:52,660][00189] Updated weights for policy 0, policy_version 8550 (0.0007) +[2023-02-26 12:04:53,346][00189] Updated weights for policy 0, policy_version 8560 (0.0006) +[2023-02-26 12:04:53,656][00001] Fps is (10 sec: 58163.7, 60 sec: 58026.6, 300 sec: 58913.0). Total num frames: 35078144. Throughput: 0: 14489.8. Samples: 8755136. Policy #0 lag: (min: 0.0, avg: 1.4, max: 3.0) +[2023-02-26 12:04:53,656][00001] Avg episode reward: [(0, '31.571')] +[2023-02-26 12:04:53,660][00141] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000008564_35078144.pth... +[2023-02-26 12:04:53,698][00141] Removing ./runs/default_experiment/checkpoint_p0/checkpoint_000005114_20946944.pth +[2023-02-26 12:04:54,021][00189] Updated weights for policy 0, policy_version 8570 (0.0006) +[2023-02-26 12:04:54,708][00189] Updated weights for policy 0, policy_version 8580 (0.0007) +[2023-02-26 12:04:55,404][00189] Updated weights for policy 0, policy_version 8590 (0.0007) +[2023-02-26 12:04:56,098][00189] Updated weights for policy 0, policy_version 8600 (0.0006) +[2023-02-26 12:04:56,764][00189] Updated weights for policy 0, policy_version 8610 (0.0006) +[2023-02-26 12:04:57,458][00189] Updated weights for policy 0, policy_version 8620 (0.0006) +[2023-02-26 12:04:58,146][00189] Updated weights for policy 0, policy_version 8630 (0.0006) +[2023-02-26 12:04:58,656][00001] Fps is (10 sec: 58982.6, 60 sec: 58231.6, 300 sec: 58926.8). Total num frames: 35377152. Throughput: 0: 14532.5. Samples: 8844796. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:04:58,656][00001] Avg episode reward: [(0, '35.879')] +[2023-02-26 12:04:58,829][00189] Updated weights for policy 0, policy_version 8640 (0.0006) +[2023-02-26 12:04:59,496][00189] Updated weights for policy 0, policy_version 8650 (0.0006) +[2023-02-26 12:05:00,208][00189] Updated weights for policy 0, policy_version 8660 (0.0006) +[2023-02-26 12:05:00,852][00189] Updated weights for policy 0, policy_version 8670 (0.0006) +[2023-02-26 12:05:01,577][00189] Updated weights for policy 0, policy_version 8680 (0.0006) +[2023-02-26 12:05:02,252][00189] Updated weights for policy 0, policy_version 8690 (0.0006) +[2023-02-26 12:05:02,964][00189] Updated weights for policy 0, policy_version 8700 (0.0006) +[2023-02-26 12:05:03,656][00001] Fps is (10 sec: 58982.5, 60 sec: 58163.2, 300 sec: 58926.9). Total num frames: 35667968. Throughput: 0: 14572.4. Samples: 8889772. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) +[2023-02-26 12:05:03,656][00001] Avg episode reward: [(0, '32.741')] +[2023-02-26 12:05:03,719][00189] Updated weights for policy 0, policy_version 8710 (0.0007) +[2023-02-26 12:05:04,449][00189] Updated weights for policy 0, policy_version 8720 (0.0006) +[2023-02-26 12:05:05,197][00189] Updated weights for policy 0, policy_version 8730 (0.0007) +[2023-02-26 12:05:05,918][00189] Updated weights for policy 0, policy_version 8740 (0.0007) +[2023-02-26 12:05:06,619][00189] Updated weights for policy 0, policy_version 8750 (0.0006) +[2023-02-26 12:05:07,375][00189] Updated weights for policy 0, policy_version 8760 (0.0007) +[2023-02-26 12:05:08,032][00189] Updated weights for policy 0, policy_version 8770 (0.0006) +[2023-02-26 12:05:08,656][00001] Fps is (10 sec: 57753.8, 60 sec: 58095.0, 300 sec: 58885.2). Total num frames: 35954688. Throughput: 0: 14543.1. Samples: 8974884. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) +[2023-02-26 12:05:08,656][00001] Avg episode reward: [(0, '32.726')] +[2023-02-26 12:05:08,741][00189] Updated weights for policy 0, policy_version 8780 (0.0006) +[2023-02-26 12:05:09,477][00189] Updated weights for policy 0, policy_version 8790 (0.0006) +[2023-02-26 12:05:10,152][00189] Updated weights for policy 0, policy_version 8800 (0.0006) +[2023-02-26 12:05:10,865][00189] Updated weights for policy 0, policy_version 8810 (0.0007) +[2023-02-26 12:05:11,585][00189] Updated weights for policy 0, policy_version 8820 (0.0006) +[2023-02-26 12:05:12,255][00189] Updated weights for policy 0, policy_version 8830 (0.0006) +[2023-02-26 12:05:12,945][00189] Updated weights for policy 0, policy_version 8840 (0.0007) +[2023-02-26 12:05:13,656][00001] Fps is (10 sec: 57753.5, 60 sec: 58095.0, 300 sec: 58885.2). Total num frames: 36245504. Throughput: 0: 14539.7. Samples: 9061860. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:05:13,656][00001] Avg episode reward: [(0, '34.237')] +[2023-02-26 12:05:13,707][00189] Updated weights for policy 0, policy_version 8850 (0.0006) +[2023-02-26 12:05:14,386][00189] Updated weights for policy 0, policy_version 8860 (0.0007) +[2023-02-26 12:05:15,084][00189] Updated weights for policy 0, policy_version 8870 (0.0006) +[2023-02-26 12:05:15,814][00189] Updated weights for policy 0, policy_version 8880 (0.0006) +[2023-02-26 12:05:16,527][00189] Updated weights for policy 0, policy_version 8890 (0.0006) +[2023-02-26 12:05:17,186][00189] Updated weights for policy 0, policy_version 8900 (0.0007) +[2023-02-26 12:05:17,948][00189] Updated weights for policy 0, policy_version 8910 (0.0007) +[2023-02-26 12:05:18,629][00189] Updated weights for policy 0, policy_version 8920 (0.0006) +[2023-02-26 12:05:18,656][00001] Fps is (10 sec: 58163.0, 60 sec: 58163.2, 300 sec: 58857.4). Total num frames: 36536320. Throughput: 0: 14544.8. Samples: 9105504. Policy #0 lag: (min: 0.0, avg: 1.9, max: 5.0) +[2023-02-26 12:05:18,656][00001] Avg episode reward: [(0, '33.340')] +[2023-02-26 12:05:19,324][00189] Updated weights for policy 0, policy_version 8930 (0.0006) +[2023-02-26 12:05:20,051][00189] Updated weights for policy 0, policy_version 8940 (0.0006) +[2023-02-26 12:05:20,736][00189] Updated weights for policy 0, policy_version 8950 (0.0006) +[2023-02-26 12:05:21,447][00189] Updated weights for policy 0, policy_version 8960 (0.0007) +[2023-02-26 12:05:22,163][00189] Updated weights for policy 0, policy_version 8970 (0.0006) +[2023-02-26 12:05:22,834][00189] Updated weights for policy 0, policy_version 8980 (0.0006) +[2023-02-26 12:05:23,554][00189] Updated weights for policy 0, policy_version 8990 (0.0006) +[2023-02-26 12:05:23,656][00001] Fps is (10 sec: 58163.6, 60 sec: 58163.2, 300 sec: 58843.6). Total num frames: 36827136. Throughput: 0: 14539.7. Samples: 9192660. Policy #0 lag: (min: 0.0, avg: 1.9, max: 5.0) +[2023-02-26 12:05:23,656][00001] Avg episode reward: [(0, '33.412')] +[2023-02-26 12:05:24,262][00189] Updated weights for policy 0, policy_version 9000 (0.0007) +[2023-02-26 12:05:24,960][00189] Updated weights for policy 0, policy_version 9010 (0.0006) +[2023-02-26 12:05:25,647][00189] Updated weights for policy 0, policy_version 9020 (0.0006) +[2023-02-26 12:05:26,383][00189] Updated weights for policy 0, policy_version 9030 (0.0006) +[2023-02-26 12:05:27,077][00189] Updated weights for policy 0, policy_version 9040 (0.0006) +[2023-02-26 12:05:27,788][00189] Updated weights for policy 0, policy_version 9050 (0.0006) +[2023-02-26 12:05:28,485][00189] Updated weights for policy 0, policy_version 9060 (0.0006) +[2023-02-26 12:05:28,656][00001] Fps is (10 sec: 58162.9, 60 sec: 58163.3, 300 sec: 58815.8). Total num frames: 37117952. Throughput: 0: 14556.9. Samples: 9280048. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:05:28,656][00001] Avg episode reward: [(0, '35.347')] +[2023-02-26 12:05:29,163][00189] Updated weights for policy 0, policy_version 9070 (0.0007) +[2023-02-26 12:05:29,883][00189] Updated weights for policy 0, policy_version 9080 (0.0007) +[2023-02-26 12:05:30,593][00189] Updated weights for policy 0, policy_version 9090 (0.0006) +[2023-02-26 12:05:31,315][00189] Updated weights for policy 0, policy_version 9100 (0.0007) +[2023-02-26 12:05:32,032][00189] Updated weights for policy 0, policy_version 9110 (0.0007) +[2023-02-26 12:05:32,710][00189] Updated weights for policy 0, policy_version 9120 (0.0007) +[2023-02-26 12:05:33,397][00189] Updated weights for policy 0, policy_version 9130 (0.0006) +[2023-02-26 12:05:33,656][00001] Fps is (10 sec: 58163.2, 60 sec: 58163.2, 300 sec: 58801.9). Total num frames: 37408768. Throughput: 0: 14559.5. Samples: 9323268. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:05:33,656][00001] Avg episode reward: [(0, '32.719')] +[2023-02-26 12:05:34,141][00189] Updated weights for policy 0, policy_version 9140 (0.0006) +[2023-02-26 12:05:34,842][00189] Updated weights for policy 0, policy_version 9150 (0.0006) +[2023-02-26 12:05:35,518][00189] Updated weights for policy 0, policy_version 9160 (0.0007) +[2023-02-26 12:05:36,278][00189] Updated weights for policy 0, policy_version 9170 (0.0006) +[2023-02-26 12:05:36,941][00189] Updated weights for policy 0, policy_version 9180 (0.0007) +[2023-02-26 12:05:37,621][00189] Updated weights for policy 0, policy_version 9190 (0.0007) +[2023-02-26 12:05:38,353][00189] Updated weights for policy 0, policy_version 9200 (0.0006) +[2023-02-26 12:05:38,656][00001] Fps is (10 sec: 58163.6, 60 sec: 58163.3, 300 sec: 58788.0). Total num frames: 37699584. Throughput: 0: 14564.9. Samples: 9410556. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:05:38,656][00001] Avg episode reward: [(0, '35.588')] +[2023-02-26 12:05:39,047][00189] Updated weights for policy 0, policy_version 9210 (0.0006) +[2023-02-26 12:05:39,754][00189] Updated weights for policy 0, policy_version 9220 (0.0006) +[2023-02-26 12:05:40,480][00189] Updated weights for policy 0, policy_version 9230 (0.0006) +[2023-02-26 12:05:41,195][00189] Updated weights for policy 0, policy_version 9240 (0.0006) +[2023-02-26 12:05:41,870][00189] Updated weights for policy 0, policy_version 9250 (0.0006) +[2023-02-26 12:05:42,599][00189] Updated weights for policy 0, policy_version 9260 (0.0007) +[2023-02-26 12:05:43,307][00189] Updated weights for policy 0, policy_version 9270 (0.0007) +[2023-02-26 12:05:43,656][00001] Fps is (10 sec: 58163.7, 60 sec: 58231.7, 300 sec: 58774.1). Total num frames: 37990400. Throughput: 0: 14507.7. Samples: 9497640. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:05:43,656][00001] Avg episode reward: [(0, '35.085')] +[2023-02-26 12:05:43,968][00189] Updated weights for policy 0, policy_version 9280 (0.0007) +[2023-02-26 12:05:44,733][00189] Updated weights for policy 0, policy_version 9290 (0.0007) +[2023-02-26 12:05:45,411][00189] Updated weights for policy 0, policy_version 9300 (0.0006) +[2023-02-26 12:05:46,090][00189] Updated weights for policy 0, policy_version 9310 (0.0006) +[2023-02-26 12:05:46,837][00189] Updated weights for policy 0, policy_version 9320 (0.0006) +[2023-02-26 12:05:47,513][00189] Updated weights for policy 0, policy_version 9330 (0.0006) +[2023-02-26 12:05:48,214][00189] Updated weights for policy 0, policy_version 9340 (0.0006) +[2023-02-26 12:05:48,656][00001] Fps is (10 sec: 58163.2, 60 sec: 58231.5, 300 sec: 58760.3). Total num frames: 38281216. Throughput: 0: 14478.0. Samples: 9541280. Policy #0 lag: (min: 0.0, avg: 1.7, max: 3.0) +[2023-02-26 12:05:48,656][00001] Avg episode reward: [(0, '38.555')] +[2023-02-26 12:05:48,656][00141] Saving new best policy, reward=38.555! +[2023-02-26 12:05:48,961][00189] Updated weights for policy 0, policy_version 9350 (0.0006) +[2023-02-26 12:05:49,623][00189] Updated weights for policy 0, policy_version 9360 (0.0006) +[2023-02-26 12:05:50,351][00189] Updated weights for policy 0, policy_version 9370 (0.0007) +[2023-02-26 12:05:51,058][00189] Updated weights for policy 0, policy_version 9380 (0.0006) +[2023-02-26 12:05:51,716][00189] Updated weights for policy 0, policy_version 9390 (0.0007) +[2023-02-26 12:05:52,456][00189] Updated weights for policy 0, policy_version 9400 (0.0006) +[2023-02-26 12:05:53,156][00189] Updated weights for policy 0, policy_version 9410 (0.0006) +[2023-02-26 12:05:53,656][00001] Fps is (10 sec: 58163.0, 60 sec: 58231.6, 300 sec: 58746.4). Total num frames: 38572032. Throughput: 0: 14527.7. Samples: 9628632. Policy #0 lag: (min: 0.0, avg: 1.7, max: 3.0) +[2023-02-26 12:05:53,656][00001] Avg episode reward: [(0, '36.005')] +[2023-02-26 12:05:53,854][00189] Updated weights for policy 0, policy_version 9420 (0.0006) +[2023-02-26 12:05:54,590][00189] Updated weights for policy 0, policy_version 9430 (0.0007) +[2023-02-26 12:05:55,277][00189] Updated weights for policy 0, policy_version 9440 (0.0006) +[2023-02-26 12:05:55,958][00189] Updated weights for policy 0, policy_version 9450 (0.0006) +[2023-02-26 12:05:56,684][00189] Updated weights for policy 0, policy_version 9460 (0.0006) +[2023-02-26 12:05:57,386][00189] Updated weights for policy 0, policy_version 9470 (0.0007) +[2023-02-26 12:05:58,058][00189] Updated weights for policy 0, policy_version 9480 (0.0006) +[2023-02-26 12:05:58,656][00001] Fps is (10 sec: 58163.0, 60 sec: 58094.9, 300 sec: 58732.5). Total num frames: 38862848. Throughput: 0: 14533.3. Samples: 9715856. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) +[2023-02-26 12:05:58,656][00001] Avg episode reward: [(0, '35.588')] +[2023-02-26 12:05:58,772][00189] Updated weights for policy 0, policy_version 9490 (0.0006) +[2023-02-26 12:05:59,507][00189] Updated weights for policy 0, policy_version 9500 (0.0006) +[2023-02-26 12:06:00,197][00189] Updated weights for policy 0, policy_version 9510 (0.0006) +[2023-02-26 12:06:00,899][00189] Updated weights for policy 0, policy_version 9520 (0.0006) +[2023-02-26 12:06:01,618][00189] Updated weights for policy 0, policy_version 9530 (0.0007) +[2023-02-26 12:06:02,342][00189] Updated weights for policy 0, policy_version 9540 (0.0007) +[2023-02-26 12:06:02,982][00189] Updated weights for policy 0, policy_version 9550 (0.0006) +[2023-02-26 12:06:03,656][00001] Fps is (10 sec: 58162.9, 60 sec: 58095.0, 300 sec: 58718.6). Total num frames: 39153664. Throughput: 0: 14526.3. Samples: 9759188. Policy #0 lag: (min: 0.0, avg: 1.5, max: 3.0) +[2023-02-26 12:06:03,656][00001] Avg episode reward: [(0, '33.455')] +[2023-02-26 12:06:03,729][00189] Updated weights for policy 0, policy_version 9560 (0.0006) +[2023-02-26 12:06:04,431][00189] Updated weights for policy 0, policy_version 9570 (0.0006) +[2023-02-26 12:06:05,148][00189] Updated weights for policy 0, policy_version 9580 (0.0006) +[2023-02-26 12:06:05,860][00189] Updated weights for policy 0, policy_version 9590 (0.0007) +[2023-02-26 12:06:06,560][00189] Updated weights for policy 0, policy_version 9600 (0.0007) +[2023-02-26 12:06:07,265][00189] Updated weights for policy 0, policy_version 9610 (0.0006) +[2023-02-26 12:06:07,981][00189] Updated weights for policy 0, policy_version 9620 (0.0006) +[2023-02-26 12:06:08,638][00189] Updated weights for policy 0, policy_version 9630 (0.0006) +[2023-02-26 12:06:08,656][00001] Fps is (10 sec: 58163.1, 60 sec: 58163.2, 300 sec: 58704.7). Total num frames: 39444480. Throughput: 0: 14530.6. Samples: 9846536. Policy #0 lag: (min: 0.0, avg: 1.5, max: 3.0) +[2023-02-26 12:06:08,656][00001] Avg episode reward: [(0, '39.330')] +[2023-02-26 12:06:08,660][00141] Saving new best policy, reward=39.330! +[2023-02-26 12:06:09,367][00189] Updated weights for policy 0, policy_version 9640 (0.0007) +[2023-02-26 12:06:10,075][00189] Updated weights for policy 0, policy_version 9650 (0.0006) +[2023-02-26 12:06:10,794][00189] Updated weights for policy 0, policy_version 9660 (0.0006) +[2023-02-26 12:06:11,507][00189] Updated weights for policy 0, policy_version 9670 (0.0006) +[2023-02-26 12:06:12,230][00189] Updated weights for policy 0, policy_version 9680 (0.0006) +[2023-02-26 12:06:12,883][00189] Updated weights for policy 0, policy_version 9690 (0.0006) +[2023-02-26 12:06:13,575][00189] Updated weights for policy 0, policy_version 9700 (0.0007) +[2023-02-26 12:06:13,656][00001] Fps is (10 sec: 57753.4, 60 sec: 58095.0, 300 sec: 58676.9). Total num frames: 39731200. Throughput: 0: 14523.2. Samples: 9933592. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:06:13,656][00001] Avg episode reward: [(0, '34.817')] +[2023-02-26 12:06:14,320][00189] Updated weights for policy 0, policy_version 9710 (0.0006) +[2023-02-26 12:06:14,999][00189] Updated weights for policy 0, policy_version 9720 (0.0007) +[2023-02-26 12:06:15,710][00189] Updated weights for policy 0, policy_version 9730 (0.0006) +[2023-02-26 12:06:16,403][00189] Updated weights for policy 0, policy_version 9740 (0.0006) +[2023-02-26 12:06:17,141][00189] Updated weights for policy 0, policy_version 9750 (0.0006) +[2023-02-26 12:06:17,831][00189] Updated weights for policy 0, policy_version 9760 (0.0006) +[2023-02-26 12:06:18,519][00189] Updated weights for policy 0, policy_version 9770 (0.0006) +[2023-02-26 12:06:18,656][00001] Fps is (10 sec: 57753.8, 60 sec: 58094.9, 300 sec: 58649.2). Total num frames: 40022016. Throughput: 0: 14533.2. Samples: 9977260. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:06:18,656][00001] Avg episode reward: [(0, '34.264')] +[2023-02-26 12:06:19,228][00189] Updated weights for policy 0, policy_version 9780 (0.0007) +[2023-02-26 12:06:19,928][00189] Updated weights for policy 0, policy_version 9790 (0.0006) +[2023-02-26 12:06:20,626][00189] Updated weights for policy 0, policy_version 9800 (0.0006) +[2023-02-26 12:06:21,336][00189] Updated weights for policy 0, policy_version 9810 (0.0007) +[2023-02-26 12:06:22,052][00189] Updated weights for policy 0, policy_version 9820 (0.0007) +[2023-02-26 12:06:22,760][00189] Updated weights for policy 0, policy_version 9830 (0.0007) +[2023-02-26 12:06:23,477][00189] Updated weights for policy 0, policy_version 9840 (0.0006) +[2023-02-26 12:06:23,656][00001] Fps is (10 sec: 58162.9, 60 sec: 58094.9, 300 sec: 58649.1). Total num frames: 40312832. Throughput: 0: 14531.3. Samples: 10064464. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:06:23,656][00001] Avg episode reward: [(0, '33.739')] +[2023-02-26 12:06:24,171][00189] Updated weights for policy 0, policy_version 9850 (0.0007) +[2023-02-26 12:06:24,894][00189] Updated weights for policy 0, policy_version 9860 (0.0007) +[2023-02-26 12:06:25,563][00189] Updated weights for policy 0, policy_version 9870 (0.0006) +[2023-02-26 12:06:26,285][00189] Updated weights for policy 0, policy_version 9880 (0.0007) +[2023-02-26 12:06:26,993][00189] Updated weights for policy 0, policy_version 9890 (0.0007) +[2023-02-26 12:06:27,685][00189] Updated weights for policy 0, policy_version 9900 (0.0007) +[2023-02-26 12:06:28,372][00189] Updated weights for policy 0, policy_version 9910 (0.0006) +[2023-02-26 12:06:28,656][00001] Fps is (10 sec: 58163.0, 60 sec: 58095.0, 300 sec: 58635.3). Total num frames: 40603648. Throughput: 0: 14533.4. Samples: 10151644. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:06:28,656][00001] Avg episode reward: [(0, '37.038')] +[2023-02-26 12:06:29,111][00189] Updated weights for policy 0, policy_version 9920 (0.0006) +[2023-02-26 12:06:29,799][00189] Updated weights for policy 0, policy_version 9930 (0.0006) +[2023-02-26 12:06:30,505][00189] Updated weights for policy 0, policy_version 9940 (0.0006) +[2023-02-26 12:06:31,227][00189] Updated weights for policy 0, policy_version 9950 (0.0006) +[2023-02-26 12:06:31,894][00189] Updated weights for policy 0, policy_version 9960 (0.0006) +[2023-02-26 12:06:32,631][00189] Updated weights for policy 0, policy_version 9970 (0.0006) +[2023-02-26 12:06:33,309][00189] Updated weights for policy 0, policy_version 9980 (0.0006) +[2023-02-26 12:06:33,656][00001] Fps is (10 sec: 58163.5, 60 sec: 58094.9, 300 sec: 58607.5). Total num frames: 40894464. Throughput: 0: 14531.6. Samples: 10195204. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:06:33,656][00001] Avg episode reward: [(0, '32.422')] +[2023-02-26 12:06:34,044][00189] Updated weights for policy 0, policy_version 9990 (0.0006) +[2023-02-26 12:06:34,730][00189] Updated weights for policy 0, policy_version 10000 (0.0006) +[2023-02-26 12:06:35,434][00189] Updated weights for policy 0, policy_version 10010 (0.0006) +[2023-02-26 12:06:36,155][00189] Updated weights for policy 0, policy_version 10020 (0.0006) +[2023-02-26 12:06:36,849][00189] Updated weights for policy 0, policy_version 10030 (0.0006) +[2023-02-26 12:06:37,565][00189] Updated weights for policy 0, policy_version 10040 (0.0007) +[2023-02-26 12:06:38,265][00189] Updated weights for policy 0, policy_version 10050 (0.0007) +[2023-02-26 12:06:38,656][00001] Fps is (10 sec: 58163.3, 60 sec: 58094.9, 300 sec: 58593.6). Total num frames: 41185280. Throughput: 0: 14529.5. Samples: 10282460. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:06:38,656][00001] Avg episode reward: [(0, '36.184')] +[2023-02-26 12:06:38,947][00189] Updated weights for policy 0, policy_version 10060 (0.0007) +[2023-02-26 12:06:39,667][00189] Updated weights for policy 0, policy_version 10070 (0.0006) +[2023-02-26 12:06:40,387][00189] Updated weights for policy 0, policy_version 10080 (0.0006) +[2023-02-26 12:06:41,054][00189] Updated weights for policy 0, policy_version 10090 (0.0006) +[2023-02-26 12:06:41,768][00189] Updated weights for policy 0, policy_version 10100 (0.0007) +[2023-02-26 12:06:42,475][00189] Updated weights for policy 0, policy_version 10110 (0.0006) +[2023-02-26 12:06:43,197][00189] Updated weights for policy 0, policy_version 10120 (0.0006) +[2023-02-26 12:06:43,656][00001] Fps is (10 sec: 58163.1, 60 sec: 58094.8, 300 sec: 58593.6). Total num frames: 41476096. Throughput: 0: 14534.0. Samples: 10369888. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:06:43,656][00001] Avg episode reward: [(0, '37.505')] +[2023-02-26 12:06:43,875][00189] Updated weights for policy 0, policy_version 10130 (0.0006) +[2023-02-26 12:06:44,575][00189] Updated weights for policy 0, policy_version 10140 (0.0006) +[2023-02-26 12:06:45,279][00189] Updated weights for policy 0, policy_version 10150 (0.0006) +[2023-02-26 12:06:46,002][00189] Updated weights for policy 0, policy_version 10160 (0.0006) +[2023-02-26 12:06:46,682][00189] Updated weights for policy 0, policy_version 10170 (0.0006) +[2023-02-26 12:06:47,413][00189] Updated weights for policy 0, policy_version 10180 (0.0007) +[2023-02-26 12:06:48,117][00189] Updated weights for policy 0, policy_version 10190 (0.0006) +[2023-02-26 12:06:48,656][00001] Fps is (10 sec: 58162.9, 60 sec: 58094.9, 300 sec: 58579.7). Total num frames: 41766912. Throughput: 0: 14536.6. Samples: 10413336. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) +[2023-02-26 12:06:48,656][00001] Avg episode reward: [(0, '35.867')] +[2023-02-26 12:06:48,808][00189] Updated weights for policy 0, policy_version 10200 (0.0006) +[2023-02-26 12:06:49,509][00189] Updated weights for policy 0, policy_version 10210 (0.0006) +[2023-02-26 12:06:50,217][00189] Updated weights for policy 0, policy_version 10220 (0.0006) +[2023-02-26 12:06:50,945][00189] Updated weights for policy 0, policy_version 10230 (0.0006) +[2023-02-26 12:06:51,622][00189] Updated weights for policy 0, policy_version 10240 (0.0006) +[2023-02-26 12:06:52,336][00189] Updated weights for policy 0, policy_version 10250 (0.0006) +[2023-02-26 12:06:53,065][00189] Updated weights for policy 0, policy_version 10260 (0.0006) +[2023-02-26 12:06:53,656][00001] Fps is (10 sec: 58163.6, 60 sec: 58094.9, 300 sec: 58565.9). Total num frames: 42057728. Throughput: 0: 14531.3. Samples: 10500444. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) +[2023-02-26 12:06:53,656][00001] Avg episode reward: [(0, '34.506')] +[2023-02-26 12:06:53,659][00141] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000010268_42057728.pth... +[2023-02-26 12:06:53,702][00141] Removing ./runs/default_experiment/checkpoint_p0/checkpoint_000006848_28049408.pth +[2023-02-26 12:06:53,760][00189] Updated weights for policy 0, policy_version 10270 (0.0006) +[2023-02-26 12:06:54,448][00189] Updated weights for policy 0, policy_version 10280 (0.0006) +[2023-02-26 12:06:55,181][00189] Updated weights for policy 0, policy_version 10290 (0.0006) +[2023-02-26 12:06:55,865][00189] Updated weights for policy 0, policy_version 10300 (0.0006) +[2023-02-26 12:06:56,577][00189] Updated weights for policy 0, policy_version 10310 (0.0006) +[2023-02-26 12:06:57,270][00189] Updated weights for policy 0, policy_version 10320 (0.0006) +[2023-02-26 12:06:57,973][00189] Updated weights for policy 0, policy_version 10330 (0.0006) +[2023-02-26 12:06:58,656][00001] Fps is (10 sec: 58162.7, 60 sec: 58094.8, 300 sec: 58552.0). Total num frames: 42348544. Throughput: 0: 14535.6. Samples: 10587696. Policy #0 lag: (min: 0.0, avg: 1.8, max: 3.0) +[2023-02-26 12:06:58,656][00001] Avg episode reward: [(0, '36.611')] +[2023-02-26 12:06:58,695][00189] Updated weights for policy 0, policy_version 10340 (0.0006) +[2023-02-26 12:06:59,399][00189] Updated weights for policy 0, policy_version 10350 (0.0006) +[2023-02-26 12:07:00,101][00189] Updated weights for policy 0, policy_version 10360 (0.0006) +[2023-02-26 12:07:00,808][00189] Updated weights for policy 0, policy_version 10370 (0.0006) +[2023-02-26 12:07:01,488][00189] Updated weights for policy 0, policy_version 10380 (0.0007) +[2023-02-26 12:07:02,228][00189] Updated weights for policy 0, policy_version 10390 (0.0006) +[2023-02-26 12:07:02,892][00189] Updated weights for policy 0, policy_version 10400 (0.0006) +[2023-02-26 12:07:03,605][00189] Updated weights for policy 0, policy_version 10410 (0.0006) +[2023-02-26 12:07:03,656][00001] Fps is (10 sec: 58162.5, 60 sec: 58094.8, 300 sec: 58524.2). Total num frames: 42639360. Throughput: 0: 14532.4. Samples: 10631220. Policy #0 lag: (min: 0.0, avg: 1.8, max: 3.0) +[2023-02-26 12:07:03,656][00001] Avg episode reward: [(0, '33.355')] +[2023-02-26 12:07:04,333][00189] Updated weights for policy 0, policy_version 10420 (0.0006) +[2023-02-26 12:07:04,995][00189] Updated weights for policy 0, policy_version 10430 (0.0007) +[2023-02-26 12:07:05,717][00189] Updated weights for policy 0, policy_version 10440 (0.0007) +[2023-02-26 12:07:06,418][00189] Updated weights for policy 0, policy_version 10450 (0.0007) +[2023-02-26 12:07:07,116][00189] Updated weights for policy 0, policy_version 10460 (0.0007) +[2023-02-26 12:07:07,827][00189] Updated weights for policy 0, policy_version 10470 (0.0006) +[2023-02-26 12:07:08,540][00189] Updated weights for policy 0, policy_version 10480 (0.0006) +[2023-02-26 12:07:08,656][00001] Fps is (10 sec: 58163.6, 60 sec: 58094.9, 300 sec: 58524.2). Total num frames: 42930176. Throughput: 0: 14539.7. Samples: 10718752. Policy #0 lag: (min: 0.0, avg: 1.5, max: 3.0) +[2023-02-26 12:07:08,656][00001] Avg episode reward: [(0, '34.988')] +[2023-02-26 12:07:09,209][00189] Updated weights for policy 0, policy_version 10490 (0.0007) +[2023-02-26 12:07:09,942][00189] Updated weights for policy 0, policy_version 10500 (0.0006) +[2023-02-26 12:07:10,646][00189] Updated weights for policy 0, policy_version 10510 (0.0006) +[2023-02-26 12:07:11,325][00189] Updated weights for policy 0, policy_version 10520 (0.0006) +[2023-02-26 12:07:12,063][00189] Updated weights for policy 0, policy_version 10530 (0.0006) +[2023-02-26 12:07:12,755][00189] Updated weights for policy 0, policy_version 10540 (0.0006) +[2023-02-26 12:07:13,430][00189] Updated weights for policy 0, policy_version 10550 (0.0006) +[2023-02-26 12:07:13,656][00001] Fps is (10 sec: 58573.4, 60 sec: 58231.5, 300 sec: 58510.3). Total num frames: 43225088. Throughput: 0: 14545.3. Samples: 10806180. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) +[2023-02-26 12:07:13,656][00001] Avg episode reward: [(0, '38.448')] +[2023-02-26 12:07:14,182][00189] Updated weights for policy 0, policy_version 10560 (0.0007) +[2023-02-26 12:07:14,866][00189] Updated weights for policy 0, policy_version 10570 (0.0006) +[2023-02-26 12:07:15,539][00189] Updated weights for policy 0, policy_version 10580 (0.0006) +[2023-02-26 12:07:16,296][00189] Updated weights for policy 0, policy_version 10590 (0.0006) +[2023-02-26 12:07:16,973][00189] Updated weights for policy 0, policy_version 10600 (0.0007) +[2023-02-26 12:07:17,641][00189] Updated weights for policy 0, policy_version 10610 (0.0006) +[2023-02-26 12:07:18,413][00189] Updated weights for policy 0, policy_version 10620 (0.0007) +[2023-02-26 12:07:18,481][00141] Signal inference workers to stop experience collection... (350 times) +[2023-02-26 12:07:18,481][00141] Signal inference workers to resume experience collection... (350 times) +[2023-02-26 12:07:18,488][00189] InferenceWorker_p0-w0: stopping experience collection (350 times) +[2023-02-26 12:07:18,488][00189] InferenceWorker_p0-w0: resuming experience collection (350 times) +[2023-02-26 12:07:18,656][00001] Fps is (10 sec: 58163.5, 60 sec: 58163.2, 300 sec: 58468.7). Total num frames: 43511808. Throughput: 0: 14546.0. Samples: 10849772. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) +[2023-02-26 12:07:18,656][00001] Avg episode reward: [(0, '36.997')] +[2023-02-26 12:07:19,134][00189] Updated weights for policy 0, policy_version 10630 (0.0007) +[2023-02-26 12:07:19,816][00189] Updated weights for policy 0, policy_version 10640 (0.0007) +[2023-02-26 12:07:20,592][00189] Updated weights for policy 0, policy_version 10650 (0.0007) +[2023-02-26 12:07:21,283][00189] Updated weights for policy 0, policy_version 10660 (0.0006) +[2023-02-26 12:07:21,988][00189] Updated weights for policy 0, policy_version 10670 (0.0007) +[2023-02-26 12:07:22,734][00189] Updated weights for policy 0, policy_version 10680 (0.0007) +[2023-02-26 12:07:23,427][00189] Updated weights for policy 0, policy_version 10690 (0.0006) +[2023-02-26 12:07:23,656][00001] Fps is (10 sec: 57343.6, 60 sec: 58095.0, 300 sec: 58413.1). Total num frames: 43798528. Throughput: 0: 14512.0. Samples: 10935500. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) +[2023-02-26 12:07:23,656][00001] Avg episode reward: [(0, '36.982')] +[2023-02-26 12:07:24,108][00189] Updated weights for policy 0, policy_version 10700 (0.0007) +[2023-02-26 12:07:24,861][00189] Updated weights for policy 0, policy_version 10710 (0.0007) +[2023-02-26 12:07:25,522][00189] Updated weights for policy 0, policy_version 10720 (0.0007) +[2023-02-26 12:07:26,211][00189] Updated weights for policy 0, policy_version 10730 (0.0006) +[2023-02-26 12:07:26,958][00189] Updated weights for policy 0, policy_version 10740 (0.0007) +[2023-02-26 12:07:27,621][00189] Updated weights for policy 0, policy_version 10750 (0.0007) +[2023-02-26 12:07:28,330][00189] Updated weights for policy 0, policy_version 10760 (0.0007) +[2023-02-26 12:07:28,656][00001] Fps is (10 sec: 57753.2, 60 sec: 58094.9, 300 sec: 58399.2). Total num frames: 44089344. Throughput: 0: 14513.6. Samples: 11023000. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) +[2023-02-26 12:07:28,656][00001] Avg episode reward: [(0, '35.610')] +[2023-02-26 12:07:29,075][00189] Updated weights for policy 0, policy_version 10770 (0.0007) +[2023-02-26 12:07:29,764][00189] Updated weights for policy 0, policy_version 10780 (0.0008) +[2023-02-26 12:07:30,563][00189] Updated weights for policy 0, policy_version 10790 (0.0007) +[2023-02-26 12:07:31,300][00189] Updated weights for policy 0, policy_version 10800 (0.0006) +[2023-02-26 12:07:32,013][00189] Updated weights for policy 0, policy_version 10810 (0.0006) +[2023-02-26 12:07:32,751][00189] Updated weights for policy 0, policy_version 10820 (0.0007) +[2023-02-26 12:07:33,498][00189] Updated weights for policy 0, policy_version 10830 (0.0006) +[2023-02-26 12:07:33,656][00001] Fps is (10 sec: 56933.4, 60 sec: 57889.9, 300 sec: 58357.5). Total num frames: 44367872. Throughput: 0: 14464.3. Samples: 11064232. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:07:33,657][00001] Avg episode reward: [(0, '36.953')] +[2023-02-26 12:07:34,193][00189] Updated weights for policy 0, policy_version 10840 (0.0007) +[2023-02-26 12:07:34,902][00189] Updated weights for policy 0, policy_version 10850 (0.0007) +[2023-02-26 12:07:35,593][00189] Updated weights for policy 0, policy_version 10860 (0.0007) +[2023-02-26 12:07:36,319][00189] Updated weights for policy 0, policy_version 10870 (0.0007) +[2023-02-26 12:07:37,012][00189] Updated weights for policy 0, policy_version 10880 (0.0007) +[2023-02-26 12:07:37,687][00189] Updated weights for policy 0, policy_version 10890 (0.0006) +[2023-02-26 12:07:38,439][00189] Updated weights for policy 0, policy_version 10900 (0.0006) +[2023-02-26 12:07:38,656][00001] Fps is (10 sec: 56934.4, 60 sec: 57890.0, 300 sec: 58329.8). Total num frames: 44658688. Throughput: 0: 14448.4. Samples: 11150624. Policy #0 lag: (min: 0.0, avg: 1.8, max: 3.0) +[2023-02-26 12:07:38,656][00001] Avg episode reward: [(0, '38.854')] +[2023-02-26 12:07:39,103][00189] Updated weights for policy 0, policy_version 10910 (0.0006) +[2023-02-26 12:07:39,804][00189] Updated weights for policy 0, policy_version 10920 (0.0006) +[2023-02-26 12:07:40,555][00189] Updated weights for policy 0, policy_version 10930 (0.0007) +[2023-02-26 12:07:41,212][00189] Updated weights for policy 0, policy_version 10940 (0.0007) +[2023-02-26 12:07:41,946][00189] Updated weights for policy 0, policy_version 10950 (0.0006) +[2023-02-26 12:07:42,701][00189] Updated weights for policy 0, policy_version 10960 (0.0006) +[2023-02-26 12:07:43,448][00189] Updated weights for policy 0, policy_version 10970 (0.0007) +[2023-02-26 12:07:43,656][00001] Fps is (10 sec: 57344.3, 60 sec: 57753.5, 300 sec: 58260.3). Total num frames: 44941312. Throughput: 0: 14423.0. Samples: 11236732. Policy #0 lag: (min: 0.0, avg: 1.8, max: 3.0) +[2023-02-26 12:07:43,657][00001] Avg episode reward: [(0, '35.979')] +[2023-02-26 12:07:44,117][00189] Updated weights for policy 0, policy_version 10980 (0.0007) +[2023-02-26 12:07:44,854][00189] Updated weights for policy 0, policy_version 10990 (0.0007) +[2023-02-26 12:07:45,582][00189] Updated weights for policy 0, policy_version 11000 (0.0006) +[2023-02-26 12:07:46,330][00189] Updated weights for policy 0, policy_version 11010 (0.0007) +[2023-02-26 12:07:47,005][00189] Updated weights for policy 0, policy_version 11020 (0.0007) +[2023-02-26 12:07:47,708][00189] Updated weights for policy 0, policy_version 11030 (0.0007) +[2023-02-26 12:07:48,497][00189] Updated weights for policy 0, policy_version 11040 (0.0007) +[2023-02-26 12:07:48,656][00001] Fps is (10 sec: 56934.9, 60 sec: 57685.4, 300 sec: 58232.6). Total num frames: 45228032. Throughput: 0: 14402.6. Samples: 11279336. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) +[2023-02-26 12:07:48,656][00001] Avg episode reward: [(0, '40.002')] +[2023-02-26 12:07:48,656][00141] Saving new best policy, reward=40.002! +[2023-02-26 12:07:49,181][00189] Updated weights for policy 0, policy_version 11050 (0.0006) +[2023-02-26 12:07:49,881][00189] Updated weights for policy 0, policy_version 11060 (0.0006) +[2023-02-26 12:07:50,681][00189] Updated weights for policy 0, policy_version 11070 (0.0007) +[2023-02-26 12:07:51,369][00189] Updated weights for policy 0, policy_version 11080 (0.0006) +[2023-02-26 12:07:52,044][00189] Updated weights for policy 0, policy_version 11090 (0.0006) +[2023-02-26 12:07:52,816][00189] Updated weights for policy 0, policy_version 11100 (0.0007) +[2023-02-26 12:07:53,486][00189] Updated weights for policy 0, policy_version 11110 (0.0006) +[2023-02-26 12:07:53,656][00001] Fps is (10 sec: 57344.9, 60 sec: 57617.0, 300 sec: 58191.0). Total num frames: 45514752. Throughput: 0: 14351.2. Samples: 11364556. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) +[2023-02-26 12:07:53,656][00001] Avg episode reward: [(0, '37.077')] +[2023-02-26 12:07:54,185][00189] Updated weights for policy 0, policy_version 11120 (0.0007) +[2023-02-26 12:07:54,908][00189] Updated weights for policy 0, policy_version 11130 (0.0007) +[2023-02-26 12:07:55,617][00189] Updated weights for policy 0, policy_version 11140 (0.0006) +[2023-02-26 12:07:56,323][00189] Updated weights for policy 0, policy_version 11150 (0.0007) +[2023-02-26 12:07:57,001][00189] Updated weights for policy 0, policy_version 11160 (0.0006) +[2023-02-26 12:07:57,738][00189] Updated weights for policy 0, policy_version 11170 (0.0006) +[2023-02-26 12:07:58,438][00189] Updated weights for policy 0, policy_version 11180 (0.0006) +[2023-02-26 12:07:58,656][00001] Fps is (10 sec: 57343.9, 60 sec: 57548.9, 300 sec: 58163.2). Total num frames: 45801472. Throughput: 0: 14337.7. Samples: 11451376. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:07:58,656][00001] Avg episode reward: [(0, '36.628')] +[2023-02-26 12:07:59,157][00189] Updated weights for policy 0, policy_version 11190 (0.0006) +[2023-02-26 12:07:59,863][00189] Updated weights for policy 0, policy_version 11200 (0.0006) +[2023-02-26 12:08:00,627][00189] Updated weights for policy 0, policy_version 11210 (0.0006) +[2023-02-26 12:08:01,318][00189] Updated weights for policy 0, policy_version 11220 (0.0006) +[2023-02-26 12:08:02,014][00189] Updated weights for policy 0, policy_version 11230 (0.0006) +[2023-02-26 12:08:02,753][00189] Updated weights for policy 0, policy_version 11240 (0.0007) +[2023-02-26 12:08:03,451][00189] Updated weights for policy 0, policy_version 11250 (0.0006) +[2023-02-26 12:08:03,656][00001] Fps is (10 sec: 57344.0, 60 sec: 57480.6, 300 sec: 58121.5). Total num frames: 46088192. Throughput: 0: 14317.3. Samples: 11494052. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:08:03,656][00001] Avg episode reward: [(0, '34.886')] +[2023-02-26 12:08:04,146][00189] Updated weights for policy 0, policy_version 11260 (0.0006) +[2023-02-26 12:08:04,880][00189] Updated weights for policy 0, policy_version 11270 (0.0006) +[2023-02-26 12:08:05,563][00189] Updated weights for policy 0, policy_version 11280 (0.0007) +[2023-02-26 12:08:06,271][00189] Updated weights for policy 0, policy_version 11290 (0.0007) +[2023-02-26 12:08:07,019][00189] Updated weights for policy 0, policy_version 11300 (0.0007) +[2023-02-26 12:08:07,690][00189] Updated weights for policy 0, policy_version 11310 (0.0007) +[2023-02-26 12:08:08,394][00189] Updated weights for policy 0, policy_version 11320 (0.0006) +[2023-02-26 12:08:08,656][00001] Fps is (10 sec: 57753.8, 60 sec: 57480.6, 300 sec: 58107.7). Total num frames: 46379008. Throughput: 0: 14338.5. Samples: 11580732. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:08:08,656][00001] Avg episode reward: [(0, '39.368')] +[2023-02-26 12:08:09,134][00189] Updated weights for policy 0, policy_version 11330 (0.0006) +[2023-02-26 12:08:09,851][00189] Updated weights for policy 0, policy_version 11340 (0.0007) +[2023-02-26 12:08:10,500][00189] Updated weights for policy 0, policy_version 11350 (0.0007) +[2023-02-26 12:08:11,256][00189] Updated weights for policy 0, policy_version 11360 (0.0007) +[2023-02-26 12:08:11,975][00189] Updated weights for policy 0, policy_version 11370 (0.0006) +[2023-02-26 12:08:12,654][00189] Updated weights for policy 0, policy_version 11380 (0.0006) +[2023-02-26 12:08:13,360][00189] Updated weights for policy 0, policy_version 11390 (0.0007) +[2023-02-26 12:08:13,656][00001] Fps is (10 sec: 57753.7, 60 sec: 57344.0, 300 sec: 58066.0). Total num frames: 46665728. Throughput: 0: 14317.7. Samples: 11667296. Policy #0 lag: (min: 0.0, avg: 1.5, max: 3.0) +[2023-02-26 12:08:13,656][00001] Avg episode reward: [(0, '39.479')] +[2023-02-26 12:08:14,106][00189] Updated weights for policy 0, policy_version 11400 (0.0007) +[2023-02-26 12:08:14,800][00189] Updated weights for policy 0, policy_version 11410 (0.0006) +[2023-02-26 12:08:15,508][00189] Updated weights for policy 0, policy_version 11420 (0.0006) +[2023-02-26 12:08:16,209][00189] Updated weights for policy 0, policy_version 11430 (0.0006) +[2023-02-26 12:08:16,937][00189] Updated weights for policy 0, policy_version 11440 (0.0007) +[2023-02-26 12:08:17,618][00189] Updated weights for policy 0, policy_version 11450 (0.0007) +[2023-02-26 12:08:18,342][00189] Updated weights for policy 0, policy_version 11460 (0.0006) +[2023-02-26 12:08:18,656][00001] Fps is (10 sec: 57752.3, 60 sec: 57412.1, 300 sec: 58038.2). Total num frames: 46956544. Throughput: 0: 14364.2. Samples: 11710620. Policy #0 lag: (min: 0.0, avg: 1.5, max: 3.0) +[2023-02-26 12:08:18,657][00001] Avg episode reward: [(0, '40.938')] +[2023-02-26 12:08:18,657][00141] Saving new best policy, reward=40.938! +[2023-02-26 12:08:19,039][00189] Updated weights for policy 0, policy_version 11470 (0.0006) +[2023-02-26 12:08:19,745][00189] Updated weights for policy 0, policy_version 11480 (0.0007) +[2023-02-26 12:08:20,431][00189] Updated weights for policy 0, policy_version 11490 (0.0006) +[2023-02-26 12:08:21,144][00189] Updated weights for policy 0, policy_version 11500 (0.0007) +[2023-02-26 12:08:21,842][00189] Updated weights for policy 0, policy_version 11510 (0.0006) +[2023-02-26 12:08:22,536][00189] Updated weights for policy 0, policy_version 11520 (0.0006) +[2023-02-26 12:08:23,280][00189] Updated weights for policy 0, policy_version 11530 (0.0006) +[2023-02-26 12:08:23,656][00001] Fps is (10 sec: 58162.1, 60 sec: 57480.4, 300 sec: 58024.3). Total num frames: 47247360. Throughput: 0: 14386.4. Samples: 11798012. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:08:23,656][00001] Avg episode reward: [(0, '38.859')] +[2023-02-26 12:08:23,994][00189] Updated weights for policy 0, policy_version 11540 (0.0006) +[2023-02-26 12:08:24,721][00189] Updated weights for policy 0, policy_version 11550 (0.0006) +[2023-02-26 12:08:25,462][00189] Updated weights for policy 0, policy_version 11560 (0.0007) +[2023-02-26 12:08:26,185][00189] Updated weights for policy 0, policy_version 11570 (0.0007) +[2023-02-26 12:08:26,932][00189] Updated weights for policy 0, policy_version 11580 (0.0006) +[2023-02-26 12:08:27,647][00189] Updated weights for policy 0, policy_version 11590 (0.0006) +[2023-02-26 12:08:28,391][00189] Updated weights for policy 0, policy_version 11600 (0.0006) +[2023-02-26 12:08:28,656][00001] Fps is (10 sec: 56935.5, 60 sec: 57275.8, 300 sec: 57954.9). Total num frames: 47525888. Throughput: 0: 14350.6. Samples: 11882508. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:08:28,656][00001] Avg episode reward: [(0, '38.736')] +[2023-02-26 12:08:29,093][00189] Updated weights for policy 0, policy_version 11610 (0.0006) +[2023-02-26 12:08:29,790][00189] Updated weights for policy 0, policy_version 11620 (0.0006) +[2023-02-26 12:08:30,526][00189] Updated weights for policy 0, policy_version 11630 (0.0007) +[2023-02-26 12:08:31,234][00189] Updated weights for policy 0, policy_version 11640 (0.0006) +[2023-02-26 12:08:31,962][00189] Updated weights for policy 0, policy_version 11650 (0.0006) +[2023-02-26 12:08:32,685][00189] Updated weights for policy 0, policy_version 11660 (0.0007) +[2023-02-26 12:08:33,382][00189] Updated weights for policy 0, policy_version 11670 (0.0006) +[2023-02-26 12:08:33,656][00001] Fps is (10 sec: 56525.0, 60 sec: 57412.4, 300 sec: 57941.0). Total num frames: 47812608. Throughput: 0: 14354.9. Samples: 11925308. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) +[2023-02-26 12:08:33,656][00001] Avg episode reward: [(0, '39.487')] +[2023-02-26 12:08:34,110][00189] Updated weights for policy 0, policy_version 11680 (0.0007) +[2023-02-26 12:08:34,845][00189] Updated weights for policy 0, policy_version 11690 (0.0007) +[2023-02-26 12:08:35,553][00189] Updated weights for policy 0, policy_version 11700 (0.0007) +[2023-02-26 12:08:36,308][00189] Updated weights for policy 0, policy_version 11710 (0.0006) +[2023-02-26 12:08:37,033][00189] Updated weights for policy 0, policy_version 11720 (0.0007) +[2023-02-26 12:08:37,760][00189] Updated weights for policy 0, policy_version 11730 (0.0006) +[2023-02-26 12:08:38,481][00189] Updated weights for policy 0, policy_version 11740 (0.0006) +[2023-02-26 12:08:38,656][00001] Fps is (10 sec: 56934.6, 60 sec: 57275.8, 300 sec: 57913.3). Total num frames: 48095232. Throughput: 0: 14350.9. Samples: 12010344. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:08:38,656][00001] Avg episode reward: [(0, '38.368')] +[2023-02-26 12:08:39,205][00189] Updated weights for policy 0, policy_version 11750 (0.0006) +[2023-02-26 12:08:39,896][00189] Updated weights for policy 0, policy_version 11760 (0.0006) +[2023-02-26 12:08:40,619][00189] Updated weights for policy 0, policy_version 11770 (0.0006) +[2023-02-26 12:08:41,346][00189] Updated weights for policy 0, policy_version 11780 (0.0007) +[2023-02-26 12:08:42,067][00189] Updated weights for policy 0, policy_version 11790 (0.0006) +[2023-02-26 12:08:42,798][00189] Updated weights for policy 0, policy_version 11800 (0.0006) +[2023-02-26 12:08:43,498][00189] Updated weights for policy 0, policy_version 11810 (0.0006) +[2023-02-26 12:08:43,656][00001] Fps is (10 sec: 56525.6, 60 sec: 57275.9, 300 sec: 57871.6). Total num frames: 48377856. Throughput: 0: 14314.6. Samples: 12095532. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:08:43,656][00001] Avg episode reward: [(0, '40.154')] +[2023-02-26 12:08:44,243][00189] Updated weights for policy 0, policy_version 11820 (0.0007) +[2023-02-26 12:08:44,955][00189] Updated weights for policy 0, policy_version 11830 (0.0007) +[2023-02-26 12:08:45,670][00189] Updated weights for policy 0, policy_version 11840 (0.0007) +[2023-02-26 12:08:46,409][00189] Updated weights for policy 0, policy_version 11850 (0.0006) +[2023-02-26 12:08:47,149][00189] Updated weights for policy 0, policy_version 11860 (0.0006) +[2023-02-26 12:08:47,824][00189] Updated weights for policy 0, policy_version 11870 (0.0007) +[2023-02-26 12:08:48,602][00189] Updated weights for policy 0, policy_version 11880 (0.0006) +[2023-02-26 12:08:48,656][00001] Fps is (10 sec: 56934.4, 60 sec: 57275.8, 300 sec: 57857.7). Total num frames: 48664576. Throughput: 0: 14311.1. Samples: 12138052. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:08:48,656][00001] Avg episode reward: [(0, '39.596')] +[2023-02-26 12:08:49,312][00189] Updated weights for policy 0, policy_version 11890 (0.0007) +[2023-02-26 12:08:50,008][00189] Updated weights for policy 0, policy_version 11900 (0.0007) +[2023-02-26 12:08:50,765][00189] Updated weights for policy 0, policy_version 11910 (0.0006) +[2023-02-26 12:08:51,471][00189] Updated weights for policy 0, policy_version 11920 (0.0006) +[2023-02-26 12:08:52,159][00189] Updated weights for policy 0, policy_version 11930 (0.0006) +[2023-02-26 12:08:52,919][00189] Updated weights for policy 0, policy_version 11940 (0.0006) +[2023-02-26 12:08:53,610][00189] Updated weights for policy 0, policy_version 11950 (0.0007) +[2023-02-26 12:08:53,656][00001] Fps is (10 sec: 56934.3, 60 sec: 57207.5, 300 sec: 57843.9). Total num frames: 48947200. Throughput: 0: 14275.4. Samples: 12223124. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:08:53,656][00001] Avg episode reward: [(0, '37.167')] +[2023-02-26 12:08:53,663][00141] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000011951_48951296.pth... +[2023-02-26 12:08:53,700][00141] Removing ./runs/default_experiment/checkpoint_p0/checkpoint_000008564_35078144.pth +[2023-02-26 12:08:54,329][00189] Updated weights for policy 0, policy_version 11960 (0.0007) +[2023-02-26 12:08:55,108][00189] Updated weights for policy 0, policy_version 11970 (0.0007) +[2023-02-26 12:08:55,806][00189] Updated weights for policy 0, policy_version 11980 (0.0007) +[2023-02-26 12:08:56,496][00189] Updated weights for policy 0, policy_version 11990 (0.0006) +[2023-02-26 12:08:57,218][00189] Updated weights for policy 0, policy_version 12000 (0.0006) +[2023-02-26 12:08:57,950][00189] Updated weights for policy 0, policy_version 12010 (0.0006) +[2023-02-26 12:08:58,656][00001] Fps is (10 sec: 56524.5, 60 sec: 57139.2, 300 sec: 57802.2). Total num frames: 49229824. Throughput: 0: 14244.8. Samples: 12308312. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:08:58,656][00001] Avg episode reward: [(0, '39.180')] +[2023-02-26 12:08:58,667][00189] Updated weights for policy 0, policy_version 12020 (0.0006) +[2023-02-26 12:08:59,379][00189] Updated weights for policy 0, policy_version 12030 (0.0007) +[2023-02-26 12:09:00,138][00189] Updated weights for policy 0, policy_version 12040 (0.0007) +[2023-02-26 12:09:00,813][00189] Updated weights for policy 0, policy_version 12050 (0.0006) +[2023-02-26 12:09:01,496][00189] Updated weights for policy 0, policy_version 12060 (0.0006) +[2023-02-26 12:09:02,216][00189] Updated weights for policy 0, policy_version 12070 (0.0006) +[2023-02-26 12:09:02,940][00189] Updated weights for policy 0, policy_version 12080 (0.0006) +[2023-02-26 12:09:03,638][00189] Updated weights for policy 0, policy_version 12090 (0.0006) +[2023-02-26 12:09:03,656][00001] Fps is (10 sec: 57343.1, 60 sec: 57207.3, 300 sec: 57802.2). Total num frames: 49520640. Throughput: 0: 14238.1. Samples: 12351336. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:09:03,656][00001] Avg episode reward: [(0, '37.763')] +[2023-02-26 12:09:04,356][00189] Updated weights for policy 0, policy_version 12100 (0.0006) +[2023-02-26 12:09:05,082][00189] Updated weights for policy 0, policy_version 12110 (0.0006) +[2023-02-26 12:09:05,799][00189] Updated weights for policy 0, policy_version 12120 (0.0007) +[2023-02-26 12:09:06,544][00189] Updated weights for policy 0, policy_version 12130 (0.0006) +[2023-02-26 12:09:07,256][00189] Updated weights for policy 0, policy_version 12140 (0.0007) +[2023-02-26 12:09:07,961][00189] Updated weights for policy 0, policy_version 12150 (0.0007) +[2023-02-26 12:09:08,656][00001] Fps is (10 sec: 57344.0, 60 sec: 57070.9, 300 sec: 57774.4). Total num frames: 49803264. Throughput: 0: 14201.2. Samples: 12437064. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:09:08,656][00001] Avg episode reward: [(0, '41.519')] +[2023-02-26 12:09:08,656][00141] Saving new best policy, reward=41.519! +[2023-02-26 12:09:08,730][00189] Updated weights for policy 0, policy_version 12160 (0.0007) +[2023-02-26 12:09:09,397][00189] Updated weights for policy 0, policy_version 12170 (0.0006) +[2023-02-26 12:09:10,180][00189] Updated weights for policy 0, policy_version 12180 (0.0007) +[2023-02-26 12:09:10,865][00189] Updated weights for policy 0, policy_version 12190 (0.0006) +[2023-02-26 12:09:11,586][00189] Updated weights for policy 0, policy_version 12200 (0.0006) +[2023-02-26 12:09:12,367][00189] Updated weights for policy 0, policy_version 12210 (0.0007) +[2023-02-26 12:09:13,067][00189] Updated weights for policy 0, policy_version 12220 (0.0006) +[2023-02-26 12:09:13,656][00001] Fps is (10 sec: 56115.8, 60 sec: 56934.4, 300 sec: 57746.7). Total num frames: 50081792. Throughput: 0: 14201.6. Samples: 12521580. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:09:13,656][00001] Avg episode reward: [(0, '40.339')] +[2023-02-26 12:09:13,797][00189] Updated weights for policy 0, policy_version 12230 (0.0006) +[2023-02-26 12:09:14,544][00189] Updated weights for policy 0, policy_version 12240 (0.0007) +[2023-02-26 12:09:15,230][00189] Updated weights for policy 0, policy_version 12250 (0.0006) +[2023-02-26 12:09:15,963][00189] Updated weights for policy 0, policy_version 12260 (0.0006) +[2023-02-26 12:09:16,704][00189] Updated weights for policy 0, policy_version 12270 (0.0007) +[2023-02-26 12:09:17,398][00189] Updated weights for policy 0, policy_version 12280 (0.0007) +[2023-02-26 12:09:18,143][00189] Updated weights for policy 0, policy_version 12290 (0.0006) +[2023-02-26 12:09:18,656][00001] Fps is (10 sec: 56524.9, 60 sec: 56866.3, 300 sec: 57732.8). Total num frames: 50368512. Throughput: 0: 14191.3. Samples: 12563916. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:09:18,656][00001] Avg episode reward: [(0, '39.115')] +[2023-02-26 12:09:18,863][00189] Updated weights for policy 0, policy_version 12300 (0.0006) +[2023-02-26 12:09:19,531][00189] Updated weights for policy 0, policy_version 12310 (0.0006) +[2023-02-26 12:09:20,264][00189] Updated weights for policy 0, policy_version 12320 (0.0007) +[2023-02-26 12:09:21,015][00189] Updated weights for policy 0, policy_version 12330 (0.0007) +[2023-02-26 12:09:21,701][00189] Updated weights for policy 0, policy_version 12340 (0.0006) +[2023-02-26 12:09:22,443][00189] Updated weights for policy 0, policy_version 12350 (0.0006) +[2023-02-26 12:09:23,184][00189] Updated weights for policy 0, policy_version 12360 (0.0007) +[2023-02-26 12:09:23,656][00001] Fps is (10 sec: 57344.3, 60 sec: 56798.0, 300 sec: 57718.9). Total num frames: 50655232. Throughput: 0: 14199.8. Samples: 12649336. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:09:23,656][00001] Avg episode reward: [(0, '38.054')] +[2023-02-26 12:09:23,867][00189] Updated weights for policy 0, policy_version 12370 (0.0007) +[2023-02-26 12:09:24,636][00189] Updated weights for policy 0, policy_version 12380 (0.0007) +[2023-02-26 12:09:25,355][00189] Updated weights for policy 0, policy_version 12390 (0.0006) +[2023-02-26 12:09:26,041][00189] Updated weights for policy 0, policy_version 12400 (0.0006) +[2023-02-26 12:09:26,811][00189] Updated weights for policy 0, policy_version 12410 (0.0007) +[2023-02-26 12:09:27,514][00189] Updated weights for policy 0, policy_version 12420 (0.0006) +[2023-02-26 12:09:28,223][00189] Updated weights for policy 0, policy_version 12430 (0.0006) +[2023-02-26 12:09:28,656][00001] Fps is (10 sec: 56524.7, 60 sec: 56797.8, 300 sec: 57677.2). Total num frames: 50933760. Throughput: 0: 14197.2. Samples: 12734408. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:09:28,656][00001] Avg episode reward: [(0, '40.782')] +[2023-02-26 12:09:28,968][00189] Updated weights for policy 0, policy_version 12440 (0.0007) +[2023-02-26 12:09:29,693][00189] Updated weights for policy 0, policy_version 12450 (0.0006) +[2023-02-26 12:09:30,417][00189] Updated weights for policy 0, policy_version 12460 (0.0007) +[2023-02-26 12:09:31,124][00189] Updated weights for policy 0, policy_version 12470 (0.0007) +[2023-02-26 12:09:31,866][00189] Updated weights for policy 0, policy_version 12480 (0.0006) +[2023-02-26 12:09:32,580][00189] Updated weights for policy 0, policy_version 12490 (0.0007) +[2023-02-26 12:09:33,285][00189] Updated weights for policy 0, policy_version 12500 (0.0006) +[2023-02-26 12:09:33,656][00001] Fps is (10 sec: 56114.9, 60 sec: 56729.7, 300 sec: 57649.5). Total num frames: 51216384. Throughput: 0: 14192.8. Samples: 12776728. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) +[2023-02-26 12:09:33,656][00001] Avg episode reward: [(0, '39.739')] +[2023-02-26 12:09:34,035][00189] Updated weights for policy 0, policy_version 12510 (0.0007) +[2023-02-26 12:09:34,754][00189] Updated weights for policy 0, policy_version 12520 (0.0006) +[2023-02-26 12:09:35,467][00189] Updated weights for policy 0, policy_version 12530 (0.0006) +[2023-02-26 12:09:36,180][00189] Updated weights for policy 0, policy_version 12540 (0.0006) +[2023-02-26 12:09:36,801][00141] Signal inference workers to stop experience collection... (400 times) +[2023-02-26 12:09:36,801][00141] Signal inference workers to resume experience collection... (400 times) +[2023-02-26 12:09:36,811][00189] InferenceWorker_p0-w0: stopping experience collection (400 times) +[2023-02-26 12:09:36,811][00189] InferenceWorker_p0-w0: resuming experience collection (400 times) +[2023-02-26 12:09:36,912][00189] Updated weights for policy 0, policy_version 12550 (0.0007) +[2023-02-26 12:09:37,667][00189] Updated weights for policy 0, policy_version 12560 (0.0007) +[2023-02-26 12:09:38,332][00189] Updated weights for policy 0, policy_version 12570 (0.0008) +[2023-02-26 12:09:38,656][00001] Fps is (10 sec: 56934.1, 60 sec: 56797.8, 300 sec: 57649.5). Total num frames: 51503104. Throughput: 0: 14193.2. Samples: 12861820. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) +[2023-02-26 12:09:38,656][00001] Avg episode reward: [(0, '40.749')] +[2023-02-26 12:09:39,061][00189] Updated weights for policy 0, policy_version 12580 (0.0006) +[2023-02-26 12:09:39,817][00189] Updated weights for policy 0, policy_version 12590 (0.0006) +[2023-02-26 12:09:40,534][00189] Updated weights for policy 0, policy_version 12600 (0.0007) +[2023-02-26 12:09:41,271][00189] Updated weights for policy 0, policy_version 12610 (0.0007) +[2023-02-26 12:09:42,021][00189] Updated weights for policy 0, policy_version 12620 (0.0007) +[2023-02-26 12:09:42,717][00189] Updated weights for policy 0, policy_version 12630 (0.0007) +[2023-02-26 12:09:43,467][00189] Updated weights for policy 0, policy_version 12640 (0.0006) +[2023-02-26 12:09:43,656][00001] Fps is (10 sec: 56524.2, 60 sec: 56729.5, 300 sec: 57607.8). Total num frames: 51781632. Throughput: 0: 14181.7. Samples: 12946492. Policy #0 lag: (min: 0.0, avg: 2.2, max: 4.0) +[2023-02-26 12:09:43,656][00001] Avg episode reward: [(0, '41.704')] +[2023-02-26 12:09:43,677][00141] Saving new best policy, reward=41.704! +[2023-02-26 12:09:44,189][00189] Updated weights for policy 0, policy_version 12650 (0.0007) +[2023-02-26 12:09:44,889][00189] Updated weights for policy 0, policy_version 12660 (0.0006) +[2023-02-26 12:09:45,594][00189] Updated weights for policy 0, policy_version 12670 (0.0007) +[2023-02-26 12:09:46,334][00189] Updated weights for policy 0, policy_version 12680 (0.0007) +[2023-02-26 12:09:47,012][00189] Updated weights for policy 0, policy_version 12690 (0.0007) +[2023-02-26 12:09:47,714][00189] Updated weights for policy 0, policy_version 12700 (0.0006) +[2023-02-26 12:09:48,376][00189] Updated weights for policy 0, policy_version 12710 (0.0006) +[2023-02-26 12:09:48,656][00001] Fps is (10 sec: 56934.8, 60 sec: 56797.8, 300 sec: 57607.8). Total num frames: 52072448. Throughput: 0: 14176.0. Samples: 12989252. Policy #0 lag: (min: 0.0, avg: 2.2, max: 4.0) +[2023-02-26 12:09:48,656][00001] Avg episode reward: [(0, '42.635')] +[2023-02-26 12:09:48,656][00141] Saving new best policy, reward=42.635! +[2023-02-26 12:09:49,099][00189] Updated weights for policy 0, policy_version 12720 (0.0006) +[2023-02-26 12:09:49,772][00189] Updated weights for policy 0, policy_version 12730 (0.0006) +[2023-02-26 12:09:50,428][00189] Updated weights for policy 0, policy_version 12740 (0.0006) +[2023-02-26 12:09:51,152][00189] Updated weights for policy 0, policy_version 12750 (0.0006) +[2023-02-26 12:09:51,807][00189] Updated weights for policy 0, policy_version 12760 (0.0006) +[2023-02-26 12:09:52,489][00189] Updated weights for policy 0, policy_version 12770 (0.0006) +[2023-02-26 12:09:53,197][00189] Updated weights for policy 0, policy_version 12780 (0.0006) +[2023-02-26 12:09:53,656][00001] Fps is (10 sec: 59392.7, 60 sec: 57139.2, 300 sec: 57621.7). Total num frames: 52375552. Throughput: 0: 14260.7. Samples: 13078796. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:09:53,656][00001] Avg episode reward: [(0, '40.347')] +[2023-02-26 12:09:53,847][00189] Updated weights for policy 0, policy_version 12790 (0.0006) +[2023-02-26 12:09:54,547][00189] Updated weights for policy 0, policy_version 12800 (0.0006) +[2023-02-26 12:09:55,271][00189] Updated weights for policy 0, policy_version 12810 (0.0007) +[2023-02-26 12:09:55,934][00189] Updated weights for policy 0, policy_version 12820 (0.0007) +[2023-02-26 12:09:56,680][00189] Updated weights for policy 0, policy_version 12830 (0.0007) +[2023-02-26 12:09:57,455][00189] Updated weights for policy 0, policy_version 12840 (0.0008) +[2023-02-26 12:09:58,127][00189] Updated weights for policy 0, policy_version 12850 (0.0007) +[2023-02-26 12:09:58,656][00001] Fps is (10 sec: 58982.7, 60 sec: 57207.5, 300 sec: 57607.8). Total num frames: 52662272. Throughput: 0: 14313.8. Samples: 13165700. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:09:58,656][00001] Avg episode reward: [(0, '42.108')] +[2023-02-26 12:09:58,829][00189] Updated weights for policy 0, policy_version 12860 (0.0007) +[2023-02-26 12:09:59,565][00189] Updated weights for policy 0, policy_version 12870 (0.0006) +[2023-02-26 12:10:00,252][00189] Updated weights for policy 0, policy_version 12880 (0.0006) +[2023-02-26 12:10:00,940][00189] Updated weights for policy 0, policy_version 12890 (0.0006) +[2023-02-26 12:10:01,662][00189] Updated weights for policy 0, policy_version 12900 (0.0006) +[2023-02-26 12:10:02,334][00189] Updated weights for policy 0, policy_version 12910 (0.0006) +[2023-02-26 12:10:03,023][00189] Updated weights for policy 0, policy_version 12920 (0.0006) +[2023-02-26 12:10:03,656][00001] Fps is (10 sec: 58163.2, 60 sec: 57275.9, 300 sec: 57635.6). Total num frames: 52957184. Throughput: 0: 14350.3. Samples: 13209680. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:10:03,656][00001] Avg episode reward: [(0, '36.763')] +[2023-02-26 12:10:03,754][00189] Updated weights for policy 0, policy_version 12930 (0.0006) +[2023-02-26 12:10:04,419][00189] Updated weights for policy 0, policy_version 12940 (0.0007) +[2023-02-26 12:10:05,125][00189] Updated weights for policy 0, policy_version 12950 (0.0006) +[2023-02-26 12:10:05,830][00189] Updated weights for policy 0, policy_version 12960 (0.0006) +[2023-02-26 12:10:06,486][00189] Updated weights for policy 0, policy_version 12970 (0.0006) +[2023-02-26 12:10:07,194][00189] Updated weights for policy 0, policy_version 12980 (0.0006) +[2023-02-26 12:10:07,890][00189] Updated weights for policy 0, policy_version 12990 (0.0006) +[2023-02-26 12:10:08,618][00189] Updated weights for policy 0, policy_version 13000 (0.0007) +[2023-02-26 12:10:08,656][00001] Fps is (10 sec: 58982.1, 60 sec: 57480.5, 300 sec: 57649.5). Total num frames: 53252096. Throughput: 0: 14420.4. Samples: 13298252. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:10:08,656][00001] Avg episode reward: [(0, '39.574')] +[2023-02-26 12:10:09,287][00189] Updated weights for policy 0, policy_version 13010 (0.0007) +[2023-02-26 12:10:09,964][00189] Updated weights for policy 0, policy_version 13020 (0.0006) +[2023-02-26 12:10:10,704][00189] Updated weights for policy 0, policy_version 13030 (0.0006) +[2023-02-26 12:10:11,381][00189] Updated weights for policy 0, policy_version 13040 (0.0006) +[2023-02-26 12:10:12,033][00189] Updated weights for policy 0, policy_version 13050 (0.0006) +[2023-02-26 12:10:12,784][00189] Updated weights for policy 0, policy_version 13060 (0.0007) +[2023-02-26 12:10:13,444][00189] Updated weights for policy 0, policy_version 13070 (0.0006) +[2023-02-26 12:10:13,656][00001] Fps is (10 sec: 58572.6, 60 sec: 57685.3, 300 sec: 57649.5). Total num frames: 53542912. Throughput: 0: 14493.9. Samples: 13386632. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:10:13,656][00001] Avg episode reward: [(0, '42.275')] +[2023-02-26 12:10:14,124][00189] Updated weights for policy 0, policy_version 13080 (0.0006) +[2023-02-26 12:10:14,859][00189] Updated weights for policy 0, policy_version 13090 (0.0006) +[2023-02-26 12:10:15,542][00189] Updated weights for policy 0, policy_version 13100 (0.0006) +[2023-02-26 12:10:16,206][00189] Updated weights for policy 0, policy_version 13110 (0.0006) +[2023-02-26 12:10:16,909][00189] Updated weights for policy 0, policy_version 13120 (0.0006) +[2023-02-26 12:10:17,629][00189] Updated weights for policy 0, policy_version 13130 (0.0007) +[2023-02-26 12:10:18,369][00189] Updated weights for policy 0, policy_version 13140 (0.0006) +[2023-02-26 12:10:18,656][00001] Fps is (10 sec: 58572.7, 60 sec: 57821.9, 300 sec: 57663.3). Total num frames: 53837824. Throughput: 0: 14539.6. Samples: 13431008. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:10:18,656][00001] Avg episode reward: [(0, '39.753')] +[2023-02-26 12:10:19,038][00189] Updated weights for policy 0, policy_version 13150 (0.0007) +[2023-02-26 12:10:19,750][00189] Updated weights for policy 0, policy_version 13160 (0.0007) +[2023-02-26 12:10:20,484][00189] Updated weights for policy 0, policy_version 13170 (0.0007) +[2023-02-26 12:10:21,155][00189] Updated weights for policy 0, policy_version 13180 (0.0006) +[2023-02-26 12:10:21,897][00189] Updated weights for policy 0, policy_version 13190 (0.0006) +[2023-02-26 12:10:22,634][00189] Updated weights for policy 0, policy_version 13200 (0.0007) +[2023-02-26 12:10:23,349][00189] Updated weights for policy 0, policy_version 13210 (0.0006) +[2023-02-26 12:10:23,656][00001] Fps is (10 sec: 58163.3, 60 sec: 57821.8, 300 sec: 57649.5). Total num frames: 54124544. Throughput: 0: 14563.2. Samples: 13517164. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:10:23,656][00001] Avg episode reward: [(0, '41.442')] +[2023-02-26 12:10:24,067][00189] Updated weights for policy 0, policy_version 13220 (0.0007) +[2023-02-26 12:10:24,751][00189] Updated weights for policy 0, policy_version 13230 (0.0006) +[2023-02-26 12:10:25,512][00189] Updated weights for policy 0, policy_version 13240 (0.0007) +[2023-02-26 12:10:26,199][00189] Updated weights for policy 0, policy_version 13250 (0.0007) +[2023-02-26 12:10:26,944][00189] Updated weights for policy 0, policy_version 13260 (0.0007) +[2023-02-26 12:10:27,645][00189] Updated weights for policy 0, policy_version 13270 (0.0007) +[2023-02-26 12:10:28,357][00189] Updated weights for policy 0, policy_version 13280 (0.0006) +[2023-02-26 12:10:28,656][00001] Fps is (10 sec: 57344.2, 60 sec: 57958.4, 300 sec: 57635.6). Total num frames: 54411264. Throughput: 0: 14591.3. Samples: 13603100. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:10:28,656][00001] Avg episode reward: [(0, '42.073')] +[2023-02-26 12:10:29,076][00189] Updated weights for policy 0, policy_version 13290 (0.0006) +[2023-02-26 12:10:29,780][00189] Updated weights for policy 0, policy_version 13300 (0.0006) +[2023-02-26 12:10:30,499][00189] Updated weights for policy 0, policy_version 13310 (0.0006) +[2023-02-26 12:10:31,222][00189] Updated weights for policy 0, policy_version 13320 (0.0007) +[2023-02-26 12:10:31,898][00189] Updated weights for policy 0, policy_version 13330 (0.0006) +[2023-02-26 12:10:32,605][00189] Updated weights for policy 0, policy_version 13340 (0.0007) +[2023-02-26 12:10:33,348][00189] Updated weights for policy 0, policy_version 13350 (0.0007) +[2023-02-26 12:10:33,656][00001] Fps is (10 sec: 57344.2, 60 sec: 58026.7, 300 sec: 57621.7). Total num frames: 54697984. Throughput: 0: 14601.6. Samples: 13646324. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:10:33,656][00001] Avg episode reward: [(0, '41.583')] +[2023-02-26 12:10:34,017][00189] Updated weights for policy 0, policy_version 13360 (0.0006) +[2023-02-26 12:10:34,724][00189] Updated weights for policy 0, policy_version 13370 (0.0007) +[2023-02-26 12:10:35,459][00189] Updated weights for policy 0, policy_version 13380 (0.0006) +[2023-02-26 12:10:36,128][00189] Updated weights for policy 0, policy_version 13390 (0.0006) +[2023-02-26 12:10:36,846][00189] Updated weights for policy 0, policy_version 13400 (0.0006) +[2023-02-26 12:10:37,555][00189] Updated weights for policy 0, policy_version 13410 (0.0006) +[2023-02-26 12:10:38,250][00189] Updated weights for policy 0, policy_version 13420 (0.0006) +[2023-02-26 12:10:38,656][00001] Fps is (10 sec: 57753.6, 60 sec: 58095.0, 300 sec: 57621.7). Total num frames: 54988800. Throughput: 0: 14546.4. Samples: 13733384. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:10:38,656][00001] Avg episode reward: [(0, '40.602')] +[2023-02-26 12:10:38,968][00189] Updated weights for policy 0, policy_version 13430 (0.0007) +[2023-02-26 12:10:39,652][00189] Updated weights for policy 0, policy_version 13440 (0.0007) +[2023-02-26 12:10:40,401][00189] Updated weights for policy 0, policy_version 13450 (0.0007) +[2023-02-26 12:10:41,094][00189] Updated weights for policy 0, policy_version 13460 (0.0006) +[2023-02-26 12:10:41,765][00189] Updated weights for policy 0, policy_version 13470 (0.0006) +[2023-02-26 12:10:42,520][00189] Updated weights for policy 0, policy_version 13480 (0.0006) +[2023-02-26 12:10:43,217][00189] Updated weights for policy 0, policy_version 13490 (0.0006) +[2023-02-26 12:10:43,656][00001] Fps is (10 sec: 58163.0, 60 sec: 58299.8, 300 sec: 57621.7). Total num frames: 55279616. Throughput: 0: 14544.6. Samples: 13820208. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:10:43,656][00001] Avg episode reward: [(0, '39.148')] +[2023-02-26 12:10:43,905][00189] Updated weights for policy 0, policy_version 13500 (0.0007) +[2023-02-26 12:10:44,628][00189] Updated weights for policy 0, policy_version 13510 (0.0007) +[2023-02-26 12:10:45,352][00189] Updated weights for policy 0, policy_version 13520 (0.0006) +[2023-02-26 12:10:46,023][00189] Updated weights for policy 0, policy_version 13530 (0.0006) +[2023-02-26 12:10:46,775][00189] Updated weights for policy 0, policy_version 13540 (0.0006) +[2023-02-26 12:10:47,486][00189] Updated weights for policy 0, policy_version 13550 (0.0006) +[2023-02-26 12:10:48,170][00189] Updated weights for policy 0, policy_version 13560 (0.0007) +[2023-02-26 12:10:48,656][00001] Fps is (10 sec: 57753.8, 60 sec: 58231.5, 300 sec: 57607.8). Total num frames: 55566336. Throughput: 0: 14526.7. Samples: 13863380. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:10:48,656][00001] Avg episode reward: [(0, '36.735')] +[2023-02-26 12:10:48,895][00189] Updated weights for policy 0, policy_version 13570 (0.0007) +[2023-02-26 12:10:49,616][00189] Updated weights for policy 0, policy_version 13580 (0.0006) +[2023-02-26 12:10:50,288][00189] Updated weights for policy 0, policy_version 13590 (0.0007) +[2023-02-26 12:10:51,036][00189] Updated weights for policy 0, policy_version 13600 (0.0006) +[2023-02-26 12:10:51,732][00189] Updated weights for policy 0, policy_version 13610 (0.0006) +[2023-02-26 12:10:52,418][00189] Updated weights for policy 0, policy_version 13620 (0.0007) +[2023-02-26 12:10:53,161][00189] Updated weights for policy 0, policy_version 13630 (0.0006) +[2023-02-26 12:10:53,656][00001] Fps is (10 sec: 57753.8, 60 sec: 58026.7, 300 sec: 57607.8). Total num frames: 55857152. Throughput: 0: 14486.0. Samples: 13950124. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:10:53,656][00001] Avg episode reward: [(0, '41.682')] +[2023-02-26 12:10:53,659][00141] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000013637_55857152.pth... +[2023-02-26 12:10:53,699][00141] Removing ./runs/default_experiment/checkpoint_p0/checkpoint_000010268_42057728.pth +[2023-02-26 12:10:53,847][00189] Updated weights for policy 0, policy_version 13640 (0.0006) +[2023-02-26 12:10:54,553][00189] Updated weights for policy 0, policy_version 13650 (0.0007) +[2023-02-26 12:10:55,253][00189] Updated weights for policy 0, policy_version 13660 (0.0006) +[2023-02-26 12:10:55,972][00189] Updated weights for policy 0, policy_version 13670 (0.0006) +[2023-02-26 12:10:56,681][00189] Updated weights for policy 0, policy_version 13680 (0.0006) +[2023-02-26 12:10:57,378][00189] Updated weights for policy 0, policy_version 13690 (0.0006) +[2023-02-26 12:10:58,091][00189] Updated weights for policy 0, policy_version 13700 (0.0006) +[2023-02-26 12:10:58,656][00001] Fps is (10 sec: 57752.9, 60 sec: 58026.5, 300 sec: 57593.9). Total num frames: 56143872. Throughput: 0: 14452.6. Samples: 14037000. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:10:58,656][00001] Avg episode reward: [(0, '40.380')] +[2023-02-26 12:10:58,822][00189] Updated weights for policy 0, policy_version 13710 (0.0007) +[2023-02-26 12:10:59,492][00189] Updated weights for policy 0, policy_version 13720 (0.0006) +[2023-02-26 12:11:00,188][00189] Updated weights for policy 0, policy_version 13730 (0.0006) +[2023-02-26 12:11:00,878][00189] Updated weights for policy 0, policy_version 13740 (0.0006) +[2023-02-26 12:11:01,569][00189] Updated weights for policy 0, policy_version 13750 (0.0006) +[2023-02-26 12:11:02,260][00189] Updated weights for policy 0, policy_version 13760 (0.0006) +[2023-02-26 12:11:02,933][00189] Updated weights for policy 0, policy_version 13770 (0.0006) +[2023-02-26 12:11:03,608][00189] Updated weights for policy 0, policy_version 13780 (0.0006) +[2023-02-26 12:11:03,656][00001] Fps is (10 sec: 58572.5, 60 sec: 58094.9, 300 sec: 57621.7). Total num frames: 56442880. Throughput: 0: 14449.5. Samples: 14081236. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:11:03,656][00001] Avg episode reward: [(0, '42.890')] +[2023-02-26 12:11:03,659][00141] Saving new best policy, reward=42.890! +[2023-02-26 12:11:04,328][00189] Updated weights for policy 0, policy_version 13790 (0.0006) +[2023-02-26 12:11:04,998][00189] Updated weights for policy 0, policy_version 13800 (0.0006) +[2023-02-26 12:11:05,663][00189] Updated weights for policy 0, policy_version 13810 (0.0006) +[2023-02-26 12:11:06,362][00189] Updated weights for policy 0, policy_version 13820 (0.0006) +[2023-02-26 12:11:07,029][00189] Updated weights for policy 0, policy_version 13830 (0.0006) +[2023-02-26 12:11:07,727][00189] Updated weights for policy 0, policy_version 13840 (0.0006) +[2023-02-26 12:11:08,426][00189] Updated weights for policy 0, policy_version 13850 (0.0007) +[2023-02-26 12:11:08,656][00001] Fps is (10 sec: 59802.2, 60 sec: 58163.2, 300 sec: 57663.4). Total num frames: 56741888. Throughput: 0: 14526.9. Samples: 14170872. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:11:08,656][00001] Avg episode reward: [(0, '42.531')] +[2023-02-26 12:11:09,092][00189] Updated weights for policy 0, policy_version 13860 (0.0006) +[2023-02-26 12:11:09,793][00189] Updated weights for policy 0, policy_version 13870 (0.0006) +[2023-02-26 12:11:10,493][00189] Updated weights for policy 0, policy_version 13880 (0.0007) +[2023-02-26 12:11:11,184][00189] Updated weights for policy 0, policy_version 13890 (0.0006) +[2023-02-26 12:11:11,846][00189] Updated weights for policy 0, policy_version 13900 (0.0007) +[2023-02-26 12:11:12,532][00189] Updated weights for policy 0, policy_version 13910 (0.0006) +[2023-02-26 12:11:13,235][00189] Updated weights for policy 0, policy_version 13920 (0.0006) +[2023-02-26 12:11:13,656][00001] Fps is (10 sec: 59801.7, 60 sec: 58299.8, 300 sec: 57691.1). Total num frames: 57040896. Throughput: 0: 14610.0. Samples: 14260552. Policy #0 lag: (min: 0.0, avg: 1.8, max: 3.0) +[2023-02-26 12:11:13,656][00001] Avg episode reward: [(0, '39.781')] +[2023-02-26 12:11:13,915][00189] Updated weights for policy 0, policy_version 13930 (0.0006) +[2023-02-26 12:11:14,554][00189] Updated weights for policy 0, policy_version 13940 (0.0007) +[2023-02-26 12:11:15,273][00189] Updated weights for policy 0, policy_version 13950 (0.0006) +[2023-02-26 12:11:15,946][00189] Updated weights for policy 0, policy_version 13960 (0.0006) +[2023-02-26 12:11:16,654][00189] Updated weights for policy 0, policy_version 13970 (0.0006) +[2023-02-26 12:11:17,351][00189] Updated weights for policy 0, policy_version 13980 (0.0006) +[2023-02-26 12:11:18,005][00189] Updated weights for policy 0, policy_version 13990 (0.0006) +[2023-02-26 12:11:18,656][00001] Fps is (10 sec: 59801.9, 60 sec: 58368.1, 300 sec: 57718.9). Total num frames: 57339904. Throughput: 0: 14645.4. Samples: 14305368. Policy #0 lag: (min: 0.0, avg: 1.8, max: 3.0) +[2023-02-26 12:11:18,656][00001] Avg episode reward: [(0, '39.441')] +[2023-02-26 12:11:18,721][00189] Updated weights for policy 0, policy_version 14000 (0.0006) +[2023-02-26 12:11:19,380][00189] Updated weights for policy 0, policy_version 14010 (0.0007) +[2023-02-26 12:11:20,071][00189] Updated weights for policy 0, policy_version 14020 (0.0007) +[2023-02-26 12:11:20,780][00189] Updated weights for policy 0, policy_version 14030 (0.0006) +[2023-02-26 12:11:21,457][00189] Updated weights for policy 0, policy_version 14040 (0.0006) +[2023-02-26 12:11:22,143][00189] Updated weights for policy 0, policy_version 14050 (0.0006) +[2023-02-26 12:11:22,858][00189] Updated weights for policy 0, policy_version 14060 (0.0006) +[2023-02-26 12:11:23,505][00189] Updated weights for policy 0, policy_version 14070 (0.0006) +[2023-02-26 12:11:23,656][00001] Fps is (10 sec: 59392.0, 60 sec: 58504.5, 300 sec: 57732.8). Total num frames: 57634816. Throughput: 0: 14694.4. Samples: 14394632. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:11:23,656][00001] Avg episode reward: [(0, '41.978')] +[2023-02-26 12:11:24,217][00189] Updated weights for policy 0, policy_version 14080 (0.0006) +[2023-02-26 12:11:24,934][00189] Updated weights for policy 0, policy_version 14090 (0.0006) +[2023-02-26 12:11:25,612][00189] Updated weights for policy 0, policy_version 14100 (0.0006) +[2023-02-26 12:11:26,292][00189] Updated weights for policy 0, policy_version 14110 (0.0006) +[2023-02-26 12:11:27,019][00189] Updated weights for policy 0, policy_version 14120 (0.0006) +[2023-02-26 12:11:27,700][00189] Updated weights for policy 0, policy_version 14130 (0.0006) +[2023-02-26 12:11:28,369][00189] Updated weights for policy 0, policy_version 14140 (0.0006) +[2023-02-26 12:11:28,656][00001] Fps is (10 sec: 59392.0, 60 sec: 58709.4, 300 sec: 57760.6). Total num frames: 57933824. Throughput: 0: 14736.1. Samples: 14483332. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:11:28,656][00001] Avg episode reward: [(0, '42.284')] +[2023-02-26 12:11:29,064][00189] Updated weights for policy 0, policy_version 14150 (0.0007) +[2023-02-26 12:11:29,759][00189] Updated weights for policy 0, policy_version 14160 (0.0006) +[2023-02-26 12:11:30,488][00189] Updated weights for policy 0, policy_version 14170 (0.0006) +[2023-02-26 12:11:31,150][00189] Updated weights for policy 0, policy_version 14180 (0.0007) +[2023-02-26 12:11:31,834][00189] Updated weights for policy 0, policy_version 14190 (0.0006) +[2023-02-26 12:11:32,559][00189] Updated weights for policy 0, policy_version 14200 (0.0007) +[2023-02-26 12:11:33,212][00189] Updated weights for policy 0, policy_version 14210 (0.0006) +[2023-02-26 12:11:33,656][00001] Fps is (10 sec: 59392.4, 60 sec: 58845.9, 300 sec: 57774.4). Total num frames: 58228736. Throughput: 0: 14761.0. Samples: 14527624. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:11:33,656][00001] Avg episode reward: [(0, '45.017')] +[2023-02-26 12:11:33,659][00141] Saving new best policy, reward=45.017! +[2023-02-26 12:11:33,910][00189] Updated weights for policy 0, policy_version 14220 (0.0006) +[2023-02-26 12:11:34,639][00189] Updated weights for policy 0, policy_version 14230 (0.0007) +[2023-02-26 12:11:35,305][00189] Updated weights for policy 0, policy_version 14240 (0.0006) +[2023-02-26 12:11:36,004][00189] Updated weights for policy 0, policy_version 14250 (0.0007) +[2023-02-26 12:11:36,712][00189] Updated weights for policy 0, policy_version 14260 (0.0007) +[2023-02-26 12:11:37,362][00189] Updated weights for policy 0, policy_version 14270 (0.0006) +[2023-02-26 12:11:38,099][00189] Updated weights for policy 0, policy_version 14280 (0.0006) +[2023-02-26 12:11:38,656][00001] Fps is (10 sec: 58982.5, 60 sec: 58914.2, 300 sec: 57788.3). Total num frames: 58523648. Throughput: 0: 14802.8. Samples: 14616248. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:11:38,656][00001] Avg episode reward: [(0, '43.675')] +[2023-02-26 12:11:38,785][00189] Updated weights for policy 0, policy_version 14290 (0.0007) +[2023-02-26 12:11:39,464][00189] Updated weights for policy 0, policy_version 14300 (0.0006) +[2023-02-26 12:11:40,217][00189] Updated weights for policy 0, policy_version 14310 (0.0007) +[2023-02-26 12:11:40,944][00189] Updated weights for policy 0, policy_version 14320 (0.0007) +[2023-02-26 12:11:41,700][00189] Updated weights for policy 0, policy_version 14330 (0.0006) +[2023-02-26 12:11:42,382][00189] Updated weights for policy 0, policy_version 14340 (0.0007) +[2023-02-26 12:11:43,107][00189] Updated weights for policy 0, policy_version 14350 (0.0006) +[2023-02-26 12:11:43,656][00001] Fps is (10 sec: 57753.5, 60 sec: 58777.6, 300 sec: 57760.6). Total num frames: 58806272. Throughput: 0: 14779.3. Samples: 14702068. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) +[2023-02-26 12:11:43,656][00001] Avg episode reward: [(0, '42.797')] +[2023-02-26 12:11:43,862][00189] Updated weights for policy 0, policy_version 14360 (0.0007) +[2023-02-26 12:11:44,540][00189] Updated weights for policy 0, policy_version 14370 (0.0006) +[2023-02-26 12:11:45,235][00189] Updated weights for policy 0, policy_version 14380 (0.0006) +[2023-02-26 12:11:45,964][00189] Updated weights for policy 0, policy_version 14390 (0.0006) +[2023-02-26 12:11:46,652][00189] Updated weights for policy 0, policy_version 14400 (0.0007) +[2023-02-26 12:11:47,326][00189] Updated weights for policy 0, policy_version 14410 (0.0006) +[2023-02-26 12:11:48,019][00189] Updated weights for policy 0, policy_version 14420 (0.0007) +[2023-02-26 12:11:48,656][00001] Fps is (10 sec: 57753.4, 60 sec: 58914.1, 300 sec: 57774.4). Total num frames: 59101184. Throughput: 0: 14768.6. Samples: 14745820. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) +[2023-02-26 12:11:48,656][00001] Avg episode reward: [(0, '40.587')] +[2023-02-26 12:11:48,741][00189] Updated weights for policy 0, policy_version 14430 (0.0006) +[2023-02-26 12:11:49,425][00189] Updated weights for policy 0, policy_version 14440 (0.0006) +[2023-02-26 12:11:50,088][00189] Updated weights for policy 0, policy_version 14450 (0.0006) +[2023-02-26 12:11:50,834][00189] Updated weights for policy 0, policy_version 14460 (0.0007) +[2023-02-26 12:11:51,543][00189] Updated weights for policy 0, policy_version 14470 (0.0006) +[2023-02-26 12:11:52,200][00189] Updated weights for policy 0, policy_version 14480 (0.0007) +[2023-02-26 12:11:52,896][00189] Updated weights for policy 0, policy_version 14490 (0.0006) +[2023-02-26 12:11:53,631][00189] Updated weights for policy 0, policy_version 14500 (0.0007) +[2023-02-26 12:11:53,656][00001] Fps is (10 sec: 58572.6, 60 sec: 58914.1, 300 sec: 57774.4). Total num frames: 59392000. Throughput: 0: 14733.2. Samples: 14833868. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:11:53,656][00001] Avg episode reward: [(0, '39.882')] +[2023-02-26 12:11:54,323][00189] Updated weights for policy 0, policy_version 14510 (0.0006) +[2023-02-26 12:11:54,986][00189] Updated weights for policy 0, policy_version 14520 (0.0007) +[2023-02-26 12:11:55,720][00189] Updated weights for policy 0, policy_version 14530 (0.0007) +[2023-02-26 12:11:56,389][00189] Updated weights for policy 0, policy_version 14540 (0.0007) +[2023-02-26 12:11:57,099][00189] Updated weights for policy 0, policy_version 14550 (0.0007) +[2023-02-26 12:11:57,594][00141] Signal inference workers to stop experience collection... (450 times) +[2023-02-26 12:11:57,599][00141] Signal inference workers to resume experience collection... (450 times) +[2023-02-26 12:11:57,599][00189] InferenceWorker_p0-w0: stopping experience collection (450 times) +[2023-02-26 12:11:57,604][00189] InferenceWorker_p0-w0: resuming experience collection (450 times) +[2023-02-26 12:11:57,803][00189] Updated weights for policy 0, policy_version 14560 (0.0006) +[2023-02-26 12:11:58,455][00189] Updated weights for policy 0, policy_version 14570 (0.0006) +[2023-02-26 12:11:58,656][00001] Fps is (10 sec: 58572.4, 60 sec: 59050.7, 300 sec: 57788.3). Total num frames: 59686912. Throughput: 0: 14707.9. Samples: 14922408. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:11:58,656][00001] Avg episode reward: [(0, '40.343')] +[2023-02-26 12:11:59,211][00189] Updated weights for policy 0, policy_version 14580 (0.0006) +[2023-02-26 12:11:59,895][00189] Updated weights for policy 0, policy_version 14590 (0.0006) +[2023-02-26 12:12:00,561][00189] Updated weights for policy 0, policy_version 14600 (0.0006) +[2023-02-26 12:12:01,274][00189] Updated weights for policy 0, policy_version 14610 (0.0006) +[2023-02-26 12:12:01,991][00189] Updated weights for policy 0, policy_version 14620 (0.0006) +[2023-02-26 12:12:02,675][00189] Updated weights for policy 0, policy_version 14630 (0.0007) +[2023-02-26 12:12:03,349][00189] Updated weights for policy 0, policy_version 14640 (0.0006) +[2023-02-26 12:12:03,656][00001] Fps is (10 sec: 58572.3, 60 sec: 58914.1, 300 sec: 57788.3). Total num frames: 59977728. Throughput: 0: 14685.9. Samples: 14966236. Policy #0 lag: (min: 0.0, avg: 1.5, max: 4.0) +[2023-02-26 12:12:03,656][00001] Avg episode reward: [(0, '40.808')] +[2023-02-26 12:12:04,075][00189] Updated weights for policy 0, policy_version 14650 (0.0006) +[2023-02-26 12:12:04,766][00189] Updated weights for policy 0, policy_version 14660 (0.0007) +[2023-02-26 12:12:05,444][00189] Updated weights for policy 0, policy_version 14670 (0.0006) +[2023-02-26 12:12:06,175][00189] Updated weights for policy 0, policy_version 14680 (0.0006) +[2023-02-26 12:12:06,864][00189] Updated weights for policy 0, policy_version 14690 (0.0007) +[2023-02-26 12:12:07,598][00189] Updated weights for policy 0, policy_version 14700 (0.0006) +[2023-02-26 12:12:08,302][00189] Updated weights for policy 0, policy_version 14710 (0.0006) +[2023-02-26 12:12:08,656][00001] Fps is (10 sec: 58573.1, 60 sec: 58845.9, 300 sec: 57788.3). Total num frames: 60272640. Throughput: 0: 14647.2. Samples: 15053756. Policy #0 lag: (min: 0.0, avg: 1.5, max: 4.0) +[2023-02-26 12:12:08,656][00001] Avg episode reward: [(0, '41.458')] +[2023-02-26 12:12:08,989][00189] Updated weights for policy 0, policy_version 14720 (0.0006) +[2023-02-26 12:12:09,697][00189] Updated weights for policy 0, policy_version 14730 (0.0006) +[2023-02-26 12:12:10,419][00189] Updated weights for policy 0, policy_version 14740 (0.0006) +[2023-02-26 12:12:11,108][00189] Updated weights for policy 0, policy_version 14750 (0.0006) +[2023-02-26 12:12:11,808][00189] Updated weights for policy 0, policy_version 14760 (0.0007) +[2023-02-26 12:12:12,502][00189] Updated weights for policy 0, policy_version 14770 (0.0007) +[2023-02-26 12:12:13,233][00189] Updated weights for policy 0, policy_version 14780 (0.0006) +[2023-02-26 12:12:13,656][00001] Fps is (10 sec: 58163.4, 60 sec: 58641.0, 300 sec: 57788.3). Total num frames: 60559360. Throughput: 0: 14620.1. Samples: 15141240. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) +[2023-02-26 12:12:13,656][00001] Avg episode reward: [(0, '44.961')] +[2023-02-26 12:12:13,924][00189] Updated weights for policy 0, policy_version 14790 (0.0006) +[2023-02-26 12:12:14,587][00189] Updated weights for policy 0, policy_version 14800 (0.0006) +[2023-02-26 12:12:15,313][00189] Updated weights for policy 0, policy_version 14810 (0.0006) +[2023-02-26 12:12:16,014][00189] Updated weights for policy 0, policy_version 14820 (0.0006) +[2023-02-26 12:12:16,672][00189] Updated weights for policy 0, policy_version 14830 (0.0007) +[2023-02-26 12:12:17,424][00189] Updated weights for policy 0, policy_version 14840 (0.0006) +[2023-02-26 12:12:18,118][00189] Updated weights for policy 0, policy_version 14850 (0.0006) +[2023-02-26 12:12:18,656][00001] Fps is (10 sec: 58163.2, 60 sec: 58572.8, 300 sec: 57816.1). Total num frames: 60854272. Throughput: 0: 14611.8. Samples: 15185156. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) +[2023-02-26 12:12:18,656][00001] Avg episode reward: [(0, '42.974')] +[2023-02-26 12:12:18,774][00189] Updated weights for policy 0, policy_version 14860 (0.0006) +[2023-02-26 12:12:19,527][00189] Updated weights for policy 0, policy_version 14870 (0.0006) +[2023-02-26 12:12:20,199][00189] Updated weights for policy 0, policy_version 14880 (0.0006) +[2023-02-26 12:12:20,867][00189] Updated weights for policy 0, policy_version 14890 (0.0006) +[2023-02-26 12:12:21,634][00189] Updated weights for policy 0, policy_version 14900 (0.0006) +[2023-02-26 12:12:22,298][00189] Updated weights for policy 0, policy_version 14910 (0.0007) +[2023-02-26 12:12:22,969][00189] Updated weights for policy 0, policy_version 14920 (0.0007) +[2023-02-26 12:12:23,656][00001] Fps is (10 sec: 58982.4, 60 sec: 58572.8, 300 sec: 57830.0). Total num frames: 61149184. Throughput: 0: 14598.4. Samples: 15273180. Policy #0 lag: (min: 0.0, avg: 1.9, max: 3.0) +[2023-02-26 12:12:23,656][00001] Avg episode reward: [(0, '42.728')] +[2023-02-26 12:12:23,731][00189] Updated weights for policy 0, policy_version 14930 (0.0007) +[2023-02-26 12:12:24,399][00189] Updated weights for policy 0, policy_version 14940 (0.0006) +[2023-02-26 12:12:25,086][00189] Updated weights for policy 0, policy_version 14950 (0.0006) +[2023-02-26 12:12:25,845][00189] Updated weights for policy 0, policy_version 14960 (0.0006) +[2023-02-26 12:12:26,514][00189] Updated weights for policy 0, policy_version 14970 (0.0006) +[2023-02-26 12:12:27,159][00189] Updated weights for policy 0, policy_version 14980 (0.0006) +[2023-02-26 12:12:27,929][00189] Updated weights for policy 0, policy_version 14990 (0.0006) +[2023-02-26 12:12:28,607][00189] Updated weights for policy 0, policy_version 15000 (0.0007) +[2023-02-26 12:12:28,656][00001] Fps is (10 sec: 58982.2, 60 sec: 58504.5, 300 sec: 57885.6). Total num frames: 61444096. Throughput: 0: 14640.8. Samples: 15360904. Policy #0 lag: (min: 0.0, avg: 1.9, max: 3.0) +[2023-02-26 12:12:28,656][00001] Avg episode reward: [(0, '41.081')] +[2023-02-26 12:12:29,265][00189] Updated weights for policy 0, policy_version 15010 (0.0007) +[2023-02-26 12:12:29,991][00189] Updated weights for policy 0, policy_version 15020 (0.0007) +[2023-02-26 12:12:30,689][00189] Updated weights for policy 0, policy_version 15030 (0.0007) +[2023-02-26 12:12:31,346][00189] Updated weights for policy 0, policy_version 15040 (0.0006) +[2023-02-26 12:12:32,080][00189] Updated weights for policy 0, policy_version 15050 (0.0006) +[2023-02-26 12:12:32,781][00189] Updated weights for policy 0, policy_version 15060 (0.0006) +[2023-02-26 12:12:33,448][00189] Updated weights for policy 0, policy_version 15070 (0.0006) +[2023-02-26 12:12:33,656][00001] Fps is (10 sec: 58573.1, 60 sec: 58436.2, 300 sec: 57885.5). Total num frames: 61734912. Throughput: 0: 14649.9. Samples: 15405068. Policy #0 lag: (min: 0.0, avg: 1.9, max: 3.0) +[2023-02-26 12:12:33,656][00001] Avg episode reward: [(0, '45.441')] +[2023-02-26 12:12:33,659][00141] Saving new best policy, reward=45.441! +[2023-02-26 12:12:34,159][00189] Updated weights for policy 0, policy_version 15080 (0.0007) +[2023-02-26 12:12:34,844][00189] Updated weights for policy 0, policy_version 15090 (0.0006) +[2023-02-26 12:12:35,566][00189] Updated weights for policy 0, policy_version 15100 (0.0007) +[2023-02-26 12:12:36,254][00189] Updated weights for policy 0, policy_version 15110 (0.0007) +[2023-02-26 12:12:36,952][00189] Updated weights for policy 0, policy_version 15120 (0.0007) +[2023-02-26 12:12:37,650][00189] Updated weights for policy 0, policy_version 15130 (0.0006) +[2023-02-26 12:12:38,352][00189] Updated weights for policy 0, policy_version 15140 (0.0006) +[2023-02-26 12:12:38,656][00001] Fps is (10 sec: 58572.9, 60 sec: 58436.2, 300 sec: 57927.2). Total num frames: 62029824. Throughput: 0: 14649.7. Samples: 15493104. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:12:38,656][00001] Avg episode reward: [(0, '42.558')] +[2023-02-26 12:12:39,066][00189] Updated weights for policy 0, policy_version 15150 (0.0006) +[2023-02-26 12:12:39,743][00189] Updated weights for policy 0, policy_version 15160 (0.0006) +[2023-02-26 12:12:40,435][00189] Updated weights for policy 0, policy_version 15170 (0.0007) +[2023-02-26 12:12:41,144][00189] Updated weights for policy 0, policy_version 15180 (0.0006) +[2023-02-26 12:12:41,823][00189] Updated weights for policy 0, policy_version 15190 (0.0007) +[2023-02-26 12:12:42,527][00189] Updated weights for policy 0, policy_version 15200 (0.0007) +[2023-02-26 12:12:43,219][00189] Updated weights for policy 0, policy_version 15210 (0.0006) +[2023-02-26 12:12:43,656][00001] Fps is (10 sec: 58982.3, 60 sec: 58641.0, 300 sec: 57954.9). Total num frames: 62324736. Throughput: 0: 14644.9. Samples: 15581428. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:12:43,656][00001] Avg episode reward: [(0, '39.765')] +[2023-02-26 12:12:43,884][00189] Updated weights for policy 0, policy_version 15220 (0.0007) +[2023-02-26 12:12:44,624][00189] Updated weights for policy 0, policy_version 15230 (0.0006) +[2023-02-26 12:12:45,312][00189] Updated weights for policy 0, policy_version 15240 (0.0007) +[2023-02-26 12:12:45,983][00189] Updated weights for policy 0, policy_version 15250 (0.0007) +[2023-02-26 12:12:46,726][00189] Updated weights for policy 0, policy_version 15260 (0.0006) +[2023-02-26 12:12:47,387][00189] Updated weights for policy 0, policy_version 15270 (0.0007) +[2023-02-26 12:12:48,136][00189] Updated weights for policy 0, policy_version 15280 (0.0006) +[2023-02-26 12:12:48,656][00001] Fps is (10 sec: 58982.6, 60 sec: 58641.1, 300 sec: 57982.7). Total num frames: 62619648. Throughput: 0: 14648.5. Samples: 15625416. Policy #0 lag: (min: 0.0, avg: 1.5, max: 3.0) +[2023-02-26 12:12:48,656][00001] Avg episode reward: [(0, '40.771')] +[2023-02-26 12:12:48,840][00189] Updated weights for policy 0, policy_version 15290 (0.0006) +[2023-02-26 12:12:49,509][00189] Updated weights for policy 0, policy_version 15300 (0.0007) +[2023-02-26 12:12:50,254][00189] Updated weights for policy 0, policy_version 15310 (0.0007) +[2023-02-26 12:12:50,969][00189] Updated weights for policy 0, policy_version 15320 (0.0007) +[2023-02-26 12:12:51,662][00189] Updated weights for policy 0, policy_version 15330 (0.0007) +[2023-02-26 12:12:52,418][00189] Updated weights for policy 0, policy_version 15340 (0.0006) +[2023-02-26 12:12:53,130][00189] Updated weights for policy 0, policy_version 15350 (0.0007) +[2023-02-26 12:12:53,656][00001] Fps is (10 sec: 57753.1, 60 sec: 58504.4, 300 sec: 57968.8). Total num frames: 62902272. Throughput: 0: 14628.0. Samples: 15712020. Policy #0 lag: (min: 0.0, avg: 1.5, max: 3.0) +[2023-02-26 12:12:53,656][00001] Avg episode reward: [(0, '41.659')] +[2023-02-26 12:12:53,660][00141] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000015357_62902272.pth... +[2023-02-26 12:12:53,700][00141] Removing ./runs/default_experiment/checkpoint_p0/checkpoint_000011951_48951296.pth +[2023-02-26 12:12:53,803][00189] Updated weights for policy 0, policy_version 15360 (0.0007) +[2023-02-26 12:12:54,557][00189] Updated weights for policy 0, policy_version 15370 (0.0007) +[2023-02-26 12:12:55,230][00189] Updated weights for policy 0, policy_version 15380 (0.0006) +[2023-02-26 12:12:55,954][00189] Updated weights for policy 0, policy_version 15390 (0.0007) +[2023-02-26 12:12:56,647][00189] Updated weights for policy 0, policy_version 15400 (0.0007) +[2023-02-26 12:12:57,379][00189] Updated weights for policy 0, policy_version 15410 (0.0007) +[2023-02-26 12:12:58,109][00189] Updated weights for policy 0, policy_version 15420 (0.0007) +[2023-02-26 12:12:58,656][00001] Fps is (10 sec: 57343.7, 60 sec: 58436.3, 300 sec: 57982.7). Total num frames: 63193088. Throughput: 0: 14602.0. Samples: 15798328. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:12:58,656][00001] Avg episode reward: [(0, '43.462')] +[2023-02-26 12:12:58,779][00189] Updated weights for policy 0, policy_version 15430 (0.0006) +[2023-02-26 12:12:59,517][00189] Updated weights for policy 0, policy_version 15440 (0.0007) +[2023-02-26 12:13:00,225][00189] Updated weights for policy 0, policy_version 15450 (0.0006) +[2023-02-26 12:13:00,921][00189] Updated weights for policy 0, policy_version 15460 (0.0006) +[2023-02-26 12:13:01,652][00189] Updated weights for policy 0, policy_version 15470 (0.0007) +[2023-02-26 12:13:02,355][00189] Updated weights for policy 0, policy_version 15480 (0.0006) +[2023-02-26 12:13:03,048][00189] Updated weights for policy 0, policy_version 15490 (0.0006) +[2023-02-26 12:13:03,656][00001] Fps is (10 sec: 57753.5, 60 sec: 58368.0, 300 sec: 57968.8). Total num frames: 63479808. Throughput: 0: 14584.5. Samples: 15841460. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:13:03,656][00001] Avg episode reward: [(0, '41.309')] +[2023-02-26 12:13:03,727][00189] Updated weights for policy 0, policy_version 15500 (0.0006) +[2023-02-26 12:13:04,452][00189] Updated weights for policy 0, policy_version 15510 (0.0006) +[2023-02-26 12:13:05,172][00189] Updated weights for policy 0, policy_version 15520 (0.0006) +[2023-02-26 12:13:05,846][00189] Updated weights for policy 0, policy_version 15530 (0.0006) +[2023-02-26 12:13:06,560][00189] Updated weights for policy 0, policy_version 15540 (0.0007) +[2023-02-26 12:13:07,316][00189] Updated weights for policy 0, policy_version 15550 (0.0006) +[2023-02-26 12:13:07,996][00189] Updated weights for policy 0, policy_version 15560 (0.0006) +[2023-02-26 12:13:08,656][00001] Fps is (10 sec: 57753.0, 60 sec: 58299.6, 300 sec: 57982.7). Total num frames: 63770624. Throughput: 0: 14560.7. Samples: 15928412. Policy #0 lag: (min: 0.0, avg: 1.5, max: 3.0) +[2023-02-26 12:13:08,656][00001] Avg episode reward: [(0, '41.417')] +[2023-02-26 12:13:08,693][00189] Updated weights for policy 0, policy_version 15570 (0.0006) +[2023-02-26 12:13:09,402][00189] Updated weights for policy 0, policy_version 15580 (0.0006) +[2023-02-26 12:13:10,063][00189] Updated weights for policy 0, policy_version 15590 (0.0006) +[2023-02-26 12:13:10,770][00189] Updated weights for policy 0, policy_version 15600 (0.0007) +[2023-02-26 12:13:11,458][00189] Updated weights for policy 0, policy_version 15610 (0.0006) +[2023-02-26 12:13:12,103][00189] Updated weights for policy 0, policy_version 15620 (0.0006) +[2023-02-26 12:13:12,843][00189] Updated weights for policy 0, policy_version 15630 (0.0006) +[2023-02-26 12:13:13,493][00189] Updated weights for policy 0, policy_version 15640 (0.0007) +[2023-02-26 12:13:13,656][00001] Fps is (10 sec: 58983.2, 60 sec: 58504.6, 300 sec: 58010.5). Total num frames: 64069632. Throughput: 0: 14596.9. Samples: 16017764. Policy #0 lag: (min: 0.0, avg: 1.5, max: 3.0) +[2023-02-26 12:13:13,656][00001] Avg episode reward: [(0, '45.620')] +[2023-02-26 12:13:13,659][00141] Saving new best policy, reward=45.620! +[2023-02-26 12:13:14,172][00189] Updated weights for policy 0, policy_version 15650 (0.0006) +[2023-02-26 12:13:14,898][00189] Updated weights for policy 0, policy_version 15660 (0.0006) +[2023-02-26 12:13:15,543][00189] Updated weights for policy 0, policy_version 15670 (0.0006) +[2023-02-26 12:13:16,241][00189] Updated weights for policy 0, policy_version 15680 (0.0006) +[2023-02-26 12:13:16,921][00189] Updated weights for policy 0, policy_version 15690 (0.0006) +[2023-02-26 12:13:17,584][00189] Updated weights for policy 0, policy_version 15700 (0.0006) +[2023-02-26 12:13:18,283][00189] Updated weights for policy 0, policy_version 15710 (0.0006) +[2023-02-26 12:13:18,656][00001] Fps is (10 sec: 59802.3, 60 sec: 58572.8, 300 sec: 58038.3). Total num frames: 64368640. Throughput: 0: 14612.5. Samples: 16062632. Policy #0 lag: (min: 0.0, avg: 2.2, max: 5.0) +[2023-02-26 12:13:18,656][00001] Avg episode reward: [(0, '43.791')] +[2023-02-26 12:13:18,973][00189] Updated weights for policy 0, policy_version 15720 (0.0006) +[2023-02-26 12:13:19,642][00189] Updated weights for policy 0, policy_version 15730 (0.0006) +[2023-02-26 12:13:20,323][00189] Updated weights for policy 0, policy_version 15740 (0.0006) +[2023-02-26 12:13:21,014][00189] Updated weights for policy 0, policy_version 15750 (0.0006) +[2023-02-26 12:13:21,699][00189] Updated weights for policy 0, policy_version 15760 (0.0006) +[2023-02-26 12:13:22,378][00189] Updated weights for policy 0, policy_version 15770 (0.0007) +[2023-02-26 12:13:23,057][00189] Updated weights for policy 0, policy_version 15780 (0.0006) +[2023-02-26 12:13:23,656][00001] Fps is (10 sec: 59801.7, 60 sec: 58641.1, 300 sec: 58107.7). Total num frames: 64667648. Throughput: 0: 14654.6. Samples: 16152560. Policy #0 lag: (min: 0.0, avg: 2.2, max: 5.0) +[2023-02-26 12:13:23,656][00001] Avg episode reward: [(0, '42.207')] +[2023-02-26 12:13:23,727][00189] Updated weights for policy 0, policy_version 15790 (0.0006) +[2023-02-26 12:13:24,423][00189] Updated weights for policy 0, policy_version 15800 (0.0006) +[2023-02-26 12:13:25,105][00189] Updated weights for policy 0, policy_version 15810 (0.0006) +[2023-02-26 12:13:25,789][00189] Updated weights for policy 0, policy_version 15820 (0.0006) +[2023-02-26 12:13:26,495][00189] Updated weights for policy 0, policy_version 15830 (0.0007) +[2023-02-26 12:13:27,148][00189] Updated weights for policy 0, policy_version 15840 (0.0006) +[2023-02-26 12:13:27,848][00189] Updated weights for policy 0, policy_version 15850 (0.0007) +[2023-02-26 12:13:28,554][00189] Updated weights for policy 0, policy_version 15860 (0.0007) +[2023-02-26 12:13:28,656][00001] Fps is (10 sec: 59801.6, 60 sec: 58709.4, 300 sec: 58149.3). Total num frames: 64966656. Throughput: 0: 14687.4. Samples: 16242360. Policy #0 lag: (min: 0.0, avg: 2.2, max: 5.0) +[2023-02-26 12:13:28,656][00001] Avg episode reward: [(0, '41.511')] +[2023-02-26 12:13:29,210][00189] Updated weights for policy 0, policy_version 15870 (0.0006) +[2023-02-26 12:13:29,901][00189] Updated weights for policy 0, policy_version 15880 (0.0006) +[2023-02-26 12:13:30,585][00189] Updated weights for policy 0, policy_version 15890 (0.0007) +[2023-02-26 12:13:31,279][00189] Updated weights for policy 0, policy_version 15900 (0.0007) +[2023-02-26 12:13:31,944][00189] Updated weights for policy 0, policy_version 15910 (0.0006) +[2023-02-26 12:13:32,632][00189] Updated weights for policy 0, policy_version 15920 (0.0006) +[2023-02-26 12:13:33,327][00189] Updated weights for policy 0, policy_version 15930 (0.0006) +[2023-02-26 12:13:33,656][00001] Fps is (10 sec: 60211.3, 60 sec: 58914.2, 300 sec: 58218.7). Total num frames: 65269760. Throughput: 0: 14709.7. Samples: 16287352. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:13:33,656][00001] Avg episode reward: [(0, '44.893')] +[2023-02-26 12:13:33,985][00189] Updated weights for policy 0, policy_version 15940 (0.0006) +[2023-02-26 12:13:34,656][00189] Updated weights for policy 0, policy_version 15950 (0.0006) +[2023-02-26 12:13:35,366][00189] Updated weights for policy 0, policy_version 15960 (0.0006) +[2023-02-26 12:13:36,039][00189] Updated weights for policy 0, policy_version 15970 (0.0006) +[2023-02-26 12:13:36,729][00189] Updated weights for policy 0, policy_version 15980 (0.0006) +[2023-02-26 12:13:37,434][00189] Updated weights for policy 0, policy_version 15990 (0.0006) +[2023-02-26 12:13:38,093][00189] Updated weights for policy 0, policy_version 16000 (0.0006) +[2023-02-26 12:13:38,656][00001] Fps is (10 sec: 60211.3, 60 sec: 58982.4, 300 sec: 58274.3). Total num frames: 65568768. Throughput: 0: 14784.4. Samples: 16377316. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:13:38,656][00001] Avg episode reward: [(0, '43.612')] +[2023-02-26 12:13:38,770][00189] Updated weights for policy 0, policy_version 16010 (0.0006) +[2023-02-26 12:13:39,470][00189] Updated weights for policy 0, policy_version 16020 (0.0007) +[2023-02-26 12:13:40,163][00189] Updated weights for policy 0, policy_version 16030 (0.0006) +[2023-02-26 12:13:40,825][00189] Updated weights for policy 0, policy_version 16040 (0.0006) +[2023-02-26 12:13:41,528][00189] Updated weights for policy 0, policy_version 16050 (0.0006) +[2023-02-26 12:13:42,200][00189] Updated weights for policy 0, policy_version 16060 (0.0006) +[2023-02-26 12:13:42,953][00189] Updated weights for policy 0, policy_version 16070 (0.0006) +[2023-02-26 12:13:43,631][00189] Updated weights for policy 0, policy_version 16080 (0.0007) +[2023-02-26 12:13:43,656][00001] Fps is (10 sec: 59391.8, 60 sec: 58982.4, 300 sec: 58302.0). Total num frames: 65863680. Throughput: 0: 14841.2. Samples: 16466180. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:13:43,656][00001] Avg episode reward: [(0, '46.689')] +[2023-02-26 12:13:43,659][00141] Saving new best policy, reward=46.689! +[2023-02-26 12:13:44,332][00189] Updated weights for policy 0, policy_version 16090 (0.0006) +[2023-02-26 12:13:45,038][00189] Updated weights for policy 0, policy_version 16100 (0.0006) +[2023-02-26 12:13:45,711][00189] Updated weights for policy 0, policy_version 16110 (0.0006) +[2023-02-26 12:13:46,411][00189] Updated weights for policy 0, policy_version 16120 (0.0006) +[2023-02-26 12:13:47,105][00189] Updated weights for policy 0, policy_version 16130 (0.0006) +[2023-02-26 12:13:47,774][00189] Updated weights for policy 0, policy_version 16140 (0.0006) +[2023-02-26 12:13:48,493][00189] Updated weights for policy 0, policy_version 16150 (0.0007) +[2023-02-26 12:13:48,656][00001] Fps is (10 sec: 58982.3, 60 sec: 58982.4, 300 sec: 58343.7). Total num frames: 66158592. Throughput: 0: 14865.5. Samples: 16510404. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:13:48,656][00001] Avg episode reward: [(0, '44.586')] +[2023-02-26 12:13:49,138][00189] Updated weights for policy 0, policy_version 16160 (0.0006) +[2023-02-26 12:13:49,840][00189] Updated weights for policy 0, policy_version 16170 (0.0006) +[2023-02-26 12:13:50,540][00189] Updated weights for policy 0, policy_version 16180 (0.0006) +[2023-02-26 12:13:51,216][00189] Updated weights for policy 0, policy_version 16190 (0.0006) +[2023-02-26 12:13:51,899][00189] Updated weights for policy 0, policy_version 16200 (0.0007) +[2023-02-26 12:13:52,563][00189] Updated weights for policy 0, policy_version 16210 (0.0006) +[2023-02-26 12:13:53,254][00189] Updated weights for policy 0, policy_version 16220 (0.0006) +[2023-02-26 12:13:53,656][00001] Fps is (10 sec: 59392.2, 60 sec: 59255.6, 300 sec: 58399.2). Total num frames: 66457600. Throughput: 0: 14926.3. Samples: 16600092. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:13:53,656][00001] Avg episode reward: [(0, '45.020')] +[2023-02-26 12:13:53,939][00189] Updated weights for policy 0, policy_version 16230 (0.0006) +[2023-02-26 12:13:54,609][00189] Updated weights for policy 0, policy_version 16240 (0.0007) +[2023-02-26 12:13:55,304][00189] Updated weights for policy 0, policy_version 16250 (0.0006) +[2023-02-26 12:13:56,000][00189] Updated weights for policy 0, policy_version 16260 (0.0007) +[2023-02-26 12:13:56,687][00189] Updated weights for policy 0, policy_version 16270 (0.0006) +[2023-02-26 12:13:57,367][00189] Updated weights for policy 0, policy_version 16280 (0.0006) +[2023-02-26 12:13:58,036][00189] Updated weights for policy 0, policy_version 16290 (0.0006) +[2023-02-26 12:13:58,656][00001] Fps is (10 sec: 60211.2, 60 sec: 59460.3, 300 sec: 58440.9). Total num frames: 66760704. Throughput: 0: 14941.5. Samples: 16690132. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:13:58,656][00001] Avg episode reward: [(0, '41.846')] +[2023-02-26 12:13:58,698][00189] Updated weights for policy 0, policy_version 16300 (0.0006) +[2023-02-26 12:13:59,402][00189] Updated weights for policy 0, policy_version 16310 (0.0006) +[2023-02-26 12:14:00,107][00189] Updated weights for policy 0, policy_version 16320 (0.0006) +[2023-02-26 12:14:00,748][00189] Updated weights for policy 0, policy_version 16330 (0.0006) +[2023-02-26 12:14:01,441][00189] Updated weights for policy 0, policy_version 16340 (0.0006) +[2023-02-26 12:14:02,128][00189] Updated weights for policy 0, policy_version 16350 (0.0006) +[2023-02-26 12:14:02,810][00189] Updated weights for policy 0, policy_version 16360 (0.0006) +[2023-02-26 12:14:03,473][00189] Updated weights for policy 0, policy_version 16370 (0.0006) +[2023-02-26 12:14:03,656][00001] Fps is (10 sec: 60211.0, 60 sec: 59665.2, 300 sec: 58496.4). Total num frames: 67059712. Throughput: 0: 14949.7. Samples: 16735368. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:14:03,656][00001] Avg episode reward: [(0, '39.493')] +[2023-02-26 12:14:04,170][00189] Updated weights for policy 0, policy_version 16380 (0.0006) +[2023-02-26 12:14:04,852][00189] Updated weights for policy 0, policy_version 16390 (0.0006) +[2023-02-26 12:14:05,506][00189] Updated weights for policy 0, policy_version 16400 (0.0006) +[2023-02-26 12:14:06,192][00189] Updated weights for policy 0, policy_version 16410 (0.0006) +[2023-02-26 12:14:06,882][00189] Updated weights for policy 0, policy_version 16420 (0.0006) +[2023-02-26 12:14:07,539][00189] Updated weights for policy 0, policy_version 16430 (0.0006) +[2023-02-26 12:14:08,243][00189] Updated weights for policy 0, policy_version 16440 (0.0006) +[2023-02-26 12:14:08,656][00001] Fps is (10 sec: 60211.2, 60 sec: 59870.0, 300 sec: 58579.8). Total num frames: 67362816. Throughput: 0: 14959.1. Samples: 16825720. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:14:08,656][00001] Avg episode reward: [(0, '42.658')] +[2023-02-26 12:14:08,922][00189] Updated weights for policy 0, policy_version 16450 (0.0006) +[2023-02-26 12:14:09,616][00189] Updated weights for policy 0, policy_version 16460 (0.0007) +[2023-02-26 12:14:10,280][00189] Updated weights for policy 0, policy_version 16470 (0.0006) +[2023-02-26 12:14:10,957][00189] Updated weights for policy 0, policy_version 16480 (0.0006) +[2023-02-26 12:14:11,648][00189] Updated weights for policy 0, policy_version 16490 (0.0006) +[2023-02-26 12:14:12,362][00189] Updated weights for policy 0, policy_version 16500 (0.0006) +[2023-02-26 12:14:13,016][00189] Updated weights for policy 0, policy_version 16510 (0.0007) +[2023-02-26 12:14:13,656][00001] Fps is (10 sec: 60211.5, 60 sec: 59869.9, 300 sec: 58621.4). Total num frames: 67661824. Throughput: 0: 14964.6. Samples: 16915768. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:14:13,656][00001] Avg episode reward: [(0, '43.196')] +[2023-02-26 12:14:13,697][00189] Updated weights for policy 0, policy_version 16520 (0.0006) +[2023-02-26 12:14:14,401][00189] Updated weights for policy 0, policy_version 16530 (0.0006) +[2023-02-26 12:14:15,070][00189] Updated weights for policy 0, policy_version 16540 (0.0006) +[2023-02-26 12:14:15,734][00189] Updated weights for policy 0, policy_version 16550 (0.0006) +[2023-02-26 12:14:16,447][00189] Updated weights for policy 0, policy_version 16560 (0.0006) +[2023-02-26 12:14:17,154][00189] Updated weights for policy 0, policy_version 16570 (0.0006) +[2023-02-26 12:14:17,780][00189] Updated weights for policy 0, policy_version 16580 (0.0006) +[2023-02-26 12:14:17,966][00141] Signal inference workers to stop experience collection... (500 times) +[2023-02-26 12:14:17,967][00141] Signal inference workers to resume experience collection... (500 times) +[2023-02-26 12:14:17,971][00189] InferenceWorker_p0-w0: stopping experience collection (500 times) +[2023-02-26 12:14:17,975][00189] InferenceWorker_p0-w0: resuming experience collection (500 times) +[2023-02-26 12:14:18,486][00189] Updated weights for policy 0, policy_version 16590 (0.0006) +[2023-02-26 12:14:18,656][00001] Fps is (10 sec: 59801.4, 60 sec: 59869.8, 300 sec: 58663.0). Total num frames: 67960832. Throughput: 0: 14961.9. Samples: 16960640. Policy #0 lag: (min: 0.0, avg: 1.8, max: 3.0) +[2023-02-26 12:14:18,656][00001] Avg episode reward: [(0, '43.388')] +[2023-02-26 12:14:19,181][00189] Updated weights for policy 0, policy_version 16600 (0.0006) +[2023-02-26 12:14:19,844][00189] Updated weights for policy 0, policy_version 16610 (0.0006) +[2023-02-26 12:14:20,506][00189] Updated weights for policy 0, policy_version 16620 (0.0006) +[2023-02-26 12:14:21,229][00189] Updated weights for policy 0, policy_version 16630 (0.0006) +[2023-02-26 12:14:21,884][00189] Updated weights for policy 0, policy_version 16640 (0.0006) +[2023-02-26 12:14:22,564][00189] Updated weights for policy 0, policy_version 16650 (0.0006) +[2023-02-26 12:14:23,268][00189] Updated weights for policy 0, policy_version 16660 (0.0006) +[2023-02-26 12:14:23,656][00001] Fps is (10 sec: 59801.6, 60 sec: 59869.9, 300 sec: 58732.5). Total num frames: 68259840. Throughput: 0: 14963.8. Samples: 17050688. Policy #0 lag: (min: 0.0, avg: 1.8, max: 3.0) +[2023-02-26 12:14:23,656][00001] Avg episode reward: [(0, '45.110')] +[2023-02-26 12:14:23,945][00189] Updated weights for policy 0, policy_version 16670 (0.0006) +[2023-02-26 12:14:24,637][00189] Updated weights for policy 0, policy_version 16680 (0.0006) +[2023-02-26 12:14:25,300][00189] Updated weights for policy 0, policy_version 16690 (0.0006) +[2023-02-26 12:14:26,004][00189] Updated weights for policy 0, policy_version 16700 (0.0006) +[2023-02-26 12:14:26,697][00189] Updated weights for policy 0, policy_version 16710 (0.0006) +[2023-02-26 12:14:27,353][00189] Updated weights for policy 0, policy_version 16720 (0.0006) +[2023-02-26 12:14:28,029][00189] Updated weights for policy 0, policy_version 16730 (0.0006) +[2023-02-26 12:14:28,656][00001] Fps is (10 sec: 59801.4, 60 sec: 59869.8, 300 sec: 58788.0). Total num frames: 68558848. Throughput: 0: 14987.6. Samples: 17140624. Policy #0 lag: (min: 1.0, avg: 1.8, max: 4.0) +[2023-02-26 12:14:28,656][00001] Avg episode reward: [(0, '44.058')] +[2023-02-26 12:14:28,739][00189] Updated weights for policy 0, policy_version 16740 (0.0006) +[2023-02-26 12:14:29,409][00189] Updated weights for policy 0, policy_version 16750 (0.0006) +[2023-02-26 12:14:30,103][00189] Updated weights for policy 0, policy_version 16760 (0.0006) +[2023-02-26 12:14:30,790][00189] Updated weights for policy 0, policy_version 16770 (0.0006) +[2023-02-26 12:14:31,463][00189] Updated weights for policy 0, policy_version 16780 (0.0006) +[2023-02-26 12:14:32,134][00189] Updated weights for policy 0, policy_version 16790 (0.0006) +[2023-02-26 12:14:32,808][00189] Updated weights for policy 0, policy_version 16800 (0.0006) +[2023-02-26 12:14:33,544][00189] Updated weights for policy 0, policy_version 16810 (0.0007) +[2023-02-26 12:14:33,656][00001] Fps is (10 sec: 60211.0, 60 sec: 59869.9, 300 sec: 58843.6). Total num frames: 68861952. Throughput: 0: 15003.7. Samples: 17185572. Policy #0 lag: (min: 1.0, avg: 1.8, max: 4.0) +[2023-02-26 12:14:33,656][00001] Avg episode reward: [(0, '43.327')] +[2023-02-26 12:14:34,183][00189] Updated weights for policy 0, policy_version 16820 (0.0006) +[2023-02-26 12:14:34,843][00189] Updated weights for policy 0, policy_version 16830 (0.0006) +[2023-02-26 12:14:35,553][00189] Updated weights for policy 0, policy_version 16840 (0.0006) +[2023-02-26 12:14:36,243][00189] Updated weights for policy 0, policy_version 16850 (0.0006) +[2023-02-26 12:14:36,890][00189] Updated weights for policy 0, policy_version 16860 (0.0006) +[2023-02-26 12:14:37,611][00189] Updated weights for policy 0, policy_version 16870 (0.0006) +[2023-02-26 12:14:38,254][00189] Updated weights for policy 0, policy_version 16880 (0.0006) +[2023-02-26 12:14:38,656][00001] Fps is (10 sec: 60211.6, 60 sec: 59869.8, 300 sec: 58913.0). Total num frames: 69160960. Throughput: 0: 15014.8. Samples: 17275760. Policy #0 lag: (min: 1.0, avg: 1.8, max: 4.0) +[2023-02-26 12:14:38,656][00001] Avg episode reward: [(0, '41.714')] +[2023-02-26 12:14:38,927][00189] Updated weights for policy 0, policy_version 16890 (0.0006) +[2023-02-26 12:14:39,657][00189] Updated weights for policy 0, policy_version 16900 (0.0006) +[2023-02-26 12:14:40,303][00189] Updated weights for policy 0, policy_version 16910 (0.0006) +[2023-02-26 12:14:40,988][00189] Updated weights for policy 0, policy_version 16920 (0.0006) +[2023-02-26 12:14:41,707][00189] Updated weights for policy 0, policy_version 16930 (0.0007) +[2023-02-26 12:14:42,363][00189] Updated weights for policy 0, policy_version 16940 (0.0007) +[2023-02-26 12:14:43,049][00189] Updated weights for policy 0, policy_version 16950 (0.0006) +[2023-02-26 12:14:43,656][00001] Fps is (10 sec: 60211.2, 60 sec: 60006.4, 300 sec: 58954.6). Total num frames: 69464064. Throughput: 0: 15017.9. Samples: 17365936. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:14:43,656][00001] Avg episode reward: [(0, '42.822')] +[2023-02-26 12:14:43,730][00189] Updated weights for policy 0, policy_version 16960 (0.0006) +[2023-02-26 12:14:44,396][00189] Updated weights for policy 0, policy_version 16970 (0.0006) +[2023-02-26 12:14:45,087][00189] Updated weights for policy 0, policy_version 16980 (0.0006) +[2023-02-26 12:14:45,760][00189] Updated weights for policy 0, policy_version 16990 (0.0006) +[2023-02-26 12:14:46,464][00189] Updated weights for policy 0, policy_version 17000 (0.0006) +[2023-02-26 12:14:47,143][00189] Updated weights for policy 0, policy_version 17010 (0.0006) +[2023-02-26 12:14:47,813][00189] Updated weights for policy 0, policy_version 17020 (0.0006) +[2023-02-26 12:14:48,505][00189] Updated weights for policy 0, policy_version 17030 (0.0006) +[2023-02-26 12:14:48,656][00001] Fps is (10 sec: 60211.3, 60 sec: 60074.7, 300 sec: 58940.8). Total num frames: 69763072. Throughput: 0: 15015.5. Samples: 17411064. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:14:48,656][00001] Avg episode reward: [(0, '49.510')] +[2023-02-26 12:14:48,656][00141] Saving new best policy, reward=49.510! +[2023-02-26 12:14:49,192][00189] Updated weights for policy 0, policy_version 17040 (0.0006) +[2023-02-26 12:14:49,859][00189] Updated weights for policy 0, policy_version 17050 (0.0006) +[2023-02-26 12:14:50,526][00189] Updated weights for policy 0, policy_version 17060 (0.0006) +[2023-02-26 12:14:51,250][00189] Updated weights for policy 0, policy_version 17070 (0.0006) +[2023-02-26 12:14:51,902][00189] Updated weights for policy 0, policy_version 17080 (0.0006) +[2023-02-26 12:14:52,608][00189] Updated weights for policy 0, policy_version 17090 (0.0006) +[2023-02-26 12:14:53,292][00189] Updated weights for policy 0, policy_version 17100 (0.0006) +[2023-02-26 12:14:53,656][00001] Fps is (10 sec: 59801.6, 60 sec: 60074.7, 300 sec: 58982.4). Total num frames: 70062080. Throughput: 0: 15004.1. Samples: 17500904. Policy #0 lag: (min: 0.0, avg: 1.5, max: 3.0) +[2023-02-26 12:14:53,656][00001] Avg episode reward: [(0, '43.983')] +[2023-02-26 12:14:53,659][00141] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000017105_70062080.pth... +[2023-02-26 12:14:53,698][00141] Removing ./runs/default_experiment/checkpoint_p0/checkpoint_000013637_55857152.pth +[2023-02-26 12:14:53,953][00189] Updated weights for policy 0, policy_version 17110 (0.0006) +[2023-02-26 12:14:54,661][00189] Updated weights for policy 0, policy_version 17120 (0.0006) +[2023-02-26 12:14:55,312][00189] Updated weights for policy 0, policy_version 17130 (0.0006) +[2023-02-26 12:14:56,026][00189] Updated weights for policy 0, policy_version 17140 (0.0007) +[2023-02-26 12:14:56,700][00189] Updated weights for policy 0, policy_version 17150 (0.0006) +[2023-02-26 12:14:57,383][00189] Updated weights for policy 0, policy_version 17160 (0.0006) +[2023-02-26 12:14:58,099][00189] Updated weights for policy 0, policy_version 17170 (0.0007) +[2023-02-26 12:14:58,656][00001] Fps is (10 sec: 59801.8, 60 sec: 60006.4, 300 sec: 58996.3). Total num frames: 70361088. Throughput: 0: 15000.4. Samples: 17590784. Policy #0 lag: (min: 0.0, avg: 1.5, max: 3.0) +[2023-02-26 12:14:58,656][00001] Avg episode reward: [(0, '42.822')] +[2023-02-26 12:14:58,735][00189] Updated weights for policy 0, policy_version 17180 (0.0006) +[2023-02-26 12:14:59,452][00189] Updated weights for policy 0, policy_version 17190 (0.0006) +[2023-02-26 12:15:00,115][00189] Updated weights for policy 0, policy_version 17200 (0.0006) +[2023-02-26 12:15:00,789][00189] Updated weights for policy 0, policy_version 17210 (0.0006) +[2023-02-26 12:15:01,483][00189] Updated weights for policy 0, policy_version 17220 (0.0006) +[2023-02-26 12:15:02,163][00189] Updated weights for policy 0, policy_version 17230 (0.0006) +[2023-02-26 12:15:02,823][00189] Updated weights for policy 0, policy_version 17240 (0.0006) +[2023-02-26 12:15:03,558][00189] Updated weights for policy 0, policy_version 17250 (0.0007) +[2023-02-26 12:15:03,656][00001] Fps is (10 sec: 60211.1, 60 sec: 60074.7, 300 sec: 59024.1). Total num frames: 70664192. Throughput: 0: 15003.3. Samples: 17635788. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:15:03,656][00001] Avg episode reward: [(0, '43.399')] +[2023-02-26 12:15:04,214][00189] Updated weights for policy 0, policy_version 17260 (0.0006) +[2023-02-26 12:15:04,900][00189] Updated weights for policy 0, policy_version 17270 (0.0006) +[2023-02-26 12:15:05,597][00189] Updated weights for policy 0, policy_version 17280 (0.0006) +[2023-02-26 12:15:06,246][00189] Updated weights for policy 0, policy_version 17290 (0.0006) +[2023-02-26 12:15:06,946][00189] Updated weights for policy 0, policy_version 17300 (0.0006) +[2023-02-26 12:15:07,610][00189] Updated weights for policy 0, policy_version 17310 (0.0006) +[2023-02-26 12:15:08,323][00189] Updated weights for policy 0, policy_version 17320 (0.0007) +[2023-02-26 12:15:08,656][00001] Fps is (10 sec: 60211.2, 60 sec: 60006.4, 300 sec: 59051.8). Total num frames: 70963200. Throughput: 0: 15003.7. Samples: 17725856. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:15:08,656][00001] Avg episode reward: [(0, '43.590')] +[2023-02-26 12:15:09,004][00189] Updated weights for policy 0, policy_version 17330 (0.0006) +[2023-02-26 12:15:09,658][00189] Updated weights for policy 0, policy_version 17340 (0.0006) +[2023-02-26 12:15:10,364][00189] Updated weights for policy 0, policy_version 17350 (0.0006) +[2023-02-26 12:15:11,049][00189] Updated weights for policy 0, policy_version 17360 (0.0006) +[2023-02-26 12:15:11,712][00189] Updated weights for policy 0, policy_version 17370 (0.0006) +[2023-02-26 12:15:12,430][00189] Updated weights for policy 0, policy_version 17380 (0.0006) +[2023-02-26 12:15:13,074][00189] Updated weights for policy 0, policy_version 17390 (0.0006) +[2023-02-26 12:15:13,656][00001] Fps is (10 sec: 59801.7, 60 sec: 60006.4, 300 sec: 59065.7). Total num frames: 71262208. Throughput: 0: 15001.1. Samples: 17815672. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:15:13,656][00001] Avg episode reward: [(0, '42.378')] +[2023-02-26 12:15:13,783][00189] Updated weights for policy 0, policy_version 17400 (0.0006) +[2023-02-26 12:15:14,472][00189] Updated weights for policy 0, policy_version 17410 (0.0006) +[2023-02-26 12:15:15,109][00189] Updated weights for policy 0, policy_version 17420 (0.0007) +[2023-02-26 12:15:15,862][00189] Updated weights for policy 0, policy_version 17430 (0.0006) +[2023-02-26 12:15:16,502][00189] Updated weights for policy 0, policy_version 17440 (0.0007) +[2023-02-26 12:15:17,192][00189] Updated weights for policy 0, policy_version 17450 (0.0006) +[2023-02-26 12:15:17,920][00189] Updated weights for policy 0, policy_version 17460 (0.0007) +[2023-02-26 12:15:18,553][00189] Updated weights for policy 0, policy_version 17470 (0.0006) +[2023-02-26 12:15:18,656][00001] Fps is (10 sec: 59801.2, 60 sec: 60006.4, 300 sec: 59107.4). Total num frames: 71561216. Throughput: 0: 14999.3. Samples: 17860540. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) +[2023-02-26 12:15:18,656][00001] Avg episode reward: [(0, '43.996')] +[2023-02-26 12:15:19,266][00189] Updated weights for policy 0, policy_version 17480 (0.0007) +[2023-02-26 12:15:19,937][00189] Updated weights for policy 0, policy_version 17490 (0.0007) +[2023-02-26 12:15:20,626][00189] Updated weights for policy 0, policy_version 17500 (0.0006) +[2023-02-26 12:15:21,290][00189] Updated weights for policy 0, policy_version 17510 (0.0006) +[2023-02-26 12:15:21,970][00189] Updated weights for policy 0, policy_version 17520 (0.0007) +[2023-02-26 12:15:22,681][00189] Updated weights for policy 0, policy_version 17530 (0.0006) +[2023-02-26 12:15:23,364][00189] Updated weights for policy 0, policy_version 17540 (0.0006) +[2023-02-26 12:15:23,656][00001] Fps is (10 sec: 59801.6, 60 sec: 60006.4, 300 sec: 59149.0). Total num frames: 71860224. Throughput: 0: 14991.7. Samples: 17950388. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) +[2023-02-26 12:15:23,656][00001] Avg episode reward: [(0, '44.769')] +[2023-02-26 12:15:24,024][00189] Updated weights for policy 0, policy_version 17550 (0.0006) +[2023-02-26 12:15:24,750][00189] Updated weights for policy 0, policy_version 17560 (0.0006) +[2023-02-26 12:15:25,416][00189] Updated weights for policy 0, policy_version 17570 (0.0006) +[2023-02-26 12:15:26,087][00189] Updated weights for policy 0, policy_version 17580 (0.0006) +[2023-02-26 12:15:26,783][00189] Updated weights for policy 0, policy_version 17590 (0.0006) +[2023-02-26 12:15:27,440][00189] Updated weights for policy 0, policy_version 17600 (0.0006) +[2023-02-26 12:15:28,150][00189] Updated weights for policy 0, policy_version 17610 (0.0006) +[2023-02-26 12:15:28,656][00001] Fps is (10 sec: 59801.8, 60 sec: 60006.5, 300 sec: 59190.7). Total num frames: 72159232. Throughput: 0: 14984.4. Samples: 18040232. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:15:28,656][00001] Avg episode reward: [(0, '40.103')] +[2023-02-26 12:15:28,825][00189] Updated weights for policy 0, policy_version 17620 (0.0006) +[2023-02-26 12:15:29,476][00189] Updated weights for policy 0, policy_version 17630 (0.0006) +[2023-02-26 12:15:30,232][00189] Updated weights for policy 0, policy_version 17640 (0.0006) +[2023-02-26 12:15:30,878][00189] Updated weights for policy 0, policy_version 17650 (0.0006) +[2023-02-26 12:15:31,539][00189] Updated weights for policy 0, policy_version 17660 (0.0006) +[2023-02-26 12:15:32,283][00189] Updated weights for policy 0, policy_version 17670 (0.0006) +[2023-02-26 12:15:32,939][00189] Updated weights for policy 0, policy_version 17680 (0.0007) +[2023-02-26 12:15:33,596][00189] Updated weights for policy 0, policy_version 17690 (0.0007) +[2023-02-26 12:15:33,656][00001] Fps is (10 sec: 59801.5, 60 sec: 59938.1, 300 sec: 59218.4). Total num frames: 72458240. Throughput: 0: 14972.7. Samples: 18084836. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:15:33,656][00001] Avg episode reward: [(0, '42.533')] +[2023-02-26 12:15:34,336][00189] Updated weights for policy 0, policy_version 17700 (0.0006) +[2023-02-26 12:15:34,981][00189] Updated weights for policy 0, policy_version 17710 (0.0006) +[2023-02-26 12:15:35,627][00189] Updated weights for policy 0, policy_version 17720 (0.0006) +[2023-02-26 12:15:36,376][00189] Updated weights for policy 0, policy_version 17730 (0.0007) +[2023-02-26 12:15:37,020][00189] Updated weights for policy 0, policy_version 17740 (0.0006) +[2023-02-26 12:15:37,703][00189] Updated weights for policy 0, policy_version 17750 (0.0006) +[2023-02-26 12:15:38,406][00189] Updated weights for policy 0, policy_version 17760 (0.0006) +[2023-02-26 12:15:38,656][00001] Fps is (10 sec: 59801.8, 60 sec: 59938.2, 300 sec: 59246.2). Total num frames: 72757248. Throughput: 0: 14978.7. Samples: 18174944. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:15:38,656][00001] Avg episode reward: [(0, '43.489')] +[2023-02-26 12:15:39,070][00189] Updated weights for policy 0, policy_version 17770 (0.0006) +[2023-02-26 12:15:39,779][00189] Updated weights for policy 0, policy_version 17780 (0.0006) +[2023-02-26 12:15:40,422][00189] Updated weights for policy 0, policy_version 17790 (0.0006) +[2023-02-26 12:15:41,139][00189] Updated weights for policy 0, policy_version 17800 (0.0006) +[2023-02-26 12:15:41,802][00189] Updated weights for policy 0, policy_version 17810 (0.0006) +[2023-02-26 12:15:42,470][00189] Updated weights for policy 0, policy_version 17820 (0.0006) +[2023-02-26 12:15:43,206][00189] Updated weights for policy 0, policy_version 17830 (0.0006) +[2023-02-26 12:15:43,656][00001] Fps is (10 sec: 59801.6, 60 sec: 59869.9, 300 sec: 59287.9). Total num frames: 73056256. Throughput: 0: 14977.1. Samples: 18264756. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:15:43,656][00001] Avg episode reward: [(0, '44.333')] +[2023-02-26 12:15:43,892][00189] Updated weights for policy 0, policy_version 17840 (0.0006) +[2023-02-26 12:15:44,587][00189] Updated weights for policy 0, policy_version 17850 (0.0007) +[2023-02-26 12:15:45,337][00189] Updated weights for policy 0, policy_version 17860 (0.0008) +[2023-02-26 12:15:46,067][00189] Updated weights for policy 0, policy_version 17870 (0.0007) +[2023-02-26 12:15:46,744][00189] Updated weights for policy 0, policy_version 17880 (0.0007) +[2023-02-26 12:15:47,540][00189] Updated weights for policy 0, policy_version 17890 (0.0007) +[2023-02-26 12:15:48,176][00189] Updated weights for policy 0, policy_version 17900 (0.0007) +[2023-02-26 12:15:48,656][00001] Fps is (10 sec: 58572.2, 60 sec: 59665.0, 300 sec: 59274.0). Total num frames: 73342976. Throughput: 0: 14917.6. Samples: 18307080. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:15:48,656][00001] Avg episode reward: [(0, '47.095')] +[2023-02-26 12:15:48,891][00189] Updated weights for policy 0, policy_version 17910 (0.0006) +[2023-02-26 12:15:49,602][00189] Updated weights for policy 0, policy_version 17920 (0.0006) +[2023-02-26 12:15:50,288][00189] Updated weights for policy 0, policy_version 17930 (0.0006) +[2023-02-26 12:15:50,989][00189] Updated weights for policy 0, policy_version 17940 (0.0007) +[2023-02-26 12:15:51,725][00189] Updated weights for policy 0, policy_version 17950 (0.0006) +[2023-02-26 12:15:52,375][00189] Updated weights for policy 0, policy_version 17960 (0.0006) +[2023-02-26 12:15:53,066][00189] Updated weights for policy 0, policy_version 17970 (0.0007) +[2023-02-26 12:15:53,656][00001] Fps is (10 sec: 58163.0, 60 sec: 59596.8, 300 sec: 59301.8). Total num frames: 73637888. Throughput: 0: 14865.9. Samples: 18394824. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) +[2023-02-26 12:15:53,656][00001] Avg episode reward: [(0, '43.293')] +[2023-02-26 12:15:53,797][00189] Updated weights for policy 0, policy_version 17980 (0.0006) +[2023-02-26 12:15:54,002][00141] Signal inference workers to stop experience collection... (550 times) +[2023-02-26 12:15:54,003][00141] Signal inference workers to resume experience collection... (550 times) +[2023-02-26 12:15:54,006][00189] InferenceWorker_p0-w0: stopping experience collection (550 times) +[2023-02-26 12:15:54,006][00189] InferenceWorker_p0-w0: resuming experience collection (550 times) +[2023-02-26 12:15:54,427][00189] Updated weights for policy 0, policy_version 17990 (0.0006) +[2023-02-26 12:15:55,151][00189] Updated weights for policy 0, policy_version 18000 (0.0007) +[2023-02-26 12:15:55,919][00189] Updated weights for policy 0, policy_version 18010 (0.0007) +[2023-02-26 12:15:56,611][00189] Updated weights for policy 0, policy_version 18020 (0.0007) +[2023-02-26 12:15:57,288][00189] Updated weights for policy 0, policy_version 18030 (0.0006) +[2023-02-26 12:15:58,002][00189] Updated weights for policy 0, policy_version 18040 (0.0006) +[2023-02-26 12:15:58,656][00001] Fps is (10 sec: 58573.1, 60 sec: 59460.2, 300 sec: 59274.0). Total num frames: 73928704. Throughput: 0: 14818.2. Samples: 18482492. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) +[2023-02-26 12:15:58,656][00001] Avg episode reward: [(0, '41.542')] +[2023-02-26 12:15:58,693][00189] Updated weights for policy 0, policy_version 18050 (0.0006) +[2023-02-26 12:15:59,390][00189] Updated weights for policy 0, policy_version 18060 (0.0007) +[2023-02-26 12:16:00,081][00189] Updated weights for policy 0, policy_version 18070 (0.0006) +[2023-02-26 12:16:00,794][00189] Updated weights for policy 0, policy_version 18080 (0.0006) +[2023-02-26 12:16:01,479][00189] Updated weights for policy 0, policy_version 18090 (0.0006) +[2023-02-26 12:16:02,194][00189] Updated weights for policy 0, policy_version 18100 (0.0006) +[2023-02-26 12:16:02,879][00189] Updated weights for policy 0, policy_version 18110 (0.0006) +[2023-02-26 12:16:03,528][00189] Updated weights for policy 0, policy_version 18120 (0.0006) +[2023-02-26 12:16:03,656][00001] Fps is (10 sec: 58573.1, 60 sec: 59323.8, 300 sec: 59260.1). Total num frames: 74223616. Throughput: 0: 14802.4. Samples: 18526648. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) +[2023-02-26 12:16:03,656][00001] Avg episode reward: [(0, '45.168')] +[2023-02-26 12:16:04,245][00189] Updated weights for policy 0, policy_version 18130 (0.0006) +[2023-02-26 12:16:04,975][00189] Updated weights for policy 0, policy_version 18140 (0.0007) +[2023-02-26 12:16:05,692][00189] Updated weights for policy 0, policy_version 18150 (0.0007) +[2023-02-26 12:16:06,453][00189] Updated weights for policy 0, policy_version 18160 (0.0007) +[2023-02-26 12:16:07,152][00189] Updated weights for policy 0, policy_version 18170 (0.0006) +[2023-02-26 12:16:07,841][00189] Updated weights for policy 0, policy_version 18180 (0.0006) +[2023-02-26 12:16:08,548][00189] Updated weights for policy 0, policy_version 18190 (0.0006) +[2023-02-26 12:16:08,656][00001] Fps is (10 sec: 58162.5, 60 sec: 59118.8, 300 sec: 59218.4). Total num frames: 74510336. Throughput: 0: 14731.6. Samples: 18613312. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:16:08,656][00001] Avg episode reward: [(0, '47.889')] +[2023-02-26 12:16:09,290][00189] Updated weights for policy 0, policy_version 18200 (0.0006) +[2023-02-26 12:16:09,963][00189] Updated weights for policy 0, policy_version 18210 (0.0006) +[2023-02-26 12:16:10,686][00189] Updated weights for policy 0, policy_version 18220 (0.0006) +[2023-02-26 12:16:11,376][00189] Updated weights for policy 0, policy_version 18230 (0.0006) +[2023-02-26 12:16:12,100][00189] Updated weights for policy 0, policy_version 18240 (0.0006) +[2023-02-26 12:16:12,782][00189] Updated weights for policy 0, policy_version 18250 (0.0006) +[2023-02-26 12:16:13,465][00189] Updated weights for policy 0, policy_version 18260 (0.0006) +[2023-02-26 12:16:13,656][00001] Fps is (10 sec: 57753.2, 60 sec: 58982.3, 300 sec: 59190.7). Total num frames: 74801152. Throughput: 0: 14684.0. Samples: 18701012. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:16:13,656][00001] Avg episode reward: [(0, '42.604')] +[2023-02-26 12:16:14,167][00189] Updated weights for policy 0, policy_version 18270 (0.0006) +[2023-02-26 12:16:14,846][00189] Updated weights for policy 0, policy_version 18280 (0.0006) +[2023-02-26 12:16:15,539][00189] Updated weights for policy 0, policy_version 18290 (0.0006) +[2023-02-26 12:16:16,276][00189] Updated weights for policy 0, policy_version 18300 (0.0007) +[2023-02-26 12:16:16,935][00189] Updated weights for policy 0, policy_version 18310 (0.0006) +[2023-02-26 12:16:17,580][00189] Updated weights for policy 0, policy_version 18320 (0.0006) +[2023-02-26 12:16:18,308][00189] Updated weights for policy 0, policy_version 18330 (0.0006) +[2023-02-26 12:16:18,656][00001] Fps is (10 sec: 58983.0, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 75100160. Throughput: 0: 14677.7. Samples: 18745332. Policy #0 lag: (min: 0.0, avg: 1.5, max: 4.0) +[2023-02-26 12:16:18,656][00001] Avg episode reward: [(0, '46.397')] +[2023-02-26 12:16:18,986][00189] Updated weights for policy 0, policy_version 18340 (0.0006) +[2023-02-26 12:16:19,641][00189] Updated weights for policy 0, policy_version 18350 (0.0006) +[2023-02-26 12:16:20,351][00189] Updated weights for policy 0, policy_version 18360 (0.0006) +[2023-02-26 12:16:21,056][00189] Updated weights for policy 0, policy_version 18370 (0.0007) +[2023-02-26 12:16:21,697][00189] Updated weights for policy 0, policy_version 18380 (0.0007) +[2023-02-26 12:16:22,375][00189] Updated weights for policy 0, policy_version 18390 (0.0006) +[2023-02-26 12:16:23,071][00189] Updated weights for policy 0, policy_version 18400 (0.0006) +[2023-02-26 12:16:23,656][00001] Fps is (10 sec: 59801.4, 60 sec: 58982.3, 300 sec: 59204.5). Total num frames: 75399168. Throughput: 0: 14675.2. Samples: 18835328. Policy #0 lag: (min: 0.0, avg: 1.5, max: 4.0) +[2023-02-26 12:16:23,656][00001] Avg episode reward: [(0, '45.779')] +[2023-02-26 12:16:23,752][00189] Updated weights for policy 0, policy_version 18410 (0.0006) +[2023-02-26 12:16:24,403][00189] Updated weights for policy 0, policy_version 18420 (0.0006) +[2023-02-26 12:16:25,136][00189] Updated weights for policy 0, policy_version 18430 (0.0007) +[2023-02-26 12:16:25,805][00189] Updated weights for policy 0, policy_version 18440 (0.0006) +[2023-02-26 12:16:26,467][00189] Updated weights for policy 0, policy_version 18450 (0.0006) +[2023-02-26 12:16:27,148][00189] Updated weights for policy 0, policy_version 18460 (0.0007) +[2023-02-26 12:16:27,878][00189] Updated weights for policy 0, policy_version 18470 (0.0006) +[2023-02-26 12:16:28,533][00189] Updated weights for policy 0, policy_version 18480 (0.0006) +[2023-02-26 12:16:28,656][00001] Fps is (10 sec: 59801.7, 60 sec: 58982.4, 300 sec: 59218.4). Total num frames: 75698176. Throughput: 0: 14675.8. Samples: 18925168. Policy #0 lag: (min: 0.0, avg: 1.5, max: 4.0) +[2023-02-26 12:16:28,656][00001] Avg episode reward: [(0, '42.461')] +[2023-02-26 12:16:29,192][00189] Updated weights for policy 0, policy_version 18490 (0.0006) +[2023-02-26 12:16:29,927][00189] Updated weights for policy 0, policy_version 18500 (0.0006) +[2023-02-26 12:16:30,581][00189] Updated weights for policy 0, policy_version 18510 (0.0006) +[2023-02-26 12:16:31,266][00189] Updated weights for policy 0, policy_version 18520 (0.0006) +[2023-02-26 12:16:31,955][00189] Updated weights for policy 0, policy_version 18530 (0.0007) +[2023-02-26 12:16:32,618][00189] Updated weights for policy 0, policy_version 18540 (0.0006) +[2023-02-26 12:16:33,336][00189] Updated weights for policy 0, policy_version 18550 (0.0006) +[2023-02-26 12:16:33,656][00001] Fps is (10 sec: 59802.0, 60 sec: 58982.4, 300 sec: 59232.3). Total num frames: 75997184. Throughput: 0: 14734.8. Samples: 18970144. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:16:33,656][00001] Avg episode reward: [(0, '43.872')] +[2023-02-26 12:16:34,016][00189] Updated weights for policy 0, policy_version 18560 (0.0006) +[2023-02-26 12:16:34,681][00189] Updated weights for policy 0, policy_version 18570 (0.0006) +[2023-02-26 12:16:35,381][00189] Updated weights for policy 0, policy_version 18580 (0.0006) +[2023-02-26 12:16:36,076][00189] Updated weights for policy 0, policy_version 18590 (0.0006) +[2023-02-26 12:16:36,753][00189] Updated weights for policy 0, policy_version 18600 (0.0006) +[2023-02-26 12:16:37,450][00189] Updated weights for policy 0, policy_version 18610 (0.0006) +[2023-02-26 12:16:38,134][00189] Updated weights for policy 0, policy_version 18620 (0.0007) +[2023-02-26 12:16:38,656][00001] Fps is (10 sec: 59801.4, 60 sec: 58982.3, 300 sec: 59287.9). Total num frames: 76296192. Throughput: 0: 14774.7. Samples: 19059688. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:16:38,656][00001] Avg episode reward: [(0, '42.100')] +[2023-02-26 12:16:38,809][00189] Updated weights for policy 0, policy_version 18630 (0.0006) +[2023-02-26 12:16:39,501][00189] Updated weights for policy 0, policy_version 18640 (0.0006) +[2023-02-26 12:16:40,181][00189] Updated weights for policy 0, policy_version 18650 (0.0006) +[2023-02-26 12:16:40,887][00189] Updated weights for policy 0, policy_version 18660 (0.0006) +[2023-02-26 12:16:41,578][00189] Updated weights for policy 0, policy_version 18670 (0.0006) +[2023-02-26 12:16:42,244][00189] Updated weights for policy 0, policy_version 18680 (0.0006) +[2023-02-26 12:16:42,935][00189] Updated weights for policy 0, policy_version 18690 (0.0006) +[2023-02-26 12:16:43,644][00189] Updated weights for policy 0, policy_version 18700 (0.0006) +[2023-02-26 12:16:43,656][00001] Fps is (10 sec: 59801.4, 60 sec: 58982.4, 300 sec: 59301.7). Total num frames: 76595200. Throughput: 0: 14816.3. Samples: 19149224. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:16:43,656][00001] Avg episode reward: [(0, '43.919')] +[2023-02-26 12:16:44,303][00189] Updated weights for policy 0, policy_version 18710 (0.0006) +[2023-02-26 12:16:44,992][00189] Updated weights for policy 0, policy_version 18720 (0.0006) +[2023-02-26 12:16:45,689][00189] Updated weights for policy 0, policy_version 18730 (0.0006) +[2023-02-26 12:16:46,369][00189] Updated weights for policy 0, policy_version 18740 (0.0006) +[2023-02-26 12:16:47,067][00189] Updated weights for policy 0, policy_version 18750 (0.0007) +[2023-02-26 12:16:47,749][00189] Updated weights for policy 0, policy_version 18760 (0.0006) +[2023-02-26 12:16:48,415][00189] Updated weights for policy 0, policy_version 18770 (0.0006) +[2023-02-26 12:16:48,656][00001] Fps is (10 sec: 59801.8, 60 sec: 59187.2, 300 sec: 59329.5). Total num frames: 76894208. Throughput: 0: 14828.6. Samples: 19193936. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:16:48,656][00001] Avg episode reward: [(0, '44.370')] +[2023-02-26 12:16:49,127][00189] Updated weights for policy 0, policy_version 18780 (0.0006) +[2023-02-26 12:16:49,810][00189] Updated weights for policy 0, policy_version 18790 (0.0006) +[2023-02-26 12:16:50,474][00189] Updated weights for policy 0, policy_version 18800 (0.0006) +[2023-02-26 12:16:51,183][00189] Updated weights for policy 0, policy_version 18810 (0.0006) +[2023-02-26 12:16:51,842][00189] Updated weights for policy 0, policy_version 18820 (0.0007) +[2023-02-26 12:16:52,532][00189] Updated weights for policy 0, policy_version 18830 (0.0006) +[2023-02-26 12:16:53,247][00189] Updated weights for policy 0, policy_version 18840 (0.0006) +[2023-02-26 12:16:53,656][00001] Fps is (10 sec: 59392.3, 60 sec: 59187.3, 300 sec: 59329.5). Total num frames: 77189120. Throughput: 0: 14897.7. Samples: 19283708. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:16:53,656][00001] Avg episode reward: [(0, '43.919')] +[2023-02-26 12:16:53,659][00141] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000018846_77193216.pth... +[2023-02-26 12:16:53,699][00141] Removing ./runs/default_experiment/checkpoint_p0/checkpoint_000015357_62902272.pth +[2023-02-26 12:16:53,913][00189] Updated weights for policy 0, policy_version 18850 (0.0006) +[2023-02-26 12:16:54,606][00189] Updated weights for policy 0, policy_version 18860 (0.0006) +[2023-02-26 12:16:55,264][00189] Updated weights for policy 0, policy_version 18870 (0.0006) +[2023-02-26 12:16:55,987][00189] Updated weights for policy 0, policy_version 18880 (0.0006) +[2023-02-26 12:16:56,676][00189] Updated weights for policy 0, policy_version 18890 (0.0006) +[2023-02-26 12:16:57,334][00189] Updated weights for policy 0, policy_version 18900 (0.0006) +[2023-02-26 12:16:58,039][00189] Updated weights for policy 0, policy_version 18910 (0.0006) +[2023-02-26 12:16:58,656][00001] Fps is (10 sec: 59802.2, 60 sec: 59392.1, 300 sec: 59371.2). Total num frames: 77492224. Throughput: 0: 14937.5. Samples: 19373200. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:16:58,656][00001] Avg episode reward: [(0, '45.981')] +[2023-02-26 12:16:58,710][00189] Updated weights for policy 0, policy_version 18920 (0.0006) +[2023-02-26 12:16:59,373][00189] Updated weights for policy 0, policy_version 18930 (0.0006) +[2023-02-26 12:17:00,068][00189] Updated weights for policy 0, policy_version 18940 (0.0007) +[2023-02-26 12:17:00,768][00189] Updated weights for policy 0, policy_version 18950 (0.0007) +[2023-02-26 12:17:01,402][00189] Updated weights for policy 0, policy_version 18960 (0.0006) +[2023-02-26 12:17:02,100][00189] Updated weights for policy 0, policy_version 18970 (0.0006) +[2023-02-26 12:17:02,793][00189] Updated weights for policy 0, policy_version 18980 (0.0007) +[2023-02-26 12:17:03,475][00189] Updated weights for policy 0, policy_version 18990 (0.0006) +[2023-02-26 12:17:03,656][00001] Fps is (10 sec: 60210.4, 60 sec: 59460.1, 300 sec: 59385.0). Total num frames: 77791232. Throughput: 0: 14955.3. Samples: 19418324. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:17:03,656][00001] Avg episode reward: [(0, '41.867')] +[2023-02-26 12:17:04,173][00189] Updated weights for policy 0, policy_version 19000 (0.0006) +[2023-02-26 12:17:04,850][00189] Updated weights for policy 0, policy_version 19010 (0.0006) +[2023-02-26 12:17:05,540][00189] Updated weights for policy 0, policy_version 19020 (0.0006) +[2023-02-26 12:17:06,201][00189] Updated weights for policy 0, policy_version 19030 (0.0006) +[2023-02-26 12:17:06,904][00189] Updated weights for policy 0, policy_version 19040 (0.0007) +[2023-02-26 12:17:07,590][00189] Updated weights for policy 0, policy_version 19050 (0.0006) +[2023-02-26 12:17:08,266][00189] Updated weights for policy 0, policy_version 19060 (0.0006) +[2023-02-26 12:17:08,656][00001] Fps is (10 sec: 59801.4, 60 sec: 59665.2, 300 sec: 59426.7). Total num frames: 78090240. Throughput: 0: 14952.2. Samples: 19508176. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:17:08,656][00001] Avg episode reward: [(0, '42.432')] +[2023-02-26 12:17:08,930][00189] Updated weights for policy 0, policy_version 19070 (0.0006) +[2023-02-26 12:17:09,595][00189] Updated weights for policy 0, policy_version 19080 (0.0006) +[2023-02-26 12:17:10,312][00189] Updated weights for policy 0, policy_version 19090 (0.0007) +[2023-02-26 12:17:11,011][00189] Updated weights for policy 0, policy_version 19100 (0.0006) +[2023-02-26 12:17:11,661][00189] Updated weights for policy 0, policy_version 19110 (0.0006) +[2023-02-26 12:17:12,339][00189] Updated weights for policy 0, policy_version 19120 (0.0006) +[2023-02-26 12:17:13,024][00189] Updated weights for policy 0, policy_version 19130 (0.0006) +[2023-02-26 12:17:13,656][00001] Fps is (10 sec: 60212.1, 60 sec: 59869.9, 300 sec: 59454.5). Total num frames: 78393344. Throughput: 0: 14962.1. Samples: 19598460. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:17:13,656][00001] Avg episode reward: [(0, '47.773')] +[2023-02-26 12:17:13,715][00189] Updated weights for policy 0, policy_version 19140 (0.0006) +[2023-02-26 12:17:14,393][00189] Updated weights for policy 0, policy_version 19150 (0.0006) +[2023-02-26 12:17:15,082][00189] Updated weights for policy 0, policy_version 19160 (0.0006) +[2023-02-26 12:17:15,771][00189] Updated weights for policy 0, policy_version 19170 (0.0007) +[2023-02-26 12:17:16,435][00189] Updated weights for policy 0, policy_version 19180 (0.0007) +[2023-02-26 12:17:17,131][00189] Updated weights for policy 0, policy_version 19190 (0.0006) +[2023-02-26 12:17:17,801][00189] Updated weights for policy 0, policy_version 19200 (0.0006) +[2023-02-26 12:17:18,470][00189] Updated weights for policy 0, policy_version 19210 (0.0006) +[2023-02-26 12:17:18,656][00001] Fps is (10 sec: 60210.7, 60 sec: 59869.8, 300 sec: 59468.4). Total num frames: 78692352. Throughput: 0: 14960.7. Samples: 19643376. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:17:18,656][00001] Avg episode reward: [(0, '44.620')] +[2023-02-26 12:17:19,188][00189] Updated weights for policy 0, policy_version 19220 (0.0006) +[2023-02-26 12:17:19,830][00189] Updated weights for policy 0, policy_version 19230 (0.0006) +[2023-02-26 12:17:20,545][00189] Updated weights for policy 0, policy_version 19240 (0.0007) +[2023-02-26 12:17:21,230][00189] Updated weights for policy 0, policy_version 19250 (0.0007) +[2023-02-26 12:17:21,904][00189] Updated weights for policy 0, policy_version 19260 (0.0006) +[2023-02-26 12:17:22,584][00189] Updated weights for policy 0, policy_version 19270 (0.0007) +[2023-02-26 12:17:23,278][00189] Updated weights for policy 0, policy_version 19280 (0.0006) +[2023-02-26 12:17:23,656][00001] Fps is (10 sec: 59801.4, 60 sec: 59869.9, 300 sec: 59482.3). Total num frames: 78991360. Throughput: 0: 14971.2. Samples: 19733392. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:17:23,656][00001] Avg episode reward: [(0, '44.717')] +[2023-02-26 12:17:23,936][00189] Updated weights for policy 0, policy_version 19290 (0.0006) +[2023-02-26 12:17:24,640][00189] Updated weights for policy 0, policy_version 19300 (0.0006) +[2023-02-26 12:17:25,324][00189] Updated weights for policy 0, policy_version 19310 (0.0006) +[2023-02-26 12:17:26,006][00189] Updated weights for policy 0, policy_version 19320 (0.0006) +[2023-02-26 12:17:26,706][00189] Updated weights for policy 0, policy_version 19330 (0.0006) +[2023-02-26 12:17:27,379][00189] Updated weights for policy 0, policy_version 19340 (0.0006) +[2023-02-26 12:17:28,081][00189] Updated weights for policy 0, policy_version 19350 (0.0006) +[2023-02-26 12:17:28,656][00001] Fps is (10 sec: 59802.1, 60 sec: 59869.9, 300 sec: 59510.0). Total num frames: 79290368. Throughput: 0: 14976.7. Samples: 19823176. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:17:28,656][00001] Avg episode reward: [(0, '45.425')] +[2023-02-26 12:17:28,765][00189] Updated weights for policy 0, policy_version 19360 (0.0006) +[2023-02-26 12:17:29,423][00189] Updated weights for policy 0, policy_version 19370 (0.0006) +[2023-02-26 12:17:30,135][00189] Updated weights for policy 0, policy_version 19380 (0.0006) +[2023-02-26 12:17:30,816][00189] Updated weights for policy 0, policy_version 19390 (0.0006) +[2023-02-26 12:17:31,470][00189] Updated weights for policy 0, policy_version 19400 (0.0006) +[2023-02-26 12:17:32,186][00189] Updated weights for policy 0, policy_version 19410 (0.0006) +[2023-02-26 12:17:32,855][00189] Updated weights for policy 0, policy_version 19420 (0.0006) +[2023-02-26 12:17:33,506][00189] Updated weights for policy 0, policy_version 19430 (0.0006) +[2023-02-26 12:17:33,656][00001] Fps is (10 sec: 59801.3, 60 sec: 59869.8, 300 sec: 59523.9). Total num frames: 79589376. Throughput: 0: 14981.2. Samples: 19868092. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:17:33,656][00001] Avg episode reward: [(0, '45.473')] +[2023-02-26 12:17:34,231][00189] Updated weights for policy 0, policy_version 19440 (0.0006) +[2023-02-26 12:17:34,910][00189] Updated weights for policy 0, policy_version 19450 (0.0006) +[2023-02-26 12:17:35,586][00189] Updated weights for policy 0, policy_version 19460 (0.0006) +[2023-02-26 12:17:36,264][00189] Updated weights for policy 0, policy_version 19470 (0.0006) +[2023-02-26 12:17:36,935][00189] Updated weights for policy 0, policy_version 19480 (0.0006) +[2023-02-26 12:17:37,651][00189] Updated weights for policy 0, policy_version 19490 (0.0006) +[2023-02-26 12:17:38,316][00189] Updated weights for policy 0, policy_version 19500 (0.0006) +[2023-02-26 12:17:38,656][00001] Fps is (10 sec: 59801.5, 60 sec: 59869.9, 300 sec: 59537.8). Total num frames: 79888384. Throughput: 0: 14984.0. Samples: 19957988. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:17:38,656][00001] Avg episode reward: [(0, '46.784')] +[2023-02-26 12:17:38,985][00189] Updated weights for policy 0, policy_version 19510 (0.0006) +[2023-02-26 12:17:39,698][00189] Updated weights for policy 0, policy_version 19520 (0.0007) +[2023-02-26 12:17:40,377][00189] Updated weights for policy 0, policy_version 19530 (0.0007) +[2023-02-26 12:17:41,054][00189] Updated weights for policy 0, policy_version 19540 (0.0006) +[2023-02-26 12:17:41,732][00189] Updated weights for policy 0, policy_version 19550 (0.0006) +[2023-02-26 12:17:42,479][00189] Updated weights for policy 0, policy_version 19560 (0.0006) +[2023-02-26 12:17:43,137][00189] Updated weights for policy 0, policy_version 19570 (0.0006) +[2023-02-26 12:17:43,656][00001] Fps is (10 sec: 59801.9, 60 sec: 59869.9, 300 sec: 59551.7). Total num frames: 80187392. Throughput: 0: 14985.0. Samples: 20047524. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:17:43,656][00001] Avg episode reward: [(0, '45.283')] +[2023-02-26 12:17:43,786][00189] Updated weights for policy 0, policy_version 19580 (0.0006) +[2023-02-26 12:17:44,496][00189] Updated weights for policy 0, policy_version 19590 (0.0006) +[2023-02-26 12:17:45,172][00189] Updated weights for policy 0, policy_version 19600 (0.0006) +[2023-02-26 12:17:45,845][00189] Updated weights for policy 0, policy_version 19610 (0.0006) +[2023-02-26 12:17:46,554][00189] Updated weights for policy 0, policy_version 19620 (0.0007) +[2023-02-26 12:17:47,232][00189] Updated weights for policy 0, policy_version 19630 (0.0006) +[2023-02-26 12:17:47,938][00189] Updated weights for policy 0, policy_version 19640 (0.0006) +[2023-02-26 12:17:48,588][00189] Updated weights for policy 0, policy_version 19650 (0.0007) +[2023-02-26 12:17:48,656][00001] Fps is (10 sec: 60211.6, 60 sec: 59938.2, 300 sec: 59621.1). Total num frames: 80490496. Throughput: 0: 14980.1. Samples: 20092424. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:17:48,656][00001] Avg episode reward: [(0, '43.466')] +[2023-02-26 12:17:49,270][00189] Updated weights for policy 0, policy_version 19660 (0.0007) +[2023-02-26 12:17:49,970][00189] Updated weights for policy 0, policy_version 19670 (0.0006) +[2023-02-26 12:17:50,651][00189] Updated weights for policy 0, policy_version 19680 (0.0006) +[2023-02-26 12:17:51,299][00189] Updated weights for policy 0, policy_version 19690 (0.0006) +[2023-02-26 12:17:52,035][00189] Updated weights for policy 0, policy_version 19700 (0.0006) +[2023-02-26 12:17:52,690][00189] Updated weights for policy 0, policy_version 19710 (0.0006) +[2023-02-26 12:17:53,362][00189] Updated weights for policy 0, policy_version 19720 (0.0006) +[2023-02-26 12:17:53,656][00001] Fps is (10 sec: 59801.4, 60 sec: 59938.1, 300 sec: 59635.0). Total num frames: 80785408. Throughput: 0: 14981.6. Samples: 20182348. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:17:53,656][00001] Avg episode reward: [(0, '44.939')] +[2023-02-26 12:17:54,063][00189] Updated weights for policy 0, policy_version 19730 (0.0006) +[2023-02-26 12:17:54,737][00189] Updated weights for policy 0, policy_version 19740 (0.0006) +[2023-02-26 12:17:55,399][00189] Updated weights for policy 0, policy_version 19750 (0.0006) +[2023-02-26 12:17:56,109][00189] Updated weights for policy 0, policy_version 19760 (0.0006) +[2023-02-26 12:17:56,767][00189] Updated weights for policy 0, policy_version 19770 (0.0006) +[2023-02-26 12:17:57,448][00189] Updated weights for policy 0, policy_version 19780 (0.0007) +[2023-02-26 12:17:58,161][00189] Updated weights for policy 0, policy_version 19790 (0.0006) +[2023-02-26 12:17:58,656][00001] Fps is (10 sec: 59801.0, 60 sec: 59938.0, 300 sec: 59690.5). Total num frames: 81088512. Throughput: 0: 14977.8. Samples: 20272464. Policy #0 lag: (min: 0.0, avg: 1.6, max: 3.0) +[2023-02-26 12:17:58,656][00001] Avg episode reward: [(0, '43.974')] +[2023-02-26 12:17:58,839][00189] Updated weights for policy 0, policy_version 19800 (0.0006) +[2023-02-26 12:17:59,509][00189] Updated weights for policy 0, policy_version 19810 (0.0006) +[2023-02-26 12:18:00,205][00189] Updated weights for policy 0, policy_version 19820 (0.0007) +[2023-02-26 12:18:00,890][00189] Updated weights for policy 0, policy_version 19830 (0.0006) +[2023-02-26 12:18:01,562][00189] Updated weights for policy 0, policy_version 19840 (0.0006) +[2023-02-26 12:18:02,222][00189] Updated weights for policy 0, policy_version 19850 (0.0006) +[2023-02-26 12:18:02,941][00189] Updated weights for policy 0, policy_version 19860 (0.0006) +[2023-02-26 12:18:03,656][00001] Fps is (10 sec: 59802.0, 60 sec: 59870.0, 300 sec: 59704.4). Total num frames: 81383424. Throughput: 0: 14980.2. Samples: 20317484. Policy #0 lag: (min: 0.0, avg: 1.8, max: 3.0) +[2023-02-26 12:18:03,656][00001] Avg episode reward: [(0, '48.422')] +[2023-02-26 12:18:03,660][00189] Updated weights for policy 0, policy_version 19870 (0.0007) +[2023-02-26 12:18:04,302][00189] Updated weights for policy 0, policy_version 19880 (0.0006) +[2023-02-26 12:18:05,032][00189] Updated weights for policy 0, policy_version 19890 (0.0006) +[2023-02-26 12:18:05,697][00189] Updated weights for policy 0, policy_version 19900 (0.0006) +[2023-02-26 12:18:06,387][00189] Updated weights for policy 0, policy_version 19910 (0.0006) +[2023-02-26 12:18:07,064][00189] Updated weights for policy 0, policy_version 19920 (0.0006) +[2023-02-26 12:18:07,726][00189] Updated weights for policy 0, policy_version 19930 (0.0006) +[2023-02-26 12:18:08,438][00189] Updated weights for policy 0, policy_version 19940 (0.0006) +[2023-02-26 12:18:08,656][00001] Fps is (10 sec: 59802.2, 60 sec: 59938.2, 300 sec: 59718.3). Total num frames: 81686528. Throughput: 0: 14965.7. Samples: 20406848. Policy #0 lag: (min: 0.0, avg: 1.8, max: 3.0) +[2023-02-26 12:18:08,656][00001] Avg episode reward: [(0, '45.076')] +[2023-02-26 12:18:09,122][00189] Updated weights for policy 0, policy_version 19950 (0.0006) +[2023-02-26 12:18:09,811][00189] Updated weights for policy 0, policy_version 19960 (0.0006) +[2023-02-26 12:18:10,517][00189] Updated weights for policy 0, policy_version 19970 (0.0006) +[2023-02-26 12:18:11,170][00189] Updated weights for policy 0, policy_version 19980 (0.0006) +[2023-02-26 12:18:11,870][00189] Updated weights for policy 0, policy_version 19990 (0.0006) +[2023-02-26 12:18:12,568][00189] Updated weights for policy 0, policy_version 20000 (0.0006) +[2023-02-26 12:18:13,226][00189] Updated weights for policy 0, policy_version 20010 (0.0006) +[2023-02-26 12:18:13,656][00001] Fps is (10 sec: 60211.0, 60 sec: 59869.8, 300 sec: 59718.3). Total num frames: 81985536. Throughput: 0: 14963.2. Samples: 20496520. Policy #0 lag: (min: 0.0, avg: 1.8, max: 3.0) +[2023-02-26 12:18:13,656][00001] Avg episode reward: [(0, '44.643')] +[2023-02-26 12:18:13,926][00189] Updated weights for policy 0, policy_version 20020 (0.0006) +[2023-02-26 12:18:14,621][00189] Updated weights for policy 0, policy_version 20030 (0.0007) +[2023-02-26 12:18:15,288][00189] Updated weights for policy 0, policy_version 20040 (0.0006) +[2023-02-26 12:18:15,979][00189] Updated weights for policy 0, policy_version 20050 (0.0006) +[2023-02-26 12:18:16,665][00189] Updated weights for policy 0, policy_version 20060 (0.0007) +[2023-02-26 12:18:17,361][00189] Updated weights for policy 0, policy_version 20070 (0.0007) +[2023-02-26 12:18:18,017][00189] Updated weights for policy 0, policy_version 20080 (0.0007) +[2023-02-26 12:18:18,656][00001] Fps is (10 sec: 59801.3, 60 sec: 59869.9, 300 sec: 59718.3). Total num frames: 82284544. Throughput: 0: 14958.2. Samples: 20541208. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:18:18,656][00001] Avg episode reward: [(0, '44.297')] +[2023-02-26 12:18:18,682][00189] Updated weights for policy 0, policy_version 20090 (0.0006) +[2023-02-26 12:18:19,421][00189] Updated weights for policy 0, policy_version 20100 (0.0006) +[2023-02-26 12:18:20,091][00189] Updated weights for policy 0, policy_version 20110 (0.0007) +[2023-02-26 12:18:20,760][00189] Updated weights for policy 0, policy_version 20120 (0.0006) +[2023-02-26 12:18:21,490][00189] Updated weights for policy 0, policy_version 20130 (0.0006) +[2023-02-26 12:18:22,154][00189] Updated weights for policy 0, policy_version 20140 (0.0007) +[2023-02-26 12:18:22,819][00189] Updated weights for policy 0, policy_version 20150 (0.0006) +[2023-02-26 12:18:23,537][00189] Updated weights for policy 0, policy_version 20160 (0.0006) +[2023-02-26 12:18:23,656][00001] Fps is (10 sec: 59801.7, 60 sec: 59869.9, 300 sec: 59718.3). Total num frames: 82583552. Throughput: 0: 14954.9. Samples: 20630960. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:18:23,656][00001] Avg episode reward: [(0, '47.631')] +[2023-02-26 12:18:24,185][00189] Updated weights for policy 0, policy_version 20170 (0.0006) +[2023-02-26 12:18:24,875][00189] Updated weights for policy 0, policy_version 20180 (0.0007) +[2023-02-26 12:18:25,197][00141] Signal inference workers to stop experience collection... (600 times) +[2023-02-26 12:18:25,198][00141] Signal inference workers to resume experience collection... (600 times) +[2023-02-26 12:18:25,201][00189] InferenceWorker_p0-w0: stopping experience collection (600 times) +[2023-02-26 12:18:25,202][00189] InferenceWorker_p0-w0: resuming experience collection (600 times) +[2023-02-26 12:18:25,620][00189] Updated weights for policy 0, policy_version 20190 (0.0007) +[2023-02-26 12:18:26,246][00189] Updated weights for policy 0, policy_version 20200 (0.0007) +[2023-02-26 12:18:26,931][00189] Updated weights for policy 0, policy_version 20210 (0.0007) +[2023-02-26 12:18:27,636][00189] Updated weights for policy 0, policy_version 20220 (0.0006) +[2023-02-26 12:18:28,272][00189] Updated weights for policy 0, policy_version 20230 (0.0006) +[2023-02-26 12:18:28,656][00001] Fps is (10 sec: 59801.8, 60 sec: 59869.9, 300 sec: 59704.4). Total num frames: 82882560. Throughput: 0: 14962.6. Samples: 20720840. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:18:28,656][00001] Avg episode reward: [(0, '42.411')] +[2023-02-26 12:18:28,977][00189] Updated weights for policy 0, policy_version 20240 (0.0007) +[2023-02-26 12:18:29,668][00189] Updated weights for policy 0, policy_version 20250 (0.0006) +[2023-02-26 12:18:30,367][00189] Updated weights for policy 0, policy_version 20260 (0.0007) +[2023-02-26 12:18:31,054][00189] Updated weights for policy 0, policy_version 20270 (0.0006) +[2023-02-26 12:18:31,707][00189] Updated weights for policy 0, policy_version 20280 (0.0006) +[2023-02-26 12:18:32,417][00189] Updated weights for policy 0, policy_version 20290 (0.0006) +[2023-02-26 12:18:33,108][00189] Updated weights for policy 0, policy_version 20300 (0.0007) +[2023-02-26 12:18:33,656][00001] Fps is (10 sec: 59801.8, 60 sec: 59870.0, 300 sec: 59704.4). Total num frames: 83181568. Throughput: 0: 14959.6. Samples: 20765608. Policy #0 lag: (min: 0.0, avg: 1.7, max: 3.0) +[2023-02-26 12:18:33,656][00001] Avg episode reward: [(0, '46.920')] +[2023-02-26 12:18:33,779][00189] Updated weights for policy 0, policy_version 20310 (0.0006) +[2023-02-26 12:18:34,461][00189] Updated weights for policy 0, policy_version 20320 (0.0006) +[2023-02-26 12:18:35,140][00189] Updated weights for policy 0, policy_version 20330 (0.0007) +[2023-02-26 12:18:35,824][00189] Updated weights for policy 0, policy_version 20340 (0.0006) +[2023-02-26 12:18:36,493][00189] Updated weights for policy 0, policy_version 20350 (0.0006) +[2023-02-26 12:18:37,166][00189] Updated weights for policy 0, policy_version 20360 (0.0006) +[2023-02-26 12:18:37,867][00189] Updated weights for policy 0, policy_version 20370 (0.0006) +[2023-02-26 12:18:38,529][00189] Updated weights for policy 0, policy_version 20380 (0.0007) +[2023-02-26 12:18:38,656][00001] Fps is (10 sec: 59801.5, 60 sec: 59869.9, 300 sec: 59718.3). Total num frames: 83480576. Throughput: 0: 14961.6. Samples: 20855620. Policy #0 lag: (min: 0.0, avg: 1.7, max: 3.0) +[2023-02-26 12:18:38,656][00001] Avg episode reward: [(0, '46.751')] +[2023-02-26 12:18:39,244][00189] Updated weights for policy 0, policy_version 20390 (0.0006) +[2023-02-26 12:18:39,900][00189] Updated weights for policy 0, policy_version 20400 (0.0006) +[2023-02-26 12:18:40,577][00189] Updated weights for policy 0, policy_version 20410 (0.0006) +[2023-02-26 12:18:41,317][00189] Updated weights for policy 0, policy_version 20420 (0.0007) +[2023-02-26 12:18:41,962][00189] Updated weights for policy 0, policy_version 20430 (0.0006) +[2023-02-26 12:18:42,642][00189] Updated weights for policy 0, policy_version 20440 (0.0006) +[2023-02-26 12:18:43,342][00189] Updated weights for policy 0, policy_version 20450 (0.0006) +[2023-02-26 12:18:43,656][00001] Fps is (10 sec: 59801.1, 60 sec: 59869.9, 300 sec: 59732.2). Total num frames: 83779584. Throughput: 0: 14960.8. Samples: 20945700. Policy #0 lag: (min: 0.0, avg: 1.7, max: 3.0) +[2023-02-26 12:18:43,656][00001] Avg episode reward: [(0, '44.356')] +[2023-02-26 12:18:44,000][00189] Updated weights for policy 0, policy_version 20460 (0.0006) +[2023-02-26 12:18:44,704][00189] Updated weights for policy 0, policy_version 20470 (0.0006) +[2023-02-26 12:18:45,392][00189] Updated weights for policy 0, policy_version 20480 (0.0006) +[2023-02-26 12:18:46,069][00189] Updated weights for policy 0, policy_version 20490 (0.0006) +[2023-02-26 12:18:46,726][00189] Updated weights for policy 0, policy_version 20500 (0.0006) +[2023-02-26 12:18:47,414][00189] Updated weights for policy 0, policy_version 20510 (0.0006) +[2023-02-26 12:18:48,117][00189] Updated weights for policy 0, policy_version 20520 (0.0006) +[2023-02-26 12:18:48,656][00001] Fps is (10 sec: 59800.5, 60 sec: 59801.4, 300 sec: 59732.1). Total num frames: 84078592. Throughput: 0: 14959.7. Samples: 20990672. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:18:48,657][00001] Avg episode reward: [(0, '43.946')] +[2023-02-26 12:18:48,803][00189] Updated weights for policy 0, policy_version 20530 (0.0006) +[2023-02-26 12:18:49,555][00189] Updated weights for policy 0, policy_version 20540 (0.0007) +[2023-02-26 12:18:50,232][00189] Updated weights for policy 0, policy_version 20550 (0.0006) +[2023-02-26 12:18:50,934][00189] Updated weights for policy 0, policy_version 20560 (0.0006) +[2023-02-26 12:18:51,623][00189] Updated weights for policy 0, policy_version 20570 (0.0006) +[2023-02-26 12:18:52,317][00189] Updated weights for policy 0, policy_version 20580 (0.0006) +[2023-02-26 12:18:52,991][00189] Updated weights for policy 0, policy_version 20590 (0.0006) +[2023-02-26 12:18:53,656][00001] Fps is (10 sec: 59392.2, 60 sec: 59801.7, 300 sec: 59704.4). Total num frames: 84373504. Throughput: 0: 14934.6. Samples: 21078904. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:18:53,656][00001] Avg episode reward: [(0, '44.460')] +[2023-02-26 12:18:53,659][00141] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000020599_84373504.pth... +[2023-02-26 12:18:53,695][00141] Removing ./runs/default_experiment/checkpoint_p0/checkpoint_000017105_70062080.pth +[2023-02-26 12:18:53,722][00189] Updated weights for policy 0, policy_version 20600 (0.0007) +[2023-02-26 12:18:54,435][00189] Updated weights for policy 0, policy_version 20610 (0.0006) +[2023-02-26 12:18:55,150][00189] Updated weights for policy 0, policy_version 20620 (0.0006) +[2023-02-26 12:18:55,865][00189] Updated weights for policy 0, policy_version 20630 (0.0007) +[2023-02-26 12:18:56,568][00189] Updated weights for policy 0, policy_version 20640 (0.0006) +[2023-02-26 12:18:57,283][00189] Updated weights for policy 0, policy_version 20650 (0.0007) +[2023-02-26 12:18:57,980][00189] Updated weights for policy 0, policy_version 20660 (0.0007) +[2023-02-26 12:18:58,620][00189] Updated weights for policy 0, policy_version 20670 (0.0006) +[2023-02-26 12:18:58,656][00001] Fps is (10 sec: 58573.8, 60 sec: 59596.9, 300 sec: 59676.6). Total num frames: 84664320. Throughput: 0: 14882.3. Samples: 21166224. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) +[2023-02-26 12:18:58,656][00001] Avg episode reward: [(0, '44.443')] +[2023-02-26 12:18:59,346][00189] Updated weights for policy 0, policy_version 20680 (0.0006) +[2023-02-26 12:19:00,030][00189] Updated weights for policy 0, policy_version 20690 (0.0006) +[2023-02-26 12:19:00,713][00189] Updated weights for policy 0, policy_version 20700 (0.0007) +[2023-02-26 12:19:01,380][00189] Updated weights for policy 0, policy_version 20710 (0.0006) +[2023-02-26 12:19:02,068][00189] Updated weights for policy 0, policy_version 20720 (0.0006) +[2023-02-26 12:19:02,772][00189] Updated weights for policy 0, policy_version 20730 (0.0006) +[2023-02-26 12:19:03,444][00189] Updated weights for policy 0, policy_version 20740 (0.0006) +[2023-02-26 12:19:03,656][00001] Fps is (10 sec: 58982.7, 60 sec: 59665.1, 300 sec: 59662.8). Total num frames: 84963328. Throughput: 0: 14882.2. Samples: 21210908. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) +[2023-02-26 12:19:03,656][00001] Avg episode reward: [(0, '46.988')] +[2023-02-26 12:19:04,107][00189] Updated weights for policy 0, policy_version 20750 (0.0006) +[2023-02-26 12:19:04,840][00189] Updated weights for policy 0, policy_version 20760 (0.0006) +[2023-02-26 12:19:05,509][00189] Updated weights for policy 0, policy_version 20770 (0.0006) +[2023-02-26 12:19:06,182][00189] Updated weights for policy 0, policy_version 20780 (0.0006) +[2023-02-26 12:19:06,891][00189] Updated weights for policy 0, policy_version 20790 (0.0007) +[2023-02-26 12:19:07,553][00189] Updated weights for policy 0, policy_version 20800 (0.0006) +[2023-02-26 12:19:08,252][00189] Updated weights for policy 0, policy_version 20810 (0.0006) +[2023-02-26 12:19:08,656][00001] Fps is (10 sec: 59391.7, 60 sec: 59528.4, 300 sec: 59648.9). Total num frames: 85258240. Throughput: 0: 14876.7. Samples: 21300412. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) +[2023-02-26 12:19:08,656][00001] Avg episode reward: [(0, '42.742')] +[2023-02-26 12:19:08,943][00189] Updated weights for policy 0, policy_version 20820 (0.0006) +[2023-02-26 12:19:09,580][00189] Updated weights for policy 0, policy_version 20830 (0.0006) +[2023-02-26 12:19:10,323][00189] Updated weights for policy 0, policy_version 20840 (0.0006) +[2023-02-26 12:19:10,989][00189] Updated weights for policy 0, policy_version 20850 (0.0006) +[2023-02-26 12:19:11,623][00189] Updated weights for policy 0, policy_version 20860 (0.0006) +[2023-02-26 12:19:12,376][00189] Updated weights for policy 0, policy_version 20870 (0.0006) +[2023-02-26 12:19:13,038][00189] Updated weights for policy 0, policy_version 20880 (0.0006) +[2023-02-26 12:19:13,656][00001] Fps is (10 sec: 59391.6, 60 sec: 59528.5, 300 sec: 59648.9). Total num frames: 85557248. Throughput: 0: 14872.6. Samples: 21390108. Policy #0 lag: (min: 1.0, avg: 2.1, max: 4.0) +[2023-02-26 12:19:13,656][00001] Avg episode reward: [(0, '44.812')] +[2023-02-26 12:19:13,695][00189] Updated weights for policy 0, policy_version 20890 (0.0006) +[2023-02-26 12:19:14,448][00189] Updated weights for policy 0, policy_version 20900 (0.0006) +[2023-02-26 12:19:15,114][00189] Updated weights for policy 0, policy_version 20910 (0.0006) +[2023-02-26 12:19:15,780][00189] Updated weights for policy 0, policy_version 20920 (0.0006) +[2023-02-26 12:19:16,525][00189] Updated weights for policy 0, policy_version 20930 (0.0006) +[2023-02-26 12:19:17,195][00189] Updated weights for policy 0, policy_version 20940 (0.0006) +[2023-02-26 12:19:17,872][00189] Updated weights for policy 0, policy_version 20950 (0.0006) +[2023-02-26 12:19:18,591][00189] Updated weights for policy 0, policy_version 20960 (0.0006) +[2023-02-26 12:19:18,656][00001] Fps is (10 sec: 59801.9, 60 sec: 59528.5, 300 sec: 59648.9). Total num frames: 85856256. Throughput: 0: 14862.2. Samples: 21434408. Policy #0 lag: (min: 1.0, avg: 2.1, max: 4.0) +[2023-02-26 12:19:18,656][00001] Avg episode reward: [(0, '46.272')] +[2023-02-26 12:19:19,279][00189] Updated weights for policy 0, policy_version 20970 (0.0006) +[2023-02-26 12:19:19,984][00189] Updated weights for policy 0, policy_version 20980 (0.0006) +[2023-02-26 12:19:20,687][00189] Updated weights for policy 0, policy_version 20990 (0.0006) +[2023-02-26 12:19:21,354][00189] Updated weights for policy 0, policy_version 21000 (0.0006) +[2023-02-26 12:19:22,084][00189] Updated weights for policy 0, policy_version 21010 (0.0006) +[2023-02-26 12:19:22,740][00189] Updated weights for policy 0, policy_version 21020 (0.0006) +[2023-02-26 12:19:23,428][00189] Updated weights for policy 0, policy_version 21030 (0.0006) +[2023-02-26 12:19:23,656][00001] Fps is (10 sec: 59392.4, 60 sec: 59460.3, 300 sec: 59635.0). Total num frames: 86151168. Throughput: 0: 14832.0. Samples: 21523060. Policy #0 lag: (min: 1.0, avg: 2.1, max: 4.0) +[2023-02-26 12:19:23,656][00001] Avg episode reward: [(0, '45.603')] +[2023-02-26 12:19:24,148][00189] Updated weights for policy 0, policy_version 21040 (0.0006) +[2023-02-26 12:19:24,807][00189] Updated weights for policy 0, policy_version 21050 (0.0006) +[2023-02-26 12:19:25,553][00189] Updated weights for policy 0, policy_version 21060 (0.0006) +[2023-02-26 12:19:26,264][00189] Updated weights for policy 0, policy_version 21070 (0.0006) +[2023-02-26 12:19:26,930][00189] Updated weights for policy 0, policy_version 21080 (0.0006) +[2023-02-26 12:19:27,637][00189] Updated weights for policy 0, policy_version 21090 (0.0006) +[2023-02-26 12:19:28,310][00189] Updated weights for policy 0, policy_version 21100 (0.0006) +[2023-02-26 12:19:28,656][00001] Fps is (10 sec: 58572.8, 60 sec: 59323.7, 300 sec: 59593.3). Total num frames: 86441984. Throughput: 0: 14789.4. Samples: 21611224. Policy #0 lag: (min: 1.0, avg: 1.9, max: 4.0) +[2023-02-26 12:19:28,656][00001] Avg episode reward: [(0, '45.913')] +[2023-02-26 12:19:29,000][00189] Updated weights for policy 0, policy_version 21110 (0.0006) +[2023-02-26 12:19:29,714][00189] Updated weights for policy 0, policy_version 21120 (0.0006) +[2023-02-26 12:19:30,400][00189] Updated weights for policy 0, policy_version 21130 (0.0006) +[2023-02-26 12:19:31,062][00189] Updated weights for policy 0, policy_version 21140 (0.0006) +[2023-02-26 12:19:31,776][00189] Updated weights for policy 0, policy_version 21150 (0.0006) +[2023-02-26 12:19:32,440][00189] Updated weights for policy 0, policy_version 21160 (0.0006) +[2023-02-26 12:19:33,155][00189] Updated weights for policy 0, policy_version 21170 (0.0007) +[2023-02-26 12:19:33,656][00001] Fps is (10 sec: 58982.0, 60 sec: 59323.7, 300 sec: 59593.3). Total num frames: 86740992. Throughput: 0: 14784.0. Samples: 21655952. Policy #0 lag: (min: 1.0, avg: 1.9, max: 4.0) +[2023-02-26 12:19:33,656][00001] Avg episode reward: [(0, '47.987')] +[2023-02-26 12:19:33,836][00189] Updated weights for policy 0, policy_version 21180 (0.0006) +[2023-02-26 12:19:34,506][00189] Updated weights for policy 0, policy_version 21190 (0.0006) +[2023-02-26 12:19:35,204][00189] Updated weights for policy 0, policy_version 21200 (0.0006) +[2023-02-26 12:19:35,893][00189] Updated weights for policy 0, policy_version 21210 (0.0007) +[2023-02-26 12:19:36,572][00189] Updated weights for policy 0, policy_version 21220 (0.0006) +[2023-02-26 12:19:37,242][00189] Updated weights for policy 0, policy_version 21230 (0.0006) +[2023-02-26 12:19:37,934][00189] Updated weights for policy 0, policy_version 21240 (0.0006) +[2023-02-26 12:19:38,656][00001] Fps is (10 sec: 59391.8, 60 sec: 59255.4, 300 sec: 59565.6). Total num frames: 87035904. Throughput: 0: 14812.0. Samples: 21745444. Policy #0 lag: (min: 1.0, avg: 1.9, max: 4.0) +[2023-02-26 12:19:38,656][00001] Avg episode reward: [(0, '46.907')] +[2023-02-26 12:19:38,697][00189] Updated weights for policy 0, policy_version 21250 (0.0007) +[2023-02-26 12:19:39,350][00189] Updated weights for policy 0, policy_version 21260 (0.0006) +[2023-02-26 12:19:40,061][00189] Updated weights for policy 0, policy_version 21270 (0.0006) +[2023-02-26 12:19:40,765][00189] Updated weights for policy 0, policy_version 21280 (0.0006) +[2023-02-26 12:19:41,447][00189] Updated weights for policy 0, policy_version 21290 (0.0006) +[2023-02-26 12:19:42,191][00189] Updated weights for policy 0, policy_version 21300 (0.0006) +[2023-02-26 12:19:42,898][00189] Updated weights for policy 0, policy_version 21310 (0.0007) +[2023-02-26 12:19:43,656][00001] Fps is (10 sec: 58572.7, 60 sec: 59118.9, 300 sec: 59537.8). Total num frames: 87326720. Throughput: 0: 14784.7. Samples: 21831536. Policy #0 lag: (min: 0.0, avg: 1.7, max: 3.0) +[2023-02-26 12:19:43,656][00001] Avg episode reward: [(0, '44.810')] +[2023-02-26 12:19:43,657][00189] Updated weights for policy 0, policy_version 21320 (0.0007) +[2023-02-26 12:19:44,432][00189] Updated weights for policy 0, policy_version 21330 (0.0007) +[2023-02-26 12:19:45,119][00189] Updated weights for policy 0, policy_version 21340 (0.0006) +[2023-02-26 12:19:45,801][00189] Updated weights for policy 0, policy_version 21350 (0.0006) +[2023-02-26 12:19:46,505][00189] Updated weights for policy 0, policy_version 21360 (0.0006) +[2023-02-26 12:19:47,208][00189] Updated weights for policy 0, policy_version 21370 (0.0006) +[2023-02-26 12:19:47,897][00189] Updated weights for policy 0, policy_version 21380 (0.0006) +[2023-02-26 12:19:48,605][00189] Updated weights for policy 0, policy_version 21390 (0.0006) +[2023-02-26 12:19:48,656][00001] Fps is (10 sec: 57753.6, 60 sec: 58914.3, 300 sec: 59496.1). Total num frames: 87613440. Throughput: 0: 14754.6. Samples: 21874864. Policy #0 lag: (min: 0.0, avg: 1.7, max: 3.0) +[2023-02-26 12:19:48,656][00001] Avg episode reward: [(0, '48.255')] +[2023-02-26 12:19:49,296][00189] Updated weights for policy 0, policy_version 21400 (0.0006) +[2023-02-26 12:19:49,963][00189] Updated weights for policy 0, policy_version 21410 (0.0006) +[2023-02-26 12:19:50,693][00189] Updated weights for policy 0, policy_version 21420 (0.0006) +[2023-02-26 12:19:51,350][00189] Updated weights for policy 0, policy_version 21430 (0.0006) +[2023-02-26 12:19:52,043][00189] Updated weights for policy 0, policy_version 21440 (0.0006) +[2023-02-26 12:19:52,753][00189] Updated weights for policy 0, policy_version 21450 (0.0007) +[2023-02-26 12:19:53,432][00189] Updated weights for policy 0, policy_version 21460 (0.0006) +[2023-02-26 12:19:53,656][00001] Fps is (10 sec: 58572.7, 60 sec: 58982.3, 300 sec: 59496.1). Total num frames: 87912448. Throughput: 0: 14736.8. Samples: 21963568. Policy #0 lag: (min: 0.0, avg: 1.7, max: 3.0) +[2023-02-26 12:19:53,656][00001] Avg episode reward: [(0, '46.109')] +[2023-02-26 12:19:54,111][00189] Updated weights for policy 0, policy_version 21470 (0.0006) +[2023-02-26 12:19:54,828][00189] Updated weights for policy 0, policy_version 21480 (0.0007) +[2023-02-26 12:19:55,507][00189] Updated weights for policy 0, policy_version 21490 (0.0006) +[2023-02-26 12:19:56,201][00189] Updated weights for policy 0, policy_version 21500 (0.0006) +[2023-02-26 12:19:56,893][00189] Updated weights for policy 0, policy_version 21510 (0.0007) +[2023-02-26 12:19:57,565][00189] Updated weights for policy 0, policy_version 21520 (0.0006) +[2023-02-26 12:19:58,256][00189] Updated weights for policy 0, policy_version 21530 (0.0006) +[2023-02-26 12:19:58,656][00001] Fps is (10 sec: 59392.2, 60 sec: 59050.7, 300 sec: 59468.4). Total num frames: 88207360. Throughput: 0: 14724.1. Samples: 22052692. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:19:58,656][00001] Avg episode reward: [(0, '44.765')] +[2023-02-26 12:19:58,926][00189] Updated weights for policy 0, policy_version 21540 (0.0006) +[2023-02-26 12:19:59,611][00189] Updated weights for policy 0, policy_version 21550 (0.0006) +[2023-02-26 12:20:00,299][00189] Updated weights for policy 0, policy_version 21560 (0.0006) +[2023-02-26 12:20:00,988][00189] Updated weights for policy 0, policy_version 21570 (0.0006) +[2023-02-26 12:20:01,700][00189] Updated weights for policy 0, policy_version 21580 (0.0006) +[2023-02-26 12:20:02,354][00189] Updated weights for policy 0, policy_version 21590 (0.0006) +[2023-02-26 12:20:03,036][00189] Updated weights for policy 0, policy_version 21600 (0.0006) +[2023-02-26 12:20:03,656][00001] Fps is (10 sec: 59801.7, 60 sec: 59118.8, 300 sec: 59482.2). Total num frames: 88510464. Throughput: 0: 14738.0. Samples: 22097620. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:20:03,656][00001] Avg episode reward: [(0, '43.679')] +[2023-02-26 12:20:03,726][00189] Updated weights for policy 0, policy_version 21610 (0.0006) +[2023-02-26 12:20:04,412][00189] Updated weights for policy 0, policy_version 21620 (0.0006) +[2023-02-26 12:20:05,054][00189] Updated weights for policy 0, policy_version 21630 (0.0006) +[2023-02-26 12:20:05,760][00189] Updated weights for policy 0, policy_version 21640 (0.0006) +[2023-02-26 12:20:06,464][00189] Updated weights for policy 0, policy_version 21650 (0.0007) +[2023-02-26 12:20:07,144][00189] Updated weights for policy 0, policy_version 21660 (0.0007) +[2023-02-26 12:20:07,826][00189] Updated weights for policy 0, policy_version 21670 (0.0006) +[2023-02-26 12:20:08,513][00189] Updated weights for policy 0, policy_version 21680 (0.0006) +[2023-02-26 12:20:08,656][00001] Fps is (10 sec: 59801.8, 60 sec: 59119.0, 300 sec: 59468.4). Total num frames: 88805376. Throughput: 0: 14763.3. Samples: 22187408. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:20:08,656][00001] Avg episode reward: [(0, '49.072')] +[2023-02-26 12:20:09,223][00189] Updated weights for policy 0, policy_version 21690 (0.0006) +[2023-02-26 12:20:09,876][00189] Updated weights for policy 0, policy_version 21700 (0.0006) +[2023-02-26 12:20:10,529][00189] Updated weights for policy 0, policy_version 21710 (0.0006) +[2023-02-26 12:20:11,264][00189] Updated weights for policy 0, policy_version 21720 (0.0006) +[2023-02-26 12:20:11,934][00189] Updated weights for policy 0, policy_version 21730 (0.0006) +[2023-02-26 12:20:12,594][00189] Updated weights for policy 0, policy_version 21740 (0.0006) +[2023-02-26 12:20:13,329][00189] Updated weights for policy 0, policy_version 21750 (0.0006) +[2023-02-26 12:20:13,656][00001] Fps is (10 sec: 59801.7, 60 sec: 59187.2, 300 sec: 59482.3). Total num frames: 89108480. Throughput: 0: 14803.6. Samples: 22277388. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:20:13,656][00001] Avg episode reward: [(0, '48.745')] +[2023-02-26 12:20:13,978][00189] Updated weights for policy 0, policy_version 21760 (0.0006) +[2023-02-26 12:20:14,663][00189] Updated weights for policy 0, policy_version 21770 (0.0006) +[2023-02-26 12:20:15,395][00189] Updated weights for policy 0, policy_version 21780 (0.0006) +[2023-02-26 12:20:16,066][00189] Updated weights for policy 0, policy_version 21790 (0.0007) +[2023-02-26 12:20:16,737][00189] Updated weights for policy 0, policy_version 21800 (0.0007) +[2023-02-26 12:20:17,437][00189] Updated weights for policy 0, policy_version 21810 (0.0006) +[2023-02-26 12:20:18,091][00189] Updated weights for policy 0, policy_version 21820 (0.0006) +[2023-02-26 12:20:18,656][00001] Fps is (10 sec: 60211.4, 60 sec: 59187.2, 300 sec: 59482.3). Total num frames: 89407488. Throughput: 0: 14797.0. Samples: 22321816. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:20:18,656][00001] Avg episode reward: [(0, '44.591')] +[2023-02-26 12:20:18,836][00189] Updated weights for policy 0, policy_version 21830 (0.0006) +[2023-02-26 12:20:19,481][00189] Updated weights for policy 0, policy_version 21840 (0.0006) +[2023-02-26 12:20:20,178][00189] Updated weights for policy 0, policy_version 21850 (0.0006) +[2023-02-26 12:20:20,868][00189] Updated weights for policy 0, policy_version 21860 (0.0006) +[2023-02-26 12:20:21,538][00189] Updated weights for policy 0, policy_version 21870 (0.0006) +[2023-02-26 12:20:22,210][00189] Updated weights for policy 0, policy_version 21880 (0.0006) +[2023-02-26 12:20:22,943][00189] Updated weights for policy 0, policy_version 21890 (0.0006) +[2023-02-26 12:20:23,586][00189] Updated weights for policy 0, policy_version 21900 (0.0006) +[2023-02-26 12:20:23,656][00001] Fps is (10 sec: 59801.8, 60 sec: 59255.4, 300 sec: 59482.3). Total num frames: 89706496. Throughput: 0: 14799.1. Samples: 22411404. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:20:23,656][00001] Avg episode reward: [(0, '45.732')] +[2023-02-26 12:20:24,269][00189] Updated weights for policy 0, policy_version 21910 (0.0007) +[2023-02-26 12:20:25,009][00189] Updated weights for policy 0, policy_version 21920 (0.0006) +[2023-02-26 12:20:25,669][00189] Updated weights for policy 0, policy_version 21930 (0.0007) +[2023-02-26 12:20:26,315][00189] Updated weights for policy 0, policy_version 21940 (0.0006) +[2023-02-26 12:20:27,042][00189] Updated weights for policy 0, policy_version 21950 (0.0006) +[2023-02-26 12:20:27,728][00189] Updated weights for policy 0, policy_version 21960 (0.0006) +[2023-02-26 12:20:28,401][00189] Updated weights for policy 0, policy_version 21970 (0.0007) +[2023-02-26 12:20:28,656][00001] Fps is (10 sec: 59391.7, 60 sec: 59323.7, 300 sec: 59468.4). Total num frames: 90001408. Throughput: 0: 14871.4. Samples: 22500748. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) +[2023-02-26 12:20:28,656][00001] Avg episode reward: [(0, '46.963')] +[2023-02-26 12:20:29,116][00189] Updated weights for policy 0, policy_version 21980 (0.0006) +[2023-02-26 12:20:29,770][00189] Updated weights for policy 0, policy_version 21990 (0.0006) +[2023-02-26 12:20:30,472][00189] Updated weights for policy 0, policy_version 22000 (0.0006) +[2023-02-26 12:20:31,182][00189] Updated weights for policy 0, policy_version 22010 (0.0006) +[2023-02-26 12:20:31,833][00189] Updated weights for policy 0, policy_version 22020 (0.0006) +[2023-02-26 12:20:32,554][00189] Updated weights for policy 0, policy_version 22030 (0.0006) +[2023-02-26 12:20:33,214][00189] Updated weights for policy 0, policy_version 22040 (0.0006) +[2023-02-26 12:20:33,656][00001] Fps is (10 sec: 59391.8, 60 sec: 59323.7, 300 sec: 59468.4). Total num frames: 90300416. Throughput: 0: 14902.1. Samples: 22545460. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) +[2023-02-26 12:20:33,656][00001] Avg episode reward: [(0, '46.972')] +[2023-02-26 12:20:33,930][00189] Updated weights for policy 0, policy_version 22050 (0.0006) +[2023-02-26 12:20:34,623][00189] Updated weights for policy 0, policy_version 22060 (0.0006) +[2023-02-26 12:20:35,260][00189] Updated weights for policy 0, policy_version 22070 (0.0007) +[2023-02-26 12:20:35,953][00189] Updated weights for policy 0, policy_version 22080 (0.0006) +[2023-02-26 12:20:36,673][00189] Updated weights for policy 0, policy_version 22090 (0.0006) +[2023-02-26 12:20:37,333][00189] Updated weights for policy 0, policy_version 22100 (0.0007) +[2023-02-26 12:20:38,012][00189] Updated weights for policy 0, policy_version 22110 (0.0006) +[2023-02-26 12:20:38,656][00001] Fps is (10 sec: 59801.8, 60 sec: 59392.1, 300 sec: 59468.4). Total num frames: 90599424. Throughput: 0: 14921.9. Samples: 22635052. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) +[2023-02-26 12:20:38,656][00001] Avg episode reward: [(0, '46.910')] +[2023-02-26 12:20:38,705][00189] Updated weights for policy 0, policy_version 22120 (0.0006) +[2023-02-26 12:20:39,387][00189] Updated weights for policy 0, policy_version 22130 (0.0007) +[2023-02-26 12:20:40,082][00189] Updated weights for policy 0, policy_version 22140 (0.0007) +[2023-02-26 12:20:40,747][00189] Updated weights for policy 0, policy_version 22150 (0.0006) +[2023-02-26 12:20:41,448][00189] Updated weights for policy 0, policy_version 22160 (0.0006) +[2023-02-26 12:20:42,131][00189] Updated weights for policy 0, policy_version 22170 (0.0006) +[2023-02-26 12:20:42,790][00189] Updated weights for policy 0, policy_version 22180 (0.0006) +[2023-02-26 12:20:43,497][00189] Updated weights for policy 0, policy_version 22190 (0.0006) +[2023-02-26 12:20:43,656][00001] Fps is (10 sec: 59801.6, 60 sec: 59528.5, 300 sec: 59510.0). Total num frames: 90898432. Throughput: 0: 14937.9. Samples: 22724900. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:20:43,656][00001] Avg episode reward: [(0, '46.419')] +[2023-02-26 12:20:44,158][00189] Updated weights for policy 0, policy_version 22200 (0.0006) +[2023-02-26 12:20:44,840][00189] Updated weights for policy 0, policy_version 22210 (0.0006) +[2023-02-26 12:20:45,534][00189] Updated weights for policy 0, policy_version 22220 (0.0006) +[2023-02-26 12:20:46,235][00189] Updated weights for policy 0, policy_version 22230 (0.0006) +[2023-02-26 12:20:46,906][00189] Updated weights for policy 0, policy_version 22240 (0.0006) +[2023-02-26 12:20:47,603][00189] Updated weights for policy 0, policy_version 22250 (0.0006) +[2023-02-26 12:20:48,289][00189] Updated weights for policy 0, policy_version 22260 (0.0006) +[2023-02-26 12:20:48,656][00001] Fps is (10 sec: 59801.4, 60 sec: 59733.4, 300 sec: 59523.9). Total num frames: 91197440. Throughput: 0: 14936.5. Samples: 22769764. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:20:48,656][00001] Avg episode reward: [(0, '45.555')] +[2023-02-26 12:20:48,947][00189] Updated weights for policy 0, policy_version 22270 (0.0006) +[2023-02-26 12:20:49,666][00189] Updated weights for policy 0, policy_version 22280 (0.0007) +[2023-02-26 12:20:50,333][00189] Updated weights for policy 0, policy_version 22290 (0.0007) +[2023-02-26 12:20:51,020][00189] Updated weights for policy 0, policy_version 22300 (0.0007) +[2023-02-26 12:20:51,677][00189] Updated weights for policy 0, policy_version 22310 (0.0006) +[2023-02-26 12:20:52,376][00189] Updated weights for policy 0, policy_version 22320 (0.0006) +[2023-02-26 12:20:53,082][00189] Updated weights for policy 0, policy_version 22330 (0.0007) +[2023-02-26 12:20:53,656][00001] Fps is (10 sec: 59801.8, 60 sec: 59733.4, 300 sec: 59551.7). Total num frames: 91496448. Throughput: 0: 14936.9. Samples: 22859568. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:20:53,656][00001] Avg episode reward: [(0, '44.887')] +[2023-02-26 12:20:53,659][00141] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000022338_91496448.pth... +[2023-02-26 12:20:53,693][00141] Removing ./runs/default_experiment/checkpoint_p0/checkpoint_000018846_77193216.pth +[2023-02-26 12:20:53,789][00189] Updated weights for policy 0, policy_version 22340 (0.0006) +[2023-02-26 12:20:54,437][00189] Updated weights for policy 0, policy_version 22350 (0.0006) +[2023-02-26 12:20:55,122][00189] Updated weights for policy 0, policy_version 22360 (0.0006) +[2023-02-26 12:20:55,831][00189] Updated weights for policy 0, policy_version 22370 (0.0006) +[2023-02-26 12:20:56,512][00189] Updated weights for policy 0, policy_version 22380 (0.0006) +[2023-02-26 12:20:57,203][00189] Updated weights for policy 0, policy_version 22390 (0.0006) +[2023-02-26 12:20:57,887][00189] Updated weights for policy 0, policy_version 22400 (0.0007) +[2023-02-26 12:20:58,566][00189] Updated weights for policy 0, policy_version 22410 (0.0006) +[2023-02-26 12:20:58,656][00001] Fps is (10 sec: 59801.7, 60 sec: 59801.6, 300 sec: 59565.6). Total num frames: 91795456. Throughput: 0: 14926.9. Samples: 22949096. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) +[2023-02-26 12:20:58,656][00001] Avg episode reward: [(0, '45.159')] +[2023-02-26 12:20:59,247][00189] Updated weights for policy 0, policy_version 22420 (0.0006) +[2023-02-26 12:20:59,925][00189] Updated weights for policy 0, policy_version 22430 (0.0006) +[2023-02-26 12:21:00,616][00189] Updated weights for policy 0, policy_version 22440 (0.0006) +[2023-02-26 12:21:01,318][00189] Updated weights for policy 0, policy_version 22450 (0.0006) +[2023-02-26 12:21:01,987][00189] Updated weights for policy 0, policy_version 22460 (0.0006) +[2023-02-26 12:21:02,681][00189] Updated weights for policy 0, policy_version 22470 (0.0006) +[2023-02-26 12:21:03,380][00189] Updated weights for policy 0, policy_version 22480 (0.0006) +[2023-02-26 12:21:03,656][00001] Fps is (10 sec: 59801.8, 60 sec: 59733.4, 300 sec: 59607.3). Total num frames: 92094464. Throughput: 0: 14934.2. Samples: 22993856. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) +[2023-02-26 12:21:03,656][00001] Avg episode reward: [(0, '47.235')] +[2023-02-26 12:21:04,047][00189] Updated weights for policy 0, policy_version 22490 (0.0006) +[2023-02-26 12:21:04,735][00189] Updated weights for policy 0, policy_version 22500 (0.0006) +[2023-02-26 12:21:05,447][00189] Updated weights for policy 0, policy_version 22510 (0.0006) +[2023-02-26 12:21:06,139][00189] Updated weights for policy 0, policy_version 22520 (0.0007) +[2023-02-26 12:21:06,788][00189] Updated weights for policy 0, policy_version 22530 (0.0006) +[2023-02-26 12:21:07,494][00189] Updated weights for policy 0, policy_version 22540 (0.0006) +[2023-02-26 12:21:08,182][00189] Updated weights for policy 0, policy_version 22550 (0.0006) +[2023-02-26 12:21:08,656][00001] Fps is (10 sec: 59391.5, 60 sec: 59733.2, 300 sec: 59621.1). Total num frames: 92389376. Throughput: 0: 14928.1. Samples: 23083168. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) +[2023-02-26 12:21:08,656][00001] Avg episode reward: [(0, '44.605')] +[2023-02-26 12:21:08,897][00189] Updated weights for policy 0, policy_version 22560 (0.0006) +[2023-02-26 12:21:09,550][00189] Updated weights for policy 0, policy_version 22570 (0.0006) +[2023-02-26 12:21:10,234][00189] Updated weights for policy 0, policy_version 22580 (0.0007) +[2023-02-26 12:21:10,962][00189] Updated weights for policy 0, policy_version 22590 (0.0006) +[2023-02-26 12:21:11,633][00189] Updated weights for policy 0, policy_version 22600 (0.0006) +[2023-02-26 12:21:12,322][00189] Updated weights for policy 0, policy_version 22610 (0.0007) +[2023-02-26 12:21:13,024][00189] Updated weights for policy 0, policy_version 22620 (0.0006) +[2023-02-26 12:21:13,656][00001] Fps is (10 sec: 59391.5, 60 sec: 59665.1, 300 sec: 59621.1). Total num frames: 92688384. Throughput: 0: 14928.0. Samples: 23172508. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) +[2023-02-26 12:21:13,656][00001] Avg episode reward: [(0, '44.960')] +[2023-02-26 12:21:13,670][00189] Updated weights for policy 0, policy_version 22630 (0.0006) +[2023-02-26 12:21:14,264][00141] Signal inference workers to stop experience collection... (650 times) +[2023-02-26 12:21:14,264][00141] Signal inference workers to resume experience collection... (650 times) +[2023-02-26 12:21:14,268][00189] InferenceWorker_p0-w0: stopping experience collection (650 times) +[2023-02-26 12:21:14,268][00189] InferenceWorker_p0-w0: resuming experience collection (650 times) +[2023-02-26 12:21:14,370][00189] Updated weights for policy 0, policy_version 22640 (0.0006) +[2023-02-26 12:21:15,059][00189] Updated weights for policy 0, policy_version 22650 (0.0006) +[2023-02-26 12:21:15,724][00189] Updated weights for policy 0, policy_version 22660 (0.0007) +[2023-02-26 12:21:16,428][00189] Updated weights for policy 0, policy_version 22670 (0.0006) +[2023-02-26 12:21:17,111][00189] Updated weights for policy 0, policy_version 22680 (0.0006) +[2023-02-26 12:21:17,800][00189] Updated weights for policy 0, policy_version 22690 (0.0006) +[2023-02-26 12:21:18,479][00189] Updated weights for policy 0, policy_version 22700 (0.0006) +[2023-02-26 12:21:18,656][00001] Fps is (10 sec: 59801.8, 60 sec: 59665.0, 300 sec: 59621.1). Total num frames: 92987392. Throughput: 0: 14935.2. Samples: 23217544. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) +[2023-02-26 12:21:18,656][00001] Avg episode reward: [(0, '46.129')] +[2023-02-26 12:21:19,173][00189] Updated weights for policy 0, policy_version 22710 (0.0006) +[2023-02-26 12:21:19,845][00189] Updated weights for policy 0, policy_version 22720 (0.0006) +[2023-02-26 12:21:20,559][00189] Updated weights for policy 0, policy_version 22730 (0.0006) +[2023-02-26 12:21:21,218][00189] Updated weights for policy 0, policy_version 22740 (0.0006) +[2023-02-26 12:21:21,886][00189] Updated weights for policy 0, policy_version 22750 (0.0006) +[2023-02-26 12:21:22,600][00189] Updated weights for policy 0, policy_version 22760 (0.0006) +[2023-02-26 12:21:23,264][00189] Updated weights for policy 0, policy_version 22770 (0.0006) +[2023-02-26 12:21:23,656][00001] Fps is (10 sec: 59801.5, 60 sec: 59665.0, 300 sec: 59621.1). Total num frames: 93286400. Throughput: 0: 14937.9. Samples: 23307260. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) +[2023-02-26 12:21:23,656][00001] Avg episode reward: [(0, '48.264')] +[2023-02-26 12:21:23,920][00189] Updated weights for policy 0, policy_version 22780 (0.0006) +[2023-02-26 12:21:24,662][00189] Updated weights for policy 0, policy_version 22790 (0.0006) +[2023-02-26 12:21:25,320][00189] Updated weights for policy 0, policy_version 22800 (0.0006) +[2023-02-26 12:21:26,001][00189] Updated weights for policy 0, policy_version 22810 (0.0006) +[2023-02-26 12:21:26,708][00189] Updated weights for policy 0, policy_version 22820 (0.0006) +[2023-02-26 12:21:27,351][00189] Updated weights for policy 0, policy_version 22830 (0.0006) +[2023-02-26 12:21:28,089][00189] Updated weights for policy 0, policy_version 22840 (0.0006) +[2023-02-26 12:21:28,656][00001] Fps is (10 sec: 59801.7, 60 sec: 59733.3, 300 sec: 59621.1). Total num frames: 93585408. Throughput: 0: 14928.8. Samples: 23396696. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) +[2023-02-26 12:21:28,656][00001] Avg episode reward: [(0, '45.280')] +[2023-02-26 12:21:28,760][00189] Updated weights for policy 0, policy_version 22850 (0.0006) +[2023-02-26 12:21:29,448][00189] Updated weights for policy 0, policy_version 22860 (0.0007) +[2023-02-26 12:21:30,171][00189] Updated weights for policy 0, policy_version 22870 (0.0007) +[2023-02-26 12:21:30,799][00189] Updated weights for policy 0, policy_version 22880 (0.0006) +[2023-02-26 12:21:31,492][00189] Updated weights for policy 0, policy_version 22890 (0.0006) +[2023-02-26 12:21:32,204][00189] Updated weights for policy 0, policy_version 22900 (0.0006) +[2023-02-26 12:21:32,893][00189] Updated weights for policy 0, policy_version 22910 (0.0006) +[2023-02-26 12:21:33,568][00189] Updated weights for policy 0, policy_version 22920 (0.0006) +[2023-02-26 12:21:33,656][00001] Fps is (10 sec: 59801.8, 60 sec: 59733.3, 300 sec: 59621.1). Total num frames: 93884416. Throughput: 0: 14923.5. Samples: 23441324. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) +[2023-02-26 12:21:33,656][00001] Avg episode reward: [(0, '48.016')] +[2023-02-26 12:21:34,246][00189] Updated weights for policy 0, policy_version 22930 (0.0006) +[2023-02-26 12:21:34,969][00189] Updated weights for policy 0, policy_version 22940 (0.0006) +[2023-02-26 12:21:35,620][00189] Updated weights for policy 0, policy_version 22950 (0.0007) +[2023-02-26 12:21:36,310][00189] Updated weights for policy 0, policy_version 22960 (0.0006) +[2023-02-26 12:21:37,046][00189] Updated weights for policy 0, policy_version 22970 (0.0006) +[2023-02-26 12:21:37,710][00189] Updated weights for policy 0, policy_version 22980 (0.0006) +[2023-02-26 12:21:38,408][00189] Updated weights for policy 0, policy_version 22990 (0.0006) +[2023-02-26 12:21:38,656][00001] Fps is (10 sec: 59392.1, 60 sec: 59665.0, 300 sec: 59607.2). Total num frames: 94179328. Throughput: 0: 14905.6. Samples: 23530320. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) +[2023-02-26 12:21:38,656][00001] Avg episode reward: [(0, '47.180')] +[2023-02-26 12:21:39,114][00189] Updated weights for policy 0, policy_version 23000 (0.0006) +[2023-02-26 12:21:39,785][00189] Updated weights for policy 0, policy_version 23010 (0.0006) +[2023-02-26 12:21:40,467][00189] Updated weights for policy 0, policy_version 23020 (0.0006) +[2023-02-26 12:21:41,186][00189] Updated weights for policy 0, policy_version 23030 (0.0007) +[2023-02-26 12:21:41,877][00189] Updated weights for policy 0, policy_version 23040 (0.0006) +[2023-02-26 12:21:42,536][00189] Updated weights for policy 0, policy_version 23050 (0.0006) +[2023-02-26 12:21:43,248][00189] Updated weights for policy 0, policy_version 23060 (0.0006) +[2023-02-26 12:21:43,656][00001] Fps is (10 sec: 58982.5, 60 sec: 59596.8, 300 sec: 59593.3). Total num frames: 94474240. Throughput: 0: 14901.1. Samples: 23619648. Policy #0 lag: (min: 0.0, avg: 1.4, max: 3.0) +[2023-02-26 12:21:43,656][00001] Avg episode reward: [(0, '46.085')] +[2023-02-26 12:21:43,923][00189] Updated weights for policy 0, policy_version 23070 (0.0006) +[2023-02-26 12:21:44,602][00189] Updated weights for policy 0, policy_version 23080 (0.0006) +[2023-02-26 12:21:45,285][00189] Updated weights for policy 0, policy_version 23090 (0.0006) +[2023-02-26 12:21:45,988][00189] Updated weights for policy 0, policy_version 23100 (0.0006) +[2023-02-26 12:21:46,687][00189] Updated weights for policy 0, policy_version 23110 (0.0006) +[2023-02-26 12:21:47,326][00189] Updated weights for policy 0, policy_version 23120 (0.0006) +[2023-02-26 12:21:48,051][00189] Updated weights for policy 0, policy_version 23130 (0.0006) +[2023-02-26 12:21:48,656][00001] Fps is (10 sec: 59801.5, 60 sec: 59665.1, 300 sec: 59621.1). Total num frames: 94777344. Throughput: 0: 14899.5. Samples: 23664336. Policy #0 lag: (min: 0.0, avg: 1.4, max: 3.0) +[2023-02-26 12:21:48,656][00001] Avg episode reward: [(0, '45.348')] +[2023-02-26 12:21:48,728][00189] Updated weights for policy 0, policy_version 23140 (0.0006) +[2023-02-26 12:21:49,369][00189] Updated weights for policy 0, policy_version 23150 (0.0006) +[2023-02-26 12:21:50,080][00189] Updated weights for policy 0, policy_version 23160 (0.0007) +[2023-02-26 12:21:50,790][00189] Updated weights for policy 0, policy_version 23170 (0.0006) +[2023-02-26 12:21:51,427][00189] Updated weights for policy 0, policy_version 23180 (0.0006) +[2023-02-26 12:21:52,157][00189] Updated weights for policy 0, policy_version 23190 (0.0006) +[2023-02-26 12:21:52,830][00189] Updated weights for policy 0, policy_version 23200 (0.0006) +[2023-02-26 12:21:53,516][00189] Updated weights for policy 0, policy_version 23210 (0.0007) +[2023-02-26 12:21:53,656][00001] Fps is (10 sec: 60211.1, 60 sec: 59665.1, 300 sec: 59607.2). Total num frames: 95076352. Throughput: 0: 14907.7. Samples: 23754012. Policy #0 lag: (min: 0.0, avg: 1.4, max: 3.0) +[2023-02-26 12:21:53,656][00001] Avg episode reward: [(0, '49.902')] +[2023-02-26 12:21:53,659][00141] Saving new best policy, reward=49.902! +[2023-02-26 12:21:54,222][00189] Updated weights for policy 0, policy_version 23220 (0.0006) +[2023-02-26 12:21:54,864][00189] Updated weights for policy 0, policy_version 23230 (0.0006) +[2023-02-26 12:21:55,598][00189] Updated weights for policy 0, policy_version 23240 (0.0007) +[2023-02-26 12:21:56,287][00189] Updated weights for policy 0, policy_version 23250 (0.0006) +[2023-02-26 12:21:56,938][00189] Updated weights for policy 0, policy_version 23260 (0.0006) +[2023-02-26 12:21:57,659][00189] Updated weights for policy 0, policy_version 23270 (0.0007) +[2023-02-26 12:21:58,356][00189] Updated weights for policy 0, policy_version 23280 (0.0006) +[2023-02-26 12:21:58,656][00001] Fps is (10 sec: 59392.1, 60 sec: 59596.8, 300 sec: 59593.4). Total num frames: 95371264. Throughput: 0: 14906.7. Samples: 23843308. Policy #0 lag: (min: 0.0, avg: 1.4, max: 4.0) +[2023-02-26 12:21:58,656][00001] Avg episode reward: [(0, '47.622')] +[2023-02-26 12:21:59,015][00189] Updated weights for policy 0, policy_version 23290 (0.0006) +[2023-02-26 12:21:59,735][00189] Updated weights for policy 0, policy_version 23300 (0.0006) +[2023-02-26 12:22:00,399][00189] Updated weights for policy 0, policy_version 23310 (0.0006) +[2023-02-26 12:22:01,092][00189] Updated weights for policy 0, policy_version 23320 (0.0007) +[2023-02-26 12:22:01,794][00189] Updated weights for policy 0, policy_version 23330 (0.0007) +[2023-02-26 12:22:02,463][00189] Updated weights for policy 0, policy_version 23340 (0.0006) +[2023-02-26 12:22:03,135][00189] Updated weights for policy 0, policy_version 23350 (0.0006) +[2023-02-26 12:22:03,656][00001] Fps is (10 sec: 59391.7, 60 sec: 59596.7, 300 sec: 59593.3). Total num frames: 95670272. Throughput: 0: 14898.9. Samples: 23887996. Policy #0 lag: (min: 0.0, avg: 1.4, max: 4.0) +[2023-02-26 12:22:03,656][00001] Avg episode reward: [(0, '47.624')] +[2023-02-26 12:22:03,832][00189] Updated weights for policy 0, policy_version 23360 (0.0006) +[2023-02-26 12:22:04,530][00189] Updated weights for policy 0, policy_version 23370 (0.0007) +[2023-02-26 12:22:05,226][00189] Updated weights for policy 0, policy_version 23380 (0.0006) +[2023-02-26 12:22:05,884][00189] Updated weights for policy 0, policy_version 23390 (0.0006) +[2023-02-26 12:22:06,582][00189] Updated weights for policy 0, policy_version 23400 (0.0006) +[2023-02-26 12:22:07,256][00189] Updated weights for policy 0, policy_version 23410 (0.0006) +[2023-02-26 12:22:07,949][00189] Updated weights for policy 0, policy_version 23420 (0.0006) +[2023-02-26 12:22:08,634][00189] Updated weights for policy 0, policy_version 23430 (0.0007) +[2023-02-26 12:22:08,656][00001] Fps is (10 sec: 59801.3, 60 sec: 59665.1, 300 sec: 59579.4). Total num frames: 95969280. Throughput: 0: 14899.5. Samples: 23977736. Policy #0 lag: (min: 0.0, avg: 1.4, max: 4.0) +[2023-02-26 12:22:08,656][00001] Avg episode reward: [(0, '44.496')] +[2023-02-26 12:22:09,316][00189] Updated weights for policy 0, policy_version 23440 (0.0006) +[2023-02-26 12:22:10,015][00189] Updated weights for policy 0, policy_version 23450 (0.0006) +[2023-02-26 12:22:10,708][00189] Updated weights for policy 0, policy_version 23460 (0.0006) +[2023-02-26 12:22:11,389][00189] Updated weights for policy 0, policy_version 23470 (0.0006) +[2023-02-26 12:22:12,098][00189] Updated weights for policy 0, policy_version 23480 (0.0007) +[2023-02-26 12:22:12,768][00189] Updated weights for policy 0, policy_version 23490 (0.0006) +[2023-02-26 12:22:13,442][00189] Updated weights for policy 0, policy_version 23500 (0.0006) +[2023-02-26 12:22:13,656][00001] Fps is (10 sec: 59801.8, 60 sec: 59665.1, 300 sec: 59579.5). Total num frames: 96268288. Throughput: 0: 14894.7. Samples: 24066956. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:22:13,656][00001] Avg episode reward: [(0, '46.684')] +[2023-02-26 12:22:14,142][00189] Updated weights for policy 0, policy_version 23510 (0.0006) +[2023-02-26 12:22:14,832][00189] Updated weights for policy 0, policy_version 23520 (0.0006) +[2023-02-26 12:22:15,487][00189] Updated weights for policy 0, policy_version 23530 (0.0006) +[2023-02-26 12:22:16,212][00189] Updated weights for policy 0, policy_version 23540 (0.0006) +[2023-02-26 12:22:16,888][00189] Updated weights for policy 0, policy_version 23550 (0.0006) +[2023-02-26 12:22:17,535][00189] Updated weights for policy 0, policy_version 23560 (0.0006) +[2023-02-26 12:22:18,254][00189] Updated weights for policy 0, policy_version 23570 (0.0006) +[2023-02-26 12:22:18,656][00001] Fps is (10 sec: 59392.2, 60 sec: 59596.8, 300 sec: 59565.6). Total num frames: 96563200. Throughput: 0: 14901.6. Samples: 24111896. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:22:18,656][00001] Avg episode reward: [(0, '46.694')] +[2023-02-26 12:22:18,926][00189] Updated weights for policy 0, policy_version 23580 (0.0006) +[2023-02-26 12:22:19,588][00189] Updated weights for policy 0, policy_version 23590 (0.0006) +[2023-02-26 12:22:20,299][00189] Updated weights for policy 0, policy_version 23600 (0.0006) +[2023-02-26 12:22:20,999][00189] Updated weights for policy 0, policy_version 23610 (0.0006) +[2023-02-26 12:22:21,653][00189] Updated weights for policy 0, policy_version 23620 (0.0006) +[2023-02-26 12:22:22,333][00189] Updated weights for policy 0, policy_version 23630 (0.0006) +[2023-02-26 12:22:23,046][00189] Updated weights for policy 0, policy_version 23640 (0.0006) +[2023-02-26 12:22:23,656][00001] Fps is (10 sec: 59392.0, 60 sec: 59596.8, 300 sec: 59565.6). Total num frames: 96862208. Throughput: 0: 14919.5. Samples: 24201696. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) +[2023-02-26 12:22:23,656][00001] Avg episode reward: [(0, '46.793')] +[2023-02-26 12:22:23,705][00189] Updated weights for policy 0, policy_version 23650 (0.0006) +[2023-02-26 12:22:24,393][00189] Updated weights for policy 0, policy_version 23660 (0.0006) +[2023-02-26 12:22:25,088][00189] Updated weights for policy 0, policy_version 23670 (0.0006) +[2023-02-26 12:22:25,771][00189] Updated weights for policy 0, policy_version 23680 (0.0006) +[2023-02-26 12:22:26,476][00189] Updated weights for policy 0, policy_version 23690 (0.0006) +[2023-02-26 12:22:27,141][00189] Updated weights for policy 0, policy_version 23700 (0.0006) +[2023-02-26 12:22:27,841][00189] Updated weights for policy 0, policy_version 23710 (0.0006) +[2023-02-26 12:22:28,511][00189] Updated weights for policy 0, policy_version 23720 (0.0006) +[2023-02-26 12:22:28,656][00001] Fps is (10 sec: 59801.5, 60 sec: 59596.8, 300 sec: 59565.6). Total num frames: 97161216. Throughput: 0: 14925.5. Samples: 24291296. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:22:28,656][00001] Avg episode reward: [(0, '48.173')] +[2023-02-26 12:22:29,173][00189] Updated weights for policy 0, policy_version 23730 (0.0006) +[2023-02-26 12:22:29,903][00189] Updated weights for policy 0, policy_version 23740 (0.0006) +[2023-02-26 12:22:30,559][00189] Updated weights for policy 0, policy_version 23750 (0.0006) +[2023-02-26 12:22:31,245][00189] Updated weights for policy 0, policy_version 23760 (0.0006) +[2023-02-26 12:22:31,966][00189] Updated weights for policy 0, policy_version 23770 (0.0006) +[2023-02-26 12:22:32,627][00189] Updated weights for policy 0, policy_version 23780 (0.0006) +[2023-02-26 12:22:33,313][00189] Updated weights for policy 0, policy_version 23790 (0.0006) +[2023-02-26 12:22:33,656][00001] Fps is (10 sec: 60211.6, 60 sec: 59665.1, 300 sec: 59579.5). Total num frames: 97464320. Throughput: 0: 14929.3. Samples: 24336156. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:22:33,656][00001] Avg episode reward: [(0, '47.939')] +[2023-02-26 12:22:34,032][00189] Updated weights for policy 0, policy_version 23800 (0.0007) +[2023-02-26 12:22:34,692][00189] Updated weights for policy 0, policy_version 23810 (0.0007) +[2023-02-26 12:22:35,349][00189] Updated weights for policy 0, policy_version 23820 (0.0006) +[2023-02-26 12:22:36,088][00189] Updated weights for policy 0, policy_version 23830 (0.0006) +[2023-02-26 12:22:36,767][00189] Updated weights for policy 0, policy_version 23840 (0.0006) +[2023-02-26 12:22:37,450][00189] Updated weights for policy 0, policy_version 23850 (0.0006) +[2023-02-26 12:22:38,148][00189] Updated weights for policy 0, policy_version 23860 (0.0006) +[2023-02-26 12:22:38,656][00001] Fps is (10 sec: 59801.7, 60 sec: 59665.1, 300 sec: 59565.6). Total num frames: 97759232. Throughput: 0: 14915.2. Samples: 24425196. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2023-02-26 12:22:38,656][00001] Avg episode reward: [(0, '50.090')] +[2023-02-26 12:22:38,656][00141] Saving new best policy, reward=50.090! +[2023-02-26 12:22:38,849][00189] Updated weights for policy 0, policy_version 23870 (0.0006) +[2023-02-26 12:22:39,526][00189] Updated weights for policy 0, policy_version 23880 (0.0006) +[2023-02-26 12:22:40,187][00189] Updated weights for policy 0, policy_version 23890 (0.0006) +[2023-02-26 12:22:40,907][00189] Updated weights for policy 0, policy_version 23900 (0.0007) +[2023-02-26 12:22:41,572][00189] Updated weights for policy 0, policy_version 23910 (0.0006) +[2023-02-26 12:22:42,237][00189] Updated weights for policy 0, policy_version 23920 (0.0006) +[2023-02-26 12:22:42,975][00189] Updated weights for policy 0, policy_version 23930 (0.0007) +[2023-02-26 12:22:43,633][00189] Updated weights for policy 0, policy_version 23940 (0.0006) +[2023-02-26 12:22:43,656][00001] Fps is (10 sec: 59391.9, 60 sec: 59733.3, 300 sec: 59551.7). Total num frames: 98058240. Throughput: 0: 14922.0. Samples: 24514796. Policy #0 lag: (min: 0.0, avg: 1.5, max: 4.0) +[2023-02-26 12:22:43,656][00001] Avg episode reward: [(0, '48.316')] +[2023-02-26 12:22:44,296][00189] Updated weights for policy 0, policy_version 23950 (0.0006) +[2023-02-26 12:22:45,045][00189] Updated weights for policy 0, policy_version 23960 (0.0007) +[2023-02-26 12:22:45,707][00189] Updated weights for policy 0, policy_version 23970 (0.0007) +[2023-02-26 12:22:46,349][00189] Updated weights for policy 0, policy_version 23980 (0.0006) +[2023-02-26 12:22:47,095][00189] Updated weights for policy 0, policy_version 23990 (0.0007) +[2023-02-26 12:22:47,785][00189] Updated weights for policy 0, policy_version 24000 (0.0006) +[2023-02-26 12:22:48,416][00189] Updated weights for policy 0, policy_version 24010 (0.0006) +[2023-02-26 12:22:48,596][00141] Signal inference workers to stop experience collection... (700 times) +[2023-02-26 12:22:48,599][00189] InferenceWorker_p0-w0: stopping experience collection (700 times) +[2023-02-26 12:22:48,605][00141] Signal inference workers to resume experience collection... (700 times) +[2023-02-26 12:22:48,605][00189] InferenceWorker_p0-w0: resuming experience collection (700 times) +[2023-02-26 12:22:48,656][00001] Fps is (10 sec: 59801.6, 60 sec: 59665.1, 300 sec: 59565.6). Total num frames: 98357248. Throughput: 0: 14922.2. Samples: 24559492. Policy #0 lag: (min: 0.0, avg: 1.5, max: 4.0) +[2023-02-26 12:22:48,656][00001] Avg episode reward: [(0, '46.399')] +[2023-02-26 12:22:49,175][00189] Updated weights for policy 0, policy_version 24020 (0.0007) +[2023-02-26 12:22:49,835][00189] Updated weights for policy 0, policy_version 24030 (0.0007) +[2023-02-26 12:22:50,474][00189] Updated weights for policy 0, policy_version 24040 (0.0006) +[2023-02-26 12:22:51,221][00189] Updated weights for policy 0, policy_version 24050 (0.0006) +[2023-02-26 12:22:51,883][00189] Updated weights for policy 0, policy_version 24060 (0.0006) +[2023-02-26 12:22:52,530][00189] Updated weights for policy 0, policy_version 24070 (0.0006) +[2023-02-26 12:22:53,285][00189] Updated weights for policy 0, policy_version 24080 (0.0007) +[2023-02-26 12:22:53,656][00001] Fps is (10 sec: 59391.7, 60 sec: 59596.8, 300 sec: 59537.8). Total num frames: 98652160. Throughput: 0: 14913.9. Samples: 24648860. Policy #0 lag: (min: 0.0, avg: 1.5, max: 4.0) +[2023-02-26 12:22:53,656][00001] Avg episode reward: [(0, '45.185')] +[2023-02-26 12:22:53,660][00141] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000024085_98652160.pth... +[2023-02-26 12:22:53,705][00141] Removing ./runs/default_experiment/checkpoint_p0/checkpoint_000020599_84373504.pth +[2023-02-26 12:22:53,978][00189] Updated weights for policy 0, policy_version 24090 (0.0006) +[2023-02-26 12:22:54,657][00189] Updated weights for policy 0, policy_version 24100 (0.0006) +[2023-02-26 12:22:55,343][00189] Updated weights for policy 0, policy_version 24110 (0.0007) +[2023-02-26 12:22:56,000][00189] Updated weights for policy 0, policy_version 24120 (0.0006) +[2023-02-26 12:22:56,727][00189] Updated weights for policy 0, policy_version 24130 (0.0006) +[2023-02-26 12:22:57,399][00189] Updated weights for policy 0, policy_version 24140 (0.0006) +[2023-02-26 12:22:58,076][00189] Updated weights for policy 0, policy_version 24150 (0.0006) +[2023-02-26 12:22:58,656][00001] Fps is (10 sec: 59392.1, 60 sec: 59665.1, 300 sec: 59551.7). Total num frames: 98951168. Throughput: 0: 14914.9. Samples: 24738128. Policy #0 lag: (min: 0.0, avg: 1.5, max: 4.0) +[2023-02-26 12:22:58,656][00001] Avg episode reward: [(0, '46.965')] +[2023-02-26 12:22:58,777][00189] Updated weights for policy 0, policy_version 24160 (0.0006) +[2023-02-26 12:22:59,461][00189] Updated weights for policy 0, policy_version 24170 (0.0006) +[2023-02-26 12:23:00,117][00189] Updated weights for policy 0, policy_version 24180 (0.0006) +[2023-02-26 12:23:00,832][00189] Updated weights for policy 0, policy_version 24190 (0.0006) +[2023-02-26 12:23:01,549][00189] Updated weights for policy 0, policy_version 24200 (0.0006) +[2023-02-26 12:23:02,215][00189] Updated weights for policy 0, policy_version 24210 (0.0006) +[2023-02-26 12:23:02,897][00189] Updated weights for policy 0, policy_version 24220 (0.0007) +[2023-02-26 12:23:03,610][00189] Updated weights for policy 0, policy_version 24230 (0.0006) +[2023-02-26 12:23:03,656][00001] Fps is (10 sec: 59801.9, 60 sec: 59665.1, 300 sec: 59537.8). Total num frames: 99250176. Throughput: 0: 14907.3. Samples: 24782724. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:23:03,656][00001] Avg episode reward: [(0, '46.402')] +[2023-02-26 12:23:04,266][00189] Updated weights for policy 0, policy_version 24240 (0.0007) +[2023-02-26 12:23:04,955][00189] Updated weights for policy 0, policy_version 24250 (0.0006) +[2023-02-26 12:23:05,691][00189] Updated weights for policy 0, policy_version 24260 (0.0007) +[2023-02-26 12:23:06,347][00189] Updated weights for policy 0, policy_version 24270 (0.0006) +[2023-02-26 12:23:07,002][00189] Updated weights for policy 0, policy_version 24280 (0.0006) +[2023-02-26 12:23:07,759][00189] Updated weights for policy 0, policy_version 24290 (0.0006) +[2023-02-26 12:23:08,413][00189] Updated weights for policy 0, policy_version 24300 (0.0007) +[2023-02-26 12:23:08,656][00001] Fps is (10 sec: 59392.0, 60 sec: 59596.8, 300 sec: 59523.9). Total num frames: 99545088. Throughput: 0: 14893.3. Samples: 24871892. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:23:08,656][00001] Avg episode reward: [(0, '48.544')] +[2023-02-26 12:23:09,103][00189] Updated weights for policy 0, policy_version 24310 (0.0007) +[2023-02-26 12:23:09,829][00189] Updated weights for policy 0, policy_version 24320 (0.0006) +[2023-02-26 12:23:10,480][00189] Updated weights for policy 0, policy_version 24330 (0.0006) +[2023-02-26 12:23:11,185][00189] Updated weights for policy 0, policy_version 24340 (0.0006) +[2023-02-26 12:23:11,878][00189] Updated weights for policy 0, policy_version 24350 (0.0006) +[2023-02-26 12:23:12,552][00189] Updated weights for policy 0, policy_version 24360 (0.0006) +[2023-02-26 12:23:13,250][00189] Updated weights for policy 0, policy_version 24370 (0.0006) +[2023-02-26 12:23:13,656][00001] Fps is (10 sec: 58982.3, 60 sec: 59528.6, 300 sec: 59510.0). Total num frames: 99840000. Throughput: 0: 14882.3. Samples: 24961000. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) +[2023-02-26 12:23:13,656][00001] Avg episode reward: [(0, '46.878')] +[2023-02-26 12:23:13,901][00189] Updated weights for policy 0, policy_version 24380 (0.0007) +[2023-02-26 12:23:14,595][00189] Updated weights for policy 0, policy_version 24390 (0.0007) +[2023-02-26 12:23:15,307][00189] Updated weights for policy 0, policy_version 24400 (0.0006) +[2023-02-26 12:23:15,978][00189] Updated weights for policy 0, policy_version 24410 (0.0007) +[2023-02-26 12:23:16,403][00141] Stopping Batcher_0... +[2023-02-26 12:23:16,403][00001] Component Batcher_0 stopped! +[2023-02-26 12:23:16,403][00141] Loop batcher_evt_loop terminating... +[2023-02-26 12:23:16,404][00141] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000024416_100007936.pth... +[2023-02-26 12:23:16,416][00189] Weights refcount: 2 0 +[2023-02-26 12:23:16,416][00189] Stopping InferenceWorker_p0-w0... +[2023-02-26 12:23:16,417][00189] Loop inference_proc0-0_evt_loop terminating... +[2023-02-26 12:23:16,417][00001] Component InferenceWorker_p0-w0 stopped! +[2023-02-26 12:23:16,427][00201] Stopping RolloutWorker_w9... +[2023-02-26 12:23:16,427][00001] Component RolloutWorker_w9 stopped! +[2023-02-26 12:23:16,428][00201] Loop rollout_proc9_evt_loop terminating... +[2023-02-26 12:23:16,428][00001] Component RolloutWorker_w0 stopped! +[2023-02-26 12:23:16,428][00190] Stopping RolloutWorker_w0... +[2023-02-26 12:23:16,428][00190] Loop rollout_proc0_evt_loop terminating... +[2023-02-26 12:23:16,429][00001] Component RolloutWorker_w8 stopped! +[2023-02-26 12:23:16,429][00001] Component RolloutWorker_w7 stopped! +[2023-02-26 12:23:16,429][00198] Stopping RolloutWorker_w7... +[2023-02-26 12:23:16,429][00197] Stopping RolloutWorker_w8... +[2023-02-26 12:23:16,430][00198] Loop rollout_proc7_evt_loop terminating... +[2023-02-26 12:23:16,430][00001] Component RolloutWorker_w1 stopped! +[2023-02-26 12:23:16,430][00001] Component RolloutWorker_w11 stopped! +[2023-02-26 12:23:16,430][00197] Loop rollout_proc8_evt_loop terminating... +[2023-02-26 12:23:16,430][00200] Stopping RolloutWorker_w11... +[2023-02-26 12:23:16,430][00194] Stopping RolloutWorker_w4... +[2023-02-26 12:23:16,430][00191] Stopping RolloutWorker_w1... +[2023-02-26 12:23:16,430][00001] Component RolloutWorker_w4 stopped! +[2023-02-26 12:23:16,430][00200] Loop rollout_proc11_evt_loop terminating... +[2023-02-26 12:23:16,430][00194] Loop rollout_proc4_evt_loop terminating... +[2023-02-26 12:23:16,430][00191] Loop rollout_proc1_evt_loop terminating... +[2023-02-26 12:23:16,431][00001] Component RolloutWorker_w2 stopped! +[2023-02-26 12:23:16,431][00195] Stopping RolloutWorker_w2... +[2023-02-26 12:23:16,431][00195] Loop rollout_proc2_evt_loop terminating... +[2023-02-26 12:23:16,431][00001] Component RolloutWorker_w6 stopped! +[2023-02-26 12:23:16,431][00001] Component RolloutWorker_w3 stopped! +[2023-02-26 12:23:16,431][00196] Stopping RolloutWorker_w6... +[2023-02-26 12:23:16,431][00193] Stopping RolloutWorker_w3... +[2023-02-26 12:23:16,432][00196] Loop rollout_proc6_evt_loop terminating... +[2023-02-26 12:23:16,432][00193] Loop rollout_proc3_evt_loop terminating... +[2023-02-26 12:23:16,432][00001] Component RolloutWorker_w5 stopped! +[2023-02-26 12:23:16,432][00192] Stopping RolloutWorker_w5... +[2023-02-26 12:23:16,432][00001] Component RolloutWorker_w10 stopped! +[2023-02-26 12:23:16,432][00199] Stopping RolloutWorker_w10... +[2023-02-26 12:23:16,433][00192] Loop rollout_proc5_evt_loop terminating... +[2023-02-26 12:23:16,433][00199] Loop rollout_proc10_evt_loop terminating... +[2023-02-26 12:23:16,448][00141] Removing ./runs/default_experiment/checkpoint_p0/checkpoint_000022338_91496448.pth +[2023-02-26 12:23:16,456][00141] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000024416_100007936.pth... +[2023-02-26 12:23:16,518][00141] Stopping LearnerWorker_p0... +[2023-02-26 12:23:16,519][00141] Loop learner_proc0_evt_loop terminating... +[2023-02-26 12:23:16,519][00001] Component LearnerWorker_p0 stopped! +[2023-02-26 12:23:16,519][00001] Waiting for process learner_proc0 to stop... +[2023-02-26 12:23:17,255][00001] Waiting for process inference_proc0-0 to join... +[2023-02-26 12:23:17,255][00001] Waiting for process rollout_proc0 to join... +[2023-02-26 12:23:17,255][00001] Waiting for process rollout_proc1 to join... +[2023-02-26 12:23:17,255][00001] Waiting for process rollout_proc2 to join... +[2023-02-26 12:23:17,256][00001] Waiting for process rollout_proc3 to join... +[2023-02-26 12:23:17,256][00001] Waiting for process rollout_proc4 to join... +[2023-02-26 12:23:17,256][00001] Waiting for process rollout_proc5 to join... +[2023-02-26 12:23:17,256][00001] Waiting for process rollout_proc6 to join... +[2023-02-26 12:23:17,256][00001] Waiting for process rollout_proc7 to join... +[2023-02-26 12:23:17,256][00001] Waiting for process rollout_proc8 to join... +[2023-02-26 12:23:17,256][00001] Waiting for process rollout_proc9 to join... +[2023-02-26 12:23:17,257][00001] Waiting for process rollout_proc10 to join... +[2023-02-26 12:23:17,257][00001] Waiting for process rollout_proc11 to join... +[2023-02-26 12:23:17,257][00001] Batcher 0 profile tree view: +batching: 191.5725, releasing_batches: 0.3781 +[2023-02-26 12:23:17,257][00001] InferenceWorker_p0-w0 profile tree view: +wait_policy: 0.0001 + wait_policy_total: 116.5801 +update_model: 26.9890 + weight_update: 0.0006 +one_step: 0.0012 + handle_policy_step: 1464.8074 + deserialize: 94.8114, stack: 7.2804, obs_to_device_normalize: 392.9161, forward: 535.6107, send_messages: 91.3920 + prepare_outputs: 275.4483 + to_cpu: 196.4430 +[2023-02-26 12:23:17,257][00001] Learner 0 profile tree view: +misc: 0.0953, prepare_batch: 95.2645 +train: 306.9296 + epoch_init: 0.0986, minibatch_init: 0.0856, losses_postprocess: 3.8987, kl_divergence: 4.7444, after_optimizer: 33.8736 + calculate_losses: 136.2350 + losses_init: 0.0601, forward_head: 13.9870, bptt_initial: 81.9990, tail: 7.3573, advantages_returns: 1.9754, losses: 13.2568 + bptt: 15.2232 + bptt_forward_core: 14.6556 + update: 122.8192 + clip: 16.9164 +[2023-02-26 12:23:17,257][00001] RolloutWorker_w0 profile tree view: +wait_for_trajectories: 1.2228, enqueue_policy_requests: 50.3874, env_step: 1304.0774, overhead: 78.7667, complete_rollouts: 1.1384 +save_policy_outputs: 78.4617 + split_output_tensors: 37.5235 +[2023-02-26 12:23:17,257][00001] RolloutWorker_w11 profile tree view: +wait_for_trajectories: 1.2185, enqueue_policy_requests: 50.6367, env_step: 1305.7803, overhead: 78.4997, complete_rollouts: 1.0931 +save_policy_outputs: 78.6178 + split_output_tensors: 37.6594 +[2023-02-26 12:23:17,258][00001] Loop Runner_EvtLoop terminating... +[2023-02-26 12:23:17,258][00001] Runner profile tree view: +main_loop: 1701.9700 +[2023-02-26 12:23:17,258][00001] Collected {0: 100007936}, FPS: 58760.1 +[2023-02-26 12:23:17,267][00001] Loading existing experiment configuration from ./runs/default_experiment/config.json +[2023-02-26 12:23:17,268][00001] Overriding arg 'num_workers' with value 1 passed from command line +[2023-02-26 12:23:17,268][00001] Adding new argument 'no_render'=True that is not in the saved config file! +[2023-02-26 12:23:17,268][00001] Adding new argument 'save_video'=True that is not in the saved config file! +[2023-02-26 12:23:17,268][00001] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! +[2023-02-26 12:23:17,268][00001] Adding new argument 'video_name'=None that is not in the saved config file! +[2023-02-26 12:23:17,268][00001] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! +[2023-02-26 12:23:17,268][00001] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! +[2023-02-26 12:23:17,268][00001] Adding new argument 'push_to_hub'=True that is not in the saved config file! +[2023-02-26 12:23:17,268][00001] Adding new argument 'hf_repository'='chavicoski/vizdoom_health_gathering_supreme' that is not in the saved config file! +[2023-02-26 12:23:17,268][00001] Adding new argument 'policy_index'=0 that is not in the saved config file! +[2023-02-26 12:23:17,269][00001] Adding new argument 'eval_deterministic'=False that is not in the saved config file! +[2023-02-26 12:23:17,269][00001] Adding new argument 'train_script'=None that is not in the saved config file! +[2023-02-26 12:23:17,269][00001] Adding new argument 'enjoy_script'=None that is not in the saved config file! +[2023-02-26 12:23:17,269][00001] Using frameskip 1 and render_action_repeat=4 for evaluation +[2023-02-26 12:23:17,277][00001] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-26 12:23:17,278][00001] RunningMeanStd input shape: (3, 72, 128) +[2023-02-26 12:23:17,278][00001] RunningMeanStd input shape: (1,) +[2023-02-26 12:23:17,294][00001] ConvEncoder: input_channels=3 +[2023-02-26 12:23:17,380][00001] Conv encoder output size: 512 +[2023-02-26 12:23:17,381][00001] Policy head output size: 512 +[2023-02-26 12:23:18,648][00001] Loading state from checkpoint ./runs/default_experiment/checkpoint_p0/checkpoint_000024416_100007936.pth... +[2023-02-26 12:23:19,270][00001] Num frames 100... +[2023-02-26 12:23:19,372][00001] Num frames 200... +[2023-02-26 12:23:19,472][00001] Num frames 300... +[2023-02-26 12:23:19,572][00001] Num frames 400... +[2023-02-26 12:23:19,667][00001] Num frames 500... +[2023-02-26 12:23:19,764][00001] Num frames 600... +[2023-02-26 12:23:19,857][00001] Num frames 700... +[2023-02-26 12:23:19,953][00001] Num frames 800... +[2023-02-26 12:23:20,051][00001] Num frames 900... +[2023-02-26 12:23:20,153][00001] Num frames 1000... +[2023-02-26 12:23:20,254][00001] Num frames 1100... +[2023-02-26 12:23:20,354][00001] Num frames 1200... +[2023-02-26 12:23:20,454][00001] Num frames 1300... +[2023-02-26 12:23:20,555][00001] Num frames 1400... +[2023-02-26 12:23:20,658][00001] Num frames 1500... +[2023-02-26 12:23:20,755][00001] Num frames 1600... +[2023-02-26 12:23:20,850][00001] Num frames 1700... +[2023-02-26 12:23:20,948][00001] Num frames 1800... +[2023-02-26 12:23:21,088][00001] Avg episode rewards: #0: 49.879, true rewards: #0: 18.880 +[2023-02-26 12:23:21,088][00001] Avg episode reward: 49.879, avg true_objective: 18.880 +[2023-02-26 12:23:21,105][00001] Num frames 1900... +[2023-02-26 12:23:21,220][00001] Num frames 2000... +[2023-02-26 12:23:21,315][00001] Num frames 2100... +[2023-02-26 12:23:21,410][00001] Num frames 2200... +[2023-02-26 12:23:21,504][00001] Num frames 2300... +[2023-02-26 12:23:21,598][00001] Num frames 2400... +[2023-02-26 12:23:21,693][00001] Num frames 2500... +[2023-02-26 12:23:21,787][00001] Num frames 2600... +[2023-02-26 12:23:21,882][00001] Num frames 2700... +[2023-02-26 12:23:21,977][00001] Num frames 2800... +[2023-02-26 12:23:22,071][00001] Num frames 2900... +[2023-02-26 12:23:22,165][00001] Num frames 3000... +[2023-02-26 12:23:22,260][00001] Num frames 3100... +[2023-02-26 12:23:22,355][00001] Num frames 3200... +[2023-02-26 12:23:22,451][00001] Num frames 3300... +[2023-02-26 12:23:22,548][00001] Num frames 3400... +[2023-02-26 12:23:22,644][00001] Num frames 3500... +[2023-02-26 12:23:22,704][00001] Avg episode rewards: #0: 45.529, true rewards: #0: 17.530 +[2023-02-26 12:23:22,704][00001] Avg episode reward: 45.529, avg true_objective: 17.530 +[2023-02-26 12:23:22,810][00001] Num frames 3600... +[2023-02-26 12:23:22,903][00001] Num frames 3700... +[2023-02-26 12:23:22,997][00001] Num frames 3800... +[2023-02-26 12:23:23,091][00001] Num frames 3900... +[2023-02-26 12:23:23,183][00001] Num frames 4000... +[2023-02-26 12:23:23,273][00001] Num frames 4100... +[2023-02-26 12:23:23,363][00001] Num frames 4200... +[2023-02-26 12:23:23,455][00001] Num frames 4300... +[2023-02-26 12:23:23,549][00001] Num frames 4400... +[2023-02-26 12:23:23,643][00001] Num frames 4500... +[2023-02-26 12:23:23,736][00001] Num frames 4600... +[2023-02-26 12:23:23,829][00001] Num frames 4700... +[2023-02-26 12:23:23,924][00001] Num frames 4800... +[2023-02-26 12:23:24,018][00001] Num frames 4900... +[2023-02-26 12:23:24,111][00001] Num frames 5000... +[2023-02-26 12:23:24,204][00001] Num frames 5100... +[2023-02-26 12:23:24,299][00001] Num frames 5200... +[2023-02-26 12:23:24,391][00001] Num frames 5300... +[2023-02-26 12:23:24,486][00001] Num frames 5400... +[2023-02-26 12:23:24,581][00001] Num frames 5500... +[2023-02-26 12:23:24,676][00001] Num frames 5600... +[2023-02-26 12:23:24,734][00001] Avg episode rewards: #0: 49.686, true rewards: #0: 18.687 +[2023-02-26 12:23:24,734][00001] Avg episode reward: 49.686, avg true_objective: 18.687 +[2023-02-26 12:23:24,836][00001] Num frames 5700... +[2023-02-26 12:23:24,929][00001] Num frames 5800... +[2023-02-26 12:23:25,021][00001] Num frames 5900... +[2023-02-26 12:23:25,114][00001] Num frames 6000... +[2023-02-26 12:23:25,207][00001] Num frames 6100... +[2023-02-26 12:23:25,301][00001] Num frames 6200... +[2023-02-26 12:23:25,395][00001] Num frames 6300... +[2023-02-26 12:23:25,501][00001] Avg episode rewards: #0: 40.889, true rewards: #0: 15.890 +[2023-02-26 12:23:25,501][00001] Avg episode reward: 40.889, avg true_objective: 15.890 +[2023-02-26 12:23:25,556][00001] Num frames 6400... +[2023-02-26 12:23:25,651][00001] Num frames 6500... +[2023-02-26 12:23:25,744][00001] Num frames 6600... +[2023-02-26 12:23:25,843][00001] Num frames 6700... +[2023-02-26 12:23:25,942][00001] Num frames 6800... +[2023-02-26 12:23:26,041][00001] Num frames 6900... +[2023-02-26 12:23:26,141][00001] Num frames 7000... +[2023-02-26 12:23:26,226][00001] Num frames 7100... +[2023-02-26 12:23:26,318][00001] Num frames 7200... +[2023-02-26 12:23:26,412][00001] Num frames 7300... +[2023-02-26 12:23:26,513][00001] Num frames 7400... +[2023-02-26 12:23:26,613][00001] Num frames 7500... +[2023-02-26 12:23:26,714][00001] Num frames 7600... +[2023-02-26 12:23:26,815][00001] Num frames 7700... +[2023-02-26 12:23:26,915][00001] Num frames 7800... +[2023-02-26 12:23:27,013][00001] Num frames 7900... +[2023-02-26 12:23:27,108][00001] Num frames 8000... +[2023-02-26 12:23:27,202][00001] Num frames 8100... +[2023-02-26 12:23:27,298][00001] Num frames 8200... +[2023-02-26 12:23:27,393][00001] Num frames 8300... +[2023-02-26 12:23:27,487][00001] Num frames 8400... +[2023-02-26 12:23:27,593][00001] Avg episode rewards: #0: 43.711, true rewards: #0: 16.912 +[2023-02-26 12:23:27,594][00001] Avg episode reward: 43.711, avg true_objective: 16.912 +[2023-02-26 12:23:27,652][00001] Num frames 8500... +[2023-02-26 12:23:27,748][00001] Num frames 8600... +[2023-02-26 12:23:27,840][00001] Num frames 8700... +[2023-02-26 12:23:27,933][00001] Num frames 8800... +[2023-02-26 12:23:28,016][00001] Num frames 8900... +[2023-02-26 12:23:28,108][00001] Num frames 9000... +[2023-02-26 12:23:28,201][00001] Num frames 9100... +[2023-02-26 12:23:28,297][00001] Num frames 9200... +[2023-02-26 12:23:28,393][00001] Num frames 9300... +[2023-02-26 12:23:28,490][00001] Num frames 9400... +[2023-02-26 12:23:28,586][00001] Num frames 9500... +[2023-02-26 12:23:28,679][00001] Num frames 9600... +[2023-02-26 12:23:28,773][00001] Num frames 9700... +[2023-02-26 12:23:28,868][00001] Num frames 9800... +[2023-02-26 12:23:28,962][00001] Num frames 9900... +[2023-02-26 12:23:29,057][00001] Num frames 10000... +[2023-02-26 12:23:29,145][00001] Num frames 10100... +[2023-02-26 12:23:29,240][00001] Num frames 10200... +[2023-02-26 12:23:29,335][00001] Num frames 10300... +[2023-02-26 12:23:29,429][00001] Num frames 10400... +[2023-02-26 12:23:29,524][00001] Num frames 10500... +[2023-02-26 12:23:29,630][00001] Avg episode rewards: #0: 46.259, true rewards: #0: 17.593 +[2023-02-26 12:23:29,630][00001] Avg episode reward: 46.259, avg true_objective: 17.593 +[2023-02-26 12:23:29,686][00001] Num frames 10600... +[2023-02-26 12:23:29,782][00001] Num frames 10700... +[2023-02-26 12:23:29,877][00001] Num frames 10800... +[2023-02-26 12:23:29,971][00001] Num frames 10900... +[2023-02-26 12:23:30,066][00001] Num frames 11000... +[2023-02-26 12:23:30,160][00001] Num frames 11100... +[2023-02-26 12:23:30,254][00001] Num frames 11200... +[2023-02-26 12:23:30,348][00001] Num frames 11300... +[2023-02-26 12:23:30,443][00001] Num frames 11400... +[2023-02-26 12:23:30,537][00001] Num frames 11500... +[2023-02-26 12:23:30,632][00001] Num frames 11600... +[2023-02-26 12:23:30,728][00001] Num frames 11700... +[2023-02-26 12:23:30,822][00001] Num frames 11800... +[2023-02-26 12:23:30,918][00001] Num frames 11900... +[2023-02-26 12:23:31,007][00001] Num frames 12000... +[2023-02-26 12:23:31,101][00001] Num frames 12100... +[2023-02-26 12:23:31,196][00001] Num frames 12200... +[2023-02-26 12:23:31,290][00001] Num frames 12300... +[2023-02-26 12:23:31,386][00001] Num frames 12400... +[2023-02-26 12:23:31,482][00001] Num frames 12500... +[2023-02-26 12:23:31,580][00001] Num frames 12600... +[2023-02-26 12:23:31,687][00001] Avg episode rewards: #0: 47.936, true rewards: #0: 18.080 +[2023-02-26 12:23:31,688][00001] Avg episode reward: 47.936, avg true_objective: 18.080 +[2023-02-26 12:23:31,747][00001] Num frames 12700... +[2023-02-26 12:23:31,844][00001] Num frames 12800... +[2023-02-26 12:23:31,938][00001] Num frames 12900... +[2023-02-26 12:23:32,032][00001] Num frames 13000... +[2023-02-26 12:23:32,124][00001] Num frames 13100... +[2023-02-26 12:23:32,218][00001] Num frames 13200... +[2023-02-26 12:23:32,312][00001] Num frames 13300... +[2023-02-26 12:23:32,409][00001] Num frames 13400... +[2023-02-26 12:23:32,500][00001] Num frames 13500... +[2023-02-26 12:23:32,583][00001] Num frames 13600... +[2023-02-26 12:23:32,663][00001] Num frames 13700... +[2023-02-26 12:23:32,746][00001] Num frames 13800... +[2023-02-26 12:23:32,827][00001] Num frames 13900... +[2023-02-26 12:23:32,912][00001] Num frames 14000... +[2023-02-26 12:23:32,997][00001] Num frames 14100... +[2023-02-26 12:23:33,080][00001] Num frames 14200... +[2023-02-26 12:23:33,164][00001] Num frames 14300... +[2023-02-26 12:23:33,248][00001] Num frames 14400... +[2023-02-26 12:23:33,331][00001] Num frames 14500... +[2023-02-26 12:23:33,415][00001] Num frames 14600... +[2023-02-26 12:23:33,500][00001] Num frames 14700... +[2023-02-26 12:23:33,556][00001] Avg episode rewards: #0: 47.754, true rewards: #0: 18.380 +[2023-02-26 12:23:33,557][00001] Avg episode reward: 47.754, avg true_objective: 18.380 +[2023-02-26 12:23:33,645][00001] Num frames 14800... +[2023-02-26 12:23:33,728][00001] Num frames 14900... +[2023-02-26 12:23:33,811][00001] Num frames 15000... +[2023-02-26 12:23:33,894][00001] Num frames 15100... +[2023-02-26 12:23:33,978][00001] Num frames 15200... +[2023-02-26 12:23:34,061][00001] Num frames 15300... +[2023-02-26 12:23:34,144][00001] Num frames 15400... +[2023-02-26 12:23:34,228][00001] Num frames 15500... +[2023-02-26 12:23:34,312][00001] Num frames 15600... +[2023-02-26 12:23:34,396][00001] Num frames 15700... +[2023-02-26 12:23:34,480][00001] Num frames 15800... +[2023-02-26 12:23:34,563][00001] Num frames 15900... +[2023-02-26 12:23:34,646][00001] Num frames 16000... +[2023-02-26 12:23:34,730][00001] Num frames 16100... +[2023-02-26 12:23:34,816][00001] Num frames 16200... +[2023-02-26 12:23:34,901][00001] Num frames 16300... +[2023-02-26 12:23:34,987][00001] Num frames 16400... +[2023-02-26 12:23:35,074][00001] Num frames 16500... +[2023-02-26 12:23:35,160][00001] Num frames 16600... +[2023-02-26 12:23:35,246][00001] Num frames 16700... +[2023-02-26 12:23:35,332][00001] Num frames 16800... +[2023-02-26 12:23:35,388][00001] Avg episode rewards: #0: 49.226, true rewards: #0: 18.671 +[2023-02-26 12:23:35,388][00001] Avg episode reward: 49.226, avg true_objective: 18.671 +[2023-02-26 12:23:35,470][00001] Num frames 16900... +[2023-02-26 12:23:35,556][00001] Num frames 17000... +[2023-02-26 12:23:35,641][00001] Num frames 17100... +[2023-02-26 12:23:35,727][00001] Num frames 17200... +[2023-02-26 12:23:35,812][00001] Num frames 17300... +[2023-02-26 12:23:35,897][00001] Num frames 17400... +[2023-02-26 12:23:35,982][00001] Num frames 17500... +[2023-02-26 12:23:36,068][00001] Num frames 17600... +[2023-02-26 12:23:36,154][00001] Num frames 17700... +[2023-02-26 12:23:36,239][00001] Num frames 17800... +[2023-02-26 12:23:36,322][00001] Num frames 17900... +[2023-02-26 12:23:36,408][00001] Num frames 18000... +[2023-02-26 12:23:36,494][00001] Num frames 18100... +[2023-02-26 12:23:36,579][00001] Num frames 18200... +[2023-02-26 12:23:36,665][00001] Num frames 18300... +[2023-02-26 12:23:36,751][00001] Num frames 18400... +[2023-02-26 12:23:36,836][00001] Num frames 18500... +[2023-02-26 12:23:36,923][00001] Num frames 18600... +[2023-02-26 12:23:37,009][00001] Num frames 18700... +[2023-02-26 12:23:37,092][00001] Num frames 18800... +[2023-02-26 12:23:37,178][00001] Num frames 18900... +[2023-02-26 12:23:37,235][00001] Avg episode rewards: #0: 50.403, true rewards: #0: 18.904 +[2023-02-26 12:23:37,235][00001] Avg episode reward: 50.403, avg true_objective: 18.904 +[2023-02-26 12:23:56,598][00001] Replay video saved to ./runs/default_experiment/replay.mp4!