nomad-ai's picture
Upload folder using huggingface_hub
1c0053f
[2023-06-29 08:39:32,418][00488] Saving configuration to /content/train_dir/default_experiment/config.json...
[2023-06-29 08:39:32,420][00488] Rollout worker 0 uses device cpu
[2023-06-29 08:39:32,422][00488] Rollout worker 1 uses device cpu
[2023-06-29 08:39:32,423][00488] Rollout worker 2 uses device cpu
[2023-06-29 08:39:32,424][00488] Rollout worker 3 uses device cpu
[2023-06-29 08:39:32,425][00488] Rollout worker 4 uses device cpu
[2023-06-29 08:39:32,426][00488] Rollout worker 5 uses device cpu
[2023-06-29 08:39:32,427][00488] Rollout worker 6 uses device cpu
[2023-06-29 08:39:32,428][00488] Rollout worker 7 uses device cpu
[2023-06-29 08:39:32,430][00488] Rollout worker 8 uses device cpu
[2023-06-29 08:39:32,431][00488] Rollout worker 9 uses device cpu
[2023-06-29 08:39:32,433][00488] Rollout worker 10 uses device cpu
[2023-06-29 08:39:32,434][00488] Rollout worker 11 uses device cpu
[2023-06-29 08:39:32,435][00488] Rollout worker 12 uses device cpu
[2023-06-29 08:39:32,436][00488] Rollout worker 13 uses device cpu
[2023-06-29 08:39:32,437][00488] Rollout worker 14 uses device cpu
[2023-06-29 08:39:32,446][00488] Rollout worker 15 uses device cpu
[2023-06-29 08:39:32,901][00488] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-06-29 08:39:32,906][00488] InferenceWorker_p0-w0: min num requests: 5
[2023-06-29 08:39:32,989][00488] Starting all processes...
[2023-06-29 08:39:32,994][00488] Starting process learner_proc0
[2023-06-29 08:39:33,067][00488] Starting all processes...
[2023-06-29 08:39:33,078][00488] Starting process inference_proc0-0
[2023-06-29 08:39:33,078][00488] Starting process rollout_proc0
[2023-06-29 08:39:33,081][00488] Starting process rollout_proc1
[2023-06-29 08:39:33,081][00488] Starting process rollout_proc2
[2023-06-29 08:39:33,081][00488] Starting process rollout_proc3
[2023-06-29 08:39:33,081][00488] Starting process rollout_proc4
[2023-06-29 08:39:33,081][00488] Starting process rollout_proc5
[2023-06-29 08:39:33,081][00488] Starting process rollout_proc6
[2023-06-29 08:39:33,081][00488] Starting process rollout_proc7
[2023-06-29 08:39:33,081][00488] Starting process rollout_proc8
[2023-06-29 08:39:33,081][00488] Starting process rollout_proc9
[2023-06-29 08:39:33,081][00488] Starting process rollout_proc10
[2023-06-29 08:39:33,081][00488] Starting process rollout_proc11
[2023-06-29 08:39:33,081][00488] Starting process rollout_proc12
[2023-06-29 08:39:33,081][00488] Starting process rollout_proc13
[2023-06-29 08:39:33,081][00488] Starting process rollout_proc14
[2023-06-29 08:39:33,699][00488] Starting process rollout_proc15
[2023-06-29 08:40:02,826][12618] Worker 7 uses CPU cores [1]
[2023-06-29 08:40:03,027][12590] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-06-29 08:40:03,032][12590] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-06-29 08:40:03,057][00488] Heartbeat connected on RolloutWorker_w7
[2023-06-29 08:40:03,072][12610] Worker 0 uses CPU cores [0]
[2023-06-29 08:40:03,109][12590] Num visible devices: 1
[2023-06-29 08:40:03,143][00488] Heartbeat connected on Batcher_0
[2023-06-29 08:40:03,152][12619] Worker 8 uses CPU cores [0]
[2023-06-29 08:40:03,152][12590] Starting seed is not provided
[2023-06-29 08:40:03,153][12590] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-06-29 08:40:03,154][12590] Initializing actor-critic model on device cuda:0
[2023-06-29 08:40:03,155][12590] RunningMeanStd input shape: (3, 72, 128)
[2023-06-29 08:40:03,157][12590] RunningMeanStd input shape: (1,)
[2023-06-29 08:40:03,207][12627] Worker 12 uses CPU cores [0]
[2023-06-29 08:40:03,309][12617] Worker 6 uses CPU cores [0]
[2023-06-29 08:40:03,297][12590] ConvEncoder: input_channels=3
[2023-06-29 08:40:03,347][12626] Worker 11 uses CPU cores [1]
[2023-06-29 08:40:03,367][12614] Worker 4 uses CPU cores [0]
[2023-06-29 08:40:03,380][00488] Heartbeat connected on RolloutWorker_w0
[2023-06-29 08:40:03,413][00488] Heartbeat connected on RolloutWorker_w8
[2023-06-29 08:40:03,445][12613] Worker 2 uses CPU cores [0]
[2023-06-29 08:40:03,448][00488] Heartbeat connected on RolloutWorker_w12
[2023-06-29 08:40:03,476][12629] Worker 14 uses CPU cores [0]
[2023-06-29 08:40:03,482][00488] Heartbeat connected on RolloutWorker_w6
[2023-06-29 08:40:03,604][00488] Heartbeat connected on RolloutWorker_w4
[2023-06-29 08:40:03,636][00488] Heartbeat connected on RolloutWorker_w2
[2023-06-29 08:40:03,646][12616] Worker 5 uses CPU cores [1]
[2023-06-29 08:40:03,712][00488] Heartbeat connected on RolloutWorker_w11
[2023-06-29 08:40:03,715][00488] Heartbeat connected on RolloutWorker_w14
[2023-06-29 08:40:03,767][12615] Worker 3 uses CPU cores [1]
[2023-06-29 08:40:03,768][12611] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-06-29 08:40:03,774][12611] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-06-29 08:40:03,783][12625] Worker 10 uses CPU cores [0]
[2023-06-29 08:40:03,793][12624] Worker 9 uses CPU cores [1]
[2023-06-29 08:40:03,826][12632] Worker 15 uses CPU cores [1]
[2023-06-29 08:40:03,844][00488] Heartbeat connected on RolloutWorker_w10
[2023-06-29 08:40:03,877][12611] Num visible devices: 1
[2023-06-29 08:40:03,915][00488] Heartbeat connected on InferenceWorker_p0-w0
[2023-06-29 08:40:03,923][00488] Heartbeat connected on RolloutWorker_w5
[2023-06-29 08:40:03,954][12612] Worker 1 uses CPU cores [1]
[2023-06-29 08:40:03,985][00488] Heartbeat connected on RolloutWorker_w3
[2023-06-29 08:40:04,056][00488] Heartbeat connected on RolloutWorker_w9
[2023-06-29 08:40:04,065][12631] Worker 13 uses CPU cores [1]
[2023-06-29 08:40:04,061][00488] Heartbeat connected on RolloutWorker_w15
[2023-06-29 08:40:04,087][00488] Heartbeat connected on RolloutWorker_w1
[2023-06-29 08:40:04,153][00488] Heartbeat connected on RolloutWorker_w13
[2023-06-29 08:40:04,335][12590] Conv encoder output size: 512
[2023-06-29 08:40:04,337][12590] Policy head output size: 512
[2023-06-29 08:40:04,423][12590] Created Actor Critic model with architecture:
[2023-06-29 08:40:04,424][12590] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): VizdoomEncoder(
(basic_encoder): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ELU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ELU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ELU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ELU)
)
)
)
)
(core): ModelCoreRNN(
(core): GRU(512, 512)
)
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=5, bias=True)
)
)
[2023-06-29 08:40:14,381][12590] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-06-29 08:40:14,382][12590] No checkpoints found
[2023-06-29 08:40:14,383][12590] Did not load from checkpoint, starting from scratch!
[2023-06-29 08:40:14,383][12590] Initialized policy 0 weights for model version 0
[2023-06-29 08:40:14,387][12590] LearnerWorker_p0 finished initialization!
[2023-06-29 08:40:14,392][12590] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-06-29 08:40:14,389][00488] Heartbeat connected on LearnerWorker_p0
[2023-06-29 08:40:14,595][12611] RunningMeanStd input shape: (3, 72, 128)
[2023-06-29 08:40:14,598][12611] RunningMeanStd input shape: (1,)
[2023-06-29 08:40:14,614][12611] ConvEncoder: input_channels=3
[2023-06-29 08:40:14,716][12611] Conv encoder output size: 512
[2023-06-29 08:40:14,717][12611] Policy head output size: 512
[2023-06-29 08:40:14,812][00488] Inference worker 0-0 is ready!
[2023-06-29 08:40:14,814][00488] All inference workers are ready! Signal rollout workers to start!
[2023-06-29 08:40:15,063][12632] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-06-29 08:40:15,068][12612] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-06-29 08:40:15,073][12615] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-06-29 08:40:15,072][12624] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-06-29 08:40:15,077][12618] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-06-29 08:40:15,076][12626] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-06-29 08:40:15,099][12631] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-06-29 08:40:15,115][12629] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-06-29 08:40:15,116][12613] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-06-29 08:40:15,101][12616] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-06-29 08:40:15,133][12627] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-06-29 08:40:15,142][12625] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-06-29 08:40:15,147][12614] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-06-29 08:40:15,158][12617] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-06-29 08:40:15,191][12610] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-06-29 08:40:15,186][12619] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-06-29 08:40:16,254][12627] Decorrelating experience for 0 frames...
[2023-06-29 08:40:16,989][00488] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-06-29 08:40:17,057][12627] Decorrelating experience for 32 frames...
[2023-06-29 08:40:17,077][12614] Decorrelating experience for 0 frames...
[2023-06-29 08:40:17,511][12614] Decorrelating experience for 32 frames...
[2023-06-29 08:40:17,837][12616] Decorrelating experience for 0 frames...
[2023-06-29 08:40:17,840][12618] Decorrelating experience for 0 frames...
[2023-06-29 08:40:17,842][12632] Decorrelating experience for 0 frames...
[2023-06-29 08:40:17,846][12612] Decorrelating experience for 0 frames...
[2023-06-29 08:40:17,849][12624] Decorrelating experience for 0 frames...
[2023-06-29 08:40:17,851][12615] Decorrelating experience for 0 frames...
[2023-06-29 08:40:19,426][12619] Decorrelating experience for 0 frames...
[2023-06-29 08:40:19,469][12618] Decorrelating experience for 32 frames...
[2023-06-29 08:40:19,475][12612] Decorrelating experience for 32 frames...
[2023-06-29 08:40:19,483][12616] Decorrelating experience for 32 frames...
[2023-06-29 08:40:19,489][12615] Decorrelating experience for 32 frames...
[2023-06-29 08:40:19,492][12613] Decorrelating experience for 0 frames...
[2023-06-29 08:40:19,494][12629] Decorrelating experience for 0 frames...
[2023-06-29 08:40:19,538][12625] Decorrelating experience for 0 frames...
[2023-06-29 08:40:21,035][12632] Decorrelating experience for 32 frames...
[2023-06-29 08:40:21,182][12618] Decorrelating experience for 64 frames...
[2023-06-29 08:40:21,849][12614] Decorrelating experience for 64 frames...
[2023-06-29 08:40:21,898][12613] Decorrelating experience for 32 frames...
[2023-06-29 08:40:21,912][12617] Decorrelating experience for 0 frames...
[2023-06-29 08:40:21,914][12610] Decorrelating experience for 0 frames...
[2023-06-29 08:40:21,945][12625] Decorrelating experience for 32 frames...
[2023-06-29 08:40:21,989][00488] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-06-29 08:40:23,976][12632] Decorrelating experience for 64 frames...
[2023-06-29 08:40:24,336][12618] Decorrelating experience for 96 frames...
[2023-06-29 08:40:24,365][12629] Decorrelating experience for 32 frames...
[2023-06-29 08:40:24,477][12617] Decorrelating experience for 32 frames...
[2023-06-29 08:40:24,617][12613] Decorrelating experience for 64 frames...
[2023-06-29 08:40:24,975][12615] Decorrelating experience for 64 frames...
[2023-06-29 08:40:24,986][12616] Decorrelating experience for 64 frames...
[2023-06-29 08:40:24,991][12631] Decorrelating experience for 0 frames...
[2023-06-29 08:40:26,999][00488] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-06-29 08:40:27,048][12632] Decorrelating experience for 96 frames...
[2023-06-29 08:40:27,303][12610] Decorrelating experience for 32 frames...
[2023-06-29 08:40:27,381][12612] Decorrelating experience for 64 frames...
[2023-06-29 08:40:27,730][12619] Decorrelating experience for 32 frames...
[2023-06-29 08:40:27,745][12614] Decorrelating experience for 96 frames...
[2023-06-29 08:40:27,880][12629] Decorrelating experience for 64 frames...
[2023-06-29 08:40:28,159][12627] Decorrelating experience for 64 frames...
[2023-06-29 08:40:28,172][12613] Decorrelating experience for 96 frames...
[2023-06-29 08:40:29,937][12631] Decorrelating experience for 32 frames...
[2023-06-29 08:40:30,055][12615] Decorrelating experience for 96 frames...
[2023-06-29 08:40:30,146][12610] Decorrelating experience for 64 frames...
[2023-06-29 08:40:30,345][12625] Decorrelating experience for 64 frames...
[2023-06-29 08:40:30,510][12618] Decorrelating experience for 128 frames...
[2023-06-29 08:40:30,887][12627] Decorrelating experience for 96 frames...
[2023-06-29 08:40:30,920][12626] Decorrelating experience for 0 frames...
[2023-06-29 08:40:31,130][12614] Decorrelating experience for 128 frames...
[2023-06-29 08:40:31,989][00488] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-06-29 08:40:32,141][12612] Decorrelating experience for 96 frames...
[2023-06-29 08:40:32,345][12616] Decorrelating experience for 96 frames...
[2023-06-29 08:40:32,600][12615] Decorrelating experience for 128 frames...
[2023-06-29 08:40:32,693][12619] Decorrelating experience for 64 frames...
[2023-06-29 08:40:32,821][12629] Decorrelating experience for 96 frames...
[2023-06-29 08:40:33,115][12610] Decorrelating experience for 96 frames...
[2023-06-29 08:40:33,361][12616] Decorrelating experience for 128 frames...
[2023-06-29 08:40:33,698][12617] Decorrelating experience for 64 frames...
[2023-06-29 08:40:34,118][12614] Decorrelating experience for 160 frames...
[2023-06-29 08:40:34,142][12618] Decorrelating experience for 160 frames...
[2023-06-29 08:40:34,503][12625] Decorrelating experience for 96 frames...
[2023-06-29 08:40:34,767][12616] Decorrelating experience for 160 frames...
[2023-06-29 08:40:34,951][12627] Decorrelating experience for 128 frames...
[2023-06-29 08:40:35,065][12626] Decorrelating experience for 32 frames...
[2023-06-29 08:40:35,581][12617] Decorrelating experience for 96 frames...
[2023-06-29 08:40:35,586][12632] Decorrelating experience for 128 frames...
[2023-06-29 08:40:35,851][12619] Decorrelating experience for 96 frames...
[2023-06-29 08:40:36,228][12618] Decorrelating experience for 192 frames...
[2023-06-29 08:40:36,431][12614] Decorrelating experience for 192 frames...
[2023-06-29 08:40:36,989][00488] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-06-29 08:40:37,059][12610] Decorrelating experience for 128 frames...
[2023-06-29 08:40:37,393][12626] Decorrelating experience for 64 frames...
[2023-06-29 08:40:37,659][12625] Decorrelating experience for 128 frames...
[2023-06-29 08:40:37,806][12632] Decorrelating experience for 160 frames...
[2023-06-29 08:40:37,822][12612] Decorrelating experience for 128 frames...
[2023-06-29 08:40:37,960][12614] Decorrelating experience for 224 frames...
[2023-06-29 08:40:38,424][12624] Decorrelating experience for 32 frames...
[2023-06-29 08:40:39,052][12618] Decorrelating experience for 224 frames...
[2023-06-29 08:40:39,550][12615] Decorrelating experience for 160 frames...
[2023-06-29 08:40:39,683][12610] Decorrelating experience for 160 frames...
[2023-06-29 08:40:39,685][12617] Decorrelating experience for 128 frames...
[2023-06-29 08:40:39,790][12625] Decorrelating experience for 160 frames...
[2023-06-29 08:40:40,407][12626] Decorrelating experience for 96 frames...
[2023-06-29 08:40:40,568][12613] Decorrelating experience for 128 frames...
[2023-06-29 08:40:41,520][12629] Decorrelating experience for 128 frames...
[2023-06-29 08:40:41,989][00488] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 26.7. Samples: 668. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-06-29 08:40:41,995][00488] Avg episode reward: [(0, '1.457')]
[2023-06-29 08:40:42,389][12632] Decorrelating experience for 192 frames...
[2023-06-29 08:40:43,339][12624] Decorrelating experience for 64 frames...
[2023-06-29 08:40:43,344][12612] Decorrelating experience for 160 frames...
[2023-06-29 08:40:43,870][12615] Decorrelating experience for 192 frames...
[2023-06-29 08:40:43,923][12613] Decorrelating experience for 160 frames...
[2023-06-29 08:40:44,183][12631] Decorrelating experience for 64 frames...
[2023-06-29 08:40:45,007][12626] Decorrelating experience for 128 frames...
[2023-06-29 08:40:46,337][12619] Decorrelating experience for 128 frames...
[2023-06-29 08:40:46,990][00488] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 44.9. Samples: 1348. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-06-29 08:40:47,003][00488] Avg episode reward: [(0, '3.219')]
[2023-06-29 08:40:47,192][12625] Decorrelating experience for 192 frames...
[2023-06-29 08:40:47,208][12610] Decorrelating experience for 192 frames...
[2023-06-29 08:40:47,593][12629] Decorrelating experience for 160 frames...
[2023-06-29 08:40:48,115][12612] Decorrelating experience for 192 frames...
[2023-06-29 08:40:49,249][12613] Decorrelating experience for 192 frames...
[2023-06-29 08:40:49,560][12632] Decorrelating experience for 224 frames...
[2023-06-29 08:40:51,832][12631] Decorrelating experience for 96 frames...
[2023-06-29 08:40:51,842][12615] Decorrelating experience for 224 frames...
[2023-06-29 08:40:51,989][00488] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 77.4. Samples: 2708. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-06-29 08:40:51,992][00488] Avg episode reward: [(0, '3.555')]
[2023-06-29 08:40:53,372][12619] Decorrelating experience for 160 frames...
[2023-06-29 08:40:53,860][12617] Decorrelating experience for 160 frames...
[2023-06-29 08:40:55,277][12627] Another process currently holds the lock /tmp/sf2_root/doom_002.lockfile, attempt: 1
[2023-06-29 08:40:55,346][12625] Decorrelating experience for 224 frames...
[2023-06-29 08:40:55,417][12610] Decorrelating experience for 224 frames...
[2023-06-29 08:40:56,993][00488] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 99.4. Samples: 3976. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-06-29 08:40:56,996][00488] Avg episode reward: [(0, '3.769')]
[2023-06-29 08:40:58,823][12590] Signal inference workers to stop experience collection...
[2023-06-29 08:40:58,863][12611] InferenceWorker_p0-w0: stopping experience collection
[2023-06-29 08:40:59,450][12590] Signal inference workers to resume experience collection...
[2023-06-29 08:40:59,450][12611] InferenceWorker_p0-w0: resuming experience collection
[2023-06-29 08:40:59,487][12624] Decorrelating experience for 96 frames...
[2023-06-29 08:41:01,035][12612] Decorrelating experience for 224 frames...
[2023-06-29 08:41:01,612][12616] Decorrelating experience for 192 frames...
[2023-06-29 08:41:01,644][12631] Decorrelating experience for 128 frames...
[2023-06-29 08:41:01,990][00488] Fps is (10 sec: 819.2, 60 sec: 182.0, 300 sec: 182.0). Total num frames: 8192. Throughput: 0: 102.3. Samples: 4604. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-06-29 08:41:01,995][00488] Avg episode reward: [(0, '3.676')]
[2023-06-29 08:41:02,279][12619] Decorrelating experience for 192 frames...
[2023-06-29 08:41:02,390][12629] Decorrelating experience for 192 frames...
[2023-06-29 08:41:04,391][12613] Decorrelating experience for 224 frames...
[2023-06-29 08:41:06,583][12626] Decorrelating experience for 160 frames...
[2023-06-29 08:41:06,989][00488] Fps is (10 sec: 2458.5, 60 sec: 491.5, 300 sec: 491.5). Total num frames: 24576. Throughput: 0: 172.0. Samples: 7740. Policy #0 lag: (min: 2.0, avg: 6.6, max: 10.0)
[2023-06-29 08:41:06,998][00488] Avg episode reward: [(0, '3.418')]
[2023-06-29 08:41:07,314][12631] Decorrelating experience for 160 frames...
[2023-06-29 08:41:07,679][12616] Decorrelating experience for 224 frames...
[2023-06-29 08:41:07,852][12619] Decorrelating experience for 224 frames...
[2023-06-29 08:41:07,866][12611] Updated weights for policy 0, policy_version 13 (0.0946)
[2023-06-29 08:41:07,913][12629] Decorrelating experience for 224 frames...
[2023-06-29 08:41:10,260][12627] Decorrelating experience for 160 frames...
[2023-06-29 08:41:11,989][00488] Fps is (10 sec: 3276.8, 60 sec: 744.7, 300 sec: 744.7). Total num frames: 40960. Throughput: 0: 296.0. Samples: 13316. Policy #0 lag: (min: 3.0, avg: 5.3, max: 8.0)
[2023-06-29 08:41:11,991][00488] Avg episode reward: [(0, '3.542')]
[2023-06-29 08:41:12,025][12626] Decorrelating experience for 192 frames...
[2023-06-29 08:41:12,351][12611] Updated weights for policy 0, policy_version 23 (0.0011)
[2023-06-29 08:41:15,262][12617] Another process currently holds the lock /tmp/sf2_root/doom_002.lockfile, attempt: 1
[2023-06-29 08:41:15,789][12624] Decorrelating experience for 128 frames...
[2023-06-29 08:41:16,989][00488] Fps is (10 sec: 3276.8, 60 sec: 955.7, 300 sec: 955.7). Total num frames: 57344. Throughput: 0: 345.6. Samples: 15552. Policy #0 lag: (min: 1.0, avg: 2.9, max: 5.0)
[2023-06-29 08:41:16,991][00488] Avg episode reward: [(0, '3.623')]
[2023-06-29 08:41:18,895][12627] Decorrelating experience for 192 frames...
[2023-06-29 08:41:19,971][12626] Decorrelating experience for 224 frames...
[2023-06-29 08:41:20,764][12611] Updated weights for policy 0, policy_version 34 (0.0011)
[2023-06-29 08:41:21,990][00488] Fps is (10 sec: 3276.7, 60 sec: 1228.8, 300 sec: 1134.3). Total num frames: 73728. Throughput: 0: 438.4. Samples: 19728. Policy #0 lag: (min: 3.0, avg: 4.7, max: 7.0)
[2023-06-29 08:41:21,994][00488] Avg episode reward: [(0, '3.980')]
[2023-06-29 08:41:23,325][12617] Decorrelating experience for 192 frames...
[2023-06-29 08:41:24,612][12624] Decorrelating experience for 160 frames...
[2023-06-29 08:41:26,990][00488] Fps is (10 sec: 2457.4, 60 sec: 1365.5, 300 sec: 1170.3). Total num frames: 81920. Throughput: 0: 517.1. Samples: 23940. Policy #0 lag: (min: 3.0, avg: 4.3, max: 7.0)
[2023-06-29 08:41:26,993][00488] Avg episode reward: [(0, '4.158')]
[2023-06-29 08:41:27,006][12590] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000040_81920.pth...
[2023-06-29 08:41:27,580][12611] Updated weights for policy 0, policy_version 44 (0.0018)
[2023-06-29 08:41:28,572][12627] Decorrelating experience for 224 frames...
[2023-06-29 08:41:31,458][12631] Decorrelating experience for 192 frames...
[2023-06-29 08:41:31,989][00488] Fps is (10 sec: 2457.7, 60 sec: 1638.4, 300 sec: 1310.7). Total num frames: 98304. Throughput: 0: 546.6. Samples: 25944. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0)
[2023-06-29 08:41:31,996][00488] Avg episode reward: [(0, '4.402')]
[2023-06-29 08:41:34,531][12617] Decorrelating experience for 224 frames...
[2023-06-29 08:41:35,573][12611] Updated weights for policy 0, policy_version 54 (0.0015)
[2023-06-29 08:41:35,559][12624] Decorrelating experience for 192 frames...
[2023-06-29 08:41:36,989][00488] Fps is (10 sec: 3277.1, 60 sec: 1911.5, 300 sec: 1433.6). Total num frames: 114688. Throughput: 0: 614.1. Samples: 30344. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0)
[2023-06-29 08:41:36,995][00488] Avg episode reward: [(0, '4.190')]
[2023-06-29 08:41:37,003][12590] Saving new best policy, reward=4.190!
[2023-06-29 08:41:39,198][12631] Decorrelating experience for 224 frames...
[2023-06-29 08:41:40,134][12611] Updated weights for policy 0, policy_version 64 (0.0012)
[2023-06-29 08:41:41,989][00488] Fps is (10 sec: 4096.0, 60 sec: 2321.1, 300 sec: 1638.4). Total num frames: 139264. Throughput: 0: 738.3. Samples: 37196. Policy #0 lag: (min: 3.0, avg: 5.3, max: 11.0)
[2023-06-29 08:41:41,996][00488] Avg episode reward: [(0, '3.992')]
[2023-06-29 08:41:44,609][12611] Updated weights for policy 0, policy_version 74 (0.0012)
[2023-06-29 08:41:45,049][12624] Decorrelating experience for 224 frames...
[2023-06-29 08:41:46,989][00488] Fps is (10 sec: 4096.0, 60 sec: 2594.1, 300 sec: 1729.4). Total num frames: 155648. Throughput: 0: 802.0. Samples: 40692. Policy #0 lag: (min: 3.0, avg: 5.6, max: 11.0)
[2023-06-29 08:41:46,993][00488] Avg episode reward: [(0, '4.153')]
[2023-06-29 08:41:50,984][12611] Updated weights for policy 0, policy_version 84 (0.0012)
[2023-06-29 08:41:51,989][00488] Fps is (10 sec: 3276.8, 60 sec: 2867.2, 300 sec: 1810.9). Total num frames: 172032. Throughput: 0: 828.5. Samples: 45024. Policy #0 lag: (min: 3.0, avg: 7.0, max: 15.0)
[2023-06-29 08:41:51,993][00488] Avg episode reward: [(0, '4.225')]
[2023-06-29 08:41:51,997][12590] Saving new best policy, reward=4.225!
[2023-06-29 08:41:56,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3140.5, 300 sec: 1884.2). Total num frames: 188416. Throughput: 0: 810.9. Samples: 49808. Policy #0 lag: (min: 2.0, avg: 6.6, max: 14.0)
[2023-06-29 08:41:56,992][00488] Avg episode reward: [(0, '4.310')]
[2023-06-29 08:41:57,012][12590] Saving new best policy, reward=4.310!
[2023-06-29 08:41:58,388][12611] Updated weights for policy 0, policy_version 94 (0.0031)
[2023-06-29 08:42:01,992][00488] Fps is (10 sec: 3276.0, 60 sec: 3276.7, 300 sec: 1950.4). Total num frames: 204800. Throughput: 0: 814.7. Samples: 52216. Policy #0 lag: (min: 3.0, avg: 5.9, max: 11.0)
[2023-06-29 08:42:01,995][00488] Avg episode reward: [(0, '4.474')]
[2023-06-29 08:42:02,001][12590] Saving new best policy, reward=4.474!
[2023-06-29 08:42:04,165][12611] Updated weights for policy 0, policy_version 104 (0.0022)
[2023-06-29 08:42:06,990][00488] Fps is (10 sec: 2457.5, 60 sec: 3140.3, 300 sec: 1936.3). Total num frames: 212992. Throughput: 0: 811.6. Samples: 56252. Policy #0 lag: (min: 3.0, avg: 6.7, max: 11.0)
[2023-06-29 08:42:06,993][00488] Avg episode reward: [(0, '4.579')]
[2023-06-29 08:42:07,009][12590] Saving new best policy, reward=4.579!
[2023-06-29 08:42:11,989][00488] Fps is (10 sec: 2458.2, 60 sec: 3140.3, 300 sec: 1994.6). Total num frames: 229376. Throughput: 0: 805.6. Samples: 60192. Policy #0 lag: (min: 0.0, avg: 4.9, max: 12.0)
[2023-06-29 08:42:11,993][00488] Avg episode reward: [(0, '4.536')]
[2023-06-29 08:42:13,447][12611] Updated weights for policy 0, policy_version 114 (0.0023)
[2023-06-29 08:42:16,989][00488] Fps is (10 sec: 3276.9, 60 sec: 3140.3, 300 sec: 2048.0). Total num frames: 245760. Throughput: 0: 817.0. Samples: 62708. Policy #0 lag: (min: 1.0, avg: 3.4, max: 9.0)
[2023-06-29 08:42:16,991][00488] Avg episode reward: [(0, '4.682')]
[2023-06-29 08:42:17,001][12590] Saving new best policy, reward=4.682!
[2023-06-29 08:42:18,172][12611] Updated weights for policy 0, policy_version 124 (0.0016)
[2023-06-29 08:42:21,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3140.3, 300 sec: 2097.2). Total num frames: 262144. Throughput: 0: 831.1. Samples: 67744. Policy #0 lag: (min: 3.0, avg: 6.1, max: 15.0)
[2023-06-29 08:42:21,992][00488] Avg episode reward: [(0, '4.707')]
[2023-06-29 08:42:22,001][12590] Saving new best policy, reward=4.707!
[2023-06-29 08:42:26,377][12611] Updated weights for policy 0, policy_version 134 (0.0013)
[2023-06-29 08:42:26,990][00488] Fps is (10 sec: 3276.6, 60 sec: 3276.8, 300 sec: 2142.5). Total num frames: 278528. Throughput: 0: 766.4. Samples: 71684. Policy #0 lag: (min: 3.0, avg: 6.4, max: 11.0)
[2023-06-29 08:42:26,994][00488] Avg episode reward: [(0, '4.710')]
[2023-06-29 08:42:27,005][12590] Saving new best policy, reward=4.710!
[2023-06-29 08:42:31,989][00488] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 2123.9). Total num frames: 286720. Throughput: 0: 734.5. Samples: 73744. Policy #0 lag: (min: 3.0, avg: 5.7, max: 11.0)
[2023-06-29 08:42:31,997][00488] Avg episode reward: [(0, '4.577')]
[2023-06-29 08:42:32,645][12611] Updated weights for policy 0, policy_version 144 (0.0026)
[2023-06-29 08:42:36,998][00488] Fps is (10 sec: 2455.6, 60 sec: 3139.8, 300 sec: 2164.9). Total num frames: 303104. Throughput: 0: 750.5. Samples: 78804. Policy #0 lag: (min: 3.0, avg: 6.0, max: 11.0)
[2023-06-29 08:42:37,005][00488] Avg episode reward: [(0, '4.425')]
[2023-06-29 08:42:39,347][12611] Updated weights for policy 0, policy_version 154 (0.0030)
[2023-06-29 08:42:41,990][00488] Fps is (10 sec: 3276.8, 60 sec: 3003.7, 300 sec: 2203.4). Total num frames: 319488. Throughput: 0: 760.4. Samples: 84028. Policy #0 lag: (min: 3.0, avg: 5.7, max: 15.0)
[2023-06-29 08:42:41,994][00488] Avg episode reward: [(0, '4.732')]
[2023-06-29 08:42:42,215][12590] Saving new best policy, reward=4.732!
[2023-06-29 08:42:44,478][12611] Updated weights for policy 0, policy_version 164 (0.0047)
[2023-06-29 08:42:46,989][00488] Fps is (10 sec: 4099.5, 60 sec: 3140.3, 300 sec: 2293.8). Total num frames: 344064. Throughput: 0: 758.7. Samples: 86356. Policy #0 lag: (min: 3.0, avg: 5.2, max: 11.0)
[2023-06-29 08:42:46,992][00488] Avg episode reward: [(0, '4.737')]
[2023-06-29 08:42:47,001][12590] Saving new best policy, reward=4.737!
[2023-06-29 08:42:50,054][12611] Updated weights for policy 0, policy_version 174 (0.0011)
[2023-06-29 08:42:51,990][00488] Fps is (10 sec: 4915.1, 60 sec: 3276.8, 300 sec: 2378.3). Total num frames: 368640. Throughput: 0: 822.9. Samples: 93284. Policy #0 lag: (min: 1.0, avg: 3.6, max: 9.0)
[2023-06-29 08:42:52,000][00488] Avg episode reward: [(0, '4.680')]
[2023-06-29 08:42:53,238][12611] Updated weights for policy 0, policy_version 184 (0.0012)
[2023-06-29 08:42:56,989][00488] Fps is (10 sec: 4915.2, 60 sec: 3413.3, 300 sec: 2457.6). Total num frames: 393216. Throughput: 0: 898.6. Samples: 100628. Policy #0 lag: (min: 3.0, avg: 5.6, max: 11.0)
[2023-06-29 08:42:56,997][00488] Avg episode reward: [(0, '4.815')]
[2023-06-29 08:42:57,008][12590] Saving new best policy, reward=4.815!
[2023-06-29 08:42:59,141][12611] Updated weights for policy 0, policy_version 194 (0.0023)
[2023-06-29 08:43:01,989][00488] Fps is (10 sec: 4096.1, 60 sec: 3413.5, 300 sec: 2482.4). Total num frames: 409600. Throughput: 0: 897.6. Samples: 103100. Policy #0 lag: (min: 1.0, avg: 3.8, max: 9.0)
[2023-06-29 08:43:01,992][00488] Avg episode reward: [(0, '4.935')]
[2023-06-29 08:43:01,994][12590] Saving new best policy, reward=4.935!
[2023-06-29 08:43:03,688][12611] Updated weights for policy 0, policy_version 204 (0.0029)
[2023-06-29 08:43:06,991][00488] Fps is (10 sec: 3276.3, 60 sec: 3549.8, 300 sec: 2505.8). Total num frames: 425984. Throughput: 0: 897.5. Samples: 108132. Policy #0 lag: (min: 3.0, avg: 5.8, max: 11.0)
[2023-06-29 08:43:06,993][00488] Avg episode reward: [(0, '5.162')]
[2023-06-29 08:43:07,008][12590] Saving new best policy, reward=5.162!
[2023-06-29 08:43:10,726][12611] Updated weights for policy 0, policy_version 214 (0.0025)
[2023-06-29 08:43:11,991][00488] Fps is (10 sec: 3276.3, 60 sec: 3549.8, 300 sec: 2527.8). Total num frames: 442368. Throughput: 0: 923.0. Samples: 113220. Policy #0 lag: (min: 3.0, avg: 4.7, max: 11.0)
[2023-06-29 08:43:11,997][00488] Avg episode reward: [(0, '4.962')]
[2023-06-29 08:43:16,068][12611] Updated weights for policy 0, policy_version 224 (0.0017)
[2023-06-29 08:43:16,989][00488] Fps is (10 sec: 3277.3, 60 sec: 3549.9, 300 sec: 2548.6). Total num frames: 458752. Throughput: 0: 932.9. Samples: 115724. Policy #0 lag: (min: 3.0, avg: 5.1, max: 11.0)
[2023-06-29 08:43:16,997][00488] Avg episode reward: [(0, '4.987')]
[2023-06-29 08:43:21,989][00488] Fps is (10 sec: 3277.3, 60 sec: 3549.9, 300 sec: 2568.3). Total num frames: 475136. Throughput: 0: 942.9. Samples: 121228. Policy #0 lag: (min: 3.0, avg: 5.3, max: 11.0)
[2023-06-29 08:43:21,992][00488] Avg episode reward: [(0, '4.728')]
[2023-06-29 08:43:22,454][12611] Updated weights for policy 0, policy_version 234 (0.0023)
[2023-06-29 08:43:25,887][12611] Updated weights for policy 0, policy_version 244 (0.0011)
[2023-06-29 08:43:26,990][00488] Fps is (10 sec: 4915.2, 60 sec: 3823.0, 300 sec: 2673.2). Total num frames: 507904. Throughput: 0: 995.6. Samples: 128832. Policy #0 lag: (min: 3.0, avg: 5.3, max: 11.0)
[2023-06-29 08:43:26,991][00488] Avg episode reward: [(0, '4.547')]
[2023-06-29 08:43:27,010][12590] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000248_507904.pth...
[2023-06-29 08:43:30,413][12611] Updated weights for policy 0, policy_version 254 (0.0014)
[2023-06-29 08:43:31,990][00488] Fps is (10 sec: 4915.1, 60 sec: 3959.4, 300 sec: 2688.7). Total num frames: 524288. Throughput: 0: 1022.8. Samples: 132384. Policy #0 lag: (min: 3.0, avg: 7.1, max: 12.0)
[2023-06-29 08:43:31,995][00488] Avg episode reward: [(0, '4.978')]
[2023-06-29 08:43:35,507][12611] Updated weights for policy 0, policy_version 264 (0.0015)
[2023-06-29 08:43:36,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3960.0, 300 sec: 2703.4). Total num frames: 540672. Throughput: 0: 981.2. Samples: 137436. Policy #0 lag: (min: 3.0, avg: 5.5, max: 11.0)
[2023-06-29 08:43:36,991][00488] Avg episode reward: [(0, '5.159')]
[2023-06-29 08:43:41,990][00488] Fps is (10 sec: 3276.7, 60 sec: 3959.4, 300 sec: 2717.3). Total num frames: 557056. Throughput: 0: 933.7. Samples: 142644. Policy #0 lag: (min: 1.0, avg: 3.8, max: 13.0)
[2023-06-29 08:43:41,992][00488] Avg episode reward: [(0, '5.243')]
[2023-06-29 08:43:41,994][12590] Saving new best policy, reward=5.243!
[2023-06-29 08:43:43,427][12611] Updated weights for policy 0, policy_version 274 (0.0014)
[2023-06-29 08:43:46,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 2730.7). Total num frames: 573440. Throughput: 0: 931.7. Samples: 145028. Policy #0 lag: (min: 2.0, avg: 7.6, max: 11.0)
[2023-06-29 08:43:46,992][00488] Avg episode reward: [(0, '4.992')]
[2023-06-29 08:43:49,967][12611] Updated weights for policy 0, policy_version 285 (0.0011)
[2023-06-29 08:43:51,990][00488] Fps is (10 sec: 3276.9, 60 sec: 3686.4, 300 sec: 2743.4). Total num frames: 589824. Throughput: 0: 930.9. Samples: 150020. Policy #0 lag: (min: 2.0, avg: 7.7, max: 14.0)
[2023-06-29 08:43:51,996][00488] Avg episode reward: [(0, '5.131')]
[2023-06-29 08:43:54,525][12611] Updated weights for policy 0, policy_version 295 (0.0018)
[2023-06-29 08:43:56,989][00488] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 2792.7). Total num frames: 614400. Throughput: 0: 970.8. Samples: 156904. Policy #0 lag: (min: 1.0, avg: 4.8, max: 9.0)
[2023-06-29 08:43:56,995][00488] Avg episode reward: [(0, '5.229')]
[2023-06-29 08:43:58,802][12611] Updated weights for policy 0, policy_version 306 (0.0012)
[2023-06-29 08:44:01,989][00488] Fps is (10 sec: 4915.4, 60 sec: 3822.9, 300 sec: 2839.9). Total num frames: 638976. Throughput: 0: 998.3. Samples: 160648. Policy #0 lag: (min: 3.0, avg: 7.1, max: 11.0)
[2023-06-29 08:44:01,995][00488] Avg episode reward: [(0, '5.497')]
[2023-06-29 08:44:02,140][12590] Saving new best policy, reward=5.497!
[2023-06-29 08:44:02,159][12611] Updated weights for policy 0, policy_version 316 (0.0019)
[2023-06-29 08:44:06,997][00488] Fps is (10 sec: 4911.5, 60 sec: 3959.1, 300 sec: 2884.9). Total num frames: 663552. Throughput: 0: 1009.1. Samples: 166644. Policy #0 lag: (min: 3.0, avg: 4.8, max: 11.0)
[2023-06-29 08:44:06,999][00488] Avg episode reward: [(0, '5.216')]
[2023-06-29 08:44:09,941][12611] Updated weights for policy 0, policy_version 326 (0.0011)
[2023-06-29 08:44:11,989][00488] Fps is (10 sec: 4096.0, 60 sec: 3959.6, 300 sec: 2893.3). Total num frames: 679936. Throughput: 0: 954.8. Samples: 171800. Policy #0 lag: (min: 3.0, avg: 6.1, max: 11.0)
[2023-06-29 08:44:11,992][00488] Avg episode reward: [(0, '5.206')]
[2023-06-29 08:44:14,886][12611] Updated weights for policy 0, policy_version 336 (0.0058)
[2023-06-29 08:44:16,989][00488] Fps is (10 sec: 2459.4, 60 sec: 3822.9, 300 sec: 2867.2). Total num frames: 688128. Throughput: 0: 933.1. Samples: 174372. Policy #0 lag: (min: 3.0, avg: 5.2, max: 11.0)
[2023-06-29 08:44:16,997][00488] Avg episode reward: [(0, '5.147')]
[2023-06-29 08:44:21,989][00488] Fps is (10 sec: 2457.6, 60 sec: 3822.9, 300 sec: 2875.6). Total num frames: 704512. Throughput: 0: 934.0. Samples: 179468. Policy #0 lag: (min: 3.0, avg: 6.2, max: 11.0)
[2023-06-29 08:44:21,994][00488] Avg episode reward: [(0, '5.308')]
[2023-06-29 08:44:22,486][12611] Updated weights for policy 0, policy_version 346 (0.0012)
[2023-06-29 08:44:26,806][12611] Updated weights for policy 0, policy_version 356 (0.0035)
[2023-06-29 08:44:26,989][00488] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 2916.4). Total num frames: 729088. Throughput: 0: 938.2. Samples: 184864. Policy #0 lag: (min: 3.0, avg: 5.4, max: 11.0)
[2023-06-29 08:44:26,999][00488] Avg episode reward: [(0, '5.471')]
[2023-06-29 08:44:30,892][12611] Updated weights for policy 0, policy_version 366 (0.0012)
[2023-06-29 08:44:31,989][00488] Fps is (10 sec: 4915.2, 60 sec: 3823.0, 300 sec: 2955.5). Total num frames: 753664. Throughput: 0: 972.7. Samples: 188800. Policy #0 lag: (min: 3.0, avg: 5.6, max: 11.0)
[2023-06-29 08:44:31,992][00488] Avg episode reward: [(0, '5.572')]
[2023-06-29 08:44:31,998][12590] Saving new best policy, reward=5.572!
[2023-06-29 08:44:34,350][12611] Updated weights for policy 0, policy_version 376 (0.0011)
[2023-06-29 08:44:36,994][00488] Fps is (10 sec: 4094.1, 60 sec: 3822.6, 300 sec: 2961.7). Total num frames: 770048. Throughput: 0: 1025.6. Samples: 196176. Policy #0 lag: (min: 3.0, avg: 5.3, max: 11.0)
[2023-06-29 08:44:36,996][00488] Avg episode reward: [(0, '5.569')]
[2023-06-29 08:44:41,102][12611] Updated weights for policy 0, policy_version 386 (0.0018)
[2023-06-29 08:44:41,992][00488] Fps is (10 sec: 4095.0, 60 sec: 3959.3, 300 sec: 2998.6). Total num frames: 794624. Throughput: 0: 985.5. Samples: 201256. Policy #0 lag: (min: 3.0, avg: 5.8, max: 11.0)
[2023-06-29 08:44:41,997][00488] Avg episode reward: [(0, '5.761')]
[2023-06-29 08:44:42,003][12590] Saving new best policy, reward=5.761!
[2023-06-29 08:44:45,775][12611] Updated weights for policy 0, policy_version 396 (0.0012)
[2023-06-29 08:44:46,990][00488] Fps is (10 sec: 4097.8, 60 sec: 3959.5, 300 sec: 3003.7). Total num frames: 811008. Throughput: 0: 956.4. Samples: 203688. Policy #0 lag: (min: 1.0, avg: 3.4, max: 9.0)
[2023-06-29 08:44:46,994][00488] Avg episode reward: [(0, '6.072')]
[2023-06-29 08:44:47,000][12590] Saving new best policy, reward=6.072!
[2023-06-29 08:44:51,993][00488] Fps is (10 sec: 3276.5, 60 sec: 3959.3, 300 sec: 3008.7). Total num frames: 827392. Throughput: 0: 937.2. Samples: 208812. Policy #0 lag: (min: 1.0, avg: 3.4, max: 9.0)
[2023-06-29 08:44:51,998][00488] Avg episode reward: [(0, '6.263')]
[2023-06-29 08:44:52,000][12590] Saving new best policy, reward=6.263!
[2023-06-29 08:44:53,455][12611] Updated weights for policy 0, policy_version 406 (0.0031)
[2023-06-29 08:44:56,995][00488] Fps is (10 sec: 3275.1, 60 sec: 3822.6, 300 sec: 3013.4). Total num frames: 843776. Throughput: 0: 933.4. Samples: 213808. Policy #0 lag: (min: 3.0, avg: 6.4, max: 11.0)
[2023-06-29 08:44:56,998][00488] Avg episode reward: [(0, '6.064')]
[2023-06-29 08:44:58,043][12611] Updated weights for policy 0, policy_version 416 (0.0016)
[2023-06-29 08:45:01,989][00488] Fps is (10 sec: 4097.4, 60 sec: 3822.9, 300 sec: 3046.8). Total num frames: 868352. Throughput: 0: 943.0. Samples: 216808. Policy #0 lag: (min: 3.0, avg: 5.4, max: 11.0)
[2023-06-29 08:45:01,997][00488] Avg episode reward: [(0, '5.739')]
[2023-06-29 08:45:03,977][12611] Updated weights for policy 0, policy_version 426 (0.0038)
[2023-06-29 08:45:06,996][00488] Fps is (10 sec: 4095.5, 60 sec: 3686.5, 300 sec: 3050.7). Total num frames: 884736. Throughput: 0: 994.8. Samples: 224240. Policy #0 lag: (min: 2.0, avg: 8.0, max: 17.0)
[2023-06-29 08:45:07,001][00488] Avg episode reward: [(0, '5.840')]
[2023-06-29 08:45:07,331][12611] Updated weights for policy 0, policy_version 436 (0.0012)
[2023-06-29 08:45:11,992][00488] Fps is (10 sec: 4095.0, 60 sec: 3822.8, 300 sec: 3082.4). Total num frames: 909312. Throughput: 0: 1008.7. Samples: 230260. Policy #0 lag: (min: 1.0, avg: 4.2, max: 13.0)
[2023-06-29 08:45:11,995][00488] Avg episode reward: [(0, '6.204')]
[2023-06-29 08:45:13,900][12611] Updated weights for policy 0, policy_version 447 (0.0012)
[2023-06-29 08:45:16,989][00488] Fps is (10 sec: 4098.7, 60 sec: 3959.5, 300 sec: 3138.0). Total num frames: 925696. Throughput: 0: 977.6. Samples: 232792. Policy #0 lag: (min: 3.0, avg: 6.4, max: 11.0)
[2023-06-29 08:45:16,994][00488] Avg episode reward: [(0, '6.764')]
[2023-06-29 08:45:17,005][12590] Saving new best policy, reward=6.764!
[2023-06-29 08:45:20,513][12611] Updated weights for policy 0, policy_version 457 (0.0012)
[2023-06-29 08:45:21,989][00488] Fps is (10 sec: 3277.6, 60 sec: 3959.5, 300 sec: 3193.6). Total num frames: 942080. Throughput: 0: 923.6. Samples: 237732. Policy #0 lag: (min: 3.0, avg: 5.4, max: 11.0)
[2023-06-29 08:45:21,992][00488] Avg episode reward: [(0, '6.830')]
[2023-06-29 08:45:22,002][12590] Saving new best policy, reward=6.830!
[2023-06-29 08:45:25,251][12611] Updated weights for policy 0, policy_version 467 (0.0011)
[2023-06-29 08:45:26,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3249.0). Total num frames: 958464. Throughput: 0: 924.0. Samples: 242836. Policy #0 lag: (min: 3.0, avg: 5.6, max: 15.0)
[2023-06-29 08:45:26,994][00488] Avg episode reward: [(0, '6.693')]
[2023-06-29 08:45:27,006][12590] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000468_958464.pth...
[2023-06-29 08:45:27,245][12590] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000040_81920.pth
[2023-06-29 08:45:31,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3304.6). Total num frames: 974848. Throughput: 0: 924.7. Samples: 245300. Policy #0 lag: (min: 0.0, avg: 4.8, max: 12.0)
[2023-06-29 08:45:31,991][00488] Avg episode reward: [(0, '6.738')]
[2023-06-29 08:45:32,418][12611] Updated weights for policy 0, policy_version 477 (0.0022)
[2023-06-29 08:45:35,538][12611] Updated weights for policy 0, policy_version 487 (0.0018)
[2023-06-29 08:45:36,989][00488] Fps is (10 sec: 4096.0, 60 sec: 3823.2, 300 sec: 3387.9). Total num frames: 999424. Throughput: 0: 957.3. Samples: 251888. Policy #0 lag: (min: 3.0, avg: 6.3, max: 12.0)
[2023-06-29 08:45:36,996][00488] Avg episode reward: [(0, '7.639')]
[2023-06-29 08:45:37,002][12590] Saving new best policy, reward=7.639!
[2023-06-29 08:45:41,989][00488] Fps is (10 sec: 4096.0, 60 sec: 3686.6, 300 sec: 3443.4). Total num frames: 1015808. Throughput: 0: 972.3. Samples: 257556. Policy #0 lag: (min: 3.0, avg: 5.4, max: 11.0)
[2023-06-29 08:45:41,992][00488] Avg episode reward: [(0, '8.516')]
[2023-06-29 08:45:41,994][12590] Saving new best policy, reward=8.516!
[2023-06-29 08:45:43,235][12611] Updated weights for policy 0, policy_version 497 (0.0019)
[2023-06-29 08:45:46,992][00488] Fps is (10 sec: 3276.0, 60 sec: 3686.3, 300 sec: 3498.9). Total num frames: 1032192. Throughput: 0: 947.2. Samples: 259436. Policy #0 lag: (min: 3.0, avg: 5.8, max: 15.0)
[2023-06-29 08:45:46,999][00488] Avg episode reward: [(0, '8.438')]
[2023-06-29 08:45:50,091][12611] Updated weights for policy 0, policy_version 507 (0.0012)
[2023-06-29 08:45:51,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3686.6, 300 sec: 3554.5). Total num frames: 1048576. Throughput: 0: 870.3. Samples: 263396. Policy #0 lag: (min: 2.0, avg: 7.2, max: 14.0)
[2023-06-29 08:45:51,995][00488] Avg episode reward: [(0, '8.568')]
[2023-06-29 08:45:52,002][12590] Saving new best policy, reward=8.568!
[2023-06-29 08:45:56,990][00488] Fps is (10 sec: 2458.1, 60 sec: 3550.2, 300 sec: 3554.5). Total num frames: 1056768. Throughput: 0: 822.0. Samples: 267248. Policy #0 lag: (min: 3.0, avg: 5.0, max: 11.0)
[2023-06-29 08:45:56,992][00488] Avg episode reward: [(0, '7.928')]
[2023-06-29 08:45:57,605][12611] Updated weights for policy 0, policy_version 517 (0.0014)
[2023-06-29 08:46:01,989][00488] Fps is (10 sec: 1638.4, 60 sec: 3276.8, 300 sec: 3526.7). Total num frames: 1064960. Throughput: 0: 808.1. Samples: 269156. Policy #0 lag: (min: 3.0, avg: 5.0, max: 11.0)
[2023-06-29 08:46:01,993][00488] Avg episode reward: [(0, '7.865')]
[2023-06-29 08:46:04,335][12611] Updated weights for policy 0, policy_version 527 (0.0016)
[2023-06-29 08:46:06,989][00488] Fps is (10 sec: 2457.7, 60 sec: 3277.2, 300 sec: 3526.7). Total num frames: 1081344. Throughput: 0: 781.7. Samples: 272908. Policy #0 lag: (min: 3.0, avg: 5.1, max: 11.0)
[2023-06-29 08:46:06,992][00488] Avg episode reward: [(0, '8.365')]
[2023-06-29 08:46:11,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3140.4, 300 sec: 3526.7). Total num frames: 1097728. Throughput: 0: 776.1. Samples: 277760. Policy #0 lag: (min: 3.0, avg: 5.4, max: 11.0)
[2023-06-29 08:46:11,992][00488] Avg episode reward: [(0, '8.882')]
[2023-06-29 08:46:11,994][12590] Saving new best policy, reward=8.882!
[2023-06-29 08:46:13,165][12611] Updated weights for policy 0, policy_version 537 (0.0026)
[2023-06-29 08:46:16,556][12611] Updated weights for policy 0, policy_version 547 (0.0012)
[2023-06-29 08:46:16,990][00488] Fps is (10 sec: 4096.0, 60 sec: 3276.8, 300 sec: 3554.5). Total num frames: 1122304. Throughput: 0: 799.5. Samples: 281276. Policy #0 lag: (min: 3.0, avg: 5.7, max: 11.0)
[2023-06-29 08:46:16,992][00488] Avg episode reward: [(0, '7.876')]
[2023-06-29 08:46:20,299][12611] Updated weights for policy 0, policy_version 557 (0.0011)
[2023-06-29 08:46:21,989][00488] Fps is (10 sec: 4915.2, 60 sec: 3413.3, 300 sec: 3610.0). Total num frames: 1146880. Throughput: 0: 822.5. Samples: 288900. Policy #0 lag: (min: 3.0, avg: 5.5, max: 11.0)
[2023-06-29 08:46:21,992][00488] Avg episode reward: [(0, '7.267')]
[2023-06-29 08:46:24,186][12611] Updated weights for policy 0, policy_version 567 (0.0020)
[2023-06-29 08:46:26,989][00488] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3610.0). Total num frames: 1163264. Throughput: 0: 826.8. Samples: 294764. Policy #0 lag: (min: 3.0, avg: 5.1, max: 11.0)
[2023-06-29 08:46:26,996][00488] Avg episode reward: [(0, '7.475')]
[2023-06-29 08:46:31,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3610.0). Total num frames: 1179648. Throughput: 0: 839.7. Samples: 297220. Policy #0 lag: (min: 1.0, avg: 2.8, max: 9.0)
[2023-06-29 08:46:32,003][00488] Avg episode reward: [(0, '8.063')]
[2023-06-29 08:46:32,399][12611] Updated weights for policy 0, policy_version 577 (0.0024)
[2023-06-29 08:46:36,990][00488] Fps is (10 sec: 3276.6, 60 sec: 3276.8, 300 sec: 3582.3). Total num frames: 1196032. Throughput: 0: 868.8. Samples: 302492. Policy #0 lag: (min: 3.0, avg: 5.0, max: 11.0)
[2023-06-29 08:46:36,998][00488] Avg episode reward: [(0, '8.351')]
[2023-06-29 08:46:37,393][12611] Updated weights for policy 0, policy_version 587 (0.0013)
[2023-06-29 08:46:41,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3582.3). Total num frames: 1212416. Throughput: 0: 898.4. Samples: 307676. Policy #0 lag: (min: 3.0, avg: 5.8, max: 11.0)
[2023-06-29 08:46:41,993][00488] Avg episode reward: [(0, '8.308')]
[2023-06-29 08:46:43,269][12611] Updated weights for policy 0, policy_version 597 (0.0034)
[2023-06-29 08:46:46,990][00488] Fps is (10 sec: 4096.0, 60 sec: 3413.4, 300 sec: 3610.0). Total num frames: 1236992. Throughput: 0: 911.2. Samples: 310160. Policy #0 lag: (min: 3.0, avg: 5.1, max: 11.0)
[2023-06-29 08:46:46,997][00488] Avg episode reward: [(0, '8.526')]
[2023-06-29 08:46:47,172][12611] Updated weights for policy 0, policy_version 607 (0.0024)
[2023-06-29 08:46:51,989][00488] Fps is (10 sec: 4915.2, 60 sec: 3549.9, 300 sec: 3637.8). Total num frames: 1261568. Throughput: 0: 990.9. Samples: 317500. Policy #0 lag: (min: 3.0, avg: 5.5, max: 15.0)
[2023-06-29 08:46:51,992][00488] Avg episode reward: [(0, '8.849')]
[2023-06-29 08:46:52,642][12611] Updated weights for policy 0, policy_version 617 (0.0011)
[2023-06-29 08:46:56,154][12611] Updated weights for policy 0, policy_version 627 (0.0012)
[2023-06-29 08:46:56,989][00488] Fps is (10 sec: 4915.4, 60 sec: 3823.0, 300 sec: 3665.6). Total num frames: 1286144. Throughput: 0: 1036.4. Samples: 324396. Policy #0 lag: (min: 3.0, avg: 6.4, max: 11.0)
[2023-06-29 08:46:56,997][00488] Avg episode reward: [(0, '9.937')]
[2023-06-29 08:46:57,005][12590] Saving new best policy, reward=9.937!
[2023-06-29 08:47:01,993][00488] Fps is (10 sec: 4094.4, 60 sec: 3959.2, 300 sec: 3693.3). Total num frames: 1302528. Throughput: 0: 1013.7. Samples: 326896. Policy #0 lag: (min: 3.0, avg: 6.1, max: 11.0)
[2023-06-29 08:47:01,996][00488] Avg episode reward: [(0, '10.489')]
[2023-06-29 08:47:02,004][12590] Saving new best policy, reward=10.489!
[2023-06-29 08:47:02,918][12611] Updated weights for policy 0, policy_version 637 (0.0011)
[2023-06-29 08:47:06,990][00488] Fps is (10 sec: 3276.7, 60 sec: 3959.5, 300 sec: 3693.3). Total num frames: 1318912. Throughput: 0: 959.3. Samples: 332068. Policy #0 lag: (min: 3.0, avg: 5.4, max: 11.0)
[2023-06-29 08:47:06,994][00488] Avg episode reward: [(0, '10.634')]
[2023-06-29 08:47:07,004][12590] Saving new best policy, reward=10.634!
[2023-06-29 08:47:07,570][12611] Updated weights for policy 0, policy_version 647 (0.0011)
[2023-06-29 08:47:11,989][00488] Fps is (10 sec: 3278.1, 60 sec: 3959.5, 300 sec: 3693.3). Total num frames: 1335296. Throughput: 0: 938.1. Samples: 336980. Policy #0 lag: (min: 3.0, avg: 5.7, max: 15.0)
[2023-06-29 08:47:11,993][00488] Avg episode reward: [(0, '11.191')]
[2023-06-29 08:47:11,999][12590] Saving new best policy, reward=11.191!
[2023-06-29 08:47:15,958][12611] Updated weights for policy 0, policy_version 657 (0.0040)
[2023-06-29 08:47:16,990][00488] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3693.3). Total num frames: 1351680. Throughput: 0: 938.0. Samples: 339432. Policy #0 lag: (min: 2.0, avg: 7.3, max: 14.0)
[2023-06-29 08:47:16,995][00488] Avg episode reward: [(0, '11.341')]
[2023-06-29 08:47:17,006][12590] Saving new best policy, reward=11.341!
[2023-06-29 08:47:19,575][12611] Updated weights for policy 0, policy_version 667 (0.0027)
[2023-06-29 08:47:21,990][00488] Fps is (10 sec: 4095.8, 60 sec: 3822.9, 300 sec: 3721.1). Total num frames: 1376256. Throughput: 0: 958.5. Samples: 345624. Policy #0 lag: (min: 2.0, avg: 8.4, max: 14.0)
[2023-06-29 08:47:21,998][00488] Avg episode reward: [(0, '10.698')]
[2023-06-29 08:47:24,760][12611] Updated weights for policy 0, policy_version 677 (0.0012)
[2023-06-29 08:47:26,990][00488] Fps is (10 sec: 4915.3, 60 sec: 3959.4, 300 sec: 3776.6). Total num frames: 1400832. Throughput: 0: 1012.6. Samples: 353244. Policy #0 lag: (min: 0.0, avg: 5.3, max: 12.0)
[2023-06-29 08:47:26,991][00488] Avg episode reward: [(0, '11.114')]
[2023-06-29 08:47:27,001][12590] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000684_1400832.pth...
[2023-06-29 08:47:27,247][12590] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000248_507904.pth
[2023-06-29 08:47:28,314][12611] Updated weights for policy 0, policy_version 687 (0.0012)
[2023-06-29 08:47:31,989][00488] Fps is (10 sec: 4096.2, 60 sec: 3959.5, 300 sec: 3776.8). Total num frames: 1417216. Throughput: 0: 1025.3. Samples: 356296. Policy #0 lag: (min: 3.0, avg: 7.1, max: 11.0)
[2023-06-29 08:47:31,996][00488] Avg episode reward: [(0, '12.016')]
[2023-06-29 08:47:32,000][12590] Saving new best policy, reward=12.016!
[2023-06-29 08:47:34,716][12611] Updated weights for policy 0, policy_version 697 (0.0011)
[2023-06-29 08:47:36,993][00488] Fps is (10 sec: 3275.8, 60 sec: 3959.3, 300 sec: 3776.6). Total num frames: 1433600. Throughput: 0: 973.2. Samples: 361296. Policy #0 lag: (min: 0.0, avg: 5.8, max: 12.0)
[2023-06-29 08:47:36,995][00488] Avg episode reward: [(0, '12.829')]
[2023-06-29 08:47:37,014][12590] Saving new best policy, reward=12.829!
[2023-06-29 08:47:40,113][12611] Updated weights for policy 0, policy_version 708 (0.0039)
[2023-06-29 08:47:41,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3748.9). Total num frames: 1449984. Throughput: 0: 932.4. Samples: 366356. Policy #0 lag: (min: 2.0, avg: 6.9, max: 10.0)
[2023-06-29 08:47:41,992][00488] Avg episode reward: [(0, '12.455')]
[2023-06-29 08:47:46,989][00488] Fps is (10 sec: 3277.9, 60 sec: 3823.0, 300 sec: 3721.1). Total num frames: 1466368. Throughput: 0: 932.2. Samples: 368840. Policy #0 lag: (min: 3.0, avg: 6.0, max: 11.0)
[2023-06-29 08:47:46,997][00488] Avg episode reward: [(0, '12.834')]
[2023-06-29 08:47:47,008][12590] Saving new best policy, reward=12.834!
[2023-06-29 08:47:47,482][12611] Updated weights for policy 0, policy_version 718 (0.0027)
[2023-06-29 08:47:51,990][00488] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3693.3). Total num frames: 1482752. Throughput: 0: 928.5. Samples: 373852. Policy #0 lag: (min: 3.0, avg: 5.5, max: 11.0)
[2023-06-29 08:47:51,998][00488] Avg episode reward: [(0, '12.167')]
[2023-06-29 08:47:52,180][12611] Updated weights for policy 0, policy_version 728 (0.0014)
[2023-06-29 08:47:56,331][12611] Updated weights for policy 0, policy_version 738 (0.0013)
[2023-06-29 08:47:56,989][00488] Fps is (10 sec: 4915.2, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 1515520. Throughput: 0: 981.8. Samples: 381160. Policy #0 lag: (min: 1.0, avg: 3.2, max: 9.0)
[2023-06-29 08:47:56,995][00488] Avg episode reward: [(0, '12.361')]
[2023-06-29 08:47:59,659][12611] Updated weights for policy 0, policy_version 748 (0.0021)
[2023-06-29 08:48:01,989][00488] Fps is (10 sec: 4915.2, 60 sec: 3823.2, 300 sec: 3748.9). Total num frames: 1531904. Throughput: 0: 1009.6. Samples: 384864. Policy #0 lag: (min: 3.0, avg: 6.8, max: 11.0)
[2023-06-29 08:48:01,996][00488] Avg episode reward: [(0, '13.340')]
[2023-06-29 08:48:02,179][12590] Saving new best policy, reward=13.340!
[2023-06-29 08:48:06,703][12611] Updated weights for policy 0, policy_version 758 (0.0036)
[2023-06-29 08:48:06,989][00488] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3776.7). Total num frames: 1556480. Throughput: 0: 998.6. Samples: 390560. Policy #0 lag: (min: 3.0, avg: 5.6, max: 11.0)
[2023-06-29 08:48:06,992][00488] Avg episode reward: [(0, '13.187')]
[2023-06-29 08:48:11,973][12611] Updated weights for policy 0, policy_version 768 (0.0013)
[2023-06-29 08:48:11,997][00488] Fps is (10 sec: 4093.0, 60 sec: 3959.0, 300 sec: 3776.6). Total num frames: 1572864. Throughput: 0: 938.7. Samples: 395492. Policy #0 lag: (min: 2.0, avg: 7.3, max: 14.0)
[2023-06-29 08:48:11,999][00488] Avg episode reward: [(0, '13.353')]
[2023-06-29 08:48:12,001][12590] Saving new best policy, reward=13.353!
[2023-06-29 08:48:16,990][00488] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3776.7). Total num frames: 1589248. Throughput: 0: 926.8. Samples: 398004. Policy #0 lag: (min: 3.0, avg: 6.3, max: 11.0)
[2023-06-29 08:48:16,994][00488] Avg episode reward: [(0, '12.863')]
[2023-06-29 08:48:19,327][12611] Updated weights for policy 0, policy_version 778 (0.0011)
[2023-06-29 08:48:21,990][00488] Fps is (10 sec: 3278.9, 60 sec: 3822.9, 300 sec: 3721.1). Total num frames: 1605632. Throughput: 0: 930.1. Samples: 403148. Policy #0 lag: (min: 1.0, avg: 5.1, max: 13.0)
[2023-06-29 08:48:21,998][00488] Avg episode reward: [(0, '13.339')]
[2023-06-29 08:48:24,062][12611] Updated weights for policy 0, policy_version 788 (0.0053)
[2023-06-29 08:48:26,990][00488] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3721.1). Total num frames: 1622016. Throughput: 0: 953.1. Samples: 409244. Policy #0 lag: (min: 3.0, avg: 5.5, max: 11.0)
[2023-06-29 08:48:26,996][00488] Avg episode reward: [(0, '13.560')]
[2023-06-29 08:48:27,007][12590] Saving new best policy, reward=13.560!
[2023-06-29 08:48:28,458][12611] Updated weights for policy 0, policy_version 798 (0.0012)
[2023-06-29 08:48:31,950][12611] Updated weights for policy 0, policy_version 808 (0.0017)
[2023-06-29 08:48:31,989][00488] Fps is (10 sec: 4915.7, 60 sec: 3959.5, 300 sec: 3776.7). Total num frames: 1654784. Throughput: 0: 980.9. Samples: 412980. Policy #0 lag: (min: 2.0, avg: 6.6, max: 10.0)
[2023-06-29 08:48:31,995][00488] Avg episode reward: [(0, '12.935')]
[2023-06-29 08:48:36,989][00488] Fps is (10 sec: 4915.2, 60 sec: 3959.7, 300 sec: 3776.7). Total num frames: 1671168. Throughput: 0: 1023.7. Samples: 419920. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0)
[2023-06-29 08:48:36,996][00488] Avg episode reward: [(0, '12.899')]
[2023-06-29 08:48:38,227][12611] Updated weights for policy 0, policy_version 819 (0.0012)
[2023-06-29 08:48:41,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3776.7). Total num frames: 1687552. Throughput: 0: 974.4. Samples: 425008. Policy #0 lag: (min: 3.0, avg: 6.1, max: 11.0)
[2023-06-29 08:48:41,993][00488] Avg episode reward: [(0, '13.202')]
[2023-06-29 08:48:45,075][12611] Updated weights for policy 0, policy_version 829 (0.0017)
[2023-06-29 08:48:46,990][00488] Fps is (10 sec: 3276.7, 60 sec: 3959.4, 300 sec: 3776.6). Total num frames: 1703936. Throughput: 0: 949.9. Samples: 427612. Policy #0 lag: (min: 3.0, avg: 6.8, max: 15.0)
[2023-06-29 08:48:46,992][00488] Avg episode reward: [(0, '14.215')]
[2023-06-29 08:48:47,006][12590] Saving new best policy, reward=14.215!
[2023-06-29 08:48:50,474][12611] Updated weights for policy 0, policy_version 839 (0.0033)
[2023-06-29 08:48:51,991][00488] Fps is (10 sec: 3276.3, 60 sec: 3959.4, 300 sec: 3748.9). Total num frames: 1720320. Throughput: 0: 932.2. Samples: 432512. Policy #0 lag: (min: 0.0, avg: 4.4, max: 12.0)
[2023-06-29 08:48:51,996][00488] Avg episode reward: [(0, '14.693')]
[2023-06-29 08:48:52,000][12590] Saving new best policy, reward=14.693!
[2023-06-29 08:48:56,989][00488] Fps is (10 sec: 3276.9, 60 sec: 3686.4, 300 sec: 3721.1). Total num frames: 1736704. Throughput: 0: 935.2. Samples: 437568. Policy #0 lag: (min: 3.0, avg: 7.1, max: 15.0)
[2023-06-29 08:48:56,992][00488] Avg episode reward: [(0, '15.477')]
[2023-06-29 08:48:57,000][12590] Saving new best policy, reward=15.477!
[2023-06-29 08:48:57,622][12611] Updated weights for policy 0, policy_version 849 (0.0017)
[2023-06-29 08:49:01,002][12611] Updated weights for policy 0, policy_version 859 (0.0011)
[2023-06-29 08:49:01,989][00488] Fps is (10 sec: 4096.6, 60 sec: 3822.9, 300 sec: 3721.2). Total num frames: 1761280. Throughput: 0: 957.2. Samples: 441076. Policy #0 lag: (min: 3.0, avg: 6.0, max: 11.0)
[2023-06-29 08:49:01,992][00488] Avg episode reward: [(0, '17.223')]
[2023-06-29 08:49:01,996][12590] Saving new best policy, reward=17.223!
[2023-06-29 08:49:05,726][12611] Updated weights for policy 0, policy_version 869 (0.0017)
[2023-06-29 08:49:06,990][00488] Fps is (10 sec: 4915.2, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 1785856. Throughput: 0: 1009.1. Samples: 448556. Policy #0 lag: (min: 2.0, avg: 6.7, max: 14.0)
[2023-06-29 08:49:06,997][00488] Avg episode reward: [(0, '16.577')]
[2023-06-29 08:49:10,043][12611] Updated weights for policy 0, policy_version 880 (0.0019)
[2023-06-29 08:49:11,989][00488] Fps is (10 sec: 4096.0, 60 sec: 3823.4, 300 sec: 3776.7). Total num frames: 1802240. Throughput: 0: 990.9. Samples: 453836. Policy #0 lag: (min: 3.0, avg: 6.9, max: 13.0)
[2023-06-29 08:49:11,992][00488] Avg episode reward: [(0, '16.184')]
[2023-06-29 08:49:16,991][00488] Fps is (10 sec: 3276.4, 60 sec: 3822.9, 300 sec: 3776.6). Total num frames: 1818624. Throughput: 0: 951.1. Samples: 455780. Policy #0 lag: (min: 3.0, avg: 6.2, max: 11.0)
[2023-06-29 08:49:16,996][00488] Avg episode reward: [(0, '16.341')]
[2023-06-29 08:49:19,726][12611] Updated weights for policy 0, policy_version 890 (0.0027)
[2023-06-29 08:49:21,990][00488] Fps is (10 sec: 2457.5, 60 sec: 3686.4, 300 sec: 3721.1). Total num frames: 1826816. Throughput: 0: 883.1. Samples: 459660. Policy #0 lag: (min: 3.0, avg: 6.2, max: 11.0)
[2023-06-29 08:49:21,992][00488] Avg episode reward: [(0, '15.547')]
[2023-06-29 08:49:26,184][12611] Updated weights for policy 0, policy_version 900 (0.0019)
[2023-06-29 08:49:26,989][00488] Fps is (10 sec: 2457.9, 60 sec: 3686.4, 300 sec: 3693.3). Total num frames: 1843200. Throughput: 0: 856.4. Samples: 463544. Policy #0 lag: (min: 2.0, avg: 6.7, max: 14.0)
[2023-06-29 08:49:26,997][00488] Avg episode reward: [(0, '14.813')]
[2023-06-29 08:49:27,010][12590] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000900_1843200.pth...
[2023-06-29 08:49:27,300][12590] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000468_958464.pth
[2023-06-29 08:49:31,989][00488] Fps is (10 sec: 2457.7, 60 sec: 3276.8, 300 sec: 3665.6). Total num frames: 1851392. Throughput: 0: 839.7. Samples: 465396. Policy #0 lag: (min: 3.0, avg: 7.5, max: 14.0)
[2023-06-29 08:49:31,994][00488] Avg episode reward: [(0, '14.700')]
[2023-06-29 08:49:35,457][12611] Updated weights for policy 0, policy_version 910 (0.0012)
[2023-06-29 08:49:36,996][00488] Fps is (10 sec: 2456.0, 60 sec: 3276.4, 300 sec: 3637.8). Total num frames: 1867776. Throughput: 0: 817.1. Samples: 469284. Policy #0 lag: (min: 2.0, avg: 7.3, max: 14.0)
[2023-06-29 08:49:36,998][00488] Avg episode reward: [(0, '14.178')]
[2023-06-29 08:49:41,600][12611] Updated weights for policy 0, policy_version 920 (0.0018)
[2023-06-29 08:49:41,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3637.8). Total num frames: 1884160. Throughput: 0: 807.1. Samples: 473888. Policy #0 lag: (min: 3.0, avg: 5.7, max: 11.0)
[2023-06-29 08:49:41,991][00488] Avg episode reward: [(0, '15.522')]
[2023-06-29 08:49:46,220][12611] Updated weights for policy 0, policy_version 930 (0.0012)
[2023-06-29 08:49:46,989][00488] Fps is (10 sec: 4098.6, 60 sec: 3413.4, 300 sec: 3665.6). Total num frames: 1908736. Throughput: 0: 814.6. Samples: 477732. Policy #0 lag: (min: 3.0, avg: 6.5, max: 15.0)
[2023-06-29 08:49:46,992][00488] Avg episode reward: [(0, '16.283')]
[2023-06-29 08:49:50,080][12611] Updated weights for policy 0, policy_version 940 (0.0011)
[2023-06-29 08:49:51,999][00488] Fps is (10 sec: 4092.1, 60 sec: 3412.9, 300 sec: 3665.5). Total num frames: 1925120. Throughput: 0: 796.8. Samples: 484420. Policy #0 lag: (min: 3.0, avg: 7.4, max: 12.0)
[2023-06-29 08:49:52,003][00488] Avg episode reward: [(0, '16.572')]
[2023-06-29 08:49:56,994][00488] Fps is (10 sec: 3275.3, 60 sec: 3413.1, 300 sec: 3637.7). Total num frames: 1941504. Throughput: 0: 793.9. Samples: 489564. Policy #0 lag: (min: 2.0, avg: 8.1, max: 14.0)
[2023-06-29 08:49:57,000][00488] Avg episode reward: [(0, '16.990')]
[2023-06-29 08:49:57,108][12611] Updated weights for policy 0, policy_version 950 (0.0020)
[2023-06-29 08:50:01,669][12611] Updated weights for policy 0, policy_version 960 (0.0025)
[2023-06-29 08:50:01,989][00488] Fps is (10 sec: 4099.9, 60 sec: 3413.3, 300 sec: 3665.7). Total num frames: 1966080. Throughput: 0: 804.6. Samples: 491988. Policy #0 lag: (min: 2.0, avg: 7.1, max: 14.0)
[2023-06-29 08:50:01,995][00488] Avg episode reward: [(0, '17.438')]
[2023-06-29 08:50:02,005][12590] Saving new best policy, reward=17.438!
[2023-06-29 08:50:06,993][00488] Fps is (10 sec: 4096.5, 60 sec: 3276.6, 300 sec: 3637.8). Total num frames: 1982464. Throughput: 0: 833.3. Samples: 497160. Policy #0 lag: (min: 2.0, avg: 5.5, max: 11.0)
[2023-06-29 08:50:07,001][00488] Avg episode reward: [(0, '16.298')]
[2023-06-29 08:50:08,505][12611] Updated weights for policy 0, policy_version 970 (0.0013)
[2023-06-29 08:50:11,990][00488] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3637.8). Total num frames: 1998848. Throughput: 0: 854.7. Samples: 502004. Policy #0 lag: (min: 1.0, avg: 3.6, max: 9.0)
[2023-06-29 08:50:11,992][00488] Avg episode reward: [(0, '15.868')]
[2023-06-29 08:50:13,307][12611] Updated weights for policy 0, policy_version 980 (0.0027)
[2023-06-29 08:50:16,989][00488] Fps is (10 sec: 4097.4, 60 sec: 3413.4, 300 sec: 3665.6). Total num frames: 2023424. Throughput: 0: 895.7. Samples: 505704. Policy #0 lag: (min: 3.0, avg: 5.9, max: 11.0)
[2023-06-29 08:50:16,994][00488] Avg episode reward: [(0, '17.116')]
[2023-06-29 08:50:18,146][12611] Updated weights for policy 0, policy_version 990 (0.0027)
[2023-06-29 08:50:21,727][12611] Updated weights for policy 0, policy_version 1000 (0.0012)
[2023-06-29 08:50:21,990][00488] Fps is (10 sec: 4915.0, 60 sec: 3686.4, 300 sec: 3693.3). Total num frames: 2048000. Throughput: 0: 979.5. Samples: 513356. Policy #0 lag: (min: 3.0, avg: 5.5, max: 11.0)
[2023-06-29 08:50:22,001][00488] Avg episode reward: [(0, '17.330')]
[2023-06-29 08:50:26,991][00488] Fps is (10 sec: 4095.6, 60 sec: 3686.3, 300 sec: 3693.3). Total num frames: 2064384. Throughput: 0: 996.0. Samples: 518708. Policy #0 lag: (min: 1.0, avg: 3.6, max: 9.0)
[2023-06-29 08:50:26,994][00488] Avg episode reward: [(0, '17.169')]
[2023-06-29 08:50:28,683][12611] Updated weights for policy 0, policy_version 1010 (0.0011)
[2023-06-29 08:50:31,990][00488] Fps is (10 sec: 3276.9, 60 sec: 3822.9, 300 sec: 3665.6). Total num frames: 2080768. Throughput: 0: 966.9. Samples: 521244. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0)
[2023-06-29 08:50:31,994][00488] Avg episode reward: [(0, '17.182')]
[2023-06-29 08:50:33,495][12611] Updated weights for policy 0, policy_version 1020 (0.0018)
[2023-06-29 08:50:36,989][00488] Fps is (10 sec: 3277.2, 60 sec: 3823.3, 300 sec: 3665.6). Total num frames: 2097152. Throughput: 0: 929.1. Samples: 526220. Policy #0 lag: (min: 2.0, avg: 5.3, max: 10.0)
[2023-06-29 08:50:36,994][00488] Avg episode reward: [(0, '18.223')]
[2023-06-29 08:50:37,010][12590] Saving new best policy, reward=18.223!
[2023-06-29 08:50:40,515][12611] Updated weights for policy 0, policy_version 1030 (0.0022)
[2023-06-29 08:50:41,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3665.6). Total num frames: 2113536. Throughput: 0: 927.2. Samples: 531284. Policy #0 lag: (min: 3.0, avg: 5.9, max: 11.0)
[2023-06-29 08:50:41,993][00488] Avg episode reward: [(0, '17.256')]
[2023-06-29 08:50:45,584][12611] Updated weights for policy 0, policy_version 1040 (0.0019)
[2023-06-29 08:50:46,989][00488] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3693.3). Total num frames: 2138112. Throughput: 0: 931.6. Samples: 533912. Policy #0 lag: (min: 3.0, avg: 6.0, max: 11.0)
[2023-06-29 08:50:46,992][00488] Avg episode reward: [(0, '17.560')]
[2023-06-29 08:50:50,049][12611] Updated weights for policy 0, policy_version 1050 (0.0018)
[2023-06-29 08:50:51,989][00488] Fps is (10 sec: 4096.0, 60 sec: 3823.5, 300 sec: 3721.1). Total num frames: 2154496. Throughput: 0: 983.1. Samples: 541396. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0)
[2023-06-29 08:50:51,992][00488] Avg episode reward: [(0, '19.564')]
[2023-06-29 08:50:52,041][12590] Saving new best policy, reward=19.564!
[2023-06-29 08:50:53,567][12611] Updated weights for policy 0, policy_version 1060 (0.0012)
[2023-06-29 08:50:56,989][00488] Fps is (10 sec: 4096.0, 60 sec: 3959.8, 300 sec: 3776.7). Total num frames: 2179072. Throughput: 0: 1026.5. Samples: 548196. Policy #0 lag: (min: 2.0, avg: 5.3, max: 10.0)
[2023-06-29 08:50:56,995][00488] Avg episode reward: [(0, '19.320')]
[2023-06-29 08:50:59,987][12611] Updated weights for policy 0, policy_version 1070 (0.0014)
[2023-06-29 08:51:01,989][00488] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 2195456. Throughput: 0: 1001.3. Samples: 550764. Policy #0 lag: (min: 3.0, avg: 5.6, max: 11.0)
[2023-06-29 08:51:01,992][00488] Avg episode reward: [(0, '18.734')]
[2023-06-29 08:51:05,207][12611] Updated weights for policy 0, policy_version 1080 (0.0034)
[2023-06-29 08:51:06,990][00488] Fps is (10 sec: 4095.9, 60 sec: 3959.7, 300 sec: 3804.4). Total num frames: 2220032. Throughput: 0: 945.2. Samples: 555888. Policy #0 lag: (min: 3.0, avg: 5.9, max: 11.0)
[2023-06-29 08:51:06,999][00488] Avg episode reward: [(0, '19.160')]
[2023-06-29 08:51:11,781][12611] Updated weights for policy 0, policy_version 1090 (0.0011)
[2023-06-29 08:51:11,994][00488] Fps is (10 sec: 4094.2, 60 sec: 3959.2, 300 sec: 3776.6). Total num frames: 2236416. Throughput: 0: 938.5. Samples: 560944. Policy #0 lag: (min: 3.0, avg: 5.3, max: 11.0)
[2023-06-29 08:51:12,003][00488] Avg episode reward: [(0, '17.787')]
[2023-06-29 08:51:16,703][12611] Updated weights for policy 0, policy_version 1100 (0.0015)
[2023-06-29 08:51:16,990][00488] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 2252800. Throughput: 0: 938.7. Samples: 563484. Policy #0 lag: (min: 3.0, avg: 6.3, max: 11.0)
[2023-06-29 08:51:16,999][00488] Avg episode reward: [(0, '17.658')]
[2023-06-29 08:51:21,989][00488] Fps is (10 sec: 3278.2, 60 sec: 3686.4, 300 sec: 3748.9). Total num frames: 2269184. Throughput: 0: 965.2. Samples: 569656. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0)
[2023-06-29 08:51:21,993][00488] Avg episode reward: [(0, '18.944')]
[2023-06-29 08:51:22,468][12611] Updated weights for policy 0, policy_version 1110 (0.0012)
[2023-06-29 08:51:25,883][12611] Updated weights for policy 0, policy_version 1120 (0.0026)
[2023-06-29 08:51:26,989][00488] Fps is (10 sec: 4915.3, 60 sec: 3959.5, 300 sec: 3804.4). Total num frames: 2301952. Throughput: 0: 1020.4. Samples: 577200. Policy #0 lag: (min: 3.0, avg: 6.0, max: 11.0)
[2023-06-29 08:51:26,992][00488] Avg episode reward: [(0, '19.834')]
[2023-06-29 08:51:26,998][12590] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001124_2301952.pth...
[2023-06-29 08:51:27,155][12590] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000684_1400832.pth
[2023-06-29 08:51:27,173][12590] Saving new best policy, reward=19.834!
[2023-06-29 08:51:31,124][12611] Updated weights for policy 0, policy_version 1130 (0.0027)
[2023-06-29 08:51:31,989][00488] Fps is (10 sec: 4915.2, 60 sec: 3959.5, 300 sec: 3804.4). Total num frames: 2318336. Throughput: 0: 1025.5. Samples: 580060. Policy #0 lag: (min: 3.0, avg: 6.0, max: 11.0)
[2023-06-29 08:51:31,996][00488] Avg episode reward: [(0, '21.061')]
[2023-06-29 08:51:32,002][12590] Saving new best policy, reward=21.061!
[2023-06-29 08:51:36,814][12611] Updated weights for policy 0, policy_version 1140 (0.0012)
[2023-06-29 08:51:36,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3804.4). Total num frames: 2334720. Throughput: 0: 971.6. Samples: 585120. Policy #0 lag: (min: 3.0, avg: 5.2, max: 11.0)
[2023-06-29 08:51:36,998][00488] Avg episode reward: [(0, '22.252')]
[2023-06-29 08:51:37,009][12590] Saving new best policy, reward=22.252!
[2023-06-29 08:51:41,990][00488] Fps is (10 sec: 3276.5, 60 sec: 3959.4, 300 sec: 3776.6). Total num frames: 2351104. Throughput: 0: 929.5. Samples: 590024. Policy #0 lag: (min: 3.0, avg: 5.8, max: 11.0)
[2023-06-29 08:51:41,993][00488] Avg episode reward: [(0, '22.033')]
[2023-06-29 08:51:44,238][12611] Updated weights for policy 0, policy_version 1150 (0.0013)
[2023-06-29 08:51:46,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 2367488. Throughput: 0: 929.8. Samples: 592604. Policy #0 lag: (min: 3.0, avg: 6.4, max: 15.0)
[2023-06-29 08:51:46,995][00488] Avg episode reward: [(0, '20.425')]
[2023-06-29 08:51:49,361][12611] Updated weights for policy 0, policy_version 1160 (0.0057)
[2023-06-29 08:51:51,990][00488] Fps is (10 sec: 3276.9, 60 sec: 3822.9, 300 sec: 3721.1). Total num frames: 2383872. Throughput: 0: 929.4. Samples: 597712. Policy #0 lag: (min: 3.0, avg: 6.0, max: 15.0)
[2023-06-29 08:51:51,996][00488] Avg episode reward: [(0, '18.579')]
[2023-06-29 08:51:54,459][12611] Updated weights for policy 0, policy_version 1170 (0.0017)
[2023-06-29 08:51:56,989][00488] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 2408448. Throughput: 0: 984.8. Samples: 605256. Policy #0 lag: (min: 1.0, avg: 4.6, max: 9.0)
[2023-06-29 08:51:56,992][00488] Avg episode reward: [(0, '18.387')]
[2023-06-29 08:51:57,601][12611] Updated weights for policy 0, policy_version 1180 (0.0018)
[2023-06-29 08:52:01,990][00488] Fps is (10 sec: 4915.3, 60 sec: 3959.4, 300 sec: 3776.6). Total num frames: 2433024. Throughput: 0: 1010.8. Samples: 608972. Policy #0 lag: (min: 2.0, avg: 6.8, max: 13.0)
[2023-06-29 08:52:02,001][00488] Avg episode reward: [(0, '16.755')]
[2023-06-29 08:52:03,299][12611] Updated weights for policy 0, policy_version 1190 (0.0012)
[2023-06-29 08:52:06,990][00488] Fps is (10 sec: 4095.8, 60 sec: 3822.9, 300 sec: 3776.6). Total num frames: 2449408. Throughput: 0: 997.5. Samples: 614544. Policy #0 lag: (min: 3.0, avg: 6.2, max: 12.0)
[2023-06-29 08:52:06,999][00488] Avg episode reward: [(0, '18.084')]
[2023-06-29 08:52:08,351][12611] Updated weights for policy 0, policy_version 1200 (0.0011)
[2023-06-29 08:52:11,989][00488] Fps is (10 sec: 3277.0, 60 sec: 3823.2, 300 sec: 3776.7). Total num frames: 2465792. Throughput: 0: 940.7. Samples: 619532. Policy #0 lag: (min: 3.0, avg: 6.9, max: 15.0)
[2023-06-29 08:52:11,991][00488] Avg episode reward: [(0, '18.948')]
[2023-06-29 08:52:15,332][12611] Updated weights for policy 0, policy_version 1210 (0.0014)
[2023-06-29 08:52:16,989][00488] Fps is (10 sec: 3277.0, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 2482176. Throughput: 0: 934.3. Samples: 622104. Policy #0 lag: (min: 3.0, avg: 6.5, max: 11.0)
[2023-06-29 08:52:16,994][00488] Avg episode reward: [(0, '18.304')]
[2023-06-29 08:52:20,328][12611] Updated weights for policy 0, policy_version 1220 (0.0020)
[2023-06-29 08:52:21,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3721.1). Total num frames: 2498560. Throughput: 0: 933.2. Samples: 627112. Policy #0 lag: (min: 1.0, avg: 4.5, max: 9.0)
[2023-06-29 08:52:21,992][00488] Avg episode reward: [(0, '18.791')]
[2023-06-29 08:52:26,405][12611] Updated weights for policy 0, policy_version 1230 (0.0012)
[2023-06-29 08:52:26,990][00488] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3748.9). Total num frames: 2523136. Throughput: 0: 967.9. Samples: 633580. Policy #0 lag: (min: 3.0, avg: 6.2, max: 15.0)
[2023-06-29 08:52:26,992][00488] Avg episode reward: [(0, '18.959')]
[2023-06-29 08:52:29,768][12611] Updated weights for policy 0, policy_version 1240 (0.0014)
[2023-06-29 08:52:31,989][00488] Fps is (10 sec: 4915.2, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 2547712. Throughput: 0: 993.8. Samples: 637324. Policy #0 lag: (min: 1.0, avg: 3.7, max: 9.0)
[2023-06-29 08:52:31,995][00488] Avg episode reward: [(0, '21.752')]
[2023-06-29 08:52:34,591][12611] Updated weights for policy 0, policy_version 1250 (0.0012)
[2023-06-29 08:52:36,991][00488] Fps is (10 sec: 4095.5, 60 sec: 3822.9, 300 sec: 3776.6). Total num frames: 2564096. Throughput: 0: 1029.9. Samples: 644056. Policy #0 lag: (min: 3.0, avg: 6.1, max: 11.0)
[2023-06-29 08:52:36,997][00488] Avg episode reward: [(0, '22.057')]
[2023-06-29 08:52:39,580][12611] Updated weights for policy 0, policy_version 1260 (0.0019)
[2023-06-29 08:52:41,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3776.7). Total num frames: 2580480. Throughput: 0: 977.4. Samples: 649240. Policy #0 lag: (min: 0.0, avg: 5.2, max: 11.0)
[2023-06-29 08:52:41,993][00488] Avg episode reward: [(0, '22.769')]
[2023-06-29 08:52:42,362][12590] Saving new best policy, reward=22.769!
[2023-06-29 08:52:46,990][00488] Fps is (10 sec: 3277.1, 60 sec: 3822.9, 300 sec: 3776.6). Total num frames: 2596864. Throughput: 0: 936.3. Samples: 651104. Policy #0 lag: (min: 3.0, avg: 6.5, max: 11.0)
[2023-06-29 08:52:46,997][00488] Avg episode reward: [(0, '22.923')]
[2023-06-29 08:52:47,013][12590] Saving new best policy, reward=22.923!
[2023-06-29 08:52:48,730][12611] Updated weights for policy 0, policy_version 1270 (0.0026)
[2023-06-29 08:52:51,990][00488] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3721.1). Total num frames: 2613248. Throughput: 0: 896.2. Samples: 654872. Policy #0 lag: (min: 3.0, avg: 6.3, max: 11.0)
[2023-06-29 08:52:51,993][00488] Avg episode reward: [(0, '22.421')]
[2023-06-29 08:52:55,340][12611] Updated weights for policy 0, policy_version 1280 (0.0018)
[2023-06-29 08:52:56,989][00488] Fps is (10 sec: 2457.7, 60 sec: 3549.9, 300 sec: 3693.3). Total num frames: 2621440. Throughput: 0: 868.8. Samples: 658628. Policy #0 lag: (min: 3.0, avg: 6.7, max: 11.0)
[2023-06-29 08:52:57,001][00488] Avg episode reward: [(0, '21.320')]
[2023-06-29 08:53:01,994][00488] Fps is (10 sec: 2456.6, 60 sec: 3413.1, 300 sec: 3665.5). Total num frames: 2637824. Throughput: 0: 856.2. Samples: 660636. Policy #0 lag: (min: 1.0, avg: 4.4, max: 13.0)
[2023-06-29 08:53:01,996][00488] Avg episode reward: [(0, '21.395')]
[2023-06-29 08:53:03,838][12611] Updated weights for policy 0, policy_version 1290 (0.0046)
[2023-06-29 08:53:06,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3413.4, 300 sec: 3665.7). Total num frames: 2654208. Throughput: 0: 849.0. Samples: 665316. Policy #0 lag: (min: 3.0, avg: 6.3, max: 11.0)
[2023-06-29 08:53:06,992][00488] Avg episode reward: [(0, '19.191')]
[2023-06-29 08:53:11,093][12611] Updated weights for policy 0, policy_version 1301 (0.0027)
[2023-06-29 08:53:11,989][00488] Fps is (10 sec: 3278.3, 60 sec: 3413.3, 300 sec: 3665.6). Total num frames: 2670592. Throughput: 0: 824.6. Samples: 670688. Policy #0 lag: (min: 3.0, avg: 6.0, max: 11.0)
[2023-06-29 08:53:11,992][00488] Avg episode reward: [(0, '19.299')]
[2023-06-29 08:53:15,432][12611] Updated weights for policy 0, policy_version 1311 (0.0012)
[2023-06-29 08:53:16,990][00488] Fps is (10 sec: 3276.5, 60 sec: 3413.3, 300 sec: 3665.6). Total num frames: 2686976. Throughput: 0: 806.6. Samples: 673620. Policy #0 lag: (min: 3.0, avg: 6.1, max: 11.0)
[2023-06-29 08:53:16,996][00488] Avg episode reward: [(0, '19.452')]
[2023-06-29 08:53:21,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3665.6). Total num frames: 2703360. Throughput: 0: 769.5. Samples: 678684. Policy #0 lag: (min: 1.0, avg: 4.7, max: 9.0)
[2023-06-29 08:53:21,997][00488] Avg episode reward: [(0, '19.865')]
[2023-06-29 08:53:22,114][12611] Updated weights for policy 0, policy_version 1321 (0.0020)
[2023-06-29 08:53:26,913][12611] Updated weights for policy 0, policy_version 1331 (0.0020)
[2023-06-29 08:53:26,990][00488] Fps is (10 sec: 4096.4, 60 sec: 3413.3, 300 sec: 3637.8). Total num frames: 2727936. Throughput: 0: 767.2. Samples: 683764. Policy #0 lag: (min: 1.0, avg: 4.2, max: 9.0)
[2023-06-29 08:53:26,992][00488] Avg episode reward: [(0, '19.288')]
[2023-06-29 08:53:27,006][12590] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001332_2727936.pth...
[2023-06-29 08:53:27,226][12590] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000900_1843200.pth
[2023-06-29 08:53:31,989][00488] Fps is (10 sec: 4096.0, 60 sec: 3276.8, 300 sec: 3637.8). Total num frames: 2744320. Throughput: 0: 783.7. Samples: 686372. Policy #0 lag: (min: 3.0, avg: 5.9, max: 11.0)
[2023-06-29 08:53:31,995][00488] Avg episode reward: [(0, '18.847')]
[2023-06-29 08:53:34,663][12611] Updated weights for policy 0, policy_version 1341 (0.0015)
[2023-06-29 08:53:36,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3276.9, 300 sec: 3637.8). Total num frames: 2760704. Throughput: 0: 813.2. Samples: 691464. Policy #0 lag: (min: 3.0, avg: 6.1, max: 11.0)
[2023-06-29 08:53:36,995][00488] Avg episode reward: [(0, '18.921')]
[2023-06-29 08:53:38,590][12611] Updated weights for policy 0, policy_version 1351 (0.0012)
[2023-06-29 08:53:41,990][00488] Fps is (10 sec: 4095.9, 60 sec: 3413.3, 300 sec: 3665.6). Total num frames: 2785280. Throughput: 0: 896.7. Samples: 698980. Policy #0 lag: (min: 2.0, avg: 7.5, max: 14.0)
[2023-06-29 08:53:41,997][00488] Avg episode reward: [(0, '18.716')]
[2023-06-29 08:53:42,864][12611] Updated weights for policy 0, policy_version 1361 (0.0011)
[2023-06-29 08:53:46,706][12611] Updated weights for policy 0, policy_version 1371 (0.0019)
[2023-06-29 08:53:46,990][00488] Fps is (10 sec: 4915.0, 60 sec: 3549.9, 300 sec: 3693.4). Total num frames: 2809856. Throughput: 0: 934.7. Samples: 702692. Policy #0 lag: (min: 2.0, avg: 6.5, max: 10.0)
[2023-06-29 08:53:46,994][00488] Avg episode reward: [(0, '19.449')]
[2023-06-29 08:53:51,989][00488] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3693.3). Total num frames: 2826240. Throughput: 0: 955.3. Samples: 708304. Policy #0 lag: (min: 1.0, avg: 5.3, max: 9.0)
[2023-06-29 08:53:51,998][00488] Avg episode reward: [(0, '19.802')]
[2023-06-29 08:53:53,996][12611] Updated weights for policy 0, policy_version 1381 (0.0011)
[2023-06-29 08:53:56,989][00488] Fps is (10 sec: 3276.9, 60 sec: 3686.4, 300 sec: 3665.6). Total num frames: 2842624. Throughput: 0: 945.4. Samples: 713232. Policy #0 lag: (min: 3.0, avg: 6.7, max: 15.0)
[2023-06-29 08:53:56,994][00488] Avg episode reward: [(0, '21.573')]
[2023-06-29 08:53:58,933][12611] Updated weights for policy 0, policy_version 1391 (0.0017)
[2023-06-29 08:54:01,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3686.7, 300 sec: 3637.8). Total num frames: 2859008. Throughput: 0: 938.3. Samples: 715844. Policy #0 lag: (min: 1.0, avg: 5.3, max: 9.0)
[2023-06-29 08:54:01,997][00488] Avg episode reward: [(0, '22.132')]
[2023-06-29 08:54:06,171][12611] Updated weights for policy 0, policy_version 1401 (0.0012)
[2023-06-29 08:54:06,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 2875392. Throughput: 0: 938.8. Samples: 720928. Policy #0 lag: (min: 0.0, avg: 5.8, max: 12.0)
[2023-06-29 08:54:06,992][00488] Avg episode reward: [(0, '22.445')]
[2023-06-29 08:54:10,282][12611] Updated weights for policy 0, policy_version 1411 (0.0045)
[2023-06-29 08:54:11,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 2891776. Throughput: 0: 966.5. Samples: 727256. Policy #0 lag: (min: 3.0, avg: 6.6, max: 11.0)
[2023-06-29 08:54:11,992][00488] Avg episode reward: [(0, '21.577')]
[2023-06-29 08:54:15,011][12611] Updated weights for policy 0, policy_version 1421 (0.0011)
[2023-06-29 08:54:16,989][00488] Fps is (10 sec: 4915.2, 60 sec: 3959.5, 300 sec: 3721.1). Total num frames: 2924544. Throughput: 0: 990.8. Samples: 730960. Policy #0 lag: (min: 2.0, avg: 6.8, max: 10.0)
[2023-06-29 08:54:16,995][00488] Avg episode reward: [(0, '21.518')]
[2023-06-29 08:54:18,410][12611] Updated weights for policy 0, policy_version 1431 (0.0015)
[2023-06-29 08:54:21,992][00488] Fps is (10 sec: 4913.8, 60 sec: 3959.3, 300 sec: 3721.1). Total num frames: 2940928. Throughput: 0: 1024.2. Samples: 737556. Policy #0 lag: (min: 0.0, avg: 5.1, max: 8.0)
[2023-06-29 08:54:21,995][00488] Avg episode reward: [(0, '19.405')]
[2023-06-29 08:54:25,554][12611] Updated weights for policy 0, policy_version 1441 (0.0022)
[2023-06-29 08:54:26,991][00488] Fps is (10 sec: 3276.4, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 2957312. Throughput: 0: 970.4. Samples: 742648. Policy #0 lag: (min: 2.0, avg: 6.6, max: 14.0)
[2023-06-29 08:54:26,993][00488] Avg episode reward: [(0, '20.672')]
[2023-06-29 08:54:30,169][12611] Updated weights for policy 0, policy_version 1451 (0.0012)
[2023-06-29 08:54:31,989][00488] Fps is (10 sec: 3277.7, 60 sec: 3822.9, 300 sec: 3749.0). Total num frames: 2973696. Throughput: 0: 944.9. Samples: 745212. Policy #0 lag: (min: 0.0, avg: 5.1, max: 12.0)
[2023-06-29 08:54:31,993][00488] Avg episode reward: [(0, '19.462')]
[2023-06-29 08:54:36,989][00488] Fps is (10 sec: 3277.2, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 2990080. Throughput: 0: 932.5. Samples: 750268. Policy #0 lag: (min: 3.0, avg: 6.2, max: 15.0)
[2023-06-29 08:54:36,991][00488] Avg episode reward: [(0, '20.478')]
[2023-06-29 08:54:37,315][12611] Updated weights for policy 0, policy_version 1461 (0.0011)
[2023-06-29 08:54:41,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3721.1). Total num frames: 3006464. Throughput: 0: 935.6. Samples: 755336. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0)
[2023-06-29 08:54:41,996][00488] Avg episode reward: [(0, '20.773')]
[2023-06-29 08:54:42,342][12611] Updated weights for policy 0, policy_version 1471 (0.0033)
[2023-06-29 08:54:46,876][12611] Updated weights for policy 0, policy_version 1481 (0.0015)
[2023-06-29 08:54:46,989][00488] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3749.0). Total num frames: 3031040. Throughput: 0: 962.4. Samples: 759152. Policy #0 lag: (min: 1.0, avg: 4.3, max: 9.0)
[2023-06-29 08:54:46,995][00488] Avg episode reward: [(0, '20.953')]
[2023-06-29 08:54:50,239][12611] Updated weights for policy 0, policy_version 1491 (0.0016)
[2023-06-29 08:54:51,990][00488] Fps is (10 sec: 4915.0, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 3055616. Throughput: 0: 1014.2. Samples: 766568. Policy #0 lag: (min: 3.0, avg: 5.9, max: 11.0)
[2023-06-29 08:54:52,001][00488] Avg episode reward: [(0, '20.357')]
[2023-06-29 08:54:56,857][12611] Updated weights for policy 0, policy_version 1501 (0.0012)
[2023-06-29 08:54:56,991][00488] Fps is (10 sec: 4095.5, 60 sec: 3822.8, 300 sec: 3748.9). Total num frames: 3072000. Throughput: 0: 995.3. Samples: 772044. Policy #0 lag: (min: 3.0, avg: 5.9, max: 11.0)
[2023-06-29 08:54:56,993][00488] Avg episode reward: [(0, '20.469')]
[2023-06-29 08:55:01,990][00488] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 3088384. Throughput: 0: 967.5. Samples: 774500. Policy #0 lag: (min: 3.0, avg: 5.9, max: 11.0)
[2023-06-29 08:55:01,994][00488] Avg episode reward: [(0, '19.789')]
[2023-06-29 08:55:02,039][12611] Updated weights for policy 0, policy_version 1511 (0.0013)
[2023-06-29 08:55:06,989][00488] Fps is (10 sec: 3277.2, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 3104768. Throughput: 0: 934.5. Samples: 779604. Policy #0 lag: (min: 3.0, avg: 6.1, max: 11.0)
[2023-06-29 08:55:06,991][00488] Avg episode reward: [(0, '20.588')]
[2023-06-29 08:55:09,144][12611] Updated weights for policy 0, policy_version 1521 (0.0019)
[2023-06-29 08:55:11,990][00488] Fps is (10 sec: 4096.0, 60 sec: 3959.4, 300 sec: 3748.9). Total num frames: 3129344. Throughput: 0: 933.8. Samples: 784668. Policy #0 lag: (min: 1.0, avg: 4.0, max: 9.0)
[2023-06-29 08:55:11,994][00488] Avg episode reward: [(0, '20.860')]
[2023-06-29 08:55:14,019][12611] Updated weights for policy 0, policy_version 1531 (0.0015)
[2023-06-29 08:55:16,989][00488] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3721.1). Total num frames: 3145728. Throughput: 0: 935.0. Samples: 787288. Policy #0 lag: (min: 3.0, avg: 5.7, max: 11.0)
[2023-06-29 08:55:16,998][00488] Avg episode reward: [(0, '20.215')]
[2023-06-29 08:55:19,130][12611] Updated weights for policy 0, policy_version 1541 (0.0021)
[2023-06-29 08:55:21,989][00488] Fps is (10 sec: 4096.1, 60 sec: 3823.1, 300 sec: 3748.9). Total num frames: 3170304. Throughput: 0: 990.5. Samples: 794840. Policy #0 lag: (min: 1.0, avg: 3.9, max: 9.0)
[2023-06-29 08:55:21,997][00488] Avg episode reward: [(0, '22.108')]
[2023-06-29 08:55:22,364][12611] Updated weights for policy 0, policy_version 1551 (0.0021)
[2023-06-29 08:55:26,992][00488] Fps is (10 sec: 4913.8, 60 sec: 3959.4, 300 sec: 3776.6). Total num frames: 3194880. Throughput: 0: 1026.1. Samples: 801512. Policy #0 lag: (min: 3.0, avg: 6.3, max: 15.0)
[2023-06-29 08:55:26,994][00488] Avg episode reward: [(0, '24.426')]
[2023-06-29 08:55:27,015][12590] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001560_3194880.pth...
[2023-06-29 08:55:27,269][12590] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001124_2301952.pth
[2023-06-29 08:55:27,293][12590] Saving new best policy, reward=24.426!
[2023-06-29 08:55:28,737][12611] Updated weights for policy 0, policy_version 1561 (0.0013)
[2023-06-29 08:55:31,989][00488] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3776.7). Total num frames: 3211264. Throughput: 0: 994.7. Samples: 803912. Policy #0 lag: (min: 3.0, avg: 5.7, max: 11.0)
[2023-06-29 08:55:31,995][00488] Avg episode reward: [(0, '23.458')]
[2023-06-29 08:55:33,379][12611] Updated weights for policy 0, policy_version 1571 (0.0024)
[2023-06-29 08:55:36,991][00488] Fps is (10 sec: 3277.7, 60 sec: 3959.5, 300 sec: 3776.6). Total num frames: 3227648. Throughput: 0: 944.4. Samples: 809068. Policy #0 lag: (min: 3.0, avg: 5.9, max: 11.0)
[2023-06-29 08:55:36,993][00488] Avg episode reward: [(0, '23.267')]
[2023-06-29 08:55:40,557][12611] Updated weights for policy 0, policy_version 1581 (0.0022)
[2023-06-29 08:55:41,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3748.9). Total num frames: 3244032. Throughput: 0: 934.3. Samples: 814088. Policy #0 lag: (min: 3.0, avg: 8.0, max: 11.0)
[2023-06-29 08:55:41,994][00488] Avg episode reward: [(0, '23.560')]
[2023-06-29 08:55:45,487][12611] Updated weights for policy 0, policy_version 1591 (0.0015)
[2023-06-29 08:55:46,990][00488] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 3260416. Throughput: 0: 938.1. Samples: 816716. Policy #0 lag: (min: 3.0, avg: 5.8, max: 11.0)
[2023-06-29 08:55:46,993][00488] Avg episode reward: [(0, '24.163')]
[2023-06-29 08:55:51,057][12611] Updated weights for policy 0, policy_version 1601 (0.0015)
[2023-06-29 08:55:51,989][00488] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 3748.9). Total num frames: 3284992. Throughput: 0: 962.7. Samples: 822924. Policy #0 lag: (min: 1.0, avg: 4.2, max: 9.0)
[2023-06-29 08:55:51,996][00488] Avg episode reward: [(0, '24.342')]
[2023-06-29 08:55:54,486][12611] Updated weights for policy 0, policy_version 1611 (0.0012)
[2023-06-29 08:55:56,989][00488] Fps is (10 sec: 4915.2, 60 sec: 3959.6, 300 sec: 3776.7). Total num frames: 3309568. Throughput: 0: 1016.5. Samples: 830408. Policy #0 lag: (min: 3.0, avg: 7.2, max: 11.0)
[2023-06-29 08:55:56,994][00488] Avg episode reward: [(0, '24.006')]
[2023-06-29 08:56:00,528][12611] Updated weights for policy 0, policy_version 1621 (0.0016)
[2023-06-29 08:56:01,989][00488] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3748.9). Total num frames: 3325952. Throughput: 0: 1025.2. Samples: 833424. Policy #0 lag: (min: 3.0, avg: 6.6, max: 15.0)
[2023-06-29 08:56:01,994][00488] Avg episode reward: [(0, '24.253')]
[2023-06-29 08:56:05,418][12611] Updated weights for policy 0, policy_version 1631 (0.0012)
[2023-06-29 08:56:06,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3748.9). Total num frames: 3342336. Throughput: 0: 967.6. Samples: 838384. Policy #0 lag: (min: 2.0, avg: 7.5, max: 14.0)
[2023-06-29 08:56:06,994][00488] Avg episode reward: [(0, '23.978')]
[2023-06-29 08:56:11,742][12611] Updated weights for policy 0, policy_version 1641 (0.0021)
[2023-06-29 08:56:11,989][00488] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3776.7). Total num frames: 3366912. Throughput: 0: 933.4. Samples: 843512. Policy #0 lag: (min: 3.0, avg: 5.9, max: 11.0)
[2023-06-29 08:56:12,001][00488] Avg episode reward: [(0, '22.848')]
[2023-06-29 08:56:16,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 3375104. Throughput: 0: 936.6. Samples: 846060. Policy #0 lag: (min: 3.0, avg: 6.1, max: 15.0)
[2023-06-29 08:56:16,999][00488] Avg episode reward: [(0, '22.392')]
[2023-06-29 08:56:17,810][12611] Updated weights for policy 0, policy_version 1651 (0.0011)
[2023-06-29 08:56:21,990][00488] Fps is (10 sec: 2457.5, 60 sec: 3686.4, 300 sec: 3693.3). Total num frames: 3391488. Throughput: 0: 909.3. Samples: 849988. Policy #0 lag: (min: 3.0, avg: 6.4, max: 11.0)
[2023-06-29 08:56:21,992][00488] Avg episode reward: [(0, '22.840')]
[2023-06-29 08:56:26,296][12611] Updated weights for policy 0, policy_version 1661 (0.0021)
[2023-06-29 08:56:26,990][00488] Fps is (10 sec: 3276.8, 60 sec: 3550.0, 300 sec: 3693.3). Total num frames: 3407872. Throughput: 0: 900.9. Samples: 854628. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0)
[2023-06-29 08:56:26,992][00488] Avg episode reward: [(0, '22.899')]
[2023-06-29 08:56:31,440][12611] Updated weights for policy 0, policy_version 1671 (0.0029)
[2023-06-29 08:56:31,989][00488] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3693.3). Total num frames: 3424256. Throughput: 0: 900.4. Samples: 857232. Policy #0 lag: (min: 3.0, avg: 5.9, max: 11.0)
[2023-06-29 08:56:31,992][00488] Avg episode reward: [(0, '21.785')]
[2023-06-29 08:56:36,993][00488] Fps is (10 sec: 3275.7, 60 sec: 3549.7, 300 sec: 3693.3). Total num frames: 3440640. Throughput: 0: 864.6. Samples: 861832. Policy #0 lag: (min: 1.0, avg: 3.9, max: 9.0)
[2023-06-29 08:56:37,000][00488] Avg episode reward: [(0, '21.873')]
[2023-06-29 08:56:39,891][12611] Updated weights for policy 0, policy_version 1681 (0.0022)
[2023-06-29 08:56:41,989][00488] Fps is (10 sec: 2457.6, 60 sec: 3413.3, 300 sec: 3665.6). Total num frames: 3448832. Throughput: 0: 785.4. Samples: 865752. Policy #0 lag: (min: 1.0, avg: 4.7, max: 9.0)
[2023-06-29 08:56:41,992][00488] Avg episode reward: [(0, '21.137')]
[2023-06-29 08:56:45,082][12611] Updated weights for policy 0, policy_version 1691 (0.0031)
[2023-06-29 08:56:46,989][00488] Fps is (10 sec: 2458.5, 60 sec: 3413.3, 300 sec: 3665.6). Total num frames: 3465216. Throughput: 0: 772.1. Samples: 868168. Policy #0 lag: (min: 2.0, avg: 7.6, max: 14.0)
[2023-06-29 08:56:46,997][00488] Avg episode reward: [(0, '22.595')]
[2023-06-29 08:56:51,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3637.8). Total num frames: 3481600. Throughput: 0: 773.5. Samples: 873192. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0)
[2023-06-29 08:56:51,996][00488] Avg episode reward: [(0, '24.028')]
[2023-06-29 08:56:52,546][12611] Updated weights for policy 0, policy_version 1701 (0.0012)
[2023-06-29 08:56:56,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3140.3, 300 sec: 3610.0). Total num frames: 3497984. Throughput: 0: 773.2. Samples: 878308. Policy #0 lag: (min: 3.0, avg: 6.3, max: 11.0)
[2023-06-29 08:56:56,995][00488] Avg episode reward: [(0, '24.915')]
[2023-06-29 08:56:57,007][12590] Saving new best policy, reward=24.915!
[2023-06-29 08:56:57,761][12611] Updated weights for policy 0, policy_version 1711 (0.0015)
[2023-06-29 08:57:01,990][00488] Fps is (10 sec: 4096.0, 60 sec: 3276.8, 300 sec: 3637.8). Total num frames: 3522560. Throughput: 0: 776.2. Samples: 880988. Policy #0 lag: (min: 0.0, avg: 5.5, max: 12.0)
[2023-06-29 08:57:01,995][00488] Avg episode reward: [(0, '25.195')]
[2023-06-29 08:57:02,000][12590] Saving new best policy, reward=25.195!
[2023-06-29 08:57:02,987][12611] Updated weights for policy 0, policy_version 1721 (0.0013)
[2023-06-29 08:57:06,324][12611] Updated weights for policy 0, policy_version 1731 (0.0023)
[2023-06-29 08:57:06,989][00488] Fps is (10 sec: 4915.1, 60 sec: 3413.3, 300 sec: 3665.6). Total num frames: 3547136. Throughput: 0: 853.9. Samples: 888412. Policy #0 lag: (min: 1.0, avg: 4.7, max: 9.0)
[2023-06-29 08:57:06,993][00488] Avg episode reward: [(0, '25.498')]
[2023-06-29 08:57:07,015][12590] Saving new best policy, reward=25.498!
[2023-06-29 08:57:11,989][00488] Fps is (10 sec: 4096.0, 60 sec: 3276.8, 300 sec: 3665.6). Total num frames: 3563520. Throughput: 0: 889.9. Samples: 894672. Policy #0 lag: (min: 3.0, avg: 6.4, max: 11.0)
[2023-06-29 08:57:11,994][00488] Avg episode reward: [(0, '26.025')]
[2023-06-29 08:57:12,086][12611] Updated weights for policy 0, policy_version 1741 (0.0012)
[2023-06-29 08:57:12,267][12590] Saving new best policy, reward=26.025!
[2023-06-29 08:57:16,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3665.6). Total num frames: 3579904. Throughput: 0: 886.7. Samples: 897132. Policy #0 lag: (min: 3.0, avg: 6.1, max: 11.0)
[2023-06-29 08:57:16,991][00488] Avg episode reward: [(0, '24.343')]
[2023-06-29 08:57:17,555][12611] Updated weights for policy 0, policy_version 1751 (0.0019)
[2023-06-29 08:57:21,990][00488] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3637.8). Total num frames: 3596288. Throughput: 0: 894.4. Samples: 902076. Policy #0 lag: (min: 3.0, avg: 6.9, max: 11.0)
[2023-06-29 08:57:21,996][00488] Avg episode reward: [(0, '23.956')]
[2023-06-29 08:57:24,480][12611] Updated weights for policy 0, policy_version 1761 (0.0012)
[2023-06-29 08:57:26,990][00488] Fps is (10 sec: 3276.7, 60 sec: 3413.3, 300 sec: 3610.0). Total num frames: 3612672. Throughput: 0: 921.3. Samples: 907212. Policy #0 lag: (min: 3.0, avg: 6.7, max: 11.0)
[2023-06-29 08:57:26,996][00488] Avg episode reward: [(0, '23.050')]
[2023-06-29 08:57:27,010][12590] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001764_3612672.pth...
[2023-06-29 08:57:27,232][12590] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001332_2727936.pth
[2023-06-29 08:57:29,513][12611] Updated weights for policy 0, policy_version 1771 (0.0026)
[2023-06-29 08:57:31,990][00488] Fps is (10 sec: 3276.7, 60 sec: 3413.3, 300 sec: 3610.0). Total num frames: 3629056. Throughput: 0: 919.5. Samples: 909548. Policy #0 lag: (min: 1.0, avg: 4.8, max: 13.0)
[2023-06-29 08:57:31,999][00488] Avg episode reward: [(0, '23.151')]
[2023-06-29 08:57:35,328][12611] Updated weights for policy 0, policy_version 1781 (0.0031)
[2023-06-29 08:57:36,989][00488] Fps is (10 sec: 4096.1, 60 sec: 3550.1, 300 sec: 3637.8). Total num frames: 3653632. Throughput: 0: 951.8. Samples: 916024. Policy #0 lag: (min: 1.0, avg: 4.5, max: 13.0)
[2023-06-29 08:57:36,994][00488] Avg episode reward: [(0, '22.569')]
[2023-06-29 08:57:38,611][12611] Updated weights for policy 0, policy_version 1791 (0.0024)
[2023-06-29 08:57:41,989][00488] Fps is (10 sec: 5734.6, 60 sec: 3959.5, 300 sec: 3693.3). Total num frames: 3686400. Throughput: 0: 1009.4. Samples: 923732. Policy #0 lag: (min: 3.0, avg: 7.2, max: 15.0)
[2023-06-29 08:57:41,994][00488] Avg episode reward: [(0, '23.955')]
[2023-06-29 08:57:44,544][12611] Updated weights for policy 0, policy_version 1801 (0.0014)
[2023-06-29 08:57:46,989][00488] Fps is (10 sec: 4915.2, 60 sec: 3959.5, 300 sec: 3693.3). Total num frames: 3702784. Throughput: 0: 1011.9. Samples: 926524. Policy #0 lag: (min: 3.0, avg: 7.5, max: 15.0)
[2023-06-29 08:57:46,994][00488] Avg episode reward: [(0, '23.980')]
[2023-06-29 08:57:49,394][12611] Updated weights for policy 0, policy_version 1811 (0.0014)
[2023-06-29 08:57:51,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3721.1). Total num frames: 3719168. Throughput: 0: 960.9. Samples: 931652. Policy #0 lag: (min: 3.0, avg: 6.8, max: 15.0)
[2023-06-29 08:57:52,003][00488] Avg episode reward: [(0, '23.462')]
[2023-06-29 08:57:55,593][12611] Updated weights for policy 0, policy_version 1821 (0.0011)
[2023-06-29 08:57:56,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3721.2). Total num frames: 3735552. Throughput: 0: 937.1. Samples: 936840. Policy #0 lag: (min: 3.0, avg: 6.3, max: 11.0)
[2023-06-29 08:57:56,997][00488] Avg episode reward: [(0, '24.494')]
[2023-06-29 08:58:00,776][12611] Updated weights for policy 0, policy_version 1831 (0.0012)
[2023-06-29 08:58:01,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3721.1). Total num frames: 3751936. Throughput: 0: 939.4. Samples: 939404. Policy #0 lag: (min: 3.0, avg: 7.2, max: 11.0)
[2023-06-29 08:58:02,002][00488] Avg episode reward: [(0, '23.554')]
[2023-06-29 08:58:06,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3721.1). Total num frames: 3768320. Throughput: 0: 943.9. Samples: 944552. Policy #0 lag: (min: 0.0, avg: 5.7, max: 12.0)
[2023-06-29 08:58:06,999][00488] Avg episode reward: [(0, '22.625')]
[2023-06-29 08:58:07,075][12611] Updated weights for policy 0, policy_version 1841 (0.0045)
[2023-06-29 08:58:10,352][12611] Updated weights for policy 0, policy_version 1851 (0.0011)
[2023-06-29 08:58:11,989][00488] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 3792896. Throughput: 0: 1001.4. Samples: 952276. Policy #0 lag: (min: 2.0, avg: 5.9, max: 13.0)
[2023-06-29 08:58:11,994][00488] Avg episode reward: [(0, '21.894')]
[2023-06-29 08:58:15,436][12611] Updated weights for policy 0, policy_version 1861 (0.0012)
[2023-06-29 08:58:16,989][00488] Fps is (10 sec: 4915.2, 60 sec: 3959.5, 300 sec: 3776.7). Total num frames: 3817472. Throughput: 0: 1033.2. Samples: 956040. Policy #0 lag: (min: 3.0, avg: 6.1, max: 11.0)
[2023-06-29 08:58:16,999][00488] Avg episode reward: [(0, '23.387')]
[2023-06-29 08:58:20,230][12611] Updated weights for policy 0, policy_version 1871 (0.0028)
[2023-06-29 08:58:21,990][00488] Fps is (10 sec: 4095.9, 60 sec: 3959.5, 300 sec: 3748.9). Total num frames: 3833856. Throughput: 0: 1008.8. Samples: 961420. Policy #0 lag: (min: 3.0, avg: 6.7, max: 11.0)
[2023-06-29 08:58:21,994][00488] Avg episode reward: [(0, '23.107')]
[2023-06-29 08:58:26,857][12611] Updated weights for policy 0, policy_version 1881 (0.0015)
[2023-06-29 08:58:26,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3748.9). Total num frames: 3850240. Throughput: 0: 954.2. Samples: 966672. Policy #0 lag: (min: 3.0, avg: 6.6, max: 11.0)
[2023-06-29 08:58:26,997][00488] Avg episode reward: [(0, '23.969')]
[2023-06-29 08:58:31,941][12611] Updated weights for policy 0, policy_version 1891 (0.0013)
[2023-06-29 08:58:31,991][00488] Fps is (10 sec: 4095.4, 60 sec: 4095.9, 300 sec: 3776.6). Total num frames: 3874816. Throughput: 0: 947.1. Samples: 969144. Policy #0 lag: (min: 3.0, avg: 6.2, max: 11.0)
[2023-06-29 08:58:31,999][00488] Avg episode reward: [(0, '23.796')]
[2023-06-29 08:58:36,990][00488] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3748.9). Total num frames: 3891200. Throughput: 0: 949.4. Samples: 974376. Policy #0 lag: (min: 3.0, avg: 7.1, max: 11.0)
[2023-06-29 08:58:36,997][00488] Avg episode reward: [(0, '24.146')]
[2023-06-29 08:58:39,124][12611] Updated weights for policy 0, policy_version 1901 (0.0047)
[2023-06-29 08:58:41,989][00488] Fps is (10 sec: 3277.3, 60 sec: 3686.4, 300 sec: 3721.1). Total num frames: 3907584. Throughput: 0: 977.1. Samples: 980808. Policy #0 lag: (min: 3.0, avg: 7.4, max: 12.0)
[2023-06-29 08:58:41,994][00488] Avg episode reward: [(0, '22.536')]
[2023-06-29 08:58:42,338][12611] Updated weights for policy 0, policy_version 1911 (0.0017)
[2023-06-29 08:58:46,845][12611] Updated weights for policy 0, policy_version 1921 (0.0022)
[2023-06-29 08:58:46,989][00488] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 3932160. Throughput: 0: 1004.1. Samples: 984588. Policy #0 lag: (min: 3.0, avg: 5.9, max: 11.0)
[2023-06-29 08:58:46,991][00488] Avg episode reward: [(0, '21.587')]
[2023-06-29 08:58:50,981][12611] Updated weights for policy 0, policy_version 1931 (0.0021)
[2023-06-29 08:58:51,990][00488] Fps is (10 sec: 4915.2, 60 sec: 3959.5, 300 sec: 3776.7). Total num frames: 3956736. Throughput: 0: 1038.2. Samples: 991272. Policy #0 lag: (min: 3.0, avg: 6.9, max: 13.0)
[2023-06-29 08:58:51,992][00488] Avg episode reward: [(0, '21.860')]
[2023-06-29 08:58:56,989][00488] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3776.7). Total num frames: 3973120. Throughput: 0: 981.2. Samples: 996428. Policy #0 lag: (min: 3.0, avg: 6.9, max: 11.0)
[2023-06-29 08:58:56,992][00488] Avg episode reward: [(0, '22.287')]
[2023-06-29 08:58:57,938][12611] Updated weights for policy 0, policy_version 1941 (0.0011)
[2023-06-29 08:59:01,989][00488] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3776.7). Total num frames: 3989504. Throughput: 0: 954.5. Samples: 998992. Policy #0 lag: (min: 3.0, avg: 6.2, max: 11.0)
[2023-06-29 08:59:01,998][00488] Avg episode reward: [(0, '22.115')]
[2023-06-29 08:59:02,734][12611] Updated weights for policy 0, policy_version 1951 (0.0031)
[2023-06-29 08:59:06,997][00488] Fps is (10 sec: 3274.3, 60 sec: 3959.0, 300 sec: 3776.6). Total num frames: 4005888. Throughput: 0: 948.4. Samples: 1004104. Policy #0 lag: (min: 3.0, avg: 6.5, max: 11.0)
[2023-06-29 08:59:06,999][00488] Avg episode reward: [(0, '22.915')]
[2023-06-29 08:59:07,702][12590] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001960_4014080.pth...
[2023-06-29 08:59:07,719][00488] Component Batcher_0 stopped!
[2023-06-29 08:59:07,716][12590] Stopping Batcher_0...
[2023-06-29 08:59:07,803][12590] Loop batcher_evt_loop terminating...
[2023-06-29 08:59:07,828][12611] Weights refcount: 2 0
[2023-06-29 08:59:07,843][00488] Component InferenceWorker_p0-w0 stopped!
[2023-06-29 08:59:07,846][12611] Stopping InferenceWorker_p0-w0...
[2023-06-29 08:59:07,856][12611] Loop inference_proc0-0_evt_loop terminating...
[2023-06-29 08:59:07,947][00488] Component RolloutWorker_w13 stopped!
[2023-06-29 08:59:07,949][12631] Stopping RolloutWorker_w13...
[2023-06-29 08:59:07,950][12631] Loop rollout_proc13_evt_loop terminating...
[2023-06-29 08:59:08,017][00488] Component RolloutWorker_w1 stopped!
[2023-06-29 08:59:08,018][12612] Stopping RolloutWorker_w1...
[2023-06-29 08:59:08,023][00488] Component RolloutWorker_w3 stopped!
[2023-06-29 08:59:08,024][12615] Stopping RolloutWorker_w3...
[2023-06-29 08:59:08,030][12590] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001560_3194880.pth
[2023-06-29 08:59:08,020][12612] Loop rollout_proc1_evt_loop terminating...
[2023-06-29 08:59:08,035][12615] Loop rollout_proc3_evt_loop terminating...
[2023-06-29 08:59:08,067][00488] Component RolloutWorker_w11 stopped!
[2023-06-29 08:59:08,069][12626] Stopping RolloutWorker_w11...
[2023-06-29 08:59:08,065][12590] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001960_4014080.pth...
[2023-06-29 08:59:08,076][00488] Component RolloutWorker_w9 stopped!
[2023-06-29 08:59:08,078][12624] Stopping RolloutWorker_w9...
[2023-06-29 08:59:08,070][12626] Loop rollout_proc11_evt_loop terminating...
[2023-06-29 08:59:08,078][12624] Loop rollout_proc9_evt_loop terminating...
[2023-06-29 08:59:08,097][00488] Component RolloutWorker_w15 stopped!
[2023-06-29 08:59:08,099][12632] Stopping RolloutWorker_w15...
[2023-06-29 08:59:08,101][12632] Loop rollout_proc15_evt_loop terminating...
[2023-06-29 08:59:08,171][12616] Stopping RolloutWorker_w5...
[2023-06-29 08:59:08,169][00488] Component RolloutWorker_w5 stopped!
[2023-06-29 08:59:08,183][00488] Component RolloutWorker_w7 stopped!
[2023-06-29 08:59:08,184][12618] Stopping RolloutWorker_w7...
[2023-06-29 08:59:08,191][12618] Loop rollout_proc7_evt_loop terminating...
[2023-06-29 08:59:08,175][12616] Loop rollout_proc5_evt_loop terminating...
[2023-06-29 08:59:08,239][12617] Stopping RolloutWorker_w6...
[2023-06-29 08:59:08,239][00488] Component RolloutWorker_w6 stopped!
[2023-06-29 08:59:08,267][12617] Loop rollout_proc6_evt_loop terminating...
[2023-06-29 08:59:08,365][00488] Component LearnerWorker_p0 stopped!
[2023-06-29 08:59:08,367][12590] Stopping LearnerWorker_p0...
[2023-06-29 08:59:08,368][12590] Loop learner_proc0_evt_loop terminating...
[2023-06-29 08:59:08,385][12613] Stopping RolloutWorker_w2...
[2023-06-29 08:59:08,385][00488] Component RolloutWorker_w2 stopped!
[2023-06-29 08:59:08,390][12614] Stopping RolloutWorker_w4...
[2023-06-29 08:59:08,391][00488] Component RolloutWorker_w4 stopped!
[2023-06-29 08:59:08,396][12614] Loop rollout_proc4_evt_loop terminating...
[2023-06-29 08:59:08,406][12613] Loop rollout_proc2_evt_loop terminating...
[2023-06-29 08:59:08,509][12610] Stopping RolloutWorker_w0...
[2023-06-29 08:59:08,509][12610] Loop rollout_proc0_evt_loop terminating...
[2023-06-29 08:59:08,508][00488] Component RolloutWorker_w0 stopped!
[2023-06-29 08:59:08,533][12625] Stopping RolloutWorker_w10...
[2023-06-29 08:59:08,534][12625] Loop rollout_proc10_evt_loop terminating...
[2023-06-29 08:59:08,533][00488] Component RolloutWorker_w10 stopped!
[2023-06-29 08:59:08,572][12629] Stopping RolloutWorker_w14...
[2023-06-29 08:59:08,573][12629] Loop rollout_proc14_evt_loop terminating...
[2023-06-29 08:59:08,572][00488] Component RolloutWorker_w14 stopped!
[2023-06-29 08:59:08,641][12619] Stopping RolloutWorker_w8...
[2023-06-29 08:59:08,641][00488] Component RolloutWorker_w8 stopped!
[2023-06-29 08:59:08,646][12619] Loop rollout_proc8_evt_loop terminating...
[2023-06-29 08:59:08,738][00488] Component RolloutWorker_w12 stopped!
[2023-06-29 08:59:08,742][12627] Stopping RolloutWorker_w12...
[2023-06-29 08:59:08,742][12627] Loop rollout_proc12_evt_loop terminating...
[2023-06-29 08:59:08,740][00488] Waiting for process learner_proc0 to stop...
[2023-06-29 08:59:11,907][00488] Waiting for process inference_proc0-0 to join...
[2023-06-29 08:59:11,912][00488] Waiting for process rollout_proc0 to join...
[2023-06-29 08:59:15,901][00488] Waiting for process rollout_proc1 to join...
[2023-06-29 08:59:15,903][00488] Waiting for process rollout_proc2 to join...
[2023-06-29 08:59:15,905][00488] Waiting for process rollout_proc3 to join...
[2023-06-29 08:59:15,906][00488] Waiting for process rollout_proc4 to join...
[2023-06-29 08:59:15,907][00488] Waiting for process rollout_proc5 to join...
[2023-06-29 08:59:15,909][00488] Waiting for process rollout_proc6 to join...
[2023-06-29 08:59:15,910][00488] Waiting for process rollout_proc7 to join...
[2023-06-29 08:59:15,912][00488] Waiting for process rollout_proc8 to join...
[2023-06-29 08:59:15,914][00488] Waiting for process rollout_proc9 to join...
[2023-06-29 08:59:15,915][00488] Waiting for process rollout_proc10 to join...
[2023-06-29 08:59:15,916][00488] Waiting for process rollout_proc11 to join...
[2023-06-29 08:59:15,919][00488] Waiting for process rollout_proc12 to join...
[2023-06-29 08:59:15,922][00488] Waiting for process rollout_proc13 to join...
[2023-06-29 08:59:15,925][00488] Waiting for process rollout_proc14 to join...
[2023-06-29 08:59:15,931][00488] Waiting for process rollout_proc15 to join...
[2023-06-29 08:59:15,932][00488] Batcher 0 profile tree view:
batching: 31.9681, releasing_batches: 0.0106
[2023-06-29 08:59:15,933][00488] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0051
wait_policy_total: 871.1978
update_model: 7.0960
weight_update: 0.0012
one_step: 0.0025
handle_policy_step: 233.4504
deserialize: 8.6941, stack: 1.1215, obs_to_device_normalize: 47.5840, forward: 121.4371, send_messages: 12.7788
prepare_outputs: 32.8200
to_cpu: 19.2093
[2023-06-29 08:59:15,935][00488] Learner 0 profile tree view:
misc: 0.0026, prepare_batch: 22.2164
train: 137.4185
epoch_init: 0.0405, minibatch_init: 0.5078, losses_postprocess: 1.3120, kl_divergence: 1.3457, after_optimizer: 4.0000
calculate_losses: 56.4946
losses_init: 0.0081, forward_head: 3.0549, bptt_initial: 37.5015, tail: 2.3895, advantages_returns: 0.6700, losses: 7.5653
bptt: 4.4084
bptt_forward_core: 4.1702
update: 72.4424
clip: 48.8101
[2023-06-29 08:59:15,937][00488] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 0.3438, enqueue_policy_requests: 28.1441, env_step: 1004.4834, overhead: 17.6515, complete_rollouts: 0.9550
save_policy_outputs: 17.3504
split_output_tensors: 8.5931
[2023-06-29 08:59:15,938][00488] RolloutWorker_w15 profile tree view:
wait_for_trajectories: 0.4299, enqueue_policy_requests: 28.7308, env_step: 1009.6872, overhead: 18.1837, complete_rollouts: 0.9640
save_policy_outputs: 16.6776
split_output_tensors: 7.3804
[2023-06-29 08:59:15,943][00488] Loop Runner_EvtLoop terminating...
[2023-06-29 08:59:15,945][00488] Runner profile tree view:
main_loop: 1182.9571
[2023-06-29 08:59:15,947][00488] Collected {0: 4014080}, FPS: 3393.3
[2023-06-29 08:59:16,141][00488] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-06-29 08:59:16,143][00488] Overriding arg 'num_workers' with value 1 passed from command line
[2023-06-29 08:59:16,145][00488] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-06-29 08:59:16,150][00488] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-06-29 08:59:16,152][00488] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-06-29 08:59:16,155][00488] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-06-29 08:59:16,158][00488] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
[2023-06-29 08:59:16,160][00488] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-06-29 08:59:16,162][00488] Adding new argument 'push_to_hub'=False that is not in the saved config file!
[2023-06-29 08:59:16,164][00488] Adding new argument 'hf_repository'=None that is not in the saved config file!
[2023-06-29 08:59:16,166][00488] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-06-29 08:59:16,167][00488] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-06-29 08:59:16,169][00488] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-06-29 08:59:16,171][00488] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-06-29 08:59:16,173][00488] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-06-29 08:59:16,194][00488] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-06-29 08:59:16,198][00488] RunningMeanStd input shape: (3, 72, 128)
[2023-06-29 08:59:16,201][00488] RunningMeanStd input shape: (1,)
[2023-06-29 08:59:16,215][00488] ConvEncoder: input_channels=3
[2023-06-29 08:59:16,345][00488] Conv encoder output size: 512
[2023-06-29 08:59:16,350][00488] Policy head output size: 512
[2023-06-29 08:59:18,941][00488] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001960_4014080.pth...
[2023-06-29 08:59:20,237][00488] Num frames 100...
[2023-06-29 08:59:20,352][00488] Num frames 200...
[2023-06-29 08:59:20,475][00488] Num frames 300...
[2023-06-29 08:59:20,597][00488] Num frames 400...
[2023-06-29 08:59:20,718][00488] Num frames 500...
[2023-06-29 08:59:20,839][00488] Num frames 600...
[2023-06-29 08:59:20,968][00488] Num frames 700...
[2023-06-29 08:59:21,096][00488] Num frames 800...
[2023-06-29 08:59:21,191][00488] Avg episode rewards: #0: 15.320, true rewards: #0: 8.320
[2023-06-29 08:59:21,194][00488] Avg episode reward: 15.320, avg true_objective: 8.320
[2023-06-29 08:59:21,274][00488] Num frames 900...
[2023-06-29 08:59:21,393][00488] Num frames 1000...
[2023-06-29 08:59:21,518][00488] Num frames 1100...
[2023-06-29 08:59:21,638][00488] Num frames 1200...
[2023-06-29 08:59:21,761][00488] Num frames 1300...
[2023-06-29 08:59:21,879][00488] Num frames 1400...
[2023-06-29 08:59:22,007][00488] Num frames 1500...
[2023-06-29 08:59:22,076][00488] Avg episode rewards: #0: 15.045, true rewards: #0: 7.545
[2023-06-29 08:59:22,078][00488] Avg episode reward: 15.045, avg true_objective: 7.545
[2023-06-29 08:59:22,227][00488] Num frames 1600...
[2023-06-29 08:59:22,406][00488] Num frames 1700...
[2023-06-29 08:59:22,580][00488] Num frames 1800...
[2023-06-29 08:59:22,757][00488] Num frames 1900...
[2023-06-29 08:59:22,936][00488] Num frames 2000...
[2023-06-29 08:59:23,130][00488] Num frames 2100...
[2023-06-29 08:59:23,310][00488] Num frames 2200...
[2023-06-29 08:59:23,489][00488] Num frames 2300...
[2023-06-29 08:59:23,693][00488] Avg episode rewards: #0: 16.280, true rewards: #0: 7.947
[2023-06-29 08:59:23,696][00488] Avg episode reward: 16.280, avg true_objective: 7.947
[2023-06-29 08:59:23,729][00488] Num frames 2400...
[2023-06-29 08:59:23,901][00488] Num frames 2500...
[2023-06-29 08:59:24,083][00488] Num frames 2600...
[2023-06-29 08:59:24,260][00488] Num frames 2700...
[2023-06-29 08:59:24,435][00488] Num frames 2800...
[2023-06-29 08:59:24,615][00488] Num frames 2900...
[2023-06-29 08:59:24,793][00488] Num frames 3000...
[2023-06-29 08:59:24,979][00488] Num frames 3100...
[2023-06-29 08:59:25,170][00488] Num frames 3200...
[2023-06-29 08:59:25,351][00488] Num frames 3300...
[2023-06-29 08:59:25,533][00488] Num frames 3400...
[2023-06-29 08:59:25,707][00488] Num frames 3500...
[2023-06-29 08:59:25,884][00488] Num frames 3600...
[2023-06-29 08:59:25,939][00488] Avg episode rewards: #0: 19.000, true rewards: #0: 9.000
[2023-06-29 08:59:25,941][00488] Avg episode reward: 19.000, avg true_objective: 9.000
[2023-06-29 08:59:26,062][00488] Num frames 3700...
[2023-06-29 08:59:26,189][00488] Num frames 3800...
[2023-06-29 08:59:26,309][00488] Num frames 3900...
[2023-06-29 08:59:26,426][00488] Num frames 4000...
[2023-06-29 08:59:26,544][00488] Num frames 4100...
[2023-06-29 08:59:26,671][00488] Num frames 4200...
[2023-06-29 08:59:26,790][00488] Num frames 4300...
[2023-06-29 08:59:26,911][00488] Num frames 4400...
[2023-06-29 08:59:27,036][00488] Num frames 4500...
[2023-06-29 08:59:27,173][00488] Num frames 4600...
[2023-06-29 08:59:27,291][00488] Num frames 4700...
[2023-06-29 08:59:27,409][00488] Num frames 4800...
[2023-06-29 08:59:27,532][00488] Num frames 4900...
[2023-06-29 08:59:27,652][00488] Num frames 5000...
[2023-06-29 08:59:27,782][00488] Avg episode rewards: #0: 22.726, true rewards: #0: 10.126
[2023-06-29 08:59:27,783][00488] Avg episode reward: 22.726, avg true_objective: 10.126
[2023-06-29 08:59:27,830][00488] Num frames 5100...
[2023-06-29 08:59:27,951][00488] Num frames 5200...
[2023-06-29 08:59:28,069][00488] Num frames 5300...
[2023-06-29 08:59:28,226][00488] Avg episode rewards: #0: 19.627, true rewards: #0: 8.960
[2023-06-29 08:59:28,227][00488] Avg episode reward: 19.627, avg true_objective: 8.960
[2023-06-29 08:59:28,259][00488] Num frames 5400...
[2023-06-29 08:59:28,377][00488] Num frames 5500...
[2023-06-29 08:59:28,495][00488] Num frames 5600...
[2023-06-29 08:59:28,611][00488] Num frames 5700...
[2023-06-29 08:59:28,729][00488] Num frames 5800...
[2023-06-29 08:59:28,845][00488] Num frames 5900...
[2023-06-29 08:59:28,924][00488] Avg episode rewards: #0: 18.029, true rewards: #0: 8.457
[2023-06-29 08:59:28,926][00488] Avg episode reward: 18.029, avg true_objective: 8.457
[2023-06-29 08:59:29,028][00488] Num frames 6000...
[2023-06-29 08:59:29,148][00488] Num frames 6100...
[2023-06-29 08:59:29,276][00488] Num frames 6200...
[2023-06-29 08:59:29,392][00488] Num frames 6300...
[2023-06-29 08:59:29,521][00488] Num frames 6400...
[2023-06-29 08:59:29,640][00488] Num frames 6500...
[2023-06-29 08:59:29,759][00488] Num frames 6600...
[2023-06-29 08:59:29,879][00488] Num frames 6700...
[2023-06-29 08:59:30,006][00488] Num frames 6800...
[2023-06-29 08:59:30,127][00488] Num frames 6900...
[2023-06-29 08:59:30,253][00488] Num frames 7000...
[2023-06-29 08:59:30,373][00488] Num frames 7100...
[2023-06-29 08:59:30,495][00488] Num frames 7200...
[2023-06-29 08:59:30,617][00488] Num frames 7300...
[2023-06-29 08:59:30,739][00488] Num frames 7400...
[2023-06-29 08:59:30,857][00488] Num frames 7500...
[2023-06-29 08:59:30,980][00488] Num frames 7600...
[2023-06-29 08:59:31,131][00488] Avg episode rewards: #0: 20.600, true rewards: #0: 9.600
[2023-06-29 08:59:31,132][00488] Avg episode reward: 20.600, avg true_objective: 9.600
[2023-06-29 08:59:31,160][00488] Num frames 7700...
[2023-06-29 08:59:31,291][00488] Num frames 7800...
[2023-06-29 08:59:31,410][00488] Num frames 7900...
[2023-06-29 08:59:31,526][00488] Num frames 8000...
[2023-06-29 08:59:31,648][00488] Num frames 8100...
[2023-06-29 08:59:31,779][00488] Num frames 8200...
[2023-06-29 08:59:31,906][00488] Num frames 8300...
[2023-06-29 08:59:32,041][00488] Num frames 8400...
[2023-06-29 08:59:32,170][00488] Num frames 8500...
[2023-06-29 08:59:32,300][00488] Num frames 8600...
[2023-06-29 08:59:32,421][00488] Num frames 8700...
[2023-06-29 08:59:32,482][00488] Avg episode rewards: #0: 21.116, true rewards: #0: 9.671
[2023-06-29 08:59:32,483][00488] Avg episode reward: 21.116, avg true_objective: 9.671
[2023-06-29 08:59:32,602][00488] Num frames 8800...
[2023-06-29 08:59:32,723][00488] Num frames 8900...
[2023-06-29 08:59:32,842][00488] Num frames 9000...
[2023-06-29 08:59:32,974][00488] Num frames 9100...
[2023-06-29 08:59:33,107][00488] Num frames 9200...
[2023-06-29 08:59:33,231][00488] Num frames 9300...
[2023-06-29 08:59:33,356][00488] Num frames 9400...
[2023-06-29 08:59:33,483][00488] Num frames 9500...
[2023-06-29 08:59:33,601][00488] Num frames 9600...
[2023-06-29 08:59:33,722][00488] Num frames 9700...
[2023-06-29 08:59:33,845][00488] Num frames 9800...
[2023-06-29 08:59:33,971][00488] Num frames 9900...
[2023-06-29 08:59:34,092][00488] Num frames 10000...
[2023-06-29 08:59:34,221][00488] Avg episode rewards: #0: 22.161, true rewards: #0: 10.061
[2023-06-29 08:59:34,222][00488] Avg episode reward: 22.161, avg true_objective: 10.061
[2023-06-29 09:00:36,018][00488] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2023-06-29 09:01:40,507][00488] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-06-29 09:01:40,508][00488] Overriding arg 'num_workers' with value 1 passed from command line
[2023-06-29 09:01:40,512][00488] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-06-29 09:01:40,516][00488] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-06-29 09:01:40,519][00488] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-06-29 09:01:40,521][00488] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-06-29 09:01:40,524][00488] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2023-06-29 09:01:40,525][00488] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-06-29 09:01:40,526][00488] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2023-06-29 09:01:40,527][00488] Adding new argument 'hf_repository'='nomad-ai/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
[2023-06-29 09:01:40,531][00488] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-06-29 09:01:40,532][00488] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-06-29 09:01:40,533][00488] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-06-29 09:01:40,535][00488] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-06-29 09:01:40,536][00488] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-06-29 09:01:40,560][00488] RunningMeanStd input shape: (3, 72, 128)
[2023-06-29 09:01:40,563][00488] RunningMeanStd input shape: (1,)
[2023-06-29 09:01:40,580][00488] ConvEncoder: input_channels=3
[2023-06-29 09:01:40,635][00488] Conv encoder output size: 512
[2023-06-29 09:01:40,637][00488] Policy head output size: 512
[2023-06-29 09:01:40,664][00488] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001960_4014080.pth...
[2023-06-29 09:01:41,377][00488] Num frames 100...
[2023-06-29 09:01:41,558][00488] Num frames 200...
[2023-06-29 09:01:41,695][00488] Num frames 300...
[2023-06-29 09:01:41,826][00488] Num frames 400...
[2023-06-29 09:01:41,945][00488] Num frames 500...
[2023-06-29 09:01:42,090][00488] Avg episode rewards: #0: 10.760, true rewards: #0: 5.760
[2023-06-29 09:01:42,092][00488] Avg episode reward: 10.760, avg true_objective: 5.760
[2023-06-29 09:01:42,124][00488] Num frames 600...
[2023-06-29 09:01:42,247][00488] Num frames 700...
[2023-06-29 09:01:42,376][00488] Num frames 800...
[2023-06-29 09:01:42,497][00488] Num frames 900...
[2023-06-29 09:01:42,612][00488] Num frames 1000...
[2023-06-29 09:01:42,734][00488] Num frames 1100...
[2023-06-29 09:01:42,857][00488] Num frames 1200...
[2023-06-29 09:01:42,931][00488] Avg episode rewards: #0: 12.580, true rewards: #0: 6.080
[2023-06-29 09:01:42,933][00488] Avg episode reward: 12.580, avg true_objective: 6.080
[2023-06-29 09:01:43,044][00488] Num frames 1300...
[2023-06-29 09:01:43,169][00488] Num frames 1400...
[2023-06-29 09:01:43,289][00488] Num frames 1500...
[2023-06-29 09:01:43,410][00488] Num frames 1600...
[2023-06-29 09:01:43,542][00488] Num frames 1700...
[2023-06-29 09:01:43,661][00488] Num frames 1800...
[2023-06-29 09:01:43,804][00488] Num frames 1900...
[2023-06-29 09:01:43,936][00488] Num frames 2000...
[2023-06-29 09:01:44,057][00488] Num frames 2100...
[2023-06-29 09:01:44,184][00488] Num frames 2200...
[2023-06-29 09:01:44,302][00488] Num frames 2300...
[2023-06-29 09:01:44,423][00488] Num frames 2400...
[2023-06-29 09:01:44,518][00488] Avg episode rewards: #0: 17.107, true rewards: #0: 8.107
[2023-06-29 09:01:44,519][00488] Avg episode reward: 17.107, avg true_objective: 8.107
[2023-06-29 09:01:44,605][00488] Num frames 2500...
[2023-06-29 09:01:44,724][00488] Num frames 2600...
[2023-06-29 09:01:44,844][00488] Num frames 2700...
[2023-06-29 09:01:44,972][00488] Num frames 2800...
[2023-06-29 09:01:45,097][00488] Num frames 2900...
[2023-06-29 09:01:45,223][00488] Num frames 3000...
[2023-06-29 09:01:45,343][00488] Num frames 3100...
[2023-06-29 09:01:45,474][00488] Num frames 3200...
[2023-06-29 09:01:45,605][00488] Num frames 3300...
[2023-06-29 09:01:45,733][00488] Num frames 3400...
[2023-06-29 09:01:45,858][00488] Num frames 3500...
[2023-06-29 09:01:45,991][00488] Num frames 3600...
[2023-06-29 09:01:46,117][00488] Num frames 3700...
[2023-06-29 09:01:46,239][00488] Num frames 3800...
[2023-06-29 09:01:46,366][00488] Num frames 3900...
[2023-06-29 09:01:46,493][00488] Num frames 4000...
[2023-06-29 09:01:46,619][00488] Num frames 4100...
[2023-06-29 09:01:46,746][00488] Num frames 4200...
[2023-06-29 09:01:46,863][00488] Num frames 4300...
[2023-06-29 09:01:46,995][00488] Num frames 4400...
[2023-06-29 09:01:47,073][00488] Avg episode rewards: #0: 26.545, true rewards: #0: 11.045
[2023-06-29 09:01:47,075][00488] Avg episode reward: 26.545, avg true_objective: 11.045
[2023-06-29 09:01:47,177][00488] Num frames 4500...
[2023-06-29 09:01:47,308][00488] Num frames 4600...
[2023-06-29 09:01:47,432][00488] Num frames 4700...
[2023-06-29 09:01:47,557][00488] Num frames 4800...
[2023-06-29 09:01:47,675][00488] Num frames 4900...
[2023-06-29 09:01:47,798][00488] Num frames 5000...
[2023-06-29 09:01:47,886][00488] Avg episode rewards: #0: 23.452, true rewards: #0: 10.052
[2023-06-29 09:01:47,887][00488] Avg episode reward: 23.452, avg true_objective: 10.052
[2023-06-29 09:01:47,983][00488] Num frames 5100...
[2023-06-29 09:01:48,103][00488] Num frames 5200...
[2023-06-29 09:01:48,225][00488] Num frames 5300...
[2023-06-29 09:01:48,348][00488] Num frames 5400...
[2023-06-29 09:01:48,473][00488] Num frames 5500...
[2023-06-29 09:01:48,596][00488] Num frames 5600...
[2023-06-29 09:01:48,716][00488] Num frames 5700...
[2023-06-29 09:01:48,840][00488] Num frames 5800...
[2023-06-29 09:01:49,019][00488] Avg episode rewards: #0: 22.490, true rewards: #0: 9.823
[2023-06-29 09:01:49,020][00488] Avg episode reward: 22.490, avg true_objective: 9.823
[2023-06-29 09:01:49,031][00488] Num frames 5900...
[2023-06-29 09:01:49,160][00488] Num frames 6000...
[2023-06-29 09:01:49,283][00488] Num frames 6100...
[2023-06-29 09:01:49,415][00488] Num frames 6200...
[2023-06-29 09:01:49,544][00488] Num frames 6300...
[2023-06-29 09:01:49,662][00488] Num frames 6400...
[2023-06-29 09:01:49,790][00488] Num frames 6500...
[2023-06-29 09:01:49,909][00488] Num frames 6600...
[2023-06-29 09:01:50,044][00488] Num frames 6700...
[2023-06-29 09:01:50,170][00488] Avg episode rewards: #0: 21.797, true rewards: #0: 9.654
[2023-06-29 09:01:50,171][00488] Avg episode reward: 21.797, avg true_objective: 9.654
[2023-06-29 09:01:50,226][00488] Num frames 6800...
[2023-06-29 09:01:50,361][00488] Num frames 6900...
[2023-06-29 09:01:50,493][00488] Num frames 7000...
[2023-06-29 09:01:50,614][00488] Num frames 7100...
[2023-06-29 09:01:50,743][00488] Num frames 7200...
[2023-06-29 09:01:50,863][00488] Num frames 7300...
[2023-06-29 09:01:50,991][00488] Num frames 7400...
[2023-06-29 09:01:51,114][00488] Num frames 7500...
[2023-06-29 09:01:51,233][00488] Num frames 7600...
[2023-06-29 09:01:51,350][00488] Num frames 7700...
[2023-06-29 09:01:51,476][00488] Num frames 7800...
[2023-06-29 09:01:51,603][00488] Num frames 7900...
[2023-06-29 09:01:51,784][00488] Num frames 8000...
[2023-06-29 09:01:51,957][00488] Num frames 8100...
[2023-06-29 09:01:52,139][00488] Num frames 8200...
[2023-06-29 09:01:52,317][00488] Num frames 8300...
[2023-06-29 09:01:52,386][00488] Avg episode rewards: #0: 24.506, true rewards: #0: 10.381
[2023-06-29 09:01:52,389][00488] Avg episode reward: 24.506, avg true_objective: 10.381
[2023-06-29 09:01:52,557][00488] Num frames 8400...
[2023-06-29 09:01:52,739][00488] Num frames 8500...
[2023-06-29 09:01:52,916][00488] Num frames 8600...
[2023-06-29 09:01:53,096][00488] Num frames 8700...
[2023-06-29 09:01:53,272][00488] Num frames 8800...
[2023-06-29 09:01:53,455][00488] Num frames 8900...
[2023-06-29 09:01:53,675][00488] Num frames 9000...
[2023-06-29 09:01:53,812][00488] Avg episode rewards: #0: 23.268, true rewards: #0: 10.046
[2023-06-29 09:01:53,813][00488] Avg episode reward: 23.268, avg true_objective: 10.046
[2023-06-29 09:01:53,917][00488] Num frames 9100...
[2023-06-29 09:01:54,100][00488] Num frames 9200...
[2023-06-29 09:01:54,299][00488] Num frames 9300...
[2023-06-29 09:01:54,480][00488] Num frames 9400...
[2023-06-29 09:01:54,669][00488] Num frames 9500...
[2023-06-29 09:01:54,849][00488] Num frames 9600...
[2023-06-29 09:01:55,028][00488] Num frames 9700...
[2023-06-29 09:01:55,211][00488] Num frames 9800...
[2023-06-29 09:01:55,392][00488] Num frames 9900...
[2023-06-29 09:01:55,569][00488] Avg episode rewards: #0: 22.969, true rewards: #0: 9.969
[2023-06-29 09:01:55,571][00488] Avg episode reward: 22.969, avg true_objective: 9.969
[2023-06-29 09:02:54,731][00488] Replay video saved to /content/train_dir/default_experiment/replay.mp4!