Upload . with huggingface_hub

Browse files

Files changed (5) hide show

.summary/0/events.out.tfevents.1677757307.ff5fdd36e73f +3 -0
README.md +1 -1
checkpoint_p0/checkpoint_000000856_3506176.pth +3 -0
replay.mp4 +2 -2
sf_log.txt +286 -0

.summary/0/events.out.tfevents.1677757307.ff5fdd36e73f ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:dd97fbae60f5186db877d0d1680b08a0c9635120ef51163b4ad617394622de11
+size 410

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ model-index:
       type: doom_health_gathering_supreme
     metrics:
     - type: mean_reward
-      value: 4.06 +/- 0.29
       name: mean_reward
       verified: false
 ---

       type: doom_health_gathering_supreme
     metrics:
     - type: mean_reward
+      value: 3.62 +/- 0.56
       name: mean_reward
       verified: false
 ---

checkpoint_p0/checkpoint_000000856_3506176.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:83dd08008431c29eb9f979f192b9d1ac4662b824b7b7f7cde11cfc05d6934043
+size 34929220

replay.mp4 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:481723c87624e186b57cc803e7ddac7e31e2d73bb360d1e9469344597411336c
-size 5837073

 version https://git-lfs.github.com/spec/v1
+oid sha256:432c1c300dd410967ba547c2ac5394f59153fe2cf5cf022f147c8bd00a8de6ca
+size 4879937

sf_log.txt CHANGED Viewed

@@ -853,3 +853,289 @@ main_loop: 967.7497
 [2023-03-02 11:10:59,408][09136] Avg episode rewards: #0: 4.464, true rewards: #0: 4.064
 [2023-03-02 11:10:59,408][09136] Avg episode reward: 4.464, avg true_objective: 4.064
 [2023-03-02 11:11:03,282][09136] Replay video saved to /home/gpu/train_dir/default_experiment/replay.mp4!

 [2023-03-02 11:10:59,408][09136] Avg episode rewards: #0: 4.464, true rewards: #0: 4.064
 [2023-03-02 11:10:59,408][09136] Avg episode reward: 4.464, avg true_objective: 4.064
 [2023-03-02 11:11:03,282][09136] Replay video saved to /home/gpu/train_dir/default_experiment/replay.mp4!
+[2023-03-02 11:11:22,494][09136] The model has been pushed to https://huggingface.co/nhiro3303/rl_course_vizdoom_health_gathering_supreme
+[2023-03-02 11:41:48,937][09917] Saving configuration to /home/gpu/train_dir/default_experiment/config.json...
+[2023-03-02 11:41:48,938][09917] Rollout worker 0 uses device cpu
+[2023-03-02 11:41:48,938][09917] Rollout worker 1 uses device cpu
+[2023-03-02 11:41:48,938][09917] Rollout worker 2 uses device cpu
+[2023-03-02 11:41:48,938][09917] Rollout worker 3 uses device cpu
+[2023-03-02 11:41:48,938][09917] Rollout worker 4 uses device cpu
+[2023-03-02 11:41:48,938][09917] Rollout worker 5 uses device cpu
+[2023-03-02 11:41:48,938][09917] Rollout worker 6 uses device cpu
+[2023-03-02 11:41:48,938][09917] Rollout worker 7 uses device cpu
+[2023-03-02 11:41:48,966][09917] Using GPUs [0] for process 0 (actually maps to GPUs [0])
+[2023-03-02 11:41:48,966][09917] InferenceWorker_p0-w0: min num requests: 2
+[2023-03-02 11:41:48,983][09917] Starting all processes...
+[2023-03-02 11:41:48,983][09917] Starting process learner_proc0
+[2023-03-02 11:41:49,683][09917] Starting all processes...
+[2023-03-02 11:41:49,686][09975] Using GPUs [0] for process 0 (actually maps to GPUs [0])
+[2023-03-02 11:41:49,686][09975] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
+[2023-03-02 11:41:49,686][09917] Starting process inference_proc0-0
+[2023-03-02 11:41:49,686][09917] Starting process rollout_proc0
+[2023-03-02 11:41:49,686][09917] Starting process rollout_proc1
+[2023-03-02 11:41:49,688][09917] Starting process rollout_proc2
+[2023-03-02 11:41:49,692][09917] Starting process rollout_proc3
+[2023-03-02 11:41:49,694][09917] Starting process rollout_proc4
+[2023-03-02 11:41:49,694][09917] Starting process rollout_proc5
+[2023-03-02 11:41:49,694][09917] Starting process rollout_proc6
+[2023-03-02 11:41:49,694][09917] Starting process rollout_proc7
+[2023-03-02 11:41:49,721][09975] Num visible devices: 1
+[2023-03-02 11:41:49,748][09975] Starting seed is not provided
+[2023-03-02 11:41:49,748][09975] Using GPUs [0] for process 0 (actually maps to GPUs [0])
+[2023-03-02 11:41:49,748][09975] Initializing actor-critic model on device cuda:0
+[2023-03-02 11:41:49,749][09975] RunningMeanStd input shape: (3, 72, 128)
+[2023-03-02 11:41:49,750][09975] RunningMeanStd input shape: (1,)
+[2023-03-02 11:41:49,762][09975] ConvEncoder: input_channels=3
+[2023-03-02 11:41:50,005][09975] Conv encoder output size: 512
+[2023-03-02 11:41:50,005][09975] Policy head output size: 512
+[2023-03-02 11:41:50,018][09975] Created Actor Critic model with architecture:
+[2023-03-02 11:41:50,019][09975] ActorCriticSharedWeights(
+  (obs_normalizer): ObservationNormalizer(
+    (running_mean_std): RunningMeanStdDictInPlace(
+      (running_mean_std): ModuleDict(
+        (obs): RunningMeanStdInPlace()
+      )
+    )
+  )
+  (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
+  (encoder): VizdoomEncoder(
+    (basic_encoder): ConvEncoder(
+      (enc): RecursiveScriptModule(
+        original_name=ConvEncoderImpl
+        (conv_head): RecursiveScriptModule(
+          original_name=Sequential
+          (0): RecursiveScriptModule(original_name=Conv2d)
+          (1): RecursiveScriptModule(original_name=ELU)
+          (2): RecursiveScriptModule(original_name=Conv2d)
+          (3): RecursiveScriptModule(original_name=ELU)
+          (4): RecursiveScriptModule(original_name=Conv2d)
+          (5): RecursiveScriptModule(original_name=ELU)
+        )
+        (mlp_layers): RecursiveScriptModule(
+          original_name=Sequential
+          (0): RecursiveScriptModule(original_name=Linear)
+          (1): RecursiveScriptModule(original_name=ELU)
+        )
+      )
+    )
+  )
+  (core): ModelCoreRNN(
+    (core): GRU(512, 512)
+  )
+  (decoder): MlpDecoder(
+    (mlp): Identity()
+  )
+  (critic_linear): Linear(in_features=512, out_features=1, bias=True)
+  (action_parameterization): ActionParameterizationDefault(
+    (distribution_linear): Linear(in_features=512, out_features=5, bias=True)
+  )
+)
+[2023-03-02 11:41:50,847][10004] Worker 0 uses CPU cores [0, 1]
+[2023-03-02 11:41:50,857][10011] Worker 7 uses CPU cores [14, 15]
+[2023-03-02 11:41:50,857][10005] Worker 1 uses CPU cores [2, 3]
+[2023-03-02 11:41:50,857][10009] Worker 5 uses CPU cores [10, 11]
+[2023-03-02 11:41:50,888][10006] Worker 2 uses CPU cores [4, 5]
+[2023-03-02 11:41:50,890][10007] Worker 3 uses CPU cores [6, 7]
+[2023-03-02 11:41:50,891][10010] Worker 6 uses CPU cores [12, 13]
+[2023-03-02 11:41:50,908][10003] Using GPUs [0] for process 0 (actually maps to GPUs [0])
+[2023-03-02 11:41:50,908][10003] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
+[2023-03-02 11:41:50,921][10003] Num visible devices: 1
+[2023-03-02 11:41:50,924][10008] Worker 4 uses CPU cores [8, 9]
+[2023-03-02 11:41:52,109][09975] Using optimizer <class 'torch.optim.adam.Adam'>
+[2023-03-02 11:41:52,109][09975] Loading state from checkpoint /home/gpu/train_dir/default_experiment/checkpoint_p0/checkpoint_000000856_3506176.pth...
+[2023-03-02 11:41:52,129][09975] Loading model from checkpoint
+[2023-03-02 11:41:52,136][09975] Loaded experiment state at self.train_step=856, self.env_steps=3506176
+[2023-03-02 11:41:52,137][09975] Initialized policy 0 weights for model version 856
+[2023-03-02 11:41:52,140][09975] LearnerWorker_p0 finished initialization!
+[2023-03-02 11:41:52,140][09975] Using GPUs [0] for process 0 (actually maps to GPUs [0])
+[2023-03-02 11:41:52,242][10003] Unhandled exception CUDA error: invalid resource handle in evt loop inference_proc0-0_evt_loop
+[2023-03-02 11:41:52,400][09917] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 3506176. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
+[2023-03-02 11:41:57,400][09917] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 3506176. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
+[2023-03-02 11:42:02,400][09917] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 3506176. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
+[2023-03-02 11:42:07,400][09917] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 3506176. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
+[2023-03-02 11:42:08,483][09917] Keyboard interrupt detected in the event loop EvtLoop [Runner_EvtLoop, process=main process 9917], exiting...
+[2023-03-02 11:42:08,484][10009] Stopping RolloutWorker_w5...
+[2023-03-02 11:42:08,484][10008] Stopping RolloutWorker_w4...
+[2023-03-02 11:42:08,484][09917] Runner profile tree view:
+main_loop: 19.5008
+[2023-03-02 11:42:08,484][09917] Collected {0: 3506176}, FPS: 0.0
+[2023-03-02 11:42:08,484][10005] Stopping RolloutWorker_w1...
+[2023-03-02 11:42:08,484][10006] Stopping RolloutWorker_w2...
+[2023-03-02 11:42:08,484][10010] Stopping RolloutWorker_w6...
+[2023-03-02 11:42:08,484][10009] Loop rollout_proc5_evt_loop terminating...
+[2023-03-02 11:42:08,484][10007] Stopping RolloutWorker_w3...
+[2023-03-02 11:42:08,484][09975] Stopping Batcher_0...
+[2023-03-02 11:42:08,484][10004] Stopping RolloutWorker_w0...
+[2023-03-02 11:42:08,484][10008] Loop rollout_proc4_evt_loop terminating...
+[2023-03-02 11:42:08,484][10011] Stopping RolloutWorker_w7...
+[2023-03-02 11:42:08,484][10006] Loop rollout_proc2_evt_loop terminating...
+[2023-03-02 11:42:08,484][10005] Loop rollout_proc1_evt_loop terminating...
+[2023-03-02 11:42:08,484][10010] Loop rollout_proc6_evt_loop terminating...
+[2023-03-02 11:42:08,484][10007] Loop rollout_proc3_evt_loop terminating...
+[2023-03-02 11:42:08,484][10004] Loop rollout_proc0_evt_loop terminating...
+[2023-03-02 11:42:08,484][10011] Loop rollout_proc7_evt_loop terminating...
+[2023-03-02 11:42:08,484][09975] Loop batcher_evt_loop terminating...
+[2023-03-02 11:42:08,485][09975] Saving /home/gpu/train_dir/default_experiment/checkpoint_p0/checkpoint_000000856_3506176.pth...
+[2023-03-02 11:42:08,546][09975] Stopping LearnerWorker_p0...
+[2023-03-02 11:42:08,547][09975] Loop learner_proc0_evt_loop terminating...
+[2023-03-02 11:42:08,580][09917] Loading existing experiment configuration from /home/gpu/train_dir/default_experiment/config.json
+[2023-03-02 11:42:08,580][09917] Overriding arg 'num_workers' with value 1 passed from command line
+[2023-03-02 11:42:08,580][09917] Adding new argument 'no_render'=True that is not in the saved config file!
+[2023-03-02 11:42:08,581][09917] Adding new argument 'save_video'=True that is not in the saved config file!
+[2023-03-02 11:42:08,581][09917] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
+[2023-03-02 11:42:08,581][09917] Adding new argument 'video_name'=None that is not in the saved config file!
+[2023-03-02 11:42:08,581][09917] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
+[2023-03-02 11:42:08,581][09917] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
+[2023-03-02 11:42:08,581][09917] Adding new argument 'push_to_hub'=False that is not in the saved config file!
+[2023-03-02 11:42:08,581][09917] Adding new argument 'hf_repository'=None that is not in the saved config file!
+[2023-03-02 11:42:08,582][09917] Adding new argument 'policy_index'=0 that is not in the saved config file!
+[2023-03-02 11:42:08,582][09917] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
+[2023-03-02 11:42:08,582][09917] Adding new argument 'train_script'=None that is not in the saved config file!
+[2023-03-02 11:42:08,582][09917] Adding new argument 'enjoy_script'=None that is not in the saved config file!
+[2023-03-02 11:42:08,582][09917] Using frameskip 1 and render_action_repeat=4 for evaluation
+[2023-03-02 11:42:08,592][09917] Doom resolution: 160x120, resize resolution: (128, 72)
+[2023-03-02 11:42:08,593][09917] RunningMeanStd input shape: (3, 72, 128)
+[2023-03-02 11:42:08,593][09917] RunningMeanStd input shape: (1,)
+[2023-03-02 11:42:08,606][09917] ConvEncoder: input_channels=3
+[2023-03-02 11:42:08,837][09917] Conv encoder output size: 512
+[2023-03-02 11:42:08,838][09917] Policy head output size: 512
+[2023-03-02 11:42:10,551][09917] Loading state from checkpoint /home/gpu/train_dir/default_experiment/checkpoint_p0/checkpoint_000000856_3506176.pth...
+[2023-03-02 11:42:11,320][09917] Num frames 100...
+[2023-03-02 11:42:11,449][09917] Num frames 200...
+[2023-03-02 11:42:11,577][09917] Num frames 300...
+[2023-03-02 11:42:11,704][09917] Num frames 400...
+[2023-03-02 11:42:11,818][09917] Avg episode rewards: #0: 5.480, true rewards: #0: 4.480
+[2023-03-02 11:42:11,818][09917] Avg episode reward: 5.480, avg true_objective: 4.480
+[2023-03-02 11:42:11,886][09917] Num frames 500...
+[2023-03-02 11:42:12,014][09917] Num frames 600...
+[2023-03-02 11:42:12,141][09917] Num frames 700...
+[2023-03-02 11:42:12,267][09917] Num frames 800...
+[2023-03-02 11:42:12,404][09917] Avg episode rewards: #0: 5.320, true rewards: #0: 4.320
+[2023-03-02 11:42:12,404][09917] Avg episode reward: 5.320, avg true_objective: 4.320
+[2023-03-02 11:42:12,450][09917] Num frames 900...
+[2023-03-02 11:42:12,576][09917] Num frames 1000...
+[2023-03-02 11:42:12,701][09917] Num frames 1100...
+[2023-03-02 11:42:12,827][09917] Num frames 1200...
+[2023-03-02 11:42:12,940][09917] Avg episode rewards: #0: 4.827, true rewards: #0: 4.160
+[2023-03-02 11:42:12,940][09917] Avg episode reward: 4.827, avg true_objective: 4.160
+[2023-03-02 11:42:13,007][09917] Num frames 1300...
+[2023-03-02 11:42:13,136][09917] Num frames 1400...
+[2023-03-02 11:42:13,262][09917] Num frames 1500...
+[2023-03-02 11:42:13,393][09917] Num frames 1600...
+[2023-03-02 11:42:13,487][09917] Avg episode rewards: #0: 4.580, true rewards: #0: 4.080
+[2023-03-02 11:42:13,487][09917] Avg episode reward: 4.580, avg true_objective: 4.080
+[2023-03-02 11:42:13,578][09917] Num frames 1700...
+[2023-03-02 11:42:13,705][09917] Num frames 1800...
+[2023-03-02 11:42:13,834][09917] Num frames 1900...
+[2023-03-02 11:42:13,961][09917] Num frames 2000...
+[2023-03-02 11:42:14,036][09917] Avg episode rewards: #0: 4.432, true rewards: #0: 4.032
+[2023-03-02 11:42:14,036][09917] Avg episode reward: 4.432, avg true_objective: 4.032
+[2023-03-02 11:42:14,156][09917] Num frames 2100...
+[2023-03-02 11:42:14,296][09917] Num frames 2200...
+[2023-03-02 11:42:14,484][09917] Num frames 2300...
+[2023-03-02 11:42:14,624][09917] Num frames 2400...
+[2023-03-02 11:42:14,675][09917] Avg episode rewards: #0: 4.333, true rewards: #0: 4.000
+[2023-03-02 11:42:14,675][09917] Avg episode reward: 4.333, avg true_objective: 4.000
+[2023-03-02 11:42:14,803][09917] Num frames 2500...
+[2023-03-02 11:42:14,929][09917] Num frames 2600...
+[2023-03-02 11:42:15,056][09917] Num frames 2700...
+[2023-03-02 11:42:15,184][09917] Num frames 2800...
+[2023-03-02 11:42:15,299][09917] Avg episode rewards: #0: 4.497, true rewards: #0: 4.069
+[2023-03-02 11:42:15,300][09917] Avg episode reward: 4.497, avg true_objective: 4.069
+[2023-03-02 11:42:15,376][09917] Num frames 2900...
+[2023-03-02 11:42:15,507][09917] Num frames 3000...
+[2023-03-02 11:42:15,632][09917] Num frames 3100...
+[2023-03-02 11:42:15,772][09917] Num frames 3200...
+[2023-03-02 11:42:15,974][09917] Avg episode rewards: #0: 4.620, true rewards: #0: 4.120
+[2023-03-02 11:42:15,974][09917] Avg episode reward: 4.620, avg true_objective: 4.120
+[2023-03-02 11:42:15,982][09917] Num frames 3300...
+[2023-03-02 11:42:16,136][09917] Num frames 3400...
+[2023-03-02 11:42:16,291][09917] Num frames 3500...
+[2023-03-02 11:42:16,451][09917] Num frames 3600...
+[2023-03-02 11:42:16,637][09917] Avg episode rewards: #0: 4.533, true rewards: #0: 4.089
+[2023-03-02 11:42:16,638][09917] Avg episode reward: 4.533, avg true_objective: 4.089
+[2023-03-02 11:42:16,676][09917] Num frames 3700...
+[2023-03-02 11:42:16,842][09917] Num frames 3800...
+[2023-03-02 11:42:17,012][09917] Num frames 3900...
+[2023-03-02 11:42:17,185][09917] Num frames 4000...
+[2023-03-02 11:42:17,362][09917] Avg episode rewards: #0: 4.464, true rewards: #0: 4.064
+[2023-03-02 11:42:17,362][09917] Avg episode reward: 4.464, avg true_objective: 4.064
+[2023-03-02 11:42:21,426][09917] Replay video saved to /home/gpu/train_dir/default_experiment/replay.mp4!
+[2023-03-02 11:42:32,345][09917] Loading existing experiment configuration from /home/gpu/train_dir/default_experiment/config.json
+[2023-03-02 11:42:32,345][09917] Overriding arg 'num_workers' with value 1 passed from command line
+[2023-03-02 11:42:32,345][09917] Adding new argument 'no_render'=True that is not in the saved config file!
+[2023-03-02 11:42:32,345][09917] Adding new argument 'save_video'=True that is not in the saved config file!
+[2023-03-02 11:42:32,345][09917] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
+[2023-03-02 11:42:32,345][09917] Adding new argument 'video_name'=None that is not in the saved config file!
+[2023-03-02 11:42:32,345][09917] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
+[2023-03-02 11:42:32,345][09917] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
+[2023-03-02 11:42:32,346][09917] Adding new argument 'push_to_hub'=True that is not in the saved config file!
+[2023-03-02 11:42:32,346][09917] Adding new argument 'hf_repository'='nhiro3303/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
+[2023-03-02 11:42:32,346][09917] Adding new argument 'policy_index'=0 that is not in the saved config file!
+[2023-03-02 11:42:32,346][09917] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
+[2023-03-02 11:42:32,346][09917] Adding new argument 'train_script'=None that is not in the saved config file!
+[2023-03-02 11:42:32,346][09917] Adding new argument 'enjoy_script'=None that is not in the saved config file!
+[2023-03-02 11:42:32,346][09917] Using frameskip 1 and render_action_repeat=4 for evaluation
+[2023-03-02 11:42:32,349][09917] RunningMeanStd input shape: (3, 72, 128)
+[2023-03-02 11:42:32,349][09917] RunningMeanStd input shape: (1,)
+[2023-03-02 11:42:32,356][09917] ConvEncoder: input_channels=3
+[2023-03-02 11:42:32,379][09917] Conv encoder output size: 512
+[2023-03-02 11:42:32,379][09917] Policy head output size: 512
+[2023-03-02 11:42:32,399][09917] Loading state from checkpoint /home/gpu/train_dir/default_experiment/checkpoint_p0/checkpoint_000000856_3506176.pth...
+[2023-03-02 11:42:32,789][09917] Num frames 100...
+[2023-03-02 11:42:33,008][09917] Num frames 200...
+[2023-03-02 11:42:33,216][09917] Num frames 300...
+[2023-03-02 11:42:33,431][09917] Num frames 400...
+[2023-03-02 11:42:33,520][09917] Avg episode rewards: #0: 4.160, true rewards: #0: 4.160
+[2023-03-02 11:42:33,521][09917] Avg episode reward: 4.160, avg true_objective: 4.160
+[2023-03-02 11:42:33,708][09917] Num frames 500...
+[2023-03-02 11:42:33,921][09917] Num frames 600...
+[2023-03-02 11:42:34,148][09917] Num frames 700...
+[2023-03-02 11:42:34,362][09917] Num frames 800...
+[2023-03-02 11:42:34,484][09917] Avg episode rewards: #0: 4.660, true rewards: #0: 4.160
+[2023-03-02 11:42:34,484][09917] Avg episode reward: 4.660, avg true_objective: 4.160
+[2023-03-02 11:42:34,633][09917] Num frames 900...
+[2023-03-02 11:42:34,835][09917] Num frames 1000...
+[2023-03-02 11:42:35,067][09917] Avg episode rewards: #0: 3.960, true rewards: #0: 3.627
+[2023-03-02 11:42:35,067][09917] Avg episode reward: 3.960, avg true_objective: 3.627
+[2023-03-02 11:42:35,089][09917] Num frames 1100...
+[2023-03-02 11:42:35,305][09917] Num frames 1200...
+[2023-03-02 11:42:35,521][09917] Num frames 1300...
+[2023-03-02 11:42:35,726][09917] Num frames 1400...
+[2023-03-02 11:42:35,864][09917] Avg episode rewards: #0: 4.100, true rewards: #0: 3.600
+[2023-03-02 11:42:35,864][09917] Avg episode reward: 4.100, avg true_objective: 3.600
+[2023-03-02 11:42:35,991][09917] Num frames 1500...
+[2023-03-02 11:42:36,195][09917] Num frames 1600...
+[2023-03-02 11:42:36,414][09917] Num frames 1700...
+[2023-03-02 11:42:36,616][09917] Num frames 1800...
+[2023-03-02 11:42:36,718][09917] Avg episode rewards: #0: 4.048, true rewards: #0: 3.648
+[2023-03-02 11:42:36,718][09917] Avg episode reward: 4.048, avg true_objective: 3.648
+[2023-03-02 11:42:36,882][09917] Num frames 1900...
+[2023-03-02 11:42:37,083][09917] Num frames 2000...
+[2023-03-02 11:42:37,277][09917] Num frames 2100...
+[2023-03-02 11:42:37,490][09917] Num frames 2200...
+[2023-03-02 11:42:37,561][09917] Avg episode rewards: #0: 4.013, true rewards: #0: 3.680
+[2023-03-02 11:42:37,561][09917] Avg episode reward: 4.013, avg true_objective: 3.680
+[2023-03-02 11:42:37,746][09917] Num frames 2300...
+[2023-03-02 11:42:38,005][09917] Num frames 2400...
+[2023-03-02 11:42:38,214][09917] Num frames 2500...
+[2023-03-02 11:42:38,449][09917] Avg episode rewards: #0: 3.989, true rewards: #0: 3.703
+[2023-03-02 11:42:38,449][09917] Avg episode reward: 3.989, avg true_objective: 3.703
+[2023-03-02 11:42:38,466][09917] Num frames 2600...
+[2023-03-02 11:42:38,670][09917] Num frames 2700...
+[2023-03-02 11:42:38,865][09917] Num frames 2800...
+[2023-03-02 11:42:39,070][09917] Num frames 2900...
+[2023-03-02 11:42:39,290][09917] Avg episode rewards: #0: 3.970, true rewards: #0: 3.720
+[2023-03-02 11:42:39,290][09917] Avg episode reward: 3.970, avg true_objective: 3.720
+[2023-03-02 11:42:39,353][09917] Num frames 3000...
+[2023-03-02 11:42:39,552][09917] Num frames 3100...
+[2023-03-02 11:42:39,763][09917] Num frames 3200...
+[2023-03-02 11:42:39,965][09917] Num frames 3300...
+[2023-03-02 11:42:40,144][09917] Avg episode rewards: #0: 3.956, true rewards: #0: 3.733
+[2023-03-02 11:42:40,145][09917] Avg episode reward: 3.956, avg true_objective: 3.733
+[2023-03-02 11:42:40,232][09917] Num frames 3400...
+[2023-03-02 11:42:40,452][09917] Num frames 3500...
+[2023-03-02 11:42:40,660][09917] Num frames 3600...
+[2023-03-02 11:42:40,745][09917] Avg episode rewards: #0: 3.816, true rewards: #0: 3.616
+[2023-03-02 11:42:40,745][09917] Avg episode reward: 3.816, avg true_objective: 3.616
+[2023-03-02 11:42:44,193][09917] Replay video saved to /home/gpu/train_dir/default_experiment/replay.mp4!