Upload . with huggingface_hub

Browse files

Files changed (6) hide show

.summary/0/events.out.tfevents.1677757888.ff5fdd36e73f +0 -0
README.md +1 -1
checkpoint_p0/checkpoint_000000856_3506176.pth +2 -2
checkpoint_p0/checkpoint_000000948_3883008.pth +3 -0
replay.mp4 +2 -2
sf_log.txt +289 -0

.summary/0/events.out.tfevents.1677757888.ff5fdd36e73f ADDED Viewed

File without changes

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ model-index:
       type: doom_health_gathering_supreme
     metrics:
     - type: mean_reward
-      value: 3.62 +/- 0.56
       name: mean_reward
       verified: false
 ---

       type: doom_health_gathering_supreme
     metrics:
     - type: mean_reward
+      value: 4.03 +/- 0.29
       name: mean_reward
       verified: false
 ---

checkpoint_p0/checkpoint_000000856_3506176.pth CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:83dd08008431c29eb9f979f192b9d1ac4662b824b7b7f7cde11cfc05d6934043
-size 34929220

 version https://git-lfs.github.com/spec/v1
+oid sha256:ac7fce1699259f447856a11a89e31a0ef159b096688e88d138a6a2c799a58b45
+size 34928614

checkpoint_p0/checkpoint_000000948_3883008.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5b7b8d989be983293e3bf8afcd6581da7f32db06a313fbbf024d115d74315535
+size 34929220

replay.mp4 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:432c1c300dd410967ba547c2ac5394f59153fe2cf5cf022f147c8bd00a8de6ca
-size 4879937

 version https://git-lfs.github.com/spec/v1
+oid sha256:3d08993bc7c4a36eccde901d230a1a1eb1cf849e9bce017d1880e93ca0f0808a
+size 6078630

sf_log.txt CHANGED Viewed

@@ -1139,3 +1139,292 @@ main_loop: 19.5008
 [2023-03-02 11:42:40,745][09917] Avg episode rewards: #0: 3.816, true rewards: #0: 3.616
 [2023-03-02 11:42:40,745][09917] Avg episode reward: 3.816, avg true_objective: 3.616
 [2023-03-02 11:42:44,193][09917] Replay video saved to /home/gpu/train_dir/default_experiment/replay.mp4!

 [2023-03-02 11:42:40,745][09917] Avg episode rewards: #0: 3.816, true rewards: #0: 3.616
 [2023-03-02 11:42:40,745][09917] Avg episode reward: 3.816, avg true_objective: 3.616
 [2023-03-02 11:42:44,193][09917] Replay video saved to /home/gpu/train_dir/default_experiment/replay.mp4!
+[2023-03-02 11:45:25,806][09917] The model has been pushed to https://huggingface.co/nhiro3303/rl_course_vizdoom_health_gathering_supreme
+[2023-03-02 11:51:30,436][10553] Saving configuration to /home/gpu/train_dir/default_experiment/config.json...
+[2023-03-02 11:51:30,436][10553] Rollout worker 0 uses device cpu
+[2023-03-02 11:51:30,436][10553] Rollout worker 1 uses device cpu
+[2023-03-02 11:51:30,437][10553] Rollout worker 2 uses device cpu
+[2023-03-02 11:51:30,437][10553] Rollout worker 3 uses device cpu
+[2023-03-02 11:51:30,437][10553] Rollout worker 4 uses device cpu
+[2023-03-02 11:51:30,437][10553] Rollout worker 5 uses device cpu
+[2023-03-02 11:51:30,437][10553] Rollout worker 6 uses device cpu
+[2023-03-02 11:51:30,437][10553] Rollout worker 7 uses device cpu
+[2023-03-02 11:51:30,464][10553] Using GPUs [0] for process 0 (actually maps to GPUs [0])
+[2023-03-02 11:51:30,464][10553] InferenceWorker_p0-w0: min num requests: 2
+[2023-03-02 11:51:30,481][10553] Starting all processes...
+[2023-03-02 11:51:30,481][10553] Starting process learner_proc0
+[2023-03-02 11:51:31,134][10553] Starting all processes...
+[2023-03-02 11:51:31,137][10553] Starting process inference_proc0-0
+[2023-03-02 11:51:31,137][10611] Using GPUs [0] for process 0 (actually maps to GPUs [0])
+[2023-03-02 11:51:31,137][10611] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
+[2023-03-02 11:51:31,137][10553] Starting process rollout_proc0
+[2023-03-02 11:51:31,137][10553] Starting process rollout_proc1
+[2023-03-02 11:51:31,138][10553] Starting process rollout_proc2
+[2023-03-02 11:51:31,138][10553] Starting process rollout_proc3
+[2023-03-02 11:51:31,138][10553] Starting process rollout_proc4
+[2023-03-02 11:51:31,140][10553] Starting process rollout_proc5
+[2023-03-02 11:51:31,142][10553] Starting process rollout_proc6
+[2023-03-02 11:51:31,142][10553] Starting process rollout_proc7
+[2023-03-02 11:51:31,173][10611] Num visible devices: 1
+[2023-03-02 11:51:31,197][10611] Starting seed is not provided
+[2023-03-02 11:51:31,197][10611] Using GPUs [0] for process 0 (actually maps to GPUs [0])
+[2023-03-02 11:51:31,197][10611] Initializing actor-critic model on device cuda:0
+[2023-03-02 11:51:31,198][10611] RunningMeanStd input shape: (3, 72, 128)
+[2023-03-02 11:51:31,199][10611] RunningMeanStd input shape: (1,)
+[2023-03-02 11:51:31,212][10611] ConvEncoder: input_channels=3
+[2023-03-02 11:51:31,450][10611] Conv encoder output size: 512
+[2023-03-02 11:51:31,451][10611] Policy head output size: 512
+[2023-03-02 11:51:31,463][10611] Created Actor Critic model with architecture:
+[2023-03-02 11:51:31,463][10611] ActorCriticSharedWeights(
+  (obs_normalizer): ObservationNormalizer(
+    (running_mean_std): RunningMeanStdDictInPlace(
+      (running_mean_std): ModuleDict(
+        (obs): RunningMeanStdInPlace()
+      )
+    )
+  )
+  (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
+  (encoder): VizdoomEncoder(
+    (basic_encoder): ConvEncoder(
+      (enc): RecursiveScriptModule(
+        original_name=ConvEncoderImpl
+        (conv_head): RecursiveScriptModule(
+          original_name=Sequential
+          (0): RecursiveScriptModule(original_name=Conv2d)
+          (1): RecursiveScriptModule(original_name=ELU)
+          (2): RecursiveScriptModule(original_name=Conv2d)
+          (3): RecursiveScriptModule(original_name=ELU)
+          (4): RecursiveScriptModule(original_name=Conv2d)
+          (5): RecursiveScriptModule(original_name=ELU)
+        )
+        (mlp_layers): RecursiveScriptModule(
+          original_name=Sequential
+          (0): RecursiveScriptModule(original_name=Linear)
+          (1): RecursiveScriptModule(original_name=ELU)
+        )
+      )
+    )
+  )
+  (core): ModelCoreRNN(
+    (core): GRU(512, 512)
+  )
+  (decoder): MlpDecoder(
+    (mlp): Identity()
+  )
+  (critic_linear): Linear(in_features=512, out_features=1, bias=True)
+  (action_parameterization): ActionParameterizationDefault(
+    (distribution_linear): Linear(in_features=512, out_features=5, bias=True)
+  )
+)
+[2023-03-02 11:51:32,207][10641] Worker 1 uses CPU cores [2, 3]
+[2023-03-02 11:51:32,226][10640] Worker 0 uses CPU cores [0, 1]
+[2023-03-02 11:51:32,280][10639] Using GPUs [0] for process 0 (actually maps to GPUs [0])
+[2023-03-02 11:51:32,280][10639] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
+[2023-03-02 11:51:32,287][10644] Worker 4 uses CPU cores [8, 9]
+[2023-03-02 11:51:32,287][10647] Worker 7 uses CPU cores [14, 15]
+[2023-03-02 11:51:32,289][10643] Worker 3 uses CPU cores [6, 7]
+[2023-03-02 11:51:32,289][10645] Worker 5 uses CPU cores [10, 11]
+[2023-03-02 11:51:32,295][10639] Num visible devices: 1
+[2023-03-02 11:51:32,300][10646] Worker 6 uses CPU cores [12, 13]
+[2023-03-02 11:51:32,316][10642] Worker 2 uses CPU cores [4, 5]
+[2023-03-02 11:51:33,451][10611] Using optimizer <class 'torch.optim.adam.Adam'>
+[2023-03-02 11:51:33,451][10611] Loading state from checkpoint /home/gpu/train_dir/default_experiment/checkpoint_p0/checkpoint_000000856_3506176.pth...
+[2023-03-02 11:51:33,471][10611] Loading model from checkpoint
+[2023-03-02 11:51:33,478][10611] Loaded experiment state at self.train_step=948, self.env_steps=3883008
+[2023-03-02 11:51:33,478][10611] Initialized policy 0 weights for model version 948
+[2023-03-02 11:51:33,481][10611] LearnerWorker_p0 finished initialization!
+[2023-03-02 11:51:33,482][10611] Using GPUs [0] for process 0 (actually maps to GPUs [0])
+[2023-03-02 11:51:33,588][10639] Unhandled exception CUDA error: invalid resource handle in evt loop inference_proc0-0_evt_loop
+[2023-03-02 11:51:33,897][10553] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 3883008. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
+[2023-03-02 11:51:37,481][10553] Keyboard interrupt detected in the event loop EvtLoop [Runner_EvtLoop, process=main process 10553], exiting...
+[2023-03-02 11:51:37,481][10641] Stopping RolloutWorker_w1...
+[2023-03-02 11:51:37,481][10645] Stopping RolloutWorker_w5...
+[2023-03-02 11:51:37,481][10553] Runner profile tree view:
+main_loop: 7.0007
+[2023-03-02 11:51:37,481][10646] Stopping RolloutWorker_w6...
+[2023-03-02 11:51:37,482][10553] Collected {0: 3883008}, FPS: 0.0
+[2023-03-02 11:51:37,482][10642] Stopping RolloutWorker_w2...
+[2023-03-02 11:51:37,482][10643] Stopping RolloutWorker_w3...
+[2023-03-02 11:51:37,482][10645] Loop rollout_proc5_evt_loop terminating...
+[2023-03-02 11:51:37,482][10644] Stopping RolloutWorker_w4...
+[2023-03-02 11:51:37,482][10641] Loop rollout_proc1_evt_loop terminating...
+[2023-03-02 11:51:37,482][10647] Stopping RolloutWorker_w7...
+[2023-03-02 11:51:37,481][10611] Stopping Batcher_0...
+[2023-03-02 11:51:37,482][10643] Loop rollout_proc3_evt_loop terminating...
+[2023-03-02 11:51:37,482][10646] Loop rollout_proc6_evt_loop terminating...
+[2023-03-02 11:51:37,482][10642] Loop rollout_proc2_evt_loop terminating...
+[2023-03-02 11:51:37,482][10644] Loop rollout_proc4_evt_loop terminating...
+[2023-03-02 11:51:37,482][10647] Loop rollout_proc7_evt_loop terminating...
+[2023-03-02 11:51:37,482][10640] Stopping RolloutWorker_w0...
+[2023-03-02 11:51:37,482][10611] Loop batcher_evt_loop terminating...
+[2023-03-02 11:51:37,482][10640] Loop rollout_proc0_evt_loop terminating...
+[2023-03-02 11:51:37,483][10611] Saving /home/gpu/train_dir/default_experiment/checkpoint_p0/checkpoint_000000948_3883008.pth...
+[2023-03-02 11:51:37,527][10611] Stopping LearnerWorker_p0...
+[2023-03-02 11:51:37,527][10611] Loop learner_proc0_evt_loop terminating...
+[2023-03-02 11:51:37,566][10553] Loading existing experiment configuration from /home/gpu/train_dir/default_experiment/config.json
+[2023-03-02 11:51:37,566][10553] Overriding arg 'num_workers' with value 1 passed from command line
+[2023-03-02 11:51:37,566][10553] Adding new argument 'no_render'=True that is not in the saved config file!
+[2023-03-02 11:51:37,566][10553] Adding new argument 'save_video'=True that is not in the saved config file!
+[2023-03-02 11:51:37,566][10553] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
+[2023-03-02 11:51:37,566][10553] Adding new argument 'video_name'=None that is not in the saved config file!
+[2023-03-02 11:51:37,567][10553] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
+[2023-03-02 11:51:37,567][10553] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
+[2023-03-02 11:51:37,567][10553] Adding new argument 'push_to_hub'=False that is not in the saved config file!
+[2023-03-02 11:51:37,567][10553] Adding new argument 'hf_repository'=None that is not in the saved config file!
+[2023-03-02 11:51:37,567][10553] Adding new argument 'policy_index'=0 that is not in the saved config file!
+[2023-03-02 11:51:37,567][10553] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
+[2023-03-02 11:51:37,567][10553] Adding new argument 'train_script'=None that is not in the saved config file!
+[2023-03-02 11:51:37,567][10553] Adding new argument 'enjoy_script'=None that is not in the saved config file!
+[2023-03-02 11:51:37,567][10553] Using frameskip 1 and render_action_repeat=4 for evaluation
+[2023-03-02 11:51:37,577][10553] Doom resolution: 160x120, resize resolution: (128, 72)
+[2023-03-02 11:51:37,577][10553] RunningMeanStd input shape: (3, 72, 128)
+[2023-03-02 11:51:37,578][10553] RunningMeanStd input shape: (1,)
+[2023-03-02 11:51:37,589][10553] ConvEncoder: input_channels=3
+[2023-03-02 11:51:37,800][10553] Conv encoder output size: 512
+[2023-03-02 11:51:37,800][10553] Policy head output size: 512
+[2023-03-02 11:51:39,438][10553] Loading state from checkpoint /home/gpu/train_dir/default_experiment/checkpoint_p0/checkpoint_000000948_3883008.pth...
+[2023-03-02 11:51:40,207][10553] Num frames 100...
+[2023-03-02 11:51:40,334][10553] Num frames 200...
+[2023-03-02 11:51:40,460][10553] Num frames 300...
+[2023-03-02 11:51:40,615][10553] Avg episode rewards: #0: 3.840, true rewards: #0: 3.840
+[2023-03-02 11:51:40,615][10553] Avg episode reward: 3.840, avg true_objective: 3.840
+[2023-03-02 11:51:40,634][10553] Num frames 400...
+[2023-03-02 11:51:40,757][10553] Num frames 500...
+[2023-03-02 11:51:40,880][10553] Num frames 600...
+[2023-03-02 11:51:41,003][10553] Num frames 700...
+[2023-03-02 11:51:41,127][10553] Num frames 800...
+[2023-03-02 11:51:41,219][10553] Avg episode rewards: #0: 4.660, true rewards: #0: 4.160
+[2023-03-02 11:51:41,219][10553] Avg episode reward: 4.660, avg true_objective: 4.160
+[2023-03-02 11:51:41,304][10553] Num frames 900...
+[2023-03-02 11:51:41,429][10553] Num frames 1000...
+[2023-03-02 11:51:41,552][10553] Num frames 1100...
+[2023-03-02 11:51:41,677][10553] Num frames 1200...
+[2023-03-02 11:51:41,749][10553] Avg episode rewards: #0: 4.387, true rewards: #0: 4.053
+[2023-03-02 11:51:41,749][10553] Avg episode reward: 4.387, avg true_objective: 4.053
+[2023-03-02 11:51:41,854][10553] Num frames 1300...
+[2023-03-02 11:51:41,976][10553] Num frames 1400...
+[2023-03-02 11:51:42,100][10553] Num frames 1500...
+[2023-03-02 11:51:42,222][10553] Num frames 1600...
+[2023-03-02 11:51:42,355][10553] Avg episode rewards: #0: 4.660, true rewards: #0: 4.160
+[2023-03-02 11:51:42,356][10553] Avg episode reward: 4.660, avg true_objective: 4.160
+[2023-03-02 11:51:42,402][10553] Num frames 1700...
+[2023-03-02 11:51:42,527][10553] Num frames 1800...
+[2023-03-02 11:51:42,654][10553] Num frames 1900...
+[2023-03-02 11:51:42,778][10553] Num frames 2000...
+[2023-03-02 11:51:42,869][10553] Avg episode rewards: #0: 4.458, true rewards: #0: 4.058
+[2023-03-02 11:51:42,869][10553] Avg episode reward: 4.458, avg true_objective: 4.058
+[2023-03-02 11:51:42,960][10553] Num frames 2100...
+[2023-03-02 11:51:43,085][10553] Num frames 2200...
+[2023-03-02 11:51:43,213][10553] Num frames 2300...
+[2023-03-02 11:51:43,341][10553] Num frames 2400...
+[2023-03-02 11:51:43,489][10553] Avg episode rewards: #0: 4.628, true rewards: #0: 4.128
+[2023-03-02 11:51:43,490][10553] Avg episode reward: 4.628, avg true_objective: 4.128
+[2023-03-02 11:51:43,517][10553] Num frames 2500...
+[2023-03-02 11:51:43,642][10553] Num frames 2600...
+[2023-03-02 11:51:43,766][10553] Num frames 2700...
+[2023-03-02 11:51:43,890][10553] Num frames 2800...
+[2023-03-02 11:51:44,014][10553] Num frames 2900...
+[2023-03-02 11:51:44,193][10553] Avg episode rewards: #0: 5.127, true rewards: #0: 4.270
+[2023-03-02 11:51:44,193][10553] Avg episode reward: 5.127, avg true_objective: 4.270
+[2023-03-02 11:51:44,207][10553] Num frames 3000...
+[2023-03-02 11:51:44,336][10553] Num frames 3100...
+[2023-03-02 11:51:44,460][10553] Num frames 3200...
+[2023-03-02 11:51:44,582][10553] Num frames 3300...
+[2023-03-02 11:51:44,726][10553] Avg episode rewards: #0: 4.966, true rewards: #0: 4.216
+[2023-03-02 11:51:44,726][10553] Avg episode reward: 4.966, avg true_objective: 4.216
+[2023-03-02 11:51:44,764][10553] Num frames 3400...
+[2023-03-02 11:51:44,910][10553] Num frames 3500...
+[2023-03-02 11:51:45,053][10553] Num frames 3600...
+[2023-03-02 11:51:45,198][10553] Num frames 3700...
+[2023-03-02 11:51:45,350][10553] Num frames 3800...
+[2023-03-02 11:51:45,438][10553] Avg episode rewards: #0: 5.023, true rewards: #0: 4.246
+[2023-03-02 11:51:45,438][10553] Avg episode reward: 5.023, avg true_objective: 4.246
+[2023-03-02 11:51:45,553][10553] Num frames 3900...
+[2023-03-02 11:51:45,697][10553] Num frames 4000...
+[2023-03-02 11:51:45,846][10553] Num frames 4100...
+[2023-03-02 11:51:46,005][10553] Num frames 4200...
+[2023-03-02 11:51:46,068][10553] Avg episode rewards: #0: 4.905, true rewards: #0: 4.205
+[2023-03-02 11:51:46,068][10553] Avg episode reward: 4.905, avg true_objective: 4.205
+[2023-03-02 11:51:50,116][10553] Replay video saved to /home/gpu/train_dir/default_experiment/replay.mp4!
+[2023-03-02 11:52:10,809][10553] Loading existing experiment configuration from /home/gpu/train_dir/default_experiment/config.json
+[2023-03-02 11:52:10,809][10553] Overriding arg 'num_workers' with value 1 passed from command line
+[2023-03-02 11:52:10,809][10553] Adding new argument 'no_render'=True that is not in the saved config file!
+[2023-03-02 11:52:10,809][10553] Adding new argument 'save_video'=True that is not in the saved config file!
+[2023-03-02 11:52:10,809][10553] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
+[2023-03-02 11:52:10,809][10553] Adding new argument 'video_name'=None that is not in the saved config file!
+[2023-03-02 11:52:10,809][10553] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
+[2023-03-02 11:52:10,810][10553] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
+[2023-03-02 11:52:10,810][10553] Adding new argument 'push_to_hub'=True that is not in the saved config file!
+[2023-03-02 11:52:10,810][10553] Adding new argument 'hf_repository'='nhiro3303/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
+[2023-03-02 11:52:10,810][10553] Adding new argument 'policy_index'=0 that is not in the saved config file!
+[2023-03-02 11:52:10,810][10553] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
+[2023-03-02 11:52:10,810][10553] Adding new argument 'train_script'=None that is not in the saved config file!
+[2023-03-02 11:52:10,810][10553] Adding new argument 'enjoy_script'=None that is not in the saved config file!
+[2023-03-02 11:52:10,810][10553] Using frameskip 1 and render_action_repeat=4 for evaluation
+[2023-03-02 11:52:10,813][10553] RunningMeanStd input shape: (3, 72, 128)
+[2023-03-02 11:52:10,814][10553] RunningMeanStd input shape: (1,)
+[2023-03-02 11:52:10,821][10553] ConvEncoder: input_channels=3
+[2023-03-02 11:52:10,844][10553] Conv encoder output size: 512
+[2023-03-02 11:52:10,845][10553] Policy head output size: 512
+[2023-03-02 11:52:10,864][10553] Loading state from checkpoint /home/gpu/train_dir/default_experiment/checkpoint_p0/checkpoint_000000948_3883008.pth...
+[2023-03-02 11:52:11,229][10553] Num frames 100...
+[2023-03-02 11:52:11,422][10553] Num frames 200...
+[2023-03-02 11:52:11,622][10553] Num frames 300...
+[2023-03-02 11:52:11,837][10553] Avg episode rewards: #0: 3.840, true rewards: #0: 3.840
+[2023-03-02 11:52:11,837][10553] Avg episode reward: 3.840, avg true_objective: 3.840
+[2023-03-02 11:52:11,873][10553] Num frames 400...
+[2023-03-02 11:52:12,065][10553] Num frames 500...
+[2023-03-02 11:52:12,256][10553] Num frames 600...
+[2023-03-02 11:52:12,452][10553] Num frames 700...
+[2023-03-02 11:52:12,641][10553] Avg episode rewards: #0: 3.840, true rewards: #0: 3.840
+[2023-03-02 11:52:12,641][10553] Avg episode reward: 3.840, avg true_objective: 3.840
+[2023-03-02 11:52:12,710][10553] Num frames 800...
+[2023-03-02 11:52:12,902][10553] Num frames 900...
+[2023-03-02 11:52:13,095][10553] Num frames 1000...
+[2023-03-02 11:52:13,289][10553] Num frames 1100...
+[2023-03-02 11:52:13,456][10553] Avg episode rewards: #0: 3.840, true rewards: #0: 3.840
+[2023-03-02 11:52:13,456][10553] Avg episode reward: 3.840, avg true_objective: 3.840
+[2023-03-02 11:52:13,557][10553] Num frames 1200...
+[2023-03-02 11:52:13,745][10553] Num frames 1300...
+[2023-03-02 11:52:13,935][10553] Num frames 1400...
+[2023-03-02 11:52:14,136][10553] Num frames 1500...
+[2023-03-02 11:52:14,270][10553] Avg episode rewards: #0: 3.840, true rewards: #0: 3.840
+[2023-03-02 11:52:14,271][10553] Avg episode reward: 3.840, avg true_objective: 3.840
+[2023-03-02 11:52:14,406][10553] Num frames 1600...
+[2023-03-02 11:52:14,609][10553] Num frames 1700...
+[2023-03-02 11:52:14,802][10553] Num frames 1800...
+[2023-03-02 11:52:14,994][10553] Num frames 1900...
+[2023-03-02 11:52:15,091][10553] Avg episode rewards: #0: 3.840, true rewards: #0: 3.840
+[2023-03-02 11:52:15,091][10553] Avg episode reward: 3.840, avg true_objective: 3.840
+[2023-03-02 11:52:15,246][10553] Num frames 2000...
+[2023-03-02 11:52:15,446][10553] Num frames 2100...
+[2023-03-02 11:52:15,640][10553] Num frames 2200...
+[2023-03-02 11:52:15,835][10553] Num frames 2300...
+[2023-03-02 11:52:15,895][10553] Avg episode rewards: #0: 3.840, true rewards: #0: 3.840
+[2023-03-02 11:52:15,896][10553] Avg episode reward: 3.840, avg true_objective: 3.840
+[2023-03-02 11:52:16,085][10553] Num frames 2400...
+[2023-03-02 11:52:16,278][10553] Num frames 2500...
+[2023-03-02 11:52:16,466][10553] Num frames 2600...
+[2023-03-02 11:52:16,654][10553] Num frames 2700...
+[2023-03-02 11:52:16,812][10553] Avg episode rewards: #0: 4.074, true rewards: #0: 3.931
+[2023-03-02 11:52:16,813][10553] Avg episode reward: 4.074, avg true_objective: 3.931
+[2023-03-02 11:52:16,907][10553] Num frames 2800...
+[2023-03-02 11:52:17,096][10553] Num frames 2900...
+[2023-03-02 11:52:17,285][10553] Num frames 3000...
+[2023-03-02 11:52:17,478][10553] Num frames 3100...
+[2023-03-02 11:52:17,609][10553] Avg episode rewards: #0: 4.045, true rewards: #0: 3.920
+[2023-03-02 11:52:17,609][10553] Avg episode reward: 4.045, avg true_objective: 3.920
+[2023-03-02 11:52:17,737][10553] Num frames 3200...
+[2023-03-02 11:52:17,927][10553] Num frames 3300...
+[2023-03-02 11:52:18,122][10553] Num frames 3400...
+[2023-03-02 11:52:18,317][10553] Num frames 3500...
+[2023-03-02 11:52:18,538][10553] Avg episode rewards: #0: 4.204, true rewards: #0: 3.982
+[2023-03-02 11:52:18,538][10553] Avg episode reward: 4.204, avg true_objective: 3.982
+[2023-03-02 11:52:18,577][10553] Num frames 3600...
+[2023-03-02 11:52:18,769][10553] Num frames 3700...
+[2023-03-02 11:52:18,965][10553] Num frames 3800...
+[2023-03-02 11:52:19,169][10553] Num frames 3900...
+[2023-03-02 11:52:19,370][10553] Num frames 4000...
+[2023-03-02 11:52:19,493][10553] Avg episode rewards: #0: 4.332, true rewards: #0: 4.032
+[2023-03-02 11:52:19,494][10553] Avg episode reward: 4.332, avg true_objective: 4.032
+[2023-03-02 11:52:23,407][10553] Replay video saved to /home/gpu/train_dir/default_experiment/replay.mp4!