You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The env.step return values changed, so this code is now how to get it going:
# Number of timesteps run so far this batch
t = 0
while t < self.timesteps_per_batch:
# Rewards this episode
ep_rews = []
obs = self.env.reset()
if isinstance(obs, tuple):
obs = obs[0] # Assuming the first element of the tuple is the relevant data
terminated = False
for ep_t in range(self.max_timesteps_per_episode):
# Increment timesteps ran this batch so far
t += 1
# Collect observation
batch_obs.append(obs)
action, log_prob = self.get_action(obs)
obs, rew, terminated, truncated, _ = self.env.step(action)
if isinstance(obs, tuple):
obs = obs[0] # Assuming the first element of the tuple is the relevant data
# Collect reward, action, and log prob
ep_rews.append(rew)
batch_acts.append(action)
batch_log_probs.append(log_prob)
if terminated or truncated:
break
The
env.step
return values changed, so this code is now how to get it going:Note that you now have to check
terminated
andtruncated
return values. Latest documentation is here: https://www.gymlibrary.dev/api/core/Without this if you follow along with the blog post, it will fail at the end of Blog 3 at this step:
Also you need to update
Pendulum-v0
toPendulum-v1
.The text was updated successfully, but these errors were encountered: