Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge reference with state #230

Open
XyDrKRulof opened this issue Mar 14, 2024 · 3 comments
Open

Merge reference with state #230

XyDrKRulof opened this issue Mar 14, 2024 · 3 comments
Assignees

Comments

@XyDrKRulof
Copy link
Collaborator

Currently, GEM returns a tuple containing of (state, reference) for each step and reset. This is incompatible with the standard RL library stable-baselines3 and might potentially be incompatible with other libraries. As, at least for reinforcement learning applications, it is always the case that state and reference have to be concatenated for the agent, both could be merged to one state. Additionally, for those interested in only core simulation, an option during initialization could be given to disable the reference generation (and thus reward calculation) to speed up the code.

@XyDrKRulof XyDrKRulof changed the title Merge reference and state Merge reference with state Mar 14, 2024
@XyDrKRulof
Copy link
Collaborator Author

Additionally, the stable-baselines3 example code should be updated as it assumes a very outdated version of GEM.

@RapheaSid
Copy link

I am trying to run the stable-baselines3 example code and facing the same issue of observation space being a tuple. Can you kindly provide the edited code snippet because I am trying it since many days but could not resolve it.

@XyDrKRulof
Copy link
Collaborator Author

@RapheaSid in order to make the current GEM environment compatible to stable-baselines 3 you can apply a wrapper which changes the tuple into an array:

class ObservationFlatter(ObservationWrapper):

    def __init__(self, env):
        super(ObservationFlatter, self).__init__(env)
        state_space = self.env.observation_space[0]
        ref_space = self.env.observation_space[1]
        
        new_low = np.concatenate((state_space.low,
                                  ref_space.low))
        new_high = np.concatenate((state_space.high,
                                   ref_space.high))

        self.observation_space = Box(new_low, new_high)

    def observation(self, observation):
        observation = np.concatenate((observation[0],
                                      observation[1],
                                      ))
        return observation

After you have defined your environment you can easily wrap it into this wrapper:

env= FeatureWrapper(env)

I hope that helps. If you got any further questions, feel free to ask. The next GEM version will probably change the state/reference observation tuple to a flat array.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants