Skip to content

Creating custom Gym Environments primarily for Optimal Control

License

Notifications You must be signed in to change notification settings

MitchProbst/ACME-GymEnvs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This repository was prepared as part of the Midterm and Final Project for BYU's Math 438 class, Modeling with Dynamics and Control 2. Students were put into groups of 4 and asked to solve the continuous cartpole problem by using some form of Optimal Control. They will have succeeded if they can demonstrate robust code that manages to keep/bring the pendulum upright and remain there "for all time".

ACME-GymEnvs

Barebones Installation Guide

The Environments here were initially built based off of Gym version 0.10.4, which was the latest version as of March 2018.

Make sure you have gym installed following the instructions found here or here.

Clone this repository and do a local installation with the following commands:

git clone https://github.com/MitchProbst/ACME-GymEnvs
cd ACME-GymEnvs
pip install -e .

Note: You do not have to pip install the environment, but if you do not pip install you will need to put the acme_gym folder in the parent directory of whatever solution file you are going to use.

You can test that you have installed it correctly by running a python script like the following (see end of README for a couple common troubleshooting issues):

import gym, acme_gym
import numpy as np

# CartPoleContinuous-v0 is one of the custom environments for you to use
env = gym.make("CartPoleContinuous-v0")
observation = env.reset()
# observation is a 4-tuple of x, x', θ, θ' 
print(observation)
env.render()

# Push the cart to the right with a force of 20
new_obs, reward, done, info = env.step(np.array([20]))
env.render()

# call env.close() at the end of a file to terminate the viewing window

Because Gym is meant for reinforcement learning it makes some assumptions about when you want to terminate a run, this is managed with the done variable that is returned at each period. For the purposes of Optimal Control, we can simply ignore that, even if it brings up a warning. Reward and info are also useless variables to us.

There are two examples to illustrate how to work with the gym environment in a .py file or a .ipynb. Working in a jupyter notebook can cause your kernel to crash when attempting to render your system, if you attempt to interrupt a notebook while the system is running. Proceed with caution, you have been warned!

See the .py example here.

See the jupyter example here.

Important Details

  1. The environments are designed to step at .02 second intervals, so the time elapsed between steps is .02 seconds. Even if your code takes 3 seconds at each step, the beauty of Gym is that we control time. You might have a hard time getting a solver to work right if you do not mimic the .02s stepsize (maybe you can, if so, you are awesome!)
  2. I highly recommend reading the source code for the CartPoleContinuousEnv class, it contains relevant information like the variables of our system.
  3. If you look at the source code and see length as 0.5, read the comment, it really means the total length of the pole is 1.
  4. One possible solution path is to re-use some of the principles from the ACME LQR lab, but recognize in that lab we used a physical system that assumes a massless rod with a weighted object at the end. Here we are using a rod with mass that has no additional object at the end.
  5. You might also want to figure out how to implement some kind of continuous feedback into your model, you may not be able to perfectly describe the dynamics of the system in any solver you use, but you hopefully can get close.
  6. To see all possible environments and the label called to instantiate them, look here.

Some Troubleshooting

Q: I am getting a raise NotImplementedError('abstract') NotImplementedError: abstract error, what should I do?

A: Try following this link that suggests downgrading pyglet to version 1.2.4 and see if that fixes your problem.

Q: I am getting an AssertionError: array(\[ BIG NUMBER ]) (<class 'numpy.ndarray'>) invalid, what should I do?

A: The environment is capped so you can not apply forces of insane magnitude (2000 is already really large), if you are trying to apply forces stronger than that, you ought to be refactoring your solution.

Q: I can not see the environment when I call env.render() even though I am getting no errors, what is going on?

A: The viewer showing the actual CartPole may not appear at the front of your screen. Check and see if it appeared in the background somewhere, especially if you are using a notebook.

About

Creating custom Gym Environments primarily for Optimal Control

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published