gym-catcher

An environment for OpenAi gym.

Objective is to catch as many falling balls as possible, having only discrete (True/False) sensors in limited view angle.

This description also contains discussion about trained agent. For more, see Demo and Discussion chapters.

Properties

Action space:

There are two available actions: move left (0) and move right (1)

Actions: Discrete(2)
Num      Action
0        move left
1        move right

Observation space:

Observation space is an array, where first value is cart position and remaining elements are binary values, representing actuation of the corresponding sensor

Observations: Box(1 + N_SENSORS)
Num             Meaning
0               position of cart on the screen in range [-2.4; 2.4]
1-N_SENSORS     discrete value for each sensor. 1 if sensor sees an object, 0 otherwise

Reward:

For each caught ball gain 1 reward point.

Max steps:

For training purposes, number of steps is limited to 100. After this value, environment returns True in 'done'.

Environment settings

Environment is highly customizable. Following parameters are defined as global constants in ./gym_catcher/envs/catcher_env.py and could be changed:

# screen parameters
SCREEN_WIDTH = 600
SCREEN_HEIGHT = 700

# cart parameters
CART_WIDTH = 50.0
CART_HEIGHT = 30.0
CART_MOVING_SPEED = 50 # pixels per 1 move

N_SENSORS = 7 # sensors will be distributed uniformly and symmetrically
VISION_ANGLE = 45.

# falling balls parameters
BALL_RADIUS = 25.0
FREQUENCY = 5  # new ball creates every FREQUENCY steps
MAX_BALLS = 2  # maximum number of ball on screen in parallel

BALL_MAX_HORIZONTAL_SPEED = 5
BALL_MAX_VERTICAL_SPEED = 49
BALL_MIN_VERTICAL_SPEED = 30

MAX_STEPS = 100  # number of steps, after which environment returns  done=True

Demo

Random agent action

Agents samples random action from action space and apply it.

Trained agent action

Agent was trained, using Sallimans et al. algorithm.

Trained agent action in hard setting

The same trained agent but with changes in balls generation. Now balls are generated in a way, that agent should select one of the falling objects in order to gain score.

Discussion

An agent was trained in a given environment in order to assure that one can come up with selective selection strategy. The agent was trained, using Sallimans et al. algorithm. As the third demo shows, the trained bot can select one of the falling objects. Probably, such a strategy is powered by choosing the ball that activates more sensors together, when approaching. Therefore an attempt to train selective attention strategy can be considered successful.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
demo		demo
gym_catcher		gym_catcher
.gitignore		.gitignore
README.md		README.md
setup.py		setup.py
testing.py		testing.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

demo

demo

gym_catcher

gym_catcher

.gitignore

.gitignore

README.md

README.md

setup.py

setup.py

testing.py

testing.py

Repository files navigation

gym-catcher

Properties

Action space:

Observation space:

Reward:

Max steps:

Environment settings

Demo

Random agent action

Trained agent action

Trained agent action in hard setting

Discussion

About

Releases

Packages

Languages

piaxar/gym-catcher

Folders and files

Latest commit

History

Repository files navigation

gym-catcher

Properties

Action space:

Observation space:

Reward:

Max steps:

Environment settings

Demo

Random agent action

Trained agent action

Trained agent action in hard setting

Discussion

About

Topics

Resources

Stars

Watchers

Forks

Languages