Skip to content

An implementation from the state-of-the-art family of reinforcement learning algorithms Proximal Policy Optimization using normalized Generalized Advantage Estimation and optional batch mode training. The loss function incorporates an entropy bonus.

Notifications You must be signed in to change notification settings

tomasspangelo/proximal-policy-optimization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Proximal Policy Optimization

An implementation from the state-of-the-art family of reinforcement learning algorithms Proximal Policy Optimization using normalized Generalized Advantage Estimation and optional batch mode training. The loss function incorporates entropy.

The code contains a lot of comments and can be helpful to understand both PPO and PyTorch.

How to use

  1. Clone the repository to get the files locally on your computer (see https://git-scm.com/book/en/v2/Git-Basics-Getting-a-Git-Repository, Cloning an Existing Repository)

  2. Navigate into the root folder of the project: /ppo

  3. Download necessary dependencies. These dependencies can be found in the file requirements.txt. Use your favorite package manager/installer to install the requirements, we recommend using pip. To install the requirements, run the following command in the root folder of the project (where requirements.txt is located):

    pip install -r requirements.txt

  4. All you need is an instance of the Environment class (see source code for specification), two are already provided. You also need a Learner object. See the example in main.py.

About

An implementation from the state-of-the-art family of reinforcement learning algorithms Proximal Policy Optimization using normalized Generalized Advantage Estimation and optional batch mode training. The loss function incorporates an entropy bonus.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages