Policy Gradient Methods

What are policy gradient methods?

All of the Reinforcement learning is about finding the best strategy to solve the targeted problem. This strategy is the policy that the agent uses to interact with the environment. All the RL algorithms directly or indirectly are about finding the optimal policy.

These policy gradient methods are methods that involve finding the policy directly.

Using this repository

I will be implementing the following PG algorithms

Repository Structure

                |Readme.md
                |---VPG
                |---REINFORCE
                |---ACTOR CRITIC 
                |   |---A2C
                |   |---A3C
                |   |---SAC
                |---DETERMINISTIC POLICY GRADIENTS
                |   |---DPG
                |   |---DDPG
                |   |---D4PG
                |---TRPO
                |---PPO

Each subfolder is structured as

                |Readme.md
                |---Main.py
                |---Solver.py
                |---UTILS.py
                |---Running Trained Model.py
                |---Trained Model.pt

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Readme.md		Readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Readme.md

Readme.md

Repository files navigation

Policy Gradient Methods

What are policy gradient methods?

Using this repository

Repository Structure

About

Releases

Packages

Sushant-ctrl/PolicyGradient-Based-RL

Folders and files

Latest commit

History

Readme.md

Readme.md

Repository files navigation

Policy Gradient Methods

What are policy gradient methods?

Using this repository

Repository Structure

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages