Policy Learning Landscape

This repository contains code to explore the policy optimiaztion landscape.

Quick setup

To run cartpole simply do:

python3 run_eager_policy_optimization.py --env CartPole-v0 --policy_type discrete

To run something from Mujoco you must have it installed and the associated license. To run Hopper-v1 use:

python3 run_eager_policy_optimization.py --env Hopper-v1 --policy_type normal --std 0.5

Parameters will be saved into ./parameters as numpy files. After obtaining some parameters from different runs use the following commands to analyze the landscape.

First install eager_pg: pip install -e ..
Random Pertubations Experiment:

cd interpolation_experiments
python paired_random_directions_experiment.py --p1 ./path/to/parameter/1/npy \
--save_dir ./path/to/save/in/ \
--alpha 0.5 --std 0.5 --n_directions 500

Linear Interpolation Experiment:

cd interpolation_experiments
python simple_1d_interpolation_experiment.py --p1 ./path/to/parameter/1/npy \
--p2 ./path/to/parameter/2/npy --save_dir ./path/to/save/in/ \
--stds 5.0 --alpha_start -0.5 --alpha_end 1.5 --n_alphas 2 \
--save_dir ./path/to/save/in

Note that interpolation tools only work with continuous policies.

Code organization

eager_pg: contains a small library to enable quick research in policy gradient reinforcement learning.
analysis_tools: contains tooling to make nice figures in papers.
interpolation_experiments: Experiments to explore the landscape in policy optimization.

Citation

If you use the proposed method or code, we'd appreciate if you could cite this work!

@article{ahmed2018understanding,
  title={Understanding the impact of entropy in policy learning},
  author={Ahmed, Zafarali and Roux, Nicolas Le and Norouzi, Mohammad and Schuurmans, Dale},
  journal={arXiv preprint arXiv:1811.11214},
  year={2018}
}

Disclaimer

This is not an official Google product.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
analysis_tools		analysis_tools
eager_pg		eager_pg
interpolation_experiments		interpolation_experiments
notebooks		notebooks
.gitignore		.gitignore
CONTRIBUTING		CONTRIBUTING
LICENSE		LICENSE
README.md		README.md
run_eager_policy_optimization.py		run_eager_policy_optimization.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

analysis_tools

analysis_tools

eager_pg

eager_pg

interpolation_experiments

interpolation_experiments

notebooks

notebooks

.gitignore

.gitignore

CONTRIBUTING

CONTRIBUTING

LICENSE

LICENSE

README.md

README.md

run_eager_policy_optimization.py

run_eager_policy_optimization.py

setup.py

setup.py

Repository files navigation

Policy Learning Landscape

Quick setup

Code organization

Citation

Disclaimer

About

Releases

Packages

Languages

License

google-research/policy-learning-landscape

Folders and files

Latest commit

History

Repository files navigation

Policy Learning Landscape

Quick setup

Code organization

Citation

Disclaimer

About

Resources

License

Stars

Watchers

Forks

Languages