irl-lab

A WIP implementation of popular Inverse Reinforcement Learning algorithms for various tasks. Suggestions are welcome for more complex environments and algorithms, as well as possible bugs in the current implementation.

The code in this repository is heavily inspired by this work by Matthew Alger, while the idea itself (not to mention, the name) is inspired by OpenAI and UC Berkeley's rllab.

Environments

The following environments have been implemented:

Gridworld

Algorithms

The following algorithms have been implemented:

Policy Iteration
Relative Entropy Inverse Reinforcement Learning

Sample Usage

from env.GridWorld import GridWorld
from algo.PolicyIteration import PolicyIteration
from algo.RelEntIRL import RelEntIRL

gw = GridWorld(10)
print gw.rewards
# Obtain the optimal policy for the environment to generate expert demonstrations
pi = PolicyIteration(gw)
optimal_policy = pi.policy_iteration(100)

expert_demos = gw.generate_trajectory(optimal_policy)
nonoptimal_demos = gw.generate_trajectory()
relent = RelEntIRL(expert_demos, nonoptimal_demos)
# Train the model with default hyperparameters. 
relent.train()
print relent.weights.reshape(10,10)

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
algo		algo
env		env
scratch		scratch
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

algo

algo

env

env

scratch

scratch

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

irl-lab

Environments

Algorithms

Sample Usage

About

Releases

Packages

Languages

License

aravindsiv/irl-lab

Folders and files

Latest commit

History

Repository files navigation

irl-lab

Environments

Algorithms

Sample Usage

About

Resources

License

Stars

Watchers

Forks

Languages