Skip to content

Value iteration, policy iteration, and Q-Learning in a grid-world MDP.

License

Notifications You must be signed in to change notification settings

kevin-hanselman/grid-world-rl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

grid-world-rl

Implementations of MDP value iteration, MDP policy iteration, and Q-Learning in a toy grid-world setting.

TODO

The policy iteration implementation is suboptimal, as it does not use the closed-form solution. Pull requests are welcome.

About

Value iteration, policy iteration, and Q-Learning in a grid-world MDP.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages