Hierarchical online planning and reinforcement learning on Taxi

This release consists of codes for two projects:

The MAXQ-based hierarchical online planning algorithm: MAXQ-OP
The HAMQ-based hierarchical reinforcement learning algorithm: HAMQ-INT

Taxi domain:

Overall results:

Averaged over 200 runs.

HAMQ-INT

The idea is to identify and take advantage of internal transitions within a HAM, which is represented as a partial program, for efficient hierarchical reinforcement learning. Details can be found in:

Efficient Reinforcement Learning with Hierarchies of Machines by Leveraging Internal Transitions, Aijun Bai, and Stuart Russell, Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI), Melbourne, Australia, August 19 - 25, 2017. [pdf][bib]

MAXQ-OP

This is the code release of MAXQ-OP algorithm on the Taxi domain as described in papers:

Online planning for large Markov decision processes with hierarchical decomposition, Aijun Bai, Feng Wu, and Xiaoping Chen, ACM Transactions on Intelligent Systems and Technology (ACM TIST),6(4):45:1-45:28, July 2015.
Online Planning for Large MDPs with MAXQ Decomposition (Extended Abstract), Aijun Bai, Feng Wu, and Xiaoping Chen, Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), Valencia, Spain, June 2012.

Files

maxqop.{h, cpp}: the MAXQ-OP algorithm
HierarchicalFSMAgent.{h, cpp}: the HAMQ-INT algorithm
MaxQ0Agent.{h, cpp}: the MAXQ-0 algorithm
MaxQQAgent.{h, cpp}: the MAXQ-Q algorithm
agent.h: abstract Agent class
state.{h, cpp}: abstract State class
policy.{h, cpp}: Policy classes
taxi.{h, cpp}: the Taxi domain
system.{h, cpp}: agent-environment driver code
table.h: tabular V/Q functions
dot_graph.{h, cpp}: tools to generate graphviz dot files

Dependencies

libboost-dev
libboost-program-options-dev
gnuplot

Related Projects

MAXQ-OP on RoboCup Soccer Simulation 2D Challenge: https://github.com/wrighteagle2d/wrighteaglebase
Concurrent HAMQ on RoboCup Soccer Simulation 2D Keepaway Challenge: https://github.com/aijunbai/keepaway

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
data		data
src		src
.gdbinit		.gdbinit
.gitignore		.gitignore
.travis.yml		.travis.yml
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
breakpoints		breakpoints
dbg.sh		dbg.sh
gdbinit		gdbinit
memcheck.sh		memcheck.sh
plot.gnuplot		plot.gnuplot
plot.sh		plot.sh
profile.sh		profile.sh
run.sh		run.sh
taxi.png		taxi.png

License

aijunbai/taxi

Folders and files

Latest commit

History

Repository files navigation

Hierarchical online planning and reinforcement learning on Taxi

HAMQ-INT

MAXQ-OP

Files

Dependencies

Related Projects

About

Topics

Resources

License

Stars

Watchers

Forks

Languages