Skip to content

armahmood/totd-rndmdp-experiments

Repository files navigation

Random MDP experiments on true online TD(lambda) algorithm

This project contains random MDP experiments comparing true online TD (TOTD) by van Seijen and Sutton (2014) with TD with accumulating traces (TD) and TD with replacing traces (TDR). These experiments are done as a part of a forthcoming work by van Seijen, Sutton, Mahmood, Pilarski and Machado (2015).

It can be imported as an Eclipse Pydev project.

Read or execute runtotd-rndmdp-experiments.sh for an example of running the experiments and plotting the python figures.

References

van Seijen, H., Sutton, R.S. (2014). True online TD(lambda). In Proceedings of the 31st International Conference on Machine Learning. JMLR W&CP 32(1):692-700.

van Seijen, H., Sutton, R.S., Mahmood, A.R., Pilarski, P.M., Machado, M.C. (2015). An empirical evaluation of true-online TD(lambda). (forthcoming)

About

Random MDP experiments on true online TD from a forthcoming work by van Seijen et al. (2015)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published