Implementation of 9 multi-armed bandit algorithms for the stationary stochastic Bernoulli environment:
- Thompson Sampling (see https://arxiv.org/pdf/1205.4217.pdf)
- KL UCB (see http://proceedings.mlr.press/v19/garivier11a/garivier11a.pdf)
- Bayes UCB (see http://proceedings.mlr.press/v22/kaufmann12/kaufmann12.pdf)
- UCB 1 (see https://homes.di.unimi.it/~cesabian/Pubblicazioni/ml-02.pdf)
- UCBV (see http://certis.enpc.fr/~audibert/Mes%20articles/TCS08.pdf)
- UCB Tuned (see http://imagine.enpc.fr/~audibert/ucbtuned0.5.pdf)
- MOSS (see https://www.di.ens.fr/willow/pdfscurrent/COLT09a.pdf)
- CPUCB (see http://proceedings.mlr.press/v19/garivier11a/garivier11a.pdf)
- DMED (see http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.232.3594)
For a demo, please launch the following script: demoStochasticMAB.m