Online-Learning-Implementation

This project aims to implement those algorithms from different papers related to online learning (Updating). I am a fresh player on this topic. I would use this project the trace my study on online learning. And I sincerely wish this project can also help you better understand those online learning algorithms. Communication is also welcome.

This project only includes the reproduction of the most simple numeric experiments mentioned in these papers

Prerequisite

Ide

jupyter notebook

File and Folder

Following folders contain the code and the description of corresponding papers, with further details and description

A method for generating uniformly distributed points on N-dimensional spheres.pdf: This paper show us how to generate uniformly distributed points on a N-dimension ball. Its conclusion can be used to generate context in each round.

"Ferrira-et-al-2018-Thompson_Sampling": https://proceedings.mlr.press/v49/agrawal16.html

"Zhang-et-al-2020-Neural_Thompson_sampling": http://arxiv.org/abs/2010.00827

"Zhou-et-al-2020-Neural_UCB_Exploration": http://arxiv.org/abs/1911.04462

"Jun-Nowak-2016-Anytime_Exploration_Best_Arm_Identification": https://proceedings.mlr.press/v48/jun16.html

"Karnin-Koren-Somekh-2013-Almost_Optimal_Exploration_Best_Arm_Identification": http://proceedings.mlr.press/v28/karnin13.pdf

"Abbasi-yadkori-et-al-Improved Algorithms for Linear Stochastic Bandits": https://papers.nips.cc/paper/2011/hash/e1d5be1c7f2f456670de3d53c7b54f4a-Abstract.html

"Jamieson-Nowak-2014-Best-arm-identification-algorithms-for-multi-armed-bandits-in-the-fixed-confidence-setting": http://ieeexplore.ieee.org/document/6814096/

"Badanidiyuru-et-al-2013-Bandits_with_Knapsacks": http://arxiv.org/abs/1305.2545

"Agrawal-et-al-2014-A_Dynamic_Near-Optimal_Algorithm_for_Online_Linear_Programming": https://pubsonline.informs.org/doi/abs/10.1287/opre.2014.1289

"David-Xu-Bypassing_the_Monster_A_Faster_and_Simpler_Optimal_Algorithm_for_Contextual_Bandits_under_Realizability": https://papers.ssrn.com/abstract=3562765

"Garivier-Kaufmann-Optimal-Best-Arm-Identification-with-Fixed-Confidence": https://arxiv.org/abs/1602.04589

"Wu-et-al-2016-Conservative_Bandits": http://arxiv.org/abs/1602.04282

Contact

LI ZITIAN: lizitian@u.nus.edu

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
Abbasi-yadkori-et-al-Improved Algorithms for Linear Stochastic Bandits		Abbasi-yadkori-et-al-Improved Algorithms for Linear Stochastic Bandits
Agrawal-et-al-2014-A_Dynamic_Near-Optimal_Algorithm_for_Online_Linear_Programming		Agrawal-et-al-2014-A_Dynamic_Near-Optimal_Algorithm_for_Online_Linear_Programming
Badanidiyuru-et-al-2013-Bandits_with_Knapsacks		Badanidiyuru-et-al-2013-Bandits_with_Knapsacks
Balseiro-et-al-2022-The-Best-of-Many-Worlds-Dual-Mirror-Descent-for-Online-Allocation-Problems		Balseiro-et-al-2022-The-Best-of-Many-Worlds-Dual-Mirror-Descent-for-Online-Allocation-Problems
David-Xu-Bypassing_the_Monster_A_Faster_and_Simpler_Optimal_Algorithm_for_Contextual_Bandits_under_Realizability		David-Xu-Bypassing_the_Monster_A_Faster_and_Simpler_Optimal_Algorithm_for_Contextual_Bandits_under_Realizability
Ferrira-et-al-2018-Thompson_Sampling		Ferrira-et-al-2018-Thompson_Sampling
Garivier-Kaufmann-Optimal-Best-Arm-Identification-with-Fixed-Confidence		Garivier-Kaufmann-Optimal-Best-Arm-Identification-with-Fixed-Confidence
Jamieson-Nowak-2014-Best-arm-identification-algorithms-for-multi-armed-bandits-in-the-fixed-confidence-setting		Jamieson-Nowak-2014-Best-arm-identification-algorithms-for-multi-armed-bandits-in-the-fixed-confidence-setting
Jun-Nowak-2016-Anytime_Exploration_Best_Arm_Identification		Jun-Nowak-2016-Anytime_Exploration_Best_Arm_Identification
Karnin-Koren-Somekh-2013-Almost_Optimal_Exploration_Best_Arm_Identification		Karnin-Koren-Somekh-2013-Almost_Optimal_Exploration_Best_Arm_Identification
Li-Ye-2021-Online_Linear_Programming_Dual_Convergence_New_Algorithms_and_Regret_Bounds		Li-Ye-2021-Online_Linear_Programming_Dual_Convergence_New_Algorithms_and_Regret_Bounds
Zhang-et-al-2020-Neural_Thompson_sampling		Zhang-et-al-2020-Neural_Thompson_sampling
Zhou-et-al-2020-Neural_UCB_Exploration		Zhou-et-al-2020-Neural_UCB_Exploration
.gitignore		.gitignore
A method for generating uniformly distributed points on N-dimensional spheres.pdf		A method for generating uniformly distributed points on N-dimensional spheres.pdf
LICENSE		LICENSE
README.md		README.md

License

lzt68/Online-Learning-Implementation

Folders and files

Latest commit

History

Repository files navigation

Online-Learning-Implementation

Prerequisite

File and Folder

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Languages