Skip to content

aubin-tchoi/sequential-learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sequential learning

This repo contains resources to test a few online convex optimization / best arm identification in bandits algorithms.

Online convex optimization

  • Online gradient descent
  • Online gradient descend without gradient

Best arm identification in stochastic bandits

Fixed budget

  • Uniform sampling
  • Successive rejects

Fixed confidence

  • UCB-based with heuristic approximation of the GLRT stopping rule
  • Uniform sampling with heuristic approximation of the GLRT stopping rule
  • TTUCB (Top two with arm drawn by UCB as the leader)
  • EB-TC (Top two with empirically best arm as the leader)