Skip to content

stabgan/Upper-Confidence-Bounds

Repository files navigation

Upper-Confidence-Bounds

I implemented the reinforcement learning based model Upper Confidence Bound in both Python and R

If we use to check if which ad is pleasing customers among many ads then we can use the reinforcement learning approach :

  • Let we have X ads to display to a customer when he connects to Web
  • Each time an user logs in we consider it an round
  • At each roundn we choose one ad to display to the user
  • At each round n , ad gives reward Ri(n)is the superset of {0,1} : Ri(n) = 1 , if the user clicked on the ad and 0 if the user didn't clicked .
  • Our goal is to minimize the total rewards we get over many rounds

Steps :

down

Comparison between UCB and Thompson Sampling :

down

About

I implemented the reinforcement learning based model Upper Confidence Bound in both Python and R

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published