Skip to content

rachelsng/Multiarmed-Bandits-Website-Tuning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Using Multi-armed Bandits to Determine Best Website Layout

Overview

  • This project is based on a case study of creating the most ideal website for Obama’s 2008 fundraising campaign.
  • A simulated dataset has been created to model the circumstances and test the effectiveness of various multi-armed bandit algorithms (in Python) on the task.

Context

  • Obama's 2008 campaign team experimented with two parts of the splash page of their fundraising website: the “Media” section at the top and the call- to-action “Button”.
  • 4 buttons and 6 different media (3 images and 3 videos) were tried as below, giving a total of 24 different combinations that they could have used.
  • Each version was tested in real-time.

example

From Obama's $60 Million Dollar Experiment.

Objective

The objective is to figure out which website configuration out of the 24 gives the:

  1. highest sign up rate (the greatest number of clicks) with the
  2. least number of tries.

As each 'try' represents a user visit / conversion opportunity, it is critical to ensure a balance of exploration to find the best configuration and exploitation to show the suspected best configuration as much as possible to maximize visits.

Dataset

As the dataset models a real-life website data stream, it is not available beforehand.

In the simulation, each website (arm) is shown for a specific period of time, garnering a certain amount sign ups per 100 visits (reward). The simulated data stream outputs a dictionary containing this information each time a website is shown (pull).

Steps

As dataset is already clean, there is no pre-processing to be done.

  • Implement 4 multi-armed bandit algorithms:
    1. Uniform Exploration
    2. Epsilon-greedy
    3. Adaptive Epsilon-greedy
    4. Upper Confidence Bound 1 (UCB1)
  • Determine best website configuration
  • Compare performance of algorithms with cumulative regret

Findings

  • The best algorithm to quickly converge on the best arm is UCB1
  • Cumulative regret is the least (as per the graph below)
    • Lower regret (best arm - chosen arm) is better because it means the best arm was played most often
  • Certain algorithms e.g. Uniform Exploration did not find the best arm at all

example

Files

  • 01_Multi_Armed_Bandit_Website_Tuning.ipynb : Notebook detailing implementation of 4 bandit algorithms and conclusions

Credits

The simulated dataset was provided by the National University of Singapore.

This was co-authored with Gino Martelli Tiu (@ginosytiu).

About

[Python] 4 multi-armed bandit algorithms are implemented to determine which one can most effectively determine the best website configuration that maximise signups.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published