Skip to content

ShahedAlMashni/Starbucks-data-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Starbucks-data-analysis

Overview

An analysis of a data set that contains simulated data that mimics customer behavior on the Starbucks rewards mobile app. Different features in the dataset were analyzed and classification models were built to predict how a user would respond to an offer. For a more descriptive analysis, check out this Medium Article that I wrote.

This project is done as part of Udacity Data Science Nanodegree - capstone project.

Table of contents

  1. Getting Started
  2. Project Summary
  3. Datasets
  4. License
  5. Contact

Getting Started

This project uses the following libraries:

In order to get a local copy of the project, you will need to download the dataset from the data folder, and add it to the same folder of the project.
The project has the following:

  • Starbucks_Capstone_notebook.ipynb
  • data folder

The jupyter notebook contains all code and data analysis.

Download the jupyter notebook from this repository and have fun!

Project Summary

In this project, I will be analyzing data coming out of this app and try to find trends and relations between users information and offer data. Finally, I will build a machine learning model to predict whether a user will respond to an offer or not. 

Each user on the application has an account that can include demographic information on the user. A user can make a purchase, receive an offer, view an offer or complete an offer. There are three types of offers that can be sent: buy-one-get-one (BOGO), discount, and informational.

  • BOGO: a user needs to spend a certain amount to get a reward equal to that threshold amount.
  • Discount: a user gains a reward equal to a fraction of the amount spent.
  • Informational: mere advertisement for a drink 

Problem Statement:
The problem that we are trying to answer is how does a customer respond when an offer is sent to them. The strategy that we will be following is:

  1. Data preprocessing and cleaning: we will look deeper at the data and understand its content. Data will then be cleaned from anomalies, null values, and duplicates.
  2. Data analysis and visualization: data will be further analyzed and visualized to answer more detailed questions relating to our problem.
  3. Data modelling: we will try to build a machine learning model that will predict whether a user will complete an offer or not. model is evaluated based on f1-score.

Dataset

Datasets: There are three datasets available:

  • portfolio.json - containing offer ids and meta data about each offer (duration, type, etc.)
  • profile.json - demographic data for each customer
  • transcript.json - records for transactions, offers received, offers viewed, and offers completed

Further information on the dataset can be found in the Jupytr notebook.

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Shahed - shahedmashni@gmail.com

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published