Skip to content

Native ad prediction model and web app for "Truly Native?" Kaggle competition

License

Notifications You must be signed in to change notification settings

ohryshyn/sponsored-ad-prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kaggle Competition: Truly Native?

This repository contains code for processing the data and building the model for Truly Native? Kaggle competition organized by Dato.

Main objective: predict whether the content in an HTML file was sponsored or not on StumbleUpon.

Project setup

Follow these steps to get the set up for the project ready and running on your local instance.

Prerequisites

  • Python 3.7+
  • pip install -r requirements.txt
  • Download *.zip files that contain raw HTMLs from Kaggle and place under data folder

Expected directory structure

.
├── data                   # Data files
│   ├── raw                # Raw zip files downloaded from Kaggle
│   ├── csv                # Transformed csv files
│   └── html_targets.csv   # Targets csv file downloaded from Kaggle
├── models                 # Models, EDA, hyper parameter tuning Jupyter notebooks
│   ├── eda.ipynb          # Exploratory analysis on the processed dataset
│   ├── hp_tuning.ipynb    # Hyper parameter tuning for selected models
│   └── models_eval.ipynb  # Model evaluation with the best parameters
├── app.py                 # Streamlit app
├── process_raw_html.py    # Extract features from zip files
└── ...

Running the project

  1. Run python3 process_raw_html.py to extract features from zip files
  2. Run hp_tuning.ipynb for hyper parameter tuning with Randomized Search
  3. Run model_eval.ipynb to evaluate and save final model to pickle file

Presentation

Slide 1 Slide 2 Slide 3 Slide 4 Slide 5 Slide 6 Slide 7 Slide 8 Slide 9 Slide 10 Slide 11 Slide 12 Slide 13 Slide 14 Slide 16

About

Native ad prediction model and web app for "Truly Native?" Kaggle competition

Topics

Resources

License

Stars

Watchers

Forks