Skip to content

Flu Shot Learning competition on Driven Data. Top 12% ranking

License

Notifications You must be signed in to change notification settings

Jswig/drivendata-flu-learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Driven Data: Flu Shot Learning

Repository for my work on the Flu Shot Learning competition on Driven Data. Driven Data profile: apoirel. Work in progress.

⚙ Setup

Requirements

  • snakemake
  • conda

For experimenting on local OS

This is only necessary if you intend to experiment with or modify the code. Create a conda environment with all the required packages:

conda env create -f environment.yml

♻ Reproducing the results

On local OS

In this directory

snakemake --use-conda all

On Singularity (recommended)

snakemake --use-singularity --use-conda all

📁 Project structure

├── environment.yml          <- The file defining the conda Python environmnet. 
├── Snakefile                <- Definition of the full workflow for reproducing the analysis.
├── LICENSE                                 
├── README.md                <- The top-level README.
├── data
│   ├── processed            <- The final, canonical data sets for modeling.
│   └── raw                  <- The original, immutable data dump.
├── output             
|   ├── models               <- Serialized models, predictions, model summaries.
|   └── figures              <- Graphics created during analysis.
├── paper                    <- Generated analysis as PDF, LaTeX.
└── src                      <- Source code for this project.
    ├── notebooks            <- Jupyter notebooks.
    └── __init__.py          <- Makes this a python module.

🏆 Results

  • 0.8342 AUC on hidden test set, 181/948 on leaderboard (baseline LR)
  • 0.8462 AUC on hidden test set, 133/953 on leaderboard (tuned random forest)
  • 0.8473 AUC on hidden test set, 130/953 on leaderboard. (xgboost baseline)
  • 0.8530 AUC on hidden test set, 112/953 on leaderboard (moderately tuned xgboost)

Todo: refactor notebooks code for the last 2 into proper python scripts.

📃 License

This project is distributed under the MIT license.

About

Flu Shot Learning competition on Driven Data. Top 12% ranking

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published