CriteoLab Click-Through-Rate

Course: w261 Final Project (Criteo Click Through Rate Challenge)

Team Number:19

Team Members: Steve Dille, Naga Akkineni, Joanna Yu, Pauline Wang

Fall 2019: section 1, 3, 4

Our final submission comprised of the following components:

Jypter notebook - main Jupyter notebook of our project.
.py files - Python files used for GCP submission to run the models in the cloud.

'steve_ctr_DT_full.py' - GCP job for Decision Tree base model (Model I) on the full training set.

'final_proj_GCP_RF_SSI2.py' - GCP job for Random Forest base model (Model II) on the toy dataset.

'steve_ctr_RF_full.py' - GCP job for Random Forest base model (Model II) on the full training set.

'steve_ctr_ ' - GCP job for Random Forest model with preprocessing, meaning scaled integer features and string indexed categoricl features (Model III) on the full training set.

'steve_crt_RF_full.py' - GCP job for Random Forest model with preprocessing listed in Model III plus Gradient Boosting (Model IV) on the full training set.

Pickle files - Pickle files for the Pandas tables that we created for pretty printing:

'summary.pkl' stores the summary statistics table for the training dataset.

'correlation.pkl' stores all the pairwise correlation values for all 39 features and the label for the training dataset.

'correlation_subset.pkl' takes the table from 'correlation.pkl' and filter out entries with correlation values > 0.5.

'DT_base_PD.pkl' stores the summary metrics table for the Decision Tree base model with varied MaxDepth to compare model performance.

'toyDF_PD.pkl' stores the Pandas dataframe for the toy dataset used for EDA. The toy dataset is 1.5% of the training set.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
DRAFT_TEAMKSA_SCD.ipynb		DRAFT_TEAMKSA_SCD.ipynb
DRAFT_TeamKSA_JY.ipynb		DRAFT_TeamKSA_JY.ipynb
DRAFT_TeamKSA_JY_DEC5.ipynb		DRAFT_TeamKSA_JY_DEC5.ipynb
DRAFT_TeamKSA_JY_NAGA_DEC-1.ipynb		DRAFT_TeamKSA_JY_NAGA_DEC-1.ipynb
DRAFT_TeamKSA_JY_NAGA_NOV-30.ipynb		DRAFT_TeamKSA_JY_NAGA_NOV-30.ipynb
DRAFT_TeamKSA_PW_112919.ipynb		DRAFT_TeamKSA_PW_112919.ipynb
DRAFT_TeamKSA_PW_Breiman_113019.ipynb		DRAFT_TeamKSA_PW_Breiman_113019.ipynb
DT_base_PD.pkl		DT_base_PD.pkl
PW_Breiman_120319.ipynb		PW_Breiman_120319.ipynb
README.md		README.md
TeamKSA_120619_1900PST.ipynb		TeamKSA_120619_1900PST.ipynb
TeamKSA_120619_2200PST.ipynb		TeamKSA_120619_2200PST.ipynb
TeamKSA_120719_2200PST.ipynb		TeamKSA_120719_2200PST.ipynb
TeamKSA_120819_1500PST.ipynb		TeamKSA_120819_1500PST.ipynb
TeamKSA_120819_1630PST.ipynb		TeamKSA_120819_1630PST.ipynb
TeamKSA_120919_0400PST.ipynb		TeamKSA_120919_0400PST.ipynb
TeamKSA_121019_1205PST.ipynb		TeamKSA_121019_1205PST.ipynb
TeamKSA_121019_1500PST.ipynb		TeamKSA_121019_1500PST.ipynb
TeamKSA_121019_1700PST.ipynb		TeamKSA_121019_1700PST.ipynb
TeamKSA_121119_0830PST.ipynb		TeamKSA_121119_0830PST.ipynb
TeamKSA_121119_1730PST.ipynb		TeamKSA_121119_1730PST.ipynb
TeamKSA_121119_2100PST.ipynb		TeamKSA_121119_2100PST.ipynb
TeamKSA_121119_2300PST.ipynb		TeamKSA_121119_2300PST.ipynb
correlation.pkl		correlation.pkl
correlation_subset.pkl		correlation_subset.pkl
final_proj_GCP_KaggleScore.py		final_proj_GCP_KaggleScore.py
final_proj_GCP_RF_SSI2.py		final_proj_GCP_RF_SSI2.py
final_proj_GCP_RF_base.py		final_proj_GCP_RF_base.py
final_proj_GCP_RF_base2.py		final_proj_GCP_RF_base2.py
s_correlation.pkl		s_correlation.pkl
steve_ctr_DT_full.py		steve_ctr_DT_full.py
steve_ctr_GBT_full.py		steve_ctr_GBT_full.py
steve_ctr_RF_full.py		steve_ctr_RF_full.py
summary.pkl		summary.pkl
toy_pauline.txt		toy_pauline.txt
toy_preprocessing.py		toy_preprocessing.py
train_preprocessing.py		train_preprocessing.py

sangurocactus/CriteoLab_CTR

Folders and files

Latest commit

History

Repository files navigation

CriteoLab Click-Through-Rate

Our final submission comprised of the following components:

About

Resources

Stars

Watchers

Forks

Languages