Skip to content

yil479/ds_hackathon_2019

Repository files navigation

2019 Columbia Data Science Society hackathon

This project is the 2019 Columbia Data Science Society Hackathon project sponsored by Facebook and NYC Academy. After two days of work, we managed to achieve 1st Place.

To find the most ideal place to start a new supermarket, we first use data visualization tools including carto,

to visualize the patterns of the dataset, and use different statistical model to predict the traffic, as an indicator of profitability.

Presentation slides is in the folder

• Expanded dataset with only 155 data and 12 variables into more than 10,000 data and 22 variables by splitting into multiple variables, merging outsourced datasets, and aggregating data • Preprocessed data for machine learning steps by applying methods including data scaling, feature engineering with the help of data visualization tools (Python Library Seaborn, Software CARTO), and data encoding by one-hot encoding • Applied multiple regression supervised models such as knn, decision tree, random forest, and split dataset with multiple different partitions to check the learning speed of models and received 0.99 R2 score • Concluded four most important variables from PCA analysis and outputted final demo consistent with the goal of data analysis to predict customer traffic given the input of store location

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published