• Applied non-parametric statistical correlation tests to recognize relationships between variables and Cleaned the messy demographic data (near 900k) into a usable form with Python
• Conducted dimensionality reduction using PCA for keeping the 85% variability explained with scikit-learn and Clustered the data into groups using the k-means
• Identified facets of the population that are most likely to be purchasers of their products for a mail-out campaign Developed an end-to-end ML/Data pipeline application for data cleaning and unsupervised learning conduct