Skip to content

Data Science Portfolio created for academic and personal projects.

Notifications You must be signed in to change notification settings

prat0101/Data-Science-Portfolio

Repository files navigation

Data Science Projects Portfolio

1. Reduce the test bench time for vehicles

• Objective: Train the model to predict the test bench time from 378 categorical features.

• Feature engineering and Feature selection: Handled features with zero variance, handled multicollinearity and treated categorical features with a large number of categories.

• Model Training: Used Linear regression, Ridge regression, Gradient boosting, and XGBoost algorithms to train the model, and performed hyperparameter tuning to optimize performance metrics. Model used to select feature values for minimum testing time.

2.Customer Segmentation for Retail store

• Objective: Perform Customer segmentation using RFM analysis (Recency, Frequency, and Monetary value) to identify prominent customers in store.

• Exploratory data analysis (EDA): Performed Cohort analysis & built RFM segments

• Perform Clustering on RFM data: Outliers detection, selected feature scaling method, applied K means clustering algorithm on scaled data, used elbow method and Silhouette score to decide the optimum number of clusters.

• Data Visualization: Created dashboard in Tableau to show average sales in different countries, Top selling products, hourly sales, and a heatmap for RFM values.

3.Comparison of Regions based on sales (Data Visualization)

• Objective: To compare sales data between two regions using the Tableau dashboard and suggest necessary improvements to management.

• Created parameters for regions, shown the sum of sales for different products, shown variation of sales with respect to time, used maps to show states in different regions.

• Created a dashboard to compare sales characteristics of two different regions at a time.

4. Comcast Customer Project Details

• Objective: Data gives information about customers’ complaints received from different regions at different times of the year. Do the data analysis based on types of complaints, number of complaints, and region-based distribution of complaints. This will help the telecom service provider to take necessary actions for reducing the number of complaints.

• Tasks Performed: Using EDA Techniques in pandas library to for data analysis of registered complaints.

5. Marketing Mix Modeling

• Objective: To establish correlation between spend on marketing promotions and sales.

• Feature engineering and Feature selection: Log transformation, outliers’ detection, check for multicollinearity, feature scaling (Standardization).

• Model Training: Used Stats model, Linear regression, and XGBoost algorithms and performed hyperparameter tuning to optimize metrics. Generated Response curves.

• Data Visualization: Showing relation between Product, Price, Promotions and Places (4Ps) with Sales using Tableau dashboards. Optimizing spends on promotions.