Skip to content

Analyzed a multicategory e-commerce store using big data techniques on a Kaggle dataset with the help of AWS EC2, AWS S3, PySpark, AWS Glue ETL, AWS Athena, AWS CloudFormation, AWS Lambda and Power BI!

Notifications You must be signed in to change notification settings

Saurabhkhandebharad/BigData-SK

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

BigData - Saurabh Khandebharad

E - Commerce Analytics and Big Data Processing (End-To-End Group Project)

Guidance by - Pradeep Tripathi

KAGGLE DATASET: https://www.kaggle.com/datasets/mkechinov/ecommerce-behavior-data-from-multi-category-store

File Name: 2019-Nov.csv

File Size: 8 GB

Project Architecture

Architecture

Being excellent at data analysis and visualization, I volunteered to do the data cleaning and preprocessing in pyspark. Head over to PySpark.py and check my code! Handling such a large data was fun and a learning experience!

👉My PySpark Script

PowerBI Visualizations..

Page 1 - Dashboard Page1

Page 2 - Dashboard Page2



Don't forget to leave a star!⭐:

About

Analyzed a multicategory e-commerce store using big data techniques on a Kaggle dataset with the help of AWS EC2, AWS S3, PySpark, AWS Glue ETL, AWS Athena, AWS CloudFormation, AWS Lambda and Power BI!

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages