Skip to content

Exploratory Data Analysis and Data Cleaning on a Amazon E-Commerce Dataset

Notifications You must be signed in to change notification settings

babiotg/E-Commerce-Amazon-EDA

Repository files navigation

Exploratory Data Analysis and Data Cleaning on a Amazon E-Commerce Dataset

This project is an exploratory data analysis and data cleaning of a dataset that contains detailed insights into Amazon sales data. The dataset covers a 90-day time frame, from 31 March 2022 to 29 June 2022, and includes information such as SKU Code, Design Number, Stock, Category, and Size. The goal of the project is to help optimize product profitability by answering several key questions related to the sales data.

To see the project jupyter notebook work visit this link: Exploratory Data Analysis and Data Cleaning.ipynb

Scope of the project

The project aims to answer the following questions:

  1. Which categories have sold the most?
  2. What are the 20 best-selling products by quantity?
  3. What are the 20 best-selling products by revenue?
  4. What are the 20 cities that made the most orders?
  5. What are the 20 cities that generate the most income?
  6. How many orders are fulfilled by Amazon, and how many are fulfilled by the Merchant?

Key findings

The analysis of the Amazon sales data yielded the following key findings:

  1. The top three categories that sold the most were SET, KURTA, and WESTERN DRESS.

Revenue by Categories

  1. It was verified that the 20 Best-Selling Products by quantity belongs to the categories Western Dress, Kurta and Set.

20 Best-Selling Products by Quantity

  1. On the other hand, the 20 Best-Selling Products by revenue belongs only to the categories Western Dress and Set.

20 Best-Selling Products by Revenue

  1. A ranking of the 20 cities with the most orders was created, and each region was assigned a color in the chart. Bangalore is the city with the highest number of orders.

First 20 Cities by Nr. of Orders

  1. It was created also a ranking of the 20 cities that generated the most income, Bangalore still mantains a leading position.

First 20 Cities by Revenue

  1. The analysis revealed that 69.5% of the orders were fulfilled by Amazon, while 30.5% were fulfilled by the Merchant.

Fulfilment Orders Overview

The shipping state overview was also checked.

Shipping State Overview

The Fulfilment ratio between Amazon and Merchant is evenly distributed (around 70%-30%) even when we consider the individual destination cities.

First 20 Cities by Revenue - Fulfillment hue

Overall, this exploratory data analysis and data cleaning project provides valuable insights into Amazon sales data and can help businesses optimize product profitability.

Data Source

The data for this project can be found at data.world, thanks to ANil.

Author

Barbara Callegari
To learn more about the author visit my LinkedIn Profile

Licence

All rights reserved 2023. All code is created and owned by Barbara Callegari.
If you use his code, please give me a skill endorsement in Python and Data Analysis on LinkedIn.
Visit me at https://www.linkedin.com/in/barbaracallegari

About

Exploratory Data Analysis and Data Cleaning on a Amazon E-Commerce Dataset

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published