Skip to content

This repository contains the readme file and jupyter notebook file of a data analysis project conducted upon a sugarcane production dataset, containing information about sugarcane production across various countries around the globe. The dataset and the conducted project has answered questions regarding factors of the sugarcane production.

Syeda-Mal/Sugarcane-Production-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

Sugarcane Production Analysis Project

Overview

This project delves into the comprehensive analysis of sugarcane production across various countries. The dataset provides insights into production quantity, production per person, acreage, and yield.

Purpose

This analysis has been done to answer arising questions when we think about sugarcane production across various countries. Some example questions include:

  • Which continent has the highest number countries producing sugarcane?
  • Which country has the highest production of sugarcane?
  • Which country has the most land & whether most amount of land translates to highest production of sugarcane or not?
  • Do number of countries in a Continent effects production of sugarcane? And so on...

Table of Content

-- Loading the Dataset

-- Data Cleaning

-- Univariate Analysis

-- Checking for Outliers

-- Bivariate Analysis

-- Correlation

-- Analysis (Continent based)

Dataset (Columns)

  1. Country: Country name
  2. Continent: Continent of the country
  3. Production(Tons): Total sugarcane production in tons
  4. Production_per_Person(Kg): Sugarcane production per person in kilograms
  5. Acreage(Hectare): Total acreage dedicated to sugarcane cultivation in hectares
  6. Yield(Kg/Hectare): Yield of sugarcane in kilograms per hectare

Data Cleaning

The dataset underwent a cleaning process to ensure accuracy and consistency: -> Removed dots and replaced commas for better numerical representation

-> Converted data types to appropriate formats

-> Handled missing values by dropping relevant entries

Exploratory Data Analysis

Univariate Analysis

Explored individual columns to understand their distribution and characteristics.

Bivariate Analysis

Investigated relationships between different variables, including production, acreage, and yield.

Correlation

Explored the correlation matrix to identify relationships between different features.

Analysis by Continent

Analyzed sugarcane production based on continents, examining factors such as the number of countries, production distribution, and correlation.

Conclusion

Concluded with key findings and insights derived from the analysis, emphasizing factors influencing sugarcane production.

About

This repository contains the readme file and jupyter notebook file of a data analysis project conducted upon a sugarcane production dataset, containing information about sugarcane production across various countries around the globe. The dataset and the conducted project has answered questions regarding factors of the sugarcane production.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published