Sales forecasting is critical for businesses to allocate resources, manage cash flow, and meet customer expectations. The BigMart Sales Prediction project explores data processing, exploratory data analysis, and the development of various machine-learning models to predict product sales in different stores.
The goal of this project is to build and evaluate predictive models for sales forecasting, helping BigMart understand the factors influencing sales and develop better business strategies.
The dataset contains annual sales records for 1559 products across ten stores in different cities. Key attributes include:
item_identifier
: Unique item identifieritem_weight
: Item weightitem_fat_content
: Fat content in the itemitem_visibility
: Product visibility in the outletitem_type
: Product categoryitem_mrp
: Maximum retail priceoutlet_identifier
: Outlet identifieroutlet_establishment_year
: Year of outlet establishmentoutlet_size
: Outlet sizeoutlet_location_type
: Outlet location typeoutlet_type
: Outlet typeitem_outlet_sales
: Overall sales of the product in the outlet
- Language:
Python
- Libraries:
Pandas
,NumPy
,Matplotlib
,Scikit-learn
,Redshift Connector
,Pyearth
,PyGAM
- Data Exploration with Amazon Redshift
- Data Cleaning and Imputation
- Exploratory Data Analysis
- Categorical Data
- Continuous Data
- Correlation
- Pearson’s Correlation
- Chi-squared Test and Contingency Tables
- Cramer’s V Test
- One-way ANOVA
- Feature Engineering
- Outlet Age
- Label Encoding for Categorical Variables
- Data Split
- Model Building and Evaluation
- Linear Regressor
- Elastic Net Regressor
- Random Forest Regressor
- Extra Trees Regressor
- Gradient Boosting Regressor
- MLP Regressor
- Multivariate Adaptive Regression Splines (MARS)
- Spline Regressor
- Generalized Additive Models
- Voting Regressor
- Stacking Regressor
- Model Blending
data
: Contains project data.lib
: Reference notebooks.ml_pipeline
: Python files for functions.engine.py
: Main execution script.requirements.txt
: List of required packages.readme.md
: Instructions for running the code.
- Create a python environment using the command 'python3 -m venv myenv'.
- Activate the environment by running the command 'myenv\Scripts\activate.bat'.
- Install the requirements using the command 'pip install -r requirements.txt'
- Run engine.py with the command 'python3 engine.py'.