Predict Future Sales

Objective and Overview

In this Notebook, I will work with a challenging time-series dataset consisting of daily sales data, kindly provided by one of the largest Russian software firms - 1C Company.
I will predict total sales for every product and store in the next month using Datasets given.

Fastai Library or API

Fast.ai is the first deep learning library to provide a single consistent interface to all the most commonly used deep learning applications for vision, text, tabular data, time series, and collaborative filtering.
Fast.ai is a deep learning library which provides practitioners with high-level components that can quickly and easily provide state-of-the-art results in standard deep learning domains, and provides researchers with low-level components that can be mixed and matched to build new approaches.

Preparing the Model

I have used Fastai API to train the Model. It seems quite challenging to understand the code if you have never encountered with Fast.ai API before. One important note for anyone who has never used Fastai API before is to go through Fastai Documentation. And if you are using Fastai in Jupyter Notebook then you can use doc(function_name) to get the documentation instantly.

Data

I had prepared the Data for this Project from Kaggle

XGBRegressor

XGBoost is an open-source software library which provides a gradient boosting framework for C++, Java, Python, R, Julia, Perl, and Scala. It works on Linux, Windows, and macOS. From the project description, it aims to provide a "Scalable, Portable and Distributed Gradient Boosting Library".

Processing with Fastai API

While working on Timeseries Data then, adding following function or API will enhance the accuracy of the Model a lot.

def add_datepart(df, fldnames, drop=True, time=False, errors="raise"):
    if isinstance(fldnames, str):
        fldnames = [fldnames]
    for fldname in fldnames:
        fld = df[fldname]
        fld_dtype = fld.dtype
        if isinstance(fld_dtype, pd.core.dtypes.dtypes.DatetimeTZDtype):
            fld_dtype = np.datetime64
            
        if not np.issubdtype(fld_dtype, np.datetime64):
            df[fldname] = fld = pd.to_datetime(fld, infer_datetime_format=True, errors=errors)
        targ_pre = re.sub("[Dd]ate$", '', fldname)
        attr = ['Year', 'Month', 'Week', 'Day', 'Dayofweek', 'Dayofyear',
                'Is_month_end', 'Is_month_start', 'Is_quarter_end', 'Is_quarter_start', 'Is_year_end', 'Is_year_start']
        if time: attr = attr + ['Hour', 'Minute', 'Second']
        for n in attr: df[targ_pre + n] = getattr(fld.dt, n.lower())
        df[targ_pre + 'Elasped'] = fld.astype(np.int64) // 10**9
        if drop: df.drop(fldname, axis=1, inplace=True)

Snapshot of using XGBRegressor

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Dataset		Dataset
LICENSE		LICENSE
Predicting Future Sales.ipynb		Predicting Future Sales.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataset

Dataset

LICENSE

LICENSE

Predicting Future Sales.ipynb

Predicting Future Sales.ipynb

README.md

README.md

Repository files navigation

Predict Future Sales

About

Releases

Packages

Languages

License

ThinamXx/PredictFutureSales_XGBoost

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages