Skip to content

This repository contains a collection of Jupyter Notebook files for various feature engineering techniques, including missing value handling, encoding, transformation, imbalanced dataset, and outlier detection. Each notebook provides practical examples of methods for handling the corresponding problem.

License

Notifications You must be signed in to change notification settings

skprasad117/Feature-Engineering-Techniques

Repository files navigation

Feature Engineering Techniques

Welcome to the Feature Engineering Techniques repository! This repository contains separate Jupyter Notebook files for various feature engineering techniques, including missing value handling, encoding, transformation, imbalanced dataset, and outlier detection. Each notebook provides practical examples of methods for handling the corresponding problem.

Note: Please keep in mind that the methods demonstrated in the notebooks are not an exhaustive list of all possible feature engineering techniques. There are many other techniques that can be used depending on the dataset and the problem at hand. I will keep updating this GitHub repository with new methods over time, so stay tuned for updates!

Contents

  • 01-Missing-Value-Handling.ipynb: This notebook explores different methods for handling missing values, including mean imputation, median imputation, and multiple imputation.
  • 02-Encoding.ipynb: This notebook covers various encoding techniques such as one-hot encoding, label encoding, and target encoding.
  • 03-Transformation.ipynb: This notebook demonstrates feature transformation techniques, including scaling, normalization, and log transformation.
  • 04-Imbalanced-Dataset.ipynb: This notebook focuses on techniques to handle imbalanced datasets, such as random oversampling, random undersampling, and SMOTE.
  • 05-Outlier-Detection.ipynb: This notebook explores methods for detecting outliers, including z-score, IQR, and DBSCAN.

Installation

To run the notebooks in this repository, you need to have Jupyter Notebook installed on your machine. If you don't have Jupyter Notebook installed, you can install it using the following command:

pip install jupyter

Once you have installed Jupyter Notebook, you can clone this repository using the following command:

git clone https://github.com/skprasad117/feature-engineering-techniques.git

Usage

To use the notebooks in this repository, open Jupyter Notebook and navigate to the directory where you cloned this repository. You can then open the desired notebook and run the cells to see the results.

Contributing

If you want to contribute to this repository, please feel free to fork the repository and submit a pull request with your changes. We welcome contributions of all kinds, including bug fixes, documentation improvements, and new feature engineering techniques.

License

This repository is licensed under the MIT License. See the LICENSE file for more information.

Credits

This repository was created by Sanjay Kumar Prasad with the guidance and support of Krish Naik as a mentor. If you have any questions or comments, please feel free to contact me.

About

This repository contains a collection of Jupyter Notebook files for various feature engineering techniques, including missing value handling, encoding, transformation, imbalanced dataset, and outlier detection. Each notebook provides practical examples of methods for handling the corresponding problem.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published