Skip to content
/ ML Public

This repository contains a Jupyter notebook that implements and optimizes several machine learning models on a dataset

License

Notifications You must be signed in to change notification settings

chikeorah/ML

Repository files navigation

ML

Machine Learning Models Notebook

This repository contains a Jupyter notebook that implements and optimizes several machine learning models on a dataset.

Models Implemented

  1. Linear Regression Model: This model is used to predict a continuous outcome variable (also called the dependent variable) based on one or more predictor variables (also known as independent variables).

  2. Linear Regression Model Optimized using RFE (Recursive Feature Elimination): This is a feature selection method that fits a model and removes the weakest feature (or features) until the specified number of features is reached.

  3. Linear Regression Model Optimized using SVR (Support Vector Regression): This model applies the principles of Support Vector Machines to a regression problem. It uses the same concepts like margin and maximum margin.

  4. Random Forest: This is a versatile machine learning method capable of performing both regression and classification tasks. It is a type of ensemble learning method, where a group of weak models combine to form a powerful model.

  5. Random Forest Optimized using Grid Search: This model uses Grid Search to find the optimal hyperparameters of a Random Forest model in order to improve its performance.

  6. k-Nearest Neighbors (k-NN): This is a simple and intuitive model that predicts the target of a new instance based on the targets of its 'k' closest instances in the feature space.

  7. k-NN Optimized using Grid Search: This model uses Grid Search to find the optimal hyperparameters of a k-NN model in order to improve its performance.

  8. Support Vector Machines (SVM): SVMs can model non-linear relationships using the kernel trick, and they work well in high-dimensional spaces.

  9. SVM Optimized using Grid Search: This model uses Grid Search to find the optimal hyperparameters of an SVM model in order to improve its performance.

  10. XGBoost: This is an implementation of gradient boosted decision trees designed for speed and performance.

  11. XGBoost Optimized using Grid Search: This model uses Grid Search to find the optimal hyperparameters of an XGBoost model in order to improve its performance.

  12. Neural Network Regression (NNR): This model uses a neural network for regression tasks. It can model complex, non-linear relationships.

  13. NNR Optimized using Grid Search: This model uses Grid Search to find the optimal hyperparameters of an NNR model in order to improve its performance.

Getting Started

  1. Clone this repository.
  2. Install the necessary libraries mentioned in requirements.txt.
  3. Run the Jupyter notebook.

Requirements

  • Python 3.7+
  • Jupyter
  • scikit-learn
  • pandas
  • numpy
  • matplotlib

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

MIT