Skip to content

Coding ML models, Sampling Methods, Feature Selection algorithms from scratch

License

Notifications You must be signed in to change notification settings

VinylBr/EE769_MLMethods_from_scratch

Repository files navigation

EE769: Introduction of Machine Learning | Methods from scratch

Overview of the project

Linear and Logistic Regression

  1. Analytical solution of Linear Regression using pseudo inverse (Moore-Penrose inverse; PRML by Bishop)
  2. Time taken to compute the analytical solution with increase sample size
  3. Linear Regression with optimization using gradient descent
  4. Impact of increasing sample size and L2-regularization parameter on test error
  5. Impact of increasing L1-regularization parameter on test-error and model weights
  6. Elastic net effect on model weights where features are correlated
  7. Linear Classification using logistic regression
  8. Impact of increasing sample size and L2-regularization parameter on test error

Sampling Methods

  1. Sampling from a known distribution
  2. Rejection Sampling
  3. Importance Sampling
  4. Markov Chain Monte Carlo Sampling

Python packages used

  • numpy
  • math
  • time
  • seaborn
  • matplotlib
  • tqdm

Data

Generated the data for regression and classification with user-defined variance

Future work

Code Adam Optimizer from scratch. Use inspiration from https://towardsdatascience.com/the-math-behind-adam-optimizer-c41407efe59b

Sources of inspiration

https://github.com/heena-sharma-sys/Machine-Learning/blob/main/Blog/LogisticRegressionFromScratch.ipynb

https://github.com/MadhumithaKannan/linear-regression-using-only-numpy

https://www.geeksforgeeks.org/how-to-split-data-into-training-and-testing-in-python-without-sklearn/

https://medium.com/@Suraj_Yadav/compute-performance-metrics-from-scratch-53025140fe1d

https://inria.github.io/scikit-learn-mooc/overfit/learning_validation_curves_slides.html

https://jaketae.github.io/study/MCMC/

Releases

No releases published

Packages

No packages published