Skip to content

SeventhPrize/INDE_577_Data_Science_and_Machine_Learning

Repository files navigation

Machine Learning & Why it Matters

Machine learning aims to develop algorithms and statistical models that enable computer systems to improve their performance on a specific task over time. Machine learning allows computers to learn from data and experience and to use that knowledge to make predictions or decisions on new data.

Machine learning has become increasingly important in our day-to-day lives, driving a wide range of applications such as image and speech recognition, natural language processing, recommendation systems, and predictive analytics. For example, machine learning algorithms power virtual assistants like Siri and Alexa, personalized movie recommendations on Netflix, and fraud detection systems in financial institutions.

The impact of machine learning is significant and far-reaching, transforming the way we work, live, and communicate. It has the potential to improve healthcare outcomes, optimize supply chains, automate mundane tasks, and enhance public safety, among other benefits. As such, it is critical for individuals and organizations to understand the basics of machine learning and its implications for society.

This repository is a semester-long project for the graduate-level course INDE 577: Data Science and Machine Learning at Rice University taught by Dr. Randy Davila. We demonstrate from-scratch implementations of several machine learning models.

Contents

All models in this repository feature from-scratch Python implementation. The two exceptions are EnsembleMethods.ipynb and StrokePredictionML.ipynb, which utilized packaged implementations from sklearn and tensorflow.

  1. Dataset introduction
  2. Supervised learning
    1. Parametric models
      1. Gradient descent
      2. Single-neuron model
      3. Perceptron
      4. Linear regression
      5. Logistic regression
      6. Dense neural network
    2. Nonparametric models
      1. Decision trees
      2. Ensemble methods
    3. Stroke prediction
  3. Unsupervised learning
    1. k-nearest neighbors
    2. Clustering
    3. Principal component analysis
  4. Reinforcement learning
    1. k-armed bandit