GitHub

Here, I will add Machine Learning Models and usage and Explanations

Types of ML

based on Human supervision:

Supervised
Unsupervised
Semisupervised
Reinforcement

based on data ingestion

batch processing
online processing

based on content

Discriminative: A Discriminative model ‌models the decision boundary between the classes (conditional probability distribution p(y|x)).
- ‌Logistic regression, SVMs, ‌CNNs, RNNs, Nearest neighbours.
Generative: A Generative Model ‌explicitly models the actual distribution of each class (joint probability distribution p(x,y)).
- Use Bayes rule to calculate P(Y |X)
- Naïve Bayes, Bayesian networks, Markov random fields, AutoEncoders, GANs.

based on parameter

Parametric: parametric model summarizes data with a set of fixed-size parameters (independent on the number of instances of training). Eg: Linear, Logistic Regression, linear SVM (wTx + b = 0), Linear Discriminant Analysis, Perceptron, Naive Bayes, Simple Neural Networks.
Non-parametric: which do not make specific assumptions about the type of the mapping function. The word nonparametric does not mean that the value lacks parameters existing in it, but rather that the parameters are adjustable and can change. eg: k-Nearest Neighbors, Decision Trees, SVMs.

A paramter is something that is estimated from the training data and change (learnt) while training a model. They can be weights, coefficients, support vectors etc.

Other techniques

Ensemble Learning Algorithms: Bagging, Boosting, stacking
Deep Learning Algorithms: CNN, RNN
Bayesian Learning Algorithms : Naive Bayes, Bayesian Networks
Instance-Based Learning Algorithms: k-Nearest Neighbors (k-NN), Locally Weighted Regression (LWR)
Clustering Algorithms : K-Means, Hierarchical clustering
Dimensionality Reduction Algorithms: PCA (linear), t-SNE(non linear)

Concepts

Maximumm Liklihood Estimation (MLE) is a method that determines values of the parameters of a model such that they maximise the likelihood of observed data given a probability distribution.

Supervised ML Table

Regression

Here, the model predicts the relationship between input features (independent variables) and a continuous output variable.

Machine Learning Models	Concepts	Usecases
Linear Regression	There are four assumptions: 1. Linearity: The relationship between X and the mean of Y is linear. $Y=\beta_{0}+\beta{1}X+\epsilon\text{(Error term)}$ Detection: Residual plots (against X), nicely and event spread. 2. Homoscedasticity: the variance of error terms are similar across the values of the independent variables. A plot of standardized residuals versus predicted values can show whether points are equally distributed across all values of the independent variables. 3. Little to no Multicollinearity: Independent variables are not highly correlated with each other. This assumption is tested using Variance Inflation Factor (VIF) values. One way to deal with multicollinearity is subtracting mean. 4. Normality: Residuals should be normally distributed. This can be checked using histogram of residuals. - Feature scaling is required - Sensitive to missing value	Good for sparse, high-dimensional data 1. Advance House Price Prediction 2. Flight Price Prediction
Polynomial regression	more the degree of the polynomial the better is the fit but the more is the issue of overfitting.

Evaluation Metrics: Regression models are typically evaluated using metrics such as Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and R-squared (R²), among others.

Classification:

Machine Learning Models	Concepts	Usecases
Logistics Regression	Assumptions: 1. The outcome is binary 2. Linear relationship between the logit of the outcome and the predictors. Logit function: $\text{logit}(p) = \log\left(\frac{p}{1-p}\right)$ $(p)$: probability of the outcome 3. No outliers/extreme values in the continuous predictors 4. No multicollinearity among the predictors Sigmoid/Logistic Function: S-shaped curve that takes any real number and maps it between 0 and 1 $f(x)=\frac{1}{1+e^{-x}}$ - Feature scaling is required - Sensitive to missing value
Decision Tree
Support Vector Machines

Evaluation Metrics: Classification models are evaluated using metrics such as accuracy, precision, recall, F1-score, and the confusion matrix, depending on the problem and class distribution.

Structuring ML

ML Algos

src: https://github.com/nvmcr/DataScience_HandBook/tree/main/Machine_Learning https://github.com/dhirajmahato/Machine-Learning-Notebooks

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
datasets		datasets
notebooks		notebooks
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

datasets

datasets

notebooks

notebooks

README.md

README.md

Repository files navigation

Types of ML

Concepts

Supervised ML Table

Regression

Classification:

Structuring ML

ML Algos

About

Releases

Packages

Languages

dhirajmahato/Machine_Learning_Module

Folders and files

Latest commit

History

Repository files navigation

Types of ML

Concepts

Supervised ML Table

Regression

Classification:

Structuring ML

ML Algos

About

Resources

Stars

Watchers

Forks

Languages