ml_basics

Data preprocessing:

This is an essential step which makes the data ready to be fed into ML algorithms.

Importing the libraries
Importing the data
Handling missing data: either remove the rows or replace with some aggregate column values(mean, mode, median, etc.)
sklearn is the ML library of python, you get all imp methods for all neccessary steps
Handling categorical data: encode text data to numbers since ML models work with mathematical equations
This can be done in many ways depending on the dataset and ML model. Replacing text data with numbers, Binary Encoding, Using LabelEncoder, Binary Encoding and many more.
OneHotEncoder adds new columns of the same number as the number of categories
Splitting the data set into train and test data since you need to test how accurately the ML model has learned
Feature scaling: it is required if working with some ML models and is important because otherwise some features with higher range of values might dominate the results. For an example while computing the Euclidean distance

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data_preprocessing		data_preprocessing
multiple_linear_reg		multiple_linear_reg
simple_linear_reg		simple_linear_reg
README.md		README.md