ClassificationAlgorithms

What is a classification algorithm?

Takes in data which may be raw or cleaned into features; mostly incomplete information and predicts its category. The task of a classification algorithm is to classify data in a discrete way - this or that? example pear or apple? An example could be an algorithm that takes in data as features like color, size and weight and gives the output as pear, apple and blueberry.

Difference between regression and classification

Regression predicts a value aka it is predictive of a continuous value. On the other hand classification is like a boolean value. Lets say you are given a bunch of data, like the altitude of an airplane, weather conditions and some other bunch of data and then you are asked to guess its speed. In classification you may put the speed into categories like slow, kinda fast and crazy fast but in this case a number like 300km/h would be more helpful when predicted by the algorithm.

Types of Classification Algorithms

Logistic Regression
Naive Bayes
K-Nearest Neighbors(KNN)
Markov Decision Process(MDP)
Decision Trees/ Random Forest
K-Means Clustering
Support Vector Machines

How do we construct an algorithm specific to our needs?

1. Type of Learning Involved:

A. Supervised Learning - Supervision :

The Model learns from training examples, provided by humans. In the fruit classification algorithm, the model is trained by giving images of the fruits and we have a human look at the picture and tell the algorithm that this a pear/apple/blueberry. Mostly used for discrimination(recognition) tasks. The training data, utilized to fine-tune the model, and the test data, employed to gauge its performance, are usually divided in an 80%/20% ratio. This division conceptually resembles our educational experience, wherein mathematics textbooks encompass both solved and unsolved problems, aiding our comprehension and application of mathematical concepts.

B. Unsupervised Learning :

Model is learning from data that is unlabeled without any human input other than how the algorithm is designed and the type of data the algorithm is exposed to. Mostly used for generative (imagination) tasks. No split between training and testing data.

C. Reinforcement Learning :

Here, The model learns the best strategy using a scenario given by humans. We are giving the model a lot of information even beyond the actual data we are even going to tell it some of the possible outcomes and some of the rules of the road. We are not labeling the data as the right or wrong choices but there will be some human input. Scenario here is : What action can you, the algorithm take as an actor in this scenario and how is the environment going to respond to this action. Mostly used for decision making tasks. (eg: robotics) Reinforcement learning is a type of Semi-supervised learning where you are giving the model some human guidance but also allowing it to learn from its own.

2. Number of Features :

How many variables do you have?(age, height,…)

More Features: the model is more optimized - consider a Support Vector Machine.

Less Features: the model is less optimized - consider a Decision Tree.

Point to note: Can you reduce the number of features to simplify your model?(Feature Selection)

Some features have more predictive value than others - example if you want to predict tomorrow’s weather - an important feature to take in would be today’s weather and not something like what you ate for cereal today :)

3. Linearity :

Is your data linear?

Linear: Consider a Support Vector Machine.

Non Linear: Consider a Decision Tree.

4. Training Time :

How much data do you have, and how fast is your computer?

Faster Training: Consider a logistic regression.

Slower Training: Consider a random forest.

More training = more accuracy

5. #Of Parameters (parameters = model specs. #iterations, error, etc) :

How much flexibility do you want in your training?

Less Parameters: Consider Logistic Regression(4)

More Parameters: Consider K-means Clustering(8)

Split data into train/validation/ test to examine the parameter space.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
README.md		README.md
kmeansclustering.ipynb		kmeansclustering.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

kmeansclustering.ipynb

kmeansclustering.ipynb

Repository files navigation

ClassificationAlgorithms

What is a classification algorithm?

Difference between regression and classification

Types of Classification Algorithms

How do we construct an algorithm specific to our needs?

1. Type of Learning Involved:

2. Number of Features :

3. Linearity :

4. Training Time :

5. #Of Parameters (parameters = model specs. #iterations, error, etc) :

About

Releases

Packages

Languages

sakshiagarwal99/ClassificationAlgorithms

Folders and files

Latest commit

History

README.md

README.md

kmeansclustering.ipynb

kmeansclustering.ipynb

Repository files navigation

ClassificationAlgorithms

What is a classification algorithm?

Difference between regression and classification

Types of Classification Algorithms

How do we construct an algorithm specific to our needs?

1. Type of Learning Involved:

2. Number of Features :

3. Linearity :

4. Training Time :

5. #Of Parameters (parameters = model specs. #iterations, error, etc) :

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages