GitHub - hakngrow/knn: K-Nearest Neighbors (kNN)

K-nearest neighbors (kNN)

K-nearest neighbors (kNN) is a supervised machine learning algorithm that can be used to solve both classification and regression tasks. The algorithm works on the principle that data points that are close together, are similar. The hyper-parameter, k, determines the number of 'nearest' data points to consider when predicting the outcome. If k=5, the predicted class of a data point is derived via majority voting using the classes of the 5 nearest data points. In regression, the mean value of the 5 nearest points is used.

Data points are determined to be close by meausring the distance between them. The Euclidean distance is most commonly used as the distance metric to measure 'closeness'.

Each feature of the dataset represents a dimension. A simple dataset with 2 features is represented by a 2-dimensional space with (x, y) co-ordinates. This notebook will explain the K-nearest neighbors (kNN) algorithm using a synthetic dataset with 2 features. In real life problems, we typically have many samples with many features.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.ipynb_checkpoints		.ipynb_checkpoints
README.md		README.md
kNearestNeighbors.ipynb		kNearestNeighbors.ipynb
test_knn1.png		test_knn1.png
test_knn5.png		test_knn5.png
test_original.png		test_original.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.ipynb_checkpoints

.ipynb_checkpoints

README.md

README.md

kNearestNeighbors.ipynb

kNearestNeighbors.ipynb

test_knn1.png

test_knn1.png

test_knn5.png

test_knn5.png

test_original.png

test_original.png

Repository files navigation

K-nearest neighbors (kNN)

About

Releases

Packages

Languages

hakngrow/knn

Folders and files

Latest commit

History

Repository files navigation

K-nearest neighbors (kNN)

About

Resources

Stars

Watchers

Forks

Languages