Benjamin Radford
Duke University
Documentation for this code is found at: http://www.benradford.com/simple-machine-learning-and-examples. This includes and explanation of k-NN algorithm and descriptions of the examples included in demos.R file.
- kNN.R: An implementation of the k-Nearest Neighbors algorithm in R.
- demos.R: Code for three examples of classification using kNN.R file.
kNN(train, test, trainlabels, k=1, trainsample=NULL, l=2)
- train: A set of training data in the form of a matrix or dataframe.
- test: A set of test data to be classified. Number of columns in test must match the number of columns in train.
- trainlabels: A set of categories or labels for the observations in train. Number of elements of trainlabels must match the number of rows (observations) in train.
- k: Number of closest neighbors on which to base inference. Default is 1.
- trainsample: Number of samples to take from the training data. Default (NULL) is to include full training set. If a number n is specified, a random sample of size n will be selected from train.
- l: The l-norm to be applied in the distance metrix. Default is 2 (standard euclidean distance). l=3 commonly improves predictive accuracy.
kNN(trainingdata, testdata, traininglabels, k=1, trainsample=NULL, l=2)