KNN algorithm is a classification algorithm can be used in many application such as image processing,statistical design pattern and data mining.
As for any classification algorithm KN also have a model and Prediction part. Here model is simply the input dataset. While predicting output is a class membership. An object is classified by a majority vote of its neighbors (k), with the object being assigned to the class most common among its k nearest neighbors (k is a positive integer, typically small).
1. If k = 1, then the object is simply assigned to the class of that single nearest neighbor.
2. If k=3, and the classlabels are Good =2 Bad=1,then the predicted classlabel will be Good,which contains the magority vote.
Lets see how to handle a sample data in KNN algorithm.
We have data from questionnaires survey and objective testing with two attribute to classify whether a special paper issue is good or not.
Here is for training sample.
Let this be the test sample
1. Determine the parameter k=the no.of nearest neighbours.
Say k=3
2. Calculate the distance between queryinstance and all the training samples.
Coordinate of query instance is (3,7) ,instead of calculating the distance we compute square distance which is faster to calculate(without squareroot)
3. Sort the distance and determine Nearest neighbors based on the kth minimum distance.
4. Gather the category Y of the nearest neighbours .
-> the second row inthe last column that the category of nearest neighbours (Y) is not included becoz the rank of this data is more than 3(=k).
5. Use simple majority of the category of nearest neighbors as the prediction value of query instance.
We have 2 good and 1 bad ,since,2>1 So we conclude that a new paper tissue that pass laboratory test with x1=3 and x2=7 is included in Good category.