What does it do?
kNN, or k-Nearest Neighbors, is a supervised classification algorithm. However, it differs from the classifiers previously described because it’s a lazy learner.
What’s a lazy learner? A lazy learner doesn’t do much during the training process other than store the training data. Only when new unlabeled data is input does this type of learner look to classify.
On the other hand, an eager learner builds a classification model during training. When new unlabeled data is input, this type of learner feeds the data into the classification model.
How does it work?
When new unlabeled data comes in, kNN operates in 2 basic steps:
- First, it looks at the closest labeled training data points — in other words, the k-nearest neighbors.
- Second, using the neighbors’ classes, kNN gets a better idea of how the new data should be classified.
How does kNN new data when neighbors disagree?
- Option 1: Take a simple majority vote from the neighbors
- Option 2: Take a similar vote except give a heavier weight to those neighbors that are closer