Top 10 Data mining algorithm – kNN

What does it do? 

kNN, or k-Nearest Neighbors, is a supervised classification algorithm. However, it differs from the classifiers previously described because it’s a lazy learner.

What’s a lazy learner? A lazy learner doesn’t do much during the training process other than store the training data. Only when new unlabeled data is input does this type of learner look to classify.

On the other hand, an eager learner builds a classification model during training. When new unlabeled data is input, this type of learner feeds the data into the classification model.

How does it work?

When new unlabeled data comes in, kNN operates in 2 basic steps:

  1. First, it looks at the k closest labeled training data points — in other words, the k-nearest neighbors.
  2. Second, using the neighbors’ classes, kNN gets a better idea of how the new data should be classified.

How does kNN new data when neighbors disagree?

  1. Option 1: Take a simple majority vote from the neighbors
  2. Option 2: Take a similar vote except give a heavier weight to those neighbors that are closer

Top 10 data mining algorithms in plain english

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s