Nearest neighbor

Definition

The nearest neighbor is a term used in data mining and statistics, particularly in the field of machine learning. It refers to a type of instance-based learning or non-generalizing learning where the function is approximated locally and all computation is deferred until function evaluation.

Nearest Neighbor Rule

The nearest neighbor rule is a simple strategy for classifying an object based on its distance to the nearest neighbor. The object is assigned to the class of its nearest neighbor in the feature space. This rule is based on the assumption that the class of an object can be accurately predicted by the class of its neighbors. The nearest neighbor rule is often used in pattern recognition and machine learning algorithms.

K-Nearest Neighbors Algorithm

The k-nearest neighbors algorithm (k-NN) is a type of instance-based learning where the function is only approximated locally and all computation is deferred until classification. The k-NN algorithm is amongst the simplest of all machine learning algorithms.

Algorithm

The k-NN algorithm is a type of lazy learning where the function is only approximated locally and all computation is deferred until classification. The k-NN algorithm does not make any assumptions on the underlying data distribution, but it relies on feature similarity. The key elements of the k-NN algorithm include:

1. Set of labeled objects, e.g., a set of stored records. 2. Distance or similarity measure to compute distance between objects. 3. The value of k, the number of nearest neighbors to retrieve. 4. A classification decision rule to determine the output value (class) based on the values of the nearest neighbors.

Applications

The k-NN algorithm has been widely used in statistical estimation and pattern recognition in the beginning of the 1970s. It is also used in computer vision, information retrieval, and genomics.