this description of linear scan kNN may be a bit tedious for the hacker news crowd. here's the gist (in python):
def knn(y,X,k):
d = len(y)
least = [(float('inf'),[float('inf')]*d) for i in xrange(k)]
for x in X:
d = 0.0
for i in xrange(len(x)):
d+=abs(x[i]-y[i])
if d < least[k-1][0]:
least[k-1] = (d,x)
least.sort()
return least
Sorting even for linear isn't the best idea but keeps the algorithm terse. A data structure like a priority queue should probably be used instead of an array.
and one should use a kd tree or ball tree for exact, and LSH-knn for inexact in practice.
and one should use a kd tree or ball tree for exact, and LSH-knn for inexact in practice.