While the topic may seem trivial and the execution/implementation a bit too naive, it should be noted the author is a graphic designer! Check out David's homepage [0] -- original, if not very practical :-)
I personally think it's great to see such cross-pollination between various fields. It helps other newcomers to read about this stuff "in their own language", using a perspective of someone from their own field (as opposed to the usual academic / old-hand angle).
It's excellent – even though I somehow get the impression you think all designers are idiots :). I've met quite a few with more talent for abstract thought than some math professors.
I can only hope that we get more and more people from other disciplines involved in data science. The field is wide open and full of opportunities. Not just the stereotypical 'creative type' creating art with ANNs, but also, for example, in applying these methods to problems in the humanities.
"I've met quite a few with more talent for abstract thought than some math professors."
Such a statement can be made about almost any two fields.
I suspect "quite a few" is either hyperbole to strengthen your point or you suffer from the dunning-kruger effect.
All the parent comment says is that its good that two fields that are conventionally thought to be orthogonal have cross-pollination of knowledge. Nothing about "abstract thought" or intelligence is mentioned and is your own insecurity coming through.
I came here to say that its distance array was going to be large, but yes, pretty nice for someone who wasn't exposed to algorithm or the big o notation.
Maybe, but if the new cross-pollination just confuses the issue rather than clarifying it, what's the point?
"Suppose you have a bunch of flowers in a field, and they're planted so that the east-west direction corresponds to the diameter of the flower and the north-west direction corresponds to the length of the leaf."
"Huh?"
"Suppose you have a bunch of data points in an n-dimensional space"
"Suppose you have a flower data set, and each item has two parts: one value for flower diameter, and one for leaf length."
"Oh, that makes sense."
"Makes sense to me, too."
"And me."
"I wish my professor had just said that. Thanks!"
There is a lot of middle ground between an artsy, overly wordy scenario and a perfectly abstract scenario using jargon that's defined in terms of other jargon, etc.
The cross-disciplinary sweet spot is where you can define a problem from one side in terms compatible with an existing solution from the other side. You end up with language that's somewhere in the middle and easier to understand than it was at either extreme.
very nice write-up. if you're interested in this, the next logical step is to look at 'locality-sensitive hashing' and some simple state-of-the-art methods: https://github.com/spotify/annoy
this description of linear scan kNN may be a bit tedious for the hacker news crowd. here's the gist (in python):
def knn(y,X,k):
d = len(y)
least = [(float('inf'),[float('inf')]*d) for i in xrange(k)]
for x in X:
d = 0.0
for i in xrange(len(x)):
d+=abs(x[i]-y[i])
if d < least[k-1][0]:
least[k-1] = (d,x)
least.sort()
return least
Sorting even for linear isn't the best idea but keeps the algorithm terse. A data structure like a priority queue should probably be used instead of an array.
and one should use a kd tree or ball tree for exact, and LSH-knn for inexact in practice.
One reason is that the implementations are much more difficult. In k-d trees you have to deal with the distance from hyperrectangles. It is not something you would write in 30 minutes.
Also k-d trees don't scale much in higher dimensions (https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/...) so in large dimensional problems (the article says 20 is already large) the naive linear search tends to be actually faster. Voronoi is even worse in dimensions larger than 2.
I personally think it's great to see such cross-pollination between various fields. It helps other newcomers to read about this stuff "in their own language", using a perspective of someone from their own field (as opposed to the usual academic / old-hand angle).
[0] http://www.lettier.com/