k-D trees and BSPs don't work well in high dimensions (say >20 to 30 dims), sinc...

danbruc · on Oct 31, 2023

I wonder if nearest neighbor is actually what you want to begin with. Imagine points distributed in a U-shape, the nearest neighbor might require jumping across the gap but a better match could be a bit further away but somewhere along the shape. Especially if the dimension do not have a meaningful intrinsic scale, I could just scale the x dimension and pull the two sides of the U apart. So maybe I could actually be more useful to look at the topology of the points, something like a Vietoris–Rips complex [1] but invariant under [local] rescaling of dimensions. Not sure if something like that exists or is possible in a robust and efficient way.

[1] https://en.wikipedia.org/wiki/Vietoris%E2%80%93Rips_complex

pizza · on Oct 31, 2023

I think you would prefer something like a support vector machine over k-nn for better learned point-wise local rescaling (you could also use eg ISOMAP or MDS to try to linearize the manifold), which you could also use w various tda methods (see the giotto-tda github repo jupyter notebooks for some similar examples)

esafak · on Oct 31, 2023

And LSH uses random hyperplanes.