Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, given the dimensionality of the input is at least the dimensionality of the hash value, location-sensitive hashing is always a dimensionality reduction operation.

Conversely, any dimensionality-reducing technique with a fixed number of finite range output dimensions can be used for location-sensitive hashing as long as a couple of extra constraints hold: (1) nearby inputs (given an appropriate distance metric) will produce nearby outputs (given a second appropriate distance metric), and (2) far-apart inputs are likely to produce far-apart outputs.



That seems to imply that LSHs are a subset of DRs. Are there LSH techniques that are not applicable to DR?


Like so many of these algorithms, there are a few perspectives, and the most useful way to contextualize it depends on the problem you're trying to solve.

I would say that it's both a dimensionality reduction and an approximate nearest neighbor search technique.

One example of a scenario where I think that the ANN perspective is more useful than the DR one is near duplicate detection. There, the buckets tend not to be treated as dimensions so much as filtered lists of items to consider for more detailed comparison. Assuming you're trying to do a high recall search for everything within one distance, you may just look at the union of all the matching buckets.

On the other hand, if you're doing a k-nearest-neighbors search (rather than a dragnet search for all items within some threshold), you might treat them more like dimensions, since the items that share more buckets with the search vector are likely to also be the most similar. So perhaps you start by ranking by Hamming distance in the LSH space, and then refine using comparisons in the higher-dimensional space.


> Are there LSH techniques that are not applicable to DR?

They're all applicable, but perhaps not useful in your particular problem domain. It all depends on how useful the distance metric used by the LSH is in your problem domain.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: