Hacker News new | past | comments | ask | show | jobs | submit login

I did some indexing using elasticsearch and some home grown stuff based on geohashes about seven years ago. At the time Elasticsearch was just adding support for geoshape indexing as well. Initially this was also based on geohashes. Later they added proper quad tree support (instead of indexing the geohash as a term), and recently they revamped the implementation using BKD trees. The current implementation is way faster and scalable than what they had a few years back and allegedly quite nice for handling complex GIS data.

What's the advantage of Uber's approach over any of this. Even my primitive geohash solution worked pretty nicely and you can implement it on just about any type of DB. I had a simple algorithm to cover any shape with geohashes of a certain size as well as a quick way to generate a polygon for a circle. The two combined allowed me to do radius searches for any shape overlapping or contained by the circle with simple terms queries on the geohash prefixes. My main headache was keeping the number of terms in a query (i.e. number of geohashes) to a reasonable size (below 1024, which if I recall was a the default limit).




They go into some detail on this talk https://youtu.be/ay2uwtRO3QE?t=712.

What I get from their explanation is that hexagon is a better shape for map grids because they are the most complex shape that can tesselate (the other two are triangles and squares). As they are more close to a circle, distances within a cell are more stable, also computing the distance from a cell center to its neighbours is stable in hexagons as well.

I think the reason hex is not more common is that subdivisions are hard to create compared to triangles or squares. Uber solved this by subdividing in 7 smaller hex and tilting it so they cover the bigger shape with some small overlap.

Also a big problem is distortion, I never thought this would be that huge of a problem but it makes sense. They go into a lot of the details later on the same video.


Which is why, btw, Hexagons have been used for decades in wargames (and - to a more limited extent, in other types of boardgames).


> hexagon is a better shape for map grids because they are the most complex shape that can tesselate

[Penrose tiling intensifies]


I think it's clear what they meant was not actually 'the most complex shape' but the regular polygon with the most sides that can tessellate.


Autocomplete of 'most compact'.


The Techzing podcast talked about the origin of this several years ago, back when one of the hosts had a critical role in its creation. Good stuff.


geohashes are cool, and lopping off a letter reduces precision.

But if you look at have they overlay London, you get quite a split higher up and two next to each other don’t look like they are and so I can see where the number of terms would get big.

The new ElasticSearch implementation is now the default and I think they are deprecating the prefix versions like geohash.

Too bad their stuff is not embeddable into an app. Know of lib for this?


This is all part of Lucene and not specific to Elasticsearch as far as I understand.

Update with link: https://lucene.apache.org/core/7_1_0/core/org/apache/lucene/...


Nice! Thanks for the link. I was hoping more for an embeddable lib for mobile apps




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: