I did some indexing using elasticsearch and some home grown stuff based on geohashes about seven years ago. At the time Elasticsearch was just adding support for geoshape indexing as well. Initially this was also based on geohashes. Later they added proper quad tree support (instead of indexing the geohash as a term), and recently they revamped the implementation using BKD trees. The current implementation is way faster and scalable than what they had a few years back and allegedly quite nice for handling complex GIS data.
What's the advantage of Uber's approach over any of this. Even my primitive geohash solution worked pretty nicely and you can implement it on just about any type of DB. I had a simple algorithm to cover any shape with geohashes of a certain size as well as a quick way to generate a polygon for a circle. The two combined allowed me to do radius searches for any shape overlapping or contained by the circle with simple terms queries on the geohash prefixes. My main headache was keeping the number of terms in a query (i.e. number of geohashes) to a reasonable size (below 1024, which if I recall was a the default limit).
What I get from their explanation is that hexagon is a better shape for map grids because they are the most complex shape that can tesselate (the other two are triangles and squares). As they are more close to a circle, distances within a cell are more stable, also computing the distance from a cell center to its neighbours is stable in hexagons as well.
I think the reason hex is not more common is that subdivisions are hard to create compared to triangles or squares. Uber solved this by subdividing in 7 smaller hex and tilting it so they cover the bigger shape with some small overlap.
Also a big problem is distortion, I never thought this would be that huge of a problem but it makes sense. They go into a lot of the details later on the same video.
geohashes are cool, and lopping off a letter reduces precision.
But if you look at have they overlay London, you get quite a split higher up and two next to each other don’t look like they are and so I can see where the number of terms would get big.
The new ElasticSearch implementation is now the default and I think they are deprecating the prefix versions like geohash.
Too bad their stuff is not embeddable into an app. Know of lib for this?
What's the advantage of Uber's approach over any of this. Even my primitive geohash solution worked pretty nicely and you can implement it on just about any type of DB. I had a simple algorithm to cover any shape with geohashes of a certain size as well as a quick way to generate a polygon for a circle. The two combined allowed me to do radius searches for any shape overlapping or contained by the circle with simple terms queries on the geohash prefixes. My main headache was keeping the number of terms in a query (i.e. number of geohashes) to a reasonable size (below 1024, which if I recall was a the default limit).