Hacker News new | past | comments | ask | show | jobs | submit login

The consistent hashing stuff in the paper is pretty cool. In order to distribute traffic among backends, they came up with a new "Maglev hashing" algorithm that gives a more even distribution than existing techniques, but is less robust to changes in the set of servers. The trick is that you can locally memoize the results of your mostly-consistent hash function, and rely on an extra layer of connection affinity from the upstream ECMP routers to the load-balancers, so that the same memoized value is used every time. So you never actually drop connections unless both a backend instance and a load balancer fail at the same time, which should be very rare. Clever!

As an aside, I couldn't help noticing these lines on adjacent pages:

> Maglev has been serving Google’s traffic since 2008. It has sustained the rapid global growth of Google services, and it also provides network load balancing for Google Cloud Platform.

> Maglev handles both IPv4 and IPv6 traffic, and all the discussion below applies equally to both.

In that case, any chance GCE will be getting IPv6 support soon? ;)




They drop connections when a backend dies.

And not just the ones that that backend was handling, but also some % of the overall traffic for re-balancing.

The degree of connection affinity from the ECMP is limited, and there's no "reliance" on it. If a connection flip-flopped between two or more load balancers, there would be no drops, thanks to the consistent fashion.


Google uses ipv6 internally for its services (no ipv4 at all?) but doesn't see fit to let the rest of us use ipv6 apparently, very annoying.


Source?



as a GCloud user I've never seen any mention of IPv6. That's the main reason i've not bothered trying to get on the ipv6 bandwagon. I don't want to strugle through it if my hosting provider doesn't even support it.


CloudSQL instances support ipv6,[1] and appengine apps can receive ipv6 traffic, but only suggestion regarding delay in getting ipv6 to users was hardware limitations.[2] I wanted to know why op thinks this is only for external customers.

[1] https://cloudplatform.googleblog.com/2014/11/cloudsql-instan...

[2] https://code.google.com/p/google-compute-engine/issues/detai...


cool, thanks for the extra details. maybe they are using ipv6 for a portion of the internal infrastructure and the OP extrapolated that to mean they use it everywhere internally.


The "mostly consistent" hashing reminds me of Stochastic Fair Queuing. Not perfect, but good enough with less overhead.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: