Hacker News new | past | comments | ask | show | jobs | submit login

I'm not saying there's never a use case for a proxy to do health checks, TLS termination, load balancing, sticky sessions, etc.... but I don't see how that can compete with using ECMP with hash-based flows and TLS termination at the service endpoints.



For those who do not know what ECMP means, I found this blog post helpful: https://blog.talentica.com/2016/12/09/hash-based-ecmp-load-b...

tl;dr -- ECMP means Equal Cost Multi-Path


off the top of my head:

* non-equal-cost loadbalancing (give the Skylake boxes 25% more traffic than the Broadwell boxes)

* ECMP doesn't know anything about service health, so what happens when one app server out of 10 gets wedged and stops responding to requests?

* TLS termination on the proxy means you limit what machines need to hold your private keys.

* what if you want to load-balance or direct traffic based on something other than a 5-tuple hash?

* ECMP doesn't scale very well. Not all that long ago, at $DAYJOB, we had scalability problems because a certain large network vendor couldn't do more than 8 next hops for a single prefix.


I agree with all that you said except the scaling limits of ECMP. We have boxes that support 64 ECMP destination and I've seen others mention 256: https://youtu.be/ciClZdwHelU


Nice to hear things have gotten better. 8 and 16 next-hop limits were very common a few years ago, especially on devices that you'd use as a top-of-rack switch.

ECMP still has some scaling challenges, IMHO, since each destination host still needs to peer with the upstream switch over BGP (unless you do some cleverness).


I don't understand, Google uses ECMP in Maglev [1]. Is there something specific about Maglev that makes their usage of ECMP viable while the scenarios that you encountered were poor fits?

[1] https://research.google.com/pubs/pub44824.html


Maglev is a network load balancer that operates at L3/L4. ECMP on the upstream routers spreads traffic for a VIP across a pool of Maglevs, but the Maglev itself does more complicated backend selection.

Envoy is an L7 proxy, so it's the moral equivalent of the Google GFE (which, conveniently enough, is the next layer of load balancing behind Maglev for most Google services).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: