I'm not saying there's never a use case for a proxy to do health checks, TLS ter...

mahmoudimus · on Jan 15, 2018

For those who do not know what ECMP means, I found this blog post helpful: https://blog.talentica.com/2016/12/09/hash-based-ecmp-load-b...

tl;dr -- ECMP means Equal Cost Multi-Path

packetslave · on Jan 15, 2018

off the top of my head:

* non-equal-cost loadbalancing (give the Skylake boxes 25% more traffic than the Broadwell boxes)

* ECMP doesn't know anything about service health, so what happens when one app server out of 10 gets wedged and stops responding to requests?

* TLS termination on the proxy means you limit what machines need to hold your private keys.

* what if you want to load-balance or direct traffic based on something other than a 5-tuple hash?

* ECMP doesn't scale very well. Not all that long ago, at $DAYJOB, we had scalability problems because a certain large network vendor couldn't do more than 8 next hops for a single prefix.

_jbez · on Jan 16, 2018

I agree with all that you said except the scaling limits of ECMP. We have boxes that support 64 ECMP destination and I've seen others mention 256: https://youtu.be/ciClZdwHelU

packetslave · on Jan 16, 2018

Nice to hear things have gotten better. 8 and 16 next-hop limits were very common a few years ago, especially on devices that you'd use as a top-of-rack switch.

ECMP still has some scaling challenges, IMHO, since each destination host still needs to peer with the upstream switch over BGP (unless you do some cleverness).

devonkim · on Jan 15, 2018

I don't understand, Google uses ECMP in Maglev [1]. Is there something specific about Maglev that makes their usage of ECMP viable while the scenarios that you encountered were poor fits?

[1] https://research.google.com/pubs/pub44824.html

packetslave · on Jan 15, 2018

Maglev is a network load balancer that operates at L3/L4. ECMP on the upstream routers spreads traffic for a VIP across a pool of Maglevs, but the Maglev itself does more complicated backend selection.

Envoy is an L7 proxy, so it's the moral equivalent of the Google GFE (which, conveniently enough, is the next layer of load balancing behind Maglev for most Google services).