Hacker News new | past | comments | ask | show | jobs | submit | snowman17's comments login

The solution you mention has some tradeoffs. Imagine that you have hundreds of services each of which have a port and have some long running connections. With your solution you can only restart the load balancer at the rate of the longest running connections or whenever ports run out, whichever comes first.

Also then you'd have to maintain the logic to do the port mapping, iptables switching, draining, etc ...

The post mentions they considered this option and they claim that they were worried about engineering risk; seems reasonable to be worried about that.


With your solution you can only restart the load balancer at the rate of the longest running connections

Well, perhaps I should have provided more detail. Our ansible deployment script actually counts the port up until it finds a free one (we keep a port-range reserved for this purpose). So if there are multiple changes in rapid succession then more than two haproxy instances may dangle around for a while.

The "port discovery" is a shell-task that registers the port to be used as a variable, which is then used by the templates of the haproxy and iptables roles.

The cleanup is done by a 15min-cronjob which kills all haproxy instances that have no connections in netstat and don't match the haproxy pidfile.


There has been ongoing work in HAProxy for many years to support zero downtime reloads via a few different potential mechanisms (file descriptor passing, a socket server, etc ...). Unfortunately it turns out that it is really hard given the architecture of HAProxy. That being said, I'm sure patches are always welcome.

The post does mention that they did consider something similar to what you are suggesting with the multiple HAProxy instances but decided against it due to engineering uncertainties. Could be that they just overestimated how hard it would be.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: