Haproxy 2.2

lukeqsee · on July 12, 2020

HAProxy is a unique piece of software.

I’ve had the privilege to interact directly with Willy (the main developer for many years, still the project lead) on the mailing list, in person at the conference, and even though I’ve never paid a dime, the interaction has been the best open-source experience I’ve ever had. Willy routinely writes multi-paragraph responses on the mailing list to my hair-brained suggestions for how HAProxy could be better for my company’s rather unique needs.

I often feel bad that I cannot do more for the project, because it is so well thought-through and delivered.

The software cuts no corners and delivers at fractions of the TCO of competition (limited by my opinion and experience, of course). For instance, just last week, I spun a rate limiting solution in half a morning, that mitigated some annoying proxy bots instantly (and is flexible enough to automatically block offenders without affecting legitimate users).

The stats, DNS support, and integrations and many other featured are second to none among other load balancers.

It’s an impressive and refreshing project. We could all learn something from HAProxy and the team.

topspin · on July 12, 2020

I asked for a feature on GitHub (fetching SHA-2 fingerprint of client certificates, as opposed to SHA-1) and it was handled in one day. That was the feature that pulled me away from nginx (which still only provides SHA-1 via $ssl_client_fingerprint btw.)

TimWolla · on July 12, 2020

Oh hi! That was me [1]. I'm a community contributor, but I'm equally impressed by the speed patches get merged on the mailing list.

[1] https://github.com/haproxy/haproxy/issues/123#issuecomment-5...

lukeqsee · on July 12, 2020

Sounds like par for the course.

I compiled it from source on our edge servers for a long time (as opposed to packaged), simply so I could apply these real time patches as needed.

apple4ever · on July 12, 2020

Why do you choose HAProxy over Ngxinx?

It's sounds like you love it, and I have had no experience with HAProxy. So I'm curious about the reasons you love it.

(Just to be clear this is a sincere question to learn since tone is hard to express clearly with text.)

lukeqsee · on July 12, 2020

As a load balancer, NGINX is subpar in almost every feature comparison, especially at the open-source (free to use) tier.

HAProxy gives you the following that are musts for load balancing (in my opinion), that NGINX does not, at least not easily:

1) A HTML (or JSON) stats page that precisely and completely tells you what’s going on at a high-level. A visit to this during outages is often all that’s required.

2) Support for DNS (and other) discovery mechanisms in a flexible way. (This is paid in NGINX)

3) Active health checks (also paid in NGINX)

4) The ACL system, while somewhat difficult to learn, is amazingly powerful.

5) Flexible L7 retries are brilliant.

We replaced NGINX with HAProxy and eliminated a whole class of bugs, micro-outages, and annoyances just by following HAProxy’s best practices.

I still use NGINX when I need a static web file server, though. :)

raghava · on July 13, 2020

HAProxy - traffic shaping { load-balancing, req-throttling, health-checks }

nginx - { static file serving, good proxy for python/ruby apps, req manipulation with advanced scripting capabilities via lua engine through open-resty [mirroring, WAF(naxsi), url rewriting etc] }

A combo of haproxy + nginx would always bring good delight for many practitioners.

wtarreau · on July 13, 2020

This is actually what I often recommend and often encounter in field: haproxy for LB, varnish serving as a smart cache, and nginx for the applications+static file serving.

All 3 components are free, combine extremely well because they've grown together, and are extremely efficient. This is important in virtualised or containerized environments where you want to save resources to minimize response time and leave the CPU for the applications.

Of course each of them can do a little bit of the other ones' job. This is fine, it allows easier initial deployments, but as your site grows, whichever you initially start with, you'll always end up installing the two other ones to constitute the most robust stack ever. And it's easy to insert one next to the others without having to break everything, which further adds to the fun.

lukeqsee · on July 13, 2020

Yes. We use all three at different points of the stack for different purposes.

It's great.

apple4ever · on July 20, 2020

Wow that is very helpful. I am going to take a look now!

jedberg · on July 13, 2020

I switched reddit from nginx to haproxy because nginx had a pathological problem where it would send 90% of the load to the first app server listed when there was a burst of requests (and 9% to the next, and .9% to the next and so on, leaving almost every app server idle).

I'm sure they fixed the problem by now, but it was not a good look for a load balancer.

HAProxy on the other hand has never failed me, and Wily (the creator) is super responsive if you have questions or need features.

takeda · on July 13, 2020

Why you chose Nginx over HAProxy?

I see a lot of people preferring Nginx, but I can't understand why they are so fascinated about it and try to squeeze it where it has no place. Nginx is just a web server that has some load balancing functionality, it is sub par with HAProxy.

I never seen people saying why are you using HAProxy instead of Apache, but Nginx load balancing functionality is closer to Apache than HAProxy.

heipei · on July 13, 2020

I think people choose nginx because it does everything decently, even if it can't beat HAProxy for load-balancing or Varnish for caching (at least the FOSS nginx). I run a small-to-mmedium size application with multiple upstream (local web app tier, remote asset servers) and some basic vhost and caching use-cases, e.g. cache specific paths with different cache keys than others, different cache behaviours and life-times, request rate limiting, some local file serving. nginx does everything I need, without any additional modules. I get close enough to zero-downtime deployments by using multiple backends and proxy_next_upstream, haven't needed active healthchecks for the upstreams yet. I have JSON access logs, so my dashboard about vhost activity is in Kibana now. There are a few juicy features missing (cache PURGE, proxy request coalescing, extended stub status endpoint, JSON error log). Not trying to fanboy, but I wouldn't introduce additional complexity into my stack by running haproxy + varnish + nginx just to get the best of each tool, not at my current scale.

apple4ever · on July 20, 2020

I choose it because I have a lot of experience with Nginx, and none with HAProxy. So I set it up as a LB, and it worked for my needs (at least from what I can tell).

I will definitely checkout out HAProxy now.

bpicolo · on July 13, 2020

One good reason is that HAProxy doesn't payment gate a bunch of the baseline features. I'm still surprised that NGINX doesn't come with active healthchecks in the free version - baseline feature for a load balancer in the world of treating computers as cattle.

HAProxy is such a great example of software that does what you expect, generally.

takeda · on July 13, 2020

> HAProxy is such a great example of software that does what you expect, generally.

It's also rock solid and battle tested. I don't think I ever observed crash. It's typically one of tools that once you configure correctly it works for years without issues.

toast0 · on July 13, 2020

> I don't think I ever observed crash.

It got a lot better quickly, but running with thread was crashy with a lot of threads. I don't remember crashes with single threaded/forking mode though, and I ended up with a forking config when I was setting something up at my last job.

TimWolla · on July 12, 2020

I gave my reasoning back on the HAProxy 2.0 announcement: https://news.ycombinator.com/item?id=20198232

apple4ever · on July 20, 2020

Perfect thank you!

This is why I love HN. I ask a good question and get a lot of great feedback!

thayne · on July 13, 2020

Totally agree. HAproxy is an incredible piece of software. It is both extremely fast, and very versatile, which is a rare combination. The 2.2 release checks off almost every item on my haproxy wishlist, which makes me very happy.

To all the developers of such a great piece of software, I offer you my gratitude.

aduitsis · on July 12, 2020

Haproxy is such a nice piece of software, sensible configuration, very stable and versatile. And, one thing that is highly appreciated, I've never seen it do something I wasn't expecting it to. This quality of minimal surprises in its operation isn't going unnoticed by any measure.

kqr · on July 13, 2020

There are some edges that were still a bit rough at least in 1.8. In particular, the way it handles reloading state from the state file, and the interactions with that state and the configuration file.

I don't remember the exact details, but the allowed formats of some identifiers were different, and it didn't do "the obvious thing" when the configuration and state file contained a different set of definitions, IIRC. This was consistent, but not very well documented.

It works fine for us now, so we like it, of course, but it took some production incidents to figure out how it all worked. (We did test before shipping, of course, but these were hard-to-predict edge cases.)

closeparen · on July 12, 2020

Has it solved the reloading problem yet?

rbjorklin · on July 12, 2020

Yup, https://www.haproxy.com/blog/hitless-reloads-with-haproxy-ho...

jontro · on July 12, 2020

Here is the announcement blogpost https://www.haproxy.com/blog/announcing-haproxy-2-2/

js4ever · on July 12, 2020

Finally full SSL certs management at runtime through the API is here! No need to restart Haproxy anymore to add or update a cert! Brilliant, thanks!

kitotik · on July 12, 2020

FWIW the ability to send a SIGUSR2 to reload certs has been there for awhile.

leehampton · on July 12, 2020

I love HA Proxy, but one thing I'm confused about is why the http-tunnel feature was removed in this release (and deprecated in earlier 2.x releases). http-tunnel allowed you to start a session with an HTTP request/response, then keep the socket to the backend alive without further inspection of the protocol.

This is useful for things like RTSP where you kick things off with HTTP but then stream lower level TCP content over the same socket. There are also lots of other custom protocols that benefit from this type of set up, including one that I'm working on wrangling HA Proxy to work with now.

Does anyone know if there's some replacement way to handle this in HA Proxy that I'm overlooking?

thayne · on July 13, 2020

> why the http-tunnel feature was removed in this release

From the haproxy 2.0 documentation:

> This mode should not be used as it creates lots of trouble with logging and HTTP processing. And because it cannot work in HTTP/2, this option is deprecated and it is only supported on legacy HTTP frontends. In HTX, it is ignored and a warning is emitted during HAProxy startup.

As for a way to handle this, I believe if you are using an HTTP CONNECT or a websocket (Connection: Upgrade), then haproxy will detect that it is a tunnel, and handle that correctly. If that's not the case, you might be able to use haproxy in tcp mode.

thedanbob · on July 13, 2020

There is a bug in 2.2 which prevents websockets (and other tunneling protocols?) from working[0]. Maybe what the OP is running into.

[0] https://github.com/haproxy/haproxy/issues/737

therockspush · on July 13, 2020

Seeing a lot of love for Willy, the main developer for years, on here. I've dabbled in HAProxy, definitely a lovable product.

Looked up Willy on LinkedIn,"DO NOT SEND ME FCKING INVITES IF WE HAVE NOT WORKED TOGETHER! "

No Fcking around with this guy.

snvzz · on July 12, 2020

It's amazing how much better the .org site is over the .com.

TimWolla · on July 12, 2020

And the best thing is Cyril Bonté’s configuration.txt to HTML converter: http://cbonte.github.io/haproxy-dconv/

emmanueloga_ · on July 13, 2020

it is not bad... but in my opinion pretty much every site out there could benefit from a little

    body { max-width: 70rem; margin: 2rem auto; }

... or something similar (pet peeve)

rumanator · on July 12, 2020

Isn't it the other way around? Haproxy.org is not responsive and looks like it was designed in the 90s.

zokier · on July 12, 2020

> looks like it was designed in the 90s.

I think plenty of HNers will see that as a positive aspect.

rumanator · on July 12, 2020

I understand the nostalgia or liking the retro look, but the site is completely unreadable in mobile. There is absolutely no redeeming quality in that.

Claiming that an unreadable version of a site is better than a readable one is simply wrong.

somehnguy · on July 12, 2020

In my opinion a site like this working perfectly on mobile is in the 'who cares' category. Almost nobody is going to haproxy.org on their smartphone with any intention of actually doing anything with the software. Almost nobody is installing or configuring haproxy from mobile.

It's information dense, which is much appreciated on desktop. The .com looks like every other generic 'look at our product!' site, and to actually do anything you have to sort through 5 different dropdowns and other UI items designed to grab your attention.

When I have to install or configure the software I want the .org, 100%.

rumanator · on July 13, 2020

> Almost nobody is going to haproxy.org on their smartphone with any intention of actually doing anything with the software.

This is where you get it entirely wrong. I read this news and I, as an extensive nginx/traffic user, wanted to check out haproxy to understand if it was worth a shot. The .org page is plagued with general usability and readability problems to the point that it's practically unreadable when compared with the page served through the .com domain. There is no way around it.

You don't fix problems by turning a blind eye and playing the denying card. More importantly, this sort of technical snafu is helps form the public image of the product, and thus this sort of poor performance reflects poorly on the product.

somehnguy · on July 13, 2020

If you were actually serious about it you would just mentally note it and revisit when you were on a more capable device. Perusing for replacement infrastructure software via mobile is just a casual thing, and foolish if you're actually trying to get anything done.

Nobody is swapping out any tech via their mobile browser impression. All this is pretty complicated software and you're going to want to do a lot of reading/inspection before making decisions like that. That is not done via a 5" display.

I stand with my view that there is no problem to fix here. The site is clear & understandable to anyone who seriously plans on using it or is using it.

vbezhenar · on July 12, 2020

Some people are using phones to call and computers to browse the Internet.

rumanator · on July 12, 2020

What's your point? Are you trying to argue that just because you imagine someone does not have a smartphone then it's ok to fool ourselves to believe that no one has a smartphone?

Because mobile has been a basic requirement and competency for, say, the last decade.

jazzyjackson · on July 13, 2020

basic requirement and competency? Are you the CSS police?

I would've thought allowing user agent style sheets would be a basic requirement for a web browser but it seems that not everyone agrees with me, and sometimes you just have to accept that not everyone is on the same page as you.

rumanator · on July 13, 2020

> basic requirement and competency? Are you the CSS police?

No, I'm a potential haproxy user who is unable to access the site because whoever put it together either failed to follow basic CSS tutorials or has no idea that there are more reading surfaces than 19' 1980x1080 LCD monitors.

And the surprising thing is that here I am, in HN of all places, where readers are expected to be educated and somewhat informed and with a basic understanding, reading comments like yours. Baffling, to say the least.

jwitthuhn · on July 12, 2020

I think "unreadable" is a pretty big stretch here. This is what the site looks like for me in firefox on android.

https://i.imgur.com/Zmslysb.jpg

The text in the navigation section and the table is a bit small, but quite far from unreadable. That can also easily be solved by zooming in.

On the topic of mobile, I think it is also very important to look at data usage.

~327KB for all assets on haproxy.org ~5.8MB for all assets on haproxy.com including nearly a megabyte of javascript

briffle · on July 13, 2020

It's completely readable on my mobile. (Android running Firefox beta).

fgonzag · on July 17, 2020

Maybe they're not using it on mobile?

netcraft · on July 12, 2020

Slightly OT, we have several different kinds of services that need rate limiting, written in different stacks. We would like to have one solution for rate limiting, ideally that we could put in front of any service, that was light weight, but also could work with AWS target groups that are already splitting traffic across nodes inside a service - so I believe that means some sort of clustered solution or at least communication. Is haproxy a good fit for this? Maybe nginx (paid)?

social_quotient · on July 12, 2020

We use HAproxy for similar reasons you describe if I’m understanding correctly.

As one of the other posts kinda suggested you can get a ton done with a few hours, it might be worth just standing up a box real quick and trying it out. As a note when we try stuff like this we put behind a AWS LB so we can push partial traffic to our experiment and aren’t betting the farm whilst testing in prod.

Good luck!

nine_k · on July 13, 2020

I wonder why Envoy is not mentioned.

warmfusion · on July 13, 2020

Haproxy is brilliant, we've used it for years as a simple mesh on all our services, but we're considering moving to envoy as we need opentracing support to help understand how requests flow between services.

Anyone managed to make this work on haproxy?

rogerdonut · on July 13, 2020

There's an opentracing integration coming very soon! We were hoping to have it available with the 2.2 release but there were still a few things to finalize.

arrty88 · on July 13, 2020

The only open source software better than Haproxy is redis IMHO.

stephenr · on July 13, 2020

I agree that Redis is "good", I wouldn't put it at the level of HAProxy - particularly when it comes to "what stuff is deliberately kept out of the open source version".

If I had to choose a project/tool to put at a similar 'level' as HAProxy in terms of: doing one thing well; a working open source project with a private company backing it; and a well run project, I'd say it's Varnish, which just happens to pair very well with HAProxy.

sillysaurusx · on July 12, 2020

Haproxy isn’t quite as fast as iptables (we switched because of this) but it was delightful to configure. The tradeoff is definitely worth it in most cases.

takeda · on July 13, 2020

That's because iptables do layer 3 load balancing (on a kernel level) while HAProxy does layer 7 (on user level). Layer 3 load balancing is much simpler so there's less work to do. If layer 3 load balancing does work for your use case then HAProxy wasn't the right tool.

BTW: Willy Tarreau (the author of HAProxy) is also a Linux kernel developer and made contributions in those areas, so . If you configure HAProxy to do zero copy forwarding you can get quite good performance.

Thaxll · on July 12, 2020

IPtables does 0.1% of what Haproxy can do.

enitihas · on July 12, 2020

Just curious, what were you doing where you needed the performance of iptables over haproxy?

sillysaurusx · on July 12, 2020

We forward a cluster of 2,560 TPU pod cores from our GCE project to other GCE projects in europe-west4-a. Originally it was because we had a separate GCE project with a bunch of credits, but that project had no access to TPUs. The question was, could we still take advantage of the credits? It turns out, we could; the solution involved VPC Network Peering, which I later learned is how the TPUs themselves work. Some configuration details are here: https://www.shawwn.com/swarm#iptables

Nowadays we forward the TPU pods to pretty much anyone who wants to try them out, in hopes of getting more people involved in the TPU programming scene. The TPUs are managed via a website (https://www.tensorfork.com/tpus) and we coordinate TPU access via spreadsheet. Each researcher has their own GCE project, and we simply flip a switch to give them access.

If anyone reading this happens to be into ML and into programming for big hardware rigs, feel free to hop into the Tensorfork discord server and we can show you the ropes. https://github.com/shawwn/tpunicorn#ml-community

toast0 · on July 12, 2020

I've done some dead simple forwarding/load balancing work, and if you can do it with nat instead of a proxy application it'll use a lot less memory, in addition to less cpu.

That means fewer load balancers needed, or smaller machines (or both). So I'd say that means anytime you run out of capacity on your proxy machines would be an opportunity to look for other techniques. Haproxy is probably easier to use though, and would tend to need less work to get the features you want, though. So there's an opex/capex vs development time argument.

Hyperscaling Haproxy is a lot of fun too, though. There's a huge difference in connections/second between a normal config and a totally tuned config with haproxy and kernel patching on the table.