> Unfortunately, some new non-technical management at F5 recently
decided that they know better how to run open source projects. In
particular, they decided to interfere with security policy nginx
uses for years, ignoring both the policy and developers’ position.
Ah, I completely forgot F5 was involved in this, probably most of everyone else and F5 gets no money from this. Shouldn't matter to them, do they even have competition in enterprise load balancer space? I spent 9 years of my career managing these devices, they're rock solid and I remember some anecdotes about MS buying them by the truckloads. They should be able to cover someone working on nginx, maybe advertise it more for some OSS goodwill.
I dunno about rock solid. I’ve had plenty of issues forcing a failover/reboot, multiple complicated tickets open a year, etc. But we have a sh ton of them. To be fair, some are kernel bugs with connection table leaks, SNAT + UDP, etc.
Buuuut, they have by far the best support. They’re as responsive as Cisco, but every product isn’t a completely different thing, team, etc. And they work really well in a big company used to having Network Engineering as a silo. I’d only use them as physical hardware, though. As a virtual appliance, they’re too resource hungry.
Nginx or HA-Proxy are technically great for anything reasonable and when fronting a small set of applications. I prefer nginx because the config is easier to read for someone coming in behind me. But they take a modern IT structure to support because “Developers” don’t get them and “Network Engineers” don’t have a CLI.
For VMWare, NSX-V HA-Proxy and NSX-T nginx config are like someone read the HOWTO and never got into production ready deployments. They’re poorly tuned and failure recovery is sloooow. AVI looked so promising, but development slowed down and seemed to lose direction post acquisition. And that was before Broadcom. Sigh.
I'm very out of date so take my opinion with a grain of salt. The customer support I received from F5 when they acquired a telco product was about the worst support I've ever seen. Now this wasn't the general LB equipment that F5 has the reputation around, it's some specific equipment for LTE networks.
We'd get completely bogus explanations for bugs, escalate up the chain to VPs and leadership because there was an obvious training, understanding, and support for complex issues problem, and get the VPs trying to gaslight us into believing their explanations were valid. We're talking things like on our IPv4 only network, the reason we're having issues is due to bugs in the equipment receiving IPv6 packets.
So it's one of those things where I've personally been burned so hard by F5 that I'd probably to an unreasonable level look for other vendors. The only thing is, this was awhile ago, and the rumor's I've heard are that no one involved is still employed by F5.
I completely get this. I feel like every product I’ve had outside of a vendor’s wheelhouse has gone that way. We just use the BigIP gear from F5 and they’re better than the load balancers we used in the past. Thank god Cisco just abandoned that business.
I can’t imagine them supporting telco gear. The IPv6 thing has me LOLing because I just had a similar experience with a vendor where we don’t route IPv6 in that segment and even if we did, it shouldn’t break. Similarly, a vendor in a space they don’t belong that I imagine we bought because of a golf game.
A thing I dread is a product we’ve adopted being acquired… and worse, being acquired by someone extending their brand into a new area. It’s also why we often choose a big brand over a superior product. It’s not the issue of today, but when they get bought and by who. I hate that so much and not my decision, but it’s a reality.
It’s also a terrible sign if you’re dealing with a real bug and you’re stuck with a sales engineer and can’t get a product engineer directly involved.
I have a list of “thou shalt not” companies as well, and some may be similar where a few bad experiences ruined the brand for me. Some we’re still stuck with and I maaaay be looking for ways to kill that.
First, I don’t make these decisions but sometimes have influence. These opinions are my own and not my intentionally unnamed employer, and might be flat out wrong. This list is very focused on big companies at stupid scale with a lot of legacy… applied tech.
Generally my rule is “except for their very core product.” But this is full “hate everything” that pops into my mind:
RedHat won’t accept gifted patches for critical bugs in their tools that they won’t troubleshoot themselves. Getting the patch upstream means you get to use it in the next major version years later. That predates IBM. I won’t use their distribution specific tooling anymore. Outside the OS sucks worse. If I hear ActiveMQ one more time… [caveat: I probably hate every commercial Linux distro and Windows because my nonexistent beard is grayer than my age]
IBM… kind of feel sad about it, but they now suck at everything.
Oracle has good support, but they’re predatory and require an army of humans to manage inherently hodgepodge systems. Also creates an organizational unit of certified admins that can’t transition to alternatives because they’ve only memorized the product. Cisco’s the same except the predatory part and without many good alternatives for core DC gear.
CA, Symantec were awful pre-Broadcom and even worse now that they’re Broadcom’s annuity. Where products go to die.
Trellix (ex McAffee) is like the new Symantec or something.
There’s more I wish I could list for you, but can’t for various reasons.
On the other end, Satya has made MS a reasonable choice in so many things. Still a lot that sucks or is immature, but still… I didn’t think that was possible. I had to shift my mindset.
When was this? I worked with them 2009-2018, support was really top notch. We could get super technical guys on the call and even custom patches for our issues, but our usage was relatively simple. I contrast them with McAfee products we've used, now that was a complete shitshow as a product and support.
The world has moved on in the sense that "good enough" and cloud eats into their balance sheets I'm sure, but there's loads and loads of banks and legacy enterprises that maintain their ivory tower data centers and there's nothing to replace these with AFAIK. Google has Maglev, AWS perhaps something similar, MS no idea, everyone else just buys F5 or doesn't need it.
My org moved off nginx for haproxy after we learned that (at the time, maybe it changed) reloading an nginx config, even if done gracefully through kernel signals, would drop existing connections, where haproxy could handle it gracefully. That was a fun week of diving in to some C code looking for why it was behaving that way.
Nginx abruptly drops http/1.1 persistent connections on reloads. This has been an issue forever and Maxim refused to ever fix it, saying it was to spec (yes it was, but there are better ways to deal with it).
It’s a reason why many large, modern infra deployments have moved away from nginx.
It doesn't drop it, it's just not persistent on reload, isn't that what you mean? Actually dropping a connection mid-request is something I haven't seen nginx (or indeed Apache) do for many years despite doing some weird things with it.
I can see where you're coming from, but it's not unreasonable behaviour, is it? Connections needs to migrated over to the new worker and that's how all major servers do it. If that's a problem then maybe something designed as proxy only instead of a real server is the way to go?
It doesn't drop mid request. But it closes the TCP socket abruptly after any in flight requests are completed. Clients have no idea the connection is closed, and try to reuse it and get back a RST. In heavily dynamic environments where nginx reloads happen frequently, it leads to large amounts of RSTs/broken connections and high error rates (you can't necessarily auto-retry a POST, a RST could mean anything).
The sane approach is connection draining - you send a `connection: close` response header on the old worker, then finally remove any remaining idle connections at the end of the drain.
In http/2 it's not an issue as it has a way for the server to explicitly say the connection is closed.
What you describe is basically how persistent http works, is it not? Even a persistent connection terminates at some point. Which web server does not work like that?
I guess you could send the connection header on draining, but anything less than what the big servers do is bound to cause some compatibility problem with some niche client somewhere. I can certainly see why a web server with millions of installs would be reluctant to change bevaviour, even if it is within spec.
I can only guess at the use case here, but maybe something designed from the start as a stateless proxy and not a general purpose web server would be a better fit.
I'm late to return to the thread, but this was the exact scenario we hit. We had it behind a CDN as well as behind an L4 load balancer for some very high volume internal consumers, and when it would just blast back RST packets, the consumers would freak out and break connection, returning errors on their end that weren't matched in our logs, unless we were lucky and got a 499 (now Maxim can talk about standards). As a general purpose reverse proxy for many clients on the Internet, I'm sure that's fine, but in our use case this made nginx unpredictable and no longer desirable.
Isn't the typical behaviour of an application to re-establish the persistent connection on demand? I wonder what the requirement is to have these persistent with no timeout.
we went in the opposite direction, not because haproxy was bad, just because nginx had a simpler config, and i think we were paying for haproxy but don't pay for nginx.
all that said, neither drops existing connections on reload
> nginx is already good at mitigating HTTP desync / request smuggling attacks, even without using HTTP/2 to backends. In particular because it normalizes Content-Length and Transfer-Encoding while routing requests (and also does not reuse connections to backend servers unless explicitly configured to do so)
HAProxy is a wonderful load balancer that doesn't serve static files thus forcing many of us to learn Nginx to fill the static-file-serving scenarios.
Caddy seems like a wonderful alternative that does load balancing and static file serving but has wild config file formats for people coming from Apache/Nginx-land.
I keep a Caddy server around and the config format is actually much, much nicer than nginx's in my experience. The main problem with it is that everybody provides example configurations in the nginx config format, so I have to read them, understand them, and translate them.
This works for me because I already knew a fair bit about nginx configuration before picking up Caddy but it really kills me to see just how many projects don't even bother to explain the nginx config they provide.
An example of this is Mattermost, which requires WebSockets and a few other config tweaks when running behind a reverse proxy. How does Mattermost document this? With an example nginx config! Want to use a different reverse proxy? Well, I hope you know how to read nginx configuration because there's no English description of what the example configuration does.
Mastodon is another project that has committed this sin. I'm sure the list is never-ending.
> The main problem with it is that everybody provides example configurations in the nginx config format, so I have to read them, understand them, and translate them.
This is so real. I call it "doc-lock" or documentation lock-in. I don't really know a good scalable way to solve this faster than the natural passage of time and growth of the Caddy project.
You're absolutely right. I'm going to do this today.
It's clear from this thread that a) Nginx open source will not proceed at its previous pace, b) the forks are for Russia and not for western companies, and c) Caddy seems like absolutely the most sane and responsive place to move.
LLMs do a horrendous job with Caddy config as it stands. It doesn't know how to differentiate Caddy v0/1 config from v2 config, so it hallucinates all kinds of completely invalid config. We've seen an uptick of people coming for support on the forums with configs that don't make any sense.
For just blasting a config out, I'm sure there are tons of problems. But (and I have not been to your forums, because...the project just works for me, it's great!) I've had a lot of success having GPT4 do the first-pass translation from nginx to Caddy. It's not perfect, but I do also know how to write a Caddyfile myself, I'm just getting myself out of the line-by-line business.
Thanks for the link! Maybe less thanks for the attitude, though--I'm well-versed in how these tools fail and nothing goes out the door without me evaluating it. (And, for my use cases? Generally pretty solid results, with failures being obvious ones that fail in my local and never even get to the deployed dev environment.)
> This is so real. I call it "doc-lock" or documentation lock-in. I don't really know a good scalable way to solve this faster than the natural passage of time and growth of the Caddy project.
I think you are totally right here - gaining critical mass over the time for battle tested solution. On the other hand, the authors [who prefers Caddy] of docs will likely abandon providing Nginx configs sample and someone else will complain on that on HN.
"Battle tested" can be seen differently of course, but in my opinion, things like the next one,
> IMO most users do require the newer versions because we made critical changes to how key things work and perform. I cannot in good faith recommend running anything but the latest release.
from https://news.ycombinator.com/item?id=36055554 , by someone working at Caddy doesn't help.
May be in their bubble (can I say your bubble as you are from Caddy as well?) noone really cares on LTS stuff and just use "image: caddy:latest" and everything is in containers managed by dev teams - just my projection on why it may be so.
How would you imagine this in practice? Should one to provide instructions how to unwrap docker images/dockerfiles project uses (quite many do lean on Docker/Containers nowadays and not regular system setup) to for example setup the same on FreeBSD Jails? Where to stop here?
Just for completeness sake and probably not useful to many people, HAProxy can serve a limited number of static files by abusing the back-end and error pages. I have done this for landing pages, directory/table of content pages. One just makes a properly configured HTTP page that has the desired HTTP headers embedded in it and then configure it as the error page for a new back-end and use ACL's to direct specific URL's to that back-end. Then just replace any status codes with 200 for that back-end. Probably mostly useful to those with a little hobby site or landing page that needs to give people some static information and the rest of the site is dynamic. This reduces moving parts and reduces the risk of time-wait assassination attacks.
This method is also useful for abusive clients that one still wishes to give an error page to. Based on traffic patterns, drop them in a stick table and route those people to your pre-compressed error page in the unique back-end. It keeps them at the edge of the network.
FYI: Serving static files is easier and more flexible in modern versions of HAProxy via the `http-request return` action [1]. No need to abuse error pages and no need to embed the header within the error file any longer :-) You even have some dynamic generation capabilities via the `lf-file` option, allowing you to embed e.g. the client IP address or request ID in responses.
Nice, I will have to play around with that. I admit I sometimes get stuck in outdated patterns due to old habits and being lazy.
I'm a community contributor to HAProxy.
I think I recall chatting with you on here or email, I can't remember which. I have mostly interacted with Willy in the past. He is also on here. Every interaction with HAProxy developers have been educational and thought provoking not to mention pleasant.
> I think I recall chatting with you on here or email, I can't remember which.
Could possibly also have been in the issue tracker, which I did help bootstrapping and doing maintenance for quite a while after initially setting it up. Luckily the core team has took over, since I had much less time for HAProxy contributions lately.
True and I've made use of the Nginx adapter, but the resulting series of error messages and JSON was too scary to dive in further. The workflow that would make the most sense to me (to exit Nginx-world) would be loading my complex Nginx configs (100+ files) with the adapter, summarizing what could not be interpreted, and then writing the entirety to Caddyfile-format for me to modify further. I understand that JSON to Caddyfile would be lossy, but reading or editing 10k lines of JSON just seems impossible and daunting.
For a lot of web apps, having an all-in-one solution makes sense.
nginx open source does all of these things and more wonderfully:
Reverse proxying web apps written in your language of choice
Load balancer
Rate limiting
TLS termination (serving SSL certificates)
Redirecting HTTP to HTTPS and other app-level redirects
Serving static files with cache headers
Managing a deny / allow list for IP addresses
Getting geolocation data[0], such as a visitor’s country code, and setting it in a header
Serving a maintenance page if my app back-end happens to be down on purpose
Handling gzip compression
Handling websocket connections
I wouldn't want to run and manage services and configs for ~10 different tools here but nearly every app I deploy uses most of the above.
nginx can do all of this with a few dozen lines of config and it has an impeccable track record of being efficient and stable. You can also use something like OpenResty to have Lua script support so you can script custom solutions. If you didn't want to use nginx plus you can find semi-comparable open source Lua scripts and nginx modules for some individual plus features.
[0]: Technically this is an open source module to provide this feature.
Quite intersting - in theory, "pure" load balancer shouldn't not, but in practice most of my LBs, especially for small projects do. Even for larger projects I do combine proxy_cache on LB making it serve static files or using to serve websites public content and splitting load over several application servers for dynamic content.
Ah, I completely forgot F5 was involved in this, probably most of everyone else and F5 gets no money from this. Shouldn't matter to them, do they even have competition in enterprise load balancer space? I spent 9 years of my career managing these devices, they're rock solid and I remember some anecdotes about MS buying them by the truckloads. They should be able to cover someone working on nginx, maybe advertise it more for some OSS goodwill.