Hacker News new | past | comments | ask | show | jobs | submit login
HTTP/2-exclusive threats caused by implementation flaws and RFC imperfections (portswigger.net)
393 points by Berg0X00 on Aug 5, 2021 | hide | past | favorite | 112 comments



So I was scratching my head for a good couple of minutes trying to figure out how this works, being familiar only with HTTP Response Splitting/HTTP Cache Poisoning.

So it seemed that somewhere along the years while I haven't been paying any attention to websec, it became a common practice to send requests from different clients through the same TLS connection. And due to the non conforming way HTTP 2/1.1 interop was implemented by these webservers/load balancers request boundaries were not delimited correctly, making it possible to inject requests on behalf of followup clients.

Fine I get the issue. Sounds like another "good enough" optimization that backfired.

What is the solution aside from playing the patching wack-a-mole game? Should maybe a HTTP2.1 protocol work without a strict TLS requirement so that protocol downgrades aren't necessary to squeeze out extra performance (unless I misunderstand why the HTTP2/HTTP1.1 bridge was in place). Or is the problem that some application servers still don't support HTTP2 out of the box?


It's worth knowing that this is an extension of a very widespread attack on HTTP/1.1; many (maybe most?) 1.1 implementations were recently broken by desync just a couple years ago.


Out of the ecosystems I’m familiar with, Python application servers have terrible http2 support: neither gunicorn nor uwsgi supports it, and even new hotness like uvicorn is pretty far from it.

I don’t think Ruby is doing much better? Correct me if I’m wrong.


But why would you need HTTP/2 perfect support in real world application server? They are never going to terminate the client traffic, they will speak with a load balancer which can speak HTTP/1.1 with them. Sure, if you are at webscale or even less you want everything on HTTP/2 for optimization sake. But in the rest of cases, even if you are in a solo project, you can easily enough put an nginx before it, or a cloud native solution, or haproxy or whatever.


The whole point of this article is that proxies speaking HTTP2 with clients and HTTP1.1 with servers introduce new vulnerabilities. The author found such vulnerabilities in AWS ALB, several WAF solutions, F5 BigIP, and others.


Yeah but serving traffic from an application server directly is probably even worse in plethora of other failure modes.

EDIT: and yes I understand that you should use http/2 on the LB and http/2 on the backend to get the best of both worlds.

EDIT2: anyway my opinion is that the general reaction to a security discovery like this one shouldn't be "let's stop using this tech immediately" but "let's get this patched ASAP"


No one was talking about serving from application directly. The issue is in scenario you are describing. Please read the article.


Multiplexing API requests could be a thing if only HTTP2 pass through from proxies would be more popular.


I forwarded this discussion to the lead maintainer of HAProxy and he confirmed that HAProxy is not impacted by this. It doesn't surprise me. He implements things to the strictest interpretation of the specs.


The building blocks are there. In Python we have the wonderful (and wonderfully sans-io) h2[1] by Cory Benfield. E.g. here's a Twisted h2-using implementation: https://python-hyper.org/projects/hyper-h2/en/stable/twisted...

h2: https://github.com/python-hyper/h2


I think the main problem is indeed missing HTTP2 support in backend servers. This is often just a case of people not being willing to upgrade for various reasons, even if the technology they are using would support HTTP2 in newer versions.


The problem is that HTTP/2 pretty much forces encryption. Most people don't want to deal with certificate management/rotation on every single microservice's application server.


Why does HTTP/2 force encryption? The HTTP/2 RFC (RFC 7540) also defines how to run HTTP/2 over plaintext.

Terminating HTTP/2 over TLS on a web frontend and then HTTP/2 over plaintext to the application servers sounds like a viable model.


In practice HTTP/2 forces encryption. For example Amazons ALB docs say "Considerations for the HTTP/2 protocol version: The only supported listener protocol is HTTPS." [1]

[1]: https://docs.aws.amazon.com/elasticloadbalancing/latest/appl...


My Apache server is fine speaking HTTP/2 over port 80:

  curl -v --http2 --http2-prior-knowledge http://localhost
  * Connected to localhost (::1) port 80 (#0)
  * Using HTTP2, server supports multi-use
  * Connection state changed (HTTP/2 confirmed)
  * Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
  * Using Stream ID: 1 (easy handle 0x559a7c6545c0)
  > GET / HTTP/2
  > Host: localhost
  > user-agent: curl/7.74.0
  > accept: */*
  > 
  * Connection state changed (MAX_CONCURRENT_STREAMS == 100)!
  < HTTP/2 301 
  < date: Fri, 06 Aug 2021 11:16:05 GMT
  *snip*
Sadly none of the services that I reverse proxy through Apache support HTTP/2..


It make sense that internet facing clients and servers only support HTTP/2 over TLS. But that is different for internal connections or debug tools.


If you're running microservices, won't you be running them on a platform?

If on Kubernetes, just install cert-manager. Or if using FaaS, your platform will already do TLS termination, no?


That assumes your application can connect to the internet and can be accessed from the internet. There is a vast array of offline-only kubernetes clusters.

You can still of course use self-signed certificates (or setting up your own "CA"), but you'll hit other problems related to runtime certificate reloads and so on. It's still a lot of work to enable SSL for fully offline services.


On kubernetes, cert-manager might get certificates for you, but you'll still need to make sure the application correctly reloads those certs (many application frameworks have no way to reload a certificate at runtime).


Your reverse proxy can handle the TLS termination. Nginx or Traifik or whatever.

Besides, if you're doing microservices, whatever is managing them should be able to restart them gracefully. No need for reload as such.


this kind of terrible implementation is a lot of why encrypted quic exists.


Though I think these same problems would happen over http3/quic.

Http3 is almost exactly http2 but over quic. This means the requests should be possible to force the same desyncs talked about in the article.


It seems most of the vulnerabilities here (at least, the first several that I read in the article) come at a point where an HTTP/2 request is transformed into an HTTP 1.1 request. So the insecurity seems to come at the point where a translation occurs and there is redundant information that can be handled in several ways, meaning that there's lots of room for bugs.


@albinowax_, can you say anything more about how widely you have investigated the prevalence of the "H2.X via Request Splitting" vulnerability?

Although all of your attack vectors are fascinating, this seems to be the one which is particularly terrifying. It seems like one single request can basically mess up user authentication across an entire website, and for every active user, until the server becomes resynchronized somehow. If I were running a large public-facing site or application, checking for this vulnerability would be my #1 priority this morning.


Good question! So, my understanding is that the majority of servers that are vulnerable to regular cross-user HTTP Request Smuggling (IE, reuse connections to the back-end server) are exposed to this terrifying response queue poisoning attack. This applies to all desync types CL.TE, TE.CL, H2.CL, etc. The reason I discovered this in the H2.X case, is because it's particularly easy to trigger response queue poisoning by accident in this scenario.

That said, I haven't actually tested this on very many live servers, for obvious reasons!


Burying the lede... Unfortunately, at the time of this whitepaper being published - 86 days after Apache was notified of the vulnerability - 2.4.49 not yet come out, so although there's a patch on master, this is effectively a zero-day.


I'm not seeing anything about this in nginx changelogs (or matching CVE's) either. Disabling http2 for now.


Why would an Apache bug show up in Nginx changelogs?


It would not, but given how widespread this issue is it seems likely nginx is vulnerable, and sensible to assume it is unless otherwise demonstrated.

It’s not like the feature is super useful in general.


Really good explanations here. As a Cloudflare Enterprise customer I've opened a case with them to see if any of these vulnerabilities apply to their WAF. I was a bit surprised to see the article mention that Imperva's WAF was vulnerable.

edit: Cloudflare has responded that "[they] are asking your question internally."


As far as I'm aware, none of the techniques referenced in this paper work on Cloudflare as-is. Of course I probably missed a bunch of variations, so it's still worth them doing some internal checks.


You can partially answer this question yourself. So the issue can occur, when HTTP/2 requests get translated into HTTP/1.1 calls. Your HTTP calls stack from client to web app presumably looks something like this:

Client → (HTTP/2) → Cloudflare ingress → (???) → Cloudflare egress → (???) → Your endpoint → (???) → Your web app

On your endpoint (LB or webserver) logs you can check if the requests coming in from Cloudbees are using HTTP/2 or not. If they do use HTTP/2, they internally might still do translation down to HTTP/1.1 and back.

But more importantly, you should investigate how your LB or webserver talks to the web app. In many cases that part won't be TLS encrypted and therefore on an HTTP/1.1 channel.

The exception is if your outward facing, TLS terminating, webserver interacts with the web app via CGI or FastCGI. While that part of the call stack would not be affected by this particular attack, (Fast)CGI comes with it's own set of risks.

In the end, the only part that you can't easily validate yourself of being susceptible to this attack (aka having to translate between HTTP/2 and HTTP/1.1) is the bit that occurs internally to Cloudflare.


> But more importantly, you should investigate how your LB or webserver talks to the web app. In many cases that part won't be TLS encrypted and therefore on an HTTP/1.1 channel.

Or even HTTP 1.0 . I found out recently while inspecting some requests in the upstream server that nginx was using HTTP 1.0 after terminating TLS. I was dumbfounded that this was still the default.

http://nginx.org/en/docs/http/ngx_http_proxy_module.html#pro...


Our parent company is an Akamai Enterprise user. We've opened a case with them as well.


The title seem very much anti-HTTP/2. However, unless I missed something huge, the vast majority, if not all, of the issues found relate to how hard it is to handle HTTP/1. And it seems kinda odd to blame HTTP/2 for HTTP/1 being difficult to implement.

I'm certainly not trying to downplay the seriousness of these issues. But it seems like a equally (it not more) valid title might be something like "HTTP/1: Continues to rear its ugly head" or something like that.


Hi, I'm the author. This paper is about HTTP/2, and the dangerous things that happen behind the scenes with HTTP/1 when people enable it. It's arguing that enabling HTTP/2 on your front-end makes your security worse if it's downgrading behind the scenes.

This is a deliberately cheeky title, selected as a presentation title, rather than being aimed at social media where many people will read the title but not the contents. For HN I'd go with something dry like "HTTP/2-exclusive attack vectors".


My employer's mandatory proxy downgrades all HTTP/2, presumably so they can better inspect traffic. This results in a bad/slow experience on many sites. I wish HTTP/2 would not have downgrade at all so they would have to deal with it..


More than that, they relate to proxies. This sentence seems the crux to me:

> If you're coding an HTTP/2 server, especially one that supports downgrading, enforce the charset limitations present in HTTP/1 - reject requests that contain newlines in headers, colons in header names, spaces in the request method, etc.


I imagine it also relates to things that really "proxies", but not normally thought of as such. Like PHP-FPM, wsgi, etc. Those are also doing protocol and header translation, and are very ubiquitous.


What's wrong with it? It seems reasonable to protect yourself from header-injections allowed by a downgrade?

Am I misunderstanding?


I don't think there's anything wrong with it, it's great advice!


I guess this is more in line with the narrative that older protocols that people are more familiar with are somehow always better. That there is a hidden evil agenda for introducing something new. Or at least that we all need to pay a price for the sole benefit of a tech giant. It is a convenient narrative, which could be true, but it feels too easy a conclusion. Which makes it a good clickbait.


> It is a convenient narrative

It's also not he narrative the content of the article purses ;-)


Yes, it is weird that a broken protocol translator results in blame for the newer protocol.

As far as I can tell, the article doesn't attack HTTP2, that seems work fine. The article clearly demonstrates the problems of HTTP1. But the real problem is sloppy HTTP2 forntends that generate broken HTTP1.


> Yes, it is weird that a broken protocol translator results in blame for the newer protocol.

From a look at the HTTP/2 spec. this protocol translation is an expected use case and explicitly results in several restrictions on the contents of HTTP/2 headers. So going by the HTTP/2 spec. at least some of those headers should have been rejected by a conformant front end and never made it into HTTP/1.


So how is a collection of broken translators the fault of HTTP/2? The title says 'The Sequel is Always Worse'. It is not HTTP/2 that is bad. It is translation to the, from a security point of view, problematic HTTP/1 that is the problem.


> It is translation to the, from a security point of view, problematic HTTP/1 that is the problem.

Parts of the HTTP/2 spec. specify what malformed headers look like, so this bug is almost entirely in the code not validating those HTTP/2 headers. This also isn't something that a few high school dropouts got wrong on their side projects, several major services got it wrong. Given that the spec. made at least some attempts to warn its implementers maybe future standards need a security section titled "Important: Why ignoring security advisory in standards is a bad idea".


Majority of the cases were requests that HTTP/2 clearly specified as malformed but they were not treated as such. It's reasonable to expect there will be other such implementation vulns that won't stem from HTTP/1 historical baggage - that is just where the author was looking today. So I'm fine with the "HTTP/2 is worse in its own" presumption because it's harder to implement.

Then, the \r\n in headers can easily trip an application that was naively ported to http/2 supporting network library. That's too pretty ripe for blame shifting :)


Thanks for this @albinowax_ - super interesting!

Just a quick question about the H2.CL case study. Did Netflix have a separate vulnerability where their server wrongly trusted Host headers with netflix.com suffix and returned a 302 response which can redirected users to an arbitrary host? I'm just trying to see whether I interpret the case study correctly.


Yes that's right. If the back-end received a request that didn't contain X-Forwarded-SSL or suchlike, and had a host-header that ended in .netflix.com, it would redirect you to the host-header.

I wouldn't exactly class this as a vulnerability - more of a useful gadget. The front-end would refuse to forward such a request so it's impossible to hit this code path without request smuggling. Even if you could hit it without request smuggling it would still be useless.


Given the long history of request parsing vulnerabilities in HTTP/1.1 servers and proxies, is HTTP/2 actually worse, or have most of the HTTP/1.1 bugs just been fixed already?


These vulnerabilities are all from badly-written HTTP/2 → HTTP/1.1 translations. Most of them come from simple carelessness, rookie errors that should never have been made, dumping untrusted bytes from an HTTP/2 value into the HTTP/1.1 byte stream. This is security 101, straightforward injection attacks with absolutely nothing HTTP-specific in it.

Some of them are a little more complex, requiring actual HTTP/2 and HTTP/1.1 knowledge (largely meaning HTTP/2 framing and the content-length and transfer-encoding headers), but not most of them.

Is HTTP/2 actually worse? Not in the slightest; HTTP/1.1 is the problem here. This is growing pains from compatibility measures as part of removing the problems of an unstructured text protocol. If you have a pure-HTTP/2 system and don’t ever do the downgrade, you’re in a better position.


I'd agree that HTTP/1 deserves a significant portion of the blame.

On the other hand, one maxim I've learned from my time bug hunting is that nobody ever validates strings in binary protocols. As such, I'm utterly unsurprised there are so many implementations with these kinds of bugs, and I'd say they could have been predicted in advance.

In fact… let's see… yep, they were predicted. Some of them, at least. In the HTTP/2 RFC, under Security Considerations, 10.3 'Intermediary Encapsulation Attacks' describes one of the attack classes from the blog post, the one involving stuffing newlines into header names.

Does that mean something could have been done about it? Perhaps not. The ideal solution would be to somehow design the HTTP/2 protocol itself to be resistant to misimplementation, but that seems pretty much impossible. The spec already bans colons and newlines in header names, but there's no way to be sure implementations won't allow them anyway, short of actually making them a delimiter like HTTP/1 did – in other words, reverting to a text-based protocol. But a text-based protocol would come with its own share of misimplementation risks, the same ones that HTTP/1 has.

On the other hand, perhaps the bug classes could have been mitigated if someone designed test cases to trigger them, and either included them in conformance tests (apparently there was an official HTTP/2 test suite [2] though it doesn't seem to have been very popular), or set up some kind of bot to try them on the entire web. In principle you could blame the authors of HTTP/2 collectively for the fact that nobody did this. But I admit that's pretty handwavey.

[1] https://datatracker.ietf.org/doc/html/rfc7540#section-10.3

[2] https://github.com/http2/http2-test


> On the other hand, one maxim I've learned from my time bug hunting is that nobody ever validates strings in binary protocols.

I wonder how much this has to do with the way strings need to be handled in the programming languages these protocols are implemented in. If dealing with strings is something that seems to be even more of a danger (if done incorrectly) you might just not do it.


It's tough to say that something is a "rookie error" when basically every serious professional team makes the same mistake. This broke apparently broke every AWS ALB, for instance.


I am genuinely astonished at the number of implementations and major players that are experiencing problems here. I’ve done plenty of HTTP/1 parsing (most significantly in Rust circa 2014) and some HTTP/2 parsing in its earlier draft days, and I can confidently and earnestly state that my code (then and now) would never under any circumstances be vulnerable to the ones I’m calling rookie errors, because I’m always going to validate the user input properly, including doing any subsequent validation necessary in the translation layer due to incompatibilities between the versions, because I know it’ll blow up on me if I don’t do these things. Especially when all of this stuff has already been pointed out in the HTTP/2 RFC’s Security Considerations section, which such sections you’re a fool to ignore when implementing an IETF protocol. The attacks that depend on content-length and transfer-encoding I’m quite not so confident about, though I believe that any of my code that I wrote then or that I would write now will be safe.

It’s quite possible that my attitude to these sorts of things has been warped by using Rust, which both encourages proper validation and makes it easier and more natural than it tends to be in languages like C or C++. I’d be curious to see figures of these sorts of vulnerabilities in varying languages—I strongly suspect that they occur vastly less in Rust code than in C or C++ code, even when they’re not directly anything to do with memory safety.


An error that's extremely common among people doing their first work on a specific domain seems like a good fit for "rookie error".

It's easy to believe most professional teams make that mistake at some point. I'd hope that it's far more rare to make that mistake twice.


No, that doesn't make sense. The errors that trip seasoned pros up are very likely to trip rookies up as well. Words mean things; rookie mistakes the mistakes that don't trip up the pros.


you're assuming the "pros" hired people with experience in the domain and retained them, and didn't let rookies do said mistakes.


Ah, the venerable "no true professional" argument. A sufficiently optimizing professional would never make these mistakes, it's true!


I would bet that a lot of these are not rookie errors, they are more akin to Spectre or Meltdown: inherently unsafe code that was considered a valuable risk for performance.

In general, when writing a high performance middle box, you want to touch the data as little as possible: ideally, the CPU wouldn't even see most of the bytes in the message, they would just be DMA'd from the external NIC to the internal NIC. This is probably not doable for HTTP2->HTTP1, but the general principle applies. In high-performance code, you don't want to go matching strings any more than you think is strictly necessary (e.g. matching the host or path to know where to actually send the packet).

Which is not to say that it wasn't a mistake to assume you can get away with this trade-off. But it's not a rookie error.


No, as I said most of these are absolutely trivial injection attacks from not validating untrusted inputs, being used to trigger a class of vulnerability that has been well-documented since at least 2005.


My point is that the code is doing the most performant thing: sending the values from A to B with as little bit twiddling as possible. They almost certainly failed to even consider that there are different restrictions between the 2 protocols that could pose security issues.


Is an new bucket leaking in a dozen places worse than an old one with all leaks fixed? I would say yes until those holes in the new one are also fixed.


When I implemented an HTTP2 server several years ago it was all of the "fun" of HTTP 1.1 parsing and semantics plus the extra challenges of the HTTP2 optimizations such as HPACK, mapping the abbreviated headers to cache in-memory representations, stream management, and if you supported Push Promises then that too.


Hi, I'm the author - please let me know if you have any questions!


Just want to thank you for publishing this and taking the time to present it in such a clear manner.

As someone who has implemented HTTP/1.1 message parsing myself, it blows my mind how downgrade implementers could've made such basic mistakes as to not validate the contents of headers. The companies selling security software with these mistakes should be held accountable IMHO, as the attacks are so simple and so many that they've probably been exploited extensively in the wild, and similar attacks almost certainly continue to be.


In light of these vulnerabilities, should we disable Internet-facing use of HTTP/2 wherever we can?


I think HTTP/2 is fine when it's used end to end. So if you've got a single webserver setup, or a reverse proxy that speaks HTTP/2 to the back-end, it's great.

However, if the only way you can use HTTP/2 is by having the front-end downgrade it to HTTP/1, I would recommend disabling it.


In fact it will depend on the implementations. As you've found, the root cause of the problem is that H1 and H2 use different delimiting techniques for protocol elements. Some implementations need to transcode H2 into an internal representation. In this case, and provided the protocol elements are checked in the intermediate representation, any protocol on the other side will be fine. But if the internal representation is H1-like with poor message delimitation, you could very well end up splitting requests internally and emitting 2 H2 requests on the backend for a single front H2 request for example.

I'd say that the main cause of these issues is the directive language used in the H2 spec almost only saying "this must be done like this" without giving rationale for the rules, meaning that implementations that were unable to implement them exactly the same way were left with no hint about what the trouble was nor how to address it. The new spec in progress and the extraction of the core semantics makes the whole thing a lot cleaner.

In addition, H2 uses much more resources per connection than H1, which encourages to coalesce them between the proxy and the server. But coalescing connections can easily cause some head-of-line blocking if coalesced front streams show different bandwidths. So even end-to-end H2 is not always a panacea.


That JIRA exploit is mind-boggling bad. Posts like this remind me that we work on a house cards, not a fortress.


I've been burnt one too many times by trying to "upgrade" to HTTP/2.

It simply broke client certificate authentication and NTLM authentication, leaving only cookie-based authentication fully functional.

Can you guess which two of the three popular authentication mechanisms Google doesn't use?

It's not a protocol designed to advance the Internet, it's a protocol designed by Google to shave 1% off their network bill.


> Can you guess which two of the three popular authentication mechanisms Google doesn't use?

Google is hardly alone. Non-cookie-based authentication mechanisms aren't used (and aren't even usable) on any other public web site, either.

TLS client certificates have always been a UX nightmare. There is no standard UI for creating, installing, or selecting a certificate to use to authenticate to a web server; many browsers (especially on mobile devices) don't even support those features. There's no way to log out without closing the browser. There's no way for an average user to transfer a certificate from one device to another.

NTLM is simply irrelevant outside the scope of a Windows network. Other HTTP password mechanisms have many of the same failings as client certificates -- the UI is clunky and sometimes unavailable, and there's no standard way to log out.


> aren't used ... on any other public web site, either.

Internal-use web sites are a thing, however. A very common, heavily used thing. Many of them use NTLM or Kerberos authentication.

> TLS client certificates have always been a UX nightmare.

Which isn't an issue for APIs accessed over HTTPS, many of which are "public" yet use client certificates for authentication.


> There's no way to log out without closing the browser.

Well, with TLS client certs, each request is essentially similar to a new login, request, logout, so "logout" doesn't really make sense, unless you want to it to mean "stop using the cert temporarily"? Perhaps the UX for certs should be more like "this site wants you to login", you press login in the browser UI, then all future requests are signed with your cert, until you click logout in the browser UI and then they aren't signed any more.


> TLS client certificates have always been a UX nightmare.

Indeed, it is very sad; they could be a great auth standard if only the client UX was acceptable.


> There's no way for an average user to transfer a certificate from one device to another.

Wouldn't you just enrol a new certificate from the new device?


How would you do that? The old device doesn't know anything about the new certificate, and the new device can't prove that it's the successor to the old device.


You use some other form of authentication, like a password, or clicking an emailed link, or clicking an "approve" button on the old device if available, or anything else that makes sense for your relationship with your users.


And how does the server remember that you've provided one of those other forms of authentication? Recall that the original goal was to be an alternative to cookies...


Same as how you verify the keys of another person, manually transferring the hex digits of the fingerprint, or something more user-friendly like QR codes or audio or BT data transmission.


"There's no way to log out without closing the browser. "

Itw like theu gave up halfway when inplementing it. Wtf


To be fair, NTLM authentication being broken has nothing to do with HTTP/2.


I always considere NTLM and Kerberos to be very similar, so I am surprised to hear that NTLM is broken.

Kerberos and HTTP/2 work just fine together.

What in particular is broken with NTLM?


Extending the sibling comment a bit, if you run it exclusively on HTTPS, with Firefox or some other browser that understands it shouldn't share your password with any site that asks, it's just a bug-prone, slow authentication mechanism.

But MS has also the "Negotiate" protocol they made to replace NTLM on IE (because even the IE team considered it irremediably broken). That one is a variant of Kerberos, and a lot of people say just NTLM when they mean "Negotiate with a possible downgrade to NTLM".


At least when it was released, IIS 10 supported HTTP/2, but did not support Kerberos, Negotiate or NTLM authentication over HTTP/2.

https://docs.microsoft.com/en-us/iis/get-started/whats-new-i...

I’m not sure that it’s a problem with HTTP/2 that Microsoft found no reason to implement Windows authentication for it.


The HTTP/2 protocol specifically makes this kind of authentication difficult. It's not just IIS, in general it doesn't quite work right.


NTLM


"HTTP/2's binary format means you can't use classic general-purpose tools like netcat and openssl. HTTP/2's complexity means you can't easily implement your own client, so you'll need to use a library."

IOW, HTTP/2 is not designed for users. It is designed for online advertisers and the companies that serve them.

How do I know this. Because HTTP/1.1 pipelining still works great. I am using it on a separate task as I type this (I use it outside the browser).

It just doesn't work well for companies that profit from online advertising as a "business model", the so-called "ecosystem" of actors seeking to capture "eyeballs" and Hoover up data about users.

That includes people who design whiz-bang websites making dozens to hundreds of "automatic" requests not initiated by the user, many for arguably unnecessary external resources (or for telemetry), to gather data on the user and/or to serve her with ads.

The type of websites that drive HN readers nuts. That is not a problem with HTTP/1.1, that's a problem with the effects of online advertising, surveillance and greed run amok.

It also includes popular browsers supported by advertising. HTTP does not exist exclusively for a handful of corporate-sponsored browsers. It exists for all present and future clients that users wish to use to retreive information from the web.

HTTP/2 is biased against users in favour of a web that exists only for online advertising (so it can keep filling Google's pockets with cash). It is a protocol that is so complex that users, not to mention the author of this blog post, "cannot easily implement clients for it."


HTTP/1.1 pipelining is fundamentally flawed and largely unused. There isn't some conspiracy not to use it - many, many projects evaluated it and pretty much all came to that same conclusion.


I am not a "project", nor is my goal the support of online advertising as a "business model", it's information retrieval. I'm a web user. I "evaluated" HTTP pipelining 20 years ago; it has always worked great for me. It is not "fundamentall flawed" for my purposes. If it did not work, I suspect servers would have disabled it by default ages ago. They didn't. Today, I use it daily. The fact is, the majority of websites I encounter enable HTTP/1.1 pipelining. Therefore I use it.


Which servers / sites support HTTP/1.1 pipelining? This seems somewhat hard to lookup.

https://en.wikipedia.org/wiki/HTTP_pipelining: says that proxies and web browsers mostly don't support it. It claims that it's easy for servers to support it, but provides no more details. It also mentions that curl removed pipelining support.

https://forum.nginx.org/read.php?2,269248,269249#msg-269249: Some person says that Nginx doesn't support pipelining. Other people agree. I can't find anything to contradict that.

https://serverfault.com/questions/266184/does-apache-webserv...: Someone says that Apache doesn't support pipelining. Again, I can't find anything to contradict that.

https://stackoverflow.com/questions/17299489/iis-and-http-pi...: This person says IIS doesn't support pipelining. Against, I can't find contradictory evidence.

Twisted used to support pipelining, but removed it: https://twistedmatrix.com/trac/ticket/8320

So, who exactly supports HTTP/1.1 pipelining?

It is true that if you fire multiple requests off to an HTTP/1.1 host without waiting for a response, you'll probably get responses back. The thing is that most hosts will process those requests one at a time. This is not pipelining - this is just processing requests one at a time as they come in. So, you're saving the latency required to get the request to the server - but not getting any benefit from parallel processing since the servers process the requests serially. With HTTP/2, however, at least some servers will actually process those requests in parallel - potentially with better performance.


This isn't a conspiracy - the problem with HTTP/1.1 pipelining is that responses have to go out in the same order as the requests that generated them. So, if a slow request comes in that generates a small amount of data, followed by 10 fast requests that generate a lot of data, the server has to hold all those big, fast responses in memory until the slow one finishes. That's bad for utilization - and it makes it easier to DOS a server. With HTTP/2, the server can respond immediately to whichever request finishes first - which is much better for utilization of server resources.


From a server point of view there is very little difference between connection reuse and pipelining. The latter just means more data might already be in the input socket when the first request is handled - but it will not be used and either sit around in socket buffers or some TLS buffer.

And i actually guess the person who mentioned pipelining just meant connection reuse.


I guess the difference is that the server is still free to return the "Connection: close" header on the first response and simply not read and/or process the other requests?

The client would then be expected to close the socket, possibly causing a TCP reset (if the server hadn't read all the data off the socket)


"And I actually guess..."

Your guess is incorrect. I mean sending multiple requests over a single TCP connection. Usually 100 or more.

When performing information retrieval, e.g., fetching a series of pages from the same host, I do not want out of order responses. I want the responses returned in order. I want the HTTP headers, too, as record separators and so I can be sure all requests were filled.

This sort of pipelining is not useful for websites that want to source myriad resources into their pages from external sources to serve ads, perform tracking and all that commercially-oriented stuff that is necessary for companies like Google to survive. HTTP/2 is useful for commercially-oriented use of the web. Surveillance and ads. I'm not interested in using the web that way.

HTTP/1.1 pipelining is useful for informational retrieval, i.e., retrieving hundreds of resources from the same host without opening up hundreds of connections. That sort of bulk information retrieval is not compatible with online advertising and tracking. Thus, Google and other companies supporting HTTP/2 have no interest in it. It benefits users, not advertisers.

I fully expect some nasty comments from "tech" workers whenever I bring up this topic. I am speaking from a user persepctive, not a "developer" perspective.

In the early days of the web, opening up hundreds of connections at once would be poor etiquette (toward the server operator). Today, many so-called "engineers" do not know any other way. Not only do they do it to servers (e.g, asynchronous requests), they do it to clients, causing a user's browser to make hundreds of requests to different servers for a single web page. Looking at a Network panel in Developer Tools in a popular browser when accessing an "average" web page reveals the sheer insanity of so-called today's "web development". The other commenter clearly has never even used HTTP/1.1 pipelining, at least not consciously, and yet he wants to offer his opinion on it. Nice.

I use HTTP/1.1 pipelining every day. For example, I use it to retrieve bulk DNS data from DoH servers. The future of HTTP/2 is uncertain. HTTP/1.1 is not going away anytime soon.

I can test every website that is currently submitted to HN for pipelining support. I would bet the majority allow pipelining. If I wanted to retrieve a large number of pages from any of them, I could use pipelining to do it.

I get no benefits from HTTP/2 because I mainly use the web for non-commercial purposes. For that use, I do not use a popular browser. I do not wait for websites to "load" whilst they open dozens or even hundreds of connections to other servers. I retrieve HTML from the command line using netcat and haproxy. It's fast. I use a text-only browser to read HTML.

When using the web in the way I do, without seeing any ads, without all the automatically triggered requests to external servers, performance is not an issue. When using the web the way Google wants people to use it, chock full of ads, then performance is an issue and something like HTTP/2 makes sense. Thus, how one uses the web matters. One size does not fit all.


There’s a couple of golang servers that implement it to make their benchmarks look better.


"This is not pipelining - this is just processing requests one at a time as they come in."

That's incorrect. This is pipelining as it is defined in RFC2616.

https://tools.ietf.org/html/rfc7230#section-6.3.2

It is not latency we are trying to save with HTTP/1.1 pipelining, it is server resources, namely the number of simultaneously open connections. (See Section 6.4)

RFC 2616 does not require parallel processing. It's optional.

Personally, I do not care about parallel processing. I want the responses returned in order. I get satisfactory performance from FIFO. That's because I only request resources from the domain I type into the computer.

A website that allows ads to be served from a variety of domains that the user never typed might have a problem with performance. However that is the web developer's problem, not the user's. Online ads are optional. There is nothing in RFC2616 that requires online advertising. The web works great without ads, and that is how I use it.

HTTP/2 is designed to serve the goals of companies that assist with online advertising. Google and others. Automatically triggering requests for ads from third party domains through webpages is a particular type of web usage, promoted by "tech" companies to support their online advertising "business model", but it is not the only type of web usage. It has performance problems. Go figure.

There is nothing to indicate any person outside of the "tech" industry is interested in this type of web use. How many users intentionally request ads. None. No user ever types in the domain of an ad server.

Requesting many small resources from the same domain, i.e., the domain the user types into the computer, using pipelined requests generally does not suffer from performance problems. It is fast and efficient for the types of web use that are not requesting ads from third party domains. Not to mention it is far more energy efficient.


"Which servers / sites support HTTP/1.1. pipelining?"

There are probably millions. IME, over 20 years of using pipelining, it is quite rare to find ones that don't. Here is a simple example.

Download the k-tree transcript archive from stackexchange.com

1116 HTTP requests, 1 TCP connection

26MB download

        stunnel -fd 0 << eof
        debug=debug 
        pid=$HOME/1.pid
        foreground=no
        [ x ]
        accept=127.0.0.77:80
        client=yes
        connect=198.252.206.29:443
        options=NO_TICKET
        options=NO_RENEGOTIATION
        renegotiation=no
        sni=
        sslVersion=TLSv1.3
        eof 
        export Connection=keep-alive; 
        sh -c "$(sh -c "$(seq -f "seq -f 'seq -f "https://chat.stackexchange.com/transcript/90748/%g/%%g/%%%%g" 1 31' 1 12" 2019 2021)")" \
        |a.out \
        |nc -vvn 127.77 80 > 1.htm
        cd;kill $(cat 1.pid)

        cat > 1.l         

        int yy_get_next_buffer();
        int fileno(FILE *);
        int setenv (const char *, const char *, int);
        int dprintf(int, const char *__restrict, ...);
        #define httpMethod "GET"
        #define httpVersion "1.1"
        #define Host ""
        #define jmp BEGIN
        #define Y(x,y) fprintf(stdout,x,y)
        int count,path,ka;
        int httpheaders(){
          setenv("httpMethod",httpMethod,0);Y("%s ",getenv("httpMethod"));
          Y("%s HTTP/",getenv("Path"));
          setenv("httpVersion",httpVersion,0);Y("%s\r\n",getenv("httpVersion"));
          if(0==setenv("Host","",0))Y("Host: %s\r\n",getenv("Host"));
          if(getenv("Connection"))Y("Connection: %s\r\n",getenv("Connection"));
          fputs("\r\n",stdout);
          return 0;}
    %option nounput noinput
    %s xa xb xc
    xa "http://"|"https://"
    xb [-A-Za-z0-9.:]*
    xc [^#'|<> \r\n]*
    %%
    ^{xa} count++;setenv("Scheme",yytext,0);jmp xa;
    <xa>{xb} setenv("Host",yytext,1);if(!getenv("Host"))setenv("Host",Host,0);jmp xb;
    <xb>\n path=0;setenv("Path","/",0);httpheaders();jmp 0;
    <xb>{xc} path=1;setenv("Path",yytext,1);httpheaders();jmp 0;
    .|\n
    %% 
       int main(){yylex();exit(0);}
       int yywrap(){if(count>1){
         fputs("GET /robots.txt HTTP/1.1\r\n",stdout); 
         Y("Host: %s\r\n",getenv("Host")); 
         fputs("Connection: close\r\n",stdout);
         fputs("\r\n",stdout);};
         exit(0);}

      ^D

     flex 1.l
     cc -std=c89 -Wall -pedantic -static -pipe lex.yy.c


Note that in the quote above "use" refers to security researchers trying to send malformed traffic.

The only usability difference between HTTP2 and HTTP1(.1) is that it's easier to write a broken implementation of a quarter of HTTP1 and still do some useful things with it.

A full, production-grade HTTP1.1 client or server is more or less as complex as an HTTP2 client or server. HTTP2 is actually easier to implement at the base level, since it's much easier to work with binary protocol formats than with the horrible string encoding of HTTP1. HTTP2 does add a lot of complexity with the stream multiplexing features afterwards - more or less enough as to cover the endless headache of deciphering HTTP 1.1 requests and arcane connection headers.


In the reddit thread someone posted the recording: https://www.youtube.com/watch?v=rHxVVeM9R-M


Wow. A fantastic set of descriptions, service and detection tooling from portswigger and colleagues.

The fact that http/1 downgrading is so prevalent and unlikely to change in the short term is the point.


I'd love to see a post from someone about their experiences using H2 between servers (i.e., instead of H1). Many big companies already use TLS internally, and I can imagine perf wins.


I'm a bit confused how these attacks can be used.

Inconsistencies between HTTP/1.1 and HTTP/2 allow for these behaviours but you still need to inject these malicious prefixes in to a user's request flow, right?

Can you actually set the Content-Length header from a browser on a HTTP/2 request? And isn't the suffix just ignored by the final remote (or does this depend on HTTP/1.1 pipelining?)


Hi, I'm the author. This paper is building on techniques explained here: https://portswigger.net/research/http-desync-attacks-request...

This attack does not require a MITM - the attacker would use a tool like Burp Suite to issue the (technically RFC-violating) HTTP request. The prefix injection happens because front-end places the attacker's request and the victim's request on the same HTTP/1.1 connection to the back-end, as shown in this diagram: https://portswigger.net/cms/images/9c/c1/4c32-article-http2-...

> isn't the suffix just ignored by the final remote The back-end treats the suffix as the start of the next request, due to TCP buffering. The vast majority of servers have this kind of accidental pipelining support thanks to TCP.


The target's own front-end proxy acts as the man in the middle!


> This means there's little room for ambiguity about the length of a message, and might leave you wondering how desync attacks using HTTP/2 are possible. The answer is HTTP/2 downgrading.

I would guess a huge percentage of attacks are made possible because of protocol or algorithm downgrading. I wonder if built in downgrading abilities into protocols is a security smell.


The shown attacks aren't a true case of protocol downgrading, but of an overeager direct translation of request headers across protocols. The attacks weren't possible because the client is doing some backhanded protocol negotiation, but because the client/front/back are being very directly connected.

If, instead of forwarding a direct translation of the headers, the front ends calculated the relevant requests and sent that, there wouldn't have been any HTTP header attacks (it would have been much slower though).


I'm not a security person at all. Is there a quick explanation for how these attacks allow the attacker to mess with other clients' requests?

I'm also confused at their usage of the word "prefix". Did they mean "suffix"? Or does "prefix" have a specific meaning in this case?


The next client's request gets sent after your request. If you get frontend and backend servers to desync, the backend gets to your first request, responds, and the response is sent to you; then the backend gets your second request + the next client's request, responds, and the response is sent to that next client.

This means you are making other clients get responses for their request prefixed by your smuggled second request.


So if AWS ALB HTTP/2 listener -> HTTP/1.1 target downgrade was vulnerable, and the recommendation is to use HTTP/2 end to end... am I reading the AWS docs [1] correctly that ALB only supports HTTP/2 -> HTTP/1.1 downgrades, but not HTTP/1.1 -> HTTP/2 upgrades? In other words, using HTTP/2 as the target group protocol only works for HTTP/2 clients, and HTTP/1.1 clients will receive an error?

[1] https://docs.aws.amazon.com/elasticloadbalancing/latest/appl...


I can't believe this, http and the internet as a whole is basically getting horsefucked




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: