Hacker News new | past | comments | ask | show | jobs | submit login
Httpoxy – A CGI application vulnerability (httpoxy.org)
153 points by omribahumi on July 18, 2016 | hide | past | favorite | 46 comments



Phusion Passenger author here. Ruby and Python apps deployed through Passenger are not vulnerable to this issue, despite Rack and WSGI using CGI-style names. That's because Passenger does not actually set OS environment variables. It merely passes a hash table/dictionary to the application with CGI-style keys.


I must say I really appreciate your hard work FooBarWidget. It seems, to me, that anytime something like this gets brought up, you've already addressed/fixed the problem or never even had it in the first place.

I've deployed a few large apps using Phusion Passenger, that years later are still running with incredible stability. Thank you!


Another fancy bug website for the list! https://github.com/KeenRivals/Bugsite-Index


Perl CGI applications using Taint mode would be unaffected (unless they intentionally break the tainted variables). I've always wondered why other languages never implemented a similar "security" feature. http://perldoc.perl.org/perlsec.html#Taint-mode


libwww-perl has been unaffected since 2001



That's pretty crazy assembly syntax... It looks like "AX" actually references "EAX" (in 32bit mode) and "RAX" (in 64bit mode), so neither Intel nor AT&T syntax. Wonder when we'll see the first crypto bug in ported code that mixed up AX/EAX/RAX...


> That's pretty crazy assembly syntax... It looks like "AX" actually references "EAX" (in 32bit mode) and "RAX" (in 64bit mode), so neither Intel nor AT&T syntax.

Yeah it's plan9 assembly which is its own thing (just as plan9 "C" was its own thing): http://plan9.bell-labs.com/sys/doc/asm.html


It's actually Go assembly, which (while being most similar to plan9 assembly) is also its own thing: https://golang.org/doc/asm


AX, is just the lower 16 bits of EAX, so all correct. This is GAS / AT&T syntax


From the context it's obvious that 0x0 is right. AX must be 32 bits (or 64 bits) wide.

https://github.com/golang/go/blob/fad2bbdc6a686a20174d2e73cf...


The L suffix of the MOVL says it's using 32 bits of the register. You move a byte with MOVB and use CMPQ to compare the full 64 bits.

You see this too with AT&T syntax, sometimes.


> RFC 3875 (CGI) puts the HTTP Proxy header from a request into the environment variables as HTTP_PROXY

I don't get it, so if I use CGI, and from my code, I query the env variable named 'HTTP_PROXY' , I will get what was set by the request header PROXY and not an environment variable 'HTTP_PROXY' as defined by the system the cgi executable is running on ?

edit :

I looked at CGI

https://en.wikipedia.org/wiki/Common_Gateway_Interface

I don't understand why this protocol just doesn't pass informations as an argument for the cgi script, why does it have to use environment variables ?


Arguments would be trickier, as escaping/quoting requirements aren't universal, and some environments (especially in 1993) have a pretty limited argument space; there is a non-parsed-header mode for CGI that passes the headers through to the CGI on stain be for the body which is maybe a better way. Environment variables are also convenient if you only care about headers deep inside many layers of code -- it would be tricky to parse the command line there, but grabbing the environment variables is simple.


Script arguments are not standardized (various ways to access and different conventions). Environment variables are pretty universal. It's one of the reasons that 12-Factor App guide recommends env vars for configuration.


If your question is regarding the precedence of the Proxy header vs. an environment variable, I'm not sure which of the two would win, but even if it's the latter, anyone who doesn't set it (which is probably the majority) would still be vulnerable.


CGI implementations will take the value of the Proxy header and clobber whatever might have originally been set in the environment at HTTP_PROXY.


Passing data as arguments is even more dangerous, especially when the program is a shell.


Good point. Well at least namespace data correctly , or dump the request, headers + body in a single CGI env variable.


CGI did namespace these headers correctly: headers X becomes HTTP_X. The problem is that the outgoing proxy config variables aren't PROXY_X, they're X_PROXY. (Other env variables from CGI aren't namespaced, however)

Dumping the headers into a single env variable means a) you need to have support for pretty big environment variables which probably wasn't realistic in 1993; b) every program that wants to use headers needs to know how to parse them, which is tricky.

Dumping the body into an env variable is bad because any reasonably sized body is not going to fit, and you really want the program to be able to start working on it before it completes.


That's how CGI works.

> pass informations as an argument for the cgi script

Elaborate?


Correct.



It says that API requests using TLS are not vulnerable. However, many applications won't do the appropriate certificate checking. If HTTP_PROXY is set to a mitm proxy, it can succeed.


The reason why they aren't affected is that you need to set HTTPS_PROXY for https://


I could imagine not all applications doing that, though.



You can go even further back to 2001 when this was first noticed:

http://www.nntp.perl.org/group/perl.libwww/2001/03/msg2249.h...



WPEngine with a not-so reassuring response: "We're aware of it, but no posts have been made thus far. We will update our blog or email customers if we feel that there's anything to be concerned about."

(To be fair, this was probably a low-level support engineer, so probably not that "official" of a response)


I always wondered why proxy configuration env variable is usually http_proxy, not HTTP_PROXY, despite env variables are usually uppercase. It made me clear why.

Moreover, curl has http_proxy, but also HTTPS_PROXY, FTP_PROXY, ..._PROXY, ALL_PROXY and NO_PROXY.


Yeah. But the really interesting thing for me was seeing that when Curl originally fixed it in 2001, they admitted the fix might not work for "Windows NT" (where environment variables are case-insensitive).

From our testing, we could get getenv in mod_php to return HTTP_PROXY when you ask for getenv('http_proxy') (seems to happen in the apr stuff?) - but that didn't affect PHP's libcurl extension, which made it a whole lot less interesting.

But yeah, if you're running curl itself under CGI with case-insensitive env vars you might still be in trouble.


The real lesson here is that environment variables can't be blindly trusted and that they can be controlled by an attacker unless proven otherwise. They should be treated as untrusted input that needs to be validated.

If you start a new process, always be explicit about the environment variables you want to pass on. Don't just let the subprocess inherit your environment variables.

If you write a library, do not rely on environment variables unless the user of the library has explicitly opted-in to that.


CGI prefixes environment variables with "HTTP_" (for HTTP headers), the problem is this prefix is not unique.


Hah, I've found that on my development machine I have to set `http_proxy`, `HTTP_PROXY`, `https_proxy` and `HTTPS_PROXY` for everything to work right.

This issue is pretty bad though.


Somewhat related, but not security-related: Quite a few HTTP libraries take the HTTP_USER_AGENT environment variable and insert it as the User-Agent header on outgoing requests. When used in a CGI script, this results in the upstream User-Agent header being forwarded, which is probably not what was intended.


This must be the most elegant vulnerability I've seen in ages. It almost feels like the pieces are interacting as designed. You politely ask the server to use a proxy -- and it does!


An extra problem here is many app-embedded HTTP clients are configured to ignore HTTPS certificate validation.


Not sure this is an issue as you don't make http requests TO the app.


Off-topic but this site should enable gzip encoding for CSS; this would save 83% on the 113 kB main.css…


How on Earth does somebody even have 113 KB of CSS?


  .highlight .sb {
    color: #ec490f; }
  .highlight .sc {
    color: #ec490f; }
  .highlight .sd {
    color: #ec490f; }
  .highlight .s2 {
    color: #ec490f; }
  .highlight .se {
    color: #ec490f; }
  .highlight .sh {
    color: #ec490f; }


I removed the stylesheed from devtools, and the site was way more readable, and basically the same in terms of organisation.


How do you remove an asset file for a web page via Chrome dev tools?


Deleting the `link` tag is one way to do it.


Deleting the link tag (the website had a single style sheet) also undoes its effects. I'm on a webkit browser.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: