That's great that you are considering this more now.
But the xy story taught us, that every contributor is dangerous, the most dangerous ones are probably the most helpful and most skilled contributors. If someone barely get's a PR accepted, they probably lack the skills to add a sophisticated backdoor.
Another thing that was not talked about a lot: There are many ways to compromise existing maintainers. Compromising people is the core competency of intelligence, happens all the time, and most cases probably never come to public knowledge.
> consider the route of getting "kompromat" on a developer to make them "help" them
I suppose that’s an option, but it also introduces an additional risk of exposure for your operation as it doesn’t always work and makes it much more complicated to manage even when it does work.
They might not even use blackmail, they might just "help out" in a difficult financial situation. Some people are in severe debt, have a gambling problem, are addicted to expensive drugs, or might need a lot of money for a sick relative. There are many possibilities.
The trick is finding the people that can be compromised.
I think you're going overboard on what's required. Take anybody who is simultaneously offered a substantial monetary incentive (let's say 4 years of total current/vesting comp), and also threatened with the release of something that we'll say is little more than moderately embarrassing. And this dev is being asked to do something that stands basically 0 risks of consequences/exposure for himself due to plausible deniability.
For instance, this is the heartbleed bug: "memcpy(bp, pl, payload);". You're copying (horrible naming conventions) payload bytes from pl to bp, without ensuring that the size of pl is >= payload, so an attacker can trivially get random bytes from memory. Somehow nobody caught one of the most blatant overflow vulnerabilities, even though memcpy calls are likely one of the very first places you'd check for this exact issue. Many people think it was intentional because of this, but obviously there's zero evidence, because it's basically impossible for evidence for this to exist. And so accordingly there were also 0 direct consequences, besides being in the spotlight for a few minutes and have a bunch of people ask him how it felt to be responsible for such a huge exploit. "It was a simple programming mistake" ad infinitum.
So, in this context - who's going to say no? If any group, criminal or national, wanted to corrupt people - I really don't think it'd be hard at all. Mixing the carrot and the stick really changes the dynamics vs a basic blackmail thing where it's exclusively a personal loss (and with no guarantee that the criminal won't come back in 3 months to do it again). To me, the fact we've basically never had anybody come forward claiming they were a victim of such an effort means that no agency (or criminal organization) anywhere has ever tried this, or that it works essentially 100% of the time.
Absolutely. And that's the point I'm making here. It is essentially impossible to discern between an exploit injected due to malice, and one injected due to incompetence. It reminds one of the CIA's 'simple sabotage field manual' in this regard. [1] Many of the suggestions look basically like a synopses of Dilbert sketches, written about 50 years before Dilbert, because they all happen, completely naturally, at essentially any organization. The manual itself even refers to its suggestions as "purposeful stupidity." You're basically exploiting Hanlon's Razor.
I suppose the point is that even though any given instance of an error like this is overwhelmingly likely to be an innocent mistake, there is some significant probability that one or two such instances were introduced deliberately with plausible deniability. Although this amounts to little more than the claim that "sneaky people might be doing shady things, for all we know", which is true in most walks of life.
> They might not even use blackmail, they might just "help out"
If the target knows or suspects what you’re asking them to do is nefarious then you still run the same risks that they talk before your operation is complete. It’s still far less risky to avoid tipping anyone else off and just slip a trusted asset into a project.
> “I am so and so of the Egyptian intelligence service and would like to blackmail you”
No, but practically by definition the target has to know they’re being forced to “help” and therefore know someone is targeting the project. Some percentage of the time the target comes clean about whatever compromising information was gathered about them, which then potentially alerts the project to the fact they’re being targeted. When it does work you have to keep their mouth shut long enough for your operation to succeed which might mean they have an unfortunate accident, which introduces more risks, or you have to monitor them for the duration which ties up resources. It’s way simpler just to insert a trusted asset into a project.
From what I know of today's developer culture the solution will be for one company, probably Microsoft given their ownership of GitHub, to step in and become undisputed king and single point of failure for all open source development. Developers will say this is great and will happily invite this, with security people repeating mantras about how securing things is "hard" and "Microsoft has more security personnel than we do." Then MS will own the whole ecosystem. Anyone objecting will be called old or a paranoid nut. "This is how we do things now."
As an positive counterexample, US recently reduced federal funding for the program which manages CVEs [1]. There was/is risk of CVE data becoming pay-for-play, but OSS developers have also pushed for decentralization [2]. A recent announcement is moving in the right direction, https://medium.com/@cve_program/new-cve-record-format-enable...
The CVE Board is proud to announce that the CVE Program has evolved its record format to enhance automation capabilities and data enrichment. This format, utilized by CVE Services, facilitates the reservation of CVE IDs and the inclusion of data elements like CVSS, CWE, CPE, and other data into the CVE Record at the time of issuing a security advisory. This means the authoritative source (within their CNA scope) of vulnerability information — those closest to the products themselves — can accurately report enriched data to CVE directly and contribute more substantially to the vulnerability management process.
> solution will be for one company, probably Microsoft given their ownership of GitHub, to step in and become undisputed king and single point of failure for all open source development.
A single vendor solution would be unacceptable to peer competitors who also depend on open-source software. A single-foundation (like LF) solution would also be sub-optimal, but at least it would be multi-vendor. Long term, we'll need a decentralized protocol for collaborative development, perhaps derived from social media protocols which support competing sources of moderation and annotation.
In the meantime, one way to decentralize Github's social features is to use the GH CLI to continually export community content (e.g. issue history) as text that can be committed to a git repository for replication. Supply chain security and identity metadata can be then be layered onto collaboration data.
It can be a stepping stone towards a world in which we use sandboxing and (formal) verification to safeguard against cultural degradation. There's no alternative, too many bad actors are roaming about. I hate that as much as the next guy :(
> If someone barely get's a PR accepted, they probably lack the skills to add a sophisticated backdoor.
That's true, but it's also true that a sophisticated and well formed PR is probably genuine too. Hostile PRs are the exception rather than the rule. And if only the high quality PRs are treated with suspicion, then the attackers will tailor their approach to mimic novices. General vigilance is required, but failure is likely because these attacks are so rare that maintainers will grow weary of being paranoid about a threat they've never seen in years of suspicion and let their guard down.
Early this year, I've received a hostile PR for a "maintenance only" JavaScript authentication library with less than 100 stars but which is actively used by my employer.
It added a "kinda useful but not really needed" feature and removed an unrelated line of code, thereby introducing a minor security vulnerability.
My suspicion is that these low quality PRs are similar to the intentional typos in spam emails: Identify projects/ maintainers who are sloppy/ gullible enough and start getting a foot in the door.
But, source-not-available proprietary systems are just totally hopeless from this point of view, of course an intelligence agency could slip something on. A bored developer at the company could too. Users of this sort of proprietary system have just chosen to have 100% faith for some incomprehensible reason.
I don't think that's relevant to this discussion though, as open- and closed-source subversion would seem to follow really different paths.
- Open-source subversion has the big advantage of having the code, testing and build processes in the open which allows for the attack surface to be exhaustively studied, whereas closed source requires code exfil, reverse engineering, inside intel on processes etc.
- Closed-source subversion can hide in other places -- binaries can be corrupted on a compromised server etc. Seeking to influence the code-based development seems like the hardest road IMO.
- Open-source maintenance (at least the kind under discussion here) stops at the maintainer, whereas most corporate dev is in a hierarchy with non-uniform commit authority. None of the same social techniques would apply.
One follow up to compromising existing maintainers: This makes the creators or long-term good faith maintainers maybe even more "dangerous" than new maintainers.
Who can share a threat model with specific probability estimates on this? FWIW, I’m less interested in the particular estimates (priors) and more interested in the structure.
>If someone barely get's a PR accepted, they probably lack the skills to add a sophisticated backdoor.
Unforuntately it's easy to sandbag being dumb. Just because someone submits a PR defining constants for 0-999 does not mean they're actually bad at programming.
Yes! Anytime you see a function signature like "int timeout", it's safe to assume that the unit is in femtoseconds and pass a gigantic number while you curse out the incompetence of the developer. Either name your variables correctly (timeoutZeptoseconds), or use a proper data type (like a Duration or Period in Java, TimeSpan in C#, or a user-defined literal in C++).
If someone gets stabbed in the eye, we find out about it. So our statistics on eye-stabbing are probably accurate.
We literally have no idea how many xz-style compromises are out there in the wild. We got really lucky with xz - it was only found because the backdoor was sloppy with performance and a microsoft employee got curious. But we have no data on all the times we got unlucky. How many packages in the linux ecosystem are compromised in this way? Maybe none? Maybe lots? We just don't know.
It did at least reveal the playbook, and that you have to get pretty creative to hide things in plain sight.
I'm sure any binary blobs in OSS software, no matter what the reason for having them will be viewed with suspicion, and build scripts get extra inspection after that.
Maybe I'm naive in thinking that some people are already looking into packages that are included in all base Linux builds? Including simplifying the build env, and making sure that the the build tools themselves (cmake, pkgconfig, gmake, autotools etc) are also not compromised.
The de facto standard serialization library for Rust, serde, started using binary blobs to speed up builds only a few months before the xz back door was discovered. Lots of people asked the author to include build scripts so they could (re)generate the blobs on their own and his response was basically if you want it, fork it.
You can always use the "we have no idea" argument because you can't prove something doesn't exist. Go find evidence. It's been over a month since xz and thus far we have zero additional incidents. And if you look at the specifics of xz attack: that wouldn't work for most projects because most don't have binary test files.
I'm nobody so you have no reason to believe me - but there have indeed been other, very prominent projects targeted in very similar attacks. We're still inside the responsible disclosure window.. hell, even in the blog post we're commenting on, three JS projects were targeted in failed attempts. That's 4 public projects now..
> seems to have 0% chance of succeeding for almost any project.
Its obviously more than 0% given xz was successfully taken over and backdoored. Even a 5% chance of malicious takeover per project would make the situation pretty worrying given how many well funded, motivated government agencies are out there.
I'm not talking about xz, I'm talking about that OpenJS thing: random people emailing out of the blue "plz gimme maintainer". Entirely different situation.
I did quote the "three JS projects were targeted in failed attempts" bit, which should have made that abundantly clear.
Is it a different situation? Seems similar to me, except the examples we know about (the obvious ones) are the low skill examples. If someone played the long game like xz and made some helpful improvements to the project in that time, we wouldn’t know about it.
People have also done the same thing (to great effect) on the chrome extension “store” to get all manner of malware into chrome extension updates.
“Nobody unsubtle was successful” tells us nothing about the success rate of subtle attackers. It’s like looking at all the dodgy ssh and http requests any host on the internet is connected to and concluding “yep, 0% of low effort script kiddie attacks get through. I’m 100% safe from hackers!”
Are people really looking though? Are all open source libraries being run through extensive performance profiling to look for known heuristics? Are they being looked at line by line for aberrations?
I don’t have confidence that people are looking for evidence of potential exploitation because of reasons like the ones you bring up.
With hindsight it's not the runtime behaviour of the library that you'd want to test - the weakest point in the chain is where the distributed source .tar.gz can't be regenerated from the project repository.
For how many projects is that actually checked? I bet barely any.
Its especially difficult because most projects aren't built in a reproducible way. You should be able to uncompress and compare a source tarball. But if you get a binary and the source code used to generated that binary, there's no way to tell that they match.
Luckily the source tarball is the more important one to check, because that's the difference between backdooring one distribution and backdooring them all.
It's still not trivial because there might well be legitimate processing steps that are used to create the tarball, but it should be doable.
Most commonly-used projects are watched by a bunch of people, or diffed on updates. These are not in-depth reviews, but should catch most of it. So yes, people are looking, and have been looking for a long time.
The reason Jia Tan could do their thing is because 1) the main meat was in a binary test file, 2) the code to use that seemed relatively harmless at a glance, and 3) people were encouraged to use the .tar.gz files instead of git clone. Also you need to actual get maintainer status, which is not as easy as it sounds.
I've been thinking of inserting a "// THIS LINE IS MALICIOUS, PLEASE REPORT IF YOU SEE IT" in some of my projects to see how long it would take. I bet it would be pretty fast either after commit or after tagging a release.
No. If there is strong incentive to compromise, and little to no chance a compromise is being found, it's statistically most likely to assume compromises happen on a regular basis and only rarely are found out.
Your choice of language in your comments (in this thread, not in general) isn’t bolstering your argument.
Why not be curious rather than just dismissive? This seems to be people just talking past each other at this point.
There have been a lot of changes in the last ~five years that point in the direction of supply chain security being at greater risk.
Evidence comes in many forms. The relevance of evidence depends on what part of the problem you are looking at.
Also, it is rational to talk about the probability by which different evidence is likely to be surfaced!
I think it is possible you are sensitive to people making such claims for self-interested purposes. Fair? But I don’t think it’s fair to assume that of commenters here.
> Your choice of language in your comments (in this thread, not in general) isn’t bolstering your argument.
Yeah, you're probably not wrong. I've had this argument a few times now, and it's the same dismissive "we don't know what we don't know" every time. Well, you can say that for everything and given the complexities of the xz attack that seems a bit unlikely to me, which is then again countered with "but we don't know!!11"
"Every contributor is dangerous" is spectacularly toxic type of attitude. I've already seen random people be made a target and even had their employers contacted over this before they even had a chance to explain(!!) To say nothing of "there are many ways to compromise existing maintainers. Compromising people is the core competency of intelligence, happens all the time" – so great, now I'm also potentially dangerous after spending untold hours and money over the last 20 years because I could be compromised. Great.
This was never a nuanced conversation about risk management to start with. This is not the type of community I've worked for all this time. "Let's use some common-sense tech so this isn't that easy". Sure, let's talk about that. "Let's treat every volunteer involved as potentially hostile and compromised after we've seen a single incident"? Yeah, nah.
> "Every contributor is dangerous" is spectacularly toxic type of attitude.
I view this from the lens of "How well can people reason about probabilities?" and research has shown, more or less, "not very well". In the short term, therefore, it is wise to tailor communications so as to avoid predictable irrational reactions. In the medium term, we need to _show_ people how to think about these questions rationally, meaning probabilistically.
For what it is worth, I prefer to avoid using the phrase "common sense", as it invites so many failure modes of thinking.
My current attitude is, more or less, "let's put aside generalizations and start talking about probabilities and threat models". This will give us a model that makes _probabilistic predictions_. Models, done well, serve as concrete artifacts we can critique and improve _together_.
I hope to see some responses to my other comment at https://news.ycombinator.com/item?id=40271146 but I admit it takes more effort to share a model. It is well outside the usual interaction pattern here on HN to make a comment with a testable prediction, much less a model for them! Happily, there are online fora that support such norms and expectations, such as LessWrong. But I haven't given up hope on HN, as it seems like many people have the mindset. I think the social interaction pattern here squanders a lot of that individual intelligence, unfortunately... but that pattern can change in a bottom-up fashion as people (more or less) demand, at the very least, clearer explanations.
In the end you can never fully trust anyone, including yourself. This has always been true for anything: people get drunk, have psychotic episodes or have other mental health issues, things like that. It happens. Remember that Malaysian pilot flying the passenger plane in the ocean?
Every pilot in the world will agree that we need to think about risk management to prevent that sort of thing. I think a lot of them will have issues if we start saying things like "every pilot is dangerous" and (in a follow-up) "long-term good faith pilots are maybe even more dangerous than new maintainers". Then you've gone from "risk management" to just throwing shade.
I don't disagree. But my follow-up response is "don't leave it there; factor that into the probability tree".
What should professionals in cybersecurity do? (Not my field, so I could be off-target here) My recommendation: communicate a risk model [1], encourage people to update it for their situation, and demand that people act on it [2]. Not too different from what the field of cybersecurity recommends now. (Or am I wrong?)
[1] based on a set of attack trees (right?)
[2] based on the logic that if you get pwned, you become a zombie to attack me
> This was never a nuanced conversation about risk management to start with. This is not the type of community I've worked for all this time.
I'm not quite following the second sentence. What kind of community have you worked for? Do you mean "worked for" as in e.g. "the spirit of your comments on HN"? Or something else?
I think they are using community to refer to F/OSS projects as a monolithic entity, rather than a million separate and often competing and disagreeing fiefdoms that have always had issues with toxic assholes worming their way into too much power.
There are a lot of dismissive folks who think this is some kind of one-off event because you can't prove it's not- oh wait, the other attempts we can prove aren't enough evidence either!
I understand being wary of America trying to solve this the only way we know how (PRIVATIZE IT!), but dismissing it as a non-issue makes that more likely because you're basically saying you plan on ignoring it rather than putting your own controls in place.
Yes, FOSS projects need to be welcoming to new devs. No, they don't need to pretend malicious actors aren't an issue in order to do that.
You can vet new people, and be welcoming, at the same time.
This is about "social engineering takeovers of open source projects", not "socially-engineered cybersecurity attack", which is much much broader.
I've been pretty clued up on open source for the last 20 years, and I don't really recall any other similar incidents other than the two I mentioned. I tried to find other examples a few weeks ago and came up empty-handed. It's certainly not common. So please do post specifics if you know of additional incidents, because from what I can see, it's exceedingly rare.
You seem super confident that there have been zero similar attacks that achieved their goals without detection. By definition, almost anyone who pulled off this kind of thing would try really hard not to burn that backdoor by being super obvious (for instance, using it to deface a website). We literally would not know anything about it, in all likelihood. Therefore I feel like it’s a lot more intellectually honest to say we have no idea if that has happened elsewhere, than it is to confidently proclaim that it certainly has not just because it’s been a month since xz.
What I'm argueing against is absolutist fear-mongering statements such as "every contributor is dangerous".
I'm not confident about anything, but anything could happen or have happened all the time. We need to operate on the reality that exists, not the reality that perhaps maybe possibly could perhaps maybe possibly exist. And we certainly shouldn't be treating anyone sending you a patch as a dangerous hostile actors by default.
You seem to think that vetting contributors or reviewing all code commits for malicious actions or code is some unreasonable ask. That should be standard practice.
If someone is getting angry that you actually check their code for vulns, or that you don't let them make changes to certain core areas of a large app without establishing some credibility first, you probably don't want them working on your project.
You can be welcoming AND cautious at the same time.
It has been standard practice for decades. Sometimes this goes wrong, because everything can go wrong. It happens. Casting doubt on any contributor, any maintainer, and any long-term maintainer with fantastical stories is just throwing shade. Of course no one can be trusted absolutely; that has always been true for anything from software to child care to launching nuclear bombs. Anyone and anything can become suspect if you analyse things with enough of a suspicious mindset.
And no, being cautious is never throwing shade, unless you're doing it in a discriminatory way, like assuming that Chinese or Russian contributors are more dangerous.
I don't think the "armed to the teeth" theory is correct. If you were right, people wouldn't honk at each other or otherwise involve themselves in any sort of road rage. But people rage at each other all the time, and only very rarely does someone get shot.
The reason people aren't walking around stabbing you in the eye with a needle is because there is no reason for them to do that. They gain nothing. They don't desire that it be done.
If the news articles about Instagram extortion are anything to go by, adding weapons to an extortion situation is more likely to lead to a suicide than the extortionist being dissuaded.
But the xy story taught us, that every contributor is dangerous, the most dangerous ones are probably the most helpful and most skilled contributors. If someone barely get's a PR accepted, they probably lack the skills to add a sophisticated backdoor.
Another thing that was not talked about a lot: There are many ways to compromise existing maintainers. Compromising people is the core competency of intelligence, happens all the time, and most cases probably never come to public knowledge.