Thanks, that was really good and with amazingly convincing detail of the technical things. Wow! Now: scared. Eh.
The most jarring thing (to me) on a quick perusal was the idea that a programmer would walk to a Peet's coffee in another building (from the sound of it, a bit of a walk at least) and return, and still be drinking from an espresso. I mean those things are tiny, and meant by (The Italian Coffee Overlords) to be drunk more or less like a shot straight at the bar. Whatever.
Obviously, this being a human behavior thing, there are going to be hundreds of people who walk around with espressos and sip on them for hours, it's just ... a bit off, at least to me.
EDIT: As in, the text clearly looks like fiction on its own, but it's being posted here as a companion piece to a series of events that definitely happened, hence my confusion.
I heard about this long ago, and understood it to be a thought experiment, if an ominous one.
And I was shocked to hear reports that it was "real" and "live" in the C compiler (which one? GNU?)
But I'm not sure if I believe this; could Ken still be pulling our leg at this point?
What is the nature of the malware or backdoor? How is it accessed? What does it do? Does it grant root privileges to an unprivileged user? Does it execute untrusted code? Does it open a socket for a RAT? Does it work under Windows? MacOS? Android?
Is Ken keeping mum about the nature and capabilities of the backdoor so he still has leverage?
If this is true, it's a security concern about the size of Intel's IME, don't you think?
That is so neat! I never did read the paper itself on trusting trust or whatever it was called, so I always thought this was mainly a theoretical kind of thing. Didn't know that the man actually made a real-life proof-of-concept for that exploit!
it is very real indeed. I've heard tale of this kind of thing being used in the wild one time, and there must be use of this kind of exploit in the wild that has not been detected.
read the paper, it will scare you at least a little if you understand what it lays out.
we really do rely on the hope that our compilers are pure, and we have very few tools to detect a bad compiler if our tools are also compiled with a malicious compiler. even if we compile the compiler from source, we can't know, because the compiler itself could be "in on it."
> we really do rely on the hope that our compilers are pure
Jeremiah Orians hacked his way through the whole supply chain up to raw machine code to get a provably clean, up-to-date GCC for Linux on amd64¹, solving the bootstrapping problem in a complete way. He and some Guix people have also then worked to integrate this into GNU Guix (a cross-distro package manager) and GuixSD (a GNU operating system based on that package manager), so it's actually not too hard to make practical use of that work, either!
Imo, this is an incredible achievement that deserves much wider recognition. It must have taken a very principled, curious, obsessive, stubborn personality to even seriously take up this work. Pretty damn cool that it even happened.
I forgot the guy's name and fucked up by only looking at some of the most recent commits. Another hacker to highlight, and the one whose lectures taught me about these efforts when I found them on YouTube, is Jan Nieuwenhuizen, who goes by janneke online.
He's the author of GNU MES (Maxwell's Equations of Software), the scheme interpreter used in this bootstrap effort, and IIRC he's worked on many parts of this whole thing.
As a bit of an apology as well as a followup, here's some talks he gave a few of years ago about this whole bootstrap story!
Under certain assumptions. This method relies on making its assumptions expensive to violate. Which is good enough in practice...
...unless you're dealing with an attacker with vastly more resources than you, and a will to spend it. It's always worth keeping in mind that the way magic tricks work is usually because the performer invested much more time and effort in preparation and practice than anyone in the audience would consider reasonable.
When I learned about it, our professor told us it was an "if I did it..." type of scenario. Very cool to see from the mailing list that it was more than a hypothetical
Reverse engineering the dump through assembler back into inferred intent in a higher language would have shown some very odd things being done. You would go "ok this is some crt0 initialisation, it must be needed". But some rigour would reveal what it does has nothing to do with runtime initialisation, its just a quine reproducing its nefarious intent inside the code.
Which I think goes to the idea that reflections on trusting trust in the end is a reflection on what trust you have with other people: either not to do this, or to be able to ask the right people to infer this hasn't been done, and have reason to believe them.
If the tools they use are made by the person who did the embedding, your trust is highly conditional. It argues for diversity of tooling.
Because it was a bug. Maybe he always appended a trailing zero byte ('\0') to some string.
In general, if you're not aware of the content of the paper, you might want to familiarize yourself with it. In short & maybe simplified: The pre-compiled compiler contains a malware that, when compiling the compiler source, will be insert itself again. (Of course the source does not contain the malware). They used the pre-compiled compiler ("stolen" from Ken) to compile the compiler's source code; this of course inserted the malware into the newly built compiler. Then they used the newly built compiler to compile the compiler's source code again, and so on. And every time they did that, it grew by one byte because of a bug in the malware component.
The compiler copies a part of itself directly into the new compiler binary (i.e. it is a type of quine)
If a bug causes that copy mechanism to make the "part to be copied" one byte larger than it should be, then each time the compiler compiles itself then that part, and therefor the compiler itself, will be one byte larger than the time before.