I wanted to make the point that old tools do not get in the way of professional results -- the point kind of fell flat because they are not actually using CVS and its just the label that Bugzilla is giving to Git.
Wow. This failure case of a compiler almost looks like the same sort of nonsense that AI/neural networks can generate, no doubt caused similarly by their immense complexity. The common theme is that no sane human would even think of generating such output.
It seems a pretty straightforward (and all-too-common) kind of misunderstanding between humans: one person gave a function used by both memcmp and strcmp a name to do with strcmp, then another person wrote a comment that talked about strcmp, and eventually a third person made a change that would have been valid if the function was used only by strcmp. No individual would make such a mistake, true, but an organisation of several people could easily fail in the same way.
> No individual would make such a mistake, true, but an organisation of several people could easily fail in the same way.
You might be able to claim that it's unlikely for someone to make this mistake if the changes were in one single change (I doubt even this!), but the same person revisiting code at multiple points in time might as well be different people.
yeah I've definitely made equivalent mistakes like that, where I gave something a misleading name because I didn't spend enough time thinking about it, and then eventually confused myself into introducing a bug
Actually this is a common sort of bug I've seen programmers implement, e.g. using strcpy where memcpy should be used or vice versa. Most common on interface boundaries with other languages, particular "everything is a string" languages.
So it's not that shocking to see a similar sort of bug made inside the compiler.
That's how the Wii got pwned. They were comparing binary SHA-1 hashes in RSA signatures with strncmp.
They also weren't checking padding, so you didn't have to use a real signature with a 00 byte in a convenient place (though I did find one), but rather could just use an all-00 signature, because performing the RSA computation on zero yields zero.
I wonder how many bugs C strings have been responsible for, all because storing a null byte at the end seems more efficient than storing a length up front.
The original secp256k1 developer Russell O’Connor who was hit by the bug (was on HN [0]) [1] said he rebuilt every package in his system after the bug was identified, and found only 10 lines of code that were miscompiled. Also, note that the bug was found during an attempt to add more test cases, pre-existing code in secp256k1 was not affected.
> Three lines are tests. All of the lines could be rewritten as a comparison to 0. None of the lines looked that serious. I am not sure which one is the worse: the reduced message integrity code(?) from some ARCFOUR implementation or the something something from an ATM driver?
Two unusual conditions are needed to trigger it...
> * at least one of the compared byte arrays is constant and has a zero byte in position 0, 1, 2, or 3, and
> * the result of the memcmp isn't immediately used in a "== 0" or "!= 0" test (or equivalently "if(memcmp(...))" or "if(!memcmp(...))").
> In particular the second condition makes this bug pretty rare and explains why it's mostly hit in non-inlined memcmp wrappers. (But in our case we hit it with a "<" comparison. )
Looks like a critical bug, but doesn't look like a conspiracy.
For those interested, ATM doesn't refer to an automated teller machine, but to the UPD98401 high-performance Asynchronous Transfer Mode (ATM) controller.
> it's very difficult to detect, if most code slips it by.
What would be the point if it has missed nearly every target? Based on the field survey by Russell O’Connor, no major crypto project or server project has been affected. The only security victim is a RC4-HMAC implementation for Kerberos 5, which has very limited uses - for Windows 2000, and deprecated since a long time ago.
The only possible explanation would be either,
1. A long-term time bomb, intended to hit a big fish in the future, in a random, non-specific way.
2. Targeted attack (which doesn't make much sense, if you have enough exploitation capabilities to launch many types of targeted attacks already, why bother to leave a public record in GCC?).
Sure. I don't think it's particularly likely to have been a conspiracy. Just speculating. If I were to try to put an illicit backdoor in a compiler, I would try and make damned sure it's not traceable to me.
It's a very subtle thing. You have to submit a patch with an improvement (or, at least, an apparent one) to the compiler, which also contains your backdoor. You have to ensure said patch doesn't immediately cause test failures in gcc, or in projects using gcc. If you can manage that, though, it'd be very difficult to prove that you were responsible, which is why I think the targeted attack is entirely plausible.
> If you can manage that, though, it'd be very difficult to prove that you were responsible, which is why I think the targeted attack is entirely plausible.
So the only argument you have to suggest it might be an intentionally introduced bug is that it doesn't look like one… I understand the sentiment but this is clearly a ridiculous line of reasoning.
The patch that introduced the bug is at [1]. Does it look to you like it introduced a bug? Granted, you're not a gcc developer—neither am I—but the gcc developers didn't think so either.
Bugs are accidentally introduced all the time, true. But I don't think it's fair to say that this is a ridiculous line of reasoning.
There is no security concern about it in libsecp256k1: The compilation was only happening in a newly written test case and didn't impact the production code.
The miscompilation of the implementation of draft-brezak-win2k-krb-rc4-hmac-04.txt seemed interesting to me and potentially of a security concern. They looked to me like they were potentially significantly reducing the length of the message integrity check in an authentication protocol.
Though its hard to say for sure without even a passing familiarity with that protocol or its implementation. If you wanted to go vulnerability hunting, however, that is where I'd start.
That said, even if there is a bug that is no evidence that it would have been intentional. Sadly, an intentionally introduced bug could be absurdly subtle. One of the downsides of software bugs being relatively common is that you can't be sure if a bug was intentional or not... and at the same time it's silly to assume one was intentional because we know that real bugs are so common.
It's long been my view that if you're not occasionally finding compiler bugs -- even in somewhat older compilers-- you're just not testing hard enough. They've become more rare but are not so rare that a single developer comprehensively testing a small library won't eventually find one (or more). (Libsecp256k1 is ~9,964 lines of code and ~8,235 lines of test code).
Many other cases found by Russell's system wide recompilation were also in test cases or were otherwise pretty obviously not that big of a deal.
It was a fairly recent regression (this past March?) and based on developer comments in the source it really looks more like a mistake than anything. Personally I’m gonna go with Occam’s razor on this one.
Unlikely. We were hit by a series of similar string builtin related bugs in the tree optimizer since gcc 9.0. And the bugfest is still ongoing. The most likely explanation is an incompetent developer touching these areas.
Not sure why has this comment been downvoted. It simply describes the bug in context ("there has been a series of similar string builtin related bugs in the tree optimizer since gcc 9.0") and offers a plausible explanation of the bug ("an incompetent developer touching these areas") based on Hanlon's Razor.
rurban often gets flak for his method of communication, which is generally blunt. Whether from a language barrier or just how he communicates, I think people have a problem separating the message from the delivery, so that might be what's going on here.
I agree with his assessment that it's probably just someone's mistake. I also personally wouldn't have worded quite that way, even if technically correct (incompetent is fairly harsh wording in English). I don't think it's worth down voting over.
A bunch of mistakes all in the same area mean you're doing something wrong, moreso than "everyone makes mistakes". And if it's an accident, and not caused by someone overworking you, then it falls under the umbrella of "incompetence". That doesn't mean you're a failure in general, it means you weren't up to doing this particular work at this moment in your life.
Depends on how much work you did in that area, and even if the proportion of mistakes is high, it doesn’t necessarily imply incompetence. Could be a bad day.
Also, I would guess gcc does code reviews. If so, gcc seems to be full of incompetent programmers, making it a miracle that this thing can even compile hello world. I don’t think that’s the case.
If you had that bad of a day while being extremely "productive", and never double checked, then that falls under the umbrella of incompetence.
> If so, gcc seems to be full of incompetent programmers
Are you bristling at a word that looks like an insult and then ignoring what I'm actually saying? I specifically said you can be incompetent at a particular thing at a particular moment in time and that it is not a summary of you in general!
"Incompetent", in special if you use it without any further qualification, is not usually meant as "incompetent at a particular thing at a particular time".
I'd say that pretty much everyone is completely incompetent when it comes to correctly implement a complicated piece of software like GCC. OTOH, the people who develop GCC are incredibly competent at what they do because the compiler does insanely complicated things the vast majority of the human race would struggle to comprehend and, yet, its surprisingly correct, with the occasional bug and confusion here and there.
We all work at the limits of our wits. Right now I'm waiting for a timeout so I can continue to test a hypothesis. If I am lucky, I'll be right and the system will malfunction in the way I expect it to. I'd say my chances at this point are 50/50.
In the context of malice vs. incompetence, malice certainly doesn't refer to an innate quality you possess all the time, so why would the alternative mean that?
> with the occasional bug and confusion here and there
And the point of this thread was talking about a big batch of bugs, the opposite of "occasional".
Also, because it badly models how the error occurred in the first place. It wasn't any one developer who made a mistake, as a comment in the thread makes clear, it was a series of mistakes, made by different people, coupled with poor documentation that ultimately led to this.
Certainly not needless. The whole world relies on a new gcc release not to introduce new severe bugs in good code. Mistakes need to found and fixed before the release, but not 3 years after. Then the mistake becomes incompetence. I wish I would not need to sprinkle my code with
Hanlon's razor isn't a binary device. It recommends a preference ordering of two well-known attributions, malice and stupidity, but does not close the door to the many other possible explanations for any given undesirable outcome.
Pepe Silvia, this name keeps comin' up over and over and over again. Every day Pepe's mail's getting sent back to me. Pepe Silvia, Pepe Silvia, I look in the mail, this whole box is Pepe Silvia! So I say to myself I gotta find this guy. I gotta go up to his office, I gotta put his mail in the guy's goddamn hands! Otherwise he's never gonna get it, it's gonna keep coming back down here. So I go up to Pepe's office and what do I find out, Mac, what do I find out? There is no Pepe Silvia. The man does not exist, okay? So I decided, ohh shit, buddy, I gotta dig a little deeper.
And it's wrong, as the comment pointed out. Bernd Edlinger has been active since 2013, his most recent work was on GCC 10's static analyzer, which led to the discovery of an exploitable OpenSSL bug, and was seen in the press in April 2020 [0]. The initial bug report was posted on March, this can hardly be qualified as "going silent".
FandangoRanger made a comment, said it can be a backdoor. And rurban made a comment, said it's unlikely to be a backdoor. Both have been downvoted. I guess the hivemind is fair enough...
A hivemind by definition does not conflict with itself. If both are being downvoted, that implies your hivemind does not exist, at least for this topic.
> If both are being downvoted, that implies your hivemind does not exist, at least for this topic.
Not necessarily so. Perhaps the hivemind is not interpreting the question as a binary one. For example, the downvote for "Reflections on trusting trust?" can be result from the cliche fatigue.
What do you mean?
This is from the first patch message on the mailing list:
> An enhancement to GCC 10 to improve the expansion of strncmp calls with strings with embedded nuls introduced a regression in similar calls to memcmp.
That patch was approved and merged.
Sounds like a compiler bug to me.
It was posted in the correct place, but you can avoid it entirely using the same compiler by skipping builtins. So it is a library bug rather than platform lexer parser or optimizer.
Builtins are functions that are built into the compiler rather than (or in addition to) being provided as part of any library. That is why they are called builtins. A bug in their implementation is a compiler bug, not a library bug.