Memcmp being wrongly stripped like strcmp

progre · on Oct 9, 2020

This is a really good example of how to report, and respond to a bug.

papaf · on Oct 9, 2020

The tooling is also interesting -- bugzilla and CVS.

Edit: my mistake its actually git underneath.

ylyn · on Oct 9, 2020

Why is Bugzilla interesting?

papaf · on Oct 9, 2020

I wanted to make the point that old tools do not get in the way of professional results -- the point kind of fell flat because they are not actually using CVS and its just the label that Bugzilla is giving to Git.

ishitatsuyuki · on Oct 9, 2020

Related: https://news.ycombinator.com/item?id=24636326

userbinator · on Oct 9, 2020

Wow. This failure case of a compiler almost looks like the same sort of nonsense that AI/neural networks can generate, no doubt caused similarly by their immense complexity. The common theme is that no sane human would even think of generating such output.

lmm · on Oct 9, 2020

It seems a pretty straightforward (and all-too-common) kind of misunderstanding between humans: one person gave a function used by both memcmp and strcmp a name to do with strcmp, then another person wrote a comment that talked about strcmp, and eventually a third person made a change that would have been valid if the function was used only by strcmp. No individual would make such a mistake, true, but an organisation of several people could easily fail in the same way.

jmgao · on Oct 9, 2020

> No individual would make such a mistake, true, but an organisation of several people could easily fail in the same way.

You might be able to claim that it's unlikely for someone to make this mistake if the changes were in one single change (I doubt even this!), but the same person revisiting code at multiple points in time might as well be different people.

asddubs · on Oct 10, 2020

yeah I've definitely made equivalent mistakes like that, where I gave something a misleading name because I didn't spend enough time thinking about it, and then eventually confused myself into introducing a bug

nullc · on Oct 9, 2020

Actually this is a common sort of bug I've seen programmers implement, e.g. using strcpy where memcpy should be used or vice versa. Most common on interface boundaries with other languages, particular "everything is a string" languages.

So it's not that shocking to see a similar sort of bug made inside the compiler.

marcan_42 · on Oct 9, 2020

That's how the Wii got pwned. They were comparing binary SHA-1 hashes in RSA signatures with strncmp.

They also weren't checking padding, so you didn't have to use a real signature with a 00 byte in a convenient place (though I did find one), but rather could just use an all-00 signature, because performing the RSA computation on zero yields zero.

admax88q · on Oct 9, 2020

I wonder how many bugs C strings have been responsible for, all because storing a null byte at the end seems more efficient than storing a length up front.

userbinator · on Oct 10, 2020

Good example of how a little insecurity is sometimes a good thing.

mrtnmcc · on Oct 9, 2020

[flagged]

segfaultbuserr · on Oct 9, 2020

The original secp256k1 developer Russell O’Connor who was hit by the bug (was on HN [0]) [1] said he rebuilt every package in his system after the bug was identified, and found only 10 lines of code that were miscompiled. Also, note that the bug was found during an attempt to add more test cases, pre-existing code in secp256k1 was not affected.

> Three lines are tests. All of the lines could be rewritten as a comparison to 0. None of the lines looked that serious. I am not sure which one is the worse: the reduced message integrity code(?) from some ARCFOUR implementation or the something something from an ATM driver?

Two unusual conditions are needed to trigger it...

> * at least one of the compared byte arrays is constant and has a zero byte in position 0, 1, 2, or 3, and

> * the result of the memcmp isn't immediately used in a "== 0" or "!= 0" test (or equivalently "if(memcmp(...))" or "if(!memcmp(...))").

> In particular the second condition makes this bug pretty rare and explains why it's mostly hit in non-inlined memcmp wrappers. (But in our case we hit it with a "<" comparison. )

Looks like a critical bug, but doesn't look like a conspiracy.

[0] https://news.ycombinator.com/item?id=24636326

[1] http://r6.ca/blog/20200929T023701Z.html

qwertox · on Oct 9, 2020

For those interested, ATM doesn't refer to an automated teller machine, but to the UPD98401 high-performance Asynchronous Transfer Mode (ATM) controller.

https://www.digchip.com/datasheets/parts/datasheet/322/UPD98...

moonchild · on Oct 9, 2020

Or, alternately, it's a conspiracy which is intended to have very tight coverage; it's very difficult to detect, if most code slips it by.

segfaultbuserr · on Oct 9, 2020

> it's very difficult to detect, if most code slips it by.

What would be the point if it has missed nearly every target? Based on the field survey by Russell O’Connor, no major crypto project or server project has been affected. The only security victim is a RC4-HMAC implementation for Kerberos 5, which has very limited uses - for Windows 2000, and deprecated since a long time ago.

The only possible explanation would be either,

1. A long-term time bomb, intended to hit a big fish in the future, in a random, non-specific way.

2. Targeted attack (which doesn't make much sense, if you have enough exploitation capabilities to launch many types of targeted attacks already, why bother to leave a public record in GCC?).

moonchild · on Oct 9, 2020

Sure. I don't think it's particularly likely to have been a conspiracy. Just speculating. If I were to try to put an illicit backdoor in a compiler, I would try and make damned sure it's not traceable to me.

It's a very subtle thing. You have to submit a patch with an improvement (or, at least, an apparent one) to the compiler, which also contains your backdoor. You have to ensure said patch doesn't immediately cause test failures in gcc, or in projects using gcc. If you can manage that, though, it'd be very difficult to prove that you were responsible, which is why I think the targeted attack is entirely plausible.

HourglassFR · on Oct 9, 2020

> If you can manage that, though, it'd be very difficult to prove that you were responsible, which is why I think the targeted attack is entirely plausible.

So the only argument you have to suggest it might be an intentionally introduced bug is that it doesn't look like one… I understand the sentiment but this is clearly a ridiculous line of reasoning.

moonchild · on Oct 9, 2020

The patch that introduced the bug is at [1]. Does it look to you like it introduced a bug? Granted, you're not a gcc developer—neither am I—but the gcc developers didn't think so either.

Bugs are accidentally introduced all the time, true. But I don't think it's fair to say that this is a ridiculous line of reasoning.

1. https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=14b7950f12...

nullc · on Oct 9, 2020

There is no security concern about it in libsecp256k1: The compilation was only happening in a newly written test case and didn't impact the production code.

The miscompilation of the implementation of draft-brezak-win2k-krb-rc4-hmac-04.txt seemed interesting to me and potentially of a security concern. They looked to me like they were potentially significantly reducing the length of the message integrity check in an authentication protocol.

https://github.com/heimdal/heimdal/blob/7ae2dfd853c87f9cbecb...

Though its hard to say for sure without even a passing familiarity with that protocol or its implementation. If you wanted to go vulnerability hunting, however, that is where I'd start.

That said, even if there is a bug that is no evidence that it would have been intentional. Sadly, an intentionally introduced bug could be absurdly subtle. One of the downsides of software bugs being relatively common is that you can't be sure if a bug was intentional or not... and at the same time it's silly to assume one was intentional because we know that real bugs are so common.

It's long been my view that if you're not occasionally finding compiler bugs -- even in somewhat older compilers-- you're just not testing hard enough. They've become more rare but are not so rare that a single developer comprehensively testing a small library won't eventually find one (or more). (Libsecp256k1 is ~9,964 lines of code and ~8,235 lines of test code).

Many other cases found by Russell's system wide recompilation were also in test cases or were otherwise pretty obviously not that big of a deal.

segfaultbuserr · on Oct 9, 2020

> The miscompilation of the implementation of draft-brezak-win2k-krb-rc4-hmac-04.txt seemed interesting to me and potentially of a security concern.

Fortunately, this ancient and deprecated [0] authentication protocol from Windows 2000 is uncommon today. AES-HMAC is used since Windows Vista.

[0] https://web.mit.edu/kerberos/krb5-devel/doc/admin/enctypes.h...

nullc · on Oct 9, 2020

https://www.wearethemighty.com/news/us-military-old-windows-...

kstenerud · on Oct 9, 2020

Everything smells fishy when you want it to be.

beowulfey · on Oct 9, 2020

It was a fairly recent regression (this past March?) and based on developer comments in the source it really looks more like a mistake than anything. Personally I’m gonna go with Occam’s razor on this one.

rurban · on Oct 9, 2020

Unlikely. We were hit by a series of similar string builtin related bugs in the tree optimizer since gcc 9.0. And the bugfest is still ongoing. The most likely explanation is an incompetent developer touching these areas.

segfaultbuserr · on Oct 9, 2020

Not sure why has this comment been downvoted. It simply describes the bug in context ("there has been a series of similar string builtin related bugs in the tree optimizer since gcc 9.0") and offers a plausible explanation of the bug ("an incompetent developer touching these areas") based on Hanlon's Razor.

Even this is not welcomed on HN nowadays?

kbenson · on Oct 9, 2020

rurban often gets flak for his method of communication, which is generally blunt. Whether from a language barrier or just how he communicates, I think people have a problem separating the message from the delivery, so that might be what's going on here.

I agree with his assessment that it's probably just someone's mistake. I also personally wouldn't have worded quite that way, even if technically correct (incompetent is fairly harsh wording in English). I don't think it's worth down voting over.

joosters · on Oct 9, 2020

Perhaps because it is unfairly assuming incompetence? Everyone makes mistakes.

naniwaduni · on Oct 9, 2020

It's attributing the error to incompetence instead of malice. It's not particularly unfair, it's providing an alternate explanation, if a bit bluntly.

rbanffy · on Oct 9, 2020

Even the most competent among us makes mistakes from time to time. I make them almost daily.

Sometimes you just have a problem that's hard to wrap your head around and you think you found a way to make code faster that just isn't so.

Now the issue is better understood and we may end up with a couple good tests to prevent future regressions.

Dylan16807 · on Oct 9, 2020

A bunch of mistakes all in the same area mean you're doing something wrong, moreso than "everyone makes mistakes". And if it's an accident, and not caused by someone overworking you, then it falls under the umbrella of "incompetence". That doesn't mean you're a failure in general, it means you weren't up to doing this particular work at this moment in your life.

Someone · on Oct 9, 2020

Depends on how much work you did in that area, and even if the proportion of mistakes is high, it doesn’t necessarily imply incompetence. Could be a bad day.

Also, I would guess gcc does code reviews. If so, gcc seems to be full of incompetent programmers, making it a miracle that this thing can even compile hello world. I don’t think that’s the case.

Dylan16807 · on Oct 9, 2020

If you had that bad of a day while being extremely "productive", and never double checked, then that falls under the umbrella of incompetence.

> If so, gcc seems to be full of incompetent programmers

Are you bristling at a word that looks like an insult and then ignoring what I'm actually saying? I specifically said you can be incompetent at a particular thing at a particular moment in time and that it is not a summary of you in general!

rbanffy · on Oct 9, 2020

"Incompetent", in special if you use it without any further qualification, is not usually meant as "incompetent at a particular thing at a particular time".

I'd say that pretty much everyone is completely incompetent when it comes to correctly implement a complicated piece of software like GCC. OTOH, the people who develop GCC are incredibly competent at what they do because the compiler does insanely complicated things the vast majority of the human race would struggle to comprehend and, yet, its surprisingly correct, with the occasional bug and confusion here and there.

We all work at the limits of our wits. Right now I'm waiting for a timeout so I can continue to test a hypothesis. If I am lucky, I'll be right and the system will malfunction in the way I expect it to. I'd say my chances at this point are 50/50.

Dylan16807 · on Oct 9, 2020

In the context of malice vs. incompetence, malice certainly doesn't refer to an innate quality you possess all the time, so why would the alternative mean that?

> with the occasional bug and confusion here and there

And the point of this thread was talking about a big batch of bugs, the opposite of "occasional".

shawnz · on Oct 9, 2020

Maybe that area just has a high complexity, making mistakes there more likely?

joosters · on Oct 9, 2020

It was declaring that the developer was incompetent. That's needlessly insulting to anyone. Making a mistake doesn't make you an incompetent person.

archgoon · on Oct 9, 2020

Also, because it badly models how the error occurred in the first place. It wasn't any one developer who made a mistake, as a comment in the thread makes clear, it was a series of mistakes, made by different people, coupled with poor documentation that ultimately led to this.

segfaultbuserr · on Oct 9, 2020

Fair enough.

Gibbon1 · on Oct 9, 2020

Yes... but breaking similar things repeatedly.

rurban · on Oct 9, 2020

Certainly not needless. The whole world relies on a new gcc release not to introduce new severe bugs in good code. Mistakes need to found and fixed before the release, but not 3 years after. Then the mistake becomes incompetence. I wish I would not need to sprinkle my code with

    #if defined(__GNUC__) && defined(__GNUC_PATCHLEVEL__) && \
       (((__GNUC__ == 9) && (__GNUC_MINOR__ <= 3) || \
        (__GNUC__ > 9)))

Also their arrogant attitude. Incompetence is rarely the only problem, it only gets problematic if paired with disfunctional management or arrogance.

inopinatus · on Oct 9, 2020

Hanlon's razor isn't a binary device. It recommends a preference ordering of two well-known attributions, malice and stupidity, but does not close the door to the many other possible explanations for any given undesirable outcome.

FandangoRanger · on Oct 9, 2020

Reflections on trusting trust?

mrtnmcc · on Oct 9, 2020

OK my getting downvoted shouldn't be a surprise. But it sure seems suspicious that the author introducing the bug made commits that end with this bug - and then silence. https://gcc.gnu.org/git/?p=gcc.git;a=search;h=d01b568a78351b...

His name and hotmail email is associated in git with various cryptography projects including openssl.

pvg · on Oct 9, 2020

Pepe Silvia, this name keeps comin' up over and over and over again. Every day Pepe's mail's getting sent back to me. Pepe Silvia, Pepe Silvia, I look in the mail, this whole box is Pepe Silvia! So I say to myself I gotta find this guy. I gotta go up to his office, I gotta put his mail in the guy's goddamn hands! Otherwise he's never gonna get it, it's gonna keep coming back down here. So I go up to Pepe's office and what do I find out, Mac, what do I find out? There is no Pepe Silvia. The man does not exist, okay? So I decided, ohh shit, buddy, I gotta dig a little deeper.

admax88q · on Oct 9, 2020

Your search is broken, you are searching up to the specific commit.

Look now the history "ends" at a different commit: https://gcc.gnu.org/git/?p=gcc.git;a=search;h=5828c09abe00cc...

reader_mode · on Oct 9, 2020

>But it sure seems suspicious that the author introducing the bug made commits for a few months that end with this bug

Your links shows commits 2016-2018

viraptor · on Oct 9, 2020

Activity through to 2020 http://patchwork.ozlabs.org/project/gcc/list/?submitter=2838...

tomku · on Oct 9, 2020

Actually 2013-2018 if you click the "next" link hidden in the lower left corner.

xvector · on Oct 9, 2020

Wow, nice find.

DarthGhandi · on Oct 9, 2020

thought I'd heard that name before, he's quite active, no idea how you ended up with "silence"

https://www.theregister.com/2020/04/23/gcc_openssl_vulnerabi...

lwansbrough · on Oct 9, 2020

That’s a very interesting discovery.

segfaultbuserr · on Oct 9, 2020

And it's wrong, as the comment pointed out. Bernd Edlinger has been active since 2013, his most recent work was on GCC 10's static analyzer, which led to the discovery of an exploitable OpenSSL bug, and was seen in the press in April 2020 [0]. The initial bug report was posted on March, this can hardly be qualified as "going silent".

[0] https://www.theregister.com/2020/04/23/gcc_openssl_vulnerabi...

lwansbrough · on Oct 9, 2020

Thanks for clarifying.

segfaultbuserr · on Oct 9, 2020

> OK my getting downvoted shouldn't be a surprise

FandangoRanger made a comment, said it can be a backdoor. And rurban made a comment, said it's unlikely to be a backdoor. Both have been downvoted. I guess the hivemind is fair enough...

setr · on Oct 9, 2020

A hivemind by definition does not conflict with itself. If both are being downvoted, that implies your hivemind does not exist, at least for this topic.

nurettin · on Oct 9, 2020

The hivemind thinks that this qualifies as both conspiracy theory and nerd sniping and both sides of the argument should be abandoned.

jobigoud · on Oct 9, 2020

A hive mind with internal conflict or cognitive issues makes for great scifi in Ann Leckie's "Ancillary Justice".

segfaultbuserr · on Oct 9, 2020

> If both are being downvoted, that implies your hivemind does not exist, at least for this topic.

Not necessarily so. Perhaps the hivemind is not interpreting the question as a binary one. For example, the downvote for "Reflections on trusting trust?" can be result from the cliche fatigue.

nurettin · on Oct 9, 2020

[flagged]

gnud · on Oct 9, 2020

What do you mean? This is from the first patch message on the mailing list:

> An enhancement to GCC 10 to improve the expansion of strncmp calls with strings with embedded nuls introduced a regression in similar calls to memcmp.

That patch was approved and merged. Sounds like a compiler bug to me.

abollaert · on Oct 9, 2020

Thing is that people try to look for conspiracies where this is probably just an honest mistake.

gnud · on Oct 9, 2020

I agree on the conspiracy stuff. But GGP seemed to say it was the reporters fault, and not a compiler bug.

From what I understand, it _was_ a compiler bug, and it's now being fixed. That's right, isn't it?

nurettin · on Oct 9, 2020

It was posted in the correct place, but you can avoid it entirely using the same compiler by skipping builtins. So it is a library bug rather than platform lexer parser or optimizer.

hvdijk · on Oct 10, 2020

Builtins are functions that are built into the compiler rather than (or in addition to) being provided as part of any library. That is why they are called builtins. A bug in their implementation is a compiler bug, not a library bug.

nurettin · on Oct 10, 2020

It is a bug in a library which you can avoid using with a flag. Take what you want from that. I won't reply to this nonsense.