Zero-Cost Well-Defined Signed Integer Overflow in C++

kevincox · on Dec 11, 2022

When I hear about making overflow safe by making it wrap my main question is how much code is actually ready for overflow? I've seen many vulnerabilities where a bounds check was bypassed because a computation overflowed. It seems that the only really safe behaviour here is crashing. If you aren't going to pay the cost of these checks may as well call it undefined behaviour and allow optimizations. (of course also including wrapping integer types in your language for when you do want to handle wrapping).

saagarjha · on Dec 11, 2022

> For the oppressed who must write secure code in C/C++ but don't want to think about signed integer overflow, let alone undefined signed integer overflow, you can always use the "-wrapv" compiler option.

Very little code in practice. Wrapping by default is typically the wrong thing to do. The only reason things like Java get away with it is that when you wrap on overflow and then use it to index into an array there’s an addition bounds check to save you. In C/C++ there are almost no codebases that should not trap, hard, on overflow.

xwolfi · on Dec 11, 2022

It's not like we get away with it in Java. In the bank I work with, we index arrays above the integer limit less than we calculate increasing prices times quantity, and I have memories of stupid overflows turning to negative. Especially since we used fixed point integer to represent decimal numbers...

We have entire special libraries for these and we still fail sometimes. Reminds me I have to reconcile a total price mismatch between a C# interface and a Java backend on monday...

beached_whale · on Dec 11, 2022

I think Rust will trap, unless explicitly asked for, in debug and wrap on overflow in release.

bestouff · on Dec 11, 2022

Yes and you can alter this behavior either globally (with compile options) or locally (by using specific types).

kazinator · on Dec 11, 2022

The problem with undefined behavior is that it has led designers to irresponsible interpretations where some later computation is optimized based on propositions about an earlier computation which are only factual if that earlier computation avoided undefined behavior. You really just want the calculation to be settled with some well-behaved result like wrapping, or else to raise an exception.

hvdijk · on Dec 11, 2022

> You really just want the calculation to be settled with some well-behaved result like wrapping, or else to raise an exception.

Perhaps you do, but others don't. This compiler behaviour isn't done just for the hell of it, it's done because it's an unavoidable consequence of optimisations that some people rely on that improve performance of valid code. At the same time, other people have almost-valid/invalid code (call it what you will) that behaves as intended when less-optimised, but breaks under these more aggressive optimisations. Whether to prioritise the handling of valid or invalid code is something people will never agree on; everybody will say to prioritise the handling of the type of code that they themselves wrote. I think compilers made the right call in having this as an option so that people can choose what works best for them.

kazinator · on Dec 11, 2022

The assumption "construct X must have undefined behavior, so we are going to optimize following construct Y accordingly" is entirely avoidable. I believe that a professional engineer would avoid such a thing. That we have such a situation just reflects badly on our field.

kazinator · on Dec 11, 2022

s/must/must not/

marcosdumay · on Dec 11, 2022

Oh, more than is ready for UB, that's a certainty.

But yeah, the normal way to handle an overflow should be an error. Most languages disagree, but I do think nearly every language is wrong here.

vbezhenar · on Dec 11, 2022

Is it possible to configure CPU to interrupt on overflow? Otherwise cost will be too high.

jart · on Dec 11, 2022

You use -ftrapv, which uses the CPU accelerated overflow detection.

marcosdumay · on Dec 11, 2022

We are in a loop where CPU designers don't create good portable ways to detect overflow, because languages won't use them; and languages do not detect overflow because CPUs don't have a good portable way of doing it.

On AMD64 you can at least avoid check after each single instruction. So it becomes less expensive.

zajio1am · on Dec 11, 2022

> On AMD64 you can at least avoid check after each single instruction. So it becomes less expensive.

How?

missblit · on Dec 11, 2022

I'd guess approx. zero code is ready for overflow unless it's fuzzed extensively, written in a language that makes you think about it, or part of a section where security or reliability was paramount.

At my job we have fuzz tests which complain about this class of bugs all the time. So far I haven't been able to prioritize fixing them since there's always bigger fish to fry (and the alignment bugs are more fun to fix anyway!).

cryptonector · on Dec 11, 2022

There's two problems with integer overflows: 1) undefined behavior (ugh), and 2) any defined behavior that is anything other than a) some behavior that you specified as desired in code, or b) cause an -and force you to check for- error (or exception, ugh).

Of these two, (1) is easily the worst, but where (1) is solved then (2) becomes the worse problem (naturally, being the only problem left).

adrian_b · on Dec 11, 2022

Yes, I agree.

In my opinion, using -wrapv to handle signed overflow is a great mistake.

The right compiler option is "-fsanitize=undefined -fsanitize-undefined-trap-on-error".

There is no need to implement any code to handle signed overflow, all decent compilers have appropriate options, which unfortunately are not the default, because too many people prefer speed over correctness.

diath · on Dec 11, 2022

The static_assert should probably use Int and not int (upper vs. lower case), so that it checks the template parameter instead of the int type, same for the make_safe_signed function.

Conscat · on Dec 11, 2022

How is that zero cost? There's a blatantly obvious opportunity cost when you remove the associative and commutative properties.

missblit · on Dec 11, 2022

"Zero cost" here means the author is asserting it doesn't add a bunch of extra branches or instructions or whatever (compared to normal addition) by the time it gets compiled down to assembly.

For example there's a slow looking if statement in there:

    if ((_is_neg() && _storage < usmin) ||
        (!_is_neg() && _storage > usmax))

But in theory an optimizing compiler could remove it entirely since it can never be true (on C++20 or above).

Whether the compiler does this in practice depends on a variety of factors, but in general they're fairly good at spotting "obvious" optimizations.