Having designed microcoded pipelined processors from gates on up in uni, that's true.
integer cmp x, y often reduces to microcode that effectively runs:
- 3A: sub x, y, dummy_result (att syntax) <-- most likely^
- 2A: sub copy_of_x, y (intel syntax)
- 1A: push x, push y (or y, x), sub, update flags, pop
^ Depends heavily on the microarchitecture, because there's likely all sorts of extra features and ops that can be set in the same microinstruction cycle to implement the macroinstructions.
In signed 2's-complement (mostly everything), zero flag just checks that all result bits to be zero and the less than flag is just the result's MSB (negative result if 1).
Signed and unsigned have their under/overflow issues. Being able to detect that and handle it in code, rather than crashing or silently producing undefined results, can be important, such as safely re-callocing memory (multiply overflow). (There are other issues like pointer/slice index over/underflow too, and these are related problems.)
The issues are usually around whether branch-free code is absolutely necessary or whether predicting branches wrong would incur disastrous pipeline stall/s (greatly depends on the specific use-case, especially inner loops... don't prematurely overoptimize).
I almost feel there needs to be something in-between IR languages like LLVM IR and general-purpose C, that's not assembly but still general-purpose, functional/imperative, static analysis and able to expose math and branching state/options/differences cleanly across a variety platforms without a fugly/verbose syntax.
To make up a pseudoreligion (if it's doesn't look like your preferred Turing-complete language, sorry) at random:
var x, y int32
# ... assign x and y somehow here
begin
x = x +! y # or x +=! y
rescue IntegerOverflowError
# ... int32.max + int32.max => (0xFFFFFFFF + 0xFFFFFFFF) & 0xFFFFFFFF => 0xFFFFFFFE (-2, int32)
rescue IntegerUnderflowError
# ... int32.min + int32.min => (0x7FFFFFFF + 0x7FFFFFFF) & 0xFFFFFFFF => 0xFFFFFFFE (-2, int32)
end
var s, t uint64
# ... assign s and t somehow here
begin
s *=! t
rescue IntegerOverflowError
# ... uint64.max * uint64.max => ... => 1 (1, uint64)
end
In signed 2's compliment architectures (primarily any microprocessor made in the mid 70s or later) that has flags (some RISC based CPUs, like the MIPS, don't have individual flags) will usually have an Overflow flag, (O or V), a Carry flag (C), a Zero flag (Z) and a Sign or Negative flag (S or N---here I'll be using N and V since I have a preference for Motorola CPUs). Yes, internal, a comparison is typically done via a subtraction; the flags are updated, but not the destination register. Now, depending upon how you want to interpret the results, as a signed comparison or unsigned comparison, depends on the flags you check (and the conditional jump (or branch) instructions will do this check for you). The flags that are checked for unsigned comparison are (!f means NOT f, i.e. the flag is not set):
above or equal: !C
above: !C and !Z
below or equal: C or Z
below: C
And the signed comparisons are:
above or equal: N == V
above: N == V and !Z
below or equal: N != V or Z
below: N != V
And, just for completeness, for both signed and unsigned:
Cool. Brings back memories of implementing various jump instructions and carry prediction adders (do adds twice, pick result based on actual carry later) (I recall I had the smallest microprogram in the course class and it used the fewest microcycles by far... everyone hated me. :)
Would it be nice to be able to access the carry in and carry out as "variables", as well as the full imul/idiv if i.e. 32 args -> 64 bit full-precision results without dropping down into assembly. Perhaps it's unreasonable, but it seems there are few limited differences per generic instruction across processors that assuming they are all each "special unique snowflakes" is obviously untrue FUD.
Understanding architecture / assembly comes in handy to replace high-level branching code with branch-free code to avoid pipeline stalls when branch predictions (eg the branch prediction infrastructure) is guessing incorrectly. Also, being able to go down the stack is really helpful because there are most definitely bugs the way down.
Because that overflag makes all the difference (no pun intended), I wouldn't say that they are the same, but that comparison usually involves subtraction.
I remember coming across a set of lecture slides for a CS course on architecture that used its own toy architecture with neither flags nor separate comparison instructions - saying that comparison is the same as subtraction gives people the entirely wrong idea, and leads to the sort of bugs mentioned in this article.
When something is easy at the hardware level but there is no easy way to express it in programs, it suggests a failure of the abstraction. Or more positively, room for improvement. I'd love to see some way that this can be expressed elegantly in C.
The way that C expresses it elegantly is via the comparison operator. The author is concerned that people try to be clever and use subtraction instead. Here is how the author corrected this error by using comparison operators instead of subtracting (in the tree man page):
The ternary operator is necessary, because in C a relational operator is evaluated to 0 (false) or 1 (true). So, the author added a special case for returning a negative value (indicating the first argument should be sorted before the second).
Regardless of the correctness of the code, if it's not obvious and you need to explain it, you probably should not write it (unless you have profile data that makes you do otherwise).
"Programs must be written for people to read, and only incidentally for machines to execute."
That's a good one for integers and if you don't intend the comparator to be inlined. For inlined comparison, I prefer if (x == y) return 0; return (x < y) ? -1 : 1;. The control flow is more transparent and C compilers are usually able to perform the if/if conversion necessary to get the single comparison you'd expect in hand-rolled sorts or searches.
From a category theoretic point of view, subtraction is a natural candidate for comparison: It is the internal hom-functor in the ordered monoid of integers. One could argue that internal hom functors are the very essence of a "comparison".
The sole problem seems to be that integers in C are not actually integers.
This isn't a problem with comparison by subtraction, it's a problem with integer overflow. There's nothing wrong with comparison via subtraction if you account for overflow, which you always need to do in programming.
In mainstream compilers, even on x64, sizeof(long) == sizeof(int). You need long long instead. Of course, in the general case there's exactly zero guarantee that there even exists an integral type wider than int.
On a (slightly) more practical level: there exists some largest integer type. You may someday need to sort it. If you want your sort to always work, you can't afford to overflow.
One way to reduce the chance of this problem occurring is to change the specification of the comparison function to only allow -1, 0, and 1 to be returned (or even better, use enums like LESS, EQUAL, GREATER). This makes doing the wrong thing more cumbersome when implementing a comparison function, but it also makes it easier to implement the actual sort function because jump tables and switch statements become possibilities.
There are all kinds of code patterns that are sensitive to overflow, and people do use those patterns because they are simple. A program that does not fail on overflow is something almost illegible, and for most applications the only difference it can make is displaying the correct error message before closing, because the problem domain has no procedure for that.
C programmers are implicitly expected to estimate the value ranges they'll be dealing with, and correctly size their variables.
/*compare.c*/
#include <stdio.h>
int compare( int x, int y )
{
return x - y;
}
void testCompare( int x, int y )
{
if ( compare( x, y ) < 0 )
{
printf( "%d is less than %d\n", x, y );
}
else
{
printf( "%d is greater than or equal to %d\n", x, y );
}
}
int main()
{
testCompare( 5, 10 );
testCompare( 10, 5 );
testCompare( 1987654321, -1987654321 );
return 0;
}
Then why isn't his example stupid, but fine? They're basically doing the same bad thing - subtracting two signed integers without caring about possible overflow.
I'm learning javascript, so this sort of issue doesn't really make sense to me, but I gather this is a specific issue with C and it has to do with how large integers are handled. Fair enough, but I'm actually quite curious how you should solve this kind of problem in a low level language and why.
You just write a function that returns -1, 0, or 1. It's not specific to C, it's in any language that has fixed-size types.
Another reason (a lame one) to return -1, 0, or 1 is that often the caller expects one of those return values for some reason (because the caller is dumb and your comparison function is replacing another one that behaved that way).
x = 1987654321 and y = -1987654321? Then the difference between them is -319658654 (negative) which proves that x is less than y. That’s less than correct.
Which is completely 100% wrong. Surely the difference would be x - y i.e (1987654321 - (-1987654321)) = (1987654321 + 1987654321) = 3975308642.
Which is perfectly ok, because that's positive and so proves x is greater than y. So this comparison works just fine for negative integers...
Don't use subtraction for comparison in any language that uses floating-point (like Lua or JS) or silent overflow (like, typically, C, C++, Java, and C#). There are a few languages, like Python, where integer subtractions that overflow will transparently produce bignums, but they are kind of the exception.
If performance really is of concern, qsort() or similar functions probably shouldn't be used. Instead a dedicated function, which allows to choose a specific algorithm (qsort() doesn't have to use quicksort) and also doesn't involve calling a comparison function pointed to by a function pointer, can be used.
One interesting thing I have learned recently is that GCC can do inlining through function pointers.
That doesn't make your point about dedicated function being better idea for performance but it's one thing "std::sort is faster by design" people often miss.
From GCC documentation:
>>-findirect-inlining
Inline also indirect calls that are discovered to be known at compile time thanks to previous inlining. This option has any effect only when inlining itself is turned on by the -finline-functions or -finline-small-functions options.
I have no idea when it can and when it can't do that to be honest. I tested qsort vs std::sort on my machine on my data and performance was the same (Windows, MinGW, GCC 4.8, -flto enabled) but other people reported different results.
I would hope that on modern compilers on modern architectures, this would be done with zero branches. That is the case in a quick test on my machine (gcc 4.6.3).
The simplest way to implement that would be two SETxx instructions and a subtract on 386+, SLT/SGT on MIPS, two subtracts (or one compare and one subtract) and two predicated moves on ARM. No branching required at all.
Exactly, don't use subtraction for comparison in C.... when the difference might exceed 2^31.
For many data types, like time, that will normally never happen, making subtraction the safest no-brainpower-needed approach. If in doubt, use the next larger integer type.
If you ever want to make a vulnerability researcher drool visibly, have a C programmer say, "that will normally never happen, making ${XXX} the safest no-brainpower-needed approach".
Actually, with time, the sign of (iXX)((uXX)x-(uXX)y) is the "odometer comparison" of x and y, which handles overflow more gracefully than the typical < comparison.
You know what causes bugs in C (besides memory leaks)? Complex or clever expressions that don't reveal their semantics at first glance. As an example from elsewhere in the thread:
return (x > y) - (x < y);
Quick. What's that do?
What will the next person who looks at the code think it does?
C programmers should also be familiar with carry propagation; I see no reason they shouldn't be expected to work out the correctness of
int cmp(int x, int y) {
const unsigned a = x;
const unsigned b = y;
const unsigned s = sizeof x * 8 - 1;
return ((b^((b^(b-a))&(a^(b-a))))>>s)&~-((a^((a^(a-b))&(b^(a-b))))>>s)^-((a^((a^(a-b))&(b^(a-b))))>>s)&~-((b^((b^(b-a))&(a^(b-a))))>>s);
}
Is a common expression which is very efficient (why else would you write anything in C if you don't care about code being fast/efficient?) and what you proposed is neither.
The real difference is that it's simple. There are only 11 symbols in the expression, only two things that can vary, and only three interesting cases. As opposed to 129 symbols and some large number of interesting cases.
Yeah but that's what happens when people become so adapted to their tools that they forget how real math/world works... and they even have the nerve to stand up for their flawed tools.
I actually like and use C everyday, but it would be aberrant for me to say essentially what the article implies: "A math algorithm is wrong because it doesn't work in C"