It could make some sense to use strchr, because in idiomatic UNIX tools, single character command line options can be clustered. But that also means that subsequent code should not be tested for a specific position.
And if you ever find yourself actually doing command line parsing, use getopt(). It handles all the corner cases reliably, and consistent with other tools.
Your code actually has 2 bugs. The first I assume is just a typo and you meant to use [1][1] == ‘r’. The second one is that you would accept “-rblah” as well.
Not to mention the potential signed integer overflow in (*right - *left) and (*left - *right), which is undefined behavior. And even if you rely on common two's complement wraparound, the result may be wrong; for example, (INT_MAX-(-1)) should mathematically yield a positive value, but the function will produce INT_MIN, which is negative.
And then we have this "modern" way of spelling pointers, "const int* right" (note the space). In C, declaration syntax mirrors use, so it should be "const int *right", because "*right" is a "const int".
You are right, the implementation of the "compare" function should have used only comparison operators, not subtraction, because unlike subtraction the comparison operations are not affected by overflow (the hardware implementation of integer comparison handles overflow automatically).
When there is no overflow, the sign of the subtraction result provides the same information as a comparison operator, but this is no longer true when overflow happens.
Using only arithmetic operators would require explicit checks for overflow and this would be a very inefficient implementation, because the comparison operators handle overflow implicitly without any overhead (because the comparison operators do not use the sign of the subtraction result to decide their truth value, but they compute the true sign of the result, by the modulo 2 sum, a.k.a. XOR, of the result sign with the integer overflow flag; this is done in hardware, with no overhead).
The feature that integer comparison must be correct regardless of overflow has been a requirement in CPU design since the earliest times. Very few CPUs, the most notable examples being Intel 8080 and RISC-V, lacked this feature. The competitors and successors of Intel 8080, i.e. Motorola MC6800 and Zilog Z80 have added it, making them much more similar to the previously existing CPUs than Intel 8080 was. The fact that even microprocessors implemented with a few thousand transistors around 1975 had this feature emphasizes how weird is that RISC-V lacks such a feature in 2025, after 50 years and being implemented with a number of transistors many orders of magnitude greater.
Sorry, but this is the kind of ridiculous reply that the RISC-V fans give when they are asked why their ISA lacks many of the features that any decent ISA has and which have a negligible implementation cost, therefore no reason to be missing.
The workaround suggested by the RISC-V documentation consists in replacing a very large fraction of all instructions of a program (because there are a lot of integer additions, subtractions and comparisons in any program, close to a half of all instructions) with 3 or more instructions, in order to approximate what in any other CPU is done with single instructions.
The other ridiculous workaround proposed to save RISC-V is that any high-performance implementation must supplant its missing features by instruction fusion.
Yes, the missing hardware for overflow detection can be replaced by multiplying the number of instructions for any operation and the missing addressing modes can be synthesized by instruction fusion, but such implementation solutions are extraordinarily more expensive than the normal solutions used for 3 quarters of century in the other computers, since they were made with vacuum tubes.
Because of the extreme overhead of checking for overflow, I bet that most programs compiled for RISC-V do not check for overflow, which is not acceptable in reliable programs (even when using C/C++, I always compile them with overflow checking enabled, which should have been the default option, to be disabled only in specific cases where it can be proven that overflow is impossible and the checks reduce the performance).
Oh, sorry, I thought you were saying that "RISC-V's comparison instructions don't properly handle integer overflow that internally happens when they do the comparisons, i.e. it only has unsigned comparisons".
The cost of overflow checks turns out to be largely about missed optimizations due to heavier constraints wrt. how the program should behave if overflow occurs (e.g. preserving partial results). Having an overflow check instruction in the ISA just doesn't matter all that much, it can even hurt in bignum computation (often cited as a favorable case for overflow checks) by introducing unwanted insn dependencies.
What you say about missed optimizations is true only when the compiler attempts to handle itself in a graceful way the cases when overflows would occur, instead of raising exceptions.
This is not what is normal overflow checking. Normal overflow checking just raises a specific exception when integer overflow happens.
This has absolutely no effect upon compiler optimizations. The compiler always generates code ignoring the possibility of exceptions. When exceptions happen, the control is passed far away to the exception handler, which decides what to do, e.g. to save debugging information and abort partially or totally the offending program, because an overflow is normally the consequence of an unforeseen program bug and it is impossible to do any action that will allow the continuation of the execution.
You should remember that there is nothing special about integer overflow, almost every instruction that the compiler generates can raise an exception at run time. Any branch instruction, any memory-access instruction, any floating-point instruction, any vector instruction can raise an exception due to hardware. In recent CPUs, integer overflow is not raised implicitly, so you have to insert a conditional branch, but this is irrelevant.
If your theory that the possibility of raising exceptions can influence compiler optimizations were true, there would exist no compiler optimizations, because from every 10 or so instructions generated by a compiler at least a half can raise various kinds of exceptions, in a manner completely unpredictable by the compiler. Adding integer overflow exceptions changes nothing.
For testing, I use a custom qemu plugin to calculate the dynamic instruction count, dynamic uop count, and dynamic instruction size.
Every instruction with multiple register writebacks was counted as one uop per writeback, and to make the results more comparable, SIMD was disabled.
I used this setup to run self-compiling single-file versions of chibicc (assembling) and tinycc (generating object file), which are small C compilers of 9K and 24K LOC respectively.
Both compilers were cross-compiled using clang-22 and were benchmarked cross-compiling themselves to x86.
Let's look at the impact of -ftrapv first.
In chibicc O3/O2/Os the dynamic upos increased due to -ftrapv for RISC-V by 5.3%/5.1%/6.7%, and for ARM by 5.1%/5.0%/6.4%.
Interestingly, in tinycc it only increased for RISC-V by 1.6%/1.0%/1.0%, while ARM increased slightly more with 1.6%/2.0%/1.3%.
In terms of dynamic instruction count, ARM needed to execute 6%/15% fewer instructions than RISC-V for chibicc/tinycc.
Looking at the uops, RISC-V needs to execute 6% more uops in tinycc, but ARM needs to execute 0.5% more uops with chibicc.
The dynamic instruction size, which estimates the pressure on icache and fetch bandwidth, was 24%/10% lower in RISC-V for chibicc/tinycc.
Note that this did not model any instruction fusion in RISC-V and only treated incrementing loads and load pairs as multiple uops (to mirror Apple Silicon).
If the only fusion pair you implement is adjacent compressed sp relative stores, then RISC-V ends up with a lower uop count for both programs.
They are trivial to implement because you can just interpret the two adjacent 16-bit instructions as a single 32-bit instruction, and compilers always generate them next to each other and in sorted order in function prolog code.
You can do this directly in your RVC expander; it only adds minimal additional delay (zero with a trick), which is constant regardless of decode width.
int main(int argc, char* argv[]) {
}Why not
if (argc > 1 && strcmp(argv[1], "-r") == 0) {
}for example?