> the same people who think that it's okay that memcpy(0, 0, 0) is undefined behavior.
I am curious about this particular example. How does that ever make sense, where do you have that?
I'd accept that it would be nice to have memcpy(p1, p2, n) with valid pointers p1 and p2 to still be defined when n=0, because maybe n is variable and might work out to be 0 in some cases, which you then don't need to treat specially. I don't know whether that case is defined or not.
But in the case where either p1 or p2 are NULL, the memcpy will be illegal no matter what n is, so if p1 or p2 are variable you always have to check them anyway, and to run into the memcpy(0, 0, 0) case you would actually have to special case n=0 when p1 or p2 were NULL, which makes no sense.
Unless 0 is actually a valid address (kernel code with identity mapping at 0, simple platforms), but then NULL becomes a whole different beast.
Because memcpy(x, y, 0) for any x and y is a no-op. Why should we have a special undefined case for x==nullptr || x==nullptr? In general, we shouldn't introduce unnecessary boundary conditions to APIs.
Because not making that case undefined puts additional and completely unnecessary burden on any implementation of memcpy? Now you have defined that memcpy(_, _, 0) is a no-op, so now the implementation has to make sure that it remains a no-op, which, if for instance a particular memcpy needs to perform an operation before the actual copy, may well involve adding conditionals to your implementation. For a case that a program will never sensibly run into (see my previous comment for why).
It's perfectly sensible to run into the case: imagine we're copying from a buffer that's allocated on demand. In the case where we haven't allocated the buffer, we have buf==nullptr and buffsz=0. Why should it be wrong to memcpy(other_buffer, buf, bufsz) without first checking whether bufsz is zero (and thus buf is nullptr)?
I don't think your optimization actually exists. In what possible world can we generate better code by assuming that for memcpy(x,y,n), when n==0, x!=nullptr||y!=nullptr?
I think that in some cases, the claim that undefined behavior allows for better optimization is bullshit.
> In the case where we haven't allocated the buffer, we have buf==nullptr and buffsz=0. Why should it be wrong to memcpy(other_buffer, buf, bufsz) without first checking whether bufsz is zero (and thus buf is nullptr)?
That's not enough. For the case that resolves to memcpy(0, 0, 0), other_buffer is also 0. That means you need to guard against memcpy (0, buf, buffsz) with buff/buffsz != 0 somewhere anyway. Unless you are saying that the standard says that the more general memcpy(p, 0, 0) is undefined, which is a slightly different discussion.
> I don't think your optimization actually exists. In what possible world can we generate better code by assuming that for memcpy(x,y,n), when n==0, x!=nullptr||y!=nullptr?
In any world where you need to e.g. lock the destination memory before accessing it, or do any other preparation/teardown. Not all platforms out there have one linear address space. In small scale applications, as just one example, it is not uncommon to have different pointer types for different types of memory, e.g. internal vs. external memory, I/O address spaces (lots of side effects there), maybe even a non-volatile memory address space.
In my opinion, it is easy and sensible to declare that memcpy(0,0,0) shall be undefined, especially because, again, no sensible program should ever reach that case without error (which, again, is different from the memcpy(p1, p2, 0) with p1/p2 != 0 case which can be sensible reached).
In fact, I'd argue that your proposal is the one that introduces the special case that all implementations would now have to take care of, adding complexity at best and a performance impact at worst, for something that a program won't ever do.
Can you cite one example of a modern platform that would be meaningfully impacted by making memcpy(a,b,0) well-defined for all a, b? These hypothetical situations you're talking about aren't real. We don't have computers with 18-bit bytes or one's complement arithmetic either.
As for the distinction between a==0&&b==0 and a==0||b==0: first of all, the standard does make memcpy(a,b,0) undefined when a==0||b==0. Second, it's perfectly reasonable to run into a situation where both and b can be zero: consider the same demand-allocated buffer case. Why wouldn't you also want to demand-allocate the destination buffer?
In my opinion, it is easy and sensible to declare that memcpy(0,0,0) shall be undefined, especially because, again, no sensible program should ever reach that case without error (which, again, is different from the memcpy(p1, p2, 0) with p1/p2 != 0 case which can be sensible reached).
It's common to see valid but semantically-meaningless constructs like that in autogenerated C code. Same with quite a few other cases mentioned in the article.
It's just not a good idea for a compiler author to arbitrarily start throwing warnings for memcpy(a,b,0) at this late date, regardless of a and b. That ship has sailed, leaks and all.
You've phrased the result of the UB in a way that disguises the benefit: your complaint is adding a special case, and for an arbitrary memcpy(x, y, n) call, the compiler would have to provide it isn't in this special case (i.e. n != 0) or else it couldn't assume x != nullptr && y != nullptr. The codegen benefit isn't when n == 0, but when n is unknown.
This is not how an assembly programmer sees this problem.
(memcpy is often hand coded in assembly).
This means the code takes advantage of CPU pipelines and branch predictions, which means the source/destination pointer might be de-referenced and loaded into a register before the number of bytes is being inspected, or the assembly is executed with a 0 value, which still makes the CPU do a load or store on the src or dest pointer. This is being done to gain performance, and doing things in another order or explicitly validating the input decreases performance.
If then the src/dest pointer is NULL, the above scenario plays out badly, but in order to not sacrifice performance, the burden is placed on the caller to correctly use memcpy() instead of memcpy() detecting invalid use.
Sure: except that rule isn't useful. You can't _rely_ on undefined behavior, so preserving the rule that "memcpy(x, y, n) is undefined if x or y are null" isn't helpful. OTOH, preserving the rule that memcpy(x,y,n) is defined for all x,y when n==0 really is useful.
It is useful. I'm focusing on the effect on code generation: the compiler relies on that rule to deduce that x and y are not null. The special case for n==0 makes it harder for the compiler to deduce the non-null-ness and thus harder for it to optimise based on that. Restricting when optimisations can be applied is the effect of your special case.
You might not agree with this as a design decision but I'm not arguing that this is the right trade-off, just answering:
> I don't think your optimization actually exists. In what possible world can we generate better code by assuming that for memcpy(x,y,n), when n==0, x!=nullptr||y!=nullptr?
In mathematics, this is called a discontinuity. Google the value of 0^0 for comparable discussion (it ‘is’ 1 because that makes more formulas look nice)
I am curious about this particular example. How does that ever make sense, where do you have that?
I'd accept that it would be nice to have memcpy(p1, p2, n) with valid pointers p1 and p2 to still be defined when n=0, because maybe n is variable and might work out to be 0 in some cases, which you then don't need to treat specially. I don't know whether that case is defined or not.
But in the case where either p1 or p2 are NULL, the memcpy will be illegal no matter what n is, so if p1 or p2 are variable you always have to check them anyway, and to run into the memcpy(0, 0, 0) case you would actually have to special case n=0 when p1 or p2 were NULL, which makes no sense.
Unless 0 is actually a valid address (kernel code with identity mapping at 0, simple platforms), but then NULL becomes a whole different beast.