Hacker News new | past | comments | ask | show | jobs | submit login

> Every UB could very well be wrapped in error checking

Not in C and C++. How would you check that a pointer is safe to dereference for example?




What do you mean? Just check if it's not null?


Not all non-null pointers are safe to dereference.


Like?


On the hardware and assembly level, all unmapped memory regions are illegal to access and will cause a segfault.

In high level languages it gets a lot more complicated (look up "pointer provenance").


Like one that is free'd or point to a stack object that is no longer in scope. Or one that points one past the last element of an array.


It's unsafe to do that, but it's not UB


C23:

> The unary * operator denotes indirection. If the operand points to a function, the result is a function designator; if it points to an object, the result is an lvalue designating the object. If the operand has type "pointer to type", the result has type "type". If an invalid value has been assigned to the pointer, the behavior of the unary * operator is undefined.

below in a note (emphasis mine):

> Among the invalid values for dereferencing a pointer by the unary * operator are a null pointer, an address inappropriately aligned for the type of object pointed to, and the address of an object after the end of its lifetime.

Now the note is not normative, but I assume there are normative wording for defining what the "invalid" pointer values are, scattered around in the standard.

C++:

https://eel.is/c++draft/expr.unary.op#1.sentence-4

C++ is very particular in what it means for a pointer to point to an object, and this is also UB there.


I don't have all ca 200 UBs in my head (see https://gist.github.com/Earnestly/7c903f481ff9d29a3dd1), but I'm pretty sure all types of illegal memory accesses count as UB.

For instance from skimming the list:

- An object is referred to outside of its lifetime (6.2.4).

- The value of a pointer to an object whose lifetime has ended is used (6.2.4).

- An array subscript is out of range, even if an object is apparently accessible with the given subscript (as in the lvalue expression a[1][7] given the declaration int a[4][5]) (6.5.6).


Technically the array subscript example is UB even before the dereference. a[1][7] is equivalent to `*(a[1] + 7)` and a[1] + 7 itself is UB.

(6.5.7) Additive operators

> If the pointer operand and the result do not point to elements of the same array object or one past the last element of the array object, the behavior is undefined.

Interestingly a[1][5] is also UB, but for the dereferencing a past-the-end pointer, not for the arithmetic.


Yes, that's exactly what I meant.

Also, dereferencing a pointer with the wrong dynamic type.

> all ca 200 UBs in my head

Technically there is no enumerated set of all possible UB. Anything not explicitly defined in the standard is UB.

/extremely-pedantic


Effort to try to write an appendix listing all the UB and IFNDR is an ambition of WG21 but there's no specific timeline.

It's a bit like that "Migrate off obsolete DB server" card that's been sat in your team's (well most teams) backlog for a couple of years. Everybody agrees it should be done... But like not this sprint.

My assessment is that at their current rate WG21 is adding new UB to the language slightly faster than it's being documented for the appendix, so this won't actually finish.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: