C23 Implications for C Libraries

vbtemp · on Nov 21, 2022

I have a question I've always wanted to know but too embarrassed to ask (Especially because I've extensively used C for well over a decade now and am intimately familiar with it):

Who exactly are these new C-standards for?

I interact and use C on an almost daily basis. But almost always ANSI C (and sometimes C99). This is because every platform, architecture, etc has at least an ANSI C compiler in common so it serves as the least common-denominator to make platform-independent code. As such it also serves as a good target for DSLs as a sort of portable-assembly. But when you don't need that, what's the motivation to use C then? If your team is up-to-date enough to quickly adopt C23, then why not just use Rust or (heaven forbid, C++23)?

I'd love to hear from someone who does actively use "modern" C. I would love to be a "modern C" developer - I just don't and can't see its purpose.

acuozzo · on Nov 21, 2022

> Who exactly are these new C-standards for?

An example: The C11 memory model + <stdatomic.h> + many compilers supporting C11 has/had a positive impact on language runtimes. Portable CAS!

> If your team is up-to-date enough to quickly adopt C23, then why not just use Rust or (heaven forbid, C++23)?

Another example: If you're programming e.g. non-internet-connected atomic clocks with weather sensors like those produced by La Crosse, then there's no real security model to define, so retraining an entire team to use Rust wouldn't make much sense. (And, yes, I know that Rust brings with it more than just memory safety, but the semantic overhead comes at a cost.)

Another example: Writing the firmware to drive an ADC and broker communication with an OS driver.

Another example: The next Furby!

attractivechaos · on Nov 21, 2022

Atomic is one of the few things in C11 I like most. Unfortunately, it is an optional feature along with threading [1]. It is not portable. In the end, I am still using gcc/clang's __sync or __atomic builtins.

[1] https://en.wikipedia.org/wiki/C11_(C_standard_revision)#Opti...

fweimer · on Nov 21, 2022

<stdatomic.h> is provided by GCC (not the libc), so I expect it to be available everywhere the atomic builtins are supported.

I prefer the builtins. With _Atomic you can easily get seq-cst behavior by mistake, and the <stdatomic.h> interfaces are strictly speaking only valid for _Atomic types.

acuozzo · on Nov 22, 2022

> Unfortunately, it is an optional feature along with threading [1]. It is not portable.

Portability isn't binary. It's the result of work being done behind-the-scenes to provide support for a common construct on a variety of hardware and operating systems. It's a spectrum.

GCC certainly is portable, but it doesn't support every ISA and OS. Over time it has even dropped support for several.

Random thoughts since I'm still in the process of waking up…

1. Most of <stdint.h> is optional

2. long is 64 bits on Tru64 Unix which is valid under all versions of the standard

kazinator · on Nov 21, 2022

Requiring C11 is itself a portability issue.

I can easily whip up CAS for all platforms I care about, plus a few more for bonus, using nothing but C99 or C90 (and have actually done it before).

Possibly, without even using any language extensions, unless I want the operations inlined.

MaxBarraclough · on Nov 21, 2022

Did you manage to write a truly portable compare-and-swap in standard C? Or did the code just happen to seem to work on your platform?

I'd be surprised if the former were possible. From a quick web search, C doesn't even give you the guarantees necessary for Peterson's Algorithm, [0][1] and volatile doesn't help. [2][3]

[0] https://codereview.stackexchange.com/a/124683

[1] https://stackoverflow.com/questions/35527557/errors-with-pet...

[2] https://web.archive.org/web/20160525000152/https://software....

[3] https://old.reddit.com/r/programming/comments/d457c/volatile...

vbtemp · on Nov 21, 2022

I think the point by kazinator still stands. C11 is still a portability issue (which was the root of my original question above). ANSI C is the de-facto baseline everything has in common, using C11+ narrows which platforms you're able to build for. Even if C11 does provide the primitives for correct implementation of Peterson's Algorithm, what about the other platforms that don't have a C11 compiler - it does them no good. As to the point more directly, a lot of the C code I've seen is for real time and embedded systems that are usually time and memory partitioned, and do not have the same concerns regarding concurrency.

I guess if I could rephrase my original question: People who are going to adopt C23 - who are you and what field/industry/line-of-work are you in?

Asking because in my line-of-work, C is ubiquitous and I personally love coding in C, but anything beyond ANSI C (or C99) is "cool" but undermines the point of C as I've used it, which is its use cross-platform for a huge set of common and uncommon architectures and instruction sets. If something only needs to run on common, conventional platforms, C, however much I love it, would no longer necessarily be a strong contender in light of many alternatives. It seems like these standards target an ever shrinking audience (much smaller than the whole universe of software developers working in C).

MaxBarraclough · on Nov 21, 2022

  > ANSI C is the de-facto baseline everything has in common,
  > using C11+ narrows which platforms you're able to build for.

Sure, but which platforms are you thinking of? I think you're overestimating the proportion of C codebases that care about targeting highly exotic hardware platforms. GCC can target all sorts of ISAs, including embedded platforms. It can't target, say, the 6502, or PIC, or Z80, but they're small niches.

I'm not an embedded software developer though, perhaps there are more developers using niche C compilers than I realise.

  > Even if C11 does provide the primitives for correct implementation
  > of Peterson's Algorithm, what about the other platforms that don't
  > have a C11 compiler - it does them no good.

If portable atomics/synchronisation machinery is not offered by your C compiler or its libraries, I figure your options are:

1. Use platform-specific atomics/synchronisation functionality

2. Leverage compiler-specific guarantees to write platform-specific atomics/synchronisation functionality, if your compiler offers such guarantees

3. Write your atomics/synchronisation functionality in assembly, and call it from C

Here's a project that uses all 3 approaches. [0] (It's old-school C++ rather than C, but it's the same in principle.)

I'm fairly sure it's not possible to implement your own portable synchronisation primitives in standard-compliant C90 code. As I understand it, the C90 standard has nothing at all to say on concurrency. It's possible that such an attempt might happen to work, on some given platform, but it would be 'correct by coincidence', rather than by definition. (Again, unless the particular compiler guaranteed the necessary properties.)

[0] https://github.com/gogglesguy/fox/blob/fe99324/lib/FXAtomic....

kazinator · on Nov 21, 2022

Neither. The code wasn't standard C, and it didn't just "happen" to work, either.

You probably wouldn't want to build an atomic compare-exchange in standard C, even if it were possible; you find out what the hardware provides and work with that.

MaxBarraclough · on Nov 21, 2022

  > The code wasn't standard C, and it didn't just "happen" to work

Thanks, that makes sense.

At the risk of sounding pedantic, you did say using nothing but C99 or C90, implying use of standard features only.

  > You probably wouldn't want to build an atomic compare-exchange in
  > standard C, even if it were possible; you find out what the
  > hardware provides and work with that.

Agreed.

kazinator · on Nov 22, 2022

Right; I could do it using nothing but standard C features in C source files, by defining some C compatible assembly language routines. The compare_swap primitive can be an external function, e.g.:

  bool cmpswap64(uint64_t *location, uint64_t if_old, uint64_t then_new);

Code that relies on the header doesn't have to process anything compiler-specific. FFI could be used to bind to that function from non-C languages.

vbtemp · on Nov 21, 2022

Interesting discussion, lots of good points all around. Thanks to you both.

flohofwoe · on Nov 21, 2022

In my case: because writing C code (specifically C99 or later - designated init and compound literals!) gives me joy in a way that neither C++ nor Rust provide (C++ was my go-to language for nearly two decade between ca. 1998 and 2017), and I tinkered with Rust a couple of years ago, enough that I realized that I don't much enjoy it.

IMHO, both C++ and Rust feel too much like puzzle solving ("how do I solve this problem in *C++*" or "how do I solve this problem in *Rust*?"), when writing C code, the programming language disappears and it simply becomes "how do I solve this problem?").

PS: I agree that the C standard isn't all that relevant in practice though, you still need to build and test your code across the relevant compilers.

mathstuf · on Nov 21, 2022

Maybe we just work on different kinds of software, but I feel like I'm actually solving problems in Rust when I'm using it. I don't have to think about all the terrible string manipulation APIs and how they can come back and bite me, who owns what is something I still have to decide except that the compiler actually helps out, and I have access to nice APIs that solve ancillary problems for me already (e.g., rayon, serde, etc.). I can't wait for the day when another parser will never be written in C again.

In C, I feel like I'm building a house out of tinker toys, C++ is Lego Techniks, and Rust I'm using bricks and mortar. FWIW, Python feels like waterballoons and drywall to me; while it might look OK from the outside, one thing pierces your exterior and things tend to sag sadly from there.

flohofwoe · on Nov 21, 2022

> terrible string manipulation APIs

Don't use (most of) the C stdlib, it's useless and hopelessly antiquated (not just for string manipulation), instead use 3rd party libs.

> who owns what is something I still have to decide except that the compiler actually helps out

Lifetime management for dynamically allocated memory in C should also be wrapped in libraries and not left to the library user. In general, well designed C library APIs replace "high level abstractions" in other languages.

But I agree, the memory safety aspect of Rust is great, and I'd love a small language similar to C, but with a Rust-like borrow checker (but without all the stdlib bells'n'whistles that come with it, like Box<>, RefCell<>, Rc<>, Arc<>, etc etc etc...) - instead such a language should try to discourage the user from complex dynamic memory management scenarios in the first place.

It's not the memory safety in Rust that turns me off, but the rest of the 'kitchen sink' (for instance the heavy functional-programming influence).

mathstuf · on Nov 21, 2022

> use 3rd party libs

And now I have more problems ;) . (And I develop on CMake; consistently using external deps is a nightmare.)

The thing is that I usually am that library author (or working in an area that acts like that).

I'm not sure what you expect to be left if you say "the stack is all you can use" (which is what I understand to be remaining when you remove those "bells'n'whistles").

I also really enjoy the functional aspects. I don't want to think about what some of the iterator-based code I've done looks like in C or C++ (even with ranges).

flohofwoe · on Nov 21, 2022

> "the stack is all you can use" (which is what I understand to be remaining when you remove those "bells'n'whistles")

Not what I meant, heap allocations are allowed (although the stack should be preferred if possible), but ideally only with long lifetimes and stable locations (e.g. pre-allocated at program startup, and alive until the program shuts down), and you need a robust solution for spatial and temporal memory safety, like generation-counted index handles: https://floooh.github.io/2018/06/17/handles-vs-pointers.html).

mathstuf · on Nov 21, 2022

Yeah, I feel like we work in completely different realms of programming. Which is fine; there are plenty of languages out there for all kinds of use cases. FWIW, pre-allocation absolutely doesn't work when you support loading GB-size datasets. You certainly can't leave it sitting around forever either once it is loaded.

steveklabnik · on Nov 21, 2022

What is your favorite 3rd party string library? I'm aware of several, but haven't used any of them in anger.

anfilt · on Nov 21, 2022

"IMHO, both C++ and Rust feel too much like puzzle solving ("how do I solve this problem in C++" or "how do I solve this problem in Rust?"), when writing C code, the programming language disappears and it simply becomes "how do I solve this problem?")."

This statement very much resonates with me. It's honestly one of the things I like about C. Although it's not perfectly like this for me all the time. For example string manipulation is not great.

An other aspect I like about C is there is not a plethora ways of doing the same thing which I have found always made it more readable than rust and C++.

Mindless2112 · on Nov 21, 2022

In contrast, when I write C, I spend far too much time thinking "how do I solve this problem without causing undefined behavior?"

flohofwoe · on Nov 21, 2022

That what UBSAN is for (along with ASAN, TSAN, static analyzers and compiler warnings, just dial everything to eleven and you can offload a lot of that thinking to the compiler - it's not the 1990's anymore ;)

Mindless2112 · on Nov 21, 2022

UB sanitizers can only show that your code has undefined behavior, not that it does not. And the results are only as good as your tests. Those sanitizers are also not available with old embedded toolchains.

I do dial up the warnings to 11, yet it is not enough.

I've written C code that's currently running on hardware orbiting the earth. I'll never do it again if I can help it; it wasn't worth the stress. You only get one chance to get it right.

flohofwoe · on Nov 21, 2022

> I've written C code that's currently running on hardware orbiting the earth.

I guess in such a situation I would not not trust any compiler (for any programming language), no matter how 'safe' it claims to be, but instead carefully audit and test every single assembly instruction in the compiler output ;)

quelsolaar · on Nov 21, 2022

I'm in the WG14, and I, like you, only use c89. So why does c23 matter? Well in terms of features it matters very little but a big part of wg14s work is clarifying omissions from previous standards. So when c23 specifies something that has been unclear for 30+ years, compiler developers back port it in to older versions of C where it was simply unclear. It matters a lot for things like the memory model and things like that.

kazinator · on Nov 21, 2022

> compiler developers back port it in to older versions of C where it was simply unclear

You cannot rely on that. If you're maintaining C90 code, with a C90 compiler or compilation mode, you should go by what is written ISO 9899:1990, plus whatever the platform itself documents.

We actually don't want compiler writers mucking with the support for older dialects to try to modernize it. It's a backward-compatibility feature; if you muck with backward compatibility features, you risk breaking ... backward compatibility!

quelsolaar · on Nov 21, 2022

C cares a hell of a lot about backwards compatibility. Whenever there is a corner case that gets fixed, the number one goal is to retain compatibility. most of the time, these clarifications clarify what everyone is already doing and has been doing for decades.

Also, most of these corner cases are so obscure that the vast majority of people with decades of C experience have not encountered them. C is an extremely explored space.

kccqzy · on Nov 21, 2022

At least for C++ there is something called defect reports. When agreed, those defect reports to retroactively applied to previously published C++ standards.

As a random example for something as fundamental as classes in C++, the page https://en.cppreference.com/w/cpp/language/classes shows ten defect reports.

vbtemp · on Nov 21, 2022

That's very interesting. Thank you.

layer8 · on Nov 21, 2022

Existing C code needs to be maintained, and can take advantage of the newer features when available in the compiler. The Linux kernel is moving to C11, and may move to C17/C23 later. Also not everyone wants to put up with the compilation times, object sizes, and aesthetics of Rust.

As for new developments, see for example https://news.ycombinator.com/item?id=33675462 which uses C11.

fweimer · on Nov 21, 2022

I doubt that the kernel will adopt the C++ memory model (the big change in C11). Instead, they will keep doing their own thing. Given the problems with the memory model, I can't really fault them. But framing this in terms of standards versions is a bit of a stretch. They could easily adopt additional GCC extensions over time as they move minimum compiler versions forward. Standardization does not really matter there.

enriquto · on Nov 21, 2022

> Who exactly are these new C-standards for?

For me and for many colleagues in my lab? C is quite big in scientific computing and signal processing. Fortran would be slightly better, and it is widely used, but not directly around me. The C99 standard, which added complex numbers and variable length arrays, was truly a godsend in the field. I cannot imagine working without it.

If you write a numerical algorithm that needs to be run 15 years from now, then C and Fortran are possibly the sanest choices. If you do something in other, fancier, languages, you can be sure that your code will stop working in a few years.

The new C standards are really minor changes to the language, and they happen in the span of a decade. It is quite easy to be up to date. And in the rare case that your old code stops compiling, the previous (less than a handful) versions of the language are all readily available as compiler options in all compilers. You can be reasonably sure that a C program written today will still compile and run in 20 years. You can be 100% sure that a python+numpy program won't. If you care about this (for example, if you are writing a new linear algebra algorithm to factor matrices), then choosing C is a rational, natural choice.

hutrdvnj · on Nov 21, 2022

> You can be 100% sure that a python+numpy program won't.

It's possible to use a phyton+numpy program in 20 years, but you also have to save the entire environment and make sure that it works air-gapped (otherwise external dependencies would fail). One possiblity would be to store it as a qemu virtual machine. It's very possible today to boot up stuff as VMs that is 20 years and older (e.g. 20 year old Linux distros or Windows XP iso from early 2000s).

LAC-Tech · on Nov 21, 2022

If your team is up-to-date enough to quickly adopt C23, then why not just use Rust

There's a lot of reasons to use C23 over rust

- multiple compiler implementations

- works on more platforms

- defined standard

- ability to create self-referential data structures without hacky workarounds

- immediate, easy access to large numbers of C libraries

(For the record I like rust, but the evangelism over the past half decade has been pretty ridiculous. Consider this counter propaganda).

addaon · on Nov 21, 2022

Keep in mind that if you want to write probably-maybe-correct code, Rust is maturing to be able to get you there more easily than C. But if you want actually-correct code, you need to do the legwork regardless of language; and C has a much more mature ecosystem (things like CompCert C, etc) that lets you do much of the analysis portion of that legwork on C code, instead of on generated assembly code as you'd have to do for Rust. Combined with verification costs that don't vary that much from language to language, and there's a long future where, for safety-critical applications, there's no downside to C -- the cost of verification and analysis swamps the cost of writing the code, and the cost of qualifying a new language's toolchain would be absurd. For this reason, C has a long, long future as one of the few languages (along with Ada, where some folk are making a real investment in tool qualification) for critical code; and even if it takes a decade for C23 features to stabilize and make it to this population, well, we'll still be writing C code well beyond '33.

MaxBarraclough · on Nov 21, 2022

  > Combined with verification costs that don't vary that much from
  > language to language, and there's a long future where, for
  > safety-critical applications, there's no downside to C -- the cost
  > of verification and analysis swamps the cost of writing the code

That doesn't sound right. You really want to get the code right early on. The later bugs are discovered, the more costly the fix. You may have to restart your testing, for instance.

If the language helps you avoid writing bugs in the first place, that should translate to quicker delivery and lower costs, as well as a reduced probability of bugs making it to production. The Ada folks are understandably keen to emphasise this in their promotional material.

  > the cost of qualifying a new language's toolchain would be absurd

As I understand it, this typically falls to the compiler vendor, not to the people who use the compiler. A compiler vendor targeting safety-critical applications will want to get their compiler certified, e.g. [0]. To my knowledge we're nowhere near a certified Rust compiler, although it seems some folks are trying. [1]

[0] https://www.ghs.com/products/compiler.html

[1] https://ferrous-systems.com/blog/sealed-rust-the-pitch/

electroly · on Nov 21, 2022

Are you asking about greenfield development only? One big obvious reason to use C23 instead of Rust or C++23 is if you already have a codebase written in C. Switching to C23 is a compiler flag; switching to Rust is a complete rewrite.

aidenn0 · on Nov 21, 2022

Places that are just now adopting C11 will probably adopt C23 in 12 years? C++ is (unfortunately, IMO) making inroads into embedded, but C is also still pretty widely used.

davidtgoldblatt · on Nov 21, 2022

My usages are similar to yours, but new C standards still benefit me because I can opportunistically detect and make use of new features in a configure script.

To use my baby as an example: free_sized(void *ptr, size_t alloc_size) is new in C23. I can detect whether or not it's available and use it if so. If it's not available, I can just fall back to free() and get the same semantics, at some performance or safety cost.

fweimer · on Nov 21, 2022

The free_sized function is kind of a bad example, though. For years to come, using free_sized will break malloc interposition. Interposed mallocs will not support free_sized initially. (It's currently not in jemalloc, tcmalloc, mimalloc as far as I can see.) If we add it to glibc and your application picks it up, calls to free_sized will end up with the glibc allocator even if malloc/free/… have been interposed. Maybe there is a way to paper over this in the glibc implementation of free_sized (rather than calling free unconditionally), and still do something useful for the glibc allocator. I don't know.

davidtgoldblatt · on Nov 22, 2022

> Maybe there is a way to paper over this in the glibc implementation of free_sized (rather than calling free unconditionally), and still do something useful for the glibc allocator. I don't know.

We emailed about this a little contemporaneously (thread "Sized deallocation for C", from February), and I think we came to the conclusion that glibc can make interposition work seamlessly even for interposed allocators lacking free_sized, by checking (in glibc's free_sized) if the glibc malloc/calloc/realloc has been called, and redirecting to free if it hasn't. (The poorly-named "Application Life-Cycle" section of the paper).

Bhurn00985 · on Nov 21, 2022

I don't fully understand the need or benefit of having free_sized() available tbh.

Spec says it's functionally equivalent to free(ptr) or undefined:

If ptr is a null pointer or the result obtained from a call to malloc, realloc, or calloc, where size size is equal to the requested allocation size, this function is equivalent to free(ptr). Otherwise, the behavior is undefined

Even the recommended practice does not really clarify things:

Implementations may provide extensions to query the usable size of an allocation, or to determine the usable size of the allocation that would result if a request for some other size were to succeed. Such implementations should allow passing the resulting usable size as the size parameter, and provide functionality equivalent to free in such cases

When would someone use this instead of simply free(ptr) ?

jabl · on Nov 21, 2022

> I don't fully understand the need or benefit of having free_sized() available tbh.

It's a performance optimization. Allocator implementations spend quite a lot of time in free() matching the provided pointer to the correct size bucket (as to why they don't have something like a ptr->bucket hash table, IDK, maybe it would consume too much memory overhead particularly for small allocations?). With free_sized() this step can be jumped over.

Bhurn00985 · on Nov 22, 2022

Thanks for your insights, which prompted to actually jump into the malloc.c implementation.

connicpu · on Nov 21, 2022

Many many many teams writing C won't be using C23 the day it's out, but they have to get these changes in now if they want the people who always use a 10 year old standard to have these features available 10 years from now

gavinhoward · on Nov 21, 2022

I'm also a C developer, but I do use the more modern versions.

There are four big reasons why:

* Atomics. These are the biggest missing feature in older C.

* Static asserts. I can't tell you how much I love being able to put in a static assert to ensure that my code doesn't compile if I forget to update things. For example, I'll often have static constant arrays tied to the values in an enum. If I update the enum, I want my code to refuse to compile until I update the array. I have 20 instances of static asserts in my current project.

* `max_align_t`. It's super useful to have a type that has the maximum alignment possible on the architecture.

* `alignof()` and friends. It's super useful to get the alignment of various types. Combined with `max_align_t`, it is actually possible to safely write allocators in C. Previously, it wasn't really possible to do safely or portably. And I have at least three allocators in my current project.

You're right that C11 doesn't have nearly the reach the ANSI C does, but it does have slightly more than Rust, much more if you consider Rust's tier 3 support to be iffy, which I do.

And it does have one HUGE advantage against Rust: compile times. On my 16-core machine, I can do a full rebuild in 2.5 seconds. If I changed one file in Rust, it might take that long just to compile that one file.

That's not to say Rust is without advantages; one of my allocators is designed to give me as much of Rust's borrow checker as possible, on top of API's designed around that fact.

tl;dr: I use modern C for a few features not found in C89, for the slightly better platform support against Rust, and for the fast compiles.

fweimer · on Nov 21, 2022

Except for max_align_t (which is broken even for scalar types on some targets, and doesn't help with vector types by design), all these things were available long before standardization. So I'm not sure if this is a compelling argument for standardization.

gavinhoward · on Nov 21, 2022

Without standardization, I have to rely on specific compilers. That's not great, either.

jrmg · on Nov 21, 2022

Outside our bubble, there’s an _ocean_ of embedded software/firmware and lower level library stuff, on up-to-date platforms, written in C by people or teams that would find switching to Rust just a _massive_ chore. I’d guess there is at least an order of magnitude more of this than Rust.

And I certainly appreciated C11 when writing Objective-C, so I’m sure people with large codebases of ObjC will appreciate it (though most will be using Swift for new features nowadays).

torstenvl · on Nov 21, 2022

One thing I'm really looking forward to is standardization of binary literals. Bitwise masking makes a lot more sense with binary literals than hex literals.

Example:

https://pasteboard.co/VkjrJIOZzaiR.jpg

(Sorry for pasting code as an image, I'm on my phone)

fulafel · on Nov 21, 2022

Old software is very slow and expensive to change. Adopting a new C version doesn't need a failure prone expensive synchronized collective-action rewrite throughout your sectors supply chain, new tooling, platform runtime ports, etc. Rust would.

zozbot234 · on Nov 21, 2022

C provides the only stable ABI for Rust, and changes to the C++ ABI may also occur in the future. So the implications of new C standards for library code are especially relevant.

sharikous · on Nov 21, 2022

There is no "modern" C but "C with additional niceties". And those additions are usually low key enough to be adopted by a good portion of the compilers out there.

When you have a C code base or experience with C those features may be enough not to make a complex transition.

Having a simple tool evolve a bit may be what you need as opposed to making the change to a much more complex tool.

fweimer · on Nov 21, 2022

I believe the current WG14 charter is here: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2611.htm

This text is still in force, it seems:

“ 13. Unlike for C99, the consensus at the London meeting was that there should be no invention, without exception. Only those features that have a history and are in common use by a commercial implementation should be considered. Also there must be care to standardize these features in a way that would make the Standard and the commercial implementation compatible. ”

I read this as saying that anything that gets standardized should be available in one of the major implementations. In practice, most of the qualifying features will have been implemented in both GCC and Clang in the same way, so for most users, there is not much benefit from standardization. Some may feel compelled to support the ”standard” way and the “GCC/Clang” way in the same sources, using a macro, but that isn't much of a win in most cases. Of course, there will be shops that say, “we can't use feature until it's in the standard”, but that never really made sense to me.

Things are considerably murky on the library side. In my experience, library features rarely get standardized in the same way they are already deployed: names change, types change, behavioral requirements are subtly different. (Maybe this is my bias from the library side because I see more such issues.) For programmers, the problem of course is that typical applications do not get exposed to different compiler versions at run time, but it's common for this to happen with the system libraries. This means that the renaming churn that appears to be inherent to standardization causes real problems.

Others have said that new standards are an opportunity to clarify old and ambiguous wording, but in many cases the ambiguity hides unresolved conflict (read: different behavior in existing implementations) in the standardization committee. It's really hard to revise the wording without making things worse, see realloc.

So I'm also not sure what value standardization brings to users of GCC and Clang. Maybe it's different for those who use other compilers. But if standardization is the only way these other vendors implement widely used GCC and Clang extensions (or interfaces common to the major non-embedded C libraries), then the development & support mode for these other implementations does not seem quite right.

avar · on Nov 21, 2022

Not new in C23, but I still think it's a glaring hole in the standard that there's still no standard way to ask the compiler which (if any) of "J.5 Common extensions" is supported.

For the C version you have __STDC_VERSION__, but there's no similar facility to check if e.g. J.5.7 is supported, which effectively makes the behavior that's explicitly omitted in 7.22.1.4 and 6.3.2.3 go from "undefined" to supported by C23 + the extension.

I understand why C can't have some generic "is this undefined?" test, but it seems weird not to be able to ask if extensions defined in the standard itself are in effect, as they define certain otherwise undefined behavior. The effect is that anyone using these extensions must be intimately familiar with all the compilers they're targeting.

phkahler · on Nov 21, 2022

>> One major change in the language is that two’s complement is now mandatory as a representation for signed types.

This pleases me greatly. Two's complement won decades ago. This also means they could define integer overflow as 2's complement rollover, which is almost universal but is still considered undefined behavior.

layer8 · on Nov 21, 2022

It might remove too many compiler optimization opportunities (see e.g. http://kristerw.blogspot.com/2016/02/how-undefined-signed-ov...) for the standard to disallow them now.

ElevenLathe · on Nov 21, 2022

Meaning there are too many compilers currently producing fast-but-probably-not-what-the-programmer-meant code that will need to be changed produce slower-but-probably-what-the-programmer-meant code? Or do I misunderstand? Seems like a win to me if so.

kllrnohj · on Nov 21, 2022

No, you misunderstand. Wide contacts are not universally superior to narrow ones, and more often than not narrow contacts catch more bugs than wide ones do.

For example almost never does a programmer want int to rollover in a for loop. Defining that behavior doesn't help, and indeed makes tools like linters or sanitizers less useful.

dzaima · on Nov 21, 2022

If you do "for (int i = 0; i < len; i++) data[i] = 0;" and len happens to be above 2^31, is your intent to write 2 billion zeroes, and then write 2 billion zeroes behind the data pointer, and repeatedly do that forever (never mind other UB that that results in)? Probably not.

While two's complement signed rollover can be pretty useful, in 99% of cases you don't need it, and casting to unsigned for the 1% is a worthwhile sacrifice for the better optimizations for the common case, and it'll be pretty clear when you want wrapping or not (and sanitizers will be able to keep reporting signed overflow as errors without false positives).

a1369209993 · on Nov 21, 2022

No, my 'intent' is to get:

  error: signed integer variable used as array index (-Wsigned-index)
  `int i`: should be `size_t i`
  error: comparison of signed to unsigned integer (-Wsigned-comparison)
  `i < len`: `i` has type `int`; `len` has type `size_t`

Scare quotes on "intent", because I don't write (that particular kind of) obviously broken code in the first place, but the upshot is that i and len both need to be size_t, and anything that lets the compiler obscure that is de facto bad.

Edit: previously assumed len was wrong too, but on rereading, you were implying len was already the correct type, and only i was wrong.

(Also, the word "variable" is significant; IIRC something like 5+bytecode_next_char promotes to int, but can't actually be negative[1] so shouldn't generate a warning.)

1: Assuming the compiler is configured correctly, ie char is unsigned.

dzaima · on Nov 21, 2022

The compiler could do those things too (-Wsign-compare is a thing; nothing for signed index though), but that won't change the existing code that uses "for (int ..." (incl. 637K C source files on github), and those warnings could be false positives too (e.g. signed indexes into lookup tables, or keeping using an int32_t after having asserted somewhere before (possibly out of the compiler's vision) that it's non-negative).

pwdisswordfish9 · on Nov 21, 2022

Slower-and-probably-not-what-the-programmer-meant-either code.

phkahler · on Nov 21, 2022

So GCC is defining one behavior for overflow in the optimizer (assuming there is no overflow so algebraic expressions can be optimized) and generating code that behaves differently (a+b will wrap around for signed and unsigned types)

That seems suspect to me but I understand why they'd do it.

zajio1am · on Nov 21, 2022

> This also means they could define integer overflow as 2's complement rollover, which is almost universal but is still considered undefined behavior.

No, they should not. Integer overflow is in most cases logic error, like division by zero or NULL pointer dereference, so it should stay undefined.

marssaxman · on Nov 22, 2022

That's only true for the relatively recent and generally unhelpful "nasal demons" interpretation of undefined behavior; under the older concept, which is more like "do whatever the sensible thing would be on the target platform", rollover is probably just what you want.

zajio1am · on Nov 22, 2022

Distinction between undefined, unspecified, and implementation-defined is already in ISO C99 specification and is rather specific:

"NOTE: Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (...), to terminating a translation or execution (...)."

> which is more like "do whatever the sensible thing would be on the target platform"

That is a meaning of 'implementation-defined' in C parlance.

marssaxman · on Nov 22, 2022

I am familiar with the verbiage in the specification. My contention is that the currently popular tendency for C compilers to generate code which behaves in ways that are absurd and unpredictable, on the grounds that you shouldn't have been doing that anyway, is an unhelpful indulgence on the part of the compiler engineers.

pantry-man · on Nov 21, 2022

It can still be well defined behaviour and still an error

rwmj · on Nov 21, 2022

The page gives a network error.

Edit: Ugh, it needs javascript to render simple HTML.

ainar-g · on Nov 21, 2022

It's especially weird considering that the Markdown file is publicly available:

https://icube-forge.unistra.fr/icps/c23-library/-/blob/main/...

Raw Markdown, for no-JS readers:

https://icube-forge.unistra.fr/icps/c23-library/-/raw/main/R...

account42 · on Nov 21, 2022

The content seems to be hosted in a GitLab repo with the raw view serving the HTML but with the wrong content type. The JS is there to fix that. Weird setup.

jwilk · on Nov 21, 2022

Archived copy that works without JS:

https://archive.today/Hz81c

turminal · on Nov 21, 2022

And it's still broken on my phone.

avar · on Nov 21, 2022

"Extended integer types may be wider than intmax_t". I'm sure there's a good reason for this, but it was introduced in C99, which says (in 7.8.1.5): "[intmax_t] designates a signed integer type capable of representing any value of any signed integer type".

That was already portable between 16 bit, 32 bit, 64 bit etc. Why is it that just because the compiler supports 128 bit or 256 bit integers that compiling in such a mode doesn't correspondingly update "[u]intmax_t"?

The linked page says they 'cannot be "extended integer types" in the sense of C17', but that printf() and scanf() should still support these?

tpush · on Nov 21, 2022

Presumably because the size of intmax_t was set to something specific on its introduction and changing it now would constitute an ABI break most everywhere.

stephencanon · on Nov 21, 2022

That’s exactly right. On platforms without ABI stability concerns, there’s no issue, but if you can’t change the ABI intmax_t is stuck being what it’s always been (which is why it was always obviously a bad idea and should not have been included in the standard).

codeflo · on Nov 21, 2022

No problem, just introduce intreallymax_t and later intmaxthistimewereallymeanit_t.

formerly_proven · on Nov 21, 2022

int -> intpro_t -> intmax_t -> intultra_t

Joker_vD · on Nov 21, 2022

And then there is intpremium_t which is required to be somewhere between intpro_t and intultra_t but has no relation to intmax_t, just for some additional fun.

amag · on Nov 21, 2022

intpremium_t is only available to paying ANSI-members.

pwdisswordfish9 · on Nov 21, 2022

    typedef int intpremium_mediocre_t;

dang · on Nov 22, 2022

I need to ask you a couple things. (This is not particularly a response to the current comment - I just need a recent place to post this.)

First, please stop breaking the site guidelines. You've repeatedly posted unsubstantive/flamebait comments—which as you know, that is not what HN is for. In particular, we ban accounts that do abusive things like https://news.ycombinator.com/item?id=33617059.

Second, please stop routinely creating accounts. As the guidelines say, Throwaway accounts are ok for sensitive information, but please don't create accounts routinely. HN is a community—users should have an identity that others can relate to.

https://news.ycombinator.com/newsguidelines.html

jraph · on Nov 21, 2022

I think you missed the intpromax_t type, which is a bit more better.

layer8 · on Nov 21, 2022

They could introduce a similar type that is prohibited (checked by the compiler) from occurring in declarations with external linkage (plus some wording regarding casting of data across translation units being undefined behavior for that type). Effectively prohibit it from being used in ABIs. It would then still be useful for intermediate calculations and in macros.

a1369209993 · on Nov 21, 2022

That doesn't work because possibly[2] the most important single use of uintmax_t is the printf specifier "%ju", ie a ABI boundary.

Ironically, this is actually one of the only[0] legitimate uses for standard[1] PRI* macros, since that could expand to whichever of "%llu", "%w128u", etc, was appropriate to the caller.

0: And I'm not sure "one of" is actually needed.

1: as opposed to nonstandard ones like PRIu_xlib_atom or the like

2: Depending on how you define "single", it competes with (x*(uintmax_t)y)>>UINTPTR_BIT, but that's not actually reliable since uintmax_t isn't (IIRC) guaranteed to be larger than uintptr_t.

layer8 · on Nov 22, 2022

I meant a new, separate type, in addition to the defunct existing (u)intmax_t. For that new type, %j(u) wouldn't apply, but of course a PRI* macro could be added for it as you suggest.

Gibbon1 · on Nov 22, 2022

I'd argue that printf and va_args are terrible and should have been depreciated 25 years ago.

loeg · on Nov 21, 2022

They could have retroactively made intmax_t that type. Anyone using intmax_t in ABIs can keep using C11 until they fix their ABIs.

stephencanon · on Nov 22, 2022

C uses intmax_t in its ABI. I’m all for deprecating those ABI, because they’re crap, but the C committee doesn’t see it that way.

SAI_Peregrinus · on Nov 21, 2022

Or include the std version in the name.

intmax_199409L_t for C89's 1994 amendment, etc.

rwmj · on Nov 21, 2022

Also a 128 bit intmax_t would be slower and the 128 bit extensions are used only rarely.

avar · on Nov 21, 2022

So, a "what your mother didn't tell you about C standardization!". I.e. that some popular compilers flaunted the standard, as wider and supported integers should surely have been "[u]intmax_t".

And now code written to conform with the intent and wording of C99 will need to change?

AnssiH · on Nov 21, 2022

No, the ABI issue is fundamentally there regardless of what compilers did or didn't do.

If you have an interface that has a function that e.g. takes an intmax_t parameter (and even the C standard has those, e.g. imaxabs()), increasing intmax_t size (ABI change) would break existing callers.

So you can only change intmax_t size if you do not care about ABI stability.

avar · on Nov 21, 2022

Exactly, and the way "intmax_t" was previously defined in C99 implied that you needed to break that ABI compatibility if you introduced larger integer types.

gpderetta · on Nov 21, 2022

Are you saying that the standard should attempt to impose requirements to non-conforming compiler modes? How would that even work?

avar · on Nov 22, 2022

I'm saying that if you wrote conforming C99 code that made the assumptions C99 guaranteed your code will be subtly broken by C23. Whether that was a worthwhile trade-off is another matter.

gpderetta · on Nov 22, 2022

To be fair C23 is simply standardising the status quo. There is no point in having a standard if nobody implements it.

loeg · on Nov 21, 2022

Technically, it would break ABIs. I agree, they should have just broken any ABI relying on intmax_t and made it definitionally unstable (instead of making it useless, which they've chosen to do instead).

zxwrt · on Nov 21, 2022

Sad that C23 didn't get symbol visibility attributes. They are very useful for libraries. Perhaps there is still a chance to get them?

rwmj · on Nov 21, 2022

I suppose because they are an ELF feature rather than a language feature?

Anyway you can (and should!) use -fvisibility=hidden and add __attribute__((__visibility__("default"))) to public symbols when writing a C library. It will make calls between non-visible symbols faster because the compiler doesn't have to generate code to handle ELF symbol interposition.

account42 · on Nov 21, 2022

You might also want to compile with -fno-semantic-interposition since no one actually needs semantic interposition for symbols used and defined in the same library so you might es well disable it even or public functions.

planede · on Nov 21, 2022

And link with the `-Bsymbolic` or `-Bsymbolic-functions` linker options to save on the indirection when calling public functions within the same library but across object files.

zxwrt · on Nov 21, 2022

These are non-standard GNU extensions, unfortunately.

account42 · on Nov 21, 2022

All compiler flags are non-standard because the C standard does not concern itself with them at all.

That said, there are effectively two standard compiler interfaces: MSVC and GCC and everyone else emulates one or both of these.

Similarily, symbol visibility is not something that the C standard cares about because it doesn't even care about libraries in the first place. Again, for all platforms that have symbol visibility (e.g. PE and ELF based ones, although the details differ) there is a defacto standard for the compiler flags and attributes to control the visilibity: the MSVC and GCC extensions.

zxwrt · on Nov 21, 2022

Actually, the only compiler interface that can claim to call itself a standard is POSIX c99. Everything else is too volatile to be a standard.

kllrnohj · on Nov 21, 2022

You can (and should!) use a linker version script instead to ensure you're not accidentally leaking any exported symbols from other static library dependencies ;)

brandmeyer · on Nov 21, 2022

Not surprising, though. The issues surrounding if/how a system provides libraries of any kind (static or shared) are completely implementation-defined. The Standard doesn't have anything to say about them at all except for the program-wide symbol scope rules.

GrumpySloth · on Nov 21, 2022

> Trigraphs are removed from the language.

Finally! My joy knows no bounds.

MaxBarraclough · on Nov 21, 2022

> My joy knows no bounds.

Much of the time, neither does a C compiler.

turminal · on Nov 21, 2022

Why?

They seem pretty harmless, are they difficult to implement?

GrumpySloth · on Nov 21, 2022

They’re not difficult to implement. It’s a sed pass running before any preprocessor or even lexer, which is the horrifying thing. They replace characters in source code, they’re not normal tokens like digraphs. And so e.g. the following program won’t print what a mere mortal would expect:

  #include <stdio.h>

  int main() {
    puts(“What??(Really.)”);
  }

Try running that. Only change the quote marks to the ones on a normal keyboard (I’m on my phone.)

Edit: Oh, and remember to compile with -std=c11 e.g. gnu11 has trigraphs disabled by default.

avar · on Nov 21, 2022

The standard itself has a better example (well, older versions). In C99 it's:

    printf("Eh???/n");

Which is the same as:

    printf("Eh?\n");

I.e. the "??/" supplies the "\" to the "n", making it a "\n". I think it's good that they're gone (they were there to support some ancient non-ASCII systems). But it's worth mentioning that the "horrifying" semantics are there so that these trigraphs combine with <s>digraphs</s> (edit: "escape sequences", see downthread) in this way.

Otherwise you'd just get:

    printf("Eh?\\n");

GrumpySloth · on Nov 21, 2022

This example doesn’t involve digraphs. Trigraphs also preceded digraphs, so their semantics couldn’t have been motivated by digraphs. In fact digraphs were added as a better-thought-out alternative to trigraphs. Digraphs were added in C99, whereas trigraphs were added in C89.

avar · on Nov 21, 2022

Sorry, you're completely right. I had digraphs confused with character-constant escape sequences (as defined in "6.4.4.4" in C99).

I.e. that the trigraphs need to be parsed like that both because they're replacements for basic syntax like "{" and "}", but also because they need to be considered as if though the parser saw a "\" in a constant, as they next character may be e.g. "n", forming a "\n", not "\\n".

smcameron · on Nov 21, 2022

Edit: Nevermind. I had an extra space so my experiment was flawed.

So, I tried it with gcc and clang, and neither did anything strange.

Also tried -std=c99 and -std=c89. All behaved the same.

I thought maybe my distro had some default option configured to turn of trigraphs/digraphs, but --verbose didn't seem to show anything suspicious.

    $ cat garbage.c
    # include<stdio.h>
    
    int main()
    {
     puts("What?? (really)");
     return 0;
    }
    $ clang -std=c11 -o garbage garbage.c
    $ ./garbage
    What?? (really)
    $ clang -std=c99 -o garbage garbage.c
    $ ./garbage
    What?? (really)
    $ man gcc
    $ clang -std=c89 -o garbage garbage.c
    $ ./garbage
    What?? (really)
    $ gcc -std=c11 -o garbage garbage.c
    $ ./garbage
    What?? (really)
    $ gcc -std=c99 -o garbage garbage.c
    $ ./garbage
    What?? (really)
    $ gcc -std=c89 -o garbage garbage.c
    $ ./garbage
    What?? (really)
    $

GrumpySloth · on Nov 21, 2022

You need to get rid of the space between “??” and “(“.

zwirbl · on Nov 21, 2022

or put the code into godbolt and take a look at the strings with -std=C11 and without https://godbolt.org/z/zn44rdW6P

merlincorey · on Nov 21, 2022

I've only seen them recently in things like the IOCCC.

[0] https://www.ioccc.org

ufo · on Nov 21, 2022

Why is stdbool.h deemed "mostly useless now"? Is it always available via another header now?

pdw · on Nov 21, 2022

In C23, `bool`, `true` and `false` are available without including `stdbool.h`.

ufo · on Nov 21, 2022

Great news! \o/

cesarb · on Nov 21, 2022

It's great news unless you're maintaining software which defines its own version of these three keywords (which was necessary before C99, and AFAIK there is a popular C compiler which still doesn't have full support for C99). And until everyone moves to C23 (which will probably take a very long time; a quick web search tells me that it took until 2015 for the aforementioned popular C compiler to add support for the C99 stdbool.h header), all C projects will still have to include that header.

pjmlp · on Nov 21, 2022

If you mean MVSC, it never will, because it now supports C11 and C17, minus the optional annexes from C99 that were dropped in C11, and atomics aren't fully supported as well.

sigsev_251 · on Nov 21, 2022

I've heard that they are working on C11 atomics, now that the C++ standard includes <stdatomic.h> for compatibility with C.

pjmlp · on Nov 21, 2022

Might be, that was the main goal when they gave up on C, just to improve C compatibility to the extent of ISO C++ requirements.

It was the whole change of management that somehow made them backtrack on that decision.

mananaysiempre · on Nov 21, 2022

Typo (I hope... :) ): BITITNT_MAX instead of BITINT_MAX, multiple times.

sigsev_251 · on Nov 21, 2022

If I remember correctly, it's not even correct. The macro is supposed to be named BITINT_MAXWIDTH.

rurban · on Nov 21, 2022

The last line has a typo: besearch_s => bsearch_s

dottedmag · on Nov 21, 2022

Hah

> Extended integer types may be wider than intmax_t

rwmj · on Nov 21, 2022

Already true in many implementations. The committee is just catching up with reality ;-(