Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
New C features in GCC 13 (redhat.com)
205 points by petercooper on May 4, 2023 | hide | past | favorite | 271 comments


> C++11 standardized a similar feature under the name decltype, so sadly we wound up with two names for a nearly identical feature.

If you wonder why:

https://open-std.org/JTC1/SC22/WG14/www/docs/n2927.htm#exist...


My follow-up question is: why does decltype((value)) return int& in C++?



There are a lot of words on that page and none really jump out at me as explaining why C++ chose to have decltype yield something different for decltype(value) and decltype((value)). But I was unable to read them all, so maybe it is buried somewhere in there.


Sorry, could not find a video.

See decltype section.


It provides a series of rules for exactly when decltype() picks a reference or non-reference type, but as far as I can see, it does not explain why the language chose to sometimes return references in some scenarios, such as for '(value)' vs 'value'.


Because decltype(*&value) is int &.

The real question to ask is why decltype(value) is int.


That doesn't really answer the question. It's not obvious why decltype(*&value) is int& either.


Normally decltype(E) returns the type of the expression E. The type of the expressions *&value and value are both int& for instance. However they also wanted a way to query the type an identifier was declared with, eg int x would be int. For some reason they chose to expose this by overriding decltype(E) when E is a plain identifier. decltype((value)) avoids this special case and returns the type of the expression (value), which is obviously the same as the expression value.


Thank you, I looked specifically whether this was discussed here already and found your reference.


Since GCC 13 made the first page, it’s worth pointing out that 13 marks *-solaris2.11.3 obsolete and the backend will be removed entirely in 14. In order to get it in 13, you need to build GCC with --enable-obsolete. It’s always hard to judge who may still use or need this, but just throwing it out there.

edit: Sorry, didn't mean to say 11.4 was obsolete. If you know of 11.3 machines that still exist, upgrade them!


Citation? The 13.1 change notes [1] indicate Solaris 11.3 has been dropped, because it has some things in it that make supporting it and 11.4 hard. [2]

I don't see anything about sparc generally, let alone sparc-solaris specifically[1]

[1] https://gcc.gnu.org/gcc-13/changes.html

[2] https://gcc.gnu.org/pipermail/gcc/2022-December/240322.html


You're correct -- Just 11.3. I've only ever used Solaris on SPARC hardware, so that colors my view.


The end of an era. The SPARC ISA was a work of art.


Not so soon. SPARC is still (somehow) alive and is produced as SPARC M8. The highest spec'd CPU (SPARC M8-8) comes with 256 CPU cores and 2048 threads in up to 8 CPU configurations.

https://www.oracle.com/a/ocom/docs/sparc-t8-m8-server-archit...


SPARC is dead, long live RISC-V.


register windows in particular stood the test of time!


More like, they were bad at introduction, and remained bad to date.

They rank high alongside delay slots, in harmful features.


Curious why you think both were bad designs in a time when memory access was slow, branch prediction was bad/non-existent, and pipelining was new?

The constraints applicable in the late '80s aren't the constraints applicable now.

Plus, this is a story about C, so I feel comfortable embracing the principle of explicit control flow. Delay slots just take that principle down to the branch prediction level.


Sorry, I forgot the /s


All quite reasonable additions to the language, kudos to the GCC team.

I'm /very/ sad to see the unprototyped functions to go though -- that was my favourite way to catch newbies off-guard! Oh well ...

As for those who like to spend their Friday afternoon feeling sad about C++isms creeping into C: You are very late! You should have done that when GCC implemented ``__attribute__((cleanup))`` years^Wdecades ago.

edit: https://gcc.gnu.org/legacy-ml/gcc/2003-05/msg00528.html


It's unfortunate that a scope cleanup mechanism, like __attribute__((cleanup)), wasn't introduced. There was a proposal for one but it relied on lambda's which didn't make the cut for C23.


Now if only they'd kill implicit function declarations too...


Clang says it's already been killed in C99:

https://www.godbolt.org/z/KaWzsc6qT

...in GCC it's still just a warning though.


That is a 'warning as default error', so for at least Clang16, it is still disable-able. The Clang code owner is on a bit of a well-deserved war-path against these, so they are likely to be completely illegal in the next version or two.


Huh I had no idea! Seems weird that it's not an error by default!


Meh, I compile with -Werror :)


All these new features, but still no arrays/strings that keep track of their length!


Oh, man!

Imagine the humble pie C would have to eat.

C beat out Pascal in the 80s for a lot of reasons, many of them not technical.

Pascal had the correct format for strings, the one you mentioned and the one used by all modern languages.

Imagine if after 40 years C would start using Pascal strings :-))


Standard Pascal did not store string length and had no notion of variable-length strings. This was such an egregious design failure that various dialects came up with non-standard, non-portable, compiler-specific work-arounds. In the 1980's, it was next to impossible to write a Pascal program that worked cross-platform and did anything with strings.

The first ISO standard to address this wasn't until 1990, with yet a new dialect called Extended Pascal, which of course wasn't compatible with the various earlier system-specific dialects, and anyway was way too late for Pascal to "win" over C.

See Wikipedia for confirmation: "Brian Kernighan, who popularized the C language, outlined his most notable criticisms of Pascal as early as 1981 in his article "Why Pascal is Not My Favorite Programming Language". The most serious problem Kernighan described was that array sizes and string lengths were part of the type, so it was not possible to write a function that would accept variable-length arrays or even strings as parameters."


Hence Modula-2 in 1978, which Brian ignored on this famous rant.


Modula-2 compilers were slow to appear on many important architectures, and again there were various divergent dialects, and the whole shebang seemed to get semi-abandoned when Wirth et. al. shifted yet again to Oberon; lather, rinse, repeat. So none of his post-Pascal languages posed practical competition to C for writing portable, cross-platform code, either.


VMS, UNIX, MS-DOS, Amiga, Archimedes, seemed important enough for me.

C only became portable by bringing UNIX alongside via POSIX.

Then there were all those dialects, K&R C, Small-C, RatC, DBS C,...


> Pascal had the correct format for strings, the one you mentioned and the one used by all modern languages.

Not really. The most crucial insight is that your string references should be fat pointers, which is not what Pascal does. Look at Rust's &str for the state of the art. Other choices are less important, but having string references be fat pointers is a huge boon. It would probably have seemed profligate in the 1980s, but it was the right choice.


> Look at Rust's &str for the state of the art. Other choices are less important, but having string references be fat pointers is a huge boon. It would probably have seemed profligate in the 1980s, but it was the right choice.

ANS Forth[1], released 1994, used (addr, len) pairs throughout, eliminating most uses of “counted” (Pascal) strings in Forth-83 and earlier.

[1] https://forth.sourceforge.net/std/dpans/


That would be a mess for interop with 8, 16, and 32-bit targets. What size should the length be? What happens when you exceed it? Nowadays you can just 32-bit all the things and ignore most range issues. That luxury didnt exist in the 80s.


> That would be a mess for interop with 8, 16, and 32-bit targets. What size should the length be? What happens when you exceed it?

It works fine. The length has the same size as an address, the fat pointer being a pair of addresses is technically equivalent but has worse performance in practice so you should make it an (address, length) pair instead of course.

You can't exceed this length because of the pigeonhole principle, basically just arithmetic.


> The length has the same size as an address, the fat pointer being a pair of addresses is technically equivalent but has worse performance in practice so you should make it an (address, length) pair instead of course.

Is it worse though? Serious question, I’ve been wondering for quite a bit. IME processing a string from the start is easier with a (start, end) pair (one operation instead of two when you chop off things from the start). You can’t use those in standard C because it reserves the right to blow up on pointer subtraction essentially randomly (ptrdiff_t overflow is UB, and the only requirement for ptrdiff_t is that it hold 2^15-1; C89 doesn’t even tell you what the maximum allowed value is), but that’s not a performance problem, and not a problem at all in a different language.


C with fat pointers and boxing/unboxing would be far more super duper than people realize.

The other three are first class types, tagged unions, and closures.

Standards committee says no.


I don't reckon first class types are required. Tagged unions, closures, method syntax sugar, and some kind namespaces/modules would be my wishlist.


With first class types you can implement containers without type erasure.


Gotta love the Str255 used everywhere on Macs before OS X came around. That was how Pascal did things on Macs in the 1980s, and it’s not exactly modern—if you want something longer than 255 characters, it’s going to be cumbersome.

Been digging through old newsgroups, and it sounds like one of the big reasons C won out over Pascal was because it was easier to find good C compilers in the 1980s compared to good Pascal compilers.


Yeah, C compilers were ubiquitous, practically every platform had more than one of them. There were even third-party C compilers for the Mac almost from the beginning.

It didn't help that Pascal didn't have a usable standard, which is what led to articles like "Why Pascal is Not Brian Kernighan's favorite language", which also didn't help.

Edited to add: there was also some amount of antiestablishment fervor in the microcomputer community, and Pascal was for better or worse perceived as a somewhat ivory-tower language.


Pascal on a micro was basically useless unless you were playing with algorithms copied from a book. Because it's a language written by people that only worked on mainframes connected to dumb terminals.

Want to write the string 'fat dog' on line 12 column 10? With Pascal you'd have to write a routine in assembly and call it. With C you use a trivial macro. Pascal was also generally very very slow. Almost as slow as basic which had better strings than Pascal.


“Pascal was slow” is a complaint about the compilers at the time, which is just reiterating my point. Anyway, if you wanted to draw something on the screen, just,

  MoveTo(10, 12);
  DrawString('fat dog');
You could do worse.


Even 8 bit Pascal compilers for ZX Spectrum were better than that.

Anyway, Pascal was designed for teaching, and Modula-2 was designed for production code and systems programming.

UCSD Pascal, and Object Pascal also took many of their ideas from Modula-2.


except that pascal specified that strings of different lengths were actually different types.

https://www.cs.virginia.edu/~evans/cs655/readings/bwk-on-pas...


Pascal wasn't alone, almost all C alternatives are better in that regard.


C also doesn't have linked lists, hash maps, and a http server. It never stopped anyone from using those things in C.


You can use them, but if you have two pieces of independently developed code, chances are good they're not going to use the same kinds of data structures. Unless they are arrays of 0-terminated strings.


C (unlike e.g. Zig) doesn't have any builtin 'struct types' (not sure about VLAs though, but those had been a dead end anyway). Adding such types into the language would be quite a change in philosophy.


VLA are exactly an array type that knows its length, i.e. a dependent type. VLAs on the stack had some issues, but pointers to VLAs, i.e. variably modified types are quite powerful. Language and compiler support is still evolving though.

But you can get run-time bounds checking already: https://godbolt.org/z/jhcavobYj

Support for statically detecting problems is also (slowely) improving.


C certainly does have struct types.

Apparently you mean something by "struct types" other than the well known C feature. Can you elaborate?


Emphasis on 'builtin' :)

For example, Zig's slice type is a 'builtin struct' which contains a pointer and size, the builtin error union type is made of an error code and the actual result type. These things are not provided by the Zig standard libary, but instead built into the language.

In contrast, C's builtin types are all 'primitives' (pointers, integers and floats) - with the arguable exception of VLAs, which have been made non-mandatory in C11.


We made VLA types and derived types (variably modified types) mandatory in C23. (but not VLAs on the stack as automatic variables which are still optional)


struct tm in time.h has been around since C89, there may be others but that's all that immediately comes to mind.


But these are stdlib features, not language features.

"Proper" strings, arrays, slices, tagged unions, etc... would need to be integrated into the language (unless you want to end up with a mess like C++).


I am exploring way on how to implement this in C (with some extensions) as a library:

https://github.com/uecker/noplate

Certainly not production ready.


Not really, something like sds for strings would be doable.

For arrays you may be right, possibly a fat pointer like suggested by Denis himself.


Seems to me that a struct isn't really any philosophically different to say a float. The only difference it that it's larger.


One problem I can imagine is that builtin struct types might have implications for ABIs. The C standard probably would prefer to not define their interior and memory layout for portability, but these things need to be clearly defined for ABIs.

For instance in a slice type, does this have a start and end pointer, or a start pointer and size? Does the pointer or size come first? Is size the number of elements, or number of bytes? Alignment? Padding? Etc...


_Complex


Oops you're right, I never understood why complex numbers ended up as a language feature or are even in the stdlib, but anyway, thanks for the correction.


I think that if you want that, you're working with the wrong language.


> I think that if you want that, you're working with the wrong language.

Maybe yes, maybe no. Walter Bright, the creator of D, has a nice article on the subject [1]. I think Zig's approach which allows slices to decay to pointers and pointers+lengths to create slices is ideal. The real benefit of slices is the implicit bounds checking which some C folks might object to since one of the appeals of C is no "hidden magic".

[1] https://digitalmars.com/articles/C-biggest-mistake.html


One could say that for any feature that a language got later on and everybody feels fine with it added.

Go/Java and generics for example. C++ and auto or closures. Javascript and async/await ("just learn regular Javascript async patterns").

The thing is C could have had a string type that knows its length since day one, with known performance tradeoffs, and also allow for everybody to build their own type in the more rare cases where it's needed.

And most of buffer overflow and string handling bugs would have been avoid, as most of the programs using them don't need the performance or special handling of C strings (or don't need them where the bugs occur, e.g. when reading some configuration file).


Go/Java is a weird one. I don't see many instances of generics in the wild. Honestly the biggest benefit I enjoy is that people who never intended to use Go in the first place are slightly less loud about Go's position on generics. Beyond that, it's a small quality of life improvement to only have to define min and max once for all of my integer types (incidentally, this is really only a problem in Go because it's so dang easy/convenient to create your own int types, whereas in Rust, Java, TypeScript etc it seems like everyone just uses the standard int types for everything).


Indeed, stuff like sds should have long been available as vocabulary types.


The thing is, many times we aren't the ones to chose the language, rather the ecosystem we need to integrate with.

So working with C or C++, it is.


D from Walter Bright took that problem head on and solved it quite nicely.

https://digitalmars.com/articles/C-biggest-mistake.html https://dlang.org/articles/d-array-article.html


I am not that against most of these. However, I don't think I ever will be for using 'auto' in C. Good C is about being extremely clear about what you want to happen at every step. Type inference goes against that imo. If you complain about writing a few more characters I don't think C is the language you are looking for. And I try to not use the "then C is not for you" line very often - I promise.


I do wish there was a programming language that just never added new syntax, ever.

They could call the language "Stone", and then we'd get to say "such and such program is written in Stone", "not written in Stone", etc.

If they keep adding features to C, such as type inference, won't C just ultimately become as feature-laden as C++, albeit on a much longer timescale?

What's the point of that?


> [T]hey keep adding features to C, such as type inference [...].

The thing about typeof (and its close cousin auto) is that it exposes information that the compiler’s frontend has to know anyway for type checking. That’s why GCC has had typeof for many years, even back when it only spoke pure C. So while it’s a new feature, it doesn’t really have a lot of implications for the rest of the language.


There's C90. There's also C99.

As far as ISO is concerned, each edition of the C standard is made obsolete by its successor, but that doesn't impose any obligation on you or on compiler writers. If you have gcc, you can still use `gcc -std=c90 -pedantic-errors` (or c99, or c11, or ...).


Hare is planning on doing this for what it's worth.


Hey, thanks for letting me know about this, I'll check it out


hence -std=stone -pedantic


I don't write C / C++ so I'm not too aware of what's going on there, but wouldn't someone that wants those features just switch to C++? Is there any reason to change C at this point?


Switching to C++ gets you lots of things that aren’t “C-like” (that, of course, is a vague term that, as this thread shows, people will disagree about, but I think there’s consensus that C++ has many features that aren’t C-like), and may get you subtle bugs because of small incompatibilities between C and C++. For example, sizeof('x') is 1 in C++, but >1 in C because 'x' is a char in C++ and an int in C.

https://en.wikipedia.org/wiki/Compatibility_of_C_and_C%2B%2B


Why is

  sizeof('x')
equal to 4 if

  char letter = 'x';

  sizeof(letter)
is equal to 1, just like `sizeof(char)`? If `'x'` is represented as an `int` in C, shouldn't `letter` in this example also be represented as an `int`?


No. The type of 'x' is int. It so happens on your platform (and most available systems today) sizeof(int) == 4.

The type of letter was explicitly char, and sizeof(char) == 1 by definition in C.

char letter = 'x'; is a type coercion. That literal is an integer, with the value 120 and then it's coerced to fit in the char type, which is an 8-bit integer of some sort (might be signed, might not, doesn't matter in this case).


People often forget that 'ab' or '1ert' multi-char immediate are allowed in C. They are almost unusable as they are highly un-portable (because of endianess issues between the front-end and the back-end).


This is once in a while kinda useful, aside from the data layout issue for stuff like a FourCC.

Rust has e.g. u32.to_le_bytes() to turn an integer into some (little endian) bytes, but I don't know if there's a trivial way to write the opposite and turn b"1ert" (which is an array of four bytes) into a native integer.

Edited to Add: oh yeah, it has u32.from_le_bytes(b"1ert"). I should have checked.


Does this mean that `word` in

  char *word = "xyz";
is a pointer to an array of four `int`s, `'x'`, `'y'`, `'z'`, and `'\0'`? When I evaluate

  sizeof(*word)
I do get 1 instead of 4, even though `*word` is pointing to `'x'`. Where are the remaining 3 bytes in memory?


A char is 1 byte by definition. But the type of a character literal (the 'x' syntax) is not a char, but an int instead.

The C type system generally matters so little that the type of an expression has little relevance (sizeof is the most notable exception to that rule), which obscures this fact.


Not at all. There are no character literals in "xyz", this is a string literal and it's unrelated to what your parent was saying.


word is of type char*, a pointer to a (single) object of type char.

The initializer means that the char object it points to happens to be the first (0th) element of an array containing 4 elements with values 'x', 'y', 'z', and '\0'.

Most manipulation of arrays in C is done via pointers to the individual elements, and arithmetic on those pointers. (Incrementing a pointer value yields a pointer to the next element in the array.)

For example, `sizeof word` gives you the size of the pointer object, but `strlen(word)` yields 3, because it calls a library function that performs pointer arithmetic to find the trailing '\0' that marks the end of the string. (A "string" in C is a data layout, not a data type.)


If you specifically type it as char * the it's a pointer to chars each of which has size 1.


you'll have to understand the 'x' syntax and the "xyz" syntax as two different things. Different quotes.


I know. But my understanding was that `"xyz"` is an array of characters so that these two would have the same representation in memory:

  char word[] = {'x', 'y', 'z', '\0'};  // sizeof(word) = 4, sizeof(*word) = 1
  char word[] = "xyz";                  // sizeof(word) = 4, sizeof(*word) = 1
What I did not realize was that the above two are not the same as this:

  char *word = "xyz";  // sizeof(word) = 8, sizeof(*word) = 1


The representation of an object is determined by how the object itself is defined.

An initializer doesn't change that. It only affects the value stored in the object when it's created.

A special case exception is that an array object defined with empty square brackets gets its length from the initializer, so

    char word[] = "xyz";
is a shorthand for, and is exactly equivalent to:

    char word[4] = "xyz";


What I see there is that you seem to highlight the difference between using sizeof with an array and sizeof with a pointer, which makes a difference, even if array-decays-to-pointer is a rule in most other contexts.


Right, I am mixing up two things here. You are right that bringing up pointers here is a mistake.

But apart from that, I would expect `{'x', 'y', 'z', '\0'}` to have size 16 rather than size 4 because it consists of four character literals which each have size 4 on my machine.


Maybe do not overthink it. 'x' is called a character literal, but it has the type int.

`{'x', 'y', 'z', '\0'}` does not have a type by itself, but it's valid syntax to use it to initialize various structs and arrays - some of those will have the size you are looking for, depending on which type of array or struct you choose to initialize with that: https://gcc.godbolt.org/z/Tqjq3xzKo


Thank you for the explanation and the Godbolt example! I appreciate it. Apologies for fumbling around in confusion.


sizeof() returns the number of "units" that something -- an expression or a type -- takes up. What do you think those units are?

They are literally defined as "characters". sizeof(char) is always 1.

Your confusion (besides the pointer thing) is that 'x' is a funny way to write an int, not a char.


It seems to me that `sizeof` returns the number of bytes that the thing takes up in memory. For example:

  int numbers[] = {1, 2, 3};  // sizeof(numbers) = 12
> Your confusion (besides the pointer thing) is that 'x' is a funny way to write an int, not a char.

Yes, this might be it. So the way to get a `char` value that contains "c" is to use type coercion and write it as `(char) 'c'`. This changes the representation in memory so that it now takes up only one byte rather than four, right?


`(char)'c'` is an expression of type char.

Its size is one byte -- but the size of an expression isn't really relevant, since it's (conceptually) not stored in memory.

You can assign the value of an expression to an object, and that object's size depends on its declared type, not on the value assigned to it. The cast is very probably not necessary.

    char c1 = 'c'; // The object is one byte; 'c' is converted from int to char
    int  c2 = 'c'; // The object is typically 4 bytes (sizeof (int))
The fact that character constants are of type int is admittedly confusing -- but given the number of contexts in which implicit conversions are applied, it rarely matters. If you assign the value 'c' to an object of type char, there is conceptually an implicit conversion from int to char, but the generated code is likely to just use a 1-byte move operation.


In the declaration

    char letter = 'x';
the initialization expression 'x', which for historical reasons is of type int in C, is implicitly converted to the type of the object. `letter` is a `char` because you defined it that way.

If you had written

    int letter = 'x';
that would be perfectly valid, and the conversion would be trivial (int to int).

It's just like:

    double x = 42;
`sizeof 42` might be 4 (sizeof (int)), but `sizeof x` will be the same as `sizeof (double)` (perhaps 8).


The type of the expression 'x' is int, not char (in C). The type of an expression consisting of a variable name is the type of the variable (as far as sizeof is concerned).


Many things that C++ added on top of C aren't actually improvements (I guess the most charitable thing that can be said about C++ is that it identifies and weeds out all the stupid ideas before they can make it into C).


A usual programmer doesn’t need most features of C++ but there are many important:

Generic-programming with templates, a usable std::string, smart-pointers, references, the howl standard-library (streams, file access, containers, threads).

The controversial ones seem to be exceptions and classes. Exceptions affect programming flow, exception safety is very hard and the runtime costs are an issue depending on the environment. Class and inheritance are complicated feature, operator overloading is one of the best stuff I’ve seen. But I can understand why many programmers don’t want handle all the special rules involving classes.


This would add a lot of complexity into the C compiler and stdlib, and C would just end up as another C++. IMHO it is still important that a C compiler and stdlib can be written in reasonable time by an individual or small team.

And just one example (because it's one of my favourite topics where the C++ stdlib really failed):

std::string as a standard string type has so many (performance) problems that it is borderline useless. You can't simply use it anywhere without being very aware of the memory management implications (when and where does memory allocation and freeing happen - and std::string does its best to obscure such important details), and this isn't just an esoteric niche problem: https://groups.google.com/a/chromium.org/g/chromium-dev/c/EU...


Classes, which have constructors and destructors, simplify code immensely..so do copy and move constructors and operators.


At least WG21 actually acknowledges solving security issues and UB problems are a real problem that needs to be sorted out.

Meanwhile, WG14, "not our problem ".


Trying to figure out whether it's well-defined to compare two pointers to different objects in memory...

...By reading the C++ standard: "it's well-defined"

...By reading the C standard: *self-referencing Zalgo text quoted by dozens of StackOverflow thread debates where no completely confident conclusion is ever reached, although 3/4 people way smarter than you think it's well-defined and blame observations to the contrary on compiler bugs which they've reported to all the major compiler maintainers with varying reception from said maintainers as to whether they agree that those are actually bugs, forcing you, at the end of the day, to realize that you should really be writing your compiler's C, not aspiring to a universal, platonic ideal of C*


Why? I do not find the wording in the C standard less clear than the C++ wording (where the result is unspecified for unrelated pointers).


C++ drags in a ton of other baggage that a large enough number of programmers don't want.


Especially when there's Go, Rust, etc. these days. There is so much legacy with C++ the language that it's pretty easy do do stuff subtly wrong if you're not rigidly careful and adhering to a style guide that forces you to only use the safer bits.


C and C++ are separate languages and lots of people don't like C++.


Maybe. But by reading the article one does get the impression that GCC devs (and C2X proposal authors) really like C++: the language is mentioned 16 times, and easily half of the features are lifted more or less as-is from there.


As a C programmer, it just seems like they finally took the minority of ideas added by C++ that were actually good ideas and added them back to C. Aside of `auto` which I'm ambivalent about (I think the only place where it's useful are macros) those all make perfect sense in context of C and I believe the only reason for C++ having them first is that C++ simply evolved faster.


It makes a lot of sense for the languages to be harmonized with each other. Differences like noreturn versus [[noreturn]] does nobody any favors. C++ has all these wacky things you can do with constexpr functions, and C is getting a VERY LIMITED version of this that only applies to constants, addressing a long-standing deficiency, where C provides only a way to define named constants as int type (using enum) or using macros, and you really want to be able to define constants as any type you like. The "const" qualifier doesn't do that, you see... it really means a couple different things, but the main one is "read-only", which is not the same as "constant".


One of the benefits, historically, to both languages is that they share a very large chunk of the language in common. It's therefore in their common interest to try and maintain that common subset wherever possible. The goal here (just to be clear, from my outside perspective) isn't to unify the languages, it's to ensure that stuff that's the same stays roughly the same. If the same code produces two different things, based on the language, that's unfortunate. Code that works in one but doesn't compile in the other is totally fine, of course.


I guess my question is: If you want `auto`, why put it in C instead of using C++ with no other C++ specific feature besides auto?

I get people don't like classes / templates / .. but there isn't any reason one has to use those.


> I guess my question is: If you want `auto`, why put it in C instead of using C++ with no other C++ specific feature besides auto?

Because they're orthogonal, and making function bodies less verbose with no loss of expressivity is nice, without needing to significantly alter the language?

Pretty much every C competitor has local type inference, and C actually needs more than most due to the `struct` namespace, typing `struct Foo` everywhere is annoying, and needing to typedef everything to avoid it is ugly.

Also C++ is missing convenient C features, like not needing to cast `void*` pointers, and designated initialisers were only added in C++20.


Type inference only make the code harder to read. You ended doing mental compiler work when you could just write the damm type.

And the people who say "I Just hover the variable in my IDE" It doesn't work in a terminal with grep, you can't hoved a diff file and not even github do the hover thing.

Combine that with the implicit type promotion rules of C. Have Fun.


> Type inference only make the code harder to read.

Nonsense.

> Combine that with the implicit type promotio rules of C. Have Fun.

This sort of trivial TI does not make that any worse. C is broken, it neither breaks nor unbreaks C.


typeof and auto are useful for writing type-generic macros.


Yeah but looking how auto has been abused in C++ I don't think it worth it.


C23's auto is not nearly as magic as C++'s auto. You can use it for the simple things but not for the perversely creative abuse scenarios you fear.


I guess one possible reason is if there’s no C++ compiler for an obscure platform as it would be too much work, but there is an up-to-date C compiler


Yeah, seems like half of the embedded architectures are like this. Well, the C compiler is not quite standards compliant, and often made to an older version of the C spec, but give it time—the C2x standard comes out this year, and it may not benefit people in the embedded space until some years down the road.

Second-best time to plant a tree, and all.


You end up having to turn a lot of C++ features off in order to get the experience you want in certain environments. In an application running on a modern Windows/Linux/Mac system, it’s no big deal to use those features.

Some platforms also just don’t have C++ compilers. Yes, they still exist. You buy some microcontroller, download an IDE from the manufacturer's web site, and you get some version of C with a couple extensions to it. And then there are all the random incompatibilities between C and C++, where C code doesn’t compile as C++, or gives you a different result.


What's "a lot of features" ? -fno-rtti, -fno-exceptions?

> Some platforms also just don’t have C++ compilers

It's not like they're going to have C23 compilers either


> It's not like they're going to have C23 compilers either

Niche compilers like SDCC (https://sdcc.sourceforge.net/) are actually keeping track of recent C language improvements quite well.


C++ has a lot of funny rules when it comes to constructors and initializers. It's easy to accidentally to do something unintended, and end up with code that relies on initialization order.


Yes, so why copy paste C++ into C?


I wish the C Standard Committe stopped smearing all C++ bullshit in to C. Now that many of the C++ people who promoted those features are abandoning the ship.

It's what you get when your C compilers are implemented in C++.


Why "bullshit"? I looked at the article, and everything looks extremely reasonable, and desirable in C.

* nullptr: fixes problems with eg, va_arg

* better enums: Who doesn't want that? C is a systems language, dealing with stuff like file formats, no? So why shouldn't it be comfortable to define an enum of the right type?

* constexpr is good, an improvement on the macro hell some projects have

* unprototyped functions removed: FINALLY! That's a glaring source of security issues.

Really I don't see what's there to complain about, all good stuff.


> nullptr: fixes problems with eg, va_arg

nullptr is an overkill solution. The ambiguity could have been solved by mandating that NULL be defined as (void*)0 rather than giving implementations the choice of (void*)0 or 0.


dmr would've approved of going a step further - only nullptr, no 0 and (void*)0:

>Although it would have been a bit of a pain to adapt,

>an '89 or '99 standard in which the only source representation

>of the null pointer was NULL or nil or some other built-in token

>would have had my approval.

https://groups.google.com/g/comp.std.c/c/fh4xKnWOQuo/m/IAaOe...


Are there any mainstream implementation where NULL is not typed as (void *)? That seems like a choice that would cause so many problems (type warnings, va_arg issues), i wonder why would anyone do that.


Vintage code or code written by vintage coders.

Code written by C++ programmers.

Code written to be both C and C++.


No.


That would have been my preference as well. Either force it to be (void*)0 or, maybe, allow it to be 0 iff it has the same size and parameter passing method.


> Really I don't see what's there to complain about, all good stuff.

It's called "change" and people don't like it.


constexpr is terrible.

-constexpr is not anything like constexpr in C++. -It makes no guarantees about anything being compile time. -It in no way reflects the ability of the compiler to make something compile time. -It adds implementation burden by forcing the implementations to issue errors that do not reflect any real information. (For instance you may get an error saying your constexpr isnt a constant expression, but if you remove the constexpr qualifier, then the compiler can happily solve it as a constant expression) -All kinds of floating point issues.

We should not have admitted this in to the standard, please do not use.

nullptr is the third definition of null. one should be enough, two was bad. why three?


Well, that's interesting.

Got any more information on that? Why does it fail in that way? Is that an implementation or a specification problem?


Im in the WG14 so ive been involved in the discussions.

It fails for 2 reasons:

In order to make it easy to implement it had to be made so limited, that it in no way useful.

The second reason, and the real killer, is the "as if" rule. It states that any implementation can do what ever it wants, as long as the output of the application is the same. This means that how a compiler goes about implementing something is entirely up to the compiler. This means that any expression can be compile or execution time. You can even run the pre-processor at run time if you like! This enables all kinds of optimizations.

In reality, modern compilers like gcc, llvm and MSVC are far better at optimizing than what constexpr permits. However since the specification specifies exactly what can be in a constexpr, the implementations are required to issue an error if a constexpr does something beyond this.


Okay, so that's a good start, but I still don't get it.

> In order to make it easy to implement it had to be made so limited, that it in no way useful.

Such as?

> The second reason, and the real killer, is the "as if" rule.

Why is that a problem? It sounds like a benefit. It means that the optimization can't break anything, which to me is kind of the point.


> Such as?

Loops, function calls.... things available in C++

>Why is that a problem? It sounds like a benefit. It means that the optimization can't break anything, which to me is kind of the point

As-if is great! but the problem is that constexpr tricks people in to thinking that something is done at compile time, and the as-if rule overrides that and lets the implementation do it when ever it wants. constexpr is a feature over ridden by a fundamental rule of the language.


"Modern compilers" still fail at:

    const int bla = 23;
    const int blub[bla] = { 0 };
(see: https://www.godbolt.org/z/hjessMhGK)

Isn't this exactly what constexpr is supposed to solve?


If so why not fix const? Why add a whole new keyword? Why complicate the language?


Don't know, you're in WG14, not me :D

Maybe 'const' can't be fixed without breaking existing source code?

I don't really have a problem with adding a new keyword for 'actually const', maybe calling it 'constexpr' means C++ people have wrong expectations though, dunno.

For me, having constexpr being explicit about only accepting actual constant expressions, and creating an error otherwise (instead of silently falling back to a 'runtime expression') is fine, but the existing keyword 'const' has an entirely different meaning, there's just an unfortunate naming collision with other languages where const means 'compile time constant'.


It has a lot of costs to add to the C language, even if it's just the increased complexity in the documentation, and doesn't effect c99. Every processor, OS, programming language needs used in business needs to fully support a C standard. So adding to C effects every processor and computer architecture, every new OS, every new language.

If you look at CPPreference you can see how much complexity has been added to the C standard in the last few years.


What do those have to do with that? A processor has no need to know anything about constexpr, auto, or static_assert.

In fact I don't see anything that needs support anywhere but the actual compiler.


constexpr is also ridiculously simple to implement -- because the existing compilers already do something similar internally for all enumeration constants.

(Enumeration constants are the identifiers defined inside enum xxx {...})


...and most compilers also already silently treat compile time constant expressions like constexpr, an explicit constexpr just throws an error if the expression isn't actually a compile time constant.


This is a completely unfair mischaracterization.

A lot of these ARE relevant and useful improvements to the C language itself; constexpr reduces the need for macro-constants (which is nice), ability to specify the enum storage type is often helpful and clean keywords for static_assert etc. are a good idea too.

And getting rid of the "void" for function arguments is basically the best thing since sliced bread.


> constexpr reduces the need for macro-constants

const is sufficient to eliminate the use of macro-constants with the exception of the use of such constants by the preprocessor itself (in which case constexpr is also inapplicable).


    #define DEF   (ABC+(GHI<<JKL_SHIFT))


Please make a point.


I did. This is exactly what we need for constexpr for.


Nothing you've demonstrated requires the use of constexpr. A const is perfectly suited for that.


One of us doesn't know C.


https://godbolt.org/z/de1YGG5r1

I hope for your sake that you don't speak this way other than anonymously on the Internet; it reflects very poorly of you.


I don’t understand why anyone would use the “auto” variable type thing. In my experience it makes it impossible to read and understand code you aren’t familiar with.


Well, the obvious (?) reason is to type less, and also reduce the risk of doing the wrong thing and using a type that is (subtly) wrong and having values converted which can lead to precision loss.

Also it can (in my opinion, brains seems to work differently) lower the cognitive load of a piece of code, by simply reducing the clutter.

Sure it can obscure the exact type of things, but I guess that's the trade-off some people are willing to do, at least sometimes.

Something like:

    const auto got = ftell(fp);
saves you from having to remember if ftell() returns int, long, long long, size_t, ssize_t, off_t or whatever and in many cases you can still use the value returned by e.g. comparing it to other values and so on without needing to know the exact type.

If you want to do I/O (print the number) then you have to know or convert to a known type of course.

This was just a quick response off the top of my head, I haven't actually used GCC 13/C2x yet although it would be dreamy to get a chance to port some old project over.


typing less sounds like a minor benefit to me and the downsides are major: auto makes code unintelligible to humans on casual inspection

#noauto


> you can still use the value returned by e.g. comparing it to other values and so on without needing to know the exact type.

No no no no nonononono. No!

Loose typing was a mistake. I think any sober analyst of C and C++ knows that. The languages have been trying to rectify it ever since.

But dynamic typing was an even bigger mistake. Perversely, it's one caused by a language not having a compiler that can type check the code, which C does.

I want to actually know what my code is doing, thanks. If you want "expressive" programs that are impossible to reason about, just build the whole thing in Python or JS. (And then pull the classic move of breaking out mypy or TypeScript half way in to development, tee hee.)

The only time `auto` is acceptable is when used for lambdas or things whose type is already deducable from the initializer, like `auto p = make_unique<Foo>()`.


There is nothing "dynamic" about what I suggested. There is a real, concrete, static and compile-time known type at all times.

In this case it would be long. I fail to see the huge risk you're implying by operating upon a long-typed value without repeating the type name in the code.

    const auto pos_auto = ftell(fp);
    const long pos_long = ftell(fp);
I don't understand what you can do with 'pos_long' that would be dangerous doing with 'pos_auto' (again, disregarding I/O since then you typically have to know or cast).


> again, disregarding I/O since then you typically have to know or cast

Thank you for answering for me!


`const long pos_long = ftell(fp);` contains a potential implicit case in the future if the return type of `ftell()` changes.

That's one reason type inference is safer than not inferring. Your program doesn't include semantics you didn't actually intend to be part of its meaning.

Also, I think lambdas would be annoying without it.


You are confusing dynamic types with type inference.


I'm not, although it apparently came off that way.

I meant that, to a person reading the code, `auto` tells you about as much about the type you're looking at as no type at all (like in a dynamically typed language).

This chatter said it better: https://news.ycombinator.com/item?id=35814337


This is where tooling can help. An IDE could display the type next to all auto symbols if you want. Or allow you to query it with a keyboard shortcut. This gives the best of both worlds, rather than forcing everyone to write out the types all the time. Sometimes we simply don't care what the exact type is, e.g. if it's an object created and consumed by an API. We still want it type-checked, but we don't want to be forced to write out the type.


There is this guideline of "Almost Always Auto" (https://herbsutter.com/2013/08/12/gotw-94-solution-aaa-style...) and I have been following it for yeears both in my job and my personal projects and I have never been very confused by it or had any sort of bug because of it. I felt very reluctant about using it at all for quite a while myself, but in practice it just makes stuff easier and more obvious. A huge reason it's useful in C++ is generic code (stuff that depends on template parameters or has many template parameters) or deeply nested stuff (typing std::unordered_map<std::string, MyGreatObject::SomeValue>::iterator gets annoying), but it's nice almost everywhere. Most types are repeated and getting rid of those repetitions makes refactorings a lot easier and gets rid of some source of bugs. For example sometimes you forget to change all the relevant types from uint32_t to uint64_t when refactoring some value and stuff breaks weirdly (of course your compiler should warn on narrowing conversions, but just to illustrate the point, because it is very real).


>For example sometimes you forget to change all the relevant types from uint32_t to uint64_t when refactoring some value and stuff breaks weirdly

Use size_t


`size_t` may not be helpful. You can argue that some other specific typedef should have been used in this case, but it's kind of water under the bridge already.


size_t is for object sizes. It might not be 64 bits.


auto is at its best when you have something like:

    std::unordered_multimap<string, std::unordered_multimap<string, someclass>> getWidgets();
With templates you can easily have very unwieldy types, and there's not that much benefit from spelling them out explicitly.

Like any tool, there are good and bad uses of it. Well used, it removes unnecessary clutter and makes the code more readable.


I agree with you here. Many people might find it useful but this is something better suited for C++ which is full of nebulous typing features.


It makes sense to have it in macros. Though standard C still doesn't have statement expressions, so...


As an example of this, a generic `MAX` macro that doesn't evaluate its arguments multiple times, would be (using said GNU extension of statement expressions):

    #define MAX(A, B) ({ auto a = (A); auto b = (B); a>b? a : b; })
As-is, for such things I just use __auto_type, as it's already in the GNU-extension-land.


Do you really need to parenthesize the parameters? Is there something that can break the variable declaration into multiple statements?


Here, no. It's just a habit or common style guideline to always parenthesize macro parameters since so many macros can otherwise break.


Here, probably not (with proper arguments at least; without the parens something like `MAX(1;2, 3)` would compile without errors though), but I'm just used to doing it everywhere.


I wish ({...}) had been in C23.


I agree, this is one of the more important common extensions we are still missing.


I'm still miffed it wasn't in C99 ;)


Sorry, it was on my list of things to propose, but I did not have enough time (due to a job change). Others were more interested in lambdas.


You are forgiven. It must be like pushing water up a mountain. At least we got #embed, typeof, auto, constexpr, and sensible enums this time around.

How many of you guys have half of a C compiler lying around in a directory somewhere on your machines?

(And how do you find the time for WG14 AND writing code AND doing research? My cousin is in roughly the same field as you and you publish more than he does.)


It is Sisyphean, but there also many synergies (I work on real-time computational imaging systems). And the specific things you mention were mostly not my work (I made typeof in GCC more coherent, which helped getting it standardized). But yes, I have a semi-complete C compiler around and many unfinished GCC branches...


Isn't auto already a keyword for specifying scope? I know it's never used and kind of redundant, but something like `auto x = foo(y);` is a terrible direction for C. Type being abundantly clear at all times is a huge feature.


The accepted proposal does address and preserve `auto`'s use as a storage class, so `auto x = foo(y);` means type deduction based on the return type of `foo`, and `auto int x = 10;` declares an integer with automatic storage duration.


It becomes tolerable by using a text editor that is too clever for his own good and fills the typeinformation back in as hint.

But I'm not a big fan either.


This gives a pretty good explanation why auto is useful in C:

https://thephd.dev/c23-is-coming-here-is-what-is-on-the-menu...

(TL;DR: it makes sense in macros)


I miss the simple days of just regular K&R and a decent lint. c was suppose to be a nice small simple language, but each standard adds more and more complexity.


Oh come on, pure K&R-C is a pain to read and write even for old-school C enthusiasts, and C99 brought some really nice improvements (like compound literals and designated initialization) while not overloading the language with bells and whistles, it's mostly just logical extensions of existing syntax that should have worked to begin with.

And C-isms like '= { 0 };' instead of '= { };' never really made a lot of sense.


I used to think like this. Then I had to use `thread_local` where it was so much more convenient and portable (and likely faster) than pthread_getspecific. I think it's good quality of life features get added and standardized.

Even Linux kernel uses lot of GNU extensions. Computers are there to abstract stuff for us.


I have to disagree here.

C++ is becoming more complex and bloated over time.

C, on the other hand, is becoming more convenient with almost no bloat or confusing features.

C is aging slowly but graciously, in my opinion.


Except C23 is going to have a lot of anti-features: https://queue.acm.org/detail.cfm?id=3588242


The article is a boring rant with no substance. His main issue seems to be the realloc zero change which he claims breaks idiomatic code. But this is certainly not idiomatic code as it never worked correctly in portable C code, because different implementations did different things for decades with no willingness to change.


Many people still write in a C89 dialect just fine.

Most of the good stuff is written in it: sqlite, curl, STB libraries, zlib, OS kernels etc.


> OS kernels

Many are, but Linux moved to C11 (well gnu11 I guess) since v5.18

Summary of discussion: https://lwn.net/Articles/885941/

Actual move https://git.kernel.org/linus/e8c07082a810fbb9db303a2b66b66b8...

Being able to replace

  int i;
  for (i = 0; i < foo; i++) { ... }
with

  for (int i = 0; i < foo; i++) { ... }
makes it already worthwhile.

Linus thinks so too:

  > I think the loop iterators are the biggest user-visible thing, but
  > there might be others.
Also interesting:

  > Of course, the C standard being the bunch of incompetents they are,
  > they in the process apparently made left-shifts undefined (rather than
  > implementation-defined). Christ, they keep on making the same mistakes
  > over and over. What was the definition of insanity again?
-- https://lwn.net/ml/linux-kernel/CAHk-=wicJ0VxEmnpb8=TJfkSDyt...


> Also interesting:

The issues here are:

- if it's defined, then people will rely on it, which means UBSan can't report it as an error if it sees it.

- IIRC x86 defines the overflow behavior differently for scalar and vector ints, so x86 compilers that want to do autovectorization would probably leave it undefined.

C's original sin here is that the numeric promotion rules are wrong ('unsigned short' should not promote to 'int'!) but as long as you can't fix that, you can't get rid of UB and still be performant.


What is wrong with promoting unsigned short to int?


C syntax is already too rich and complex. What it needs is not more but less and some fixing: Only sized primitive types (u8/s8...u64/s64, f32/f64); typedef, enum, switch, all but one loop keyword (loop{}) should go away; No more integer promotion, only compile-time/runtime explicit casts (except for literals, and maybe void * pointers); explicit compile-time constants; no anonymous code block; etc.

That said, if risc-v is successful, many components will be written directly in assembly, and trash-abl code will be written in very high-level languages with risc-v assembly written interpreters (python/perl5/lua/javascript/ruby/php/haskell/tintin/milou).


The variable-size int, unfortunately, made a lot of sense in the early days of C. On processors like the x86 and 68000, it made sense for int to be 16-bit, so you don't pay for the bits you don't need. On newer systems, it makes sense for int to be 32-bit, so you don't pay to throw away bits.


The variable-sized word made more sense when writing code to work acoss machines with 16-bit and 18-bit words or 32-bit and 36-bit words. This is also why you get C99's uint_least32_t and friends, so you're not accidentally forcing a 36-bit machine to have 32-bit overflow behavior everywhere.

Before the mid-late 1990s, programmers rarely needed to worry about the difference in size between 32 and 36 bit words.


That's why it needs fixing. We are not in the early days anymore.


Problem is simple—there are still systems out there like that, and people are still buying them and writing C code for them. They're just in the embedded space, where day-to-day programmers don't encounter them.


Then those systems are maintained, then then could correct their legacy C code to "fixed-C" (which should be not that much in-real-life anyways).

It would be possible to do a quick-and-dirty work with preprocessor definitions. The real thing is when you want to write "fixed-C" with a legacy compiler: you realize than it does so much things without telling you, you would need a really accurate warning system in order to catch all those things. I was told gcc can report all integer promotions, true?


You shouldn't fix it by making users choose what bit width their ints are. That's not the right answer for performance (they don't know what's fastest) or correctness (the only choices are incorrect).

If you have a variable whose values are 0-10, then its type is an integer that goes from 0-10, not from 0-255 or -127-127.


> Only sized primitive types (u8/s8...u64/s64, f32/f64);

C's success came from portability, so that would have killed it. Certainly you need fixed-size types occasionally to match externally-defined structures (hardware, protocols) but if you write u8 loop counters and u32 accumulators you're screwed on a DSP with only u24.

> That said, if risc-v is successful, many components will be written directly in assembly

There are already too many RISC-V extensions for code to be portable between different RISC-V chips without using a higher-level language.


> Certainly you need fixed-size types occasionally to match externally-defined structures (hardware, protocols) but if you write u8 loop counters and u32 accumulators you're screwed on a DSP with only u24.

This "portability" argument always rings hollow. How often are you actually reusing code between a DSP and a desktop? When real sizes don't matter, but just a minimum range, there's `(u)int_least8_t` (could be renamed as `u_min8`). On a DSP with only, say, 16-bit "bytes" (like the TMS320), that would internally be a `u16`.

C is not a "portable assembler" anymore. That's a myth. Almost no one writes C in a portable way. Every library or program has their on version of <stdint.h>. glibc, for example, uses gu8 for `unsigned char`. Put that on your 24-bit DSP, and `gu8` just became the same as `gu16` and the nonexistent `gu24`.

C is a language designed around a PDP/11 and inherits that legacy baggage. The C committee that refuses to accept the reality for "purity" reasons holds back the language.


Yep, that's why you would have had a "fixed-C" compiler with an explicit u24 and "portability", if it has any meaning here, would have to be done at the user program level.

The C committee is just adding stuff, to make even a naive C compiler more and more costly, which kills many "real life" alternatives, and in the end, does promote planned obsolescence more than anything else.

We have to remove stuff from C, not add stuff to C, make things more explicit and not more implicit (the abomination of the integer promotion...).

The stdlib has nothing to do in the C syntax even though not having memcpy and co internal to the compiler feels really meh on modern CPU.


Most of the C related cost of a new* compiler seems to come from declarations (they are a lot nastier to figure out than they look) and the preprocessor. The rest of the language doesn't seem to be that expensive to implement.

And then there is the other stuff: calling conventions, per-CPU code generator, general IR optimizations. This can be very cheap if we accept poor quality of the generated code.

---

Edit: I inserted the word 'new'.


Yet int is still 32 bits when it should be 64 bits on modern CPUs ;)


Yes, that is what I stick to also, and that is better for portability.


IMHO C should remain a language that is easy to write a compiler for (and there have been many examples of individuals doing it, some of which have appeared on HN in the past), i.e. one should be able to "bootstrap" from it.


What we need is a good old wave of deprecation. Re-simplify around a slightly more modern syntax.


   gcc -std=c89
still works


Or you can chose to use the only the modern subset. eg only nullptr and never NULL.


Sure; but you’ll need to learn all those other features if you want to read other people’s code.

I was adamantly against Javascript adding OO-style classes and public/private methods and stuff for that reason. More stuff in the language makes it strictly harder to learn. Even if I don’t want to use those language features myself, I will inevitably end up reading and debugging other people’s code which does use those features. So I still need to learn it all, even if I don’t ever see myself using that stuff.


A "nice small simple" thing implies that someone has the power to say "that's it, we're freezing it the way it is". Instead, usually there is pressure to add, change, 'improve', keep up with the Joneses.


Not enough, if you ask me. It's treated mostly like a museum language, like classical music, with some polishing and quality of life improvements (that came 50 years too late, like checked arithmetic).


I am starting to see value in fixing a language when you hit 1.0. No more language changes at that point. If you want to change something, it has to be in a library.


Everyone is so angry in the comments about c adopting c++ traits. I do not see it that way. This is not the language standard this is just GCC. Clang and GCC have always had compiler specific features which could only be described as wizardry to a regular c user. GCC has always kept the c standard at arms length, clang is a bit better but using compiler specific features has always been reserved for mavericks. Conversely, C23 has some great additions, none that remind me of C++? [1]. I don't think the standards enthusiasts that hang out on #c in freenode would be happy to think this many people see c going in the direction of C++.

[1] https://open-std.org/JTC1/SC22/WG14/www/docs/n3054.pdf


It's weird because even the first C standard was influenced by C++: function prototypes and const.


AFAIK, all the features described in the post are not GCC specific but are in the next standard (hence mention of C2X).


I am not angry, just very sad but I guarantee you most people who love to program in C or do it for a living are not "standards enthusiasts".


I am very happy about #embed, auto, constexpr, static_assert, unreachable, and ditching of K&R function parameters. Aren't you?


#embed is a great feature and should have been added ago.

Knowing how much memory your numbers take up is important for many applications, so I find things like "auto i = 5" to be questionable.

Fancy compound literals seem like a solution in search of a problem.

Removing ancient unused misfeatures is good.

I don't have strong feelings about the rest. But I think people are reacting to the process more than the specific features. There's always a good reason for new features -- that's how feature creep works. Over time, adding a few features here and a few features there is how you go from a nice, simple language like C to a gargantuan beast like C++. C has around 35 or so common keywords, almost all of which have a single meaning. C++ has many more keywords (80+?), many of which have overloaded meanings or subtle variations in behavior, or that express very fine distinctions -- const vs. mutable or static/dynamic/const/reinterpret_cast, for instance. All of this makes C++ a very large language that is difficult to learn.

In a world of constant feature additions and breaking changes in programming languages, C is largely the same language it was in 1989. For some applications, that's a good thing, and there isn't another language that fills that niche.


> Knowing how much memory your numbers take up is important for many applications, so I find things like "auto i = 5" to be questionable.

Automatic variables[0] don't take up any defined amount of memory. They certainly don't take up exactly sizeof(i) memory.

Struct members and global variables are more likely to do what you say; in that case it will be either not be allowed or will be sizeof(int). Conversely, `long i = 5` is two different sizes (long vs int) which could be a latent bug.

[0] aka local variables, not `auto` variables


auto, constexpr: Kill them with fire.

unreachable. No when the optimizer compiler can f*#k up my code if you combine it with unintended undefined behaviur. I just use asser(0) in debug builds. Not kidding.


Unreachable is a quite important optimization hint (note how the 'blub()' function removes a range check because of the unreachable in the default branch):

https://www.godbolt.org/z/Ph8PY1drc


I know it, but if you have a bug that reach the Unreachable path the compiler can't tell you anything. that is why I use assert(0) in debug builds.


And you can easily do a macro check and define a custom thing that's either assert(0) or unreachable() depending on the build type. But you still need unreachable() to exist to do that. (and under -fsanitize=undefined you get an error for unreachable() too)


And you can effectively do both:

    #ifdef NDEBUG
    #  define UNREACHABLE()   unreachable()
    #else
    #  define UNREACHABLE()   assert(0)
    #endif
That's what I have been doing for years, except with __builtin_unreachable()... and __assume(0) if I bothered to make it work under MSVC.


And why this is not the default behaviur?

I am pretty sure many users are going to think it is correctness check an not an optimization attribute.


I'd rather not have a basic feature be put behind needing to define an arbitrary NDEBUG; having to define your debugging setup around NDEBUG would not fit some things - e.g. in my case I'd end up having to always define NDEBUG, and continue with my own wrapper around things. (with <assert.h> you have the option of writing your own thing if you don't like its NDEBUG check, which is what I do; with unreachable(), if you literally cannot get its functionality other than NDEBUG, you're stuck with NDEBUG).


Because C is a "do exactly as I say" language.


It can, just turn on UBSan.


unreachable() is just the standardized form of __builtin_unreachable() (gcc/clang) and __assume(0) (MSVC).

I often have a macro called UNREACHABLE() that evaluates to assert(0) or __builtin_unreachable() depending on NDEBUG.

It improves the generated code a bit.

One trick one can use is to define ASSERT() as a wrapper around assert() or something like

    do { if (!x) unreachable(); } while (0)
This is a really nice way to tell the compiler about invariants -- and generate better code (and better warnings!).

There are no fuck ups involved. None.

constexpr is great because it reduces the need for macros. There are three features that make almost all macro use unnecessary. They are enums, inline, and constexpr. Good enough inline support has only really been available for a few years -- by "good enough", I mean "doesn't slow down CPU emulator code".

Things are really looking up for C, in my view.


What's wrong with constexpr?


Its a crippled form of code generator in an era where compile time execution is suported by many modern system languages.

But I don't like any of them. I prefer to write my own code generators.


C doesn't have that version of constexpr. In C2x, constexpr is just a way to define constants, like

  constexpr unsigned long long kMyBigNum = 0x1234123412341234ull;
Previously, you had to #define. Using enum causes problems when it doesn't fit in an int. And const doesn't mean the right thing:

  const int kArraySize = 5;

  void MyFunction(void) {
    int array[kArraySize]; // NO!
  }
The above function will work if you have VLAs enabled, or if your compiler specifically allows for it. It's nice to have a standardized version that works everywhere (VLAs don't work everywhere).


Constexpr in C is essentially what const should have been, it's not as "powerful" as the C++ version.


The most desired C++ feature I'd like to see is the automatic typedef'ing of structures. Does anyone know why such a fundamental thing hasn't been implemented yet?


I disagree that this is a fundamental thing.

C++ doesn't exactly do "automatic typedef'ing of structures".

The difference is that in C++, if you define a type "struct foo", you can refer to it either as "struct foo" or as "foo" (likewise for class, union, and enum).

In C, if you define a type "struct foo", its name is "struct foo". If you want to call it "foo", you have to define an alias using "typedef".

Personally, I see "struct foo" as a perfectly valid name. I seldom feel the need to define another name for it. (typedef is the only way that a C type's name can be a single user-defined identifier; "int", "char" et al are keywords.)

I'll define a typedef for a struct type only if the code that uses it shouldn't know that it's a struct type.

Yes, it's a little extra typing. I save up the keystrokes I save by typing "{" rather than "BEGIN" and use them to type "struct". 8-)}

This is a matter of personal taste, and if you want to call it "foo", there are common idioms for doing that.


The common workaround isn't too bad:

    typedef struct { float x, y; } vec2;
...or if forward declaration is needed via a named struct:

    typedef struct vec2 { float x, y; } vec2;


Because it would break a lot of existing code.


How would using an existing structure definition where an undefined type is found break anything?

Compilers already know what you want to do as it will print an error such as: "unknown type name ‘Vec’; use ‘struct’ keyword to refer to the type".


So you want Vec to refer to struct Vec but only if there is no other type Vec defined before or until one is defined later? That would work, but might be a bit confusing.


C++ has struct & typedef and things work quite naturally. It always seemed like an obvious thing to bring to C, but I'm not sure about the nuances of the rules governing this.


C++ had this forever (I assume), but for C this would be a breaking change which we try very hard to avoid.


Could you please give an example of what would break? Perhaps I'm being dense, but it seems a new C standard supporting this would still compile existing code just as C++ can.


In C, it's perfectly legal to do this:

    struct S { ... };
    typedef int S;
That's not valid in C++ (so would be a breaking change in C, if it were to adopt this).

I don't really think changing this in C would break all that much code, but it's definitely not backwards compatible.


That's how it works for an object's attributes inside methods, since this is optional by default. I hate it but there's precedent.


> The auto feature is only enabled in C2X mode. In older modes, auto is a redundant storage class specifier which can only be used at block scope.

Is it redundant? I'm no c expert, but I remember several very specific cases where auto is necessary, and not using it leads to problems. Nested functions being the big one I remember.

The 'new' meaning for the 'auto' keyword doesn't seem compatible with this change. Won't these affect lots and lots of code that relies on nested function semantics?


Nested functions are not C.

It's a gcc extension but it is not used much. Many Debian/Red Hat packages that used it had it removed more than a decade ago because the implementation did things like stuffing code into the stack and executing it. When we began hardening our machines and disallowing that kind of behaviour, those packages either had to upgraded to not use that feature or be removed.


My software and Debian package uses nested functions. I use it because code quality improves a lot with nested functions. But yes, the executable stack is a problem. I had a patch for GCC to use function designators which would have removed this problem, but it was not accepted due to some backwards compatibility issues. I think a generic solution could be a new function pointer type.


Nested functions are a GCC extension, not standard C. As far as I know, there's never been a reason to use the auto keyword in standard C.


Interesting, I hadn't realised...

So in standard c, all function prototypes and definitions must always appear at toplevel?


That is correct, yes. You can limit the scope of a function to a single file using the static keyword; otherwise all functions are in the same global namespace for static linking.

In practice, this is not a big limitation by itself. Although C has function pointers, it's not a functional programming language -- there are no closures, and you can't create new functions at run-time under any circumstances. (Not within the language itself, anyway. You'd need to embed a compiler, or at least a minimal code generator.) On Harvard-architecture systems like microcontrollers, code and data may be in different memories (ROM vs. RAM) with different access permissions, and some platforms even put them in entirely separate address spaces.


Definitions, yes. Declarations can appear inside functions, but it's not common to see.

    int f(int x)
    {
      extern int g(int);
      return g(x + 1) - 1;
    }


This is doubly interesting, since the whole point of auto in the nested scenario is precisely to prevent the default "externness" of the declaration.

But, if the definition itself cannot appear in the enclosing block scope in the first place without the gcc extension, then I suppose an auto keyword here would still be meaningless (if not misleading / dangerous possibly UB?).


auto is not allowed for a function declaration with block scope without the GCC extension.


Is it just me, or is C turning into a C++ dialect?


ANSI adopted several features which first appeared in C++ when they drafted the first C standard for 1989.


These are not really C++ features, it's just basic features included in most similar languages these days.


It’s just you.


Evidently not.


My dream is that someday C will have safe varargs functions that know how many arguments the caller passed in.


Still no #embed


#embed will be in the C23 standard. It's not yet implemented in GCC 13.


Huh? Embed is there (and this is great).


#embed isn't mentioned in the post and I'm pretty sure it didn't make it into GCC 13.





Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: