My review of the C standard library in practice

celeritascelery · on Feb 13, 2023

This just goes to show how hard it is to create a API to last 30 years. So much has changed since then, but libc needs to keep the same API. Things like Unicode and the internet (which brought the need for security to forefront) have come into popularity since then, but you can’t fix the old functions.

vbezhenar · on Feb 13, 2023

So one should focus on creating versioned API IMO. Nobody does that for some reason. I should be able to declare my program to use v2023 API and consume libraries which use v1975 API.

yxhuvud · on Feb 13, 2023

No, one should focus on creating smaller independent modules and don't mix unrelated stuff like networking fundamentals with string convenience functions.

beebmam · on Feb 13, 2023

Are you familiar with linker hell? I agree in theory with your suggestion, but not for C or C-like languages which have such a poor story for linking compilation units.

dietr1ch · on Feb 13, 2023

Well, the problem is not getting easier and having poor versioning has been hunting everyone for around as long as null references

jmillikin · on Feb 13, 2023

That doesn't work in languages with nominal types, and even structurally-typed languages can have a hard time.

You call a function that returns a v1975/SomeStruct and pass it to a function expecting a v2022/SomeStruct. What does the new function do when some of the struct fields are set to invalid values?

netr0ute · on Feb 13, 2023

What if the type was a "Some1975Struct" that got passed? Then that would give a huge blaring error because it's not the same type as a "Some2022Struct".

jmillikin · on Feb 13, 2023

The naming convention isn't important. If the structs are from different versions of the library then you either have a type error (nominal types) or have to figure out the semantics of arbitrarily uninitialized fields.

admax88qqq · on Feb 13, 2023

You do option 3, manually adapt one type to the other because they are different types.

If you call one function that returns an int and need to pass it to another function that expects a char* what do you do?

jmillikin · on Feb 13, 2023

I think you may have lost track of what this thread is discussing.

To be more concrete, let's say you have the following code (pseudo-Go):

  package main
  import "example.com/http_server" // v2
  import "example.com/utils" // transitively depends on http_server v1

  function main() {
    var srv *http_server.Server = http_server.New()
    utils.ServeHttp(srv)
  }

You're going to have a problem, because ServeHttp expects an HTTP server struct from the v1 API, and you're passing in a struct from the v2 API.

If you tried to write an adapter for this then (1) you'd need to somehow construct a concrete proxy of a type you don't control, and (2) there's no way to import both the v1 and v2 APIs at the same time in your code, so you don't have a way to declare a conversion function.

That's why APIs with versions declared by dependency edges are impractical.

mseepgood · on Feb 13, 2023

> there's no way to import both the v1 and v2 APIs at the same time in your code, so you don't have a way to declare a conversion function.

In Go major versions have different import paths. You can import multiple major versions of the same library:

    import (
        http_server_old "example.com/http_server"
        http_server_new "example.com/http_server/v2"
    )

jmillikin · on Feb 13, 2023

Sure, but that's not what's being discussed here.

In C and C++, it's idiomatic to have different major versions use the same includeable name, with the version resolution handled by the build system.

staunton · on Feb 13, 2023

... where "build system" means a bunch of preprocessor macros, at which point you just use the oldest version of everything available and never update anything that works.

vbezhenar · on Feb 13, 2023

So my thoughts are as follows:

1. Every source file must be annotated with language version.

2. Every language version can remove things from previous language versions. More like "hide" I guess, but you can't compile source file that's using removed features or APIs. So to migrate to new version you're supposed to change sources. Of course preferable using some migration tools, but that's out of scope.

3. You can access old API indirectly. For example if you're using library which uses old API and returns struct from that old API.

4. There should be some well thought rules for situation that you describe. To make structs forward and backward compatible as much as possible. May be even to provide some implicitly running migrations to convert between structs with different versions.

5. If there's no way to automatically convert v1975/SomeStruct to v2022/SomeStruct, you can't do that and need to convert it manually.

This is hard problem and must be thought on every level: data layout compatibilities, ABI compatibilities, type system compatibilities. But I'm not convinced that it's unsolvable problem. And if solved it would provide great benefit allowing lots of freedom and agility for language development.

eadler · on Feb 13, 2023

Rust kind of does this. Language feature uses "edition" and has crates for its standard api instead of built in.

tialaramex · on Feb 13, 2023

It really doesn't, though, although in principle Editions could provide an escape hatch it would be so drastic as to be largely unthinkable.

Because of Rust's commitment to long term stability all of the standard library is there forever, even if it's a mistake and thus deprecated. std::u8::MAX will be in the library forever, even though you can write u8::MAX to get the same constant, name.trim_left_matches(remove) will exist forever even though you ought to write name.trim_start_matches(remove) because "left" assumes an LTR writing system.

If NonZeroI64 is a bad idea, too bad it's in the standard library forever. If AddAssign is a bad idea, too bad it's in the standard library forever. The language syntax is allowed to evolve via Editions, but the standard library never breaks backward compatibility.

nicoburns · on Feb 13, 2023

> It really doesn't, though, although in principle Editions could provide an escape hatch it would be so drastic as to be largely unthinkable

Editions don't currently work for the std library, but I don't see why doing so would be drastic or unthinkable.

tialaramex · on Feb 13, 2023

Whereas C++ versions are each distinct languages which aren't necessarily interoperable, Rust Editions promise to interoperate and this is used all over the place. As a result the Editions can only "really" touch syntax, how you in some sense spell Rust, not what it means.

Essentially Rust 2015 Edition and Rust 2021 Edition are actually exactly the same language except that the spelling is different, and as a result there are some things you can spell in one that you don't have a way to spell in the other. So in Rust 2015 Edition I can name my function async, because I want to, in Rust 2021 Edition that's fine, but it's written "r#async". Same function, same name in a sense, but different spelling. On the other hand, if I have an actual async function in Rust 2021 Edition, I can't write that in 2015 Edition at all - there's no way to spell the async keyword in that edition.

The library however, is mostly semantics, not syntax. We care about what it does, not how to spell things on the whole.

kelnos · on Feb 13, 2023

But the point is that there's no reason why editions couldn't touch semantics, too.

There's no reason why they couldn't add a new attribute

    #[available_in_editions(2018, 2021)]

that could allow them to actually remove deprecated functions, structs, trait, macros, whatever in newer editions. Code written for edition X would still compile and work, but code written for edition Y wouldn't be able to use things marked as unavailable.

tialaramex · on Feb 14, 2023

So, with this hypothetical attribute, the thing isn't gone, but it doesn't compile any more, whereas with the current situation the thing also isn't gone, and you get a warning (unless you told the compiler to forbid this rather than warning, in which case it doesn't compile).

This is identical to a [really_deprecated] attribute. We know it's deprecated but kelnos wants to force us to use a separate crate to use it for some reason.

actionfromafar · on Feb 13, 2023

Hey, don't get me started on semantics... what if you trim from the left or right, but are using RTL. Imagine your surprise when the exact opposite happens.

masklinn · on Feb 13, 2023

Which is exactly why, as GP noted, rust deprecated trim_left/trim_right and added trim_start/trim_end.

kstenerud · on Feb 13, 2023

Well, that and also a showcase of just how BAD the design of the C stdlib string functions was, and continued to be with every iteration where someone introduced a new bad-and-poorly-thought-out API to replace the old bad-and-poorly-thought-out API.

It still boggles the mind how a function that leaves a C string potentially unterminated ever even made it into the standard...

krelian · on Feb 13, 2023

>It still boggles the mind how a function that leaves a C string potentially unterminated ever even made it into the standard...

Easy to say with 50 years of hindsight. These are the pioneers we are talking about here. It boggles the mind how much they got right.

nicoburns · on Feb 13, 2023

> Easy to say with 50 years of hindsight. These are the pioneers we are talking about here.

Eh, well kinda. Languages with decent string handling predate C, so it's not like there wasn't precedent to follow. The creators of C were pioneers, but they were also people who favoured a quick, hacky approach over a clean careful one. That has certain advantages, but it's rather unfortunate that C has become so foundational and we've been stuck with those hacks.

flohofwoe · on Feb 13, 2023

What would those languages be? I'm only used to relatively modern languages with "decent string handling", but they almost always treat strings as opaque high level objects with dynamic memory allocation under the hood. Such an approach wouldn't exactly fit into the C philosophy.

Also, once UNICODE is added to the mix (which involves a lot more than just the text encoding), a decent string processing library isn't exactly trivial, it either needs a very big chunk in the stdlib, or a a handful of specialized 3rd party libs).

Even in Zig, which has a very decent modern low-level approach for handling string data, people used to high level string objects would probably be shocked at how 'inconvenient' it is to work with string data (which can be solved with specialized libraries though).

nicoburns · on Feb 13, 2023

Pascal would be one example.

> with dynamic memory allocation under the hood

C strings are typically dynamically allocated, are they not?

icedchai · on Feb 13, 2023

Not in the way that is described here. I took "always treat strings as opaque high level objects with dynamic memory allocation under the hood" to mean that the string would have operations, like append, that would reallocated/resize the string if needed. C strings are definitely not that.

kstenerud · on Feb 13, 2023

While strcpy is forgivable considering it came from the primordial soup, there is no excuse for strncpy since it was designed to solve the problems with strcpy. And the string library is full of gaffes like this.

tialaramex · on Feb 13, 2023

No, strncpy is not "designed to solve the problems".

strncpy is a function to fill out fixed size data structures such as some on-disk data structures, with a variable length string it right pads the structure with null bytes.

It does exactly what it was supposed to do, something that was pretty useful in 1970s UNIX programming but rarely if ever what you need today.

icedchai · on Feb 13, 2023

Every teen hacker I knew had their own "my_strncpy" function that did something like strncpy(dst, src, len); dst[len-1] = 0;

KingLancelot · on Feb 13, 2023

WG14 has talked about looking into making string objects first class citizens in C and with it deprecating all the old narrow/wide string stuff.

flohofwoe · on Feb 13, 2023

> but libc needs to keep the same API

The "needs to" is debatable. Compared to the C++ stdlib, the C stdlib is so small that adding a modernized and incompatible "v2" next to "v1" is realistic. The effort could start as a 3rd party implementation similar to MUSL.

The old headers with the old APIs would still exist for "legacy code" but would generate "deprecated" warnings.

Once that new "3rd party stdlib" has proven itself in the real world, the C commitee might consider it for inclusion in the standard.

stephc_int13 · on Feb 13, 2023

Would you keep the naming style?

flohofwoe · on Feb 13, 2023

No, at the very least I would add a (reserved) stdc_ prefix to all stdlib functions (and defines, and header filenames...), and keep options open for API versioning (e.g. stdc2_...).

That way we could also get rid of all the random reserved identifiers we need to (theoretically) adhere to now.

stephc_int13 · on Feb 13, 2023

That's a good start, at least for function names, not sure if I would use the same logic for everything.

In practice I find the prefix style better than namespaces, as a way to avoid information scattering.

Snake case would probably win the race, but I am not sure that many old-timers would be ready to let go the old habit of abbreviated function names.

After a while we don't see it for what it is, but I think that most of the naming is ugly, especially the string functions...

amelius · on Feb 13, 2023

You can make new functions while keeping the old functions, but making them print a warning that they are obsolete now.

adrian17 · on Feb 13, 2023

> As with volatile, C is using the type system to indirectly achieve a goal. Types are not atomic, loads and stores are atomic

I don't get this argument. This same wording can actually be used to make pointers untyped: "pointers are not inherently typed, loads and stores are typed. (In fact, LLVM IR recently made pointers untyped too)". And yet, clearly there's value in having them typed at the language level. Similarly here, there's clear value in having them part of the type system, if only to prevent a footgun of someone forgetting to mark a load as atomic/volatile.

tialaramex · on Feb 13, 2023

These type qualifiers cause the opposite foot gun. The programmer confidently writes compound assignments, which look like they're atomic/volatile thanks to the type qualifier but of course they are not.

With intrinsics you don't have that problem. You just can't write the compound assignment - it doesn't exist, whereas in C it is just quietly compiled to two separate operations.

C++ 20 deprecated this nonsense, but at almost the last opportunity WG21 voted to un-deprecate for C++ 23.

adrian17 · on Feb 13, 2023

I'm not a big fan of restarting the discussion from the big Reddit argument on the same topic[1], but as I understand it: from embedded (or at least some developers') POV, this is a non-concern. Volatile never implied atomic in regards to interrupts, and pretending it should is wrong. On some platforms, single volatile load/store (whether `*ptr = 1234`, or `volatile_store(ptr, 1234)`) already can compile to "two separate operations" that can be interrupted in the middle. With that in mind, you are already supposed to be aware of that and execute such operations in no-interrupt contexts, and having this also apply to compound assignments is no more of a footgun than any other operation on such memory.

(and if you do not need to care about the above for your platform/use-case, then I don't see why you would care about whether compound assignment compiles to one, two or more discrete opcodes)

Not to mention, `REG |= 0x4` is just too entrenched (and seen as idiomatic) on some platforms.

[1] https://www.reddit.com/r/cpp/comments/jswz3z/compound_assign...

tialaramex · on Feb 13, 2023

> With that in mind, you are already supposed to be aware of that and execute such operations in no-interrupt contexts

It's certainly possible embedded C programmers have told you this, but, imagine if this was actually true - you mustn't touch these MMIO registers unless interrupts are disabled. But, wait, how do we turn off the interrupts? That's an MMIO write, which supposedly we mustn't do until the interrupts are switched off...

The paper mentioned in that Reddit post isn't actually what ended up happening at Kona by the way, though I assume the Redditor didn't know that. The paper's authors weren't able to produce any evidence at all that this is used correctly in practice (e.g. a survey of 100 microcode C++ projects which use compound assignment showing that yup, no correctness bugs here), and they could only explain how it might be used correctly for some bit ops, so their paper just un-deprecates the bit-ops. As a result x /= 23; would have remained deprecated on volatile x, a small piece of sanity.

After all your micro-controller might (some do, some don't) have a single CPU instruction which atomically clears bit six of I/O register 0x3B but it's fair to say it definitely doesn't have a CPU instruction which somehow atomically divides that register by 23. Because nobody needs that.

However, at Kona WG21 voted (though by the smallest margin at the event) to undo the whole deprecation. So you can write x /= 23 in C++ 23 without even a warning that this isn't sane.

At Kona the committee also was very exercised about EU and US agencies pointing out that writing more C or C++ is a terrible idea because these languages are unsafe. Surely - several prominent WG21 members harumphed - it's wrong to treat C++ the same as C on this issue. And yet, on this relatively trivial issue of volatile compound assignment, keeping C++ consistent with obsolete C that might not even exist was considered to trump safety considerations at the same meeting, with the same people.

As to REG |= 0x4 what you probably should look for are actual intrinsics for your platform, which do only and exactly what the platform can actually implement, rather than offering the "Eh, we'll just muddle along and maybe it'll work" approach these C SDKs have today. This is less error-prone, and can often be more efficient.

renox · on Feb 13, 2023

It's also very wrong for volatile: the compiler has to handle a volatile 'type' differently than regular types: you cannot 'cache' the value in a register, memory must be updated.

jmillikin · on Feb 13, 2023

Discussion yesterday: https://news.ycombinator.com/item?id=34752400

secondcoming · on Feb 13, 2023

I was unpleasantly surprised by how much locales are used for seemingly trivial things. It's even the case with the C++ iostreams stuff.

I ended up writing my own functions in an 'ascii' namespace to avoid all this as much as possible.

This [0] rant about locales and how they're nigh-on impossible to use correctly is quite amusing (slightly NSFW)

[0] https://github.com/mpv-player/mpv/commit/1e70e82baa9193f6f02...

Aanok · on Feb 13, 2023

I understand that wm4 was probably a pretty difficult person to work with, but he got stuff done well and from the safe distance of my chair he was a rather entertaining fellow. The other mpv devs can't miss him but I kinda do.

secondcoming · on Feb 13, 2023

As a follow up, if anyone uses `boost::lexical_cast` a lot, try defining

    BOOST_LEXICAL_CAST_ASSUME_C_LOCALE

for some performance speed-ups and code size reductions

jeffrallen · on Feb 13, 2023

When Go showed up on the scene, I was just about losing my patience with libc, and one if the things I found miraculous about Go was "no more strtok and friends".

actionfromafar · on Feb 13, 2023

I found it liberating to ignore strtok and friends even when coding in C :)

heywhatupboys · on Feb 13, 2023

if only "strsep" has existed? I mean just stop using strtok

marssaxman · on Feb 13, 2023

I wonder what a reasonably modern C library would look like, were we to design it from scratch?

zellyn · on Feb 13, 2023

The last paragraph reads:

> If you would like to see interesting innovation, check out what Cosmopolitan Libc is up to. It’s what I imagine C could be if it continued evolving along practical dimensions.

naasking · on Feb 13, 2023

I found that sentence weird because when I checked Cosmopolitan, it included most if not all of the functions this article was claiming shouldn't be used.

marssaxman · on Feb 13, 2023

That confused me, too. Cosmopolitan is an interesting piece of engineering, but its goals seem entirely orthogonal to the topic of this article.

wheelerof4te · on Feb 13, 2023

What are the alternatives to these libc functions?

Making C even more bare bones than it already is makes no sense.

marssaxman · on Feb 13, 2023

It's not as strange as you might think. Back in the classic Macintosh era (pre-OS X), it was quite common to write programs in C which essentially ignored libc. It was not a Unix system, so C's filesystem API made for an awkward fit, and the system's native callback-based IO was more efficient anyway. For similar reasons Mac programmers had little use for C's string functions or its allocator. Nor was there any terminal, so printf had little value.

I never spent any time in the early DOS/Windows world, but from what I heard similar coding patterns were found there.

icedchai · on Feb 13, 2023

I learned C on the Amiga, and similar patterns were often followed. Example: we generally used the platform-specific AllocMem instead of malloc, even though a mostly complete C library implementation was available. One reason was AllocMem gave you finer control of what sort of memory was allocated (Amiga had "chip" memory, "fast" memory, and other oddities I now forget.)

pavlov · on Feb 13, 2023

For many things there are much better versions of the functions available in the operating system. For example on macOS you really should be using Core Foundation for strings, dates, etc.

In a cross-platform application you can create a platform layer that wraps OS API usage and provides a POSIX fallback for the platforms you don't directly implement.

sfpotter · on Feb 13, 2023

He explains in the article…

nn3 · on Feb 13, 2023

>Without libc you don’t have to use this global, hopefully thread-local, pseudo->variable. Good riddance. Return your errors, and use a struct if necessary.

On contrary you should absolutely use errno, if only to report your IO errors.

Or maybe he never does IO. After all a program that doesn't output anything is likely safer, since these are all evil side effects. I can just see that new programming paradigm: "side effect free programming" taking over the world in storm.

Just based on that gaffe I'm not sure I can take anything else in that post seriously. It's clearly not based on any practical experience.

michaelhoffman · on Feb 13, 2023

He is saying that the design of errno is a flaw and that errors should be returned via other means. Not that you can safely ignore errno.

AaronFriel · on Feb 13, 2023

It's not a gaffe, it's moving C in a direction that's safer and more like modern languages by removing mutable global state. That's what this means: "Return your errors, and use a struct if necessary."

manv1 · on Feb 13, 2023

If you don't understand why errno is bad then it'll be hard to explain it.

Errno is fine if you're accessing it right after a stdlib function that sets it, because it's the equivalent of a function's error code. Once you get out of that context errno becomes less and less useful and probably shouldn't be used, because without that context you won't know who/what actually set it.

pjmlp · on Feb 13, 2023

Assuming there is no thread context switch or OS signals between those calls.

comex · on Feb 13, 2023

errno is thread-local, so you don’t have to worry about context switches.

Someone · on Feb 13, 2023

Historically, it wasn’t (of course; Unix didn’t have multiple threads in a program). https://pubs.opengroup.org/onlinepubs/7908799/xsh/errno.html:

“Previously both POSIX and X/Open documents were more restrictive than the ISO C standard in that they required errno to be defined as an external variable, whereas the ISO C standard required only that errno be defined as a modifiable lvalue with type int.”

Even now that it is guaranteed to be thread-local, it still is a bad API because of (same page)

“A program that uses errno for error checking should set it to 0 before a function call”

And yes, that still seems true today. https://en.cppreference.com/w/cpp/error/errno:

“library functions never store 0 in errno.”

masklinn · on Feb 13, 2023

That library functions don’t reset errno is considered a feature / convenience, it allows calling a bunch of them then checking if the entire thing succeeded or failed.

Obviously this assumes the failure of one function does not trigger a UB down the line, and that you care about general but not specific failure. And also obviously this is easy to replicate with an API which returns error objects / codes.

IIRC this is most(ly) convenient in graphics APIs, I don’t know how common leveraging this in the libc actually is.

Someone · on Feb 13, 2023

It also requires code to explicitly set errno many, many times. I think there are many errors lurking in C code there, with code calling foo, not checking errno or checking it but not resetting it, later calling bar, and assuming errno was set by bar.

masklinn · on Feb 13, 2023

> It also requires code to explicitly set errno many, many times.

Yes, errno should be cleared every time you want to check it. Although you can check it multiple times with a single clear, as long as a set errno leads to an exit.

A more likely cause if the issue you outline is that errno is not context-local tho, so if you call something which causes errno to be set you might get an errno from an unexpected error. This is similar to overly broad exception contexts, but much harder to diagnose.

pjmlp · on Feb 13, 2023

You won't find that on either ISO C nor POSIX specifications.

It only happens to be implemented that way in some environments, as a macro to either thread local or some function call that retrieves the right errno.

Still, even that doesn't take care about signal handling.

jstimpfle · on Feb 13, 2023

That's a total strawman. Errno has nothing to do with signal handling at all. While errno is archaic, it is a completely sound design.

That you need to be very careful in signal handlers is common knowledge, and obvious from what they do; signals handlers as a first approximation behave like executed in a separate thread, but even worse since they hijack a running thread and thus block the hijacked thread from executing the previously running code. (So you could say they're like fibers / cooperative multitasking?)

The bottom line is, you can't touch errno in your signal handler, obviously -- just like you can't printf() or any other thing that has "critical sections". And since you don't do that, you're safe.

By the way, signal handler safety is specified in POSIX and/or various other documents to a degree that neither I nor you seem to have cared to research properly. https://www.gnu.org/software/libc/manual/html_node/POSIX-Saf...

BTW2: I noticed on that page that errno is explicitly mentioned, but obviously only as an example of modifying thread-local data in a signal handler. As the docs point out, any such modified data must be restored before leaving the signal handler as to not pending code running on that thread. So in that sense, thread-local variable like errno are "safer" for than functions like printf() that take locks -- you _can_ touch them in a signal handler but you need to restore them.

pjmlp · on Feb 13, 2023

You call it a strawman, I call it knowing what matters when writing portable code across multiple platforms and compilers.

You will notice that I have linked Open Group errno documentation in another post. That is what matters, not what GNU thinks.

jstimpfle · on Feb 13, 2023

So, what does matter? What is your actual statement? That you can't use errno in a multi-threaded environment? That you can't use errno in the presence of signal handlers?

What exactly is your complaint about the GNU link? As far as I can see it more or less regurgitates what POSIX has to say about async-signal-safety.

You made some vague statements that a mechanism that is used on billions of devices is fundamentally unsound, while providing no evidence other than maintaining that it would break in conjunction with signal handlers -- when signal handlers are expressly to be used with caution, no matter what you intend to touch.

If you worry that you can't safely touch errno in a signal handler (provided that you reset it) on some broken platform from decades ago, because a spec like this [0] is too much swallow (understandably so, since this spec is a futile attempt to reconcile all the relevant history into a useful document), then I have a solution... Don't touch errno in a signal handler, which is probably solid practical advice anyway.

Oh, and how do you reconcile the fact that many syscalls are declared async-safe by POSIX that may set errno themselves?

"Operations which obtain the value of errno and operations which assign a value to errno shall be async-signal-safe, provided that the signal-catching function saves the value of errno upon entry and restores it before it returns."

Tada!

[0] https://pubs.opengroup.org/onlinepubs/9699919799/functions/V...

pjmlp · on Feb 13, 2023

One thing one learns about writing specifications in English is the meaning of must, should and shall.

As for soundness, it is C we are talking about here.

jstimpfle · on Feb 13, 2023

Where "shall" is usually understood as equivalent to "must". Are you going to cite me one reliable source for the adventurous claims you're making all the time, or are you just going to continue putting out blatant falsehoods and strawmans?

> As for soundness, it is C we are talking about here.

Well the only thing we're doing here is trolling, nothing of substance has been put on the plate so far.

xeeeeeeeeeeenu · on Feb 13, 2023

Excerpt from C11:

"[...] errno which expands to a modifiable lvalue that has type int and thread local storage duration"

pjmlp · on Feb 13, 2023

Still doesn't cover signals, only works if C11 threads are being used (it is all open if OS threads follow the same TLS mechanism), and those C89 and C99 code bases get nothing from it anyway.

quietbritishjim · on Feb 13, 2023

In practice it works with major thread implementations even before C11, even though it doesn't say so in the standard.

You can't call OS functions from a signal handler so I can't see why signals would matter. Have I missed something?

cesarb · on Feb 13, 2023

> You can't call OS functions from a signal handler

You can, as long as these functions are "async signal safe", see https://man7.org/linux/man-pages/man7/signal-safety.7.html for details.

quietbritishjim · on Feb 13, 2023

Oh wow. I mean, fair enough for simple utilities like strlen(). But open(), write(), close()? I had no idea.

pjmlp · on Feb 13, 2023

You can, it is UB in what might happen, you either get lucky, or not.

Don't expect a compiler error, this is C, where the programmer knows best.

jstimpfle · on Feb 13, 2023

You can call any system function that is marked as async-signal-safe. Whether that's a great idea is a different question.

> Don't expect a compiler error, this is C, where the programmer knows best.

As for this snark, these considerations are pretty much OS and concurrency stuff that transcend any implementation language [0]. No one prevents you to switch back to single-threaded cushy OOP-y Javascript, however note that someone has to implement and run that for you.

[0] yes, signals aren't beautiful, and in 2023 a beautifully designed OS might better do delivery of asynchronous messages using event handling and/or using dedicated threads. However, that's computing history for you, and it can be worked with. If you don't like it find a different platform. C doesn't really require signals, and if you use Unix with a different language, signals won't just go away.

pjmlp · on Feb 13, 2023

Indeed,

"Transcending POSIX: The End of an Era?"

https://www.usenix.org/publications/loginonline/transcending...

jstimpfle · on Feb 13, 2023

So you're going back to URL firing mode, instead of making clear statements that are reasonably backed up?

unwind · on Feb 13, 2023

This [1] seems to contradict, that at least for more "recent" versions of POSIX.

[1]: https://unix.org/whitepapers/reentrant.html

pjmlp · on Feb 13, 2023

That isn't the most recent version, this is,

https://pubs.opengroup.org/onlinepubs/9699919799/functions/e...

actually_a_dog · on Feb 13, 2023

That’s all documented in the standard though. Why would you access errno any other way?

masklinn · on Feb 13, 2023

The main issues with errno are you need to remember to reset it before you enter a context whose failability you care about, so any code which relies on errno must be:

    errno = 0
    // code which may error here
    if(errno) { … }

And that errno is notably not scoped, so the code which may error should only be composed of calls to libc, or code whose interaction with libc you understand perfectly, otherwise you need to save and restore errno around uncontrolled calls.

This is quite verbose, annoying, and error prone.

lelanthran · on Feb 13, 2023

> The main issues with errno are you need to remember to reset it before you enter a context whose failability you care about, so any code which relies on errno must be:

> errno = 0 > // code which may error here > if(errno) { … }

Which are the functions that set errno, and only errno, on failure?

In practice, you will clear errno once, and then repeatedly check for failure after every call to a libc function.

> And that errno is notably not scoped, so the code which may error should only be composed of calls to libc, or code whose interaction with libc you understand perfectly, otherwise you need to save and restore errno around uncontrolled calls.

I don't understand what this means.

aardvark179 · on Feb 13, 2023

Consider you are trying to do a foreign function call from an interpreted language. You make the call and then want to check errno to see what the error was, but you don't know what precise C calls the runtime may have made in the meantime, or whether it might have reset errno. The only reliable thing to do is to mark library functions explicitly as modifying errno, and storing that somewhere else so that you can reliably retrieve that value.

It's a pain, and if it's not done correctly by everyone then it leaves you with subtle intermittent bugs.

aidenn0 · on Feb 13, 2023

Lots of APIs allow reporting of I/O errors without use of an errno-like construct.

wahern · on Feb 13, 2023

> Lots of APIs allow reporting of I/O errors without use of an errno-like construct.

libc included--the C11 threads API returns some errors directly using the constants thrd_busy, thrd_nomem, and thrd_timedout. Unfortunately, it also uses a catch-all constant, thrd_error, for other errors.

The pthreads API returns errno values directly as return values, but that's POSIX, not C.

iExploder · on Feb 13, 2023

practice reading with understanding mate