Setenv Is Not Thread Safe and C Doesn't Want to Fix It

pitdicker · on Nov 20, 2023

This also caused a lot of trouble for time libraries in Rust. The two foundational libraries, chrono and time, rely on localtime_r to get the local time instead of the clock value in UTC. localtime_r reads the TZ environment variable (and optionally others like TZ_DIR). Rust declares it safe to modify the environment, while POSIX declares it unsafe.

CVE-2020-26235, RUSTSEC-2020-0071 and RUSTSEC-2020-0159 where opened against the crates. That left the Rust ecosystem with a pretty much unsolvable issue for many months. Chrono went with the solution to parse the timezone database of the OS natively and read the environment using the Rust locks. Time tries to detect if the libc version has thread-safety guarantees to access the environment, and otherwise panics if there are multiple threads.

More reading: https://docs.rs/chrono/latest/chrono/#security-advisories

lifthrasiir · on Nov 20, 2023

To be exact, it was Chrono and time-rs 0.1, while time-rs 0.2 and later was rewritten from the scratch and didn't have that issue... because the new time-rs didn't yet support general time zones other than fixed offsets. The accepted solution for Chrono surprised me a lot, because as far as I reckon it was the hardest solution. (Disclaimer: I'm the original author of Chrono.)

But a bad API design doesn't end at environment variables. Many POSIX systems rely on `/etc/localtime` to define the system-wide time zone, and every `localtime` call has to check if the file has been changed or not because there is no way to subscribe to the system-wide time zone change event. Of course there is a cache, but many libcs call at least `stat` per each `localtime` call AFAIK. I had even experienced a possible glibc bug due to the lack of guard against I/O error during this process [1]. Windows got this right, I can't see why POSIX couldn't do the same when it does have an asynchronous signal delivery mechanism anyway.

[1] https://news.ycombinator.com/item?id=9953898

mprovost · on Nov 20, 2023

Unix was "designed" (if you can call it that) a long time before it was possible to move a running system between timezones. So many of these decisions were made in completely different circumstances (I almost said environment) and are laying around like old WWII bombs just waiting for someone to dig one up.

mcguire · on Nov 20, 2023

Presumably, GNU Hurd will fix these issues without introducing fun new ones.

vacuity · on Nov 21, 2023

Poe's law; I can't tell if this is a joke. I was under the impression that Hurd is not going to be ready for usage any time soon (a while).

mcguire · on Nov 30, 2023

That is the joke.

pitdicker · on Nov 20, 2023

My respects for your work on Chrono!

And you are right about time-rs (or I think you are). Version 0.1 was never fixed, and version 0.3 does the OS and thread count checks.

It does have some advantage for chrono to do everything in Rust: it can now return two results for ambiguous local time during DST transition fold, and properly return None during a transition gap.

lifthrasiir · on Nov 20, 2023

> My respects for your work on Chrono!

Thank you. To be frank as a first-time maintainer I did a mediocre job---my biggest regret for Chrono is that I did know most forthcoming issues beforehand and yet didn't take enough time to make them public and explicit so that someone else could prepare for the future.

account42 · on Nov 20, 2023

> Many POSIX systems rely on `/etc/localtime` to define the system-wide time zone, and every `localtime` call has to check if the file has been changed or not because there is no way to subscribe to the system-wide time zone change event.

But you can subscribe to file change events so why not do that?

lifthrasiir · on Nov 20, 2023

I did seriously consider inotify back in time, but in order to take advantage of inotify I had to parse all binary TZif files (because otherwise I still had to call `localtime` that would `stat` every time anyway). It was so cumbersome, that was only halfway finished when I stepped down as a maintainer. Hence my surprise when I learned that someone actually did implement all of them.

mjevans · on Nov 20, 2023

Offhand, and a quick google search; I was unable to find the exact definition / specification for how time-zone data must be obtained rather than how it happens to conventionally be obtained.

It is entirely reasonable that any of the following _might_ be valid behavior.

* Simple but syscall heavy approach which re-reads the env, and possibly /etc/localtime each call and has no stability. (Results may mutate as other processes / threads change things.)

* Same as above, then caches the decision result for some application specific reasonable time; which may be until the application exits.

* The elsewhere mentioned stat / inotify approaches that only track updates to /etc/localtime (and ideally update the cached decision result when notified).

All approaches seem valid. It's sort of like the hostname or any other system level configuration where a reboot may be a reasonable expectation for a complete update.

lifthrasiir · on Nov 21, 2023

That was also my thought. To my knowledge `/etc/localtime` is the creation of Arthur David Olson, the founder of the tz database (now maintained by IANA), but his code never read `/etc/localtime` multiple times unless `TZ` environment variable was changed. Tzcode made into glibc but Ulrich Drepper changed it to not cache `/etc/localtime` when `TZ` is unset [1]; I wasn't able to locate the exact rationale, given that the commit was very ancient (1996-12) and no mailing list archive is available for this time period.

[1] https://github.com/bminor/glibc/commit/68dbb3a69e78e24a778c6...

wavesquid · on Nov 20, 2023

I believe systemd has a way to subscribe to timezone changes.

SkiFire13 · on Nov 20, 2023

> Rust declares it safe to modify the environment, while POSIX declares it unsafe.

Arguably, Rust declares it is safe to modify the environment through its stdlib methods. The tricky detail is that this means it is unsafe to read/modify the environment through other means, but sometimes this is really hard to avoid.

manwe150 · on Nov 20, 2023

Does rust also add an pthread_atfork handler? Otherwise, it seems likely still unsafe for rust to claim to support calling fork (for execv) or posix_spawn, as most libc call realloc on the `environ` contents, but do not appear to take any care to ensure that (v)fork/posix_spawn doesn't happen concurrently with that. Worse yet, the `posix_spawnp` API takes an `envp` parameter and expects you to pass it the global pointer `environ`, which is completely unsynchronized across that fork call. It is not obvious to me that this is a security gap, but certainly it seems to me that this would violate rust's safety claim, if it is not taking added precautions there.

The Apple Libc appears to just unconditionally drops the environ lock in the child (https://github.com/apple-oss-distributions/Libc/blob/c5a3293...), while glibc doesn't appear to even bother with that (https://github.com/bminor/glibc/blob/6ae7b5f43d4b13f24606d71...)

connicpu · on Nov 20, 2023

I don't think Rust's stdlib provides any kind of safe way to call just fork(), it only has methods for creating child processes because that's the only interface that works on every supported Tier 1 platform. Calling fork is always going to necessarily be an unsafe{} libc call or syscall, and the caller will have to take care to ensure nothing funny is going on.

namibj · on Nov 20, 2023

There are OS specific APIs where needed, probably also for threads.

connicpu · on Nov 20, 2023

`std::os::unix does` adds some additional methods in that vein like exec(), but no fork(). `std::os::linux` only adds the ability to get `pidfd`s for child processes you create. There's simply no safe way for the stdlib to provide safe fork() without knowing a lot of things about how you're going to set up your process and what other libraries you might pull in that may not be fork-safe. If you're willing to ensure you only call it in a safe way, you can still call fork, the language just cannot guarantee it will be safe, same as when you're doing it in C.

asveikau · on Nov 20, 2023

> The tricky detail is that this means it is unsafe to read/modify the environment through other means, but sometimes this is really hard to avoid.

If you have C and Rust in the same process and C code calls setenv(3), for one ...

Edit: why downvotes? It's very typical to link to C libraries which may call the libc environment stuff ... My point is you can't control library code as easily, if it's some dependency of a dependency eventually calling libc.

tsukikage · on Nov 20, 2023

If you are modifying TZ while another thread is relying on it to calculate time, those threads are racing, and hiding the crash won't solve the race: the reading thread will now randomly return values in the wrong timezone instead, subsequent code will use it in whatever operation it is it wanted the time for, the end result will be garbage, and this will be super hard to debug because there won't be a loud obvious crash pointing to the root cause and also depending on the winner of the race the symptoms will be random/intermittent.

Fix the high level race, and suddenly you no longer need the low level mutex.

asveikau · on Nov 20, 2023

> the reading thread will now randomly return values in the wrong timezone instead, subsequent code will use it in whatever operation it is it wanted the time for, the end result will be garbage,

I really strongly disagree with how bad you seem to think this is. If you are designing your application to use the timezone and modify it at the same time, it is a totally natural consequence that you may see the previously set time zone in a timing dependent fashion. That's the nature of the beast. To "solve this" is seemingly to make that other thread capable of time travel or something. It read something before it was written, and acted on it. Reasonable!

The harmful data races are when you read intermediate results. If setting the timezone is a multi-step process, or involves manipulation on complex data structures with pointers that might be deallocated, then you are in grave danger. Seeing a previously valid result is ... I honestly don't know how you'd expect to solve it without threads being able to see the future, or some other unreasonable expectation.

formerly_proven · on Nov 20, 2023

> If you are modifying TZ while another thread is relying on it to calculate time

environ is a single contiguous null-terminated segment of null-terminated key-value pairs; any change of any environment variable might reallocate it, changing the address and invalidating the old address.

Also why it's a bad idea to store the pointer returned by getenv, it might be invalidated by any environment modification.

Sprocklem · on Nov 20, 2023

The strings in environ is only contiguous at program start. In every libc I'm aware of, both putenv and setenv replace only the specified key-value pair (and possibly environ itself, if it needs to be larger) and should not affect the address of any other environment variables. It is still thread-unsafe, but far more limited in its unsafety.

comex · on Nov 20, 2023

In current glibc master, it's unsafe for any putenv/setenv to race with any getenv, even if the variable names are different, for two reasons. (Note that multiple calls to putenv/setenv are serialized by a lock, but getenv does not take the lock.)

(1) setenv resizes environ using realloc, which frees the old buffer, so getenv can end up reading from a freed array.

(2) The code does not use atomics or memory barriers, so on weakly ordered architectures, getenv could observe another thread's write to one of the pointers in the environ array, or to the environ pointer itself, while observing stale values for the memory behind it.

In both cases, getenv could end up returning a bogus pointer or just crashing.

However, those issues can be fixed without changing the API, and at least Apple's libc seems to do the right thing here. On the other hand, other libcs such as musl, FreeBSD libc, and even OpenBSD libc (!) do worse than glibc and have no locking at all.

If someone could convince the maintainers of all those libcs to add a lock and make getenv/setenv 'thread safe as long as you're not racing on the same variable name', then that would be a good starting point. But in my opinion it would still be a half-measure. We need a fully thread-safe environment.

And honestly, it might be easier to convince the maintainers to add a full solution than a half-measure, even if it involved API changes. (But it may be hard either way. Rich Felker showed up in a Rust thread a while back and was highly negative on the idea of making any changes to musl.)

mjevans · on Nov 20, 2023

IMHO - I am sympathetic to the BSDs, Apple (presumably forked BSD), and musl approach.

In what sane world would someone reasonable treat (initial shell) Environment Variables as a proper ACID complaint database? About the 'best' solution I can see for preventing segmentation faults related to resizing the env array during runtime is to defer reclaiming freed memory chunks until after all in-process threads have been given another uninterrupted timeslice to process. Even that wouldn't be 100% but probably would cover any not pathological case.

bensecure · on Nov 20, 2023

atomic ordering is very easy if you don't care about performance. So on the other hand we could ask why get/put/setenv have such a terrible need for performance that we can't afford to put a simple lock around them.

Sytten · on Nov 20, 2023

The shame of those CVE is that it created a split in the rust community between chrono and time. For a time it looked like people were all moving to time (which handling on TZ is a bit stupid IMO since it just refuses to work if there is more than one thread). But with chrono 0.4 now things are stale and there is no clear winner anymore.

I would argue that those splits are in great part responsible for the feeling that rust is hard to learn. I remember to have had to dig into pretty complex time code to understand why it broke our program that relied on timezone when we switched from chrono to time. It hinders your productivity for sure even if you learn the how.

rewmie · on Nov 20, 2023

> Rust declares it safe to modify the environment, while POSIX declares it unsafe.

There's your problem right there, and it ain't the behavior specified in the standard.

kibwen · on Nov 20, 2023

But Rust doesn't declare it safe to modify the environment in general. It declares it safe to modify the environment using std::env::set_var, which uses locking internally. The docs explicitly note that there's potential unsafety if non-Rust code modifies the environment:

"Note that while concurrent access to environment variables is safe in Rust, some platforms only expose inherently unsafe non-threadsafe APIs for inspecting the environment. As a result, extra care needs to be taken when auditing calls to unsafe external FFI functions to ensure that any external environment accesses are properly synchronized with accesses in Rust."

https://doc.rust-lang.org/std/env/fn.set_var.html

Ultimately the problem here is with Posix. Rust can only do so much to paper over the pitfalls in the underlying platform.

Although note that if you replace libc with eyra, then the behavior goes from thread-unsafe to "just" a memory leak: https://blog.sunfishcode.online/eyra-does-the-impossible/

Sytten · on Nov 20, 2023

There is an issue in the std to name setenv unsafe but that is a breaking change so it's complicated.

kibwen · on Nov 20, 2023

One problem is that marking that function as unsafe would unfairly penalize platforms like Windows that don't have this issue. Even if it turns out to be the least-bad compromise solution, it sure would be nice if we could have nice things.

pitdicker · on Nov 20, 2023

You are right. POSIX specifies one thing, the standard library in Rust and some other libraries specifies something different. 'Safe to use unless there are other threads' is not really something you can or want to encode in a type system.

But libraries and users are caught in the middle.

eptcyka · on Nov 20, 2023

It is safe to use the Rust standard library interface.

pitdicker · on Nov 20, 2023

Unless the environment is also touched by a part of the program written in Go, Julia, I don't know... The lock is not shared across languages.

the_mitsuhiko · on Nov 20, 2023

> The lock is not shared across languages.

Which just to be clear: it cannot without changing the standard. There is really nothing anyone can do without a change in the standard.

bbatha · on Nov 20, 2023

However, to access any of those languages from rust you need to use unsafe.

eptcyka · on Nov 20, 2023

There is no safe way to access the environment, even if you mark this API unsafe, what are you going to do?

bbatha · on Nov 20, 2023

You can safely access the environment so long as you use the rust apis and don't have unsafe code that calls `setenv` without synchronization.

usrbinbash · on Nov 20, 2023

> C Doesn't Want to Fix It

Or: C knows that it doesn't need fixing.

How often do I need to `setenv()` anything? The answer is "Never" in the vast majority of programs, because ENVVRS are usually read rather than set, so this issue is nonexistent for them.

For the vast majority of the small amount of programs that actually need to use `setenv()`, the answer is: "Maybe once or twice during the entire lifetime of the process, and then only at the very start, probably even before running any threads", meaning this issue is nonexistent for them as well.

So, is there a potential issue with thread safetey? Yes. Does it matter given where and under what circumstances it occurs? Not really.

> such as Go's os.Setenv (Go issue)

Here is the link to the "issue":

https://github.com/golang/go/issues/63567

What kind of actual real life production code would continuously set envvars while simutaneously calling a function that tries to read the environment?

Yes, this is a footgun. But even the issues author acknowledges, in the issue thread:

    Realistically: this is a pretty rare problem, and documenting 
    it is probably a fine solution. This is probably going to cost
    someone else a couple of days of debugging every couple of
    years

> It has wasted thousands of hours of people's time, either debugging the problems, or debating what to do about it.

Source?

jeroenhd · on Nov 20, 2023

> Or: C knows that it doesn't need fixing.

People don't like APIs that can randomly crash your program while there's no good technical reason for why they should. Why not fix the problem? People like you, who have no issues with the current implementation, won't see any regressions because you're already a good citizen, and myriad other programmers whose programs do occasionally crash because of this will be helped.

> So, is there a potential issue with thread safetey? Yes. Does it matter given where and under what circumstances it occurs? Not really.

"The unpredictable crashes only happen very rarely" doesn't mean the crashes go away.

> What kind of actual real life production code would continuously set envvars while simutaneously calling a function that tries to read the environment?

The reproduction sample calls setenv in a loop so the issue can be reproduced. A single setenv anywhere in the code is enough to trigger the crash, but then you would get one of those "you need to run the program a million times to reproduce it" bug reports that gets pushed down the line.

usrbinbash · on Nov 20, 2023

> Why not fix the problem?

Because doing so breaks backwards compatibility, simple as that.

The problem isn't even that `setenv` isn't thread save. The problem is that `getenv` returns a `*char` directly into the environment memory space. Many many many programs rely on that being the case.

> People like you

People like me would like every software to be perfect, but that's not the world we live in, so we are forced to be pragmatic. When fixing something causes more problems by breaking backwards compatibility promises, than it prevents, then there is no good argument for a fix, and the correct approach is to say "yes, this sucks, let's document it well so people don't waste too much time on this".

The setenv/getenv problem is such a case. Anyone who disagrees is free to fork glibc, implement whatever fix they think is adequate, and then try to compile the software packages found on a typical Linux server against the result.

> so the issue can be reproduced.

"Can be reproduced" and "is a common issue in production code" are not the same.

Fact is, almost all production programs that set envvars, do so once, very early in the process lifecycle, and then never again, and so are never affected by this.

mastax · on Nov 20, 2023

So why not implement the fix suggested in the article: improve the existing interface to the extent possible, and introduce a new interface which is easier to use correctly.

fch42 · on Nov 20, 2023

There is nothing to "improve" on the existing interface, really. From a C point of view ... a _hidden_ global lock is worse than no lock at all. Because in the latter case ... you, as the programmer, have a choice what to do. If you never call setenv(), no locks. If you only ever call setenv() in your startup code, no locks. If you only ever call setenv() after fork&co, no locks. And if you do believe you need to call it at runtime, but are singlethreaded ... still no locks. And if you really really really need to call it from a multithreaded process, concurrently with getenv(), then lock around both and make your getenv() "safe" wrapper create you an owned point-in-time copy - basically a getenv_r().

Note also that "global references" like getenv() returns and point-in-time owned snapshots don't behave the same way. Say, a library initializer code could retrieve a number of env var references by calling getenv(), and then use those at runtime. No more need/use for getenv() again after - and even perf-sensitive code could look at the env var. With a func that copies, the perf-sensitive code would need to do that each time (lock, lookup, copy). Not strongly desirable.

Also ... UNIX is rather flexible ... and if you so wish, you _can_ substitute _your own_ setenv()/getenv() by the magic of dynamic linking. To create a set that locks and returns you leaked copies (changes the semantics of getenv so that the caller must free the pointer to avoid a leak). It's all possible to do this.

I'm getting the impression from this that we see a "go tantrum" here. "I make my own standards but I wanna use that C/Unix standard thing as well but not how it is because it's not nice it should take go into account waaaahwaaah ...".

It is not _nice_ to modify your own env at runtime. Maybe, just maybe ... that's for reasons. Because not everything that can be done is also a great idea.

rerdavies · on Nov 20, 2023

So why not implement it yourself, instead of polluting the standard runtime with functionality that nobody needs?

zare_st · on Nov 20, 2023

> People don't like APIs that can randomly crash your program while there's no good technical reason for why they should. Why not fix the problem?

I think you're not seeing this from the right POV. People that consume POSIX API need to know POSIX API.

https://pubs.opengroup.org/onlinepubs/009604499/functions/se...

It says loud and clear "The setenv() function need not be reentrant. A function that is not required to be reentrant is not required to be thread-safe."

> "The unpredictable crashes only happen very rarely" doesn't mean the crashes go away.

If you get a crash over setenv() reading the manual page of setenv C call should be your first step. And the only step. The bigger issue is in design of application that has wrongly assumed setenv() is thread-safe. That requires a refactoring and is solely due to developer misunderstanding the API.

tikhonj · on Nov 20, 2023

"RTFM" is not a coherent defense for awful API design and we shouldn't accept it as such.

zare_st · on Nov 20, 2023

Who is "we"?

I'm a UNIX/C programmer for decades and I don't care about this.

There is no such thing as beautiful API design. Every design is a compromise. If you think non-reentrant calls should be deprecated in POSIX take it to the committee.

There is a myriad of non-reentrant code both in POSIX spec and in libc implemenations. You need to RTFM, I'm sorry.

There is no "coherent API" as far as null termination goes too. Some library functions deal with it, some calls don't. You need to RTFM.

I also want to know OP's reason to even use setenv() in a multithreaded piece of software. It's like an oxymoron. setenv and vars are useful to pass on data from parent process to forked children because they inherit the environment. If you use the threading model you don't need it. If your application is a single process setenv() is useless.

usrbinbash · on Nov 20, 2023

What should we accept? That every library is made under the assumption that it has to work as expected, even if people ignore the documentation?

As someone who made and maintains multiple libraries: No. Not gonna happen.

JohnFen · on Nov 20, 2023

Putting aside whether or not the design is awful, the fact that it's standardized and documented is absolutely a valid argument. Changing it now would break backward compatibility. That should always be a showstopper.

Programmers who are using any library code without reading and understanding the documentation are asking for trouble regardless of language.

The correct solution to your objections is to create new functions that behave as you prefer.

wredue · on Nov 20, 2023

The real skinny of it is that it’s in the name: “Environment”.

If you’re calling setenv in the middle of your program, you fucked up.

There are those things in programming that should be extremely triggering to your “what the actual fuck?!” senses, and “setenv in the middle of runtime” is one of those things.

kstrauser · on Nov 20, 2023

True, but for every envvar a program reads, something called setenv on it originally. It’s not like no programs call setenv in the middle of runtime. Examples:

- Shells

- CI runner

- Container launchers

- IDEs

DSMan195276 · on Nov 20, 2023

> but for every envvar a program reads, something called setenv on it originally

That's not true, that's just misunderstanding how it works. `execve()` takes an entirely new copy of environment variables to give to the child, that's the "real" way to do it.

tsukikage · on Nov 20, 2023

The child process's environment for these purposes is constructed without mutating its parent's environment - a copy is used - and before the child process actually runs the target code it was created to run. So there is no possibility of race between mutations to the environment and reads of the environment. If you are writing such a tool but doing something other than this, you are doing it wrong.

stefan_ · on Nov 20, 2023

No, a process gets its environment variables from the operating system (just like argc, argv) before any code is ever executed and the majority never change them.

ric2b · on Nov 20, 2023

Then why does setenv even exist? Maybe that's the issue and it should be deprecated and throw compilation warnings?

loup-vaillant · on Nov 20, 2023

You are literally putting forth arguments in favour of fixing the thread safety issue, and then conclude it’s not worth the effort.

It’s simple, really: we indeed rarely to `setenv()`. So it’s not a performance problem. So we can make it thread safe, and the performance impact will be negligible. In exchange for this small price, safety will increase.

Sacrificing any amount of safety for a negligible improvement in performance is flat out unprofessional, and should be grounds for immediate termination in most contexts.

usrbinbash · on Nov 20, 2023

> You are literally putting forth arguments in favour of fixing the thread safety issue, and then conclude it’s not worth the effort.

Yes. I do. These two concepts don't contradict each other.

> No it’s not a performance problem. So we can make it thread safe, and the performance impact will be negligible

Who said anything about performance being the problem, or a reason not to change it!?

The problem is BACKWARDS COMPATIBILITY. The issue is that `getenv` returns a `*char` into the envvar array. Basically every application that uses this function relies on this fact.

So we have:

a) A potential issue that occurs only in very unusual circumstances, most of which will never occur in production code and on the odd chance that they do, they can easily be avoided. Documenting that well can help prevent time wasted in debugging.

b) A fix that may prevent a) but breaks backwards compatibility promises, and would necessitate reworking god knows how many programs, the vast majority of which were never impacted by the issue in the first place.

Of these 2 options, a) is just the better one. Yes, in an idea world, we could have pure, 100% bug free code, and spend an unlimited amount of time on fixing every last problem. That's not the world we live in however, and so a pragmatic approach is simply a necessity.

DSMan195276 · on Nov 20, 2023

How do you propose making it thread-safe? The real problem here is that `getenv()` was designed around it returning a `char *` into some read-only memory. It's a bad API if the backing data can change because the returned pointer is assumed to exist 'forever'.

`setenv()` has no way to knowing where those pointers are floating around so there's no way to safely change the environment variables. The best you could do would be to leak memory every time you set new environment variables so that the old pointers don't get invalidated, and that just creates a new problem and reason not to use `setenv()` (that's arguably worse).

cnity · on Nov 20, 2023

Here's my proposal: Introduce a new threadsafe API (`tgetenv` or whatever) which takes _two_ `char *`s, one of which is a return buffer. This leaves allocation as a responsibility of the caller.

And then you can leave the existing syscalls as they are (thread unsafe) while having a separate thread safe version.

DSMan195276 · on Nov 20, 2023

I agree that would be the way to do it, but now we're no longer talking about simply 'fixing' the implementation of the existing API but rather introducing a new function you have to use.

`setenv()` would only be safe if your program never uses `getenv()`, and calls to `getenv()` are so numerous and all over the place that for most non-trivial programs it would be hard to ensure they never happen.

There's also the rub that `setenv()` is not part of the C standard, it's POSIX. I don't think the C standard would ever introduce `tgetenv()` to fix a problem it doesn't have, so non-POSIX code would have to continue to call `getenv()` since that's all that is available to them.

josefx · on Nov 20, 2023

> I don't think the C standard would ever introduce `tgetenv()` to fix a problem it doesn't have

The C standard has no problem acknowledging that getenv is subject to data races for most of its implementations. As far as I can tell that part was even added at the same time as threading support.

DSMan195276 · on Nov 20, 2023

Well actually I'll have to eat my words on that one - I didn't catch that Annex K in C11 includes `getenv_s` (even if it is optional).

leoh · on Nov 20, 2023

>you have to use

I mean, why not just deprecate the old one; add a warning if it’s used

DSMan195276 · on Nov 20, 2023

That doesn't really help you determine whether a given library is using `getenv()` or not. That also requires that things are actually recompiled/updated, which for some C libraries is not that often.

There's also the rub that many C programs do not target the latest standard (for a variety of reasons). I didn't realize `getenv_s` was added in C11 (though it's optional), but it doesn't really matter because programs/libraries that target C89 or C99 can't use it anyway.

leoh · on Nov 20, 2023

That’s a good point. I would guess there are ways of doing static analysis to see if a given binary is making getenv() calls though, even if one doesn’t fully grok its source.

Maybe some combo of that with sentenv() in your source or something

Or do “live” analysis under integration and give a low priority warning

But yeah it’s hairy, you’re right

salawat · on Nov 20, 2023

GNU convention is <funcname>_r for a reentrant version of a non reentrant function.

I'm in the process of working on a tool in C at the moment, so for once I actually have some context on what's being grumped about here!

JohnFen · on Nov 20, 2023

Entirely this. It works so well that I've seen this in various utility libraries for decades.

cnity · on Nov 22, 2023

Yeah. Most of my code these days works like this, to the point where almost all of my allocation happens roughly in my main entry-point. Or I use arena allocators and pass those into my utilities. But there's basically never a `realloc` or `malloc` call deep in my code anymore.

forrestthewoods · on Nov 20, 2023

> to safely change the environment variables. The best you could do would be to leak memory every time you set new environment variables so that the old pointers don't get invalidated, and that just creates a new problem and reason not to use `setenv()` (that's arguably worse).

Arguably worse? My goodness no.

This is a rare edge case that most programs don’t encounter. Option 1 is to crash and explode and die. Option 2 is to leak tens of bytes.

Leaking tens of bytes is for sure NOT worse than crashing.

none_to_remain · on Nov 20, 2023

You could even put a note in the man page that `setenv()` will leak memory. Then ten or twenty years from now there will be a blog post about how a currently trendy language/runtime can be manipulated into looping over `setenv()` zillions of times and OOM'ing, and comments about how no one can possibly be expected to read the man page for this horrible footgun, and it's wrong to expect developers to have any idea about what they're doing, give a shit, or pay attention at all.

marcosdumay · on Nov 20, 2023

> Leaking tens of bytes is for sure NOT worse than crashing.

I do really disagree here. The answer is not clear at all.

But then, you are mischaracterizing the problem. The issue is not with crashing, you can get plain bad data too, and this is clearly worse than both leaking memory and crashing.

Also, the GP is mischaracterizing the options. You don't need to leave the old values around, you can just copy them into userspace memory.

DSMan195276 · on Nov 20, 2023

My reasoning is simple - the issues here can be avoided if you're careful about how you use `setenv()` and `getenv()`, which many programs already are. The memory leak in contrast would never be avoidable regardless of how you use it.

forrestthewoods · on Nov 20, 2023

The problem with “be careful” is that libraries often want to use the very unsafe API and there is no standard mechanism to expose safety. It’s fundamentally a bad API design. It could be good. But it is not.

There’s a reason this problem comes up on HN once or two a year. And don’t even get me started about printf grabbing a mutex for a stupid locale…

leoh · on Nov 20, 2023

This is not a good argument imo. Its “rarity” still affects a tremendous number of folks in profoundly vexing ways that are difficult to debug on account of this not only affecting C but innumerable other languages’ compilers and interpreters that rely on the stock getenv implementation.

I wouldn’t be surprised if a good chunk of compilers and interpreters in other languages suffer from this gotcha’.

I mean, I wouldn’t even be surprised if some JVM implementations silently expose their users to bugs on account of this implementation.

EDIT: … ha, sure looks like it https://github.com/openjdk/jdk/blob/a2c0fa6f9ccefd3d1b088c51...

zlg_codes · on Nov 20, 2023

Ooh, the old 'unprofessional' epithet! What do you mean by that slur here? Most can't agree on what professional even means. Additionally, why should one be held to artificial, inconsistent, and poorly defined standards of 'professionalism' when they aren't a professional?

My care for code robustness scales with income.

loup-vaillant · on Nov 21, 2023

> Ooh, the old 'unprofessional' epithet! What do you mean by that slur here?

For an action I generally mean "malpractice". Something bad enough to bar repeat offenders from the profession (if we even were a profession, which we’re not). For a person I mean "unfit to program code other people rely on".

> My care for code robustness scales with income.

Good point: the conditions for us to write code that actually works are too rarely met. The only answer I have for this one is political though, not technical.

MrBuddyCasino · on Nov 20, 2023

Unpopular opinion: Neither Go's os.Setenv nor Rust's std::env::set_var() should exist. I was pleased to find that Java only has System.getEnv(), but not a setter.

jeroenhd · on Nov 20, 2023

I think there are good reasons for Setenv and set_var to exist, but if they are implemented, they shouldn't be wrappers around POSIX' shitty API and implement their own environment variable system instead (one of which the initial variables are possibly initialised by a call to getenv to make them compatible).

There's no reason why these languages need to restrict themselves the same way C does.

OskarS · on Nov 20, 2023

That doesn't fix the problem: these languages has to be able to coexist peacefully with C in the same address space. You can have a dynamically linked library written in Rust in a host program written in C, you can use C libraries in Go, etc.

Even if that wasn't an issue: this is a bug in C as well! You should absolutely be able to use setenv/getenv safely in multi-threaded C, it's insanity that you can't.

MrBuddyCasino · on Nov 20, 2023

The bug in Golang was because DNS lookups interact with the C library, which looks up environment variables. As long as everything happens in Goland, there is no problem - but this is simply not good enough.

jeroenhd · on Nov 20, 2023

Go makes the assumption that the DNS lookups are thread-safe, but it doesn't have that guarantee (or the C library is spec-incompliant, but I doubt that). It's still something Go can fix.

You can't fix C libraries loaded into Go programs (i.e. and external library calling C's setenv, or I suppose explicit FFI calls by the user), but Go can be responsible for the APIs it calls itself. That may necessitate writing a thread-safe alternative for DNS lookups, or documenting and/or adding compile time warnings that threaded programs doing DNS lookups will just crash sometimes, but the language's standard library can still make it much harder for developers to write buggy code.

MrBuddyCasino · on Nov 20, 2023

My impression is that this was Golang's plan from the start - this is why they didn't want to use the C stdlib at all, issuing the Kernel syscalls directly from the Golang runtime. A good idea, but then they had to backpedal to solve issues such as DNS resolution respecting certain OS settings, and this bug is a symptom of that.

fch42 · on Nov 20, 2023

Yes, there are certain things in UNIX which _are_ part of the standard (POSIX / IEEE1003) but _aren't_ usually implemented as system calls.

Name lookups (whether user identities or network resources) are the biggest chunk of these. You have a "choice" as a user/programmer here. Say, the existing name lookup interfaces in most libc implementations don't do DNS-over-HTTP (DoH); you can implement that yourself and just use the addresses returned by your library/package where the system calls ... want addresses.

If you have the go stance, go all the way. Don't say "the C runtime is sh*te but I really really really want that one particular teensy tiny bit of it could someone somewhere somehow please do something to make it a little less sh*te". Legacy baggage is a burden and backwards compatibility shackles you. The C/Unix interfaces are full of this, and with the hindsight of 50 years noone today, not even "C programmers", would implement them all the same way again. But that doesn't mean their behaviour can be arbitrarily changed.

o11c · on Nov 20, 2023

> Go makes the assumption that the DNS lookups are thread-safe

DNS functions are thread-safe.

The thing people aren't understanding here is when you set loose nasal demons (such as by calling `setenv` in a multithreaded program), they can cause problems even in safe code.

dwattttt · on Nov 20, 2023

If a function is safe only if everyone else you rely on never calls a particular function, it's not that safe. Certainly less safe than other functions guaranteed not to result in crashes if you use them right.

o11c · on Nov 21, 2023

You can make any safe function fail trivially with appropriate uses of unsafe functions.

For example, imagine the chaos of `memset(stderr&-4096, 0x42, 4096)`.

usrbinbash · on Nov 20, 2023

That is an unpopular opinion for the simple reason that some programs do in fact need to set envvars, particularly programs that will start child processes.

pajko · on Nov 20, 2023

Nope, execve() and friends ending in 'e' accept a pointer to a completely new set of environment variables, no need to do setenv. Windows has _execve() too.

usrbinbash · on Nov 20, 2023

The fact that there is an alternative doesn't change the fact that a lot of software relies on the worse method to work.

naniwaduni · on Nov 20, 2023

It makes it a pretty silly idea to invite people writing new programs in a new language to use that method, though.

rerdavies · on Nov 20, 2023

A lot of software doesn't modify the environment when exec-ing.

usrbinbash · on Nov 21, 2023

So? A lot of software doesn't use file compression, should we remove these libraries?

MrBuddyCasino · on Nov 20, 2023

That is still possible using the java.lang.ProcessBuilder API: you can launch a child process and give it a modified environment, but just at launch time. This side-steps the issue.

ryan-c · on Nov 20, 2023

Programs that need to set environment variables for child processes should use `execvpe` or `execle`.

account42 · on Nov 20, 2023

Or posix_spawn / posix_spawnp.

badsectoracula · on Nov 20, 2023

Or programs that rely on libraries that for some unfathomable reason expose some functionality only via environment variables without an API.

looks at SDL

account42 · on Nov 20, 2023

There is SDL_SetHint [0] which doesn't modify the environment but instead changes the value internally to SDL only.

[0] https://wiki.libsdl.org/SDL2/SDL_SetHint

badsectoracula · on Dec 1, 2023

That is SDL2, what i had in mind is SDL1 - but also the SDL1-on-SDL2 implementation which had some SDL2-specific extras (for things like scaling).

kevincox · on Nov 20, 2023

I think the probably is really that there are 2 times where you should be setting env vars 99% of the time.

1. Right after program startup before any threads are spawned.

2. After a fork before an exec.

In both cases it can be known that no threads are running. (Ok, for 1 it can actually be non-trivial if you have code before main or if you call functions that spawn helper threads, but let's assume that you can know this).

However no languages actually have ways to enforce this. So the APIs can be called at any time and are huge footguns.

I think that the proposed improvement of `getenv_s` is great. It is cheap and easy to use, then software can slowly migrate off of the less safe stuff. You can imagine that if libc stopped using `getenv` internally most of this problem would be solved.

Blikkentrekker · on Nov 20, 2023

No, in many cases one needs to set them interactively.

Consider for instance something as simple a implementing a shell. Such a program needs to be able to set the environment based on user interaction and this change needs to show up in /proc/$pid/env.

__david__ · on Nov 20, 2023

Why does a shell need its current environment to be visible in /proc/$pid/env (as opposed to just its initial environment)?

Blikkentrekker · on Nov 30, 2023

Because the specification of the POSIX shell says that `export` changes the current environment of the running process, not just of any newly started processes.

This is useful to recognize various processes I suppose. I have written code that scans the environment of processes to find particular processes and group them together.

rerdavies · on Nov 20, 2023

If you need to set environment variables for child processes in a thread-safe manner, use execvpe or execle.

crabbone · on Nov 21, 2023

This is a bizarre take...

Programmers have to deal with a lot of badly written programs all of the time. You'd need this functionality to either debug a program that responds differently to different values of environment variables, or to control it, because, maybe it's the only reasonable way to do so.

It's OK to say that programmers shouldn't rely on this functionality ideally, but, for practical reasons, this functionality is needed. Same happens in "pure" functional languages, for example, when you need to debug programs in such languages interactively, and struggle to create the program state that reproduces the problem, or, in some extreme cases, due to I/O being "impure" even struggle to output diagnostic information.

grodriguez100 · on Nov 20, 2023

I fully agree with your unpopular opinion.

xbar · on Nov 20, 2023

While I am sure that thousands of hours have been spent debugging threaded setenv() attempts (and developing & discarding Annex K), it is clearly not a problem that needs a solution.

Languages that compile to C need be careful not to promise thread-safe implementations of POSIX or C functions that are explicitly documented as not reliably thread-safe, including setenv(). The author seems to want to change C, and POSIX, so that Go can reliably do so.

somat · on Nov 20, 2023

Right, why is the problem "changing memory in a shared memory execution model will cause corruption" and not "Why are we using such a fragile shared memory execution model in the first place."

jcalvinowens · on Nov 20, 2023

This is silly, setenv() isn't reentrant for the same reason that getopt() isn't reentrant: there's no valid reason to use it except at the very beginning of the program.

The most common misuse I see is changing env before forking a child: nobody has to do that, execve() lets you pass arbitrary envp to the new process without changing yours.

If you need to change env in threaded tests... frankly I think there was probably a better way to do whatever you're doing, but you can just declare a global lock and use it. I bet you could even LD_PRELOAD a custom setenv() that uses your lock.

Nobody is pointing at concrete problems outside of Rust. Rust is just wrong here, sorry, the manpage has said this for a long time:

> POSIX.1 does not require setenv() or unsetenv() to be reentrant.

I think a more intellectually honest version of this article would have been "POSIX should have made setenv() reentrant", not "C is buggy": it's not buggy, it obviously complies with the standard. There's nothing to "fix", he wants to change the standard.

duped · on Nov 20, 2023

> there's no valid reason to use it except at the very beginning of the program.

POSIX doesn't define the "beginning of a program." Nor do you, if you're compiling C code. Libraries can (and do) spawn threads before main, so it's not even safe to use if you restrict yourself to "only the beginning of the program."

> The most common misuse I see is changing env before forking a child: nobody has to do that, execve() lets you pass arbitrary envp to the new process without changing yours.

Just remember to do it before fork() and not between fork() and exec() because you probably want to copy the existing envp and allocators usually aren't async signal safe. If you want to be sure to be correct, use posix_spawn to create a child process.

--

I think it's fair to say "setenv is buggy" in the sense that the POSIX specification for setenv guarantees it to be buggy in most programs that think they should be using it. What makes it an even bigger footgun is that the path of least resistance for people who need/want the behavior is the most difficult to use correctly. POSIX is full of shit like this, and "you should know better" isn't a good enough excuse.

It's like saying "hey you should have known unless you press the doohickey for 5 seconds and turn the chainsaw at 90 degrees when you start it, the chain is going to fly off."

jcalvinowens · on Nov 20, 2023

> Libraries can (and do) spawn threads before main

I don't think any library should be calling setenv(), there's always a better way. If you know of a counterexample, please share it, I'd like to see it.

> Just remember to do it before fork() and not between fork() and exec() because you probably want to copy the existing envp and allocators usually aren't async signal safe.

Why would you go to all that trouble? Most implementations just use the stack to build the arguments for execve(), making all that irrelevant.

> If you want to be sure to be correct, use posix_spawn to create a child process.

That's not the purpose of posix_spawn(), it exists to deal with vfork-only nommu architectures:

>> [posix_spawn() was] specified by POSIX to provide a standardized method of creating new processes on machines that lack the capability to support the fork(2) system call.

>> These functions are not meant to replace the fork(2) and execve(2) system calls. In fact, they provide only a subset of the functionality that can be achieved by using the system calls.

Anyway... back to our regularly scheduled programming:

> the POSIX specification for setenv guarantees it to be buggy in most programs that think they should be using it. [...] POSIX is full of shit like this, and "you should know better" isn't a good enough excuse.

I completely disagree.

The C standard library is chock full of non-reentrant APIs. Nobody who has read more than ten manpages would reasonably assume that anything is reentrant without an explicit assurance. Locks are trivial. POSIX expects you to use a lock if you want to use setenv() like this.

I also want to point out that (at least on Linux) getenv() returns a pointer to the stack! How could anybody with basic programming literacy reasonably expect that to be thread safe? It's exactly analogous to getopt() and argv.

No, the authors obviously couldn't imagine that $FAANG would be building 4GB binaries with 1000+ recursive library dependencies, each of which has its own chance to reenact the printer scene from Officespace with your shared envp. I think that's an organizational problem, not a POSIX problem.

I'm not saying the standard shouldn't change: I would love to see argv and envp become immutable. If we're going to change it, that's the right move IMHO. But I don't really think that's practical...

> It's like saying "hey you should have known unless you press the doohickey for 5 seconds and turn the chainsaw at 90 degrees when you start it, the chain is going to fly off."

I think it's more like saying "we can't stop people from getting hurt jaywalking, so we're going to solve the problem by legally requiring everybody to wear helmets at all times outdoors".

layer8 · on Nov 20, 2023

> I don't think any library should be calling setenv()

It’s already a problem if the library is calling getenv(), because this could happen concurrently to the main program calling setenv(). The only universally safe solution is to not use setenv()/putenv() at all.

Which I think is actually reasonable. But yes it makes those functions broken in a multithreaded program.

GoblinSlayer · on Nov 21, 2023

>I also want to point out that (at least on Linux) getenv() returns a pointer to the stack! How could anybody with basic programming literacy reasonably expect that to be thread safe?

getenv_r is the standard way to provide thread safety in such situations: https://man.netbsd.org/getenv.3

kibwen · on Nov 20, 2023

> the manpage has said this for a long time

Nobody is disputing what the manpage says. What people are arguing is that the specification should be improved so as to no longer say that. Please stop quoting the Posix docs, as it merely broadcasts that one has missed the point here. Documenting the behavior does not automatically excuse the behavior.

As for Rust, the Rust docs make it clear that the underlying mechanism is fraught. Rust is well-acquainted with trying to find satisfactory solutions to the unhelpful and nonsensical tech stack that has been foisted upon the world by decades of worse-is-better laziness. And even if Rust were to mark std::env::set_var as unsafe, that doesn't magically fix anything; the underlying mechanism is broken, and actually fixing it is beyond Rust's control. Only the platforms can fix it.

jcalvinowens · on Nov 20, 2023

> Documenting the behavior does not automatically excuse the behavior.

It's a standard, and I'm citing the standard. The non-reentrant behavior doesn't need to be "excused", it is correct!

If a microcontroller used a different bit than you expected it to for something, would the documentation need to be "excused" for "disagreeing" with you? That's how absurd what you're saying here sounds.

> What people are arguing is that the specification should be improved

That would be more reasonable, but that's not the argument. They're saying the standard is wrong. Standards can't be wrong, they're tautological. All that can be wrong is the programmer's understanding of them.

paulddraper · on Nov 20, 2023

If your point is "the standard is correct as judged by the standard" your point is accurate and meaningless.

Joker_vD · on Nov 21, 2023

Worse than that, it means that standards shouldn't be changed (e.g. to improve them): because the changed standard would be different from the standard and that would make it wrong by, well, being different from the standard!

I've actually heard a similar argument about moral systems, about why the debater should not change his moral beliefs: it would be morally wrong for him to do that, you see, because then he'd then consider some things he currently considers morally right to be morally wrong and vice versa. Probably completely coincidentally, that debater was not a very nice person to talk to.

JohnFen · on Nov 21, 2023

> it means that standards shouldn't be changed (e.g. to improve them)

I think it's more nuanced than that. It means standards shouldn't be changed in a way that breaks backward compatibility. That seems like an important feature of any standard.

Nobody is saying standards shouldn't be improved.

Joker_vD · on Nov 21, 2023

> They're saying the standard is wrong. Standards can't be wrong, they're tautological. All that can be wrong is the programmer's understanding of them.

I personally do not possess enough of mental flexibility to believe simultaneously both that "the standard is not wrong, it can not possibly be wrong" and that "the standard should be improved".

And how do you improve standards anyway, without breaking compatibility? Making a previously thread-safe function non-thread-safe is an incompatible change: the clients relied on the thread-safety but new implementations won't provide it. Making a previously non-thread-safe function thread-safe is an incompatible change: the clients would rely on the thread-safety but old implementations don't provide it. And adding a new function to the standard is an undue burden on the implementers, and for almost no value since no client uses that function yet, and they won't for years because they want to keep backward compatibility with existing implementations.

JohnFen · on Nov 21, 2023

> simultaneously both that "the standard is not wrong, it can not possibly be wrong" and that "the standard should be improved".

I'm not the one saying that a standard cannot possibly be wrong. However, it is entirely coherent to say that a standard is not wrong and yet there is still room for it to be improved. "Not ideal" isn't the same as "wrong".

> And how do you improve standards anyway, without breaking compatibility?

Lots of ways, depending on what you think should be improved. In this case, you can improve it by adding a new set of functions that behave as you wish rather than altering the old functions in an incompatible way.

If it's not possible to improve a standard without maintaining backward compatibility, then the solution is to introduce a new standard entirely. One of the main points of even having a standard is that you can rely on it, and things you make that adhere to it will continue to work into the future.

Ferret7446 · on Nov 22, 2023

> What people are arguing is that the specification should be improved so as to no longer say that

What I hear is "we should invalidate all environments in existence", for the purpose of, um, something. Satisfying some Rust devs, I guess?

boring_twenties · on Nov 20, 2023

> The most common misuse I see is changing env before forking a child: nobody has to do that, execve() lets you pass arbitrary envp to the new process without changing yours.

That's pretty much never what you would want. You want to set a single variable while inheriting the rest of the existing environment. In order to do that with execve() you would have to copy the existing environment first, yuck.

And you wouldn't use setenv() before forking anyway, you would do it after forking, in the child, before exec.

Too · on Nov 21, 2023

What is yuck about copying the existing environment first?

boring_twenties · on Nov 21, 2023

It's a complete waste all around? Wasted cycles at runtime, wasted development time, and a bunch of unnecessary lines of code to read and debug later.

Too · on Nov 22, 2023

Wasted lines is a syntax sugar problem. It can be solved in the language level. Python for example has dict destructuring where you can solve this in under 10 characters. You could also imagine wrapper functions to execv taking in modifications to env only, that it merges with the inherited env.

Copying is something that happens anyway if you write to the environment after fork, due to the copy-on-write model used by forking.

If you don’t need to edit the environment at all, one could imagine a pure inherit option used.

Besides, what size is the environment variables anyway? There are many other bigger bottlenecks if you are launching processes frequent enough for this to matter.

loevborg · on Nov 20, 2023

You nailed it. The real villain in this story is mutability. We're addicted to changing variables in place, which is inherently complex – especially so in multithreaded environments. Environment variables are clearly best treated as immutable. Rust, despite its advances in some areas, perpetuates our addiction to mutable variables.

orange_fritter · on Nov 20, 2023

Every attempt at escaping mutability basically kills the language in the mainstream because so much of "real" programming is just bit-twiddling that gets too verbose when immutability is involved. It's a good question whether Rust nudges the world toward functional/declarative spiritual purity by placing constraints on mutation. I'm betting that No, it doesn't.

vacuity · on Nov 21, 2023

An explicit goal of Rust is to be "low level", which is an admittedly vague phrase. While you could certainly write a Rust library that clones linked lists left and right, and maybe someone would prefer that, someone else should be able to write a library that does in-place mutation with that delicious imperative goodness (badness?). To GP's point about environment variables, I think that's more of an issue with the fact that Rust tries to be compatible with existing C conventions in the OS. I don't think Rust can do much about it, since it can interface with C libraries that don't care about any Rust constraints on environment variables.

zlg_codes · on Nov 20, 2023

That's essentially the problem with any technology, or community around said technology, that's primed against some target they want to replace.

It has kept me away from Rust for years. If its fans weren't such fanboys for disruptive activity like rewriting things in Rust for the fuck of it, I might look closer into it. But considering where it came from and their politics, it doesn't seem like Rust is actually for everyone.

Its angle of picking on C is also rich. If C is so bad, why has no major language supplanted it or C++? Anything with high importance on performance is written in low level languages, unconcerned with pushing a narrative or making BS 'more inclusive' which is really another view on affirmative action.

My identity alone makes me unfit for the project.

jcalvinowens · on Nov 20, 2023

> or community around said technology, that's primed against some target they want to replace.

It's a vocal minority of the community, in my experience. Most people I've personally met who are passionate about Rust have a much more reasonable attitude about the whole thing, and see it as moving the needle in the right direction rather than a fully formed solution.

It also really is interesting: obviously it's not the panacea it is sometimes made out to be, but it truly does eliminate a class of error. I admit to a bit of stubbornness myself, but I'm trying to work with it more.

mike_hock · on Nov 20, 2023

> If C is so bad, why has no major language supplanted it or C++?

Because no one, not even Rust, aims for feature parity with C (except on the language level).

JohnFen · on Nov 20, 2023

> It has kept me away from Rust for years.

Yes. The Rust community is what has kept me away from Rust for a long, long time. Now I'm learning Rust because it may become an important skill to have, but it's despite the community. They're very hard to put up with.

vacuity · on Nov 21, 2023

> If C is so bad, why has no major language supplanted it or C++?

I think a big reason is the massive codebases with legacy practices that no one has the time or money to fix or rewrite. But there are plenty of companies using newer languages for newer products, Go and Rust being prime examples.

As to your points about the Rust community, there are some people who make it a religious thing (although Rust isn't especially abnormal in this regard; there are fanatics everywhere), but I think you're exaggerating.

coldtea · on Nov 21, 2023

>Its angle of picking on C is also rich. If C is so bad, why has no major language supplanted it or C++?

Because the assumption "If some thing/technology was bad it would surely would have been supplanted" is wrong to begin with.

There are lots of reasons things (customs/technologies/even politicians) stick around, and sticking around is not necessarily proof they aren't bad.

Once a language has been entrenched (and it doesn't have to be great: just good for its time, or having advantages like being free to implement when competitors costed money or had to be licenced) it's very difficult to migrate countless very critical infrastructure.

It's also very difficult to coordinate a huge industry in adopting a new language - there's the cost of retraining, the uncertainty on whether it will catch on and be worth the investment, the cost of the migrating existing stuff (or the cost of not migrating it, and having to deal with both the old and new language in your codebase). It's also difficult to change language and tooling vendors, to make sure the new language has as mature tooling, that the libs you need are available for it (or you have to deal with wrapping and calling across languages and dealing with a mixed codebase) and so on.

So any replacing is slow.

And if any language, aside of C++ which in many domains has eclipsed C, shows signs to replace C that would be Rust: it has a maturing and fast implementation, it has an increasing number of libs, it has major investment and adoption from big name companies, and has increased vendor support (not to mention IntelliJ making an IDE for it, always a sign of a language doing well, and unheard of for any C-replacent-wannabes until Rust to reach this level of success).

And the very thing you complain about "disruptive activity like rewriting things in Rust" is actually part of what is needed for that eventual replacement.

>Anything with high importance on performance is written in low level languages

Increasingly it's written in Rust, even in the bigger of FAANG companies. Fuchsia OS from Google, for example, critical infrastructure in Google, Cloudflare, Apple: "Following a very successful first foray into Rust we are migrating an established codebase from C to Rust, and building new functionality primarily in Rust", MS "rewriting core Windows code in memory-safe Rust" and others.

>My identity alone makes me unfit for the project

Don't let the door hit you on your way out.

zlg_codes · on Nov 22, 2023

How inclusive of you, lol.

Georgelemental · on Nov 20, 2023

> But considering where it came from and their politics

> unconcerned with pushing a narrative or making BS 'more inclusive' which is really another view on affirmative action

For what it's worth: I am conservative, right-wing, as opposed to "affirmative action" as it is possible to be—and I love Rust. Don't judge the language by the politics of a tiny section of its community, judge it on the technical merits.

zlg_codes · on Nov 21, 2023

I judge things holistically. Ignore the community and you ignore the future of that tech. Ignore the tech and you miss some features.

If others want to delude themselves by ignoring half the picture, more power to them.

beltsazar · on Nov 20, 2023

> This is silly, setenv() isn't reentrant for the same reason that getopt() isn't reentrant: there's no valid reason to use it except at the very beginning of the program.

It is not, unless you'd argue that "there's no valid reason to use [anything that transitively uses setenv()] except at the very beginning of the program." Did you even read the article? The author and the GitHub links mentioned provide some examples that use setenv() not directly, but transitively.

jcalvinowens · on Nov 20, 2023

> Did you even read the article?

Of course. I also clicked through and looked at his examples, did you?

> The author and the GitHub links mentioned provide some examples that use setenv() not directly, but transitively.

The big list at the end of the article? That's absolutely not true: they're all read only usecases that prove my point. Nobody should be changing any of those in the middle of the program. If you disagree, please point out specifically which one and explain why, because I don't see it.

beltsazar · on Nov 20, 2023

Ah yes, silly mistakes—I mixed up getenv and setenv. Those examples transitively use getenv, not setenv.

Having said that, my point still stands. You can only control what you directly use / don't use. Third-party libraries you use might use setenv at times other than "the very beginning of the program."

verall · on Nov 20, 2023

In C/C++ land you control what third party libraries you use / don't use and you don't use the ones that behave badly

vacuity · on Nov 21, 2023

Imagine if programmers did that with memory safety. "Hey, is the program interfacing with any C/C++ libraries that aren't rigorously sanitized and fuzzed for memory safety? If so, throw them out." Yes, theoretically you can control these things. In practice, most people aren't going to and the few that try may have to expend a lot of effort.

Ferret7446 · on Nov 22, 2023

I thought that was in fact standard practice in any sort of serious software engineering. Do you not audit your dependencies? I mean, running ASan is basically free.

vacuity · on Nov 22, 2023

Well, it hasn't stopped memory safety bugs from being a thing, that's for sure.

verall · on Nov 21, 2023

You're free to spend your free time/money rewriting mountains of legacy C/C++ code in memory safe languages to use in your projects.

Good luck.

vacuity · on Nov 21, 2023

It's like you didn't read my comment. Of course I'm free to spend my time doing that, but significant memory safety or thread safety hazards can't just be waved away because "you can vet your dependencies". For smaller projects, perhaps, but it's not feasible in general. This can be a source of actual exploits that cost millions of dollars or leak personal information. Footguns don't just become not a problem because they don't affect you. Whether setenv's lack of thread safety matters in practice and whether backwards compatibility hinders a fix are both factors to consider, but your reasoning is just absurd. I guess everyone who doesn't rigorously vet their C/C++ dependencies up to and including libc is just a bad programmer.

verall · on Nov 21, 2023

I did read your comment, but I guess I didn't fully understand it?

I'm not waving away memory safety or thread safety hazards, that's just the state of things. There's mountains of legacy C and C++ code that you can use. Or you can not use it. It's inexpensive to use (you don't have to write it), but not free - you may have to chase a wild bug down now and then.

I guess the part I don't understand, is that you said "Imagine if we applied this logic to memory safety", so I imagined it, and I imagined it would mean a lot of code would have to be rewritten.

Or maybe, you meant something like "it's absurd to vet your dependencies for setenv issues when noone vets dependencies for memory safety", which still doesn't make sense to me.

The point I was trying to make, is that C/C++ developers generally _do_ vet (and vendor) their dependencies, while developers in other ecosystems don't as much. go is big enough that if they want to provide a library call to setenv/getenv and _not_ make their users read the man pages, maybe they should add a lock or provide their own implementation.

I still think GP saying basically "you can't control what your dependencies use" is kind of ridiculous, because you do control what dependencies you use, and you should probably fix or toss the ones crashing your program.

> I guess everyone who doesn't rigorously vet their C/C++ dependencies up to and including libc is just a bad programmer.

I'm not saying this at all. It's just your job to fix it, or to convince your manager it's not your fault it crashes. If you rely on other peoples code, eventually they will have a bug. It's inevitable.

vacuity · on Nov 22, 2023

We mostly agree then, but I think the scale of enterprise software development in languages such as Go and Rust is such that telling people to vet their dependencies isn't a feasible solution. If the dependencies are free software, it may be fairly easy to fix the setenv bugs, but they might not be. Ideally, there should be action on the C side to make this less of a problem overall, instead of just for the people who have the time and energy to fix this locally. Not sure if that's feasible based on backwards compatibility, but that's a different problem.

verall · on Nov 22, 2023

Anyone can submit a patch on the C side to make this less of a problem, but if it's not a problem for most people on the C side, why wouldn't it get fixed in go or rust instead?

The cycle goes: "it's hard to support a lot of platforms" -> "let's support a common interface" -> "posix sucks" -> "it's hard to support a lot of platforms"

Changing posix means distributing your work among all of the platforms rather than adding your own implementations and confirming they work on everything. Not likely to get buy in unless something is causing problems for whoever is paying those platform developers, or libc maintainers.

vacuity · on Nov 22, 2023

It is a catch-22 of sorts. I would like for us to move on from POSIX and the current libc interface, but that's easier said than done.

eqvinox · on Nov 20, 2023

> This is a list of some uses of environment variables from fairly widely used libraries and services. This shows that environment variables are pretty widely used.

Widely used, yes. Used as in read. Why do any of these need to change at runtime? And if they do - why are they environment variables?

(NB: starting a new process is not "at runtime")

dmytroi · on Nov 20, 2023

Mostly integration, for example some library can only be configured via env variables, but a developer might want to configure it from with-in the app it's integrated into and used from.

Also, few weeks ago I found a use for them when trying to pass configuration from Java/Kotlin to C++ library to be used during static constructors (invoked during dlopen) on Android, because at that phase native code cannot call back to JVM.

guappa · on Nov 20, 2023

> for example some library can only be configured via env variables

library has already loaded when you call setenv, so what you're saying doesn't work in most cases.

It seems to be a need to use poorly written libraries. You might consider fixing them instead.

jonhohle · on Nov 20, 2023

I agree that would be a poor implementation, but the library could be loaded at runtime using dlopen or equivalent.

This issue with that “interface” is the environment is process global. If the library is being loaded dynamically (specifically for some task) it would seem that the parameters are local to that task and should be taken by some reentrent init method. Alternatively, the process could be forked and environment set in the child without concern for thread safety or polluting the environment (think of the children!).

guappa · on Nov 20, 2023

The only library I've seen to use env vars is libc, which uses them to decide how malloc should behave for example.

the_svd_doctor · on Nov 20, 2023

Some libraries behaviour/API can be tweaked with env var. env var are read at runtime not loading time.

guappa · on Nov 21, 2023

Can you link one such library here?

wzdd · on Nov 20, 2023

Indeed -- it's an extremely unconvincing list, because any sensible library which may require a library user to set env variables (which includes all the ones I checked on the list) can also be configured without setting env variables. Most of the time the env variables set fallback defaults for parameters not specified by the caller. In these cases, the sane thing to do, regardless of the thread-safety of setenv(), is simply to supply the parameter in code.

The only exception is things like debug logging, which is unlikely even to work dynamically.

On the other hand, setenv() is clearly broken in modern code, particularly in a library context, and the man page (at least on my Linux machine) does not make that particularly obvious -- "Thread safety: MT-Unsafe" is the only note, with a reference to attributes(7) for more information. It could definitely be made more obvious.

oefrha · on Nov 20, 2023

Have you ever exported anything in a shell script? Sure you can keep the necessary changes in local state and pass those to execve(2)/execvpe(3)/posix_spawn(3), and that would be safe AFAIK, but setenv(3) is there and more convenient if you're unaware of the hidden dangers. Also that doesn't work for PATH in execvp/execvpe, which is read from the current process; how do you change search paths for execvp without setenv (short of doing the search yourself)?

Edit: I just realized macOS/FreeBSD has execvP() that allows passing a custom search path, so PATH is now safe, but without a -e variant, everything else is again unsafe.

xxs · on Nov 20, 2023

>Have you ever exported anything in a shell script

So, shells use a single thread that can safely modify the environment - then start new child processes by the same thread. The child processes get a =copy= of the said environment. That's a textbook example how to use env.

Starting multiple threads on your own, then modifying env should be considered a textbook example how not to do things - env is not intended for interprocess communication.

rcxdude · on Nov 20, 2023

Shell scripts are not really prone to this problem because AFAIK no shells are multithreaded: subshells and the like are implemented with fork()

oefrha · on Nov 20, 2023

Yes, I’m not saying shell scripts are affected, merely using them as an example to answer the question “Why do any of these [env vars] need to change at runtime?”

xxs · on Nov 20, 2023

The discussion is only relevant for a shared unguarded resource (the env) modified and read by multiple threads. Single threaded operations are just fine.

Someone · on Nov 20, 2023

> Single threaded operations are just fine.

Sort of. https://pubs.opengroup.org/onlinepubs/9699919799/functions/g...:

“The returned string pointer might be invalidated or the string content might be overwritten by a subsequent call to getenv()”

There’s little you can do with a broken API, so Linux has that ‘feature’, too. https://man7.org/linux/man-pages/man3/getenv.3.html:

“The string pointed to by the return value of getenv() may be statically allocated, and can be modified by a subsequent call to getenv(), putenv(3), setenv(3), or unsetenv(3).”

FreeBSD chooses to leak memory, instead. https://man.freebsd.org/cgi/man.cgi?getenv(3):

“Successive calls to setenv() that assign a larger-sized value than any previous value to the same name will result in a memory leak. The FreeBSD semantics for this function (namely, that the contents of value are copied and that old values remain accessible indefinitely) make this bug unavoidable”

eqvinox · on Nov 20, 2023

Shells don't generally use the libc environment; this would be too limited to implement even standard POSIX shell functions with local variables, or non-exported variables. It's much easier to set up purpose-built data structures to track variables, and construct an argument for execve().

(Edit: removed unneeded pointing out execve)

Also shells generally have their own program search anyway since they need to support built-in commands. It's not particularly hard to implement PATH search.

oefrha · on Nov 20, 2023

Once again, the OP asked why setenv is even needed, which implies they likely don’t have much experience with spawning processes in low level languages, so I used the more familiar shell script setting as an illustrative example, as setenv is analogous to export in POSIX sh. I never said export is implemented with setenv, or shell script exports aren’t thread safe. Unfortunately, replies hung up on shell scripts.

As for I’m not aware of execve etc… You need to re-read my comment which clearly mentions execve, execvep, posix_spawn, as well as implementing PATH search on your own.

eqvinox · on Nov 20, 2023

> Once again, the OP asked why setenv is even needed, which implies they likely don’t have much experience with spawning processes in low level languages

I am the OP and your assumption is incorrect. You may consider why the post ends with:

  (NB: starting a new process is not "at runtime")

Izkata · on Nov 20, 2023

"export" in shells has to change the environ before they start the new process. It may not be "at runtime" for the new process, but it would be for the shell.

account42 · on Nov 20, 2023

Wrong, export does not have to change the shell's environment at all. There are plenty of exec variants that accept a different environment pointer, same for posix_spawn.

quickthrower2 · on Nov 20, 2023

Shell scripts are different as you are likely exporting environment variables and then starting new processes.

oefrha · on Nov 20, 2023

Shell scripts aren’t different from “real” programs using exec or posix_spawn in this regard, it’s just that fewer people have done the latter than the former, so the former is a more relatable example. “Real” programs spawn other processes too you know, sometimes with modified environ.

quickthrower2 · on Nov 20, 2023

So I understand this right, I thought the issue is about multiple threads but in shell you wouldn’t have this just new processes.

In a program you could have either.

anttihaapala · on Nov 20, 2023

In the case of execvp, you would pretty much be required to fork before it and then you could change PATH.

oefrha · on Nov 20, 2023

Yeah, fork()+immediately exec() should be safe, but those use cases are almost always better with posix_spawn(), due to issues with fork(), like memory copying. And if you want to use the p-variant of posix_spawn you’re back to setting PATH beforehand. These APIs designed back in Stone Age just aren’t very well thought-out wrt concurrency and high performance.

jstimpfle · on Nov 20, 2023

Why would you change the path just to call posix_spawnp()? If you want that control, that is an indication that you want to specify the path to the executable, not use PATH.

leoh · on Nov 20, 2023

Testing, for one thing…

I mean YES you can factor your code (tests, whatever) to make this a non-issue but supposing some person wrote some code 10 years ago in an OSS project or on your team and you start banging into this issue.

It’s not going to be trivial to unwind let alone find the root issue.

Let’s start fixing things like this for our future selves, right?

Digging heels in and saying “eh, you just got to learn this one weird quirk.. oh yeah this other one too..” is kind of a fun glass bead game until it’s not; as is not a winnning way to endear hearts and minds.

withinboredom · on Nov 20, 2023

Changing the env during runtime is actually quite handy for debugging and forcing the program into specific states.

Other than that, it can also be handy in k8s with a VPA. You get more/less memory and then update the env to reflect that. Your service picks up the env change and updates the runtime.

IIRC, there is/was some way to listen to those changes in C#, and automatically update runtime settings.

eqvinox · on Nov 20, 2023

> Other than that, it can also be handy in k8s with a VPA. You get more/less memory and then update the env to reflect that. Your service picks up the env change and updates the runtime.

You… can't change the env from outside the process…

are you saying this is used by disjoint components within a single process? Or is this just a misunderstanding?

pjc50 · on Nov 20, 2023

> You… can't change the env from outside the process…

Not with that attitude you can't.

(OK, without the joke: you can do this with an interactive debugger. But I think OP just meant "change it in the container and then restart the child process")

withinboredom · on Nov 20, 2023

You can spawn as many processes as you want in a container, did you not know that?

But you only need access to the /proc/pid directory to change another processes env.

eqvinox · on Nov 20, 2023

> But you only need access to the /proc/pid directory to change another processes env.

/proc/$pid/environ is not writable

(and as a matter of fact, due to how the environment works, it cannot be writable.)

LegionMammal978 · on Nov 20, 2023

But /proc/pid/mem is, if you like living dangerously! You'd just have to parse the dynamic-linker metadata to find where libc's environ is hiding. (Though statically-linked programs would be tougher.)

SAI_Peregrinus · on Nov 20, 2023

Spawning a new process doesn't require changing the parent's environment.

rewmie · on Nov 20, 2023

> Changing the env during runtime is actually quite handy for debugging and forcing the program into specific states.

Most debuggers nowadays support altering variables at runtime after hitting breakpoints. In the meantime this was the very first time I ever heard anyone even considering changing env vars at runtime, let alone use it to debug stuff. Sounds like an ass-backwards way of going about debugging.

riffraff · on Nov 20, 2023

> Changing the env during runtime is actually quite handy for debugging and forcing the program into specific states.

Wait, why would this need to happen at runtime? I have used env cars a lot to trigger specific cases but why would you want to do this while the process is running from within the process itself?

If you control the process you can start it with the right env to begin with, no?

qwertox · on Nov 20, 2023

Just asking: If you pass security tokens via environment variables to the process, doesn't it make sense to delete them from within the process after they have been used?

eqvinox · on Nov 20, 2023

Yes it would make sense, but no there is no way to actually ensure they have been deleted. A trivial but nonetheless very common case would be if your process is started with a wrapper shell script. But even just within your process, there is no guarantee at all against some random library (or the kernel) making a copy of the entire environment.

If you want to pass secrets into a process at startup, I would strongly recommend passing a pipe as an additional open file descriptor (e.g. fd #4, but this FD number you can then put in an env variable) and writing it onto the pipe. It can only be read once, and you can control where the value propagates.

yrro · on Nov 21, 2023

Now if only someone would explain this to the authors of mongosh (which REFUSES to accept credentials in an environment variable and will ONLY read them from stdin or from argv...)

Zandikar · on Nov 20, 2023

Damn, learning new tricks everyday, thanks for the tip.

DangerousDoctor · on Nov 20, 2023

The essential problem is that there is no thread-safe way to implement this while maintaining backwards compatibility -- applications can alter the environment block by changing the environ global pointer, applications can also alter the environment block by replacing individual pointers in the environ array, applications can also alter the environment block by altering the strings pointed to by the individual members of the environ array, applications can also alter the environment block by using setenv/putenv/etc.

Inserting a mutex into the setenv/getenv/etc. functions is pointless because applications are explicitly allowed to modify the environ pointer and array directly without any locking.

jcelerier · on Nov 20, 2023

> Inserting a mutex into the setenv/getenv/etc. functions is pointless because applications are explicitly allowed to modify the environ pointer and array directly without any locking.

by that logic mutex themselves are pointless because nothing ever forces you to use them, even in memory-safe languages you can still access /dev/mem and change bytes? It's stil a useful thing to have.

PhilipRoman · on Nov 20, 2023

The difference is that modifying the environ pointer is explicitly supported behaviour in the standard, poking through /dev/mem is not.

Although I guess a middle ground solution wouldn't be too bad either - most programs don't modify environ directly, so POSIX could offer thread safety for the functions and make multithreading through "environ" UB. This is already kind of explained in the standard:

https://pubs.opengroup.org/onlinepubs/9699919799.2018edition...

jcelerier · on Nov 20, 2023

> The difference is that modifying the environ pointer is explicitly supported behaviour in the standard

they just have to fix the standard. e.g. in my country they manage to improve for instance the standard for electrical plugs every three years, there is NO REASON posix cannot do the same

jeroenhd · on Nov 20, 2023

I think the memory leak solution (copy over the env variables to a new location in memory every time you call setenv and keep the old pointers alive) will cause the fewest crashes.

I would personally go for the aggressive approach (release a new major version of libc that detects multithreaded environments and intentionally crashes out when calling setenv() so people actually notice and fix their broken programs) but I suspect not many people will agree with me on that.

The API is not necessarily bad (it's just very 80s UNIX), but the lack of enforcement of thread-safety causing all kinds of bugs and crashes.

lelanthran · on Nov 20, 2023

Yeah, but the program may not be broken until you, the glibc maintainer, calls `raise(SIGSEGV)`.

Most programs using setenv call it before starting any threads. That is not broken.

Detecting the linkage of thread support and crashing that program on purpose is, frankly, a pathological way to fix a non-broken program.

Besides which, your proposal won't work anyway, because this remains a potential problem in single threaded programs anyway: a program calling getenv, storing the result, and then calling setenv on the same variable and using the previously stored result will break anyway.

In summary, your proposal is broken in two different ways: 1) it breaks well-defined programs, and 2) it fails to break broken programs.

jeroenhd · on Nov 20, 2023

I wouldn't implement it during linkage, obviously single threaded putenv/setenv calls should still be permitted as part of initialisation routines. Count the number if children in /proc/self/task for all I care, the detection needs to happen during runtime.

You're right that putenv/setenv are also horribly broken in other ways, and doing multi thread detection doesn't prevent those problems. In a perfect world we would just kill off these two functions all together, replacing them with either crashes or no-ops, but that'd be an even harder sell.

lelanthran · on Nov 20, 2023

> I wouldn't implement it during linkage, obviously single threaded putenv/setenv calls should still be permitted as part of initialisation routines. Count the number if children in /proc/self/task for all I care, the detection needs to happen during runtime.

That still breaks well-defined, non-broken programs which don't call getenv/setenv in racing ways. There is no way for you do a conditional-upon-threads mechanism without false positives.

> You're right that putenv/setenv are also horribly broken in other ways, and doing multi thread detection doesn't prevent those problems. In a perfect world we would just kill off these two functions all together, replacing them with either crashes or no-ops, but that'd be an even harder sell.

But you don't need to in order to meet your original goal - breaking programs which do call setenv/getenv in the wrong order. Proposing to remove them altogether doesn't fulfill the goal of finding the breakages immediately and introduces breakages in existing programs.

My alternative: use LD_PRELOAD and provide alternative setenv/getenv functions which raise SIGSEGV when setenv is called on a variable more than once, and when getenv is called on a variable that was already setted once. It requires nothing more than a counter for each of setenv/getenv per variable.

That finds programs which actually are broken, with no false positives, and ignores threads altogether because they don't matter under the counter system[1].

Best of all, you can implement this in an afternoon, without needing to modify glibc, and then test it with every single executable on your system to see which ones break.[2]

[1] Since the caller knows they are not thread-safe anyway, we aren't looking for the error where the caller calls setenv concurrently in different threads. That's a different problem.

[2] I would wager good money that few, if any systems, will break under this test.

fch42 · on Nov 20, 2023

Leaking memory is not a "solution". Ever. Maybe for a commercial problem. But not for one in systems API design.

If you copy, provide a new interface. It's time-honoured and proven in Unix to give *_r() ones in such a case.

jeroenhd · on Nov 20, 2023

It solves hard crashes during DNS lookups by wasting a few kilobytes of RAM. Seems like a fine solution to me. The memory leak only occurs in circumstances where the program would've crashed or started messing with random memory anyway.

A proper solution would be to either nuke put/setenv() in the C standard library or redesign the *env() calls entirely, but that would break existing programs.

JohnFen · on Nov 21, 2023

A proper solution would be to implement your own put/setenv in the program that needs special behavior. That solves everything and does not affect any other programs.

slaymaker1907 · on Nov 20, 2023

There are circumstances where it's a perfectly valid solution. For example, suppose you're trying to acquire the lock on something to destroy it. It can be the lesser of two evils just to leak that memory instead of just waiting forever/a very long time to acquire that lock. You just need to ensure that you aren't leaking memory too quickly for whatever your constraints are. For example, most programs wouldn't care about a 1kb/day leak of memory because that would take a very long time to actually become noticeable. Furthermore, there's pretty much always some degree of memory growth just due to heap fragmentation (at least if you're using a language like C which can't do memory compaction via GC).