So, for a top level bit of code to pass arguments to lower level bits of code yo...

kbenson · on May 24, 2021

The only two things I use this for are for setting things that I consider runtime flags that I want honored throughout the instance of that applicaton run.

In the instance of dry run, I think it's far more important that it's correctly seen and honored than that globals should be avoided. I view dry run mode as a contract with the person running, and if I can avoid having to pass arguments along every time, and avoid having to got change the arguments of all the utility functions I call and then all their call sites, then that's a win because if that needs to be done to correctly honor that sometimes it won't be done and some worse solution will happen, if anything.

> but code passing arguments to other bits of code in the same language with environment variables seems odd.

I don't pass the environment variables around, they exist as part of the program state. if foo() calls bar(), I don't pass the environment variable to bar, it exists as a global flag, I just check for it in bar(). That's the benefit, it's a global and set at runtime.

The other benefit is that as an environment variable, it's inherited by child processes. Even if I system out to another utility script, I don't have to pass a debug flag their either, it's inherited as part of the environment the child runs in.

raziel2p · on May 24, 2021

In many programming languages, environment variables are a mutatable dictionary/map, which makes it the ultimate place to store global variables.

It's definitely my guilty go-to-hack when I'm not up for refactoring everything to be more functional and/or take a dry_run function parameter everywhere.

Buttons840 · on May 24, 2021

It's a global mutable map that multiple uncoordinated processes can change. If you want a global map, just make a global map. Or just make global variables, since the variable namespace itself is a map.

kbenson · on May 24, 2021

All that assumes you have some stuff to set up that global map, and that means that setup code is a requirement.

I end up writing lots of library code. Sometimes that library code is called from within a command line utility I created, sometimes it's called from a web service, sometimes it's just a small driver script because I'm not developing or testing the utility, but the library code itself.

I can make that include to set up the shared global and try to make sure it's included in all instances and all ways I want to use the code, or make all the call sites resilient to it not existing, or I can just use the included OS mechanism for doing this and since that's always available, I get it for free.

Also, dry run mode isn't necessarily something you want set in a config. It's generally something you run once or twice prior to running for real (the normal case) or while in development/debugging. It's not something you would want to set in code and accidentally forget and push live, and generally a good dry run mode will look like it succeeded without actually succeeding, mocking responses that would fail along the way, because you aren't testing one small thing you're testing a workflow of some sort generally which has a few steps.

That said, I fully admit the trade-off might go a different way for different languages. Using a compiled strongly typed language may mean there's enough bits to check that you need to write a debug/dry run helper function to make it convenient, so there's not a lot lost by requiring setup in that as well. But for something like Perl (and I assume Python and Ruby and JS, to almost the same degree) where I can do:

    warn "Calling out to foo() with args: " . Dumper($args) if $ENV{DEBUG} and $ENV{DEBUG} >= 2;
    foo($args) if not $ENV{DRY_RUN};

and it will be completely valid, obvious and idiomatic with zero additional work, there's a real draw to using environment vars for these two specific cases (even if not for all config).

hnick · on May 25, 2021

Is this true on other OSs? On Windows at least env vars are inherited and will not change in a long running process if changed elsewhere.

So you can do something like nuke $ENV{PATH} in your perl script and it'll apply to any subsequent child system() calls.

p_l · on May 25, 2021

on POSIX, environment variables is just a pointer to an array of pointers to NAME=VALUE strings, terminated with NULL.

The contents are private to a process, and the only method to modify them from outside the process is to have write access to process memory and change the pointer to the memory block.

When you start a new process, ultimately your calls turn into execve family with full set of arguments, i.e. actual binary to replace your process with, arguments, and contents of the environment it's going to use. The wrapper functions that don't ask for new environ value simply copy the current one.