So this is something I’ve never understood. If you modify a shell script while i...

wongarsu · on Dec 30, 2021

> isn’t it possible to mark a file as in use so it’s cant be modified?

That's the route chosen by Windows for binary executables (exe/dll) and various other systems. Locking a file against writes, delete/rename or even read is just another flag in the windows equivalent of fopen [1]. This makes for software that's quite easy to reason about, but hard to update. The reason why you have to restart Windows to install Windows updates or even install some software is largely due to this locking mechanism: you can't update files that are open (and rename tricks don't work because locks apply to files, not inodes).

With about three decades of hindsight I'm not sure if it's a good tradeoff. It makes it easy to prevent the race conditions that are an endless source of security bugs on unix-like systems; but otoh most software doesn't use the mechanism because it's not in the lowest-common-denominator File APIs of most programming languages; and MS is paying for it with users refusing to install updates because they don't want to restart their PC.

1: Search for FILE_SHARE_DELETE in https://docs.microsoft.com/en-us/windows/win32/api/fileapi/n...

pjmlp · on Dec 30, 2021

Files in use can be shadow updated and then will be actually replaced when possible.

Naturally no one reads MSDN docs.

Also to note that other non-UNIX clones follow similar approach to file locking.

justsomehnguy · on Dec 31, 2021

>and rename tricks don't work because locks apply to files, not inodes

My experience is opposite, you can rename a locked file and place a new file with the name of the former one.

Depends on the lock type, I suppose.

formerly_proven · on Dec 30, 2021

On Unix/Linux you can't update a file mmaped for execution either - text files are busy.

toast0 · on Dec 30, 2021

I've updated .so files on FreeBSD while they're running. They weren't busy and a program which had it mmaped to run promptly crashed (my update wasn't intended to be hot loaded and wasn't crafted to be safe, although, it could have been if I knew it was possible). And now I won't forget why I should use install instead of cp (install unlinks before writing, by default, cp opens and overwrites the existing file)

AnssiH · on Dec 31, 2021

In my experience on Linux, shared libraries can be modified while running (often causing a crash), while executables cannot (ETXTBUSY).

kragen · on Dec 30, 2021

This behavior in shell scripts predates mmap. In very early versions of Unix it was arguably even useful; there was a goto command which was implemented by seeking on the shell-script file descriptor rather than as a shell builtin, for example. I don't know of any use for it since the transition to the Bourne shell, but my knowledge is far from comprehensive. (I suppose if your shell script is not small compared to the size of RAM, it might be undesirable to read it all in at the start of execution; shar files are a real-life example even on non-PDP-11 machines.)

As I understand it, the reason for ETXTBSY ("on some old Unices...you can't overwrite a running binary") was to prevent segfaults.

cp usually just opens the file O_WRONLY|O_TRUNC, which seems like the wrong default; Emacs for example does create a new file and rename it over the old one when you save, usually, allocating a new inode as you say. By default it makes an exception if there are other hardlinks to the file.

Btrfs and xfs have a "reflink" feature that allows you to efficiently make a copy-on-write snapshot of a file, which would be ideal for this sort of thing, since the shell or whatever won't see any changes to the original file, even if it's overwritten in place. Unfortunately I don't think you can make anonymous reflinks, so for the shell to reflink a shell script when it starts executing it would need write access to somewhere in the filesystem to put the reflink, and then it would need to know how to find that place, somehow. And of course that wouldn't help if you were running on ext4fs or, I imagine, Lustre, though apparently an implementation was proposed in 02019: https://wiki.lustre.org/Lreflink_High_Level_Design

exikyut · on Dec 30, 2021

> there was a goto command which was implemented by seeking on the shell-script file descriptor rather than as a shell builtin, for example.

Oh noooo I just realized you could probably implement a shared library loadable module for bash `enable` that does the same thing... just fseek()s the fd...

*Runs for the hills screaming*

sillysaurusx · on Dec 30, 2021

“Emacs for example does create a new file and rename it over the old one when you save, usually, allocating a new inode as you say. By default it makes an exception if there are other hardlinks to the file.”

Though the trade off is that all operation ceases on a full hard drive.

I don’t have a better solution, but it’s worth noting.

kragen · on Dec 31, 2021

Emacs gives you an error message in that case rather than destroying the old version of the file and then failing to completely write the new version, in the cases where it does the tempfile-then-rename dance. This is usually vastly preferable if Emacs or your computer crashes before you manage to free up enough space for a successful save.

It doesn't cease all operation; other Emacs features work as they normally do. Bash, by contrast, stops being able to tab-complete filenames, at precisely the time when you most need to be able to rapidly manipulate your files. At least, that's the case with the default completion setup in a few recent versions of Ubuntu.

thaumasiotes · on Dec 30, 2021

Well, it looks like creating another hard link is a nearly-free solution. And beyond that, since emacs already has both behaviors, presumably you can tell it you want the in-place modification.

kragen · on Dec 31, 2021

I think you can just customize the backup-by-copying variable to t, though I haven't tried it. Check the manual.

vbezhenar · on Dec 30, 2021

Does it mean that I need to have extra free space? Does not sound good.

Hello71 · on Dec 30, 2021

> So I guess bash or whatever does an mmap of the script it’s running

this is incorrect, and is relatively easy to test:

  $ strace -y -P /tmp/test.sh bash /tmp/test.sh
  ioctl(3</tmp/test.sh>, TCGETS, 0x7ffc6daea580) = -1 ENOTTY (Inappropriate ioctl for device)
  lseek(3</tmp/test.sh>, 0, SEEK_CUR)     = 0
  read(3</tmp/test.sh>, "#!/bin/sh\n", 80) = 10
  lseek(3</tmp/test.sh>, 0, SEEK_SET)     = 0
  dup2(3</tmp/test.sh>, 255)              = 255</tmp/test.sh>
  close(3</tmp/test.sh>)                  = 0
  fcntl(255</tmp/test.sh>, F_SETFD, FD_CLOEXEC) = 0
  fcntl(255</tmp/test.sh>, F_GETFL)       = 0x8000 (flags O_RDONLY|O_LARGEFILE)
  newfstatat(255</tmp/test.sh>, "", {st_mode=S_IFREG|0644, st_size=10, ...}, AT_EMPTY_PATH) = 0
  lseek(255</tmp/test.sh>, 0, SEEK_CUR)   = 0
  read(255</tmp/test.sh>, "#!/bin/sh\n", 10) = 10
  read(255</tmp/test.sh>, "", 10)         = 0

the reason why modifying a script during execution can have unpredictable results, not demonstrated in this test, is that Unix shells traditionally alternate between reading commands and executing them, instead of reading the entire file (potentially very large compared to 1970s RAM size) and executing commands from the in-memory copy. on modern systems, shell script sizes are usually negligible compared to system RAM. therefore, you can manually cause the entire file to be buffered by enclosing the script in a function or subshell:

  #!/bin/sh
  main() {
  # script goes here
  }
  main

prussian · on Dec 30, 2021

>So, how could this (IMO) bad behaviour be fixed?

By reading in the whole file at once. Bash does not mmap shared the script it is parsing. You can see this behavior with

    strace -e read,lseek bash << EOF
    echo 1
    echo 2
    EOF

bash will read(), do its multi-step expansion-parsing thing and then lseek back so the next read starts on the next input it needs to handle. This is why the problems described in the story can happen.

The other way to fix this is to simply use editors that will just make a new file and move over that file on the target on save. I believe vim or neovim does this by default, but things like, ed or vi do not. Emacs will do something similar on first save if you did not (setq backup-by-copying t) but any write after will still be done in-place. I tested this trivially without reviewing the emacs source simply doing the following and you can to with $EDITOR of choice:

    !#/usr/bin/env bash
    echo test
    sleep 10
    # evil command below, uncomment me and save
    # echo test2

while running sleep, if changing the script causes things to happen, your editor may cause the problem described.

helsinkiandrew · on Dec 30, 2021

> If you modify a shell script while it’s running, the shell executes the modified file

That is dependent on the OS. In this case wasn't the shell script just executed fresh from a cronjob?

I remember on Digital Unix - on an Alpha so this was a few years ago - that you could change a c program (a loop that printed something then slept, for example), recompile and it would change the running binary.

doctor_eval · on Dec 30, 2021

> wasn't the shell script just executed fresh from a cronjob?

The description said that the script changed while it was running, so certain newly introduced environment variables didn’t have values and this triggered the issue.

My reading was that this was just a terrible coincidence - the cron job must have started just before the upgrade.

Regarding changing a C program, now you mention it I think that the behaviour you describe might also have happened on DG/UX, after an upgrade. IIRC it used to use ETXTBSY and after an upgrade it would just overwrite.

Not really behaviour that you want (or expect) tho.

bluedino · on Dec 30, 2021

It's nice to see the same mistakes that people have been making for as long as I've been alive, on small and large systems all over the world, still happen on projects with professional teams from HPE or IBM that cost hundreds of millions of dollars.

eklavya · on Dec 30, 2021

From what I know, so far linux doesn't have an exclusive lock capability on a file, windows does however. So in linux you can't mark a file in exclusive possession of a process.

eklavya · on Dec 30, 2021

Down voters should read up on the state of mandatory locking in Linux and what conditions need to be met and how reliable it is.