Hacker News new | past | comments | ask | show | jobs | submit login
Moved ~/.local/share/steam. Ran steam. It deleted everything owned by user (github.com/valvesoftware)
603 points by stkhlm on Jan 15, 2015 | hide | past | favorite | 268 comments



This seems like yet another good example of why robust application-level access control would be a helpful thing to build into modern operating systems, in addition to the typical user-based controls. This may have been both a rookie mistake and a regrettable failure of code review processes, but in any case it simply shouldn’t be possible for an application running on a modern system to wipe out all user data without warning in such a sweeping way.

I have often made this argument in the context of sandboxing communications software like browsers and e-mail clients, where it is relatively unusual to need access to local files except for their own data. In that context, restricting access to other parts of the filesystem unless explicitly approved would be a useful defence against security vulnerabilities being exploited by data from remote sources. It’s hard to encrypt someone’s data and hold it for ransom or to upload sensitive documents if your malware-infected process gets killed the moment it starts poking around where it has no business being.

More generally, I see no reason that we shouldn’t limit applications’ access to any system by default, following the basic security principle of least privilege. We have useful access control lists based on concepts of ownership by users and groups and reserving different parts of the filesystem for different people. Why can’t we also have something analogous where different files or other system resources are only accessible to applications that have been approved for that access?


Isn't this the basic idea behind the sandbox in OS X?

I think OS X (and mobile app development in general) shows both that this is great in theory and a net improvement over not having it, but that there are some common pitfalls to address.

First, there are a handful of apps where this model doesn't work so well -- e.g. text editors, FTP clients, etc. So you're inconveniencing quite a few legit apps which need broader access.

Second, as a corollary of the first, that means you're going to have a lot of apps that legitimately need to ask users to approve broader access. And as the number of apps asking for approval goes up, the more likely users are to simply ignore the warning and approve all. This is especially problematic since we can assume the average user is a good judge of which apps need which access.

Edit: One way of reducing user acceptance fatigue might to introduce greater granularity into the requested permissions and then tier the permissions requested -- e.g. commonly asked vs. uncommon. E.g. an app may legitimately need permission to write to any file in your home directory, but it's highly unlikely they'll need permission to write to more than X number of files per second. Or at least they shouldn't be able to do so without the OS throwing up lots of warnings outside of the app.


Sandboxed applications in OS X can read/write to arbitrary locations if they use the system Open/Save dialogs to ask the user about those files (after opting into sandboxing, of course). See here [1].

For files and folders the user cares and knows about (documents, projects, etc), this shouldn't be a problem. For files the user doesn't care about (caches, configuration), you can just leave them in your sandboxed container.

[1] https://developer.apple.com/library/mac/documentation/Securi...


Lots of applications (like Emacs and vim) don't use the system file dialogs though. It'd be nice to preserve old-fashioned file access for them.


But these are applications for knowledgeable users I think which is not the type of users the GGP is talking about. For these it might be ok to ask for permission. Or to grant it automatically if you are root or sudo'ed. And many use cases could get around with allowing silently to write if the file has been previously been opened by the same application.


I will not use a system that pops up some kind of UAC-like approval dialog for every call Emacs makes to open(2).


The OS X sandbox doesn't work that way. If you do "Open" and pick the root of your hard drive one time, the application gets and keeps access to the entire drive.

Disk scanning programs like DaisyDisk from the app store have to make you do this before they can get any information about disk usage.


Well, of course, sandboxing is opt-in, and there are still lots of programs out there that don't use them.

I'm not sure Emacs or vim, as an example, would ever be able to be sandboxed from the filesystem.


Aquamacs seems to use the system dialogs with no trouble.


Sure, File->Open invokes the system dialog. Not C-x C-f. Not automatic filename history, or tags table traversal, or perusal of the M-x grep results buffer, and so on. Emacs is a generic runtime environment and needs generic file access.


It helps if you assume good faith of the application developer. Part of developing an application should be defining what permissions you need at install time. This won't help against legitimate malice, but it would defend against this type of mistake, and (if it is configured such that the app cannot change its own permissions) will also mitigate any exploit of the app.


Is it really a net improvement? There are a number of developers out there that have pulled out of the app store (off the top of my head, Atlassian for SourceTree, and Panic for Transmit) because the sandbox restrictions would force them to remove functionality from their applications.

As far as Apple's implementation goes, sandboxes are for kids, not adults that need to get work done.


SourceTree and Transmit need to stomp all over directories to get things done. Yes, it's useful. But it's not common for applications to need that kind of access. The sandbox seems to work fine for the other 99% of applications. I think Apple even uses the sandbox heavily for their own apps, check ~/Library/Containers next time you use an OS X system.

The only complaint here is about the app store, sandboxing is wonderful.


Steam is another category of application which wants to write to directories used by other applications...


Why? In my Windows VM, all Steam data lives in C:\Program Files\Steam. (Plus start menu entries etc.)


Steam manages game installation and updates; those games are themselves separate applications. That's what I was referring to.


There seem to be several reasonable ways to address this kind of situation without requiring universal access.

One would be analogous to an ACL arrangement rather than simple ownership. Steam applications could be installed with Steam also having permission to access their resources.

A second possibility would be to have the operating system provide dedicated services for installing and maintaining software. We’re already heading in that direction on some platforms anyway, and it would be useful generally given the kind of security model I suggested. Then software like installers/updaters or package managers can do their jobs in a tightly controlled way, without needing any general access or introducing the accompanying security risks.


Silly question; how common is this class of bug? We're talking about an application that lives on the local system, and is probably only exploitable via social engineering bugs (i.e. we convince the user to do something stupid).

I can count the times I've been owned through an app that doesn't run content from the internet (either accessed by or being a server for) on zero hands.

What is the problem that sandboxing every app into a homogenous set of thou-shalt-not's solves?


Silly question; how common is this class of bug? We're talking about an application that lives on the local system, and is probably only exploitable via social engineering bugs (i.e. we convince the user to do something stupid).

We live in a world where merely installing software might also install a silent updater in the background, or might interfere with existing software that it has no need to touch, or might start monitoring peripherals and phone home with data in ways that could invade privacy. We also live in a world where once popular software, particularly freely available software, sometimes drifts into borderline malware territory over time. In this world, “doing something stupid” can be as simple as turning on your computer and installing (not running, just installing) some of the most popular software in the world today on it.

What is the problem that sandboxing every app into a homogenous set of thou-shalt-not's solves?

To give a few examples, some of us would consider it a bug for everyday applications to splat junk all over a filesystem during a build/install, or to hide data in odd places as part of a copy protection scheme, or to scan a whole disk and automatically upload any files that might support “cheating” in a game to the mothership.

Unlike some here, I am not willing to trust the good intentions of a software developer just because I have paid good money to use their product. Far too many shady practices go on in parts of our industry for that to be a sensible policy without adequate safeguards in place any more.


And so the answer is to hamstring what all apps can do in the aim of safety?

...there's a commonly misattributed quote I have in mind that describes this situation.


Steam runs content from the internet: the steam store, and all the downloaded games are from the internet.

And where do your apps come from? Wasn't there a thread the other day about installing the top apps from download.com and counting the chaos they inflicted on the system? Sure, you can avoid that, but not everyone does.


Steam runs mostly upstream vetted content, i.e. it's identical to the app store. If malicious code makes it through there, everyone is varying degrees of boned, sandboxes be damned. Yeah, it has a web browser, but 99% of the time, that browser is pointed at https://something.steampowered.com.

It's not like an average browser like Chrome where your main use case is running random code from random domains made by random people.

Besides, external apps are non-responsive to the question of whether the tradeoff gained by Apple's variant of sandboxing (where there are certain things you are not ever allowed to do, even if they are integral to the primary purpose of your software) is really worth preventing a limited class of issues?

App store requirements don't stop the user from downloading malware outside of the store. They do stop the user from doing certain things outright, and so push the user outside of the store, and so I'd argue, actually reduce safety as a knock on effect.


Why not have Apple require apps to run usefully with just basic permissions? Anything beyond access to own files is optional. This could certainly be gamed, but scrupulous app authors could gain a ton of trust from playing ball.


This is essentially already the case as Mac App Store apps must be sandboxed. This works fine for a lot of apps, but has presented issues for many prominent developers. (e.g. Panic had a fairly difficult time adapting Coda to the sandbox, if I recall correctly.)


Like AppArmor, or SELinux, or any of the other applications which have their hooks in the LSM? They do a fantastic job of this, if you can figure out how to use them.

The truth is that they are too hard for even your average Sysadmin to configure & manage, let alone your average desktop user.

setenforce=1 (yeah, right).


Yes, and I also agree that there are substantial unsolved problems in making such fine-grained systems practically useful for non-expert users.

I’d like to see the industry moving in that general direction, though. Even a much simpler model could bring real benefits relative to the status quo, where in application terms our current security model is analogous to everything being root.


Broadly I agree; the original work on access control assumed that it was users who might be untrustworthy and programs were safe, in a "classified documents" context.

However, applying program-level access control is very un-UNIX. How do you compose multiple programs with different security regimes? This bug happened because the "steam" program called the "rm" program via the "shell" program. Inheriting capabilities mostly solves this, but we're familiar with how hard selinux is to use as a result and it still doesn't save the user from command line typos.

I think it's time to make a stronger case for time-reversible filesystems. Accidental deletion matters less if you can just get in your time machine.


You are describing a problem that has been solved several times over. Blame steam, the distro devs, or the user for not implementing one of the many long existing solutions: chroot [0].

[0] http://en.wikipedia.org/wiki/Chroot


If a problem has been solved several times over, that's a sign that it has never really been solved. If there's a thing that most apps should do, then doing it should be the default, and avoiding the default should be the thing that takes work.


So what, your definition of a solved problem is one that never occurs again due to uniform implementation of countermeasures? That isn't realistic. That is like saying that plane geometry isn't a solved problem because people are still walking around having not read Euclid's "Elements", doing geometry wrong.


For the record, plane geometry wasn't solved by Euclid's "Elements"; the conjecture about the 5th postulate remained open for a substantial amount of time, and other axioms were introduced when it was pointed out that Euclid's work relied on numerous implied axioms.


Damn it. Well at least I can still trust in Plato's theory of forms.


Just because you've invented solar panels does not mean that you've solved the energy crisis. Sometimes the implementation of technology is just as inventive as the invention itself.


lol, ok lets break this down Barney style:

Converting sunlight to electricity is a solved problem. The "energy crisis" remains unsolved.

Jailing processes is a solved problem. Preventing users from shooting themselves in the foot remains unsolved.

Now let us reexamine the post:

> ...would be a helpful thing to build into modern operating systems...

Already built in.

> Why can’t we also have something analogous where different files or other system resources are only accessible to applications that have been approved for that access?

We do. Solved problem. Now if we want to prevent users from shooting themselves, that is a different problem.

That last paragraph sure makes it sound like the writer is unaware of these solution's existence... but let us suppose that he knows about them and is coming at it from your perspective. In that case I agree, we don't have a solution in place - and I'd love to see one in common use. I'd think you'd be able to implement such a system through package management for the majority of software, an added chroot step. That is why I included distro and application devs in those that share blame for the problem. Users are also included, to a smaller degree, because this is a known problem with a known solution.


I’m well aware of strategies using chroot, virtual machines, and the like. These are useful tools up to a point, but a long way short of what I would ideally like to see. For example, they restrict access at a very coarse level compared to the kinds of user/group/ACL models we use in many other contexts. By their nature, they also do not admit convenient ways to break out of the jail with the user’s explicit consent. Once you get beyond individual applications with their own dedicated file types and consider more generic cases like text editors working on text files that are likely to exist throughout a filesystem, this lack of flexibility is a serious limitation. Another key distinction is that chroot and the like are voluntary mechanisms, usually off by default and therefore not completely enforced by the OS.

Some systems mentioned in this HN discussion are closer to the kind of model I had in mind. As other posters have pointed out, the difficulty is how you structure a system so that it is reasonably effective by default but still usable by non-experts. I believe we could achieve this — or at least get much closer than the typical security models we use at the moment — but it will surely take a lot more thought and experimentation than we have attempted as an industry so far. Microsoft’s UAC mechanism makes an interesting case study here: it was fundamentally a reasonable idea, but the first implementation proved too intrusive for average users to tolerate and lost much of its effectiveness as a result.


It sounds to me like your ideal situation could be implemented with the tools we already have at hand - it is just an issue of default settings and how the package maintainers set compilation and installation options. Unless you really wanna go hardcore and advocate for something that is impossible for users/developers to break, which would most likely require a formally verified microkernel [0] - at the very least.

[0] http://en.wikipedia.org/wiki/L4_microkernel_family#High_assu...


Role-based and other access control mechanisms unfortunately come off all too frequently as bolted on and arcane hacks.

The real issue lies in the fact that the file system resides in a global namespace, when it shouldn't. Much like each process has its own environment variables, so should it have its own namespace. Linux does support so-called "mount namespaces" now, but once again they're not inherent parts of the system, but have to be tacked on through explicit unshares, and thus lose the cohesiveness of platforms such as Plan 9. [1]

[1] http://doc.cat-v.org/plan_9/4th_edition/papers/names


As a more immediate fix with less collateral damage, since Unix programmers refuse to stop putting `rm -rf` commands in shell scripts (they seem to think the suggestion is an insult to their manhood), change the behavior of rm so that by default it either disregards -rf or moves the target files to a trash directory where they can be retrieved in the event of an error.


A simple alias would work as a quick fix:

    alias rm='rm -I'
With the -I flag, rm asks the user before removing multiple files or removing recursively. One of the top in my aliases file.


Already exists - libtrash.


> robust application-level access control would be a helpful thing to build into modern operating systems

Something like Windows Store apps, but which ideally wouldn't require the use of a specific store? (Only Entreprise apps can bypass the store from what I know)


Or use a check pointed file system with off-site storage. My PC can catch fire any day and I would lose nothing.

Sure, this Steam thing sucks but your disk could die any moment; be prepared. RAID is not backup


Here's the offending shell script code:

  # figure out the absolute path to the script being run a bit
  # non-obvious, the ${0%/*} pulls the path out of $0, cd's into the
  # specified directory, then uses $PWD to figure out where that
  # directory lives - and all this in a subshell, so we don't affect
  # $PWD
  STEAMROOT="$(cd "${0%/*}" && echo $PWD)"
  [...]
  # Scary!
  rm -rf "$STEAMROOT/"*
The programmer knew the danger and did nothing but write the "Scary!" comment. Sad, but all-too-familiar.


Yikes. That should never have passed a code review. I know mistakes happen, I often defend screw-ups, but anything in a shell script that has an "rm" command should automatically get the squinty eyes.


Also, someone should have been asking "Why is that scary?" - which should have lead to a discussion on alternatives.



Actually, that repo is just a user who extracted the Steam .debs into a github repo, the comments are completely misdirected.


The reaction images/memes are priceless.


I thought I was a bad programmer until I saw this thread. How did this make it to prod?


Tight deadlines, limited resources (programmers), limited budget, limited QA


I worked at a solar SCADA company that rolled there own APT packages.

In the pre and post install deb package scripts there was all kinds of crazy shit, like upgrading grub to grub 2 and manually messing with boot sectors. All this stuff in packages innocuously named modbus_driver.deb or what have you, and all in absolutely the most archaic bash syntax possible. I did suggest, strongly to jail all application binarys that we twiddle around with, with something like chroot, but was rebuffed.

Eventually somebody mixed a rm -rf /bin/* with rm -rf / bin/*, and the rest is history. They bricked about 100 embedded PC's, all in remote locations, all powering powerplant SCADA systems that did stuff like connect to CalISO for grid management or collect billing information. It cost hundreds of thousand of dollars to fix.


Would this have stopped things getting deleted?

    if [ -z "$STEAMROOT" ]
       # something isnt right...


Not if the previous line was:

    STEAMROOT=$SOME_OTHER_UNSET_VARIABLE/
"rm -r " is a code smell, as much as "cc -o myprog .c" is. You should always know what files make up your system, and track them in a MANIFEST file. There's rarely a good reason to use wildcards when a program is dealing with its own files.

    xargs rm -df -- < MANIFEST
fixes this.


That looks like it works until your MANIFEST file ends up with a space character in one of the file names.

For GNU xargs I like adding -d\\n which handles everything except files with an embedded newline. Those are much rarer than files with a space, though.

Sadly, OS X xargs (probably BSD based) doesn't have that option, so I have an alias to do the same thing:

    alias xargsn="tr '\n' '\0' | xargs -0"


Good catch.


Steam doesn't deal with its own files. It deals mainly with random games that are creating tons of random files.


Doesn't it have an API? It could mandate that "random files" should only be created and deleted via the API, and update the manifest accordingly. Put the game in a read-only folder to make sure it happens.


Steam actually sells a number of games which haven't been modified for use with Steam at all - no DRM integration, no achievements. Further to that, it sells games which use closed engines that are never going to be modified to use Steam's APIs to do things.


Or worse, basically just glorified installers for games for windows live, or ubisoft's giant portal thing that you have to then run simultaneously (or in the right order) to get to the game.

Its definitely a rube goldberg machine in action.

I feel like the idea of sandboxing its progeny is going to need to look like docker or some sort of container where it appears to be a standard OS (since games use a lot of low level hacks) but is actually partitioned from the rest of the system.


   rm --preserve-root
isn't a bad thing to have either. That way, even if you do screw up, you won't be able to run rm against '/', even with '-f'.

It's one of the top aliases in my .bash_aliases file.


I think that

    rm --preserve-root -rf /*
doesn't save you.


You are quite correct.


Agreed. Thankfully this is the default on many systems (I'm guessing recent coreutils).

Unfortunately dumb lines like "rm -rf $HOME/$STEAMDIR" still get through unscathed.


What if you somehow wound up with a single line of / in the MANIFEST, though?


"rm -df /" does nothing. "rm -df" does not remove non-empty directories.


That's awesome, I didn't catch the s/r/d/.


That's because I ninja'd the -d in after ;) also just ninja'd in a -- to prevent stray options.

Note also that putting a "*" in the MANIFEST doesn't do anything either, as neither xargs nor rm expands wildcards (only bash does).


    rm: cannot remove ‘/’: Is a directory


Fyi: the -d option is specific to the BSD implementation of rm, it's not available in the GNU coreutils rm.


It's in my coreutils (8.22). Maybe it's recently added?


It looks like it was added in version 8.19. Probably about time to update my Ubuntu box...

http://savannah.gnu.org/forum/forum.php?forum_id=7342


> Would this have stopped things getting deleted?

No. "${STEAMROOT}" will contain something when the "rm" command runs. It just might not be what's expected.

> STEAMROOT="$(cd "${0%/*}" && echo $PWD)"


I worked at a solar SCADA company that rolled there own APT packages.

In the pre and post install deb package scripts there was all kinds of crazy shit, like upgrading grub to grub 2 and manually messing with boot sectors. All this stuff in packages innocuously named modbus_driver.deb or what have you, and all in absolutely the most archaic bash syntax possible.

Eventually somebody mixed a rm -rf /bin/* with rm -rf / bin/*, and the rest is history. They bricked about 100 embedded PC's, all in remote locations, all powering powerplant SCADA systems that did stuff like connect to CalISO for grid management or collect billing information. It cost hundreds of thousand of dollars to fix.


Whenever I write something scary like that, I usually wrap it in a function, double checking if it's really the folder you wish to delete ... Often even adding a user-verification if possible.


Since the script is written in bash they could also have used $BASH_SOURCE, which will always point to the correct path of the script being executed, so you can do something like this:

    SCRIPT="$BASH_SOURCE"
    SCRIPT_DIR="$(dirname "$BASH_SOURCE")"


Gotta question why they used -f.


That's normal, but why the trailing slash?! That's just pointless, and almost looks like an explicit deathtrap.

Without the slash, an empty variable would result in a command line of "rf -rf" which would simply fail due to the missing argument.

There is absolutely no need for having a trailing slash, it's not as if "rf -rf foo" and "rm -rf foo/" can ever mean two different things, there can be only one "foo" in the file system after all.

Very interesting way of introducing an epic fail with a single character, that really looks harmless.


> it's not as if "rf -rf foo" and "rm -rf foo/" can ever mean two different things

That's true, but the original code was akin "rm -rf foo/*" and that's different, since it removes the content of the directory while preserving it.


I have observed that ,any people feel that directories “need” to have a slash appended to them. They are afraid of ever doing, e.g. “cd /foo/bar” and will always do “cd /foo/bar/”. I”m guessing they feel like it would be some sort of type error, like treating a directory like a file or something.

This behavior is especially common with regards to URLs; there are many flamewars about whether URLs “need” a trailing slash or not.


The trailing /* is to delete the directory _contents_ rather than the dir itself. It would be safer to receate the dir as then you can possibly hit some of the inbuilt rm protections


So the script wouldn't pause and ask for user input during normal operation.


But why would the script prompt for user input unless something was awry? Presumably they control the contents of $STEAMROOT, so I don't see why rm -r should prompt unless it's about to do the wrong thing.


most systems have "alias rm='rm -i'" by default, so it would prompt on every single file regardless of ownership, etc.


"most systems"? Not on Debian or Ubuntu, and that's most of them.


fwiw, making those "-i" aliases to rm, mv and cp are one of the first things I do on any new Linux machine I'm on.

I don't understand how anyone works in the very unforgiving-of-accidental-delete *nix world without those aliases.


The "rm -i" alias is a horrible, horrible idea. Red Hat has a lot of stupid defaults, but this is probably the most questionable one.

The useless confirmations on every deletion is so intrusive that people will instinctively try to work around it. In the best case they'll undefine those crappy aliases in their own shell config, or maybe gravitate toward writing /bin/rm rather than rm to avoid the alias expansion. In the bad case they'll learn that "rm -f" will override "rm -i", and get in the habit of using that to shut rm up. Too bad that "-f" does more than negate "-i"...

People who don't actively work to circumvent the braindamage will almost certainly end up reflexively teaching themselves to just answer "y" to the prompts without reading. Or worse, they'll learn to depend on the prompts being there, and doing "rm * " when their intent is not to remove everything. "Yeah, I'll just answer 'n' for the files I want to keep". That's going to be a really nasty surprise when they use a machine without that alias.

No. Just no. Don't do it.

The solution in zsh is much better. Warn for "rm * " (or "rm * .o", etc) no matter what, since that's both very dangerous and very rare. But don't waste the user's attention on every single deletion.


Hmm. Well I guess it's just me then. I've been burned so many times by rm/mv/cp that I actually do always read their confirmations and give it a 2nd thought. I rarely delete files unless it's a mass delete or files & folders that I do "rm -rf". and I pretty much never intend for mv or cp to overwrite an existing file.


zsh has a nice feature where rm is only interactive if you are deleting everything in a folder. Another safety feature I really like in zsh is to tab-expand wildcards, so I can check that it's not deleting anything it shouldn't.


99% of the time, my terminal is open to a git repo, and -i would just be more noise. The other 1% of the time I add the alias or use Nemo.


Frequent snapshots and backups (non-local)


With care.


or Arch Linux, or OS X, or Cygwin, or …

… I don't think anyone does that.


> Not on Debian or Ubuntu, and that's most of them

Perhaps I should have said "sane systems" ;)

Fedora ships with that alias by default FWIW.


I don't know why anyone would use Fedora by choice.


Frankly put, it's a fantastic development environment -- mostly because it's targeted at developers/power-users and not the average user.

It has the added benefit of being inside the "RHEL/CentOS" ecosystem (similar commands, structure, etc), provides a glimpse of what's-to-come in future releases of RHEL/CentOS, and since majority of servers in the enterprise are RHEL/CentOS based, it's a natural fit.

All that aside -- have you tried Fedora 21? It's a complete overhaul from previous Fedora releases and has a lot to like and offer.


I guess I have always had ideological issues with RedHat, as a distribution. It doesn't surprise me that they would put the -i alias in Fedora; this kind of "helpful" addition that is actually totally annoying and inappropriate is, IMO, emblematic of the distro.

I'm sure it works for some but it's not for me.


Shells (at least bash) don't evaluate aliases when running non-interactively.


Anyone considering putting customization like aliases where they'll execute even for non-interactive shells should go home and rethink their life. Thankfully, I've never encountered such a system.


If all one is in the habit of using with a particular command is some single letter flags glommed together like that then it's possible to forget that they're actually separate flags.


Probably to delete subfolders.


To those name-calling the author of the script:

The product/update is hyped and the release date is set in stone. Tensions are high and your boss has already let you know that you're on thin ice and not delivering on the project goals.

A last-minute showstopper bug comes in, caused by file leaks. Everyone is scrambling, and the file belongs to you so its on you to fix it alone. There is no time for code review, and delaying isn't an option (so says management). "I'm afraid if we keep seeing these delays in your components, we might have to consider rehiring your position".

The rm rf works -- it's a little bit scary, but it works. You write a test case, and everything passes. Still, you add the "scary" line for good measure. You have two more bugs for fix today and you'll be lucky if you're home by midnight and see your wife or kids. You've been stuck in the office and haven't seen them in days.

Are you an "idiot", "talentless" engineer that "deserves to have his software engineering license permanently revoked"? How do you know this wasn't the genesis of this line of code?


Yes, that person is still an unprofessional idiot.

If a doctor accidentally removes the wrong organ because administrators have overscheduled him, "whoopsie, not my fault" is not the appropriate answer. The same applies to engineers working on bridges. Professionals take responsibility for their working conditions.

There is an enormous shortage of programmers right now. Anybody shipping stuff that is bad or dangerous is choosing that. If we drop our professional standards the moment a boss makes a sadface, then we're not professionals.


It's hard to be professional if no one wants or values it but you. The word your manager would use is "obstinate."

If they have not internalized the consequences of the risks they're asking their subordinates to take, they'll weigh what look like vague misgivings about "mumble, should be better, dangerous, blah blah" against the better understood risk of their bonus disappearing if the product doesn't ship on time.

Even if you choose to sacrifice yourself, your reputation, and your future prospects--again, if almost no employer in your industry would value what you call professionalism over short-term profits--someone else will ship the code you wouldn't.

That isn't a defense of anything; it's just a fact. Taboos (e.g. against bad code) don't work if they're not shared by the majority.


Sure, you can tell yourself that, and it will remain true. Or you can act like a professional and seek out places that value that. I have, and know others who do. I don't think we've sacrificed anything.


> If a doctor accidentally removes the wrong organ

Incidentally, that sort of thing does happen sometimes.

http://www.cnn.com/2010/HEALTH/10/18/health.surgery.mixups.c...


Agreed 100%. The number of people defending the author(s) of the code or saying the user should have backed up their data is disturbing. This isn't about protecting programmers' egos, it's about not deleting users' data.


So I guess it's a choice between getting fired for not meeting a deadline, and getting fired for destroying customer data.

I'd rather take the first option. At least my reputation will still be somewhat intact.

And as a bonus, I can get out of that hostile environment earlier.


> I'd rather take the first option. At least my reputation will still be somewhat intact.

I'd take the same option, but for different reasons: the customer's data. Only pictures of e.g. a deceased wife, no backup, and you just deleted them. You can arm-wave all you like about backing up, but you deleted them.


Dude, I am already stressed when hitting big red buttons in production, and from now on I can imagine the possibility of erasing unrecoverable memories about lost loved ones... This profession is tough on the most unexpected levels.


If you want more nightmare fuel (or just want to see various ways technology goes wrong), check out http://catless.ncl.ac.uk/Risks .


The choice is between definitely getting fired and maybe someone's data getting deleted.


This.

This is the reality of corporate development. I think we should be demanding more before buying software from vendors, especially when you can't just whip out the source code to audit or fix yourself.

Suff like this doesn't have to happen, but we let it.


> Tensions are high and your boss has already let you know that you're on thin ice and not delivering on the project goals.

Valve doesn't have bosses, remember?


Except of course they actually do. It'll just take you half a year to figure out who they are.


Totally agreed but isn't Valve a company famous for not having all that corporate deadline stuff?


but the same wouldnt be acceptable in other lines of work right? Try a civil engineer or surgeon... what would you expect someone to do in similar situations? Probably escalate the issue and perhaps delay the release.

Pushing in shitty broken means you're not doing your job. If your company forces you to do this then they are not doing their job.


How do you know it was?


A while ago I made a small program to cache Steam's grid images and search missing ones (https://github.com/boppreh/steamgrid). More of an experiment in programming in Go, really, but it works and makes Steam a little better.

When I tried on Linux, it threw a permission error. Turns out Steam installs the folder "~/.local/share/Steam/userdata/[userid]/config/grid" without the executable permission bit. Without this bit no one can create files in there, Steam included, and the custom images feature gets broken.

I reported the problem, saying they should fix their installer, and got a "have you tried reinstalling it?" spiel. When I said I did, and manually changing the permission fixes the problem, so it must really be it, they closed the ticket with "I am glad that the issue is resolved".

This was a ticket at support.steampowered.com, because I didn't know Valve had a github account. I would open an issue there, but I don't have a linux installation to test again and this sort of misunderstanding burns a lot of goodwill.


Vavle's customer support is known to be pretty awful. This is all too familiar an issue.

To be fair to them, when the six-sigma of your customer service is 15 year olds asking to be unbanned cause they totally didn't use hacks, your CS probably gets a bit desensitised.


Too many IT supporters either have no idea what they're doing or have no interest in helping improve their product. I'm experiencing the same with highly specific and expensive commercial product, so it doesn't surprise me with Steam at all.


Something like this with Steam happened to my friend not too long ago. It was very saddening because he literally lost years of files (including personal projects) and salvaged what he could. That was with the Steam Beta and I caught Steam doing this myself (after he told me what happened). I was lucky to stop the script and switched out of the beta. At the time he reported this to Valve themselves and said they were "investigating the issue and knew of it". Seems to still be here, sigh.

I know the morale here is keep your files backed up but come on, this is a ridiculous issue Valve still hasn't fixed.


That is quite frustrating, and consumer vendors should be mindful of creating life-changing experiences.

Also: backups. I know it sounds cliche, but look, if it has a mechanical hard drive, the manufacturer could have slightly mis-calibrated one of the mechanical assemblies, and this could have happened because the nature of digital storage is that it is essentially ephemeral.

Protect yourself from things outside your control. You don't need the most sophisticated solution, just an external usb drive.


But if that external usb drive is mounted at the time (as in the case of the user this thread is about), then all data on that drive will be deleted.

For this reason, the recommened way to use things like rsnapshot is to have your backup directories owned by root and with permissions masked to something like rwxr--r--. If you then want to read your backups easily, you do things like mount it under NFS as read-only.


A (RAID6) fileserver running ZFS with filesystem-level snapshotting. It's really, really good.


Btrfs snapshots, and then cp --reflink from the snapshot to the current tree whatever files or directories are missing; i.e. no need to rollback to a snapshot.


Or if you're on a Mac you don't even need any extra hardware.

Just ARQ and an AWS Account [or google drive, or dreamobjects, or SFTP, or...].

Your backup is client side encrypted and completely painless. Take half an hour to set it up and then just forget it till it saves your ass.

Not affiliated, just a fan: http://www.haystacksoftware.com/arq/


I think Valve should be sued over this, and lose. Sure, there will be clauses in their EULA stating that they aren't liable, but morally, those clauses should not be valid in any licensing agreement. Their commercial software caused damage to people, and they should pay through the nose for it.


There are several problems with your ideas. The most important one is that programmers only call themselves engineers until it comes time to take legal responsibility for their work; then suddenly they're artists creating works for hire. Programmers have worked very hard through the years to create the current liability-free environment; people die because of health care programming bugs, pilots crash because of avionics bugs, and people lose fortunes -- or welfare checks -- because of finance programming errors, but programmers just throw up their hands, and say that programming is hard.

The other problem here is the concept that commercial software should be held to some higher standard of liability than noncommercial software. If some random group of strangers build a bridge, which then collapses and kills someone, they're still very much open to lawsuits. So far, the programmers of the world have fended this off by shouting in all-caps about warranties express or implied -- but eventually (I hope) the world will get sick of their shit and hold them accountable for failure. When that happens, whether or not the software is sold should have nothing to do with damage liability (outside of any sale contracts that my apply).

A man can dream.


It's a bit tricker than that though.

In software, if I build a toy, and release it to the world, many people may find it useful. They might trust it with their most personal data, or vital business documents. But, it is still a toy.

If I build a toy bridge no-one tries to make it span the grand canyon. They see it for what it is - its faults are obvious. They try to cross my toy bridge and it easily collapses under the weight of their foot without costing the lives of anyone. And, if they somehow managed to string along my toy to the point where it could span the grand canyon, do you think anyone in their right mind would find me liable as the toy maker?

Ultimately, this is where the differences are huge. People are building toy bridges, pet bridges, foot bridges, bridges for single cars and bridges for 10 lanes of traffic. It's very difficult to accidentally substitute a bridge built for 1 person into a spot you need a bridge for 10 lanes of traffic.

But it's trivial in software to put a "toy bridge" or "pet bridge" where you need something like the golden gate bridge.

So, when someone takes my toy and tries to use it as the golden gate bridge, whose at fault?


Good points. But the reason why software being sold should have a higher standard of liability is because in the case of open source software, that's generally just posted on some website and anyone can download it, modify it, and run it on any platform. The authors don't have any control over what anyone does with it, and never received any money for it in exchange for their liability. With many open source projects, there's a feeling that it was created for the author to use themselves and anyone else being able to see the source and possibly get some benefit from it is just a bonus.

With a bunch of random people just making a bridge that collapses, then unless it's on their own property, they know that other people will use it and expect it to not collapse and so liability is justified there. (If it's on their own property, then morally there shouldn't be any liability unless they invited someone to use it, but legally is another matter.)

If this was ANY product other than software, there wouldn't even be a question of Valve being liable, and that's just disgusting to me.


Can anyone design a safe and reliable bridge with a span of indeterminate length across unpredictable geography?


This is precisely why I hate comparing building software to building bridges. Nobody builds bridges without knowing exactly where that bridge will be standing, but it happens all the time with software - furthermore same piece of software usually must fit all kinds of "unpredictable geography". As such, yes, I do think that building software is a problem far more difficult than building bridges.


Which leads back to the number one reason why software projects go bad; not knowing what you're meant to be building because you didn't get clear requirements. The number of times the requirements gathering has ended up with questions genuinely being answered "we don't know what we want" makes me want to hurt myself.


There's just no sense in telling a hobbyist what to do with his personal time. That means its no longer a hobby, it's a charity exercise.

I think the problem is EULAs. Foss Eula's are pretty straight forward - no warranty: use at your own risk. Corporate software takes the same approach, and I think that's cowardly and underhanded because they are a profit driven entity which, if not incentivized properly, would literally rob you blind.

Corporations are profit driven. No profit, no corporation.

If your going to profit off of selling me something which then burns my house down because it was cheaper for the vendor to not build to standards (best practices), shouldn't they be held liable?

If someone made a lamp for themselves that wasn't up to standard, and then left it out on the curb for garbage pickup when you came along and took it home, plugged it in, and burned down your house, who's fault is that? They didn't sell it to you. There was no purchase.


Your analogy is flawed. What is happening instead is someone is making a lamp, and then putting a sign on it says "LAMP is a lamp that lights your home! It's compatible with electricity and is under no warranty, express or implied!" and offering to let you take it home.

And you seem to have missed the part where I explicitly said that both commercial and noncommercial software should be exposed to liability lawsuits. EULAs are a waste of everyone's time: anything distributed in an executable format should expose the distributor to liability suits for damages.


No, what is happening is someone making the instructions on how creating a lamp public and saying "I use this lamp in a controlled environment where using it cannot hurt anyone. If you try to use it outside of a similar controlled environment it may explode in your face, create a black hole and/or anhilate the universe. I don't really know, since I did not test it in that environment. Use it at your own risk." Then people go and use it outside that controlled environment, maybe making other people pay for using the lamp created following the instructions.

And now the creator of the instructions is liable for damages? Wait, what?


There is a real an unavoidable distinction between throwing stuff up on github and distributing software packaged for easy insertion into your operating system. If it's so incredibly difficult to test, why provide binaries?


Hm, that's fair.

For what it's worth, I didn't miss that part, I just disagree with it.

But let's explore the idea that anything distributed in an executable format should expose the distributor to liability.

How do you define executable? I mean, what about languages which are optionally compiled, or python or ruby packages for example? And when you say distributor, do you mean distributor or do you mean author? Eg, is Github responsible because they hosted/distributed my code?


I think that would impact the viability of open-source software in a way that would change the whole world for the worse.

By a lot.

No one sane would put a pet project out in the open, and we'd all have far fewer kernels from which great things could grow.

(Unless you're specifically talking about closed-source software, in which case I just might largely agree with you.)


I don't know. I think the main problem with the parent idea is '...and lose'. Perhaps they should be sued (presumably as a class-action), but whether or not they lose is TBD. The parent wants them to be sued only so that they necessarily lose.


What are you whining about? Why would anyone desire for a company to be sued and have them win?


>Programmers have worked very hard through the years to create the current liability-free environment; people die because of health care programming bugs, pilots crash because of avionics bugs, and people lose fortunes -- or welfare checks -- because of finance programming errors, but programmers just throw up their hands, and say that programming is hard.

Yeah, we've all noticed the rampant mass deaths of millions because of software issues...

/sarcasm


So you're telling us that accidental death is okay as long as it isn't in the millions? How did you decide on that number? Where is the line drawn?


>So you're telling us that accidental death is okay as long as it isn't in the millions?

Yes, of course. We take such decisions everyday. Cars lead to deaths from traffic accidents, and still we are ok with it as a necessary evil, since we want the benefits they provide.

If we restricted technology to "things that can cause absolutely no deaths" we would even have used fire (tons of people die every year from it). Heck, houses too can collapse in an earthquake etc -- we should make sure noone builds one until its 100% safe house (that costs some millions to build).

But my sarcasm was directed at the parent blowing this "deaths due to software" thing out of proportion, and making it sound like someone dies every minute from a buffer overflow.


   rm -rf "$STEAMROOT/"*
This is why serious bash scripts use

   set -u # trap uses of unset variables
Won't help with deliberately blank ones, of course.

Scripting languages in which all variables are defined if you so much as breathe their names are such a scourge ...

I did this once in a build script. It wiped out all of /usr/lib. Of course, it was running as root! That machine was saved by a sysadmin who had a similar installation; we copied some libs from his machine and everything was cool.


While you're at it:

    set -e # exit on unchecked failure
That way you don't trudge forward through an untested code path after a failure, you stop there and then.


I use this holy trinity in most of my scripts:

set -o nounset # set -u

set -o errexit # set -e

set -o pipefail # I'm not sure this has a short form.

I use the long forms for the option names because they're self documenting.

If anyone has more suggestions, I'm all ears.


David A. Wheeler recommends setting IFS=$'\n\t' to handle spaces in filenames & variables.

http://www.dwheeler.com/essays/filenames-in-shell.html


All those rules add up to: "avoiding writing anything complicated or general purpose in the shell language that is intended to be used by any user over any set of files; stick to handling known input materials which are closely associated with the script."


Are these bash-exclusive or do they also work on posix-like sh?


I doubt it. set -e is POSIX, I think, but the rest are probably bash extensions.

I'll probably get flak for this, but for 99% of the scripts, at this point I'd just use bash. Linux has it as a default, as does Mac OS, anyone running BSDs can install it trivially, and Unixes probably have it as well since most of them have antiquated CLI environments and administrators usually install GNU utils to have a more human working environment.


See here:

http://pubs.opengroup.org/onlinepubs/009695399/utilities/set...

"-u The shell shall write a message to standard error when it tries to expand a variable that is not set and immediately exit. An interactive shell shall not exit."


That one I almost always use, but it's not very relevant to this specific case. If you add it to an existing script, there is usually work to be done because up until now, the script might have been working properly only because it persisted through some failing commands.


Beware, this doesn't always work. For instance, inside a function called by a pipeline that is part of an if condition.


    rm --preserve-root -rf "$STEAMROOT/"*
would have worked too.


Really? Have you tried it? If your shell expands globs and you have any directories below / (both very common conditions), rm will not receive any arguments that are the root directory, and your --preserve-root will not alter its behavior. --preserve-root would help if the author of this particular nugget hadn't made the additional mistake of inserting a needless /*, but with that there the root directory would never be an argument to the rm.


Another beautiful example of developer/user priority inversion.

All system architects ever:

1) System data are sacred, we must build a secure system of privileges disallowing anyone to touch them

2) User data are completely disposable, any user's program can delete anything.

All users ever:

1) What? I can reinstall that thing in 20 minutes, there's like 100 million copies worldwide of these files.

2) These are my unique and precious files worth years of work, no one can touch them without my permission!


Many of the comments mentioned this should have been caught in the code review. I suspect they don't perform code reviews.

Makes me wonder, is there a tool, system, service for auditing how many 'pair of eyes' have reviewed a given line of code. This would be hard to determine, but could be useful. I am envisioning a heatmap bar or overlay that indicates the number of reviews a line of code has received.


It is trivial to find systems where you can assign specific people to review code, of course. Plenty exist. But to guarantee that they truly read a line of code, rather than skimmed/scrolled through it? Even if they paused their scrolling on that line, it doesn't mean that they really read it, or did so with proper understanding of what it was doing. Short of placing an actual comment or question for that line (demonstrating some real interaction), it doesn't seem like an easy problem to tackle.


I think there's merit in the idea -- not tracking what a code reviewer reads but more of tracking how many times each line of code has been _included_ in a review, by being modified or maybe within so many lines of a change (like how diffs show X lines of surrounding context). The idea would be that code changes _near_ a buggy line would be more likely to draw attention to that bug and perhaps lines with less attention would be more likely to facilitate hidden bugs.


I suspect it would work out the other way round - code in a frequently modified section would be more likely to hold bugs because the requirements/understanding of that code is changing more frequently.


I would use that. For unpopular open source projects. Let me review and comment other people code, make them review mine. Some rating is probably needed. Very nice idea.


>I suspect they don't perform code reviews.

It certainly seems that way. CSGO (their FPS) is notorious for updates that break things, like very recently a gun having less ammo than it should, or masks (halloween thing, I think) being rendered through smoke (a crucial feature of the game).


That's not a lack of code review, that's a lack of QA.


For those of you worried about important files, chattr +i is a useful defence. No easy way of applying this automatically.

Long ago I had a kernel hack that would kill any process that attempted to delete a canary file. Worked OK but no chance of it ever going mainstream.


Reminds me of a shell trick I saw many years ago for short circuiting accidental 'rm -rf's by issuing a 'touch -- -i' in a sensitive location. In bash (and others), the glob operator inadvertently feeds the '-i' (now a file) into rm as an argument which then interprets it as its "interactive" flag, causing it to prompt for continued removal.


Which only helps "rm -rf ". It does nothing for "rm -rf /anything" or "rm -rf /anything/" or "rm -rf /*" or any other way of spelling doom.


Which is why I mentioned the glob operator! Speaking of which, if you did touch '/-i', it would catch

  rm -rf /*
Ironically, I was thinking of adding that specific directories would not be caught (obviously), but I figured that would be understood implicitly since most people ought to know what <asterisk> actually does in the shell. And if they don't...

Edit: I just noticed that the glob operator in my first comment didn't show up, because it was eaten by markdown. Incidentally, so was the asterisk in your post! That might be the source of your confusion. In that case, I should specify such a trick only works with:

  rm -rf *
  rm -rf /*
  rm -rf ~/*
Or similar. Not specific files. But, again, I appeal to the importance of understanding what the glob operator actually does!

As an aside, the context of this post is a mistake in steam.sh which may essentially do:

  rm -rf /*
So, the discussion implicitly has nothing to do with exact paths. :)


Nice. Although if I forgot I did this, I suspect that I will have a difficult time figuring out why I cannot delete the file.


The biggest lesson here is that backing up your files is extremely important. Both local backups and remote backups.

I like the 3-2-1 rule:

  At least three copies,
  In two different formats,
  with one of those copies off-site.
Software is written by humans who will undoubtably miss a corner case and not think of every possible environment.


You're too kind :)

Granted, part of the blame lies in the archaic Unix security model which doesn't sandbox applications. But ANY line containing "rm -rf" should be reviewed by the most senior dev in the company, or at least one who actually understands shell scripting. It has such a terrible failure mode, there's no excuse not to. (Especially when the dev to blame knew that it's "Scary!".)



Are you trying to make me have an aneurism? :)


Always live by the saying:

"If it doesn't exist in at least three places, it doesn't actually exist"

- Allan Jude - TechSnap


>In two different formats

What does this mean?


Probably different media, e.g. an external hard drive and DVDs.


Probably means an onsite and an offsite backup for example.


It should be noted, as listed in that issue thread, this is apparently also present on Windows (same bug in two different shell scripts!).


This is a danger of me-ware being used before it becomes software. Software assumes and defends against others using and running it. Me-ware makes no assumptions because me is the only one running it.

The transition from meware to software is a hard one - and usually it's how we get terrible reputations as an industry. Basically it's a prototype till it's burned enough beta users.

Edit OMG - that is actually Steam from valve - I take it back - this is supposed to be software.


Though it turned out to be irrelevant, I really enjoyed your spiel about "me-ware". The distinction between it and finished software is too easy to forget.


I'm pretty sure I stole of off the guy who wrote Source Vault SCM but his name escapes me.


    # Scary!
    rm -rf "$STEAMROOT/"*
Anybody who writes a line like this deserves their software engineer license revoked. This isn't the first time I've seen shit like this (I've seen it in scripts expected to be run as root, no less); it makes my blood boil.

Seriously. "xargs rm -df -- < MANIFEST" is not that hard.

EDIT: I shouldn't be so harsh, if it weren't for the comment admitting knowing how poor an idea this line is.


The programmer committed a cardinal sin to be sure. But, so did everyone on the code review that let it slide.


You're right, but I don't retract my statement, because we know the dev knew how dangerous this line was thanks to his "Scary!" comment.


I don't whole heartedly disagree with you but I think its grayer than that.

Maybe there were time constraints. Maybe the coder explained that this code was dangerous to his boss but was not grantee the time to fix it.

I think we agree that said code should never have been written, but there are any number of circumstances that place the blame squarely on management. If he explained the dangers of doing it that way but wasn't granted time to fix it (or no manifest was kept), theres little that johnny coder can do outside of their own time.

None of us write perfect code the first time, and we all had to start somewhere. What's important is how far youve come and what you've learned. I think.


You are right. Consider my original assertion (which is past its edit window) amended to read "engineer (or manager who prevented an engineer from fixing)".


Do we know if they do code review? Are pull requests blindly accepted?


If valve doesn't do code reviews, and if pull requests are actually blindly accepted (is that actually a thing? Do people ever actually blindly accept pull requests?) I don't think I have the wherewithal to give a useful response.


>is that actually a thing? Do people ever actually blindly accept pull requests?

Speaking only for myself, I've seen it with internal git repos at companies that have some name recognition. Obviously they'd never want it known that this happens, but it does.

So why does it happen? In the cases I saw, it was usually when people responsible for code reviews had "other things on their plate", and rather than hand off the review to someone else, they just merged it. During heavy pushes to get a build out the door, I saw a lot of this happening. Since I never said anything, I suppose I was part of the problem.


Absolutely. Arguably, that's the main point of the code review -- you're sharing responsibility. If your team scape-goats, it's not going to be a team for very long.


Is there proof that the comment was committed with the line of code?

Of course once it was a identified a concerted effort could have been made to rectify the situation.

Sometimes a problem is identified working on some unrelated aspect of the software and the developer does not have time or scope to change the offending piece of code but wants to place a red flag.

Of course this example does not appear to be systematic approach to marking "fixme", unless of course their fixme tag is actually #scary.

Edit: Of course one would hope they have raised a rather high priority work item for this.


There's nothing wrong with that line as long as you know "$STEAMROOT" contains a directory that you wish to nuke.

The issue is that this particular script did not set up $STEAMROOT correctly.


The thing is, there are so many ways for STEAMROOT not to be set correctly in Bash. Just one typo in some future edit can bork everything if it's not tested thoroughly.

Sure, you could somehow check that STEAMROOT is set to something resembling what you want to delete. But manifests are much more simple to get right.

EDIT: Alternatively, pick a well-known UUID, and put everything under a directory with that name under STEAMROOT. Then, to remove:

  rm --one-file-system -rf "$STEAMROOT/d0a8936d-1faf-4b82-aac9-e5f104432b24"
This won't do the wrong thing, as it will require that unique string to be present in the pathname. (--one-file-system for good measure.)

(Don't put the UUID in a variable, or you're back to square one of having a potential bug!)


I'm always annoyed when I see UUIDs in my directory structure, it's the same feeling I get when I see a big mac wrapper littered in my yard, or MSOCache in Windows system root for that matter.


If you can't write a shell script without being sure your variables are set, then don't write it in shell.

A MANIFEST is not the right answer. Steam deals with any number of third-party games which may drop any number of files into these directories.

The best way to nuke a directory is 'rm -rf "$DIRECTORY"'. Any amount of "well what if $DIRECTORY points to the wrong place" has nothing to do with the removal operation itself.


How about using a better language? She'll scripting is an awful, awful language. An error like this wouldn't have happened if the program had been written in C or Python or Perl or whatever your choice might be.

Shell scripting seems tremendously overused. It makes some things a bit easier, but it's so crazy it makes PHP a look like a pinnacle of good language design.


Using C or Python or Perl does not automatically keep people from failing to check whether some code that reaches out into the environment to prepare for later action actually succeeded.


I don't see any way to accidentally write code in C or Python (Perl may be a different beast as a sibling comment indicates) that deletes the user's home directory if an environment variable is unset. These languages don't keep you from failing to check, but they fail much better. An unset environment variable without a check means you'll probably crash, whereas with a shell script you just keep on going, with bad data.


It tends to fail much more noisily if you try to use an uninitialized variable - C will probably segfault (or end up removing a garbage string of probably-unprintable characters), Python and Perl will throw exceptions.


I don't disagree with you. (Although Perl exhibits exactly the same issue.) Unfortunately bash is the lowest common denominator on Linux and is often chosen on that basis.


If you want to blow away a directory, do this just to avoid wildcard games:

  rm -rf $the_dir && mkdir $the_dir
Yes, it's technically slower, but this is shell.


Should mkdir be first (so that it short circuits if $the_dir is unset?) I haven't done anything in bash in a while but that seems to be how it works on my computer (Debian Wheezy, bash 4.2.37).


The original broken code was using a wildcard to clear a directory. I proposed a safer way of doing that: delete the directory itself then recreate it.


[deleted]


No, I'm a developer who pays attention to his code, knows what he doesn't know, double-checks what he thinks he knows, and isn't too lazy to ask for help when he isn't sure about something.

Not to mention I've worked with someone before who wrote code exactly like this, despite my protestation, and after he managed to delete half the (thankfully backed-up) file share in an unrelated incident.


Well I'm on the guy's side but this worth noting:

Including my 3tb external drive I back everything up to that was mounted under /media.

Well maybe it was just unfortunate and the drive just happened to be mounted or the "backup" is always online. If it is the latter it is a really bad idea. If your computer is compromised you risk all of your backup. A proper backup should protect your data from these occasions.


Wow, an awful bug -- and brings back memories of a very similar bug that we had back in the late 1990s at Sun. Operating system patches on Solaris were added with a program called patchadd(1M), which, as it turns out, was actually a horrific shell script, and had a line that did this:

  rm -rf $1/$2
Under certain kinds of bad input, the function that had this line would be called without any arguments -- and this (like the bug here) would become into "rm -rf /".

This horrible, horrible bug lay in wait, until one day the compiler group shipped a patch that looked, felt and smelled like an OS patch that one would add with patchadd(1M) -- but it was in fact a tarball that needed to be applied with tar(1). One of the first systems administrators to download this patch (naturally) tried to apply it with patchadd(1M), and fell into the error case above. She had applied this on her local workstation before attempting it anywhere else, and as her machine started to rumble, she naturally assumed that the patch was busily being applied, and stepped away for a cup of coffee. You can only imagine the feeling that she must have had when she returned to a system to find that patchadd(1M) was complaining about not being able to remove certain device nodes and, most peculiarly, not being able to remove remote filesystems (!). Yes, "rm -rf /" will destroy your entire network if you let it -- and you can only imagine the administrator's reaction as it dawned on her that this was blowing away her system.

Back at Sun, we were obviously horrified to hear of this. We fixed the bug (though the engineer who introduced it did try for about a second and a half to defend it), and then had a broader discussion: why the hell does the system allow itself to be blown away with "rm -rf /"?! A self-destruct button really doesn't make sense, especially when it could so easily be mistakenly pressed by a shell script.

So we resolved to make "rm -rf /" error out, and we were getting the wheels turning on this when our representative to the standards bodies got wind of our effort. He pointed out that we couldn't simply do this -- that if the user asked for a recursive remove of the root directory, that's what we had to do. It's a tribute to the engineer who picked this up that he refused to be daunted by this, and he read the standard very closely. The standard says a couple of key things:

1. If an rm(1) implies the removal of multiple files, the order of that removal is undefined

2. If an rm(1) implies the removal of multiple files, and a removal of one of those files fails, the behavior with respect to the other files is undefined (that is, maybe they're removed, maybe they're not -- the whole command fails.

3. It's always illegal to remove the current directory.

You might be able to imagine where we went with this: because "rm -rf /" always implies a removal of the current directory which will always fail, we "defined" our implementation to attempt this removal "first" and fail the entire operation if (when) it "failed".

The net of it is that "rm -rf /" fails explicitly on Solaris and its modern derivatives (illumos, SmartOS, OmniOS, etc.):

  # uname -a
  SunOS headnode 5.11 joyent_20150113T200918Z i86pc i386 i86pc
  # rm -rf /
  rm of / is not allowed
May every OS everywhere make the same improvement!


>A self-destruct button really doesn't make sense

Bryan--you mean Star Trek didn't get it right? Well, at least they didn't allow shell scripts.

It's an interesting philosophical question though. At what point do you decide that the user truly really can't want to do this even though they've said that they do?


rm -rf is not self-destruct button. It's just a "take everything out of the closets, rip of the labels, and throw it on a heap".

A real self-destruct button would ensure you couldn't recover the data anymore.

So at least try "dd if=/dev/random of=/dev/sda" or more elegant when available, throw away the decryption key.


GNU rm defaults to failing on /, but this behavior can be overwritten with the --no-preserve-root flag.


Note to all, prepend all your bash script with

"set -o errexit -o nounset -o pipefail"

It'll save you headaches.


Can you at least explain what this is doing, outside of saving me headaches?


-errexit: exit the script when a command fails -nounset: fail when referencing an unset variable -pipefail: fail when the any command in a pipeline fails, not just the last one

The last option is unfortunately harder to use, since some programs misbehave in pipelines.



set -o errexit (set -e): exit script when command fails

set -o nounset (set -u): exit script when it tries to use undeclared variables

set -o pipefail: returns error from pipe `|` if any of the commands in the pipe fail (normally just returns an error if the last fails)


I use the shebang "#!/bin/bash -e".

To get the same effect as the "set -o errexit -o nounset", I think you can use "#!/bin/bash -e -u". (There seems to be no option for pipefail.)


The "shebang" treats everything after the binary as a single argument. It only does one argument from the shebang and the file itself as the final argument. So it would run that kinda like

    bash '-e -u' $file
But you can do

    #!/bin/bash -eu


This breaks if your script is sourced by another shell. Best use 'set -eu' at the start instead.


Yep, my thought exactly. My standard bash header is

    #!/bin/bash
    set -eu
    IFS=$'\n\t'
(Wish there was a shorthand version of pipefail, then I'd always use that too.)


You can save a line by doing #!/bin/bash -eu


Which is no equivalent if the script is executed using "bash script.sh" ;).


What is this magic doing?

   $(cd "${0%/*}")


$0 is the path of the script. ${0%/* } takes $0, deletes the smallest substring matching /* from the right (the filename) and returns the rest, which would be the directory of the script. So this changes the directory to the directory the script is located in.


$0 is not the path of the script. $0 is the path passed to the shell used to execute the script. If run via the #! line, it will contain a path of some kind. But if it's passed directly using "bash myscript.sh", then it won't.

And yes, dirname is a way out of this. I'd do this:

    "$(cd "$(dirname "$0")"; pwd)"
if I wanted the path to the script. I would also sanity-check the path by testing for the existence of some files or directories that are expected to exist under it, before trying to delete it all.


Is there any good reason to use this approach over the more readable 'dirname $0'? Not that it would have made any difference to the bug in question of course.


Per a subsequent comment in the github thread:

"the ${0%/*} pulls the path out of $0, cd's into the specified directory, then uses $PWD to figure out where that directory lives - and all this in a subshell, so we don't affect $PWD"

So for example, that would take /user/amputect/seam_setup.sh and change directories into into /user/amputect.


Assuming the shell script is for Bash, that incantation removes the shortest matching suffix which matches the pattern "/*". So it is attempting to remove the filename component from a pathname string in the "0" positional parameter.

Since $0 is usually the name of the shellscript itself, this would be trying to obtain the directory path of the shellscript.


It's a bashism. Open up "man bash" and search for "Remove matching suffix pattern".


which, regrettably, is not something you can do with `man` if you find `${0%/*}` in a script you're reading that someone else wrote.


No, but you can find it by searching for "%", because that's exactly what I did.


The sad sad part of all this is that Half-life 1 had a similar bug in their windows installer and would wipe your program files if not careful http://arstechnica.com/civis/viewtopic.php?t=479484



One update to Eve Online had a broken script that removed "boot.ini" in some circumstances. Customers were not amused.


I don't want to blame anyone, but maybe there should be foolproof default security measures that prevent something like this from happening. For example rm -rf called on a home, documents, music, photos etc. directory could require an additional confirmation, perhaps through a GUI.


There are, it's called users, and groups, and file permissions. Applications like steam should really be running under a separate user so they can't write to personal files (and maybe just have read permissions). But of course proper application isolation and file permissions is something few people do correctly on their personal machines, let alone know about.

Window managers don't make it any easier, and I put a lot of the blame on them for not making it easy to configure applications to start under different users.


Steam shouldn't run as its own user. It's a user-level process, not a system process. It needs to have user-specific things (install directory, save games, etc.) that need to be accessible to the person using it. Separating processes into users is only one method of sandboxing, and not appropriate in this case. Sandboxing via mechanisms like SELinux is the correct solution.

One of the users in the Github thread even mentions how SELinux prevented the same thing from happening on his machine.


Yes you are of course correct about SELinux!

I actually separated Steam into a sand-boxed "steam" user account. But maybe that's because I learned Unix on BSD and never included SELinux or how to use it (and it isn't obvious from a desktop user accounts perspective), I should probably check that out.


SELinux seems like a case of hunting tweetie birds with 88s...


should get better once systemd has steam integration


It seems like the direction Linux is going (albeit slowly) is to use selinux instead of different users for this type of isolation.


Yes, however linux supports this sort of security right now and has for many many years, and properly used would have prevented these mishaps. More than backing up their data, I blame the users for being incompetent users of computers in general.


Linus supports selinux right now, the only issue is that applications ship with policy about as often as they create their own user (actually, maybe a bit more often).


I'm saying that people can set up this policy themselves. I'm not saying steam should do it, I'm saying anyone can do it for any application they install.


IT reminds me the Bumblebee Project bug: https://github.com/MrMEEE/bumblebee-Old-and-abbandoned/issue...

Spoiler: due to a forgotten space the entire /usr folder was deleted


I've done something like this before, with my own build scripts. Except I was running as root (a requirement for some parts of the build).

Part of the scripts installed a bunch of files into what was supposed to be a fakeroot, however I did not have bash's 'set -u' configured and an incorrectly spelled path variable was null, meaning something like: "${FAKEROOT}/etc" was translated into "/etc". Before I realized it, it had clobbered most of my /etc directory.

When the build failed, I was puzzled. I only noticed there was an issue when I opened a new shell and instead of seeing "myuser@host ~]#" I got "noname@unknown ~]#". Uh oh...

Needless to say I know do my development of those scripts from within a VM.


Since this kind of thing keeps happening, isn't there a need for a safer tool than shell scripts? Maybe with a little bit more safety around null/empty variables and not as stringly typed?


Tools that are safer then shell scripts exist, doesn't mean that end users have them installed, doesn't mean people will chose to you them either


Python?


Reminds me of a bug with the Myth II uninstaller immortalized by the following Penny Arcade cartoon:

http://www.penny-arcade.com/comic/1999/01/06

Basically, they called delete tree. If the user installed or moved the game to a different location, say, the root, it would delete tree away somebody's whole computer. Fun times.


Just wanted to remind everyone that deleted files can be recovered until their space is reclaimed for something else.

So if you notice something like this happening, shut down computer asap, so they can't be overwritten. Plug drive into another computer but do not mount it. Instead run some file recovery program on it.

For an SSD it becomes murkier though, what with their trimming and automatic garbage collection.


Yeah if you have time to manually guess the filenames on half a million files. Great.


$ extundelete /dev/sda{n} --restore-all


We all make mistakes when coding. However, knowing an engineer at Valve did this in a way makes me feel a little bit better about my abilities as a software engineer. At the end of the day we are all human and makes something like working at Valve/Google/Big Name Corp a little less daunting.


When are we moving to the mobile security model on the desktop? I love knowing that nothing I install on my phone can ever to anything like this or access data that belongs to another program in general.


Pretty sure shellcheck ( http://www.shellcheck.net/ ) has a lot to say about this faulty script...


this is clearly programmer error, but it's multiplied by a random variable: the Degree of Misery of the programming language, (in this case) bash.


This is an amateur mistake. Makes me wonder what horrors lurk inside Steam itself...


Thanks for reminding me to unmount my backup drive.


Steam in docker, anyone?


Why are we still running bash scripts? The only thing worse might be DOS batch files.

It's not like python isn't available on every major linux distro. It's a little harder to ensure it's on windows, but when Steam is installing every point release of the Visual C++ runtime that has ever existed on my system, why not bundle python in there too?


> Why are we still running bash scripts?

Because admins gotta admin. Also, adding python to a software suite just to work on the file system... gross.


And even if it isn't, modern perl would do as well as python for this case, with no language wars required.


Exactly. There are better options, that have more sane string and path handling. Much as I grumble about using Powershell on Windows, at least you've got the .Net framework underneath you that you can drop into relatively easily


Horrible life lesson. Saved by my own laziness.


This could be quick hack, maybe the dev just forgot to put it on the issue list.


I've written a short guide on how you should safeguard `rm`: https://github.com/sindresorhus/guides/blob/master/how-not-t...


If you're curious about who the "Scary!" guy is, someone pointed to where the code was checked in: https://github.com/lrusak/steam_latest/commit/21cc14158c171f...


That looks like an unofficial repo. I don't think the guy who committed that actually wrote it.


some hilarious comments there now




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: