Hacker News new | past | comments | ask | show | jobs | submit login

One example from the document:

> Consider the naive attempt to clean out the /tmp directory.

> cd /tmp

> foreach f [glob *] {file delete -force $f}

> A file ~ or ~user maliciously placed in /tmp will have rather unfortunate consequences.






i once managed to create a directory named ~ using the mirror tool written in perl. then i naively tried to remove it using "rm -r ~" and started wondering why removing an empty directory would take so long, until it dawned on me...

i learned a few new habits since then. i almost never use rm -r and i avoid "*" as a glob by itself. instead i always try to qualify "*" with a path, remove files first: "rm dir/*"; and then remove the empty directory. "rmdir dir/"

if i do want to use rm -r, it is with a long path. eg in order remove stuff in the current directory i may distinctly add a path: rm -r ../currentdir/*" instead of "rm -r *"

related, i also usually run "rm -i", but most importantly, i disable any alias that makes "rm -i" the default, because in order to override the -i you need to use -f, but "rm -i -f" i NOT the same thing as "rm". rm has three levels of safety: "rm -i; rm; rm -f". if "rm -i" is the default the "rm" level gets disabled, because "rm -i -f" is the same as "rm -f"


My main safety habit is to avoid slashless paths.

Bad:

    rm *
Okay:

    rm ./*
    rm /tmp/d/*
    rm */deadmeat
    rm d/*
Then again, I commonly use dangerous things like `mv somefile{,.away}` that are easy to get wrong, so maybe don't trust my advice too much.

  rm -rf "$TSTDIR"/etc
is pretty dangerous when you forget to set the env var

Fair! Upvoted.

I guess I'm not likely to type that into the shell, or if I do, I then tab-complete to expand it.

I could definitely see myself using that in a shell script, though. I tend to do validity checks there:

    if ! [ -d "$TSTDIR" ]; echo "$TSTDIR not found, stupid" >&2; exit 1; fi
but that's kind of irrelevant, since if I need it to exist then I won't be removing it. Plus, I could totally see myself doing

    if [ -d "$TESTDIR" ]; then
      rm -rf "$TSTDIR"/etc
    fi

In bash, `set -u` or `"${TSTDIR:?Error: TSTSDIR is required.}/etc"` protects from such errors.

My safety technique is to echo the commands before I do the actual commands as a sanity check, e.g.

for i in $(find something); do echo "rm -f $i"; done

(bash example as my TCL is rusty)


Change your do block to `printf %q\ rm -f "$i" ; echo` and it won't lie about spaces. In case HN has "trimmed" my post in some way, as it often does, that's: percent q backslash space space. Works in bash/zsh, but not dash, probably not whatever your sh is. Can make a function of it trivially, but you have to handle the $# -eq 0 case, return whatever printf returns, etc.

When deleting, if it is more than a few specifically named files I will use a "find ... -delete" invocation.

I like it for two reasons. Find feels like it has more solidly defined patterns and recursion than shell globing and by leaving off the "-delete" it give me a chance to inspect the results before committing to my actions.


Without testing, I wonder if find follows symlinks. I’m pretty sure rm doesn’t.

Edit: Just checked and find doesn’t by default.


Very cool of you to post this. Too many people won't post stories like this, but I've done very similar multiple times. I think it definitely helps reinforce proper habits, and is the best way to cut your teeth on technology. It's also great for anyone new to read something like this, and be able to avoid something so devastating, and maybe make lesser mistakes, but still learn from both!

when you get to be as old as i am, these are the war stories you share with your kids and grandkids around the campfire ;-)

like the one from my colleague who once fat fingered fsck into mkfs and i lost my personal homepage because of it. what makes me uncomfortable about that story is that it was not my fault. if it were it would have been easier to tell. but at the time i was quite frustrated and my colleague felt that despite me trying to not get angry at him. i still feel really bad about my reaction then, adding to his predicament, since he had to live with the guilt about losing our website and everyone's personal home directory. it's bad when a mistake causes you to loose something personal, but so much worse when you loose someone elses stuff.

talking about mistakes is how we learn from them. the important part is not to get embarrassed about them. however that requires an environment where we are not blaming each other when something goes wrong.

i could have made that mistake myself. and i applied this lesson to my own learning as if i had.

blessed be the pessimist, for he hath made backups...


Absolutely, the worst mistakes are the biggest opportunities for learning. Destroying your own files is painful. Destroying EVERYONE'S files is a lesson that is more painful and something you will be more careful not to repeat.

I try to tell as many of these stories in person as possible, to let everyone know that unless you have dealt with this kind of accident, you're either in the 1% or due for your turn. I'd like to think that sharing these stories might at least give someone some pause before they go ahead and throw caution to the wind.


> if "rm -i" is the default the "rm" level gets disabled, because "rm -i -f" is the same as "rm -f"

You can use "\rm" to invoke the non-aliased version of the command. I made "rm -i" the default using an alias and occasionally use "\rm" to get the decreased safety level you described. I think it is more convenient that way.


I love zsh auto completion for this stuff. It automatically escapes really messed up paths like paths with new lines or emojis and crazy characters like that. Its really rare but I still intentionally practiced removing these things just so I can do it safely if it ever happens.

I've long fantasized about a tool I call "expect" that safeguards against crazy stuff like that.

It has a syntax of your expectations, functionally existing as a set of boundaries, and you can hook it to always run as a wrapper for some set of commands. It essentially stages the wrapped command and if none of the boundaries are violated it goes through. Otherwise it yells at you and you need to manually override it.

For instance, pretend I'm ok with mv being able to clobber except in some special directory, let's call it .bitcoin or whatever. (chattr can also solve this, it's just an example). The tool can be implemented relying on things like bpf or preload

Originally I wanted it as a SQL directive ... a way to safeguard a query against doing like `update table set field=value expect rows=1` where you meant to put in the where clause but instead blew away an entire column. I think this would be especially useful surfacing it in frameworks and ORMs some of which make these accidents a bit too easy.


When it comes to SQL, I will often write a SELECT with very explicit search (WHERE) criteria for this very reason. Then copying that statement, commenting the original, and pasting to change into an UPDATE or DELETE statement seems to be a technique that works well for me. The SELECT tells me exactly what I'm going to UPDATE or DELETE, and once I have that, changing the syntax is very minimal. In the case of an ORM, you might have to write a tool that only listens on LOCALHOST to run these statements first.

I always write the where first. It's kinda like thinking in RPN or postfix. I put the parts in out of order in a way that prioritizes the minimization of error.

But this is stupid. These are computers, we can make whatever we want. Executing a delete or update should, if one desires, not have to be database knifeplay.


I know what you mean, I do the same. I agree, but at the same time, it's difficult to start building in protections for the user. Where do you start and where do you stop? I have been forced to do the extreme to protect the user, and then you are asked why things are so difficult to use. I think to make something for someone that concentrates in the technology, as well as a beginner, means you've got to give up so much power (or create a secondary syntax/interface for both audiences). It would be nice to be able to set modes, but then it's going to be database specific unless it has proven itself to be useful across engines. Like most standardization, then you play syntax games between vendors. It would be nice to at least be able to write an UPDATE or DELETE statement with a leading character or keyword to display affected rows.

It's completely Optional safeguards. Add long as it's optional, I advocate for having as many of those as people can imagine

I understand, but with how much of a change to the language? Such a change would take an enormous amount of time to make it into the ANSI/ISO SQL standard, and what database would start to implement it first, and which would hold out as long as possible?

I don't disagree that it's impossible, but how do you get the syntax standardized at this point? Do you get various dialects, or an agreement between vendors? Look how slowly the standard moves, when do we get this where it's usable in most popular RDBMS?


The venn diagram of query support between SQL vendors is much closer to a flower than you think.

Just implement it for one and if it works, the others will add it


I have upvoted you for each comment you've made, but I feel like it's not that simple. Even just getting a single vendor to implement it is a huge undertaking. I know that you and I see the value in it, but I don't feel like we're the first to see that. There's a reason behind not implementing this feature, and it's the complexity that lies behind such a feature, like most things. This seems like one of those recursive and interactive features that don't fit into SQL. Does it present the rows that will be updated or deleted, and then ask if you wish to perform the operation? That doesn't work like anything SQL based, and I feel that's why we don't have it. I appreciate the back and forth on this, and am curious as to how you think it should be handled, if there's a way to fit in the way SQL works.

If the expectation is not met then it rolls back and fails. I implemented a slipshod version of it years ago for a previous employer (it got the job done with a lousier syntax)

Here's a list of 1,000 postgres extensions, it's not a big deal: https://gist.github.com/joelonsql/e5aa27f8cc9bd22b8999b7de8a...

Things are way more modular than they used to be.

I can probably do it again and just try to get attention for it.


From the hip, maybe something like "UPDATE DRYRUN ...". It'd report how many rows would be updated.

Or... "DRYRUN UPDATE ...", which is more like "EXPLAIN UPDATE..."

Thoughts?


Sounds like a solid idea, but I feel like just replacing your UPDATE with a SELECT COUNT(whatever_column_is_indexed_from_your_where) would be a good practice. If your DBMS supports an external language, that might be the best idea, so you can write your own logic, while keeping mostly everything in the database itself.

I only mean this from a more ANSI SQL side of things, where you might want to build your skills up to use as much of the standard as possible, until it's no longer possible and make sense to dip into platform specifics. I used to build code around being cross-platform, but at the same time realized that it's more useful to learn the ANSI standard, and then break free with LOTS of useful comments where it makes sense to do things more efficiently and with safety that you don't get normally.


For sql specifically, “limit 2” is my default way to write “expect 1”; if it affects two rows, I know that I have screwed up, whereas “limit 1” can be wrong without my noticing.

That's not a terrible solution although expect sounds like a simple safety mechanism for feeble-minded people like me who do simple queries.

I actually know postgres people. I should probably ask them


just to clarify, this has nothing to do with the "expect" that is the other major application of tcl other than tk?

None.

I was just reminded of a good idea I never implemented


Oh yeah, re: SQL expect - I always wish joins had a "cardinality assertion", like a regex *?+ (or ! for exactly one)

You could also create a file named "-i" in your home dir.

Why not learn to enjoy life knowing you get a second chance instead ?:

https://github.com/rushsteve1/trash-d


because then i become to rely on stuff being in the trash and i'd be less careful when deleting, which means i have to double check when i clean the trash. that's extra work. . and since the trash is one single folder for the whole desktop, that means the trash is full of stuff from all over the place, making a review extra hard. in most cases i know something needs to be gone, so i'd rather delete it on the spot.

besides that, the primary reason for deleting stuff is to gain space. moving things to trash doesn't help with that.

what i sometimes do though, when mass deleting, is to move stuff to be deleted into a new folder (usually called "del" or "delete"). then verify the contents of the folder before removing it.

what would be more useful is a kind of trash implementation that does not take space, in that it keeps files around but reports them as unused space that can be overwritten when space is needed. kind of like undelete is possible on some filesystems. so that gone is gone because i can't control when deleted space gets used up, but in a panic situation i can revert the most recent deletes.


Isn't that just tilde-expansion happening at the wrong moment?

I once created a file named *. Sweats were sweated that day.

I've heard that one of the Unix founding fathers had a directory with 125 files that all had single-byte names: one for each ASCII symbol except slash, dot and null. He would then test any new utility against this directory and chew the careless programmer out if it couldn't correctly handle every one of these names.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: