Hacker News new | past | comments | ask | show | jobs | submit login
Shell script mistakes (pixelbeat.org)
117 points by reinhardt on Nov 17, 2013 | hide | past | favorite | 34 comments



The single biggest shell script mistake is not handling whitespace in file names correctly, and it's almost impossible to do correctly if you have weird file names: embedded newlines, leading and trailing spaces, embedded tabs. Embedded quotes can be tricky too, especially if you're writing a script that generates a script.

That bit, writing a script that generates a script, happens surprisingly often in bash. It's cheaper to pipe a stream to sed that converts it into a shell command than it is to iterate over all the lines, and individually pluck out the arguments for the commands you want to execute. Leaving the script as something that outputs shell commands also lets you inspect what it does before committing to it (by piping it to bash).



The problem with file name is the main reason I use zsh.

    for fic in **/*(.); do
        doSomethingWith $fic
    done
will take way more time than using find, but will work even with files with stranges chars.


and another reason to use zsh is the incredibly nice looping syntax, for example, this is the same as

  for fic (**/*(.)) doSomethingWith $fic
I find the fact that it fits in a single line makes me much more likely to use it in everyday shell use.. though I'd still revert to bash for scripting


I find that after a shell script gets to be over 3 or so lines, it's easier to switch over to python or perl. Do others feel the same?


I don't agree with this. I love Perl, it's up there as one of my favourite languages (despite it's many short comings), but in my opinion shell scripts make more sense if you're writing scripts which depend upon a number of additional programs as the bulk of it's processing.

For me, the point at which Perl makes more sense is if your script requires more internal logic than it does depends on spawning other programs.

For example, if I'm writing a routine to auto snapshot ZFS / Btrfs volumes and delete any over a certain age, the script would be dependant on your file system CLI tools. So it makes more sense to have an 80+ line shell script than it does to write that in Perl / Python.

However if I was writing a routine which requires users inputting details, where those details need to be sanity checked and then stored some where (such as a database), then the core logic of that program resides within your script (where you'd have to read inputs, do your sanity checks and then write to the database). So a Perl or Python script makes more sense.

Obviously you can do either of those examples in each of those languages (crudely speaking as I know shell scripts aren't technically a programming language); but that's just a basic example of where I personally draw the line.

I also think this is one of those occasions where it doesn't massively matter which approach you take just so long as the code works and is maintainable (though I draw the line at one maintenance script I saw last year. It was a Python script where every other line was os.system. It just struck me as rather pointless starting a Python interpreter if you're just going to use it like a shell script - you might as well do the whole lot in the shell to begin with).


    *(though I draw the line at one maintenance script I saw last year. It was
    a Python script where every other line was os.system. It just struck
    me as rather pointless starting a Python interpreter if you're just
    going to use it like a shell script - you might as well do the whole lot in the shell to begin with).*
FWIW, one of my favorite features of Perl syntax is that you can do things like this. A lot of my quick-and-dirty sysadmin scripts end up being Perl scripts with lots of backticks. It's handy for when they grow (as they often do) into more full-featured scripts.


Great comment. One nitpick: Bash, at least, is Turing-complete. https://en.wikibooks.org/wiki/Bash_Shell_Scripting


> shell scripts aren't technically a programming language

Of course they are.


For me, it has nothing to do with the number of lines, but the amount of actual computation I am doing. If all I need to do is wire up the inputs and outputs of X commands, then shell script is the best tool for the job.

If I actually need to do something with an input or output before sending it to the next command, I'll switch to Python.

For CLI tasks in Python, http://docopt.org/ and http://shell-command.readthedocs.org/en/latest/ are invaluable tools. Yes, I know about Envoy, but shell_command is much nicer to use in my opinion. Envoy doesn't automatically escape arguments, which is annoying for all the reasons that SQL injections are annoying.


If I actually need to do something with an input or output before sending it to the next command, I'll switch to Python.

If it's not something that can be done trivially with sed or awk, I'll consider whether it's something that can be implemented in a generic way to be used elsewhere, in which case I'll write that specific utility, and insert it into the pipeline (with appropriate arguments). Only if it's fairly specific, or I need to package it for distribution, will I consider trying to rewrite everything in a single language script.


That's more true of python than perl, I think - my recollection is that python adds slightly more syntax on top of running external commands. That said, I happily assemble shell scripts of tens (or occasionally hundreds) of lines when it's a good fit...


It entirely depends on what the script is doing. If it's orchestrating a pipeline of processes, shell script is ideal. If it needs much in the way of complicated temporary data structures, shell is the worst way to go.


I was gonna write this.

Shell script syntax has always been hard to remember to me (but I feel it to be less than intuitive compared to lisp, perl, js, c, ruby, python...) , subject to different behaviors between shells (csh vs ksh IIRC), string manipulation is close to non-existing, types are more difficult to use than in PHP,...

Okay, it works, we can do stuff with it, we can hack a quick thingie there and there,... I know the main appeal is, "it just runs everywhere, no install required" (minus the csh/ksh differences), but is it really the tool we need for all we use it for?


The primary advantage of learning shell script syntax for me is that it's the same language I use on the command line.

I write for and "while read" loops on the command line a dozen times a day, () subshells, process substitution, sed, awk etc. The more proficient I am on the command line, the more I am with shell scripts, and vice versa.

Another advantage; the shell is a repl in which you can prototype your script. Perform the operations manually, pull the commands out of your history, and adjust as necessary.


> I write for and "while read" loops on the command line a dozen times a day

You earned my respect.

I like REPLs for small operations. When talking about a "full" script, I'm more at ease in a text editor with a script I'm gonna adjust as I run it. Best example is a syntax error: the script just won't run. Meanwhile, in a REPL, I may get myself stuck without knowing it.


Yep pretty much, but it can be up to 100 lines depending on the nature of the script.

To the article, I've never used a shell that doesn't understand [[ (guaranteed to be builtin, not sensitive to the hyphen issue) and {,} expansion.

For the "for file in *; ... " that is also susceptible to the shell arg limits and is fairly hairy.

Lastly if you're concerned about performance -- well, it's definitely time to leave shell for something better.


> I've never used a shell that doesn't understand [[

That may be, but that doesn't mean they aren't pretty widespread. [[ and {,} are bashisms. However, they still work when bash is executed as /bin/sh, as is the case on RedHat and some other Linux distributions.

However, other distributions, including Debian, Ubuntu, and any embedded system using Busybox, point the /bin/sh symlink to a version of ash, the Almquist shell. It is used for scripts instead of bash because it is much smaller and faster (reportedly, Debian/Ubuntu boot speed improved significantly -- leaving the shell isn't a very good option for sysvinit).


[[ is from ksh88 or earlier; {,} is apparently from csh.

(Funny; I could've sworn I used {} in ksh under Solaris but apparently not!)

Your point stands though, they are bashisms for all intents and purposes on Linux.

As to speed -- well, again, I think if you want speed don't use shell. For startup you're better off redesigning the whole damn thing (cue systemd). Changing shell is like overclocking your cpu when you should be choosing a better algorithm.


It depends what it's doing. Other languages are much better at data processing. They make it more complicated to start child processes, use pipes, set environment variables, etc. If you're doing system administration tasks, you'll end up with a lot of shell command lines embedded in the program. (Try editing a crontab in Python...)


Try editing a crontab in Python...

  pip install python-crontab


It depends on the goal of the script. I tend to rewrite complicated shell scripts in Ruby, but working with subprocesses in Ruby is terrible, so those scripts are destined to end in `.sh` forever. bash has a very specific purpose where it excels over scripting languages like Python and Ruby: wiring input and output and communicating with subprocesses.


Yes. I just promised myself yesterday to stop writing bash scripts for things other than setting up an environment and launching another program. What pushed me over the edge is most of the script was incorporating the kind of fixes from the article (force variable to be treated as int, add dollars signs to access as variable, and so on).


I've gotten in the habit of using Python or Haskell for my automation. Pretty happy with it.


Depends on the target. Python, and even perl is not always available. /bin/sh is.


What systems don't have an interpreter installed these days?


Well, cygwin won't have it by default


This example

  for $file in *;do wc -l $file;done
could be reduced to

  for $file in *; { wc -c $file ;}
in some POSIX-like shells.

Is the for loop even necessary?

    echo wc -l * |sh 
But...

http://www.in-ulm.de/~mascheck/various/argmax/


This is a great page in the same vein - bash specific, but quite a bit more comprehensive.

http://mywiki.wooledge.org/BashPitfalls


I recommend the shell script static analyzer ShellCheck: https://github.com/koalaman/shellcheck


One of the subtle shell script mistake which I was unaware was that if a shell script is modified, currently running instances of the script might fail [1].

1. http://stackoverflow.com/questions/2285403


Let's not forget unit testing. After all a shell script is code and code should be unit tested.

https://code.google.com/p/shunit2/


I prefer http://bmizerany.github.io/roundup/ . It works in fewer shells, but when you know which shell(s) you are targeting, that doesn't matter. It is far less magic than shunit2--writing the tests actually feels like writing shell.


No, just no.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: