Yup. These days if it's more than a few lines, I write it in Perl instead if it's for personal use, or Python if it's going to be shared (Python just takes me more effort still).
For all the vaunted Unix Philosophy it's amazing how clunky some of it is. Countless shell scripts still trip over spaces in filenames. Of those that don't, nearly all of them will still be confused by any unusual characters like newlines. If you want something done properly, you have to pull out a proper scripting language, to have readdir() and the ability to pass arguments directly to a process.
Even Perl, which generally excels in such tasks has weird lapses in convenience. You can run a command with one line of code without the possibility of confusion with `system("ls", "-l", $dir)`, but you can't get its output that way. There's no version of `system` that'd allow you to both explicitly specify each argument to the process, and obtain its output. You either use backticks and risk quoting trouble, or need to use `open`, which is a lot more verbose.
It's interesting that it took Microsoft to do a new approach in this regard. PowerShell has its amount of weirdness I really hate, such as that the environment escapes the program into the commandline, but it's really refreshing how it dispenses with brittle grep, awk and cut stuff.
Oh, I know. There's that, and File::Slurp, and a bunch of other stuff one would think would exist from the start, but for some reason don't. But those are all external modules that somebody had to write, that didn't exist at some point in time, and which sometimes one can't use because in some cases you can only rely in what's in core.
I'm just wondering at that bizarrely enough, Larry Wall (or somebody else) found it useful to have a simple, convenient way to execute a command with exact arguments, but not for when you need its output.
I must have written the same wrapper around open() several dozen times by now, due to needing to write scripts for cases where installing dependencies is undesirable.
> I'm just wondering at that bizarrely enough, Larry Wall (or somebody else) found it useful to have a simple, convenient way to execute a command with exact arguments, but not for when you need its output.
It had backquotes from the start. So it was as convenient (and as unsafe) as the Bourne shell.
Of course you quickly want more safety and backquotes are just legacy you can never use for serious stuff.
Bad example, yes. Plus it's a pointless thing to do to start with, because you run into trouble with special characters in filenames that way. Got to use readdir.
`open my $fh, '-|', 'ls', '-l', $dir` should do the trick I think, but apparently it only works on platforms with a "real fork" (so fine as long as you don't care about Windows, probably).
The TOPS-20 command interpreter with its regularized parameters and prompting command-line completion and inline help and noise words was so much better designed and user friendly than any of the pathetic Unix shells.
Typing the escape key says to the system, "if you know what I mean from what I've typed up to this point, type whatever comes next just as if I had typed it". What is displayed on the screen or typescript looks just as if the user typed it, but of course, the system types it much faster. For example, if the user types DIR and escape, the system will continue the line to make it read DIRECTORY.
TOPS-20 also accepts just the abbreviation DIR (without escape), and the expert user who wants to enter the command in abbreviated form can do so without delay. For the novice user, typing escape serves several purposes:
Confirms that the input entered up to that point is legal. Conversely, if the user had made an error, he finds out about it immediately rather than after investing the additional and ultimately wasted effort to type the rest of the command.
Confirms for the user that what the system now understands is (or isn't) what the user means. For example, if the user types DEL, the system completes the word DELETE. If the user had been thinking of a command DELAY, he would know immediately that the system had not understood what he meant.
Typing escape also makes the system respond with any "noise" words that may be part of the command. A noise word is not syntactically or semantically necessary for the command but serves to make it more readable for the user and to suggest what follows. Typing DIR and escape actually causes the display to show:
DIRECTORY (OF FILE)
This prompts the user that files are being dealt with in this command, and that a file may be given as the next input. In a command with several parameters, this kind of interaction may take place several times. It has been clearly shown in this and other environments that frequent interaction and feedback such as this is of great benefit in giving the user confidence that he is going down the right path and that the computer is not waiting to spring some terrible trap if he says something wrong. While it may take somewhat longer to enter a command this way than if it were entered by an expert using the shortest abbreviations, that cost is small compared to the penalty of entering a wrong command. A wrong command means at least that the time spent typing the command line has been wasted. If it results in some erroneous action (as opposed to no action) being taken, the cost may be much greater.
This is a key underlying reason that the TOPS-20 interface is perceived as friendly: it significantly reduces the number of large negative feedback events which occur to the user, and instead provides many more small but positive (i.e. successful) interactions. This positive reinforcement would be considered quite obvious if viewed in human-to-human interaction terms, but through most of the history of computers, we have ignored the need of the human user to have the computer be a positive and encouraging member of the dialog.
Typing escape is only a request. If your input so far is ambiguous, the system merely signals (with a bell or beep) and waits again for more input. Also, the escape recognition is available for symbolic names (e.g. files) as well as command verbs. This means that a user may use long, descriptive file names in order to help keep track of what the files contain, yet not have to type these long names on every reference. For example, if my directory contains:
BIG_PROGRAM_FILE_SOURCE
VERY_LONG_MANUAL_TEXT
I need only type B or V to unambiguously identify one of those files. Typing extra letters before the escape doesn't hurt, so I don't have to think about the minimum abbreviation; I can type VER and see if the system recognizes the file.
“I liken starting one’s computing career with Unix, say as an undergraduate, to being born in East Africa. It is intolerably hot, your
body is covered with lice and flies, you are malnourished and you
suffer from numerous curable diseases. But, as far as young East
Africans can tell, this is simply the natural condition and they live
within it. By the time they find out differently, it is too late. They
already think that the writing of shell scripts is a natural act.”
— Ken Pier, Xerox PARC
The Shell Game, p. 149
Shell crash
The following message was posted to an electronic bulletin board of a
compiler class at Columbia University.
Subject: Relevant Unix bug
October 11, 1991
Fellow W4115x students—
While we’re on the subject of activation records,
argument passing, and calling conventions, did you
know that typing:
!xxx%s%s%s%s%s%s%s%s
to any C-shell will cause it to crash immediately?
Do you know why?
Questions to think about:
• What does the shell do when you type “!xxx”?
• What must it be doing with your input when you type
“!xxx%s%s%s%s%s%s%s%s” ?
• Why does this crash the shell?
• How could you (rather easily) rewrite the offending
part of the shell so as not to have this problem?
MOST IMPORTANTLY:
• Does it seem reasonable that you (yes, you!) can bring what
may be the Future Operating System of the World to its
knees in 21 keystrokes?
Try it. By Unix’s design, crashing your shell kills all your processes and
logs you out. Other operating systems will catch an invalid memory reference
and pop you into a debugger. Not Unix.
Perhaps this is why Unix shells don’t let you extend them by loading new
object code into their memory images, or by making calls to object code in
other programs. It would be just too dangerous. Make one false move
and—bam—you’re logged out. Zero tolerance for programmer error.
The Metasyntactic Zoo
The C Shell’s metasyntactic operator zoo results in numerous quoting
problems and general confusion. Metasyntactic operators transform a command
before it is issued. We call the operators metasyntactic because they
are not part of the syntax of a command, but operators on the command
itself. Metasyntactic operators (sometimes called escape operators) are
familiar to most programmers. For example, the backslash character (\)
within strings in C is metasyntactic; it doesn’t represent itself, but some
operation on the following characters. When you want a metasyntactic
operator to stand for itself, you have to use a quoting mechanism that tells
the system to interpret the operator as simple text. For example, returning
to our C string example, to get the backslash character in a string, it is
necessary to write \\.
Simple quoting barely works in the C Shell because no contract exists
between the shell and the programs it invokes on the users’ behalf. For
example, consider the simple command:
grep string filename:
The string argument contains characters that are defined by grep, such
as ?, [, and ], that are metasyntactic to the shell. Which means that you
might have to quote them. Then again, you might not, depending on the
shell you use and how your environment variables are set.
Searching for strings that contain periods or any pattern that begins with a
dash complicates matters. Be sure to quote your meta character properly.
Unfortunately, as with pattern matching, numerous incompatible quoting
conventions are in use throughout the operating system.
The C Shell’s metasyntatic zoo houses seven different families of metasyntatic operators.
Because the zoo was populated over a period of time, and
the cages are made of tin instead of steel, the inhabitants tend to stomp over
each other. The seven different transformations on a shell command line
are:
Aliasing alias and unalias
Command Output Substitution `
Filename Substitution *, ?, []
History Substitution !, ^
Variable Substitution. $, set, and unset
Process Substitutuion. %
Quoting ',"
As a result of this “design,” the question mark character is forever doomed
to perform single-character matching: it can never be used for help on the
command line because it is never passed to the user’s program, since Unix
requires that this metasyntactic operator be interpreted by the shell.
Having seven different classes of metasyntactic characters wouldn’t be so
bad if they followed a logical order of operations and if their substitution
rules were uniformly applied. But they don’t, and they’re not.
[...followed by pages and pages of more examples like "today’s gripe: fg %3", "${1+“$@”} in /bin/sh family of shells shell scripts", "Why not “$*” etc.?", "The Shell Command “chdir” Doesn’t", "Shell Programming", "Shell Variables Won’t", "Error Codes and Error Checking", "Pipes", "| vs. <", "Find", "Q: what’s the opposite of ‘find?’ A: ‘lose.’"]
My judgment of Unix is my own. About six years ago (when I first got
my workstation), I spent lots of time learning Unix. I got to be fairly
good. Fortunately, most of that garbage has now faded from memory. However, since joining this discussion, a lot of Unix supporters
have sent me examples of stuff to “prove” how powerful Unix is.
These examples have certainly been enough to refresh my memory:
they all do something trivial or useless, and they all do so in a very
arcane manner.
One person who posted to the net said he had an “epiphany” from a
shell script (which used four commands and a script that looked like
line noise) which renamed all his '.pas' files so that they ended with
“.p” instead. I reserve my religious ecstasy for something more than
renaming files. And, indeed, that is my memory of Unix tools—you
spend all your time learning to do complex and peculiar things that
are, in the end, not really all that impressive. I decided I’d rather
learn to get some real work done.
For all the vaunted Unix Philosophy it's amazing how clunky some of it is. Countless shell scripts still trip over spaces in filenames. Of those that don't, nearly all of them will still be confused by any unusual characters like newlines. If you want something done properly, you have to pull out a proper scripting language, to have readdir() and the ability to pass arguments directly to a process.
Even Perl, which generally excels in such tasks has weird lapses in convenience. You can run a command with one line of code without the possibility of confusion with `system("ls", "-l", $dir)`, but you can't get its output that way. There's no version of `system` that'd allow you to both explicitly specify each argument to the process, and obtain its output. You either use backticks and risk quoting trouble, or need to use `open`, which is a lot more verbose.
It's interesting that it took Microsoft to do a new approach in this regard. PowerShell has its amount of weirdness I really hate, such as that the environment escapes the program into the commandline, but it's really refreshing how it dispenses with brittle grep, awk and cut stuff.