Bit of questionable advice: I've found using jq for all the argument mangling/postprocessing/glue tasks instead of sed, awk etc to be pretty workable lately.
The -R switch reads in arbitrary text instead of json and presents it as a sequence of json strings internally.
The capture() filter applies regexes to the string and extracts subgroups (which will be represented as json objects).
The gsub() filter performs search-and-replace on a string.
Finally, the -r switch writes the result back to stdout as a raw value, removing any json quoting and escaping.
By default, processing is linewise, but there are options to process the whole input at once instead. There is also supposed to be a way how you can have per-line processing with state tracking across lines (like sed's "holding space"), but I haven't tried that out yet.
I found this approach often more flexible than sed and easier to reason about. Also, I believe it may be faster, as more things are happening in a single process.
The one exception: grep on files is still orders of magnitude faster than anything else. I believe there also was an article about this on HN a while ago.
So if you have to find some data in a larger collection of files and then have to turn the data into something structured, a good approach is to make a "coarse" query with grep or grep -R first, to narrow down your set of candidates, then postprocess your remaining candidates with jq.
Also, apropos single process: Another questionable advice is that for/while loops in bash are subshells and you can use them as part of a pipe chain. This often lets you reduce the number of processes you need.
E.g. instead of
for f in $files; do foo "$f" | sed ...; done
You can often write:
for f in $files; do foo "$f"; done | sed ...;
Which will only launch a single sed process for all files instead of one per file.
But yeah, shell scripts do have many shortcomings and in particular the error handling situation seems extremely hard to fix. So actually sensible advice is still: Use python.
Python is a horrible language for system administration (or designing systems as in system engineering, for that matter): it's big, it's bloated, it's over-complicated, it's slow, it's hard to debug.
Shell is the native automation facility of the UNIX-like operating systems, and with 40+ years behind it, it is well understood, easy to write, easy to debug, and has no artificial dependencies (so, completely the opposite of Python).
Combine shell with AWK and you have an unbeatable combination for most computer automation and even very large scale data (pre)processing and number crunching.
Python is not really very slow overall unless you’re on windows. When you get to serious computing then breaking out of python is possible, but systems administration is not the place to do heavy computation.
As for debugability, I find python to be easier, set -/+x is great, but that’s not doing much more than printing out every line of execution, there’s no interactive debugger like python has afaik.
The killer thing for me is that python has modules for basically everything, which is awesome.
The thing that kills python for me is that I can’t be 100% sure if what version may exist on a system; and worse: if I actually use any modules then I have to somehow get them on the system.
For systems administration, this is pretty close to a non-starter. Prevailing sysadmin knowledge is that the admin tools should not horn in excessive dependencies, since sysadmin tools should work when the system isn’t fully set up.
This is why go is so great. But go is harder to debug than python in my experience.
> The thing that kills python for me is that I can’t be 100% sure if what version may exist on a system;
pip handles that automatically - you set which version numbers of Python you can handle in your project specification.
> and worse: if I actually use any modules then I have to somehow get them on the system.
pip handles that too!
----
I'm not quite sure what the problems are, but between virtualenv/venv, pip, and modern package managers like poetry, this is really a solved problem - something you spend an hour setting up when you start a project and just never think of again.
I do this so much I have a shell function, `nenv`, which creates a new virtualenv with a specific Python and then loads its dependencies into the virtualenv.
It gives me tremendous freedom. For example, I can heavily instrument other modules' Python code inside the virtualenv (to find their bugs, e.g.) and then throw it all away and recreate it fresh in a few keystrokes.
---
There are OK, well-known solutions to most of the common packaging and distribution problems with Python programs you might have. If you go back to it, you should spend a bit of time with this, it isn't that bad.
(Poetry seems really cool - when I start something new it'll be with that.)
it's weirly naive to assume I don't know about pip as a python user as it's a bit ubiquitous.
That said: This is very much a developers take on the packaging situation in python.
Running virtualenvs is "fine" until you're not connected to the internet, a less-and-less common scenario thankfully.
pip itself (without virtualenvs) will "dirty" the system, or you use the --user flag and make it work only for one user.
The situation for systems administrators is to do things that do not mess with the developers, if I install a package, especially with a version lock, and it directly conflicts with a developers (less and less of a problem with containers!) then they're going to be very cross with me.
"Running virtualenvs is "fine" until you're not connected to the internet, a less-and-less common scenario thankfully."
Any high security environment (read: the financial industry) will have most of the servers purposely not connected to the InterNet, to prevent people from doing exactly the above and from attackers hacking in. Energy and pharmaceutical industries - same thing. That's the norm, not the exception, so pretty much any industry which is critical for society at large won't allow access to the InterNet.
This may be a silly question, but how would you think to install packages if not with pip?
I used to see python the same way, and coding on my megalithic single file would always take a couple hours to set up on a fresh computer. Then I discovered requirements.txt, where you simply list each package you need and the version (range) desired and then run:
pip install -r requirements.txt
this will install everything unless it's fucky like for example, pip install kivy-garden.matplotlib will not work with pip but everything else does.
"This may be a silly question, but how would you think to install packages if not with pip?"
Through the native software management subsystem: RPM, SVR4 packaging, pkgsrc, MSI... always use the native software management subsystem - that is what it is there for, for developers to deliver their software with.
One is not supposed to have more than one software management subsystem on a server because it leads to a hodge-podge mess: imagine you have 100'000 servers and it's RPM this, then pip that, then pear other, then npm whatever, then cargo something else... it would become a system administration nightmare in 0.1, and a system engineering nightmare in 0.01 seconds... you have to troubleshoot production, and do, what? Start firing different private packaging format commands in the hopes you find the correct one to find out what's been done to the system, what's available, then pray that particular private packaging system supports some sort of verification feature in order to find out who hacked which file with what tool? And that's okay by you? It certainly isn't okay by me and would have never worked in an environment that large (it doesn't even work in very small environments!)
...One is always supposed to use the native software management system of the operating system: cattle, not pets. And if there isn't a native OS package of the software one needs, or there isn't one of sufficient quality and standards compliance (like LSB FHS), then one makes one's own packages, because that is the high quality, professional way of doing things.
shell is ultra fast, and it's very easy to squeeze performance out of shell programs. Like I wrote, shell is a well understood programming language - it's been around for almost 50 years and generations and generations have grown up on it professionally.
I am not assuming, I do this daily so I know what I'm writing about. And as if that were not enough, I have formal education in shell programming as part of my degree.
Hah. Like the time you tried to tell people that grep -r was not real unix, and that they should use
the vastly slower combination of find + xargs + grep?
To be fair, the startup speed of the shell is generally considered pretty slow/expensive in comparison to running everything in a single shell/executable.
That's only a problem in real life if you're running SmartOS's pkgsrc build farm. And even then, there are techniques: Johnatan Perkin described them in length on his blog. Search for it.
Python "peed in my teacup" many, many times: every time I run Mercurial, Python shits on me from Earth's orbit because the piece of crap language is so slow. And that is just one example among many.
"Bash"? Is that all you know, "bash"? I wasn't writing about that GNU crap of a shell, but of real, AT&T Bourne or Korn shell!
Debugging shell is trivial with set -x. I can find a problem in a shell program in seconds.
> The one exception: grep on files is still orders of magnitude faster than anything else. I believe there also was an article about this on HN a while ago.
I'd like to see that article! Faster than jq for sure, but my experience is that ack/ag/rg are all orders of magnitude faster than grep for typical uses.
Also I think there is a difference between grep on files, i.e. "grep ... file1 file2" or "grep ... -R somedir" and grep on stdin, i.e. "something | grep ...".
My understanding was that in the first case, grep can make use of random access APIs to control the size of the input chunks it processes or skip over parts that are known in advance not to match. That's not possible if the input is fed through stdin.
I haven't seen any actual comparisons though apart from "subjective" speed. So things might be wrong.
I think people are underestimating the power of `bash+curl+jq`. In pretty much every scenario where I want to consume an API, I just fire up my vim+shellcheck+shfmt setup, split my logic into functions (using `local` for private variable, mind you), then just spam `get_endpoint | jq filter`s in a `getopts` loop, and bam, suddenly you have a clean and concise CLI.
Granted, Bash is quirky. So I should say I program in bash+shellcheck not just Bash.
And Jq. Jq is a functional programming language in itself! I do regex or string mangling inside jq filters.
Why should I use Python over this? The moment you want a clean and cool feature that Python has over Bash+jq, you should incorporate third party libraries. And adding just one non-std lib, means you have to resort to `virtual env`. And why not adding also `click` or `requests` or .. and now you have ~10 moving dependencies. Let's not talk about Python's subtle differences between versions.
Compare it to Bash-as-glue-of-unix-tools approach. You want image processing? Just use `imagemagick`. You want web scraping? `pup` is your friend.
ProTip: do `my_command || exit 1` if my_command is bound to fail. Or `if ! my_command; then printf >&2 "[ERROR] reasons"; exit 4 fi` for more elaborate exception-handling.
that exact sequence is possible in lots of desktop environments, with their own list and exception handling. Many people here show that the result of the fast (and fun?) steps result in something that other people can't change much, and have a lot of non-obvious footguns. Last, if you like it, no judgement from me about that!
Perl seems to be fading away, but for a long time it was the obvious tool to use when your program got too big for a shell script.
> Python, Go and so on need more structure and often don't have quite as simple and convenient methods of doing various shell like things.
This is Perl's niche, or at least it was Perl's first niche before it started getting used for CGI. The things you'd do in a shell script are generally convenient in Perl too, the main inconvenient part is selecting from the many different ways to invoke a program in Perl, like system(), backticks, qx//, open(), etc.
Perl is also more commonly installed and has faster startup time than Python, in my experience.
It is now the niche of Next Generation Shell. Totally not humble and biased opinion... maybe. I'm the author. You are very welcome to make your own judgement - https://ngs-lang.org/
Yes, Perl is still used and it is fast and excellent for anything that becomes too big or complicated for a shell or awk script. As you write it is installed in the vast majority of *NIXes as default.
And if you're conflicted between thinking both this and "but go is nice and static and free of runtime", you can look at https://crystal-lang.org/ Ruby syntax with static compilation for your tools.
Definitly. I've been replacing some of my more janky & complex bash scripts with ruby and it becomes a lot cleaner imo. Also, I recently 'discovered' Pathname, it takes away a lot of the pain of dealing with files and directories in ruby and allows for pretty clean code imo.
perl 5 was great but it broke down as soon as you needed a depth or complexity of data structure. The number of times you had to man perldsc ... every single one had a different syntax for declaration and dereferencing ... AoAoHoHoAARGH!
I have the opposite problem... I find perl's data structures so ridiculously easy to write that I can quickly build up arrays of hashes of arrays of (ad infinitum) while in the "flow". Throw in the ability to actually get references to things, and control de-referencing of things, and I've never felt perl has stopped me (for better or worse!) from doing anything with data structures.
Yeah now try iterating. What used to get me is for some types $ is used only when de-referencing. It was terrible trying to track those errors down. Basically I loved perl but the language has a terrible design, inexcusable really. There was essentially no point in having types, it took all the worst parts of C and then reinvented them slightly differently with new syntax, twice per datatype. Glad to be rid of it.
The “best practice” of preferring higher level languages is questionable when it comes to System level tasks such as this one.
If it’s something that needs to be part of a complex system, sure higher level langs can be powerful. If it’s a standalone thing, shell scripts are generally ok.
In fact higher level languages add complexity: their build process, library updates, breaking changes of dependencies all make it a pain to maintain. Shell scripts? “ls” is never gonna change.
IMO the biggest reason shell scripts are avoided is simply because most software developers aren’t very comfortable with the syntax (tbh it does feel archaic in certain places) or know how unixy stuff like signals, pipes etc really work. If you have to do system level tasks though it’s probably worth the effort to understand these things.
> IMO the biggest reason shell scripts are avoided is simply because most software developers aren’t very comfortable with the syntax (tbh it does feel archaic in certain places) or know how unixy stuff like signals, pipes etc really work. If you have to do system level tasks though it’s probably worth the effort to understand these things.
Hard disagree on that one. People aren't comfortable with the syntax because it's rather awkward and the subset of portable syntax is minuscule. The syntax itself has all sorts of variations (e.g. bash 3 vs 5 vs not-bash) and when you have to leverage external programs (e.g. sed or awk) things can get hairy quickly (especially if you're trying to support both BSD and GNU userland components).
From a security POV I'd much rather use not-sh style stuff. Error handling is much more accessible and robust in higher level scripting languages. Sure you can run "set -e" and hope for the best... until you have to use something that tries to stuff meaning into the exit code like grep. Handling filenames with spaces is significantly easier once you leave the sh baggage behind.
As for ls(1), are you talking about BSD or GNU ls?
If you don't standardize on a sane shell then you shouldn't use the unix shell, agreed. Total disagreement on security fitness of high level languages -vs- a shell script.
It has changed in the past and it will change in the future.
"ls" will always list files in some way, but it is the only thing you can take for granted. You don't really know what kind of decoration you will get, how spaces and non-ascii characters will appear, the language, the order, etc... You may also find yourself with a weird environment variable your user has set that makes ls look better in his interactive shell and breaks your script.
Generally, I avoid "ls" in scripts for that reason, but many commands have the same kind of problem. Your GNU script may break on busybox or BSD (happened to me many times). And good luck knowing what /bin/sh means, it may be bash, it is supposed to be POSIX, that's all we know.
Totally agree, tough I am not sure that all shells that pretend to be POSIX really are.
But it is a minefield, it is way to easy to sneak-in a bashism because you are so sure it is POSIX that you didn't check, made worse by the fact that if you are writing shell scripts, you are probably using an much more liberal interactive shell all day long. And your script will work and no one will complain until you try to run it on another system.
It is not entirely unlike client-side web technologies, where you have to check for browser compatibility. For the web, it is a recognized problem and there are plenty of solutions to deal with it: jquery, babel, etc... I am not aware of such tools for shell scripts, or maybe just analyzers that check for posix compliance.
The problem with shell scripts is the same as automation a gui.
It’s flakey and hard to maintain.
Grepping through output
No uniform handling of spaces, tabs. No data structures. Multiple dialects.
Shell scripts mainly work with files and process management. So many scripts don’t handle spaces correctly. This is not a problem of the creator, but because of the language.
Although not following best practices when writing shell scripts is the fault of the author. For example not thinking about spaces and using $varname directly instead of adhering to best practice and enclosing it in "${...}". It is uninformed and one should inform oneself, if one is to write a shell script. Plus there is no shame in looking up, how to do it correctly. There is little excuse for getting that one wrong. I mean, if this is like one of the first shell scripsts one writes, OK, but a seasoned engineer ... nah.
Well sure..
Then everybody, please stop blaming C, C++, php or wordpress.
If it's the "wrong" way, remove it. Don't even deprecate it. Just hard throw an error.
There are 10 tiny buttons next to each other. You press the wrong one. Was that your fault or the fault of the designer?
Stop blaming "the user". In this case, not even a software engineer, but usually a sysop or power user.
No one (not that I have seen a comment like that anyway) is claiming, that shell scripts are not gnarly in some way or the other. However, this gnarliness of a language is of course a gradient, not an on-off, 1 or 0 kind of thing. Shell scripts are surely further on the gnarly end of things than some other things.
If you ask me personally, I would not start a new project in any of the languages or frameworks you mentioned.
I still do not agree though, that the user is completely not at fault, when there are easily avoided traps (if that is what you are proposing?). Of course those should ne even exist. I agree with that obviously. Blaming every single mistake on the language design, which might be gnarly, is not what I think is the correct thing to do though. We do not have the perfect language yet (or might never have it). So some gnarliness is to be expected and that means, that a realistic perspective is to educate oneself about those gnarly things and avoid mistakes. Mistakes are human, so there will be mistakes, but we should try to avoid cheap ones out of laziness or being uninformed.
Also it is totally fine to criticize a language for its design flaws. I like to do that often even, with mainstream languages, because they can learn loads of stuff from less mainstream ones. This is the way to create awareness of those flaws and thus a way to improvement.
It's best practice for longer scripts. I think anything over 100 lines is better off rewritten in a general purpose language as it is bound to have the complexity they are designed to solve and it will outweight the drawbacks.
> most software developers aren’t very comfortable with the syntax
Yeah which is the same reason why sql is avoided, regex is avoided, javascript is avoided etc. People will go through enormous trouble to avoid learning anything other than the multi threaded c-style programming model from the 1970s, that they learned in their computer science degree.
> multi threaded c-style programming model from the 1970s, that they learned in their computer science degree.
Yep, most folks learn programming in high level languages, lower level and “glue” scripts feels “not right”. I can attest to this… after using one shitty orm after another, learning sql and bash made me a much better programmer.
I think one of the things that still make shell scripts so alluring is that it's the same language you use for your shell.
Many of my scripts for personal use were basically just cleaned up and slightly extended versions of the command sequences I typed in anyway. This is different from writing a script in a proper language where you usually start from scratch.
I believe, this was Perl's approach originally: Let you use your shell commands as a starting point and then gradually extend the code in a safer language. However, this still leaves you with many of bash's syntax issues.
On the opposite side, many modern languages do have REPLs - but I have not seen anyone so far use them as actual day-to-day shells.
So I think to really get rid of shell scripting, you'd have to develop some syntax that would work well both for scripts and for shell use.
I think that's the void PowerShell TIRED to fill, but I loath even trying to read such a script because everything is so obtuse. Unix style shells only blindly pass around octet streams, implicitly delimited by \n characters, but really a blind pipe of bytes is the true interface. As it _can_ be anything, to get anything other than human text each program needs an extra flag that alters it's output and thus documents the new format. There's also nothing that stops a program from accepting arguments that arrange for a nest of files and/or fifos (which can also be seen as a type of named pipe) for their interface instead.
Unix style shells start simple and get as complex as desired, rather than a set of typed objects which must all be kept in memory at the same time.
I think what PowerShell showed me is the next step above something the complexity of a Unix shell is a "real programming language", no matter if it's interpreted live (scripting) or compiled. There might be reason to revisit the exact syntax, but I'll wait to hold my breath until we've replaced keyboard layouts designed to slow down typing to a point which doesn't break a mechanical typewriter.
I'm always very mixed on PowerShell as it can be quite useful and it's mature enough in windows that it really does simplify a lot of windows administration tasks to be trivial and can save a ton of time there by having a nice and safe way to avoid working with the windows ui, a goal that everyone should strive for. The verbosity is not lost on me a d k do find it tiring to read scripts, and it's exacerbated when you read a poweshell script written by someone who started with a "real" language and tries to make PowerShell work like those languages.
Windows apps that have PowerShell cmdlets are a boon also to administration also and I'm grateful when the option is there.
My biggest gripe with PowerShell though is that as soon as you get past the "I don't want to click around windows/.net apps", Powershell shows how shallow it is and its quirks affect performance, design, etc. The PowerShell community encourages a lot of practices that end up trapping you in bad design which makes maintenancen and readability of scripts a challenge. (the handling of variables and their scope can be confusing as PowerShell allows for a lot of "cheating" on which data is accessible and the script will work, but it's very hard to tell sometimes the state of the data being worked on and where it came from. Use of += for adding to arrays is fastest to write at first blush, but you end up with awful performance because it copies the array, but your average PowerShell script writer wouldn't know it and ends up wondering why your script struggles outside of simple lab scale tests)
The answer to this is getting into reflection and using .net calls, but to get the right performance and behavior from PowerShell you end up with a good portion of the script being just straight up .net code and at that point I wonder why bother with powershell at all and not just use .net and some faster API to get data from the app or going through wmi to work with windows instead.
You can get complex PowerShell programs for sure. But they're PowerShell only in name, the force powering the program is usually something else that PowerShell only hinders as you need to now deal with PowerShell-isms to make it execute.
So in a sense it's very successful in that I think this is exactly what PowerShell was supposed to do. But I find as I got past short scripts that PowerShell just can't really do anything it does better than another language would. It can't work anywhere near as fast as bash or any other *nix shell can, you can get a lot of depth into windows but mostly through .net calls, and for applications, a good rest API returns the same data, but faster and in a format more useful for other applications. (both program wise and just in terms of concept)
It feels like PowerShelk was (is?) supposed to be a gateway drug for .net. But it's just too easy to get hooked on it and suffer the side effects and never really take that next step to a more mature language, so you get less of high from using it but way more nasty side effects.
(Also the less said about PowerShell ISE the better. I wish MS would just remove it entirely as its difference in behavior from the actual shell wastes so much time in debugging scripts)
The power of shell scripts are the redirects and the pipes, nothing else is as close to the file system and has the same power and expressiveness for system administration tasks.
I completely agree. In fact I blogged about this problem myself. For what it’s worth there are a few indie shells out there that are trying to solve this problem:
- Elvish (non-POSIX)
- Oil (Bash superset)
- murex (non-POSIX)
Oil is probably the most interesting on this list because it aims to be 100% Bash compatible while supporting additional syntax to make it less ugly. Thus an upgrade path from Bash.
Elvish and Murex are similar in they they’re typed shells (like Powershell but less ugly and works with existing Linux / UNIX CLI tools) but have slightly different approaches in solving that similar domain.
My dream shell is not a bash superset, in fact it's as far from bash as possible (because I don't know bash, and won't ever know bash). I dream of a python/nodejs shell that can list files, and run and pipe programs.
I hear this opinion a lot but the reason such shells don’t take off is because most of the time in a shell you’re writing the same short snippets over and over. The burden of doing that becomes annoying after a while in some of the more verbose languages. Thus shells favour read once and write many syntaxes.
I also agree.
In fact, I have a feeling it is not simply a matter of allure, but one of the most decisive and important factors if not the single most.
I think we tend to underestimate the importance of this kind of uniformity (aka homogeneity, consistency, sameness, equivalence etc) that allows us to freely shift back and forth and work across different environments (aka contexts, mindsets, etc) without any essential changes and translations.
i.e. in a sense, minimizing the boundary such that the difference effectively disappears and it feels as if it's all just one same.
In this case the environments being shell <-> script. But I think I see similar patterns in many other places, including outside of tech.
So, similar to as you also mentioned, as long as there is no language that beats the current shell as both a shell language and a scripting language, and one that sufficiently matures for real practical uses as both, I have a hard time imagining the most perfect scripting language alone still being able to make shell scripts go away. At least, I have a hard time imagining myself abandoning shell scripts otherwise.
Python is such a mess, most likely the script you wrote won't work on a coworker's machine. Either it accidentally needs some global package installed, and then if course the needed version interfers with something else. Or the script only works for certain versions of python etc.
For most of our big scripts at work, I've actually resorted to just running them inside a docker container, mounting my current folder. So tired of trying to get my env to match someone else's.
>>Python is such a mess, most likely the script you wrote won't work on a coworker's machine. Either it accidentally needs some global package installed, and then if course the needed version interfers with something else. Or the script only works for certain versions of python etc.
So true. You have to be very careful when writing python, because cool language idioms that you use (or function calls), may not have been introduced in say for example Python 3.5. So when someone runs it with a different interpreter it will not work.
Writing python scripts that not-python-knowing people could use, is an art.
Instead, I work for a company who's core was written in bash and perl, and those scripts still work today after 15 years.
Have problems with poetry all the time. Upgrade minor version of python, suddenly there is no wheel available for some combination of platform+architecture. Forces you to build the stuff yourself, thus having to install a complete cpp build chain.
I haven’t come across this but I use only whatever versions of Python are provided by the python@3.10 and/or python@3.9 Homebrew packages and Debian testing or stable so am very likely not at the bleeding edge.
That or I don’t have interesting enough dependencies :)
It's definitely a pain not knowing if your python3 is 3.5, 3.6, or later. If you stick to using the older syntax and standard library, though, it's a fairly reliable replacement for bash. Though, in my situation the environments are pretty tightly controlled, so maybe I've got a biased set of experiences.
There are legitimate issues but in most cases I encountered, the complaints are of the kind: "it doesn't behave exactly like solution from another language that I happened to learn first and therefore Python sucks.
What is the dependency management solution for bash that it so much better than anything in Python?
It just blows my mind that this, and some of the other idiosyncracies of Python are not sorted out.
Python3 was a big stain because of those issues.
Python 4 could literally just be a platform shift to sort out paths, libraries, dependencies, documentation in order to try to solve all those problems.
It's just amazing how nutty things can be for tech we all depend on.
We would have never designed something that way on purpose.
Bash - as a non-expert I also feel is a bit wierd.
I almost what Computer Science God to come along and make a shell script, a clean light language like python/js, maybe something like Swift/Java, and then something like Rust but a bit easier - and they all feel a bit similar and integrate and the build tooling is fairly clean and the whole world can get along.
If you don’t feel comfortable writing shell scripts, or even for people who do :-), shellcheck especially when used as a vscode plugin will hold your hand and catch any potential problems with your shell script before it’s even run.
I do agree with your overall points but it’s worth noting that Go has been pretty stable in that regard. In fact backwards compatibility is one of the core missions and has lead to various design decisions that have left many wishing Go would break its compatibility promise. But I do agree it still wouldn’t compare against Bash’s multi-decade of consistency.
> Python and Go will probably change faster than Bash. So the maintenance cost of scripts in Python or Go will be higher.
This is something I'd like to see addressed by programming languages. Wouldn't it be nice if you could say "pythonversion 2.7" at the top of your program, and it would use that version of the language for the remainder of the file?
Apple shipped an "old" version of bash due to licensing issues which means if you're relying on any bash 5 extensions you're SOL. Now that zsh is the default is Apple even shipping bash anymore? Things get even more fun if you've multiple versions of bash installed at once.
Not sure if someone mentioned it already, but for such use cases Xonsh[1] is amazing.
It allows you to use shell stuff in a python script. Basically it allows you to use the best of both worlds, i. e. you can use python for the logic flow while retaining all your quick shell oneliners when neede. And it makes this interaction of python and shell super convenient, i. e. run your oneliner but process its output with python. Much faster than coding everything in python or in shell.
From a quick glance seems that the biggest difference is that ABS is a whole new language while Xonsh is just a python superset. I. e. Xonsh is python which can also parse and execute shell commands.
What I see as a benefit is that it allows you to reuse your existing python and bash knowledge without having to learn (almost[1]) anything new.
[1]: Xonsh does add one or two new things to python. But they are pretty straightforward and aren't required. But are hella nice for more advanced interplay between shell and python.
Shell scripting is not bad these days. There’s a linter - Shellcheck - that will catch most gotchas in scripts, there are a lot more resources to learn shell online, and there are tools like Bats to make your life easier if your shell scripts grow too big.
Writing logic - functions, for loops, or if statements- is still very annoying in Bash. Likewise writing a piece of code that glues calls to external commands is extremely clunky in almost all other languages (except Perl an Ruby), notably Python which is often treated as a Bash replacement.
Finding the right balance can be tricky, but choosing only Shell or only Python is not the right way to go.
Yeah I think PowerShell has a perfect balance there. Of course they were able to do so by taking lessons learned from 30 years and didn't have any legacy baggage.
Piping structured objects between apps without having to parse output is amazing. I hope Linux will come up with something similar. Of course PowerShell is open source but I don't trust MS enough to adopt it privately. I do use it at work though which makes sense as we're a MS shop.
Shell scripts require more direct systems knowledge than languages with an abstracted approach to _any_ system. As a rule of thumb if I am using python to drive (popen) system binaries I don't write in python and if I have significant data structures I avoid the shell. TBH the popular movement and younger crowd have kind of dismissed aptitude with the traditional unix shell/awk/sed as not worthy of significant effort due to any number of (mostly shallow) complaints. (gnu) Awk is a wonderful language in itself but sadly most users don't know anything about it and use it for trivial operations. These days if I am working with younger people in a team environment I default to python and avoid the bickering. That's the way wars are won - make it unpleasant enough that surrender is a better option.
Shell are old classic human-computer UIs, so they are battle tested, well known, daily used by almost anyone who can write a script. That means: scripting is natural for a vast cohort of users, it's natural because while weird in language terms it's a glue of a set of known and frequently used tools.
There are much more shell users than Go, Python, ... programmers. Also those who happen to be programmers normally use certain languages, typically more than one, typically far more diverse than just most used shells, for a certain kind of activity, like programming in a team a not-that-small project. As a result their mindset is not "quickly do something/quickly automate something for me, myself and I only, on my own desktop" but to be part of a larger game with need to cooperate, produce code easy to read by others, create a set of entry points for integration etc. Those programmers are not all, but mostly also shell users and they tend to use the shell not just for specific large projects but also for quick, simple&dirty personal stuff. So as non-programmers who use the shell they know and use it daily.
In the past we have had perhaps better tools, like Smalltalk on Xerox desktops or Emacs/zmacs lisp on LispM BUT their actual public is next to 0 and modern Emacs-ers are on modern desktops not LispM and they are typically born on modern desktop so they know the shell and in Emacs they'll wrap anyway countless shell-centric tools so end the end a script is often quicker and simpler...
Some people on this website keep insisting that you should use the 20 or so Unix DSLs to solve small problems, and then graduate to a general-purpose language like Python when you have a big one. I have pointed out one problem with this, which is that it means you have to learn a ton of arcania for each Unix DSL, and you end up with a Rube Goldberg machine of DSLs invoking DSLs. This post points out another set of problems. I now just use Xonsh (a Python dialect with some syntax sugar for shell things). Some people say that this might stop working after some updates, but I don't do this for a living, and those of you who likewise don't should consider following me.
There's an escape sequence for shell commands, and an escape sequence within that for ordinary Python code. It works fine.
The reality is that most Unix DSLs are similar to subsets of Python. Yes, Python is not the global optimum of programming languages forever more amen, but most Unix DSLs are effectively just ad-hoc procedural languages, and therefore special cases of Python. And if you're going to use Python aggressively for other things, then there's a benefit to using it as a shell as well, and thereby get some consistency.
If the project is just for "glue", then any reasons it shouldn't be used? I'm currently creating a "glue" project with shell scripts, and the biggest issue is that I'm creating a lot of environment variables.
I wrote a few scripts in Deno this week. In a single file I can import packages like markdown for parsing. It's typescript. I can handle arg parsing. Shebang works. This will be my go to for small local scripts
The only time I use shell is if I expect a pure list of commands with no loops or significant logic.
Actually, I'm not sure I've ever done a loop in bash at all.
The few times I've tried to write even trivial programs in bash resulted in me eventually encountering something ugly and unpleasant, then quitting and starting over in Python, and having a working solution extremely fast.
I had this problem, so I wrote a script. Then, I had 2 problems...
- old saying
Seriously, though, treating shell script as a prototyping tool can be really powerful. Don't be afraid to rewrite it again as a 'proper' program if you have concerns about maintainability, etc.
The -R switch reads in arbitrary text instead of json and presents it as a sequence of json strings internally.
The capture() filter applies regexes to the string and extracts subgroups (which will be represented as json objects).
The gsub() filter performs search-and-replace on a string.
Finally, the -r switch writes the result back to stdout as a raw value, removing any json quoting and escaping.
By default, processing is linewise, but there are options to process the whole input at once instead. There is also supposed to be a way how you can have per-line processing with state tracking across lines (like sed's "holding space"), but I haven't tried that out yet.
I found this approach often more flexible than sed and easier to reason about. Also, I believe it may be faster, as more things are happening in a single process.
The one exception: grep on files is still orders of magnitude faster than anything else. I believe there also was an article about this on HN a while ago.
So if you have to find some data in a larger collection of files and then have to turn the data into something structured, a good approach is to make a "coarse" query with grep or grep -R first, to narrow down your set of candidates, then postprocess your remaining candidates with jq.
Also, apropos single process: Another questionable advice is that for/while loops in bash are subshells and you can use them as part of a pipe chain. This often lets you reduce the number of processes you need.
E.g. instead of
You can often write: Which will only launch a single sed process for all files instead of one per file.But yeah, shell scripts do have many shortcomings and in particular the error handling situation seems extremely hard to fix. So actually sensible advice is still: Use python.