GNU Parallel is the first thing I usually install on top of a standard Unix userland. It's almost a superset of xargs, with many interesting features. It depends on perl, though.
htop is also pretty much a great replacement for top. And ripgrep a great replacement for the find | xargs grep pattern.
Aside from that, I'm pretty content with the Unix userland. It's remarkable how well tools have aged, thanks to being composable: doing one thing and communicating via plain text.
I'm less happy with the modern CLI-ncurses userland. Tools like e.g. mutt are little silos and don't compose that well. I've migrated to emacs, where the userland is much more composable.
I don't feel the plain text portion has aged well at all in regards to composability. It leads to a lot of headache as the complexity of the task grows because of the in-band signaling and lack of universal format. I think it is high time the standard unix tools were replaced with modern equivalents that had a consistent naming scheme and pipelined typed object data instead of plain text.
Plain text was chosen to interface the various programs of Unix OSes because it's the least common denominator all languages share. It also forces all tools to be composable with each other. You can take output text that was obviously not formatted for easy consumption by another program and still use all the information it outputs for input into another program. Programs that were only thought to have users handling its input and output (ncurses apps) can also be forced to be used by programs through things like the expect TCL program or Ruby's expect library.
If programs used typed data, they'd still need the option to output text to present results in a format the user can understand. To do this, a negotiation protocol could be established, like skissane said. This, in my opinion, is BAD, because then there's the possibility or probability that they'll be differences in the information conveyed in the different formats.
I believe that the use of plain text as the universal format of communication between programs is one of the greatest design decisions for Unix and CLI.
How hard is it to come up with a object format (more like a data structure format since I wouldn't want logic/code being passed around) and then come up with a standard text serializer for it? Not that hard, in my opinion.
You'd standardize it once via an RFC and you'd be done with it.
I don't think your problem is as big as you say it is.
The real problem is that this pile of code we already have kind of works and it's already making trillions for its users. Changing the whole ecosystems would cost millions and millions, for only a very long term and unclear benefit.
It's not about backwards compatibility. It's about the fact that text is what we read as humans and if commands parse the same format there is only one output format to implement.
I rarely want text as output format, I want structured data that I can explore in a structured fashion.
As oblio said, you can come up with a standard conversion of structured data to text. The other way round you need write a parser for every textual output format, and typically people come up with fragile ad-hoc parsers that don't deal with edge cases properly.
I don't believe that. Text for human consumption, in a well designed UI (and I mean even a CLI one!), should be different from text for machine consumption. Human consumption generally optimizes for characteristics almost diametrically opposed from machine consumption.
Of course, who am I kidding, in real life we have some sort of crappy text interface which is half-baked both for humans and for machines. But we've been using it for almost half a century and it's too widespread to redo, so there we are, plowing through it daily.
Let's imagine the OS thought so, too, and had programs require implementation of both UIs, one for machines which is hard to look through by humans, and one for humans which is automatically presented in GUI form, meaning its hard to control and it's hard to parse the information it's presenting in a bitmap window (IOW an unautomatable interface). Now, I see 2 reasons to prefer the scenario we have now with unix and text based communication:
1) We don't need to depend on each individual program's programmer to present every control and information consistently between the 2 interfaces.
2) Automation matches normal, manual use. Just put what you normally do on the command line in a file and you're done. There's no need to look through documentation on how to do what you so frequently do, only in a manner that you rarely do.
I think we have enough formats that fit the requirement of serializing data structures, no need for a new one (xkcd ref goes here). You still need to be able to tell the other tool what to do with that data though. In essence instead of a series of greps and seds and awk you need a bunch of options for the next tool in the chain to tell it how to treat your serialized object. That's merely shifting the complexity around.
Also there is no need really to change anything (as in, breaking existing scripts). Selecting a different output format can simply be a command line option. Many tools already offer Json, XML or CSV output. But since development of those tools is so decentralized you'd be hard pressed getting them all to agree on one. But theoretically you can pick any tool you want right now, add --json support and submit a patch.
You've misunderstood me. It's not that it's hard; it's that, however nicely you do it, the result sucks.
Are TUIs like htop, tmux, vim, emacs, less, etc. going to be impossible now, or will you do the negotiation protocol? Both options suck.
When programs have both normal output and errors intermixed are the objects going to be intermixed in the output that's presented to the user? For example, if you do a `find /etc/pacman.d`, instead of:
You could have every function incorporate their errors into their normal output, but that means giving up a standardized way of working with errors and warnings. I don't know if you know this, but when you do substitution or piping, by default, only stdout is used. That means that we you do piping, the programs in the pipelines normally do not see the errors in their inputs, and the errors of the multiple concurrently running programs are shown to you intermixed while the pipeline is working. That's a friggin' incredible effect that came from simple design, but when each one would output objects, you'll could get syntax errors or a completely different object like what happened in the above example. You could say, "well, only make stdout an object and let stderr be text," but the fact that they're both the same type means that you can work with the error or only with your errors in pipelines and other shell constructions. For example, `find /etc 2>&1 >/dev/null` will output the directories in /etc you can't read for whatever reason. You might want to pipe that to `xargs chmod` (for whatever reason) after preparing the output to only include the paths.
Right now, programs can strike a good balance in presenting its output in a format that is both readable to humans and other programs. By forcing their output to be structured as objects, and not giving them the option of presenting 2 formats (because we don't want that either), you're removing their ability to present the output in a manner that is readable to humans.
Take for example, rspec's output (a unit testing framework):
$ rspec spec/calculator_spec.rb
F
Failures:
1) Calculator#add returns the sum of its arguments
Failure/Error: expect(Calculator.new.add(1, 2)).to eq(3)
expected: 3
got: nil
(compared using ==)
# ./spec/calcalator_spec.rb:6:in `block (3 levels) in <top (required)>'
Finished in 0.00131 seconds (files took 0.10968 seconds to load)
1 example, 1 failure
Failed examples:
rspec ./spec/calcalator_spec.rb:5 # Calculator#add returns the sum of its arguments
Mind you, that's full of colors in the terminal. It's output that easy to read with the eye and parse with a bit of awk. Can you imagine that being output as a JSON with a generic pretty printer? How will it compare when reading with the eye?
The main thing is, though, that, in the question of what the universal format of communication between programs written in different languages should be, text is the simpler, more natural choice over objects. Take note, I don't mean easier. The fact that it's easier is merely coincidence. Simplicity leads to good design because it means less arbitrary choices to make. Less controversial choices to make. Choosing objects leads to more questions: What should the primary types be? Should arrays/lists allow multiple types of elements? Floating types or decimals? Precision restriction on the decimals? Should integers and numbers that allow fractional parts be the same type or different? Should we have a null type? Should we have a date primary type? What about a time primary type? What about a datetime primary type? Whatever answers you give, there will always be groups of people that will dislike them. When you chose text, the only question is really, what encoding? Utf-8. done. Natural, simple design is what we want to be the foundation that myriads of programs and languages can base themselves on and depend on.
There's only one way I'd agree with you that structured output would be nice, and that's with mono-language OSes, like a lisp OS or some other OS where all code is in the same language, and there would be no concept of programs or shared / dynamically loaded libraries or such. In an OS like that, every function is a program, and your shell is the language's REPL. This is bliss when the OS is done in your favorite language. The problem with these kinds of OSes is that we don't all like the same languages and so it'd lead to ridiculous situations where we'd translate a new language into the high level language of the OS. That's what we do in the OS known as the web browser and why we're coming up with WebAssembly.
In conclusion, multi-language OSes like those that are Unix based are awesome, and text as the basis of communication in multi-language OSes is awesome. Therefore, text as the basis of communication in Unix is awesome. :)
> You'd standardize it once via an RFC and you'd be done with it.
If you think something like this is easy much less "you standardize it once and you're done", then you are only cheating yourself out of an essential life lesson.
is there any filesystem that does not enable spaces in filenames? I find the idea ridiculous. Would you design a programming language that allowed spaces in its variable names? Because that is the same level of atrocity.
If Unix pipes gained support for exchanging some kind of out-of-band signalling messages, then CLI apps could tag their output as being in a particular format, or even the two ends of a pipe could negotiate about what format to use. If sending/receiving out-of-band messages was by some new API, then it could be done in a backwards compatible way. (e.g. if other end starts reading/writing/selecting/polling/etc without trying to send/receive a control message first, then the send/receive control message API returns some error code "other end doesn't support control messages")
(But I don't really care enough about the idea to try to implement it... it would need kernel changes plus enhancements to the user space tools to use it... but, hypothetically, if PTYs got this support as well as pipes, your CLI tool could mark its output as 'text/html', and then your terminal could embed a web browser right in the middle of your terminal window to display it.)
I am not sure how this would work. rsh/ssh may be involved. I wouldn't even know how to express:
ssh carthoris ls /mnt/media/Movies | grep Spider
(this is just an example). Note that in this example, we
have two processes running on two different machines.
Indeed, the OSs and systems on these machines may be, um...
different. Indeed, I routinely include "cloud" machines
in pipelines. Indeed, with ssh, the -Y (or -X) option can
introduce a GUI to a part of the command.
I have wished that shar was part of SUS. Also, I find that
"exodus" is useful (across Linux anyway -- the systems have
to be "reasonably" homogenous). https://github.com/intoli/exodus
> I am not sure how this would work. rsh/ssh may be involved.
In order for this to work over SSH, the SSH client and server would need to be enhanced to exchange this data, and also an SSH protocol extension would need to be defined to convey it across the network.
One might define IOCTLs that work on pipes and PTYs to send/receive control messages. So sshd would read control messages from the PTY and pass them over the network, and the SSH client would receive them and then pass them on to its own stdout using the same ICOTLs. (Alternatively, one might expand the existing control message support that recvmsg/sendmsg supply on sockets to work on pipes and ptys as well.)
Any program supporting such an out-of-band signalling mechanism would have to gracefully degrade when it is absent. If your SSH client or server, or some program in your pipeline, or your terminal emulator, etc, doesn't support them, just fall back on the same mechanisms used today to determine output/input formats.
(rsh is such a deprecated protocol, there would be no point in trying to extend it to support something like this.)
This would be excellent.
Finally we get to view images on remote side (except by the iTerm hack) and view diff in a local GUI in the middle of a session?
>your CLI tool could mark its output as 'text/html', and then your terminal could embed a web browser right in the middle of your terminal window to display it.
Ha ha, I had dreamed up something like this in one of my wilder imaginings, a while ago: A command-line shell at which you can type pipelines, involving some regular CLI commands, but also GUI commands as components, and when the pipeline is run, those GUIs will pop up in the middle of the pipeline, allow you to interact with them, and then any data output from them will go to the next component in the pipeline :) Don't actually know if the idea makes sense or would be useful.
Sure it makes sense. Just put `gvim /dev/stdin` in the pipeline you desire and write to stdout when you're done. You can add to your configuration to remap `ZZ` (which usually saves the file and exits) to `:wq! /dev/stdout` when stdout is not a terminal. I'm going to do that when I get back to my computer.
It does. For example, you could pipe an output of diff and a GUI diff viewer starts up and it's a whole lot easier to see or merge the context and even you may be able to launch a 'system default' app for that instead of a predefined app.
And another apparent one is image viewer or editor.
> A command-line shell at which you can type pipelines, involving some regular CLI commands, but also GUI commands as components, and when the pipeline is run, those GUIs will pop up in the middle of the pipeline, allow you to interact with them, and then any data output from them will go to the next component in the pipeline
There seems to be some precedent for this sort of thing. For example, DVTM can invoke a text editor as a "filter", where the editor UI is drawn on stderr and result saved to stdout.
> where the editor UI is drawn on stderr and result saved to stdout
I did some experimenting recently and found you can open a new /dev/tty file descriptor and tell curses to use this, the rest of the application can continue reading stdin and writing stdout as normal.
This made me chuckle, because it's incredibly accurate. I started with bash and am still naturally more comfortable there for general purpose work. PowerShell is annoyingly verbose sometimes and has its own WTF moments, but there are waaay fewer surprises when working with complex scripts and variables.
It’s not included anywhere, but PowerShell is open source and you can install it on *nix now! Obviously the things that integrate with Windows aren’t there, but the object-oriented pipelining sure is.
I'm actually a fan of Powershell, but unix people take it as a personal insult if you try and tell them their 70s-era tooling is inferior in some way to something designed with 30 years of hindsight.
You're right that a shell alone can't calculate md5 but a separate md5 binary does it for you, but the question still stands when the common answer on the internet seems to be to write that cryptic code as that's probably the easiest way provided on PowerShell.
The best alternative I could find is some community maintained PowerShell extension (with just 177 GitHub stars now), which is far better but the lack of interest in making PowerShell act more straightforward is weird.
Still, all that stops PS from being a better shell in linuxland is some people writing PS-equivalents of all the small executables that ship with your linux distribution.
First of all, the two examples you present do not do the same thing. The Powershell version preserves paths and excludes more patterns than your find | xargs example. The right way to do this is to just use robocopy.
This example has targeted a very specific deficiency in the way Copy-Item works. I could similarly point out that the following Powershell command would be much more difficult in standard unix tooling:
Notmuch, which is IMHO superb and totally underrated.
I like it a lot due to its clever architecture. It never ever touches your email. It operates on a separate tag database.
Then, it's the task of a backend to translate tag changes into maildir actions before and after syncing email. Keeping tag to actions decoupled from the GUI is extremely clever, because it allows implementing basically any email workflow you can imagine.
For simple workflows, calling a one liner notmuch command is sufficient. You don't really need to implement anything.
Mu4e is an alternative client to Notmuch. Quite similar to Mutt. Gnus is the other big alternative. It's quite old, and complex to configure. Besides, the codebase is overcomplicated as it tries to do email in a news-like fashion. Still, it has lots of great ideas on how to deal with email from many sources. E.g. using predictive scoring.
+1. I spent a while trying to persuade Apple Mail to let me receive notifications only for threads I was 'watching'. I never did come up with a sane answer, and finally decided "well, I use Emacs for almost everything else. Might as well see how email pans out..."
With notmuch and mbsync I have the best email setup I ever have. I wish I'd done it years ago.
The check-mail script is just a wrapper around mbsync (http://isync.sourceforge.net) that invokes 'notmuch new' after running, so notmuch can index and process new messages.
My notmuch post-new hook does a bunch of tagging for me, so I have to actually look at as little email as feasible, but the main thing it sounds like you're interested in is the notification setup:
With that, I can batch-process email a few times a day, while staying responsive to any discussions I actually want to be interrupted for.
The missing piece is reasonable logic for knowing when I should be notified of new threads. I'm currently in a job where people don't expect insta-responses to email, thank God, but I've been in ones where they do and I'd have to think about how to handle that more.
My one annoyance when reading email is that large inline images aren't auto-resized to fit. They should be, but the Emacs build I use doesn't have ImageMagick support compiled in.
My custom.el probably has some notmuch settings too.
Thanks, I'm in the process of migrating from Thunderbird to mutt/neomutt and am wondering if emacs might have advantages. I already use Emacs for Org Mode.
Fresh mu/mu4e user here. The advantages over mutt/neomutt are that your e-mail now becomes the part of the same consistent UX of Emacs, and then every improvement to any of your workflows in Emacs is automatically inherited by your e-mail workflow. This depends on how much you like customizing Emacs and/or writing Emacs Lisp to solve your problems. Some practical examples include:
- org-mu4e & org-notmuch will let you link directly to your e-mail messages from your org-mode files; potentially useful if you're using org-capture to quickly add TODOs and notes.
- it's trivial to add any kind of template responses, template subresponses, etc.
- you can make Emacs automatically do things in response to particular e-mails, or you can compose/send e-mails directly from any elisp code
Emacs is a fully programmable environment with lots of existing software packages and the best interoperability story I've ever seen.
As you're already using Emacs for org-mode, you could take a look at org-feed [1] for text-based RSS. It places all RSS feeds into a normal org file with headings per source and article. As it's Emacs + org, it's naturally scriptable.
I usually just run --citation once on a new system, and never see the notice again, and I have never had the notice cause any problems as it only shows if the output is to the screen.
Is the Perl dependency actually relevant? I've never seen a Linux distro that doesn't install Perl 5 by default, and it's also installed by default on macOS and OpenBSD. The other BSDs all have Perl in their ports, and you almost always end up installing it anyway due to the sheer amount of stuff that depends on it.
I feel like these tools very much go against the Unix philosophy of "Write programs that do one thing and do it well". They try to do the pretty user interface and the underlying operation in a single tool.
I prefer PowerShell in this respect where the output of each command is not text streams (as in the Unix world) but objects which can be operated on in a more object oriented way. You spend less time thinking about text parsing and more time thinking about the data you're working with.
I think the command line is great as a way of manipulating data streams but it is incredibly lacking as a user interface. There is very little consistency of the interface between commands and new commands and options aren't easily discoverable.
"One thing" has never been very well defined - basically every common Linux shell tool could be said to do more than one thing. GNU `grep` supports four different pattern styles (fixed strings, basic regex, extended regex and Perl regex), has lots of options for output formatting (counting, line numbers, whether to display file names, etc.) and has a bunch of rarely-used (but useful!) options for various corner cases, such as line buffering. Even GNU `cat`, the canonical "do one thing" tool, has several formatting options. If `grep` returned match objects rather than a semantically void binary stream it would be really easy to hand over formatting to another tool, but that just isn't how *nix tools work yet.
> Even GNU `cat`, the canonical "do one thing" tool, has several formatting options.
Huh, TIL. It's just never occurred to me to even bother checking the help for cat.
$ cat --help
Usage: cat [OPTION]... [FILE]...
Concatenate FILE(s) to standard output.
With no FILE, or when FILE is -, read standard input.
-A, --show-all equivalent to -vET
-b, --number-nonblank number nonempty output lines, overrides -n
-e equivalent to -vE
-E, --show-ends display $ at end of each line
-n, --number number all output lines
-s, --squeeze-blank suppress repeated empty output lines
-t equivalent to -vT
-T, --show-tabs display TAB characters as ^I
-u (ignored)
-v, --show-nonprinting use ^ and M- notation, except for LFD and TAB
--help display this help and exit
--version output version information and exit
Examples:
cat f - g Output f's contents, then standard input, then g's contents.
cat Copy standard input to standard output.
GNU coreutils online help: <http://www.gnu.org/software/coreutils/>
Full documentation at: <http://www.gnu.org/software/coreutils/cat>
or available locally via: info '(coreutils) cat invocation'
The thing I would like to see with grep is to have it optionally give me a 0 return code if it doesn't find anything.
I know why the authors chose to use a non-zero return code as the default, but when I'm using grep deep in a pipeline and doing my own checking of the results to see if nothing was found, I don't need grep bombing out the whole pipeline with a non-zero return code.
The alternative of being forced to use "(grep pattern||true)" instead of a plain "grep -q" is kinda painful.
I'm not in my computer now, but doesn't "grep -v" solves your issue? If I remember correctly, it inverts the query, so it would return 0 if there wasn't a match.
$ cat > foo
here's
some
random
text
$ grep -v some foo; echo $?
here's
random
text
0
$ grep -v bar foo; echo $?
here's
some
random
text
0
$ grep -vE "(here's|some|random|text)" foo; echo $?
1
ripgrep definitely goes against the Unix philosophy. This is intentional. The Unix philosophy is a means to an end, and not an end unto itself. The key way that ripgrep violates the Unix philosophy is that it couples the filtering of what to search with the act of searching. You hit the nail on the head with styling, because the output of ripgrep is itself the thing that prevents composability.
However, from my own observations (and my own use), folks seem to appreciate having some amount of coupling and integration here. I've often seen folks claim that a legitimate use case (for them) for ripgrep is to "replace crazy 'find ./ ... | xargs' concoctions." The degree to which the aforementioned is "crazy" or not varies based on the individual, but there's a not insubstantial number of people who appreciate the succinctness that coupling brings.
As the maintainer, the coupling is annoying, because it means ripgrep needs to implement more stuff. For example, ripgrep provides a `--sort-files` option, which is something that standard grep doesn't need because you can fairly easy compose its output with 'sort'. You could do the same with ripgrep (ripgrep supports the standard composable grep output format), but then you lose the "pretty styling" that users appreciate. So you have no choice.
In terms of giving more structure to output, I am mostly unconvinced by your argument, but I've been happily using text streams as my user interface for a very long time, and I like its balance.
With that said, the next release of ripgrep will come with a --json flag, which enables structured output. :-)
I wonder if there is or could be a good standard way of providing decoupled tools and their coupled interface.
Last year I wrote a friendly script to search content on YouTube, which prints human-readable content if plugged to a terminal and URLs otherwise, and another one which applies a pattern to turn such YouTube URLs to the underlying video or audio streams using youtube-dl. (I don't think this plays along nicely with YouTube's ToS but whatever I'm the only user.)
The obvious use case is to look at the output of the first command, and then pipe it to the second after choosing what to watch, and find a way to feed that to mplayer. So you'd want to provide a shortcut script, maybe make it interactive… but when does it go from an alias to a new program altogether? This interaction is very intuitive in the browser and I find trying to reproduce it with a CLI tool is quite challenging.
Btw I love ripgrep, it made me look very hip in front of colleagues a couple of times.
Which is a pretty abominable tool TBH. It breaks on just about every little edge and isn't very useful unless you like to see pretty colors and nested outputs + learn it's terrible output formatting spec. When using json I still use jansson for quick one offs.
I've never understood complaints like this. If you are proficient in the unix environment any sort of gymnastics can be handled via generation of some 'object' from plaintext and command generation on the fly.
find $PATH -name '.c' -exec grep -l socket {} \; | awk ' {printf "mv %s %s\n",$0,sprintf("%s.old",$0)}'
find $PATH -name '.c' -exec grep -l socket {} \; | awk ' BEGIN {n=0} {printf "{\"items\": \%s",sprintf("[\"%d\",\"%s\"]}\n",n++,$0)}'
If you aren't proficient or have an aesthetic or religious aversion to unix userland and traditional tools you'll play some other game I guess. Reinventing the wheel without understanding the model and power is a next-gen game. I don't have time for it.
Just because I built ripgrep doesn't mean I've reinvented a wheel without understanding the existing model/power, so your criticism feels a bit disingenuous to me.
To be clear, with the current release of ripgrep, you cannot create structured objects from its output as easily as you might think. I get that it's fun to show how to do it with long shell pipelines for simple cases, but the current release of ripgrep would actually require you to parse color escape sequences in order to find all of the match boundaries in each line. This is what tools like VS Code do, for example. The --json output format rectifies that. There are other solutions that might be closer to the text format, but they're just more contortions on the line oriented output format that aren't clearly useful for human consumption, and it's much simpler to just give people what they want: JSON.
Wasn't referring specifically to you.. but to the gist of the article and the post previous to yours I believe. On the ansi escape sequences to find matches, etc...yes, I get what your tool does but having to tokenize against ansi escape codes and other ad-hoc env artifacts is something I'm glad to leave to authors that enjoy it...not that it is terribly difficult unless you decide to reinvent the wheel and optimize everything.
This is correct. The thing is, powershell is not for humans. It is purposed more towards configuration and system scipting, and thus should be compared to something like ansible. Using it as a shell is counter-productive unless you have very specific mindset.
Unix shell, on the other hand, is a trade-off: it offers you options to process and automate and reasonable convenience when working interactively. None of those aspects is perfect, of course, but it allows gradual learning curve (i.e. being able to do quick and dirty things from the beginning), is much more versatile and ubiquitous.
At work, I always keep a PowerShell instance open. I like the fact that many common commands have Unix-like aliases, plus a lot of server software offers PowerShell interfaces for administration (e.g. Exchange, VSphere, SharePoint, ...).
On Windows, I strongly prefer PowerShell for interactive work, although I cringe a little each time I see how much memory it uses.
I disagree about objects being preferable to text in a shell. Text is very easy to reason about, and you can quickly determine what transformations you need to make based on visual feedback. Working with objects means spending a lot more time in the documentation learning what properties you have to work with.
In a programming language, it's obviously no context. But in a shell, I want to spend more time doing and less time reading.
> Text is very easy to reason about, and you can quickly determine what transformations you need to make based on visual feedback. Working with objects means spending a lot more time in the documentation learning what properties you have to work with.
Or not. Take `ps`. You'll need to spend time in documentation anyway, figuring out what process properties can be shown with what flag, and then you'll be bitten later by things like the difference between `ps ux` and `ps aux` including process names in square brackets, etc. Contrast with PS equivalent, `Get-Process`. Type `Get-Process | Get-Member` to list properties of the objects returned by `Get-Process`, and you can quickly see both properties you can inspect (with descriptive names, not "VSZ" or "RSS") and what methods you can call directly (instead of extracting properties and piping to other programs).
This is IMO much cleaner, easier to work with interactively (properties instead of constant parsing and unparsing of text), better for interoperability (you're limited by what actual objects expose, not by what pieces of them a CLI program wishes to print, and if it so happens that objects somehow print more than they expose in properties, you can still call ToString() on them and get that data "the UNIX way"), and correctly separates presentation from content.
The only real drawback I've seen of Powershell is the lack of quality-of-life scripts and executables in the system. Like the md5 example elsewhere in this thread.
A shell is supposed to be an interactive user interface and a very thin glue layer for some simple tasks. Once you move beyond a certain level of complexity, a shell is just the wrong tool for the job.
Powershell is too complex to be a good shell, and there are too many bizarre idiosyncrasies about it to hold its own as a programming language. It just doesn't really have a place.
Yes, the shell can be seen as a subset of OS REPL that's been optimized for efficiently performing simple system tasks and gluing things together. The problem is, ever since Lisp Machines died off and UNIX won, we've lost the REPL. We're missing a tool for doing complex tasks interactively, so people naturally started to repurpose shell for that.
How should ls coloring work in your view? ls just does the file system reading and then some other tool parses the ls output matches some expressions and adds color? Or something like that?
I suppose it's just a generic colorizer tool that takes a configuration file of expressions and arbitrarily colors data feed through it?
I'm looking for more patterns like that, mainly to explore how to make them work in the terminal and see whether that actually has an impact on everyday actions people perform in the terminal many times per day.
The goal is not to save time, but to reduce mental friction.
With tools like these or ripgrep it is useful to know whether the authors try to replace the old tool or not. This is usually stated somewhere in their readme or faq.
If they choose to approach some problems with different semantics, I would not recommend to alias the new tool over the old one. Just treat it as a like a separate tool and in most cases their new names are as comfortable to type as the 'legacy' tool's. The one exception that comes to my mind is 'exa' vs. 'ls'. Typing 'ls' is a single action for both hands while 'exa' has to be typed with the left hand alone (on QWERTY/Z layouts).
Yes, I was thinking about exa and ripgrep recently, since I am going to write a couple of tools (very loosely) like ls/find and grep in a while (for learning and as tutorials). Thought the same as you, that for a command that is to be typed so often as ls, exa is a bit longer (even though just one letter). Of course it can be aliased to a shorter name or one can write a shell script wrapper for it. The point about single hand vs. both is a good one, that had not occurred to me.
Edited to change: like them
to: like ls/find and grep
There's exa fork that allows --icons option but you need to patch your font to display them as they're text. (Search for nerdfont. Some are pre-patched.)
Is anyone else impressed with the quality (and speed!) of some of the tools written in Rust? I'm an avid user of fd and bat, the former being ridiculously fast. Often I find something on github, I'm impressed by the quality of the documentation, features, UI etc, then lo and behold it's written in Rust.
Another one potentially for this list is tokei[1]
I was trying to count the code in our repos at work and used the venerable 'cloc' utility. It took over 5 minutes. Looking around I found tokei, written in rust. Same-ish results (more accurate actually) took 10 seconds.
I had a case where tar was too slow for my needs. Rust did let me cobble the right syscalls and threads together to make it more HDD-friendly and faster while forcing me to handle all the gritty filesystem error cases from the start. Lo and behold, about 4 times faster on an idle system and orders of magnitude faster on a busy system. It just would not have been possible to do it by combining shell utilities and too finicky in C.
That's the thing with Rust, it makes you work hard right from the start to make everything right, but when you have something working it's usually fast, too.
IIRC, loc (in rust) might be even faster than tokei at times? I always forget, and I’m sure I saw an old benchmark...
We have a whole working group this year focused on making the experience of writing CLIs awesome, so hopefully we’ll see even more great tools in the future!
Loc (mine) is usually faster in my tests but doesn't handle comments that start in strings, so if there's something like x="/*" it can be way off, so I usually still point people at tokei.
I need to try implementing the string thing. It'd be a lot more fun to try to compete on speed if we were the same on accuracy.
>We have a whole working group this year focused on making the experience of writing CLIs awesome
Can you explain what you mean by this? Something like the people are going to focus on language and library features that help with writing CLIs? Or something else?
We've tried to dig into various problematic areas of writing CLIs in Rust [0], worked to create or improve libraries [1], and are working on writing a "book" for Rust-based CLIs [2].
Well luckily I recently published a new comparison benchmark[0]. :)
The TL;DR is that loc is faster by a few hundred milliseconds depending on repository size, but as cgag mentions doesn't have comment in string detection so can be quite off in its metrics, for example on the Rust repo Tokei says it has 643,754 lines of code where as loc says it's 635,849.
New and not-popularly adopted languages tend to have more senior people learning and developing software using them; this leads to people (read: recruiters) looking for the people with experience.. People equate the quality of software with something innate with the language, or that the developers are exceptional. Then after being popularly adopted it slumps in perceived elegance and goes on a downward spiral until it ends up like PHP, Ruby or JAVA. (not to undersell the folks who predominantly use those languages, I'm just picking languages that I've seen follow this pattern)
A more generous interpretation might be that Rust has sparked a CLI renaissance, since it lets developers write CLIs that have that satisfying zip that previously was only possible in C. None of PHP, Ruby or Java is responsive enough for a good CLI tool (also, static compilation is a must for wide deployment).
FreePascal has been around for a long time and supports CLI programming. Maybe not very much support in the basic language and stdlib, but the essentials are there (CLI arg handling and file I/O). Third-party libraries may have more and you can always write your own. Both speed and size of binaries are good. I had compiled some simple CLI programs and they were under 100K, maybe under 50K. It's also supposed to be quite cross-platform, though I have not checked that out.
Also, I don't know Rust (but have read that it is somewhat difficult to learn); for the basics, FreePascal (FP) may be easier than Rust to learn and start using, because although it (FP) has advanced language features, for many basic CLI programs, the simpler procedural features should be enough.
Not to mention you get a language that is enjoyable but also compiles to native code. My first Rust project was a Haml parser and I created a CLI for it too that is similar to the Ruby version. It is nice being able to ship the 8Mb executable and not have to worry about users having the right runtime.
Cargo is awesome too. Cargo reminds me of Mix (from the Elixir ecosystem). The one thing that Mix has on Cargo is that it is super simple and straight forward to extend Mix. Cargo may be the same way but from what I've seen it seems more complicated.
Sure! Take a look at Mix.Task [0]. You basically name your module Mix.Tasks.X where X is whatever you want the command to be and then include "use Mix.Task" and implement the run function and you have now extended Mix. You call the task my running "mix X" and it will run the code in the run function. I created a small task that generates ORM models for Ecto (Elixir's de facto ORM library) using Mix tasks [1]. So the user types mix plsm and they can generate the code.
Ah ha! Thanks. So yeah, if you have an executable in your PATH named cargo-foo, then “cargo foo” will execute it. But we’re planning on having that style of functionality in the future as well, the cod name is “tasks”, but it still doesn’t have an RFC.
I'll definitely want to check out that RFC when/if it opens up. I didn't realize that cargo did that with the executables, that's good to know. Thanks!
No, not like any other compiled language. Just like any other native compiled language, maybe, but Java and C# are also compiled yet require a large runtime in order to run any application. An 8Mb executable written in Rust is going to be 8Mb total. An 8Mb executable written in C# or Java is going to weigh much more than that when you take into account the framework. Now, one could argue that it is almost a given for the JVM to be installed on a given computer but you can't say the same for .NET when doing cross-platform development.
tokei is fast but isn't the smartest tool (at least for python). I compared it to pygount (and looked on individual files). Total lines is correct but it's grossly miscalculating code vs comments resulting in almost 2x total actual SLOC for a semi large project I manage.
I manually counted a few files for comparison, I believe tokei doesn't understand python-doc-comments and think it's code.
Since we're on the subject, I would ask this question I've been meaning to ask for a while. I love writing command line tools and I've written a few in Python but the performance isn't there.
Would you suggest me to learn Rust in order to write CLI programs? I'm also looking at haskell for the same but after this thread, I'm really thinking of going the rust way. Ideas?
I personally would strongly recommend going the rust route having spent a decent amount of time learning Haskell. I think youll spend more time working on real problems and learn more about low level programming, in addition to the code being much faster.
A very informal analysis of bat and ccat is that ccat (written in Go) has choked on large files multiple times (which resulted in panics), while bat hasn't crashed on me yet. Of course I've only tried bat on two files so far ;)
tokei is actually the very first Rust program that I found 'in the wild' while searching for a tool. I had run many Rust programs prior to then, but they had always been in the context of experimenting with Rust.
I like these kinds of articles, but it's rare I need that many CLI tools on my MacBook.
I interact with hundreds of servers on a day-to-day basis, and I don't want to go around installing random tools on my servers. But I guess it's an idea for some tools to add to my Ansible server provisioning script :-)
"I'm not sure many web developers can get away without visiting the command line."
IIS has significant market share. I'll bet most web developers who deploy on it get away without having to use the command line interface very often if at all.
I have co-workers who work on a big intranet web application that runs on IIS, and they never touch it.
They recently finally switched to git for source control, and they want to do everything via GUI. I've tried to show them how some things are just easier/better from the command line, but if something isn't doable via GUI, they won't do it. One of them keeps committing line endings differently than everyone else, and he literally won't run git config --global core.autocrlf true because... I dunno why. He just doesn't want to, because it's the command line.
Git has such a terrible command line that most things are easier in (good) GUIs than on the command line.
The only exceptions I can think of are interactive rebase which has a weirdly good command line interface and is really confusing in every GUI I've tried; and continuing/aborting rebase and cherry picks when there is a conflict. And that's only because most GUIs don't bother to actually implement that properly and mislead you into thinking that you should make a new commit.
I find git to be one of the best CLI programs I've ever used.
There are a few things I don't like, like how `git blame` requires me to put my terminal in fullscreen to see the output properly. I wish there was an option to make the output group changes of a commit together. It could put the commit details on a line before and add a single character prefix to all lines to differentiate between file lines and commit lines, just like how ag/ack/rg improve grep's output format for human viewing.
There's also a long-standing bug in `git log --graph` that causes the lines of the graph to sometimes move back to the previous line at the end of a commit description.
I love the -p/--patch option that's available in many subcommands. It's really quick to work with.
What specific things do you not like about git's CLI?
The CLI for Git is one of the most confusing CLIs I've ever seen. I can describe all the warts here, but other did that better than me:
http://stevelosh.com/blog/2013/04/git-koans/
Besides the inconsistencies, there are many gotchas which keep tripping developers. For instance, git pull always tries to merge in the remote branch, but you almost never want that. You can do: git pull --ff-only, but most developers I know don't know that and end up with a mess they have to spend time cleaning up.
The main issue is that git commands mix up so many concepts and the defaults are almost always useless. For instance, git add manages tracking files and staging commits. git rebase both deals with rebasing and history clean-up (rebase -i).
A better CLI would just have consistent clear verbs that expose the git model properly instead of mixing up concepts:
> What specific things do you not like about git's CLI?
The biggest for me is staging lines. In GitHub desktop you click the line numbers, in the command line (iirc) you navigate through chunks as they appear in the file, maybe splitting them into smaller chunks and hopefully can get it down to what you want.
I just went through this with most of the engineers at our company as they're mostly .NET devs. It's an ongoing process but I basically gave a big talk open to questions on how to use git, all the commands, and the benefits of using the CLI. Thankfully they have a willingness to learn and become better developers and so it's been going pretty smoothly outside of a couple hiccups where I had to step in and perform git-surgery and explain to them how things work a bit better.
>One of them keeps committing line endings differently than everyone else, and he literally won't run git config --global core.autocrlf true because... I dunno why. He just doesn't want to, because it's the command line.
It sounds like someone needs to have a conversation with this engineer. I personally wouldn't want a single engineer on my team or company that has a resistance to learning new skills. Learning new things and better ways to do those things is basically the job description. Hopefully they have a good reason other than being obstinate otherwise they might need to find a new company that tolerates mediocrity :/
Configuring IIS is much better in the command line. You either have a complex and slow process of going through menus and ticking boxes, or you put all the settings in one semi colon separated string and it's done in one command.
The "learn+fuzz" part seems to always produce weird results due to my navigational habits, so I have a zero-dependency very short {ba,z}sh function that allows me to jump to preset locations (kd for "quicK Dir" or "worK Dir"):
$ cd ~/go/src/github.com/my/project
$ kd awesome_project $PWD # create bookmark
Then, anywhere:
$ kd aw # BAM
$ # yay, straight to my project!
Also, if I'm in a project with a "root", like a Makefile, Gemfile, or anything:
$ cd app/controllers/whatever/deeply/nested
$ kd # back to the project root!
$ make
I'll probably integrate it with fzf some day but for now the "prefix thing+match last entry" works well enough.
I installed z which is great. Then I started thinking about all the available one letter commands in bash/zsh. So I went through them. This is the result.
$d lists recent dirs
$g a git shortcut
$l shorthand for ls -l
$t It's just after a quarter past twelve.
$w Show who is logged on and what they are doing.
$x X - a portable, network-transparent window system
$z zsh fast finder
I'm surprised that no-one has mentioned mtr yet. I like it much better than ping/traceroute and I don't really see anything in prettyping that makes me willing to switch.
At the end of the article csvkit is given an honourable mention. I’m a big fan and I’ve used it a lot in the past, but these days I’d say that xsv > csvkit for working with CSV files on the command-line.
I don't know how it compares to xsv, but you might be interested in Miller/mlr (http://johnkerl.org/miller/doc/index.html), I've used it because csvkit was too slow for what I needed to do and was pleasantly surprised by its features.
Cat is just a tool to dump files to stdout and maybe concatenate them. If you want paging, syntax highlighting, etc, you probably want a tool to replace `less`, not `cat`.
Isn't "view" just an alias to invoke vim on read-only mode (i.e. you can still edit the text, do anything else you can do on vim and then save the contents to another file instead of the original file)?
I keep a running list in my dotfiles repo[1] of the Unix replacements that I use. One downside is that I don't know the standard tools as well as I would like because I almost never use them.
Rust is compiled, is fast and it's a modern language. It's also a demanding language so people who are willing to learn it are more likely to be craftsmen.
Noti looks nice, but typically I just do `whatever-long-command && tput bel`. On Macs, the Terminal dock icon bounces and gets a badge whenever the console bell rings and the terminal is in the background.
This is one of my favorite tricks. The nice thing is that it still works if you’re ssh’ed into a remote host. If I’m feeling nostalgic I’ll do ‘... && say “files done”’ (which only works locally)
There's the convenience of these tools, then there's convenience of having a small number of non-standard tools to carry along with you, however easy that may be.
For example, for any diff'ing or cat-like need, I find it SUPER handy to just pipe stdout to vim like so:
$> thing-with-output-that-needs-navigating | vim -
I'm not an Emacs user, but my impression is that over in that world, the shell integrations are crazy good.
The inclusion of Ponysay over cowsay is dubious at best.
Let me be clear: Ponysay has its niche. But with the Ansible stack providing great out of the box integration with Cowsay, it's clear which is the tool of choice for modern workflows.
On a serious note, TL:DR is amazing in some cases. The man page for "tar" compared to the TL:DR is much more unwieldy for 99% of use cases, for example.
I love ripgrep and htop, and I think that a lot of people new to *nix cli's would love tree as well!
>The talk reviews reasons for UNIX's popularity and shows, using UCB cat as a
primary example, how UNIX has grown fat. cat isn't for printing files with line
numbers, it isn't for compressing multiple blank lines, it's not for looking at
non-printing ASCII characters, it's for concatenating files.
>We are reminded that ls isn't the place for code to break a single column into
multiple ones, and that mailnews shouldn't have its own more processing or joke
encryption code.
Nice list, but I tend to keep my stuff as close to default as possible. The feeling of not being at home on a new box outweighs the benefits of customizations, in most of the cases.
I do have a huge .vimrc and some bash niceties, but I found it better for me to exercise some customization discipline overall.
I'm so relieved that this blog post isn't about a Bash library. It's definitely a great collection of tools; I'm going to have to add some of these to my recommendations.
Filewatcher does the same as entr. In addition it exports environment variables. It makes it possible to send the name of the updated file as a parameter to commands like this:
> The bat command also allows me to search during output (only if the output is longer than the screen height) using the / key binding (similarly to less searching).
You can do this with `less -F <file>`. It will show the file on the screen like `cat` unless it is longer than one screen, then it will go into paging mode and allow search with `/`.
Side note, how do you make his nice-looking prompt?
He mentions using ccat previously, I've been using it for a while (aliased as `cat`) but I've been meaning to switch away from it since it sometimes panics. `bat` looks very nice, just installed it and played around with it a bit.
atop as replacement for both top and htop. By default it dispays CPU frequences, disk i/o (read, write bw, average io time), disk io per process, basic network load. And it wastes no space as htop does.
Oh this was definitely a bug already, and a nasty one at that, as it froze the whole computer in a few minutes while the kernel was racing against itself and starving for some resource (which didn't happen while being run as root). It was fixed in macOS 10.13.4 / htop 2.2.0
"bat" is not a better version of cat, it's a completely different tool. cat is short for concatenate. If you want to view a file use less, and if you want syntax highlighting etc. then use view (comes with vim).
Most tools mentioned are completely different tools from the ones they're suggested to be better versions of. I understand the comparison is for specific use-cases. For example, csvkit is far better than trying and failing to parse CSV with awk, but it's very much useless for any other type of text format. Same thing with jq (a JSON query tool) vs grep (a general text search tool). The only ones I can agree with in the general case are htop > top and ag || ack > grep (though I still prefer grep sometimes).
If you don't mind me asking, in roughly what cases do you reach for grep instead of ag || ack? Is it a familiarity thing, or specific features or something else?
Well there's the portability of grep which is better for scripts. Besides that, most if not all times I do a simple `ag pattern file`, I'm interested only in the results and not in the location. I'm probably preparing the pattern to output something to provide to another command via stdin or substitution and at those times (which is also in the interactive command line), I want to see exactly what I'm going to pass in. To remove the numbers ag adds by default, I have to use --nonumbers, which is weirdly long given that one of the main benefits of ag is the brevity of the commands (ag is short and easy to type, no need to include options -r, -i, -P, or -n (when needed)). Instead of using that option, I just switch to grep. I really wish ag changed that default to be consistent with grep and ack or at least change the way its options are negated, like making it a short option and negating like +N.
EDIT: I just saw your comment where you stated you're the author of rg. I honestly haven't tried it before until now. It seems you also chose to output line numbers by default in that case, but it's nice that you have -N. If you don't mind me asking, do you have an option like ag's undocumented -W which allows specifying a max line width for which a matching line is displayed? Regularly when searching a hierarchy a match happens on a generated file where everything is on a single line and so my terminal prints the millions of characters line. -W displays an bracketed ellipsis once the max width is reached on a line.
> It seems you also chose to output line numbers by default in that case, but it's nice that you have -N.
Right. When you run ripgrep with its output connected to a tty, then its output is "prettified." That means results are grouped by file, colorized and include line numbers. But if you aren't connected to a tty, then ripgrep reverts to the standard grep format (e.g., no line numbers). This means you should be able to use ripgrep in pipelines pretty much exactly like you would use grep. It just does the right thing. You can test this easily by comparing the output of `rg <pattern>` with `rg <pattern> | cat`. ag also does this to some extent (by disabling grouping and colors), but does still include line numbers in most cases.
> do you have an option like ag's undocumented -W which allows specifying a max line width for which a matching line is displayed?
Yes. That's the -M/--max-columns option. I have that set by default in my config:
> When you run ripgrep with its output connected to a tty, then its output is "prettified." That means results are grouped by file, colorized and include line numbers. But if you aren't connected to a tty, then ripgrep reverts to the standard grep format (e.g., no line numbers). This means you should be able to use ripgrep in pipelines pretty much exactly like you would use grep. It just does the right thing.
Yes, I knew that. What I meant to say is that in the particular case of "rg pattern file", I feel it doesn't to the right thing. I understand that my usual usage of that kind of invocation may not be like the majority, so I understand that other people might prefer that default as it is right now. In my usual usage, though, I feel the numbers are burdensome, because I have to imagine the output without it to know what I'm passing in to the next command I've yet to type in the pipeline. I can't remember the last time I used that kind of invocation to look for the line number a match was in. I only ever use the line numbers when matching a directory or multiple files.
> Yes. That's the -M/--max-columns option. I have that set by default in my config:
That's awesome, and thanks for sharing your config.
> When you run ripgrep with its output connected to a tty, then its output is "prettified." That means results are grouped by file, colorized and include line numbers. But if you aren't connected to a tty, then ripgrep reverts to the standard grep format (e.g., no line numbers).
That sounds both good (for direct use) and bad (for developing scripts) at the same time. And it's a general pattern. I wonder, is there a standard UNIX/Linux way of saying "run this, but pretend I'm not connected to a tty"?
Scripts tend to be shared, so portability matters. grep alternatives, while great for interactive use, are generally not worth being an extra dependency to install.
I use rg on all my scripts that I won't be sharing with others. Anything that needs to be used by other people / needs to be run on any machine that I don't control, uses grep instead.
ag is made to search code specifically. grep is a more general purpose text search tool and better suited (imho) for handling data (in contrast to code). Also, it’s always installed. I’ve never been on a machine with ag unless if I’ve installed it myself.
Errmm, yes, I'm familiar with the difference in motivations behind the tools. I guess what I'm looking for is specific examples of things that cause you to use grep instead of ag. Ubiquity is a good one. But let's say you have both ag and grep. When do you reach for one and why?
(I should have said this in my first comment, but I'm the author of ripgrep, and I'm just generally interested in learning more about the differing use cases for these tools, in the words of users. I certainly have my own ideas about the question!)
Ah yes! That is a good one too. That's because grep uses "basic" regexes by default, and this sort of use case is one area where they work nicely. The downside is needing to remember which meta characters need to be escaped.
For ripgrep, you can enable literal search the same way you do it in grep: with the -F flag.
They're really meant for the same thing. You can also search the filesystem with grep by using -R and providing a path explicitly like `grep -R pattern .`.
EDIT: The real differences (at least the most important ones for me) between grep and these tools are 1) they format filesystem searches better for human viewing, 2) they provide more useful defaults, 3) in the case of ag, you can specify a max line width to avoid the terminal being filled by a matching line that has a ridiculous length, which typically happens with generated files, 4) they're supposed to be faster although I personally don't have searches big enough to notice the difference, but I imagine it's a big deal to others.
I'm starting to use vimpager after realizing I don't feel good seeing 2 different coloring on the terminal and in vim, instead, this way, whatever your vim supports (color, utility, key binding) can be carried over as a pager.
The project is still rough around the edges but mostly good.
> if you want syntax highlighting etc. then use view
I'd like to recommend pygmentize also for syntax highlighting. Works like cat and supports a lot of languages. But since pygmentize is written in Python, it may not run as fast as bat. (I haven't tried bat though, because pygmentize is fast enough for my daily use cases.)
Can we use the term text user interface (TUI) and not CLI for stuff talked about here. I don't like the confusion and I think TUI I'd appropriate, just because something is launched by command line doesn't make it a CLI. A CLI is good for pipes, these things are interactive post start.
I was wondering if any of you use alternatives or hacks for the cd command?
I've been testing out a couple of different ones, like xd, fcd, wcd and pushd/popd, but I'm not quite sure which I should commit to or if there are better ways :)
7 │ alias ..="cd .."
8 │ alias ...="cd ../.."
9 │ alias gcd="cd (git rev-parse --show-toplevel)"
The first two should be self explanatory.
The third one will take you to the "git root" of a directory structure, i.e. the top-level folder where the .git directory is (I find this grounds me for when I'm doing git commands.)
Nothing fancy. I simply use the Zsh dirstack (with auto_pushd) so 'cd -' instead to be limited to the previous path only like in Bash, can be expanded with Tab to one of the last DIRSTACKSIZE path.
The dirstack can also be saved to a file and reused on the next session.
Then I've stolen the "up" function somewhere to cd to a specific level using a part of the path when I'm deep in a FS tree.
Finally, not limited to cd only, Zsh can also expand paths with only the initials of the dirs, so 'cd /v/c/a/aTab' is expanded to 'cd /var/cache/apt/archives/', usually with less typing required on Bash...
I use z [1], it builds up a list of folders you visit based on frequency and lets you quickly jump into them by fuzzy-searching it. Z also has tab completion allowing you to cycle through the matches. See project's home for examples of its work at the link below.
htop is also pretty much a great replacement for top. And ripgrep a great replacement for the find | xargs grep pattern.
Aside from that, I'm pretty content with the Unix userland. It's remarkable how well tools have aged, thanks to being composable: doing one thing and communicating via plain text.
I'm less happy with the modern CLI-ncurses userland. Tools like e.g. mutt are little silos and don't compose that well. I've migrated to emacs, where the userland is much more composable.