For me, bash and jq are, literally, the opposite of riding a bicycle. It doesn't matter the amount of time I spend on a given week working with them, a month later, I am gonna have to skim through my bookmarks and Kagi results (and now also chatGPT) for knowing how to do stuff I was easily doing a month ago.
I also observed this when using most cli tools… I think it’s a common problem for tools you have to reach for a couple times a month/quarter (versus programming language when you’re coding almost everyday)
My solution was literally to create Anki cards every time I discover a neat feature that I might not remember but it would be useful too. I just go through it once a day for 10 minutes (my anki cards) and it works like a charm. My memory for various cli tools has drastically improved. Rarely do I need to reach for Google, man docs or ChatGPT for most cli tools usages. I’d recommend spaces repition for cli tools
I get Bash, but for jq I found that my small fusillade of Anki flash cards was more than enough to get a fingertip feel for its syntax. Amazing what 50 flashcards of jq (or awk, or sed, or regexes, or any DSL really) gets you in the long run.
It's more important to understand the possibilities than remember the details. Details can always be quickly looked up, as long as you know what to look for and can conceptualize which tools to combine to achieve a goal.
If I find myself struggling with a task I’ve done a handful of times, I just make a page for it in obsidian with the snippet I need and an explanation of how it works.
I always struggled with understanding JQ. Each time I was just googling things. But actually it does make a lot of sense if you understanding the building blocks. I wrote it all down [1] but here is my summary:
jq lets you select elements like it's a JavaScript object using dot notation and array indexing.
jq '.key.subkey.subsubkey'
jq '.key[].subkey[2]'
You can turn wrap things in array constructors, or object constructors to create new objects and lists:
jq '[ .[].key ]'
jq '{key1: .key1, key2: .key2}'
You can combine filters with pipes (|) to build complex transformations. Built-ins like map() and select() are useful for transforming arrays.
This query fetches GitHub issues, transforms them into a simplified structure, filters out unlabeled issues, sorts them, and wraps the results in an array - demonstrating how you can chain together jq's query language to wrangle JSON data.
I was curious so looked up how it works before reading the summary at the end, and that led me to find another user of aioli.js jq implementation: https://jiehong.gitlab.io/jq_offline/ (featured https://news.ycombinator.com/item?id=28627172 two years ago); jqplay.org still sends all the data on every modification so they should learn from it...
Anyway, this article is neat! Good work!
If I were to nitpick one of the last examples with path has no explanation and flew over my head (would have to open the documentation), and a reset button for each example might be nice after messing with it a bit, but it was a nice play.
Regarding the reset button: I think that's a great suggestion and now it bugs me so much that I can't reset it. I'll add a reset button later tonight when I'm off work.
Regarding the confusing example: Yes, some of the examples are missing explanations (mainly because I spent more than a month on this post and I just did not want to put off putting it out any longer). Sorry haha. I'll try to improve the explanations and add more.
JQ is an insanely powerful language, just to put to rest any of your doubts about what it is capable of here is an implementation of JQ... in JQ itself:
And also for prototyping you can also use it to tailor output of APIs to what you need in a pinch, using JQ as a library especially with something like python:
JQ + journald is great too, but 20 years of muscle memory writing bash / python / perl / awk / sql / ruby / JS / CSS selectors / xpath / xmlstarlet one-liners keep getting in my way. I keep long notes on both with examples of common tasks. I still dislike yaml (significant whitespace is my “ick” as the kids say) too much to learn whatever the equivalent is for that and still find CSV/TSV easier to slice and dice at will due to my own personal history.
I’m sure at this point that many ETL jobs in notebooks we run at $BigCo today could be reduced to jq expressions that run 100x faster and use 1/10th the memory.
The ‘nearly’ Turing complete is something I wonder about. It feels like jq might have some limitations - transformations it can’t do, due to some inherent limitation of how it handles scope or data flow. The esoteric syntax makes it hard to determine sometimes whether what you are attempting is actually possible.
As soon as jq scripts reach a certain level of complexity I break out to writing a node script instead.
And given how rapidly jq scripts acquire complexity, that level is pretty low. One nested lookup, and I’m out.
jq does often feels like a code golf language. I would say it does have some of those Perl one liner vibes, that is to say that it is often a write-only language.
Also the ‘nearly’ part is because I don’t remember if it has infinite loops or if it is more like Starlark and thus decidable. I do have vague recollections of causing infinite cycles in JQ, it quite as well could be entirely Turing complete.
So far I have not found a single task that JQ was incapable of. And I have abused it pretty bad on my spare time =], for intellectual challenge.
jq lacks coroutines, which means some tasks can be hard to accomplish in jq. It's still a very powerful language, and it is Turing complete, not just nearly.
Thank you so much for piecing together a great example (jqjq) to help open everyone’s eyes that JQ is not just a JSONpath implementation with weird syntax! I often reference it to drive home the fact that JQ is a full blown language.
The brainfuck one is also gonna be going into my notes. That implementation is quite a terse implementation.
Great to hear and that was one of my hopes! but honestly it initially came to be because i was fiddling with some jq AST-tree stuff for fq :) weirdly it was much easier to implement than i expected. Hardest part was how to handle infix operators, +/- etc, parsing without infinite recision. But once i found and managed to implement precedence climbing things got a lot easier, it's still a bit of magic to me how well it works :) the eval part had some difficulties but mostly straight forward when you can piggy-back on the "host jq", but i tried to stay away from piggy-back too much, to not piggy-back at all probably requires implement a VM somehow.
BTW your very welcome to help improve the jq documentation. Me and some other maintainers have been talking about that it probably needs an overhaul to be more approachable and also better document some nice hidden features. Join the discord if you want!
For whatever reason jq is one tool that I simply can never remember the syntax for. It's ChatGPT every time for me. I just can't remember the specifics of how it differs from jsonpath vs jmespath (used by AWS) .... I wish there was a way for every tool to just use jsonpath instead.
i get that it's more powerful but I almost consider it an anti-feature. I have very good tools for doing transformations (sed,awk,perl, etc etc) the problem is they all want line oriented format and JSON breaks that. So all I want is a tool to go from json => line oriented and I will do the rest with the vast library of experience I already have at transformations on the command line.
> So all I want is a tool to go from json => line oriented and I will do the rest with the vast library of experience I already have at transformations on the command line.*
The tool for that is likely https://github.com/tomnomnom/gron
It's probably the best tool to go back and forth easily between json and line oriented.
The first and foremost thing to know about jq is that it's built on path expressions, so the first thing to learn is how to write path expressions. Fortunately path expressions are easy in jq!
.a # Get the value of the "a" key
# in the current input object
.[0] # Get the value of the first
# element in the current input
# array
.a[0] # Get the value of the first
# element in the array at the
# key named "a" in the current
# input object.
#
# I.e., path expressions chain:
.a[0].b # Get the value of the "b"
# key in ...
Things get more interesting when you see that `.[]` is the iterator operator, and that you can use it in path expressions.
Things get really interesting when you see that `select(conditional expression)` can be used in path expressions joined with `|`.
Just this can be very useful. It's also useful to know about the magic `path()` function, and `paths`, which I often use to just list all the paths in an input JSON text. Try applying `jq -c paths` to a `kubectl get -o json pods` command's output!
It's great for building large complex queries that will eventually live in scripts, but your zsh plugin seems to hit a real sweet spot of fast feedback for ad-hoc queries too! Huge props!
Yeah, I tend to use that one as well, but for me it just feels 'right' as a line editor plugin. I'm running a lot of kubectl commands and for me this plugin proved to be invaluable.
zshbuiltins(1): Unlike parameter assignment statements, typeset's exit status on an assignemt that involves a command substitution does not reflect the exit status of the command substitution. Therefore, to test for an error in a command substitution, separate the declaration of the parameter from its initialization.
Nice plugin; I got it and will be using it. Browsing the code, I saw a couple of small errors; not too serious, but some error handling is incorrect. In your `jq_complete()` function, for instance, you have
local query="$(__get_query)"
local ret=$?
Unless the `local` assignment to `query` fails, `ret` will always be 0 regardless of the return value of `__get_query`. To fix this, you would need your first line to be
Nice, esp reading Calzifier’s comment above and remembering how many times I’ve cursed the JQ syntax because of quoting issues…another “trick” I’ve been using is for any non-trivial JQ filter, stick it in a file or at least a heredoc and feed it to JQ using -f for much less quote-escaping malarkey.
nice, but... I'd written something like this (as a program you pipe to, not autocomplete) before, but when there's an error, I try to show the error then the last-good-output. The reason for this is that when you're typing a complex command you want to have the json visible to guide your thinking, just displaying the error hides it.
The way I did this was to store both the last working query and the last working output, I'd only reuse it if the last working query was a prefix of the current query - that avoids the awkward case where you are deleting letters from the output, so you need an output further back in history (which I didn't store, wasn't worth the hassle)
For me jq is my epitome of "When faced with a problem a programmer says 'I know I can use X' and now they have two problems"
I continually bounce off the "language/philospohy" of jq in quite embarrassing ways. Every time I go "Ah, I can use this as a reason to learn jq and half an hour lateI've written a python script to extract the data instead.
x1000 this. I find I have similar reasoning that I apply to awk. I _know_ some people get massive benefits out of using it, I just dont need it often enough to actually pick it up... GPT to the rescue I suppose
Great article. Nice to have it interactive. How does it work? Do you have a terminal running somewhere or does it run in the browser?
One thing I noticed, and where I stopped continuing, is that the jump from Filtering Nested Arrays to Flattening Nested JSON Objects, is WAAAY too big. From a simple filter to triple nested filters with keywords that had no introduction in a simpler example, isn’t working for me
It seems like jq is getting a nice boost due to how useful it is getting JSON into and out of OpenAI and LLM environments that understand jq. The big new release/relaunch shows the project is up and running again so maybe we see even more integration with Agent/Function type use cases or some pydantic-ish guardrails. Thanks for the Bookmark !
Out of any language I’ve worked with, Ruby was #1 overall in being able to solve problems closest to how I thought about them conceptually if that makes sense. In other words, getting some way of doing something out of my brain and translating it to Ruby was the most seamless (I haven’t tried any real lisp in earnest), with ES6 a pretty close second now.
Even though I use it a lot professionally, I really don’t like Python much, and I like Java / Gradle less. The whole “There should be one-- and preferably only one --obvious way to do it” thing never held water in my opinion, and things like making an HTTP request then doing something useful with the output are really not fun in Python without using things like the Requests lib… even closing files or having to care where a cursor is, like what year is it again? Python doesn’t feel like a high-level language many times since it keeps forcing me to deal with minutia and doing the wrong thing by default. Then there’s the whole disappointment with PEP 582 getting rejected (check out the PDM project to see what Python packaging could have been) and I just can’t help but really despise it despite how useful it continues to be.
Sorry, just hearing the desire for anything else to be more pythonic in any way takes my brain to a dark place of PyTSD.
Hey, at least we don’t have to write TS types for JQ (yet).
If you have trouble remembering jq syntax (or any other weird CLIs) I'd reccomend increasing the number of lines of history stored in your shell and finding a way (I use FZF) to search through that history.
I do a quick ctrl+r, type jq, and I can find all of my JQ snippets I've used in the past couple of years. If I then type "select" I can find all of the times I've used that function, etc.
I also use it to find while loops, kubectl snippets, environment variables I exported to run a script, etc.
This made me think: If you wanted to make an 'inverted bottom-up' introduction to the suite of Unix command line tools, you could go in the direction of more-to-less-structured text formats and the common tools we use with them quite easily.
1. JSON: `curl` to get interesting JSON APIs, `gron` and `grep` to explore what's inside them, `jq` to process them into interesting formats.
2. CSV: Lots of good choices here. `xsv` is very popular but I think development ended a while back; I like the `csvkit` just because I like tabbing through the options you have here. `miller` I've heard good things about. Or go to the total opposite end of the direction, use Simon Willison's excellent `csvs-to-sqlite` in conjunction with `datasette`, and then do a foray into the many interesting things you can do in SQL.
3. Bespoke text formats - `sed`, `awk`, and possibly even Vim macros reign supreme here, along with the rest of the "standard" Unix text kit. The big benefits of introducing these last is that these tools work as a superset of many of the previous ones for added flexibility.
Jq has one of the worst, non intuitive, non self evident syntax ever devised on planet Earth. Bash's if constructs are a walk in the park compared to general jq syntax. And people try to sort out that mess... somehow people always want to climb a mountain when it's in their way or someone say that would be an achievement of some sorts...
I found Jq to be difficult to use which is why Oj, https://github.com/ohler55/ojg is based on JSONPath. There still are a lot of options but it only takes a couple of help screens to figure out what the options are.
If you can't recall the syntax 3 days later it's terrible. Same applies to awk, sed and the like. No matter how many times I used them for something, I always have to resort to the documentation, chatgpt or the like because I just can't remember the contrived syntax. Maybe I'm retarded somehow... ¯\_(ツ)_/¯
Python (dead simple, easy to recall):
if monkey == 'fat':
print('happy')
vs. bash (horrible syntax pitfalls):
if [ "$monkey" = 'fat' ]; then
echo 'happy'
fi
you gonna forget about the semicolon, the spaces around the condition and it will error out. not to mention integer comparison operators...
python also has some terrible syntax, but those are advanced things, like list comprehensions.
jq...it's the same as awk, sed, bash... hard to remember for the reasons mentioned above
I think it's quite nice and intuitive to describe a pipe of filters as: filter | filter | filter. Also note that a single filter in jq is 1:N, 1 input N output where N can be zero which is a very useful feature. To express this in python etc you would probably need nested for i in (for j in ...)) and/or also use nested yield from somehow which results in a note a very CLI/ad-hoc query friendly syntax
Aside, your post looks funny because 2-space indentation triggers code formatting (so only the body of those if statements are formatted as such, instead of the whole thing).
i think the problem is it doesn't have easy to find documentation, if any at all, that introduces it as a formal language so everything is just a very hard guessing game of easily forgettable syntax.
You know, a list of reserved words, what their functions are, how the structure works, etc ... the kind you'll see if you pick up some "intro to <language X>" book from no-starch or o'reilly or the kind that GNU Awk/sed/dc have.
I suspect Stephen made a trade off between terseness and power and it being intuitive. Part of the value of jq is that it's effectively a "small program" that can easily be piped in a one liner rather than a program that has clear English keywords as an instruction.
Reading through these comments here - some praise jq, others claim it is not possible to actually remember the syntax. There seems a consensus it reminds of complexity in awk, bash, sed... While I appreciate the magic behind jq, from intellectual point of view, and also as a tool, indeed - is impossible for me to remember reasonable part of it.
Interestingly I still remember most Perl5 syntax, even the crazy stuff, quite vividly, after some 6-7 years of not-writing Perl code. I wonder why - perhaps because Perl is not so complex (even the PCRE), and perhaps because one needs jq now and then, while Perl can be a primary tool for many things. Sadly, Perl is past its prime now, and there are no implications it'll ever do a comeback.
One thing that has helped me write simple/intermediate jq code is this: Imagine what the context is for your filter. Most importantly, update that context at each pipe character '|'.
On an empty command, the context is the top-level of your JSON. As you add filter stages, that context evolves.
(this really requires more explanation and diagrams than I have room for in this margin)
At some point in every declarative language’s life, so many features get bolted on to make it useful that it loses its declarative nature, at which point you might as well just use a more standard imperative language. In this case, just plain JS.
This. The number of times I’ve had to mess with or write yaml and gotten some ridiculously unhelpful error because I forgot a space somewhere or needed dashes in my sub-whatever are equal to the number of times I’ve had to write yaml. I’d seriously rather work with XML than YAML. It’s worse to read and 10x worse to write than JSON for anything nontrivial. It’s the TAI64 of serialization. It’s like JSON that you have to type the spaces out to format visually yourself for each thing. I don’t understand the mindset of folks who prefer it. Quotes in yaml are as much fun as they were with mysql_real_escape_string. The only way I work with any significant amount of YAML is by converting it to json. Terrible format, 0/10, let’s just write everything in acme::Bleach since significant whitespace is so darn cool right?
I'm aware that yq has a lot more editing features, so it's not apples-to-apples, but I'll also draw ones attention to gojq's --yaml-input flag <https://github.com/itchyny/gojq/blob/v0.12.13/README.md?plai...> (they have yaml out, also, but I use that a lot less often) allowing leveraging the same language for both input formats
Carting your request over to chatgpt rather than learning the basics of jq doesn't make much sense.
I basically know enough jq to traverse a document, which took a few days of muscle memory. Totally worth it. One of the best tools I've used in the last decade and I'm barely scratching the surface.
not saying you shouldn't know `curl some.json | jq -r '.someField'`, but anything beyond that is overkill unless you do hardcore bash scripting all the time. My annual bash script quota is like a couple hundred lines at most so I don't really want to learn this stuff deeply.
I feel like learning something like jq is rendered almost obsolete with the onset of GPT. Spending time learning the syntax of a tool I will use one every so often seems like a waste of mental energy when I can just rely on GPT to spit out whatever I need on demand