Hey, author here.
Christmas Eve, 22 pm, after having eaten with my family, I was browsing HN (because what else could you be doing then), and stumbled upon this comment: https://news.ycombinator.com/item?id=21860107
It inspired me to create an alternative to jq with more of a Lispy syntax, as I think the original is awesome but also fairly cryptic for anything more advanced than single field selection.
Overall it was a fun few-day project, and was also very educational in terms of writing a parser in Go. (I used goyacc before in the sql parser of OctoSQL[1], however, that one is copied from vitess, so I’ve never built one from scratch, only customised an existing one. It’s really pleasant overall and the code is very simple, so I encourage you to take a look[2].
I'd love to hear any feedback, comments or potential improvements you can think of.
To me, pipe feels far more intuitive than callbacks as a way to write complex queries. Using it right away might make the first few examples much easier to read.
Hi, awesome work! Love the idea of using S-expression for the query syntax! Funny fact: we share the same project name - even though the core language and the query syntax are different - see https://github.com/yamafaktory/jql.
No worries regarding the naming, it's what makes open-source friendly! You can see some benches here e.g. https://travis-ci.org/yamafaktory/jql/jobs/618374730#fold-st... but those actually more like regression tests that are triggered for pull requests to compare them against master (and I should use Github actions probably here now instead of Travis). Thanks for your reply!
Looks good but there seems to be a missing explanation gap between switching from `(elem "countries")` and dropping the `elem`s to just `("countries")`. Why is `elem` able to be dropped? Is it the default function or something? I didn't follow that big jump in the readme.
This feature can probably be thought of similarly to Clojure's map access via keywords: if you write `(:something map-var)`, you'll get the value of the field ‘:something’ in the ‘map-var’ variable. I.e. keywords can be ‘executed’, and such calls get translated to map access.
It's explained right below, I think I messed up the order there.
Is this understandable?
"You can see that elem is the most used function, and in fact it's what you'll usually be using when munging data, so there's a shortcut. If you put a value in function name position, it implicitly converts it to an elem."
I scratched a similar itch by creating Jowl[1], which uses JavaScript one-liners with the Lodash library to transform JSON.
Like you, I wanted syntax that was more familiar to me than jq's, and like you, I wanted to use a language that was built for data structure transformation.
However, I offloaded all of the actual hard work to an existing JavaScript runtime, and to an existing library for data structure transformation, meaning I could punt on the parser and language design; the hard parts that you've taken on. I have this bookmarked to read through in more detail later this week.
My approach was caused by the fact that I wanted to have a minimal language which only contains what really is necessary and is as regular as possible, to be very simple as a result.
Yup, I felt the irony myself creating it, as I actually love clojure!
But I'm most comfortable with Go, and it results with a single binary for any of the major platforms with very quick startup time (as illustrated by the small benchmark), so those were huge advantages.
I love the idea of a Lispy version of jq, but as a Lisper, the examples feel bewildering to me. The fact that (elem) takes an optional second argument, whose default value is 'identity, seems familiar, but then I try to understand this example, and it just doesn't make sense to me:
(elem "countries" (elem (keys) (elem "name")))
I think part of the problem is, IIUC, this is constructing a pipeline, like (->) in Clojure, but is using function call syntax rather than pipeline syntax, and also mixing in a function call with (keys). I feel like it's a very non-Lispy language masquerading as Lisp by wearing parens.
Instead, if you wrote it as real Lisp function calls, and provided a real pipeline macro, it would be much more Lispy and, I think, much easier to understand. For example, instead of:
Have you checked out the type cheatsheet?
(keys) is a function call!
The function you pass as last to elem is kind of a continuation. It's a function which transforms the output before returning.
All you're doing writing a jql query is composing one big function.
With values in function call position, it's just that (keys) gets evaluated to a value (the keys of the map) and that is used to index the json. That's why keys wouldn't work, you need the parentheses to make it a function call.
This may sound complex at first but I find it gets intuitive quickly.
I'm afraid that that does not help me understand how (elem) works at all. From my perspective, I don't need a "type cheatsheet," I need elem's docstring that explains what arguments it expects and what it returns. I'm not even thinking about value types yet, and I don't know what grammar that cheatsheet is written in, nor why I would need to read one to be able to use what seems like the most basic function jql has, the elem function.
To write clear, useful documentation, you need to put yourself in the shoes of a user who has never seen your project before, who has not walked the miles you have to arrive at the solutions you have, and take them quickly to the same destination.
The readme, in general, is another issue. It's...not ideal.
The first part tries to sound cute with stuff like, "Hey there!" and, "remember? That's explicitly not why we're here. But it aids understanding of the more complex examples, so stay with me just a little bit longer!".
But I'm not a little kid who's bored in math class, so writing like that turns me off quickly. Having to read through long prose like, "Ok, let's check it out now, but first things first, you have to install it:" instead of simply a heading titled "Install:" feels like I'm wasting my time, and I quickly skip ahead or move on to the next HN article.
What I'm really looking for is a short intro and a table of contents with sections like: Intro, Examples, Tutorial, Reference, FAQ.
You might think of it like this: the project is your baby, but just like in real life, no one thinks your baby's as cute as you do, and, no, we don't want to see the baby pictures. ;)
I actually don’t think like that. I even have another fairly successful open source project whose readme is probably more to your liking, and is nothing like this one: https://github.com/cube2222/octosql
Everybody has different taste. I like README’s like that as they add an entertainment value. Overall I’ve got two kinds of feedback about the README:
1. I really like the project and really liked the readme, great fun to read!
2. The readme was very confusing on top of the already confusing query language.
And based on that I'm planning to keep it unchanged for now. I added the type cheatsheet to make case 2 at least a little bit better.
As for the quick examples, you can just scroll through the code blocks (as each one contains input with its corresponding output).
Since you're discarding feedback kind #2, and all of the potential users it represents, in favor of entertainment value, I'd say that you do think like that. ;) Comedians write lots of jokes, but when they stand up and tell them in front of an audience, they find out which ones work and which ones don't. How they respond to that feedback determines their success. Hey, it's your project.
I'm definitely not discarding it! I'm open to writing additional clarifying paragraphs. (that's why I wrote the type cheat sheet). I'm also not discarding feedback kind #1 :)
I'm thinking of adding another one which shows how standard map/filter notation maps to jql queries.
Anyways, thanks for the feedback again, I do appreciate it!
Regarding CPS: I think that expecting users to write queries in CPS is going to be very limiting. It's missing a big opportunity to use simple funcall and/or pipeline syntax, with which many more users are familiar. Even among Lispers, CPS isn't that popular. But Lisp, in general, could make jq-like syntax much more regular and usable.
So I guess we all know https://jsonnet.org, right? It's advertised as a data templating language. Interestingly enough, you can also use it as a querying tool. The first four examples on the jql page can be translated to Jsonnet as follows:
$ jsonnet -e "(import 'test.json').countries[0]"
$ jsonnet -e "(import 'test.json').countries[0:2]"
$ jsonnet -e "[x.name for x in (import 'test.json').countries]"
$ jsonnet -e "std.objectFields((import 'test.json').countries[0])"
With that in mind, I never bothered to learn how to use a tool like jq, rq, jql, etc.
Shameless plug but I've also written a language (murex) for shell scripting which can be used for this too:
open test.json -> [ countries ]
open test.json -> [[ /countries/0 ]]
open test.json -> [ countries ] -> [ 0 2 ]
Single brackets only return the next level deep but can retrieve multiple items (and negative integers to count from the end of an array). Double brackets are to specify a path. So [[/foo/bar]] is literally the same as [foo]->[bar]
Lastly the 4th example:
open test.json -> [ countries ] -> foreach c { echo $c[name] }
Unfortunately this one doesn't output it in JSON. If you needed that you could chain another command to reformat it:
open test.json -> [ countries ] -> foreach c { echo $c[name] } -> cast str -> format json
This works because murex can auto-convert between lists, JSON, YAML, CSV and a few other structured formats. However it's fair to say it is a lot more verbose than jsonnet and jql in that last example.
I've been using this as my primary shell for a few years now and, like yourself, I've never bothered to learn jq because of that.
NB all of the above example are running from inside the murex interactive command line (like bash). If you wanted to run it like jq/jql then you'd need to do the same sort of thing as bash:
Yeah. It’s a pretty sweet tool. What’s also pretty awesome about it is that its interpreters (one C++ and one Go implementation) can also easily be embedded into custom applications.
For an Open Source project of mine (https://github.com/buildbarn, a remote build cluster implementation for the Bazel build system), we are using go-jsonnet as the config file parser library. This means that people can use JSON, but if they need something that’s more flexible, they can add Jsonnet statements immediately. It’s also possible to have that as a preprocessing stage (init scripts), but in a containerized world it’s easier to have it embedded.
One of the nice things about doing that is that I also don’t need to write code to load TLS certificate data from disk. People can either embed their secrets directly into the config file, or they can use Jsonnet’s ‘importstr’ keyword to load separate PEM files from disk.
Somewhat related to that: Jsonnet also allows people to write libraries for constructing more complex JSON files. For example, the Grafana folks have released a library (https://github.com/grafana/grafonnet-lib) that allows you to easily build Grafana dashboards from code.
Wow. No, we all didn't know jsonner.org, and, TBH, it was always a problem for me to remember jq syntax every time I needed to do something. This really looks perfect, thanks!
I haven't used jsonnet before but is it possible to read via stdin?
for eg to parse a json response from a rest endpoint in jq I would do something like:
$ curl "..." | jq .abc
Looks like it is not ready for prime time yet...
(Not to mention that it seems to have no support for stdin input, multiple input files, and json lines format - which are my main uses cases for jq)
Yeah -- I've ended up defaulting to jq just because of its ubiquity, but many of these other tools have similar attributes without the esoteric nature of the jq DSL. It'd be great if something more natural came along and actually started to displace it.
I've never felt it was an unnatural API, but it can be hard to grok/remember. I was primarily using jq either to extract data from large JSON files or to concatenate a bunch of smaller JSON files into one, large file.
I think it can get a little weird as far as error handling goes, but I feel it's been pretty approachable.
Nice. However, a part of my problem with jq is that changing data structure also gets awkward pretty quick. So I hoped to hijack some small Lisp that's still likely to have a suite of functional tools―`map` and such. This would also rely on the fact that I can represent JSON's maps and arrays in most such Lisps. Afaict jql doesn't have that.
So far I'm planning to just fire up Hy the next time I need anything in this vein. Since it's on Python, both JSON parsing/encoding and functional stuff should be there out of the box. Alternatively, I still can try using ClojureScript with Lumo. But I don't expect too fast startup with either of them, compared to, say, Fennel/Lua.
BTW, there's a Lisp that translates to Go: https://github.com/jcla1/gisp. Though not too popular, and I've heard an opinion that it's toy-like in its capabilities.
Nice project. I'm a heavy jq user and still find myself regularly tinkering around with queries on https://jqplay.org. It would be nice if there were a similar page for jql, even if it is just a local server.
Can you find many lisps or even any lisps that evaluate like that? A familiar syntax that actually works different from any lisp in existence may be more hindrance than help.
Inside out elem 0 has to return a function since it doesn't have enough to do anything. I'm not sure why 0 couldn't be a key necessitating a different function to differentiate but whatever.
Elem countries has to return json and one can imagine using the json in the first position like a map in clojure and taking a fn in following position. This would mean that on net it would be exactly the same as.
It really ends up being called like this: ((elem "countries" (elem 0)) JSON)
Yours is more akin to (elem 0 (elem "countries" json))
and you can see that the hierarchy of the query is now inverted in respect to the json. This is what I like about the continuation based approach, a big query will be readable because it fits the data very well.
This looks very readable and matches the structure of the json too.
To me it's basically the same readability wise with proper indentation.
Though with deeply nested data and a lot of map functions I think you'd have a lot of unnecessary (->> JSON ...). But those can probably be eliminated with another macro.
Also, there's the pipe function in jql which basically lets you do this.
I'm super excited to sink my teeth into this. I usually have to use jq in intense bursts and no matter what i end up back in the manual because it just get's so damn confusing.
From a quick look on the repo, this seems so much simpler. Thanks!!!
Documentation Suggestion: I'm not familiar with fzf. So having the first example the README use fzf but not subsequent examples is a little confusing. Maybe start with a couple small examples and then use fzf towards the end?
In all seriousness, thank you for writing this. I've had trouble wrapping my small brain around some of jq's more advanced syntax, so being able to look at things from a different direction seems like a great idea to me.
It was very well laid out and understandable. I found myself wishing for a parenthesis-matching capability in my zsh though! I also liked the informal/joking conversational style of the tutorial/walkthrough. It helped to put me at ease while learning something new. And having it written in Go, and with Go's simple installation method, makes me way more likely to use it in the future :) Nice job!
I tend end to struggle with this sort of lispy syntax as I use a regular shell (rather than one in emacs) which makes balancing parens hard, and the syntax is otherwise verbose. If I’m using a query tool like this it will be interactively writing a one-liner so I want syntax that’s easy to get right (balancing parens by hand is hard), easy in the usual case (eg I think it should accept some way to use symbols to name fields in a record because typing and balancing double quotes all of the time is a waste of time) and easy to append to (I will run a query, press up, move to the end of the query, and modify. In this case I think moving to the end involves counting parens).
Perhaps I would want some implicit pipe or threading at the top level (which could be turned off with some option for scripts I guess).
Another helpful feature could be adding the magic closing paren: ] which will close all open parens to the top level. E.g. these two lines are equivalent:
(foo (bar (baz) whizz))
(foo (bar (baz) whizz]
A completely different (and less powerful) paradigm I like is one of match-format where the pattern looks a lot like what it’s trying to match. In this case most functions and fancy transformations are impossible. So you could write something like this to extract the list of countries:
j '{countries: %%}'
Or to extract each country’s name:
j '{countries: [ .*, { name: %% }, .*]}'
(Presumably you would have syntax for trying to match against each array element). (I also would make commas optional but I’m trying to make this syntax obvious).
An alternative query tool which I haven’t really seen before would be one which is interactive rather than repeatedly run: you pipe your json (or whatever) into it and it spins up some kind of (curses) gui shoeing you your data with a nice interface to interactively write your query and see its results. It could finish by outputting a command line to run the query non-interactively.
The closest thing to this I know of is a small programming by example demo that came out of Microsoft research. The goal was to figure out how to extract some desired data from some big blob of json (or maybe html, I don’t remember). The user would see the json in some web gui and could click on the data they wanted and the app would try to figure out the “most obvious” query to get it. If they clicked on the country name, I think it would probably suggest a query for all the countries’ names (but maybe that would be the second option). I think it would also try to be clever about grouping, so if the user picked name and population it would hopefully figure out that they should come in pairs rather than being two independent lists that might not be the same length.
There's already an issue to automatically add trailing parens and I definitely think it's a good idea, as balancing parens really is annoying in-shell. I didn't think of it before.
I decided that adding interactivity would be too much work, but you could definitely put together a oneliner (or alias) which achieves that using fzf: https://paweldu.dev/posts/fzf-live-repl/
Overall I encourage you to try write a tool that has the syntax you described, as it sounds plausible! It kinda looks a little bit like graphql, so maybe that would fit?
I wonder if it’s worth writing some read line macros (and instructions for setting them up) for eg inserting ( ) <move-cursor-backwards>. (Bound to M-( in emacs) I think they help for simple sexp editing. I also wish I had something like “run this command but when it’s done show me a prompt with the same command and my cursor in the same place” but I have no idea if it’s easy
It inspired me to create an alternative to jq with more of a Lispy syntax, as I think the original is awesome but also fairly cryptic for anything more advanced than single field selection.
Overall it was a fun few-day project, and was also very educational in terms of writing a parser in Go. (I used goyacc before in the sql parser of OctoSQL[1], however, that one is copied from vitess, so I’ve never built one from scratch, only customised an existing one. It’s really pleasant overall and the code is very simple, so I encourage you to take a look[2].
I'd love to hear any feedback, comments or potential improvements you can think of.
[1]:https://github.com/cube2222/octosql
[2]:https://github.com/cube2222/jql/tree/master/jql/parser