Hacker News new | past | comments | ask | show | jobs | submit login
Jq – A Command-line JSON processor (stedolan.github.io)
451 points by geekrax on April 27, 2015 | hide | past | favorite | 91 comments



jq is awesome, a real achievement.

At the same time it demonstrates everything that is wrong with traditional shells and what PowerShell gets so right.

jq is not a "Unixy" tool in the sense that it should do one thing and do it right. jq implements its own expression language, command processor complete with internal "pipelines". Why would a tool need to do that? find is another utility that does many, many other things than to "find" items: It executes commands, deletes objects etc.

Consider this challenge that included parsing json, filtering, projecting and csv-formatting output: https://news.ycombinator.com/item?id=9438109

Several solutions uses jq - to good effect. But the PowerShell solution uses PowerShell expressions to filter, sort and project items.

The problem is - at the core - that the traditional command tools are severely restricted by only being able to rely on a text pipeline convention. You cannot parse json and send the "objects" along to another tool. Well, you can, if the json tree is extremely basic - like 2 levels.

PowerShell also has a json parser tool: https://technet.microsoft.com/en-us/library/hh849898.aspx. It is called ConvertFrom-Json (follows the verb-noun convention) and is distributed as part of PowerShell (built-in if you want - but that's a misnomer for PowerShell modules).

It is extremely simple - it doesn't even take any parameters - it just converts a string (parses it as json) and outputs the object/objects which can be arbitrarily complex. It truly does only one thing. If you want to select specific properties, sort, group etc you do it with other PS tools like select (alias for Select-Object), sort (alias for Sort-Object), group (alias for Group-Object) etc. If you want to convert back to json you use ConvertTo-Json.


I don't entirely agree. With traditional shells, you can still use objects, you just have to serialize them to text. jq is perfect for this, you can use it to store a serialized object in a variable and then pull things out of it for use with other commands. Example:

    example="{}"
    example=$(echo $example | jq .foobar=10)
Later on you can pass example.foobar into another program like so:

    grep $(echo $example | jq .foobar) ...
You can write simple helper functions that make this more concise. The beauty of it is that this works with any data format - plaintext, json, csv, etc - you just need a tool to serialize and deserialize it.


My understanding is that PowerShell is passing around specialized objects that have a few required methods, and some source specific ones. The beauty of this is that you can pass massive amounts of data without serializing and unserializing it at every step, and also that they can contain links to other information, such as links to parent objects which may or may not be part of the same list (I have no idea if this is allowed, but it seems a logical extension of the concept). Additionally, dates passed around can be complex types that don't have to lose relevant info such as time zone, regardless of what the default display format is.

The downside is that tools need to buy into this and work with this data stream rather than text, which makes the set of available tools necessarily much smaller.


powershell is a nice concept and a pretty nice shell, but its also complex due to this.

serializing to text just means "everyone must speak the same language" its also complex due to this.

And in fact jq does serialize. i'd argue jq is a "one tool, one job" and thus unixy. Its the way the "one job" is seen that differs.

After using powershell an bash/etc. for years, i'm still unconvinced which one is the most powerful in the real world. sometimes (most often) its faster/easier to do things with bash, sometimes the convenience of powershell objects is nice (that's where i'd basically use python on linux because of bash limitations).

As long as i can parse, grep, etc with jq, and pass it to another tool via a text stream output, im happy. as a human, i find it easy to deal with unstructured or very lightly structured text. Hard to dig through complex structures.


I understand this, I myself often trade off between bash and Perl. Sometimes I'm using a for loop on a shell expansion to tar directories, sometimes I'm piping an ls to xargs to parellelize a workload, but often I'm piping SQL output to Perl, selecting the records I care about and storing (and/or altering) the relevant fields in a data structure, and outputting the data in a formatted way (such as SQL update/insert statements) or reporting some aggregate information.

What I see in PowerShell, which to be truthful I never use (it's not in my common environment), is a less powerful/extensible version of Perl, but optimized for a specific workload, as there's less parsing and manipulation of data required if it's easily available in the formats you need already. Honestly, I still wouldn't use much it if it was installed by default on all UNIX boxes, because Perl fills that need for me, but I recognize what it's trying to do on a conceptual level, and it's similar to what I end up doing every day, so I support that.


You have it backwards. Even though they are similar on a conceptual level to what you do everyday, doing it in Powershell requires buying development tools. And there hasn't been a need to support that monetarily for a while.


What development tools do you need to buy for Powershell? (unless you mean Windows itself).


hmm i do powershell stuff with "just" windows amd gvim.exe


just windows


> My understanding is that PowerShell is passing around specialized objects that have a few required methods, and some source specific ones.

I don't believe that PS objects need to have any specific methods. Maybe only ToString() - but that is always auto implemented.

> The beauty of this is that you can pass massive amounts of data without serializing and unserializing it at every step, and also that they can contain links to other information, such as links to parent objects which may or may not be part of the same list (I have no idea if this is allowed, but it seems a logical extension of the concept).

Well, that's part of it. The fact that there is no need for serialization/deserialization between the commands has some obvious performance benefits (and a few drawbacks). But it is also what enables the perhaps most overlooked but most important characteristics of PowerShell: You can use it in-process to manipulate in-memory objects of the hosting process. This means that you can implement the management functions as PS cmdlets, and still build an admin GUI using those cmdlets but where the GUI is a rich process with in-memory objects that is just passed to cmdlets. Think how a task manager could be written to list the processes on the machine by invoking Get-Process. The objects returned represent the actual processes - complete with methods and properties. The GUI can hold on to those objects, and when you pick one and choose the "Stop" action (for instance) it could pass it as input to Stop-Process (kill). This is actually what a number of product-specific GUIs from Microsoft do, e.g. Exchange.

> The downside is that tools need to buy into this and work with this data stream rather than text, which makes the set of available tools necessarily much smaller.

Yes - it is a new model, and the commands need to support it to take full advantage. There is a "bridge" from the old world to the new in that the "old" command types are simply seen as commands that consume and produce sequences of strings. But yes - there still is some friction. The model works at it best when you can do all of your tasks using cmdlets.

At some level I suspect that PowerShell has been blessed with a rather poor CLI ecosystem on Windows preceding it. It didn't have to knock anything existing from the throne; rather it hit a vacuum and was almost immediately adopted. The same wouldn't have been the case on Linux/Unix.

There is an impressive number of cmdlets available now, however. On my machine "gcm|measure" tells me that there are 1484 commands available. Those are only those distributed from Microsoft, and the cover all sort of administration, firewall, sqlserver, file transfer, disk management, low level network managent, print management, start screen, user management etc. I don't think it is on par with what the Linux/Unix community has available now, but it is getting there, quickly.


While jarring when I first used PowerShell, once I realized what it was doing, I immediately understood it as a powerful concept. It allows shell pipeline processing to work in much the same way you can filter lists of hashes/dicts/arrays in more traditional scripting languages. I haven't used it much, and not at all in the last few years as I'm not often on Windows boxes, but I've always remained impressed.


If you want to see another approach, jshon is much more unixy. (disclaimer, I am the author)

homepage: http://kmkeen.com/jshon/

Jshon is also a year older. I've never been able to figure out why/how Jq got so much larger so much faster.


> I've never been able to figure out why/how Jq got so much larger so much faster.

I offer this in the sense of constructive criticism: for me, it's because your examples are scary. Here's the command you give for reading a URL of a photo uploaded to reddit (in file foo.json):

    $ jshon -e data -e children -e 0 -e data -e url < foo.json
Here's the jq equivalent:

    $ jq -r '.data.children[0].data.url' < foo.json
which looks an awful lot like every other tree parser I've used. Now suppose I want to get all the URLs with jq? That's just:

    $ jq -r '.data.children[].data.url' < foo.json
In Jshon, I have to rewrite the query as:

    $ jshon -e data -e children -a -e data -e url -u < foo.json
That is, replacing the "-e 0" (get item at index 0) with "-a" (do the rest for all items). The jq equivalent changes ".children[0]" to ".children[]", which to my poor muddled brain more clearly represents what I'm asking for. For me, jq's query language maps well onto how I think about the problem. Jshon's does not. That's why I use one and not the other.

Again, I offer this sincerely. Even if I don't personally use it, I love when cool ideas cross-pollinate from one project to another and I'm glad there's more than one player in the space. Thanks for your work!


I have only recently realized the effect a well designed project website can have on the users impression of your project.

I'm sure the code/design quality of jshon is great, but the website design/layout doesn't reflect this.

The current look of your website looks like a website from the late 90's, and may as well implicate that the project is no longer maintained.


Hey, thanks for jshon. I've been using it a lot lately!


> This cmdlet is introduced in Windows PowerShell 3.0.

So... I can only guarantee it's included on my customers' computers if they have Windows 8 or higher. I'm happy most of them moved from XP to 7. Maybe I can start shipping PowerShell instead of batch files in 2020.


I don't think that's really the point. The point is that the unix command line has some serious shortcomings in its fundamental design. By extension he may be suggesting that maybe we should get off our asses and do something about it because windows has actually managed to ship something that's not only good, but better than any of the comparable options on *nix.


Agreed, GP's post was about the technical merits of PS. PowerShell may be the greatest thing since virtual desktops, but that doesn't help me until PS 3.0 is included in the OS by default. Since my customer base is decidedly conservative on OS upgrades, it's going to be a while until I can try out any of those great PS features.


I don't see how it's better. You can send objects over pipes if you want to (in, for example, json format).

If PowerShell wants to be the bloated Word-equivalent of shells that's certainly a design objective. I'm sure someone will eventually do a Mono or systemd of it, saving UNIX from the tragedy of not working like Windows. But it smells of Taligent.


Does it smell like people, places or things? ;)



But it is not installed by default, which means you'll have to tell your users to find and install it first.


Package a bootstrapping batch script that installs the MSI, maybe? (I don't know your clients' requirements, obviously.)


But Powershell builds on top of .NET, which means you can't write cmdlets in non-.NET languages. And writing them even in C# is quite painful. Which is why there is virtually no ecosystem for Powershell cmdlets and there is a gigantic, ever expanding, ecosystem of Unix shell tools.

At the end of the day you have to choose a primitive and history has shown that text is the right one.


PowerShell cmdlets can be written in PowerShell. PowerShell modules (the unit of distribution) can be written in entirely in PowerShell. You can also write them in a .NET language, like C#, F# or VB.NET.

I disagree that it is painful to write cmdlets in C#. A cmdlet is just a class, the parameters are just properties annotated with the [Parameter] attribute. The shell does all of the parameter parsing, matching and type conversion/coercion. If I wrote a traditional commandline tool I'd have to do that parsing myself. Yes, I could use a library function like getopt for that - but the code and complexity would still be there.

But my point was not to get anyone to use PowerShell on Unix/Linux. PowerShell draws much of it strength on Windows from the fact that there are already well-established object models on that platform (COM, .NET and WMI/CIM).

The point was that the age-old text-only view in command shells is beginning to show some shortcomings. XML and JSON (among other formats) challenge the tabular view of the world. Sometimes you have to experience the alternative to realize that you have been living in a restricted world. Like swallowing the red pill. Who know, maybe an object-oriented shell that mixes well with the Unix tradition will emerge.


PowerShell's "pipeline" is largely analogous to chained methods in a scripting language's REPL, and therefore is not the same as Unix pipes, which are used to share data streams between processes.

Unix already has the object functionality you see in PowerShell: Python, Ruby, the JVM, etc, and all of these have existed for far longer than PowerShell. The only thing PowerShell has is a bunch of shortcuts for system operations, which Unix already has in the form of POSIX shells and utilities, and which many languages provide language-native abstractions for.


> Who know, maybe an object-oriented shell that mixes well with the Unix tradition will emerge.

This doesn't make any sense though. Unix has a tradition of language diversity where Windows has a tradition of using whatever Microsoft says to use.

Unix has language-specific shells but like Powershell they haven't really caught on.


You can write them in PS itself, and it is quite simple and minimal.


Depends how you look at it.

jq is a "Unixy" tool in the sense that it should do one thing and do it right - and that 'one thing' is parse json data to output it in the desired format.

Awk, arguably one of the oldest unix tools, also has its own language. Would you consider it non-Unixy or is it grandfathered in due to age?


let's see:

jq does JSON parsing. And does it really well. Check.

jq does filtering jq does projection (. operator, .[] operator, select etc) jq builds objects (the {} operator) jq does arithmetic (addition, subtraction, division, multiplication) jq does database-like "lookups" in collections (keys, has, length etc) jq does pipelining of the in-tool "objects". jq does object generation of new objects like range operators etc. jq does data type conversions jq does sorting, grouping, extracts min/max etc. jq does string interpolation and string formatting jq does equals and relational comparisons, tests and other boolean tests.

awk does many of these too.

bash has some boolean tests, but typically very filesystem or envvar centric.

In general, when using shells and tools:

Sorting should be handled by sort. Except when it is handled by jq or by ls or by find or by ps or ....

Filtering should be handled by grep. Except when it is handled by ls or by find or by ps or by jq or by awk or ...

Formatting is always handled by each and every command. Which is understandable given that the format of each line is so important.

Now consider PowerShell where ConvertFrom-Json does one thing: Converts text from JSON to objects. Where-Object (aliases ?, where) does filtering. On any type of object. Sort-Object (alias sort) does sorting. On any type of object. Select-Object does projections. On any type of object. Format-Table, Format-List and other Format- cmdlets formats objects to human-readable output.

The text protocol does not scale well with structural complexity. It is really only good for lines of text. It has been somewhat enhanced to be able to delimit fields which then makes for an informal tabular-like protocol.

The object protocol scales with structural complexity. That's the difference.


It's "Unixy" in that it has a short cryptic name!

Some times "one thing" happens to be a big thing that you have to do 100% or not at all, and doing it well requires a hell of a lot of complexity.

Does a XSLT processor do "one thing": processing XML? Does a DSSSL processor for doing "one thing": processing SGML? (Hint: It has a Scheme interpreter built into it!)


You raise some interesting points.

As I said, jq is impressive. It is a JSON processor that implements it's own functional language, complete with variables, function definitions, arithmetic, recursion. There is no question that it is useful and will be the absolutely right tool for a number of jobs.

Likewise with awk. I really, really like awk. It is such an elegant solution to a common problem (text parsing and productions). It is also a functional language.

XSLT is also a functional language. An XSLT processor is expected to implement a standard (as opposed to jq and awk) - but that's a minor difference.

It is when you are required to blend the disciplines that the limitations sneak up on you. In powershell it works the same regardless of whether I process data from JSON, from XML, from line-oriented text, from CSV, fixed columns or or from database queries. Sorting, grouping, comparing, mapping, presenting, converting - it is the same skill set.

That is not so say that something like PowerShell can or should replace tools like awk, XSLT or jq. It should not. Period.

But for tasks where one has to use several formats, the lack of a sufficiently strong inter-command protocol causes the tools to bloat and include functionality that has nothing to do with the core task. Consider, for example, why jq needs a "@csv" function (called a "format string") - and @html, @sh, @base64!

My claim is that those functions have nothing to do with JSON processing.

The equivalent in PowerShell is handled by ConvertTo-Csv, ConvertTo-Html, Out-String or the [Convert] class.

It really comes down to a n*m problem. jq is just one tool that implements csv or html formatting (m formats). There are n tools (jq, awk, xslt, ...). Should they all use the same rationale as jq and implement csv formatting? Or should they stay away from that and just concentrate on the core task.

In PowerShell the equivalent of jq - the ConvertFrom-Json cmdlet - just converts from JSON. Regardless of whether the objects were converted from JSON, line-oriented text, database queries, fixed columns, XML - the output formatting is handled by tools that are optimized for that.

If I'm dissing anything it is the limited inter-command protocol of traditional shells. In this day and age we process complicated structures that are not easily expressed in line-oriented plain text.


Awk can be used as a unix pipeline tool, or it can be used more as a stand-alone that you make scripts/programs in to do larger tasks.

Similarly, Perl falls under the same category, being an awk/sed replacement. But considering Perl just a unix pipeline tool is also severely underselling it.


I'd like it better if it was more like csvkit, with a separate jqgrep, jqfmt, jqsort... that you combine using the shell. Because of course everybody already knows the shell.


Awk does one thing well, it runs Awk.


PowerShell is a .NET CLI and is more comparable to a JVM CLI (or the Python, Ruby, or Node REPLs) than it is to POSIX shells.

Unix pipelines are for streaming data between processes using standard input and output. PowerShell pipelines are for sharing data between .NET classes called cmdlets and are more like chaining methods within a programming language. There are multiple programming languages with this functionality available in the *nix ecosystems.

Working with the PowerShell "pipeline" is like working in a CLI/REPL for Python, Ruby, Node, not like working with Unix pipelines.


I'm wondering how PowerShell's object model compares to other scripting languages. What's a good introduction to PowerShell's object model?


type

    help about_objects
(or go to the online version at https://technet.microsoft.com/en-us/library/hh847810.aspx)

Follow the pointers to the other topics from there.


well, JSON is a serialization format -- so I'm genuinely curious what you mean. Can you better describe the dichotomy you're talking about (text vs objects) and give an sample problem that demonstrates a weakness of text ?


The http protocol is just text when you look at it. But I cannot just send "text" to a web browser or server. I have to follow a protocol. When I do so, I can anticipate how the receiver will understand the "text" - and plan for it.

The same way with shell tools - they input/output byte streams - but by convention we expect those to be (mostly) text. But there is no *protocol there. One tool cannot build a text stream in anticipation of how the next tool will understand it.

Hence you can only standardize on a very basic level. In traditional shells that basic level is text lines and sometimes space or tab delimited fields of a line.

There is no common way do describe a tree of nodes or a graph of objects that you can expect all of the tools to understand.

Sure, you can use xml - that's just text. Or you can use json - that's also just text. At least they can describe compound values and trees. But other common tools simply do not understand those formats.

PowerShell outputs objects - i.e. the pipelines are delimited so that the elements are objects. Each object can be arbitrarily complex. Each tool understands that it may receive complex objects - but can carry out their task regardless. Example: select (projection akin to the map function), group, sort operates on objects. Even if you sort the objects - the output will be the same objects. If you select properties using the select cmdlet, the objects that are the properties are passed on - and they can be arbitrarily complex as well.

To sum up: PowerShell has a higher-level protocol for pipelines and standardizes an object model (duck typing) so that tools can produce and consume trees, graphs, dates, customers, services etc.


PowerShell is cool, I wish it was more popular. I also wish that XProc was more popular too, but XML is a big no-no on the Web nowadays.


I've been using this for a couple months now. With httpie[1], it's become as essential to my work day as Grep.

[1] https://github.com/jakubroztocil/httpie


Wow, thanks for that link. I hadn't heard of httpie, but it looks pretty slick.


Isn't that the same as curl(1)?


Nope. HTTPie is much, much more focused than cURL (the CLI tool). It is exclusively an HTTP client designed primarily for interactive use. cURL, on the other hand, is meant to be used mostly non-interactively and is more of a general client for "URLs" (it also supports DICT, FILE, FTP, FTPS, GOPHER, IMAP, IMAPS, LDAP, LDAPS, POP3, POP3S, RTMP, RTSP, SCP, SFTP, SMTP, SMTPS, TELNET and TFTP).

HTTPie's goal is user-friendliness, so it comes with features like built-in JSON support, output formatting, syntax highlighting, extensions, etc. Comparison of the two in one screenshot can be seen in the README:

https://github.com/jakubroztocil/httpie#readme

(Disclaimer: I created HTTPie)


I use and love httpie but I start to regret that it's written in Python as the Python platform is a very very late adopter of modern TLS standards. Even today, in 2015, it's still a hurdle to use httpie with a SNI endpoint, and forget about it making it talk to a ECDSA-based endpoint like those exposed by the free tier of Cloudflare (so called "modern TLS"). I wonder how long it will take to HTTP2 to land.

I tried a random rewrite of httpie in Go and it worked perfectly with those endpoints, as expected, and Go is getting full HTTP2 with 1.5 in June. Too bad the rewrite was still feature incomplete, when I tried it.


The situation is not ideal but more recent versions of Python (2.7.9 and 3.4) have a much improved TLS support and things like SNI work out of the box.

For older Python versions there are some additional dependencies to be installed manually (will happen automatically in next version of HTTPie):

https://github.com/jakubroztocil/httpie#sni-server-name-indi...

As for HTTP/2, there is an experimental extension for now:

https://github.com/jakubroztocil/httpie-http2


I had the same feelings but gave it a try .. it works great, and you can construct http(s) request more naturaly, like..

    http POST example.org name=jhon pass=12345
which results in ..

    POST / HTTP/1.1
    Accept: application/json
    Accept-Encoding: gzip, deflate, compress
    Content-Length: 33
    Content-Type: application/json; charset=utf-8
    Host: example.com
    User-Agent: HTTPie/0.8.0

    {
        "name": "jhon", 
        "pass": "12345"
    }
..everything beautifully printed in nice colors

thanks for sharing!


Serves the same purpose, for sure. However, I think the interface is more intuitive/clean for json rest apis.

I usually end up using HTTPie for manual work / ad-hoc checks. And then when I script it I tend to end up using curl, because curl is available everywhere and I never need to worry if it's installed everywhere I'll want to the script to run.


Except much nicer - especially if you work with json APIs


I found httpie last week when I started to think I should write a replacement for curl with a much better UI.


what is your work is about ?


I do DevOps for a medium sized SaaS company (~500 employees). So often working with scripts, and JQ is a lifesaver any time a bash script needs to grab data from a REST api. Before it was grep/sed/etc. and the scripts would turn ugly quick. Now you get an at-a-glance understanding what the script is trying to grab from the endpoint.


Love, love jq...after awhile, I realized I was twisting my brain too much to get bash to do what it wasn't meant to do, but I got a lot of mileage of just playing with APIs because of how jq made it so easy. I came up with some fun lessons on how to use jq and other CLI-tools to tour the Spotify API [1] and to mash up images from Instagram [2]

[1] http://www.compciv.org/recipes/data/touring-the-spotify-api/

[2] http://www.compciv.org/recipes/data/api-exploration-with-gma...

(note: while I say that I like the command-line, I'm not die-hard enough to have stopped to learn awk, so take that fwiw)


Using it for parsing AWS CLI responses, it's been a sweet tool to parse those JSON responses from AWS, most of the time I'd fire some python script to parse the data and now with jq I can keep most of the one-time scripts limited to bash only, if I need something more maintainable or longer living then I'll drop a script in python or ruby.


JmesPath is built into the AWS CLI.

You don't need separate JQ. See the --query option:

http://docs.aws.amazon.com/cli/latest/userguide/controlling-...

Check out http://jmespath.org for details.


Only saw your comment today, thanks so much for this, I really overlooked this part from the docs.


That's exactly what I did with Pouch (https://github.com/bramgg/pouch).


Lots of previous discussions about this on HN (for instance [0]) but still a sweet tool.

[0] https://news.ycombinator.com/item?id=5734683


Jq is truly amazing swiss army knife for JSON viewing/processing in command line.

Btw, I've seen people posting this github link at least 10 times in the past. I thought HN filters dupes, what gives?

https://hn.algolia.com/?query=http:%2F%2Fstedolan.github.io%...


I think it filters dupes within a certain time frame. You don't want important front-page news at a site being considered a dupe because something was big on the front page a year ago.


Gotcha. In this particular case though, it does feel like every few months someone will remind us there is this great tool some of us have been already using for 2 years now:)


FWIW, I hadn't heard of it before.

(actually that's a lie, I saw it mentioned in the CSV challenge thread. But that was just a few days ago, so the point still stands)


For a similar tool written in javascript, check out json (https://github.com/trentm/json). I use both.


Love JQ, been using it for the past year. I'm very much looking forward to the regex support in 1.5.


I love that it is written in C and not Go, or Rust, Java


I know it is right on the linked page, but it is useful enough I think it deserves a direct link in any conversation:

https://jqplay.org/


Compare: rl_json -- Extends Tcl with a json value type and a command to manipulate json values directly. The commands accept a path specification that names some subset of the supplied json entity. The paths used by [json] allow indexing into JSON arrays by the integer key.

https://github.com/RubyLane/rl_json


For those on OS X, check out JSON Query: https://itunes.apple.com/us/app/json-query/id953006734?mt=12

It's not nearly as powerful as jq, but for simple querying and downloading JSON (with POST body and HTTP headers), I find it quite useful.

Disclaimer: I built it.


"brew install jq"


I recently had to consume an xml service; it would've been great to have something like Jq for xml. For some reason I wasn't really interested in xml -> json conversion as a first step. I also found a bunch of xml cli tools, but they are older, had to learn things like XQuery (which actually few tools today seem to support).


I have found `xmlstarlet` to be really useful for working with XML. You can achieve a lot with some basic XPath expressions. No need to learn XSLT. (I should actually update my book Data Science at the Command Line to include this tool.)


For me, that "some reason" is normally the overlap between children and parameters:

    <foo bar="123" baz="456"><qux>789</qux></foo>
There's not a great round-trip mapping of that to JSON, like:

    {"foo": {"bar": 123, "baz": "456", "children": {"qux": 789}}}
(which also illustrates why I dislike XML being untyped unless you have a schema file to interpret it with).


Well there's XProc. But as with everything XML - you shouldn't really edit it by hand without proper editor support at the very least (preferably DTD/XML Schema aware).


I understand XML has namespaces, but I'm not sure what "edit by hand" means in this context. The tools I found previously were xqilla and xmlstarlet, but they seem pretty old and just not as convienient as Jq.

I still think there is space for an XML cli tool that is more convenient for the majority of common tasks working with XML, at simply treating the data as a hierarchical structure, without requiring an interface that is full fully-qualified/needs XSLT/etc..


Nice. I made jipe[0] a while back to handle common troubleshooting tasks with streams of json objects. I still use it quite a bit. I didn't see anything about streaming in the jq docs.

[0] https://github.com/dokipen/jipe


It does streaming. You can pass a stream of JSON objects through standard input and jq will process them on stream.

The master branch also has experimental support for JSON sequences.


This appears to be in FreeBSD ports already, so just

  pkg install jq
and you can have it :)


We are using jq to parse JSON output from various API in shell scripts. It served us well even for large JSON strings. If you want to write small scripts and parse some JSON in it then jq is best fit.


i love it so much.

i've used it for things like migrating between json schemas, and even converting normal json into elasticsearch bulk imports on the fly.


How weird - I just started using this for some simple scripting against a rest interface. Its a great tool.


I love JQ. I use it along with Ansible's AWS dynamic inventory script to manage my instances.


No mention of jsawk in here? We use it quite a bit for shell scripts, works great.


I appreciate jq and use it almost every day. At the same time, I recently had a task that was easier to do in Node.js. I had a file containing a JSON object with a structure like so: [ { }, { }, ... ] and I wanted to turn each object into a separate file.


That task is trivially accomplishable in jq. I have no clue why you say it's "easier" in nodejs.

    i=1
    for line in `echo '[{"a":"b"},{"c":"d"},{"e":"f"}]' | jq '.[]' -c`; do
      echo $line | jq '.' > jq_out_$i.json
      i=$((i+1))
    done
The above prints each element of the array to a file named by its index, including pretty-printing it.

If you use it every day, the above should have come to you easily enough.


Looks interesting. Given this json, [{"name":"john", "grades":[1,2,3]}, {"name":"jane", "grades":[4,5,6]}], can it print it out as this: john,1 john,2 john,3 jane,4 jane,5 jane,6


Yes, kind of like so:

echo '[{"name":"john", "grades":[1,2,3]}, {"name":"jane", "grades":[4,5,6]}]'| jq '.[]| "\(.name), \(.grades|.[])"'


Thanks! Now, it looks more interesting...


Based on @aquassaut's answer, I tweaked it a little, and this should do it.

echo '[{"name":"john", "grades":[1,2,3]}, {"name":"jane", "grades":[4,5,6]}]' | jq -r '.[] | "\(.name),\(.grades|.[])"' | paste -sd' ' -


one of the most useful command lines tools in my toolbox these days - thank you for all that have contributed to this masterpiece.


I'm using it in bash scripts - it's a very good tool.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: