Select, put and delete data from JSON, TOML, YAML, XML and CSV files

barakplasma · on March 8, 2022

Reminds me of https://github.com/johnkerl/miller which is also a go based tool cloning features from tools like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON

ratrocket · on March 8, 2022

I'm a big fan of miller (mlr) -- it's the tool I landed on when I needed to "graduate" from awk to look at CSV data. But when I read "go based" in your comment, I thought "nope, it's written in C". But no! It was ported to go -- very interesting!

The developer wrote a comprehensive document explaining the rationale behind the porting that answered all my questions and a lot more: https://github.com/johnkerl/miller/blob/main/README-go-port.....

Thought other miller/mlr fans (that don't follow its development) might find this interesting as well.

(The dasel tool looks very cool, too -- looks like a good complement to mlr and similar tools!)

lichtenberger · on March 8, 2022

It is nice, but I wonder if it's also a full-fledged query compiler?

Also IMHO modules with user-defined (higher order?) functions are a big plus of a query language.

mkroman · on March 7, 2022

I think I prefer yq's[1] syntax for manipulating and updating fields.

Combined with yj[2] you can also get TOML and HCL support, but not CSV.

It's certainly not because there's a lack of these tools :p

[1]: https://github.com/mikefarah/yq

[2]: https://github.com/sclevine/yj

AtlasBarfed · on March 8, 2022

I remember looking through XPath and wondering why no one had converted as much of it as possible to a generic path language for any tree-based data format, which JSON, TOML, YAML, and XML are.

XPath was the best of the XML standards. Well, it helps that the language wasn't xml unlike XSLT and others.

lichtenberger · on March 8, 2022

Regarding XQuery we just added JSON querying on top in Brackit[1] / SirixDB[2].

Brackit is a retargetable query compiler and does a lot of optimizations at compile time as for instance optimizing joins and aggregations. It is useable as an in-memory processor or as a query processor of a database system.

The Ph.D. thesis of Sebastian:

Separating Key Concerns in Query Processing - Set Orientation, Physical Data Independence, and Parallelism

http://wwwlgis.informatik.uni-kl.de/cms/fileadmin/publicatio...

[1] http://brackit.io

[2] https://sirix.io

HWR_14 · on March 8, 2022

XPath was a great system. However I feel like most of the features required existing XSLTs to support them.

username3 · on March 7, 2022

Can this or anything else flatten arrays of JSON objects to CSV?

screature2 · on March 7, 2022

I tend to use python (usually in a jupyter notebook) and pandas. Lots of experimentation with pandas' json_normalize() function. https://towardsdatascience.com/all-pandas-json-normalize-you...

then just call to_csv() on the dataframe. (edited to add to comment on exporting to CSV as per the original question).

dec0dedab0de · on March 8, 2022

I also use jupyter for this, but my goto is tablib. for anyone who hasn't used it, it's super easy to switch between tabular data formats. you create an instance of their Dataset class, then assign your data to the appropriate property, and all of the other properties are your data in the respective format

for instance:

  from tablib import Dataset

  json_array_of_objects = '[{"header": "data1"}, {"header": "data2"}]'
  ds = Dataset()
  ds.json = json_array_of_objects
  
  ds.csv # data formatted as a csv
  ds.xlsx # excel, only useful on a binary read or write
  ds.dict # list of dictionaries
  ds.json # list of dictionaries converted to json
  ds.jira # table formatted for jiras markup
  ds.html # html table
  # and more

they used to vendorize dependencies, so everything worked out of the box, but now some features need to be installed specifically, or do pip install tablib[all], which is kind of annoying. I suspect they started doing it when they included support for pandas dataframes, because they didn't want to vendorize all of pandas. or force it to install as a requirement.

bewuethr · on March 8, 2022

In addition to the already mentioned jq, there's https://github.com/jehiah/json2csv

naan_bread · on March 8, 2022

Try this: https://flatterer.opendata.coop/

There is no binary yet but there is a python CLI and library, even though it is written in rust.

It is the only tool that I know that deals with nested JSON and converts it into relational tables.

Here is a notebook of the python library usage.

https://deepnote.com/@david-raznick/Flatterer-Demo-FWeGccp_Q...

thesurlydev · on March 8, 2022

Try https://github.com/tomnomnom/gron

DangitBobby · on March 8, 2022

I mostly go from csv to csv but nushell handles table data in files really well [1]. I've done a little but of json manipulation with it but not a ton, so I can't say it's a silver bullet.

1. https://www.nushell.sh/book/

jarofgreen · on March 8, 2022

Here's a Python library with a CLI:

Pypi: https://pypi.org/project/flattentool/

Docs: https://flatten-tool.readthedocs.io/en/latest/

It's maintained by Open Data Services Coop, where we use it as a component in several of our web & data pipeline tools for working with data that is published in a Data Standard.

TAForObvReasons · on March 7, 2022

https://github.com/stedolan/jq can manipulate arrays of objects and format as CSV

lichtenberger · on March 7, 2022

Of course you can do it with http://brackit.io.

What exactly is the "transformation" you envision?

lichtenberger · on March 7, 2022

Something like this (very simplified):

  let $array := [{"foo":0,"bar":"tztz"},{"foo":"hello","bar":null},{"foo":true,"bar":"yes"}]
  let $value := for $object in $array
                return
                  let $fields := bit:fields($object)
                  let $len := bit:len($fields)
  
                  for $field at $pos in $fields
                  return if ($pos < $len) then (
                    $object=>$field || ","
                  ) else (
                    $object=>$field || "\n"
                  )
  return string-join($value,"")

will output:

  0,tztz
  hello,null
  true,yes

kbrazil · on March 8, 2022

You could do something like this in pure python without the json loading boilerplate with jello[0]. An interactive TUI for jello called jellex[1} is also available. (I am the author)

[0] https://github.com/kellyjonbrazil/jello

[1] https://github.com/kellyjonbrazil/jellex

lichtenberger · on March 8, 2022

JSON loading boilerplate? Is it also a compiler with sophisticated optimizations during compile time? :)

lichtenberger · on March 8, 2022

Of course you can simply create a module with something like the above snippet as a function for reuse. Or I can add a built-in function for flattening like this or as in pandas.

Sn0wCoder · on March 8, 2022

If you use JavaScript-> papaparse or like other have pointed out straight forward to do in python, bash, Perl, etc…

m1sta_ · on March 7, 2022

Excel can do this.

mattewong · on March 9, 2022

RodgerTheGreat · on March 7, 2022

The "delete" example on the readme is a bit odd, since it seems to suggest that you need to specify the value to delete as well as the path. The equivalent in the documentation does not have this problem.

scoopertrooper · on March 7, 2022

Looks to be a mistake in the readme. The example in the docs doesn't have value being passed in.

> echo '{"name": "Tom"}' | dasel -p json '.name'

https://daseldocs.tomwright.me/usage/delete

Novalty93 · on March 8, 2022

Can confirm this was a mistake I made when mass-compiling the docs

tpetry · on March 8, 2022

That's a fascinating concept. Is there anything similar for INI? Such a tool for ini could help in automatically changing configuration files for software which is most often configured by ini files.

emmanueloga_ · on March 8, 2022

I haven't used it but augeas seems to have support for ini files [1].

1: https://augeas.net/docs/references/1.4.0/lenses/files/inifil...

theluketowers · on March 8, 2022

Not a generic tool as such, but Winter CMS has a INI parser / renderer built into the core: https://wintercms.com/docs/services/parser#ini-parser

Novalty93 · on March 8, 2022

Hey, author of dasel here. I have an issue open to add INI support - just haven't gotten round to it yet. Life has been busy!