It's interesting to see how they introduce a new binary format in their catalogue. I was expected to find a domain specific language to define the grammar of binary bitstreams, maybe as a context free grammar. Instead, they built a nice library of routines that helps them design custom parsers by hand for each new format.
Hi, fq actually do support json so other similar text formats could work the same way. But it's currently implemented in a big hacky way, it's just a big blob that happens to be work as normal JSON. I've done some attempts at implementing it as a normal fq decoder but it's hard to figure out a way to represent the whitspace between values etc, ends up very clunky or not very user friendly. Any suggestions are very welcomed.
It’s interesting how people work. Seeing that section of the readme with a laundry list of alternatives made me want to try fq even more. It tells you that the author actually cares about the problem space.
Hi! yes i'm very interested in binary analysis and decoders in general and fq was not built with the intention to compete or replace anything. I usually use fq together with lots of other tools, they all fill different purposes. The more the merrier!
The listing of alternatives in the README really should be standard practice for open source projects. OTOH, some maintainers don't like to do that when they haven't evaluated the projects. Perhaps they could still add them with a disclaimer though.
Can't blame him. ASN.1 is one of the most complicated binary formats, with so many encoding rules that there's no free decoder that can process them all.
Shameless plug, but you may be interested in my library (which is MIT/Apache-2.0) that offers decoding from BER/DER/CER all from a single model in code, there's no UPER/APER support at the moment, but it's coming in the next few months. :)
There's a reason why all the cool companies invented their own serialization formats: Google's Protobuf, Facebook's Thrift, etc.. even when ASN.1 had been an international standard for years: It's too complicated.
The big part of the reason is combination of NIH with bad reputation mostly related to X503 and such rather than anything else - hard to advocate for it when the main library you can point to is OpenSSL, and most commonly known encoding is DER (which has certain implementation complexity, effectively being sorted BER, which has certain important value in cryptography).
Both Protobuf and Thrift evolved from RPC systems that possibly started out too simple for ASN.1, combined with above issue where good tools were probably commercial and expensive (FWIW, my experience also suggests that Thrift is shitty rpc system, compared even to Sun/ONC RPC, but maybe things changed)
Hi, here is issue related to this where i explain a bit what would be required https://github.com/wader/fq/issues/20 and how protobuf support currently works.
That’s what I thought until recently, but it turns out that PEM refers to just base64 wrapped with ——BEGIN—— and ——END—— lines, and the encapsulated data does not have to be DER.
I’d like to see support for FIT files as they are emitted by Garmin fitness devices. It’s a clever binary format that in-stream defines the format of records which then contain the actual measurements which may be scaled for more compact representation. These multiple layers make the format not obvious to parse but the tool supports already an impressive list of formats that probably use similar techniques.
I am alternating between WOW and wtf. Pretty cool stuff.
I just wonder how on earth you want to be able to support all the binary formats out there. I mean, jq supports json, not all structured text data, like json, xml, csv, ini, ...
Honestly, it all makes sense: The plugin system and open source nature makes it really easy to write a definition for the file format you want to work on, which will not just leverage the whole ecosystem, but benefit everyone.
This is one of the seriously great ideas where I‘m thinking: How didn‘t anyone come up with that before?
Hi, i can give some background how i ended up with go instead of using something more declarative. Maybe 1.5 years ago i start to prototype different approaches for what query language to use (sql, jsonpath, my own basic jq version and few more) and what language to implement decoders in (lisp, kaitai, tcl, "scripted" go, normal go and some more).
What i found was that for my use cases, detailed parsing of big media files, anything scripted was just too slow. I did look into translating kaitai etc into something compiled which would probably be fast, but next on my list was i wanted was to be able to select and decode subformats in quite complicated ways (like mp4 samples), flexible ways of demux and join blob to decode, calculate checksums, samples counts in various way. All felt clunky or hard to fit into a purely declarative description.
But i was also biased towards go as i had good experience using it and know that it would probably be fast enough (turn out smart memory usage is probably the main speed factor for fq when you keep track of lots of things). Also it would provide good tooling like IDE support, refactoring (gopls gofmt -r, rf) and it's a reasonably strongly typed language i think. Last but not least the quick build times really fits my way of working, usually use lots of watchexec etc.
For query language i didn't prototype much, i know i really wanted jq as i had already used it extensively and know it was very powerful and had a terse syntax when working with structured data. I had some ideas of maybe using the C-version of jq via bindings or somehow let fq be tool that you used like this 'fq file | jq ... | fq' but it just felt strange and not very user friendly. Then i found gojq and i just felt that i have to make it work somehow, even if it would require lots of hard work and change to it (see https://github.com/wader/gojq/commits/fq, the JQValue change it probably to most interesting and support or custom iterators/functions that has been merged). And it turned out much better than i would expected, large parts becuse gojq's code is very nice and author has been very helpful.
There is more things i would like that talk about but i think this is long enough for now :)
But all that said i think you could use kaitai or something similar together with fq:s decode API if you want. I also have some ideas and plans on supporting writing deocders in jq, hopefully will get some time for that next year.
I wish this was only the binary front end so I could pick my parser (e.g. PowerShell). I see fq seems to support sending the whole JSON to stdout; I wonder if there's a way to make this the default behavior:
I wrote a small script to convert CSVs to JSON strictly to use jq on the output. Querying things like your GCP bill with jq is quite enjoyable.
gojq is also nice. I work with a lot of structured logs and wrapped jq with a little bit of format-understanding and output sugar to make looking at and analyzing such logs an enjoyable experience: https://github.com/jrockway/json-logs
Does this support out-of-tree format decoders? From an initial glance it looks like all decoders are in-tree and written in golang. We have a lot of internal binary formats at $WORK that I would like to use this on...
Hi! yes it's kind of support but in a very go:ish at the moment. You can use fq as submodule, import/register your own format decoders and then run cli.Main. More or less what https://github.com/wader/fq/blob/master/fq.go does.
I have private version of fq for work with some proprietary formats that does this and it works great. One issue is that the decoder and format API might change, not sure i can give any stability guarantees atm and i want to evolve a bit more. Also it would be great to be able to hook into existing formats more in some way.
In the future i hope to support writing decoders in jq and or support some declarative format like kaitai.
Seems unlikely since the decoders are defined not just in-tree but in "host" code, the definitions are neither data-driven nor a DSL.
So it would require some sort of native (Go) plugins system, which I understand is about as bad as in Rust owing to there being no standard ABI (or plugins system for that matter).
therefore the way to have bespoke / internal formats would be to maintain an internal fork of the tool.
:) i didn't choose it to be provocative or so, apologies if that is the case. I've always pronounced jq yay-queue so fq is eff-queue for me. Also f and q can be written with one hand on qwerty which is nice and quick
If it were 'fk' sure, but the Q on the end makes me think of all the English words that come from French and end in 'ique', like technique. 'fq' looks like 'feek' to me.
this is quite an interesting project! combining kaitai structs or similar with the command line.
however, i am a little disappointed that the jq syntax was chosen. jq has a very non-intuitive syntax. there are more intuitive query syntaxes out there. (linq or even basic sql come to mind.)
Yes i can empathize with finding jq hard to understand, it's quite different and took a while to grasp. The reason i choosed it anyway was that after prototyping some common type of queries i would like to do (basic value access in deep structures, multiple recursive traverse with filtering, transform objects and arrays) in various languages, jq was more or less then only one that felt terse enough.
Also i think it's quite nice that you can output to JSON and then load into whatever language or environment you want
Maybe there are some alternatives i should look at?
This looks incredible. I'm on my phone so I haven't tried this, but it looks like this supports slicing into MP3 bitstreams? That would have saved me a month of research and tons of development back in 2013.
Hi, it depends a bit, if the mp3 stream uses bit reservoir it might be tricky to to "pure" remuxing with any tool. fq:s mp3_frame decoder do try to know what bits are parts of the current frame or part of a future frame but not sure how much that helps. If the stream does not use reservoir you should be able to slice using fq '.frames[100:200][]' file.mp3 > sliced.mp3 or something similar.
Interesting project. Unfortunate that its name conflicts with one of nq’s executables (https://github.com/leahneukirchen/nq), but I’m not sure anything can be done about it.
IMO ones that only have one non-prefixed executable take precedence over ones that don't, even when the one with multiple non-prefixed executables is older.
Hi, currently the protobuf support can either decode the wire-format or in some cases a format decoder uses protobuf as subformat and passes it a "schema" so it can do some more fancy decoding. But yes it would be interesting adding support for reading protobuf schemas somehow.
Interestingly fq works by being kind of a superset of JSON/jq. It has types that can behave as jq values when needs but then with special functions or key accessors can be something else.