That's what our first stage does, pretty much. I would imagine we do it way faster than re2c would do it.
Parsing the entire document lock stock and barrel is an easier thing to write about and benchmark. The problem is with skipping around and pulling out bits of JSON from a benchmarking framework is that attempting to present such data often amounts to "hey, we asked ourselves a question and then we got a really good answer for it!". It's hard to picture what a 'typical' query for some field over a JSON document would look like. Conversely, it's pretty easy to know when you finished parsing the Whole Thing.
> It's hard to picture what a 'typical' query for some field over a JSON document would look like.
Exactly. A "query" would have to define not only the path, type of the field in the source data but also the type/interface of where you want to put that data. Combining dynamic queries and typed data you get a fairly tricky problem, which is why I said this is tricky. I worked on a similar thing for protobuf and jitting was a solution I looked into (in that project libllvm was too unwieldy to use).
Parsing the entire document lock stock and barrel is an easier thing to write about and benchmark. The problem is with skipping around and pulling out bits of JSON from a benchmarking framework is that attempting to present such data often amounts to "hey, we asked ourselves a question and then we got a really good answer for it!". It's hard to picture what a 'typical' query for some field over a JSON document would look like. Conversely, it's pretty easy to know when you finished parsing the Whole Thing.