A parser-combinator approach to parsing JSON

aufreak2 · on June 19, 2009

It looks like modeling your parser as fn(str) -> (list-of (pair-of value remaining-str)) might work out as a more compact (in terms of code) and powerful representation. I did something similar when writing a combinator library for querying xml-like trees and it ended up quite powerful. (you can replace "str" with any sequence type).

For one thing, you can write parsers that return more than one value - i.e. parsers that are ambiguous about how they parse the string they are given. That lets you express "longest" and "shortest" as parser combinators rather than some special operation for each parser.

CatDancer · on June 19, 2009

I didn't quite follow, perhaps an example to help me? Is there a specific place where the code would become shorter?

aufreak2 · on June 20, 2009

Each element of the returned list represents one way that the parser can "succeed". For example, if your parser is for the regexp "a+" and you give the string "aaabc", then the parser can return (("a" . "aabc") ("aa" . "abc") ("aaa" . "bc")] to say that it can interpret "aaabc" in 3 ways. You can transform this into the "longest" parse by filtering this list for the shortest "remaining string".

If your parser returns [] it means it cannot parse the string, so it all fits together. I don't know if I can find the time to rewrite the code to show you explicitly how this could help, but looking at Parsec (in Haskell) will give you a good idea of what I'm talking about

http://legacy.cs.uu.nl/daan/download/parsec/parsec.html

CatDancer · on June 20, 2009

Very interesting! Would you use this for backtracking?

aufreak2 · on June 20, 2009

My comment wasn't JSON specific. After all, the JSON spec was designed to support simple parsing logic from the start. My comment was more oriented towards generalization.

jongraehl · on June 20, 2009

I believe you can abstract single match vs. set of matches using monads.

aufreak2 · on June 20, 2009

Indeed. Parsec is a monadic combinator library.

aufreak2 · on June 19, 2009

A well documented effort, definitely!