CL-PPCRE turns a PCRE string into a parse tree, as a native data structure, and the programmer has full access to this interface. For example, the sample regex:
/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/u
could be specified in your program as something like:
(Whether this is easier to read depends on your relative familiarity with CL and PCRE. It's a lot easier to generate or manipulate with native CL functions, though, if you ever need to do that.)
Most HLLs today wrap an existing C regular expression library, and I don't know any C regular expression library that provides a public interface to its parse tree, so it's unlikely that other languages will be able to do something similar without a lot of work.
Of course, regular-expressions-as-strings are still strings, so if you only need to write them, you can get most of the benefit by using your language's native string facilities: https://news.ycombinator.com/item?id=241373
> Most HLLs today wrap an existing C regular expression library, and I don't know any C regular expression library that provides a public interface to its parse tree, so it's unlikely that other languages will be able to do something similar without a lot of work.
I don't think I know of one either. But Go's regexp library provides access to the syntax[1], and so does rust/regex[2]. In the case of [2], it provides both an AST and a high level IR for regexes. It's not as convenience to build expressions as in your Lisp example, though, there's nothing stopping someone from building such a convenience. :-)
Yup! There's performance reasons not to write a regular expression engine in a language like Python or Ruby. I'm not surprised that other natively compiled languages like Go and Rust are the ones that come closest.
Most HLLs today wrap an existing C regular expression library, and I don't know any C regular expression library that provides a public interface to its parse tree, so it's unlikely that other languages will be able to do something similar without a lot of work.
Of course, regular-expressions-as-strings are still strings, so if you only need to write them, you can get most of the benefit by using your language's native string facilities: https://news.ycombinator.com/item?id=241373