Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Unfortunately, in practice, it's not really the case that "it’s really easy to parse Lisp code". Yes, you can do it in some cases, but to do it correctly in general, at least in CL, is the somewhat infamous "code-walker" problem, which needs to do all sorts of strange things:

Do you handle all varieties of lambda lists? recognize and descend into all the special forms? what do you do with macros? expand them (a mess)? try to walk into special-cased standard ones like 'loop' and ignore user-defined ones?

The closest you can come to doing it sanely is to use a code-walking library like the one in arnesi: http://common-lisp.net/project/bese/docs/arnesi/html/A_0020C...



These are all issues of compilation, not reading. Reading Lisp code into cons-pairs, symbols, numbers, etc. is not that hard at all: you can do it in about 100 lines of code. Lisp can be trivially parsed by a recursive-descent parser. "'Recursive-descent'," as the UNIX-HATERS Handbook says, "is computer science jargon for 'simple enough to write on a liter of Coke.'"


No, you don't READ Lisp. You READ S-Expressions. The Lisp syntax is defined on top of that. To parse Lisp you need to understand the complex Lisp syntax - sometimes done with a code-walker. Trivial parsing can be done with primitive Lisp functions, but a complete parser is not that easy.


Usually resolving things like which identifiers refer to which kinds of things is considered part of the parsing step, not compilation; for example, the is-it-a-typedef-or-not resolution problem is frequently cited as the reason C's grammar isn't context-free. And you can't do even that level of parsing---determining which symbols are function calls and which aren't---for Lisp without expanding macros, or special-casing how to descend into known ones.


Macro-expansion is sometimes considered part of compilation, sometimes a separate stage between parsing and compilation. I've never seen it referred to in Lisp as a part of parsing, though I guess it might be possible to do that with C's text-based macros.

`read` does not perform macro-expansion: that would break data reading and the reading of quoted forms. macroexpand expands macros at a later stage. Once expanded, macros either refer to primitive special forms or function calls, and it's trivial to determine which. Primitive special forms can have their components macroexpand'd as appropriate. Function calls can be left as-is to be compiled (well, the arguments can be macroexpand'd). Eventually there will just be primitive special forms and raw function calls left, ready to be handed to the compiler.


I can buy that in terms of definitions. My objection is more based on the common use-case of "parsing" Lisp in this fashion to do some kind of source-to-source transformation, like the example in this post, which I've also wanted to do. The post saves itself by only replacing the first symbol in a form. But what if you wanted to do something fairly simple like replace every call to foo with some other bit of code?

In idiomatic Lisp code, you either miss lots of the calls, or you have to complicate your code-walker significantly. This is especially the case if you write CL in the (common, but not universal) style that makes significant use of the loop macro, because you either ignore it as a macro, and consider anything inside it opaque until macro-resolution time (because you don't know what it does to its arguments), or you special-case it as a new bit of CL syntax, in which case your parsing is now fancier. Usually you want something like the latter, because source-to-source transformations expect to also replace things inside loops. Same with, say, special-casing setf forms, if you want source-to-source transformations to "do what I mean" in a large number of cases.

It's true that it's very easy to literally get the list representing the code, but there's precious little sensible you can do with that list unless you're willing to descend into some of the more commonly used built-in macros that most CLers treat as de-facto syntax, which requires knowing something about the syntax they in effect define.


I agree with you and lispm about the complexity of code walking in CL.

For your specific example of replacing calls to foo with another bit of code you may be able to get away with macrolet. (Example: http://letoverlambda.com/index.cl/guest/chap5.html#sec_4 )


That's all non-trivial. Common Lisp tried to have a very small Lisp syntax, but it still has about 25 special operators, function calls, macro calls, symbol macros, lambda expressions, lexical and dynamic binding, ...

A code walker then is mildly complex:

   http://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/lang/lisp/code/codewalk/walk/new_walk.cl


With reader macros, reading can be made arbitrarily complex:

  (read-from-string "#.(print :foo)")




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: