Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Right now there is a divide among programmers. One one side you have people like the author who crave the power of code-as-data more than they care about nice syntax and therefore love Lisp. On the other side you have people who like more conventional syntax more than they care about code-as-data and therefore don't love Lisp.

Neither side can understand the other: one side says "why do you resist ultimate power?" and the other side says "how can you possibly think that your code is readable?"

My belief (and what I am starting to consider my life's work) is that the gap can be bridged. Lisp's power comes from treating code as data. But all code becomes data eventually; turning code into data is exactly what parsers do, and every language has a parser. The author says "it's about read," but "read" (in his example) is just a parser.

The author asks "How would you do that in Python?" The answer is that it would be something like this:

  import ast
  
  class MyTransformer(ast.NodeTransformer):
    pass  # Implement transformation logic here.
  
  node = MyTransformer().visit(ast.parse("x = 1"))
  print ast.dump(node)
This works alright, but what I'm after is a more universal solution. With syntax trees there's a lot of support functionality you frequently want: a way to specify the schema of the tree, convenient serialization/deserialization, and ideally a solution that is not specific to any one programming language.

My answer to this question might surprise some people, but after spending a lot of time thinking about this problem, I'm quite convinced of it. The answer is Protocol Buffers.

It's true that Protocol Buffers were originally designed for network messaging, but they turn out to be an incredibly solid foundation on which to build general-purpose solutions for specifying and manipulating trees of strongly-typed data without being tied to any one programing language. Just look at a system like http://scottmcpeak.com/elkhound/sources/ast/index.html that was specifically designed to store AST's and look how similar it is to .proto files.

(As an aside, programmers have spent the last 15 years or so attempting to use XML in this role of "generic language-independent tree structured serialization format," but it wasn't the right fit because most data is not markup. Protocol Buffers can deliver on everything people wanted XML to be).

Why should manipulating syntax trees require us to write in syntax trees? The answer is that it shouldn't, but this is non-obvious because of how inconvenient parsers currently are to use. One of my life's goals is to help change that. If you find this intriguing, please feel free to follow:

  https://github.com/haberman/upb/wiki
  https://github.com/haberman/gazelle


One one side you have people like the author who crave the power of code-as-data more than they care about nice syntax and therefore love Lisp.

I crave both the power of code-as-data and nice syntax, which is why I love Lisp.


Some people like putting salt on grapefruit. I'm not saying that it's impossible to like Lisp's syntax, but empirically most people prefer the ALGOL-like syntax, which is why I referred to it in the next sentence as "conventional syntax."

I could attempt to prove to you that "conventional syntax" is inherently superior to Lisp syntax, but that would be a waste of both of our time.


Some people like putting salt on grapefruit.

You sound as if you think that putting salt on grapefruit is inherently strange, while in actuality there's a very good reason to do so: it reduces the perception of bitterness.

I'm not saying that it's impossible to like Lisp's syntax, but empirically most people prefer the ALGOL-like syntax

Empirically, most people prefer what they are already familiar with, so I'm not sure what this is supposed to prove, other than most people are already more familiar with Algol-like syntax.

For me, Lisp syntax has the definitive advantage that the first identifier in every expression tells me what to expect. I.e., I don't have to scan to the right to figure out what kind of expression this is. For me, this makes code much more readable. And this makes Lisp syntax more "nice".


Presumably salt on grapefruit is thought to be weird since grapefruit is thought to be good for your heart, while the perception of salt is quite the opposite.


(+ 1 2 3) =

add 1 and 2 and 3

The syntax is a bit terse but if you teach people a good way to read it it becomes much more readable than 1+2+3

The only reason we prefer that way is that we are thought that syntax when we do math in school, I have found it much easier to teach people lisp who have no or very little formal education in math.


This argument is based on too shallow an analysis and doesn't stand up to closer examination.

  (/ (+ (- b) (sqrt (- (* b b) (* 4 a c)))) (* 2 a))
Yeah, so it divides (the addition of (-b and the (sqrt of (the difference between (the product of b and b) and (the product of 4, a and c))) by (the multiplication of 2 and a))

Right, that's much easier than

  (-b + sqrt(b*b - 4*a*c)) / (2*a)
(-b plus the sqrt of ((b times b) - (4 times a times c))) divided by (2 times a)


New lines in the Lisp, PLEASE :)

I see you omitted some parenthesis in the "conventional" expression, relying on the fact that multiplication takes priority over substraction. Making this fact explicit is exactly what makes Lisp better, especially for more complex domains: delegating priorities to the notation, freeing brain capacity for the actual problem.


If math is a problem for the user, there is the option to use a modified parser. For example using an infix parser called by a readmacro:

    (defun foo (a b c)
      #I( 

          (-b + sqrt(b*b - 4*a*c)) / (2*a)

        ))

    CL-USER 8 > (foo 1 2 3)
    #C(-1.0 1.4142135)


Sure, there are solutions and it's awesome that they're both possible and easy to use. I'm not arguing against Lisp; I just don't agree that its syntax is better. I agree it is not worse, if you survey a sufficiently large variety of cases.

It may be bikeshedding, but I would not let 'blue is better than red, because the sky is blue' pass either.


Once you add indentation, and know the simple rule that args line up vertically (unless they're so short that you'd rather leave them), the following is pretty easy to read:

    (/ (+ (- b)
          (sqrt (- (* b b)
                   (* 4 a c))))
       (* 2 a))
It tells me:

* there's a quotient of 2 things

* the first thing is a sum of -b and a sqrt

* the second thing is a product

and so on. Pretty nice. Of course, mathematical notation is more terse.


Weak arguement. So rather than form a language around our existing learnings (in Math as you say). We should change all our existing learnings to suit a particular language... then it's more readable... riiiiight, good luck with that. :)


Syntax is the vietnam of programming languages.


> I could attempt to prove to you that "conventional syntax" is inherently superior to Lisp syntax, but that would be a waste of both of our time.

Yes, trying to prove falsehoods is a waste of time.

Conventional syntax is neither conventional nor suited to humans. (If it's "conventional", why isn't there more agreement as to what it is? If it's suited to humans, why aren't there more than 100 who actually know it for any given language?)


Lisp is effectively the parse tree in other languages. In the gap between human and machine, Lisp is closer to the machine. Some people like that, but the majority prefer a language that is closer to the human (and perhaps the closer you get to the human the less nice it gets depending on how you define "nice").


> the majority prefer a language that is closer to the human

Closer to "the human"? Do you know more than 3 people who know C++ operator precedence?

Humans don't handle operator precedence very well.


Why then has math (which is read and written only by humans) used infix notation for hundreds of years, whereas prefix/postfix notation were only developed in the 20th century and today are used only by Computer Scientists?

You don't have to know an entire operator precedence table to read and write idiomatic infix-notation code. Precedence is defined such that common expressions evaluate as people intuitively expect (a notable counterexample is "x & y == z" in C). Parentheses are always available to clarify more complicated expressions.


Humans who do a lot of math switch notations when convenient. For example, for addition we'll sometimes put a summation sign in prefix notation. For division we like to put the numerator above the denominator, a notation that's inconvenient in a programming language.

Come to think of it, humans usually add and subtract by stacking numbers vertically. I don't think you can point at infix notation as "the" human-friendly notation.


Have you seriously seen (non-computer) people write simple arithmetic as + 1 3 ?

This feels like a discussion based in fiction...


I've seen lots of people doing 1 enter 2 + on their HP calculators. Using them with RPL, reverse polish lisp.


Postfix notation is popular in the financial industries.


One could have made the same argument in favour of Roman numerals, in the face of the less familiar Arabic numerals


Math was written on paper long before there were computers. As a result, the use of infix notation was an act of necessity not a calculated decision. Now that we have computers and keyboards we should use prefix notation.


So you think we should teach in schools (from an example above):

(/ (+ (- b) (sqrt (- (* b b) (* 4 a c)))) (* 2 a)) ?


Seems like a great idea to me.

I take it you think there is something intrinsically wrong with that idea?


Certainly, I would hope it would have been obvious, but we might each have different opinions of obvious!

I find it very hard, without bracket counting, to see exactly what the '+' and '/' bind to. With the more traditional:

(-b + sqrt(bb-4ac)) / (2a)

I find in only a glance I can tell what everything is binding to.


> I find in only a glance I can tell what everything is binding to.

It must be nice to live in a world with only 4 infix operators and expressions that have only 3 infix operators.

For example, lots of folks think that sqrt should be a prefix operator, not yet another function. I suppose you're going to assume that the top bar will serve as parentheses.

BTW "-b + sqrt(bb-4ac) / 2a" is the interesting expression. Is it "(-b + sqrt(bb-4ac)) / (2a)" or "-b + (sqrt(bb-4ac)) / 2a)" And, are you certain what "bb-4ac" means? (There's at least one major language where it doesn't mean "(bb)-(4ac)".)


We aren't talking about programming languages. We are talking about teaching math in school. In most of school mathematics, there are only 4 infix operators (well, and also the comparison operators).


> We are talking about teaching math in school. In most of school mathematics, there are only 4 infix operators (well, and also the comparison operators).

And that's how the exceptions swallow the rule. And, it's also how we get infix programming languages where that's definitely not true, and so on. Where should we make the switch?

Also, only four? What about set operations?


I would teach it like this:

  (/ (- (sqrt (discriminant a b c)) b)
     (* 2 a))


Why exactly is it a necessity to use infix on paper? And what exactly is the argument for using prefix just because we have computers? I'd argue quite the opposite, computers give use even more convenience to use whatever we like. I think it is a rather arbitrary choice, but it may relate to the prevalence of subject-verb-object in (spoken) languages; i.e., operators act like a verbs.


if you look at math on paper, that's definitely not 'infix'.


It's even stronger, in that mathematics generalized arithmetic algebra into groups, fields, rings and other things I don't understand. Examples of specific algebras include: boolean, relational and Kleene (aka regular expressions).

Other notations are used, but with a frequency similar to pre-fix (lisp) and post-fix (forth). "Associativity" (not affected by order of evaluation) only makes sense for in-fix.

But it really could just be familiarity, I guess. I can't see how to determine it either way. But regardless of the cause, there's overwhelming evidence that people, in fact, prefer in-fix.


> But regardless of the cause, there's overwhelming evidence that people, in fact, prefer in-fix.

How many people have seen anything other than in-fix? Of those, how many got a fair shot at an alternative?


> You don't have to know an entire operator precedence table to read and write idiomatic infix-notation code.

If it's "idiomatic", why is there such disagreement?

> Why then has math (which is read and written only by humans)

Convention has a lot of value. That said, mathematicians don't have to worry about getting things wrong. It's just paper, and they're happy to let humans fix up the errors.

> Parentheses are always available to clarify more complicated expressions.

Unnecessary parentheses are how humans deal with the fact that they can't handle infix.


Closer to a virtual machine which treats most things as function, nouns, and lists; fairly close to human thoughts. Naked and abstract and systematic hence easier to process, but beside cad/cdr name, I don't see too much machinery here.

here's IPL, an influence of lisp, also a list processing language (c/p from wikipedia)

  IPL-V List Structure Example
  Name	SYMB	LINK
  L1	9-1	100
  100	S4	101
  101	S5	0
  9-1	0	200
  200	A1	201
  201	V1	202
  202	A2	203
  203	V2	0
How human LISP feels now ;) ?


Yes, I don't see how mainstream languages can possibly be considered aesthetically pleasant, with the possible exception of Python. I wouldn't visit a gallery that shows them off.

I find Lisp (particularly Clojure) much more aesthetically pleasant, in that it communicates better with me. With Paredit, it's even better to the touch.

(If there one day came to exist something even better on these metrics, then I'm sure I would start to prefer it aesthetically.)


I agree I don't get the syntax argument. I am sure that as one of the other posters mentioned that one is more favored than the other, but once you know the limited syntax of a Lisp it becomes fairly readable. For me it's all about figuring out scope, once I know what denoted scope, it is fairly easy to format the code into a readable format for my mind. The thing about the Lisp dialects is the rules for syntax are so simple that once understood they become perfectly readable, at least to me they do.


> I agree I don't get the syntax argument

Doubly agreed! I learned Clojure just over a year ago and will never look back. My attitude to people complaining about parents is that that should just get over it. That one hang up is actually holding them back.


That should have been 'parens', not 'parents', of course. Although, in another context, it's completely valid!


I'll explain it to you: Where is the bug in..

(defun substitute-in-replacement ($-value replacement) (cond ((null $-value) replacement) ((null replacement) ()) ((eq (car replacement) '$) (cons $-value (cdr replacement))) (T (cons (car replacement)) (substitute-in-replacement $-value (cdr replacement)))))

From: http://www.csc.villanova.edu/~dmatusze/resources/lisp/lisp-e... with one paren moved.

I remember, when I was taking a class on AI, looking for some sort of style guideline that would help me get through the learning curve, but the FAQ (I want to say it was comp.lang.lisp) just had "coming soon." So this would be the allegro editor in 2001 or 2002. It may be obvious to an experienced hand, and perhaps if there was some sort of best practices when I was learning it I wouldn't have had the same problem, but I just remember the frustration of my mind playing tricks on me and (even with syntax highlighting) trying to match parens that I thought were there.


1 paren moved and all line breaks and indentation removed. This is intentionally obfuscated. Is there any language that is easy to read when you put five lines of code together like This?

I haven't read a lisp style guide, Emacs just takes care of indentation - it is immediately clear when a paren is wrong because the shape of the function is wrong. If you are writing lisp with an editor that doesn't do this, get a better editor, don't blame the language.


Oy. The mistake was actually in posting too late and not using the proper code tag. Don't drink and post, kids. ;) When I brought the code into vim I used the same indentation as the example, I saw that the indentation changed, but you're saying "The shape of the function is wrong."

It's quite possible my experience as a programmer today would be different than when I started -- I mean, I made it through a few chapters of SICP without such troubles, but in the back of my head was the memory of trying to figure out my logic error in a bit of code when it was really a misplaced paren.


>>Is there any language that is easy to read when you put five lines of code together like This?

Arguably Python -- but to get that, Python sacrificed the possibility of both usable anonymous functions and the possibility to cut-paste a code fragment and just ask the editor to reindent.

Hardly worth the price.


The Lisp compiler tells you:

    (defun substitute-in-replacement ($-value replacement)
      (cond ((null $-value) replacement)
            ((null replacement) ())
            ((eq (car replacement) '$)
             (cons $-value (cdr replacement)))
            (T (cons (car replacement))
               (substitute-in-replacement $-value (cdr replacement)))))

    CL-USER 5 > (compile 'substitute-in-replacement)
    ;;;*** Warning in SUBSTITUTE-IN-REPLACEMENT: CONS is called with the wrong
    ;;;     number of arguments: Got 1 wanted 2
    SUBSTITUTE-IN-REPLACEMENT
Lisp compilers able to present these error messages are in use since more than 40 years. Common Lisp has them since day one.


Took me less than a minute to find it. Just pasted the code into my Lisp editor (I use CCL), added a few line breaks in the obvious places, hit TAB a few times, and it was immediately obvious.

Actually, it's pretty obvious even without doing all that. CONS always takes two arguments.


My code doesn't look like that. For 30-50% of my code, I try to use a style more like:

  (defn thingies [id]
    (->> id
         fetch
         read-json
         :rows
         (map :thingy)))
(Of course, my code is often more complex and messier than that, even when using ->>, but some fairly significant percentage of my code does look that simple.)

I'm sure there's stuff to criticize about Clojure, but we can look at real-world code in another mainstream language (Javascript+node.js? PHP? Java?) and point out readability problems too. (Python maybe being an exception in terms of readability-in-the-small, for things that fit in the mainstream style. Though as someone pointed out, there's maybe some problems with manipulability.)


You can use the protocol buffer schema language to define your ASTs if you want, but I think that addresses only a relatively small part of the problem.

There are two larger problems in adding Lisp-style macros to non-Lisp languages, one social and one technical.

The social problem is that language designers must be persuaded to publish a specification of the internal representation of the AST of their language. This makes the AST a public interface, one which they are committed to and can't easily change. People don't like to do this without a good reason.

The technical problem is more difficult, though. To make a non-Lisp language as extensible as Lisp would require making the parser itself extensible. This is not too hard to implement, but perhaps not so easy to use. If you've ever tried to add productions to a grammar written by someone else, you know it can be nontrivial. You have to understand the grammar before you can modify it.

And if you overcome the difficulties of having one user in isolation add productions to the grammar, what happens when you try to load multiple subsystems written by different people using different syntax extensions which, together, make the grammar ambiguous?

I don't know that these problems are insurmountable, but a few people have taken a crack at them, and AFAIK no one has produced a system that any significant number of people want to use.

It's worth taking a look at how Lisp gets around these problems. Lisp has not so much a syntax as a simple, general metasyntax. Along with the well-known syntax rules for s-expressions, it adds the rule that a form is a list, and the meaning of the form is determined by the car of the list -- and if it's a macro, even the syntax of the form is determined thereby.

Add a package system like CL's, and you get pretty good composability of subsystems containing macros. You can get conflicts, but only when you explicitly create a new package and attempt to import macros from two or more existing packages into it.

Applying these ideas to a conventional language gives us, I think, the following:

() While the grammar is extensible, all user-added productions must be "left-marked": they must begin with an "extension keyword" that appears nowhere else in the grammar.

() Furthermore, those extension keywords are scoped: they are active only within certain namespaces; elsewhere they are just ordinary names. This requires parsing itself to be namespace-relative, which is a bit weird, but probably workable.

I think that by working along these lines it might be possible to add extensible syntax to a conventional language in a way that avoids both the grammatical difficulty and the composition problem. And if you do that, maybe you can then get the relevant committees or whoever to standardize the AST representation for the language.

I've never taken a crack at all this myself, though, because I'm happy writing Lisp :-)


My goal is not to add Lisp-like macros to every language. That would be a bit bit presumptuous; not all languages want Lisp-like macros.

My goal is to make AST's as available and easy to traverse/transform as they are in Lisp. This is the foundation that makes things like Lisp's macros as powerful as they are. And easy access to AST's enables so many other things like static analysis, real syntax highlighting, and detecting syntax errors as you type.

In a way, Lisp-like macros are just a special-case of tree transformation that puts the tree transformer inline with the source tree itself. But this is not the only possible approach. You could easily imagine an externally-implemented tree transformer that implemented GCC's -finstrument-functions. This tree transformer could be written in any language; there's no inherent need to write it in C just because it's transforming C.

It's true that a complier/interpreter could be reluctant to expose their internal AST format. But there's no reason that the AST being traversed/transformed has to use the same AST schema that is used internally; if you can translate the transformed AST back to text it could then be re-parsed into a completely different format. And with a correctly implemented AST->text component, this would not be a perilous and fragile process like pure-text substitution is.


The author of Magpie has some interesting ideas about designing a language with extensible syntax. Sorry I can't find you a more specific link right away.

  http://magpie.stuffwithstuff.com/index.html


Have you looked at Nemerle?


I don't know Lisp, I only tried Clojure but I think this applies. I don't think it's about code as data per se as it is about the philosophy that there are no "special cases" outside the few special forms needed to bootstrap the language. This simplicity allows you to shape the language the way you want without having to consider the impact of your modifications on existing code. Consider the hell C# or Java teams have to go trough when they introduce things such as async, lambda, linq, etc. and how those features interact with the existing language. Consider implementing pattern matching for C# and all the edge cases. Even when if you had a open compiler it would be difficult. Python is no better, for eg. it has a rigid class/type model that has been abused more than once to provide metadata for ORM for eg. I once tried to extend a ORM functionality that was using metaclasses and multiple inheritance, the hell you go trough with metaclass ordering is insane, compared to writing a declarative DSL in clojure with a macro and leveraging existing ORM library. And consider when libraries overload operators to implement DSL's you immediately get in to problems with operator precedence. This problems don't go away even if you can redefine the precedence because it's not consistent with the rest of the language.

So what I'm saying complexity is significantly reduced when you have a small/consistent core. As for readability I think Clojure makes this better by providing different literals for vectors and maps and those literals have consistent meaning in similar situations so it provides nice visual cues. But immutability by default, clear scoping and functional programming make things like using significant whitespace and pretty conditional expression syntax bikesheding level details.


Protocol buffers are a reimplementation of some whizzy stuff we did at a messaging start-up circa 1994.

They were great for messaging . . . but we found ourselves using them /everywhere/. And since our stuff worked in many different environments (C++, Java, Visual Basic were the ones we directly supported), you could have your choice of language.

It's flattering to see this rediscovered, several times over :-)


Yep, I believe that it will continue being "rediscovered" until some well-executed, open-source embodiment of them becomes a de facto standard. My goal is to make Protocol Buffers just that.

Another way of putting it is that I'm trying to beat Greenspun's Tenth Rule by making that "half of Common Lisp" separable from Common Lisp so that C programs (and high-level programs too) don't have to keep re-inventing it. As a bonus, this will help make languages more interoperable too.


I wrote a (batshit crazy) Python library to do exactly what you're talking about. It converts Python ASTs into S-expressions (just Python lists) and allows you to apply 'macros' to methods just by decorators.


I guess it would've been helpful to actually link the project! https://github.com/daeken/transformana


I don't agree that "the other side" cares so much about conventional syntax. Syntax is a very superficial thing; I think it's rather that most of the time you don't want to have to deal with the power and complexity of something like Lisp. If you compare it with natural language, most of the time we don't speak in poetic or literary language, even though that might be the most beautiful; rather we naturally strive for efficiency and simplicity, using a much smaller vocabulary, re-using common expressions and being redundant etc.


I'm also in some kind of middle ground. I like the power lisp provides and dont mind the syntax, but I also like and appreciate what languages with rich syntax provide. On the other hand, I dont particularly love either one. I think the only way I'll truly be happy is if the gap is bridged without really giving up on either sides advantages.

But there are other factors which would make me happier with a language than closing the gap between expressive power and great syntax. For example, I would love if there were a language with nice syntax and good metaprogramming (eg python) that also had an unambiguous visual representation (something like eg Max) that you can switch between at will. Dunno how realistic that would be without adding complexity or ambiguity or ruining code formatting)


TXL is the best I've seen for transforming ASTs (sample: http://www.txl.ca/tour/tour9.html). As you can see, it's a little complex, which is because the problem is complex. I think they do a really good job. (TXL homepage http://www.txl.ca/)

Also: I recall Matz said Ruby was lisp with friendlier syntax (but I can't find the quote right now, so maybe he didn't).


I also think there is a way to have the best of both worlds, but I have so far taken a rather different approach. What do you think of what I've got here?

  https://github.com/andrewf/fern
It's a very much a prototype, and my ideas have evolved a lot (towards lisp), but I'm at least curious what you think of the ideas in the README.


In your example you say "Once in memory, this structure can be easily outputted as CSS." I think you may be underestimating how hard this will be. CSS may look like it's just a bunch of maps, but there is more to it than that. Consider something funky like:

  #id > p a.red:visited {
    background: url(foo.png) white;
    margin: 0 3px 5em 80% ! important;
  }
There's a lot going on here. CSS isn't just key/value maps.

Also, I don't think you want a data language to be Turing-complete. PostScript was Turing-complete but PDF is not; this makes PDF easier to deal with because it's easier to analyze and there's no risk of it getting into an infinite loop.


True, I'll confess I didn't know much about the more arcane CSS selectors when I wrote that (still don't). Complicated properties, though, are not too hard, and the important thing is that it starts off in a form that's easy to process.

I don't intend it to be just a data language. What I've been moving toward in my daydreams is a DAG of, for lack of a better word, function calls (some interesting data doesn't really fit in a tree), some of which are generators. If Turing-completeness is a problem in your context, you can reject some or all generators and/or just not evaluate them, i.e. take them as pure data. But I don't want to limit myself. I would have no problem if it turned into a general purpose language with a nice data-oriented subset.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: