Why concatenative programming matters (2012)

layer8 · on July 17, 2022

The ergonomic drawback of concatenative languages is that when you see `a b c d e`, you don’t know whether it means `e(d(c(b(a))))`, or `e(b(a), d(c))`, or `e(a, b, c, d)`, or `b(a), e(c, d)`, etc. The answer depends on the types of a, b, c, d, and e. The syntax `a b c d e` is a linearization of the call tree, whose node structure can be recovered from the types of the symbols (assuming the language is statically typed). As a human reader, it means you have to know the types to understand the structure of the expression. This imposes a cognitive load that other types of languages don’t.

It also removes redundancy that can be helpful to prevent programming errors. While the type checker verifies that the types match a structuring of the expression, that derived structure is not directly visible to the human programmer, and the programmer cannot easily see whether it actually matches the programmer’s understanding.

carapace · on July 17, 2022

In practice, this rarely comes up. You're almost never looking at ambiguous expressions. Instead the code is written as semantic factoring of the desired "bridge" (program) between the problem domain and solution space. (See "Starting Forth" and especially "Thinking Forth".)

whartung · on July 17, 2022

The other issue, specifically in Forths, is that it's not immediately obvious WHEN an argument is processed.

For example, you could have a construct such:

    ( read first block from FILE.SCR )
    OPEN FILE.SCR 1 BLOCK

But if you try to define a word to do that:

    : get-block OPEN FILE.SCR 1 BLOCK ;

You will find the second one doesn't work because OPEN is special, reading from the input stream for the filename at RUN TIME, not COMPILE time.

The OPEN command is more suited to the immediate "shell" experience, than for use in code. But it's not obvious, certainly not immediately apparent at first sight. Need to dig into the documentation is understand the difference.

astrobe_ · on July 17, 2022

Not the best example. The lack of quotes around file.scr is an extra-giant-huge hint that OPEN does some parsing, and therefore you should check the docs before using it in a definition (actually it is so very likely that it "won't work" in a definition that you won't even bother checking the docs).

Example aside, this type of word is usually used in a DSL-y way and with parsimony. Confusing uses are to be filed in the misuse/abuse category.

vitiral · on July 17, 2022

Like the author said, that is optional. I'm writing a concatenative language with a C-similar syntax

GitHub.com/civboot/fngi

layer8 · on July 17, 2022

It’s not optional if you want partial application/implicit composition. You can certainly have multiple syntaxes within the same language, but you can’t get the benefits of concatenative languages without also getting the drawbacks.

vitiral · on July 17, 2022

Interesting, I hadn't actually considered partial composition, since my language is very "at the metal" and partial composition isn't possible (or rather would have to have a macro and data-structure built around it). That's a good point for a "functional-style" language though.

adastra22 · on July 17, 2022

Partial compilation is pretty core to concatenative languages. I don’t know how you can call yours one without it.

vitiral · on July 17, 2022

You mean partial _application_. The author called FORTH a concatenative language, and FORTH definitely doesn't have partial application. In fact, FORTH doesn't even have a type system. Perhaps this is a miscategorization?

fngi does have a type system which modifies a stack. However, it will be lacking many high-level type system features, being similar (but smaller and lighter-weight) than golang. Not totally sure if it qualifies or not.

js8 · on July 17, 2022

But this is true for ML syntax too, isn't it? And ML-likes are widely considered to be readable.

There are also features of concatenative languages, like quoting in Joy or Factor, which can help alleviate the problem.

layer8 · on July 17, 2022

> But this is true for ML syntax too, isn't it?

In ML, function application is left-associative, and therefore `a b c` always means `(a b) c`. It can’t mean `a (b c)`, even if that would fit the types.

The expression structure is therefore independent from the types in ML. You can parse the expression tree without a symbol table. In concatenative languages, on the other hand, you need to look up the types, or if the language is dynamically typed, the expression structure only manifests at runtime.

js8 · on July 17, 2022

What I mean in ML syntax, every function is a curried function of only one argument. So you don't know how many arguments a function has without looking up its type. In concatenative languages, this is similar, every function (with exception of special forms) is a function from stack to stack, and to see how many arguments it has you also have to look at its type.

I think concatenative syntax is in fact more elegant than ML syntax, since it also "curries" return values, and is evaluated in the same order as written.

GrumpySloth · on July 17, 2022

If b is declared to be infix by default, then it will be an application of function b to arguments a and c though. So you need to know the symbol table anyway it seems.

IHLayman · on July 17, 2022

Haskell coders have to do this often enough that there is a token that does this grouping… f $ g x = f (g x), so the expression `a b c` is still a source for ambiguity.

avgcorrection · on July 17, 2022

`$` is an operator. There is no ambiguity (you might not know the precedence table but that doesn’t make it ambigious).

pklausler · on July 17, 2022

`$` = id

zetalyrae · on July 17, 2022

I think currying by default is an antifeature and ML would be better off with f(x,y,z) call syntax. It interacts poorly with type inference: forget a parameter, or swap a parameter, and get an incomprehensible type error message about some function type that has little to do with the actual fault. This is prevented by parenthesizing expressions and using tuples as the sole argument.

GrumpySloth · on July 17, 2022

> But this is true for ML syntax too, isn't it? And ML-likes are widely considered to be readable.

I don't see how that's true for ML. Could you give an example showing the ambiguity?

Edit: I think I can see how the presence of infix syntax may cause that. As in: when you see "a b c" it will read differently based on whether "b" is a function that's declared to be infix by default. Yeah, I think infix notation is a mistake. The more I think about syntax, the more appeal I see in S-expressions.

js8 · on July 17, 2022

In ML syntax, you don't know how many arguments a function will take, because of currying, and it's a (useful) feature, not a bug.

Similarly, in concatenative languages, there is ambiguity how many parameters will a function take/leave on the stack.

GrumpySloth · on July 17, 2022

> In ML syntax, you don't know how many arguments a function will take, because of currying, and it's a (useful) feature, not a bug.

Currying doesn't change the picture here from the situation in, let's say, C. If it wasn't for "infixr", there would only be one way to parenthesise an expression. Just like in C, you could apply a function of N arguments to N+2 arguments, and the function call would get interpreted by the compiler as function application to N+2 arguments. It's just that there would be a type error, but gramatically, there is no ambiguity about which symbols are used as called functions, and which as arguments to the functions (and to which functions)... as long as you don't use "infixr". Same with a function of N+2 arguments getting N arguments. At some point you get a type error probably, because you're going to get a function, where you expected a non-function; but the type system doesn't influence the building of AST (in this place at least). Looking at the source you know how many arguments are getting applied to what, without even knowing the definitions of functions, as long as you don't play with infix functions.

To make an example, again ignoring infix stuff, given "a b c d e", I know to read the whole expression as application of function a to arguments b, c, d, e. To be even more literal, SML doesn't have multiple-argument functions, all functions take one argument, and I can write it as "((((a b) c) d) e)". Unlike in concatenative languages, the function application doesn't suddenly stop in the middle of this expression, just because a takes fewer arguments than it was given, e.g. this expression can never be interpreted as "((a b) c) (d e)", even if it made more sense from the point of view of types of a, b, c, d, e.

But when you do add infix syntax, then yeah, there is ambiguity, and now AST building is driven by the symbol table a bit.

js8 · on July 17, 2022

> To make an example, again ignoring infix stuff, given "a b c d e", I know to read the whole expression as application of function a to arguments b, c, d, e.

Yes, but what does it really tell you? It doesn't tell you that the final result is a primitive value, it might as well be a function again. (We need to be careful how we define an "return argument", whether it is only primitive types, or also function types, and based on the chosen definition, the ambiguity is present in both types of syntax, or in neither.)

Similarly, if I have a program "a b c d e" in concatenative language, it is a function from stack to stack, and we don't know, how many arguments it either took from or left at the stack when it's done, without looking at the types.

The point is, ambiguity regarding number of processed arguments is there in both cases. Except in concatenative languages, it also applies to the returned arguments (i.e. a "function" in a concatenative language can kinda take negative number of arguments).

I think what mainly confuses you is that you really don't need an AST with concatenative languages. And it cannot be created, it's not a tree, more like "abstract syntax sequence".

nmz · on July 17, 2022

Its seen as bad practice to use anything but one argument per word.

There's also quotations, retroforth[1] tends to use a prefix based system for indicating the type of the argument, c:put will print a character, s:put will print a string.

https://github.com/crcx/retroforth/blob/master/doc/NamingCon...

https://github.com/crcx/retroforth/blob/master/doc/QuickRef....

zozbot234 · on July 17, 2022

Yes, but this is exactly parent's point. Once you allow for the possibility of procedures taking multiple arguments, one has to deal with call trees which cannot be represented unambiguously and compositionally in linearized form. And if this is extended to multiple return values, the "tree" is replaced by a dataflow or control-flow graph (both are possible, and not directly compatible). The change from an ordinary concatenative language is quite substantial.

MaxBarraclough · on July 17, 2022

> Its seen as bad practice to use anything but one argument per word.

I've never heard this before. Many of the words in the ANS Forth standard [0] consume more than one element from the data stack, and it's hard to see how else they could be defined.

The '+' word consumes 2 elements, as we'd expect. So does '!'.

[0] http://lars.nocrew.org/forth2012/alpha.html

qgc · on July 18, 2022

> Many of the words in the ANS Forth standard consume more than one element from the data stack

ANS generally frowned upon by Forthers, the idea of having a standard, cross-platform Forth is something of an insult the Forth philosophy. If you check out something like the Silicon Valley Forth Interest Group guys, they all make their own Forths, or use one specific to the platform they're using (such as Brad Nelson's Forth for the ESP32, if they're using that device.)

Additionally, I don't agree you should never use more than one, but the majority should only take one, the cases where you do take 2 or 3 should be either primitive words or fit the style and form of primitives that take in 2 or 3. (for instance we might see `cells!` take in 2 variables just like `!`, but it's styled like said primitive so it's behavior is clear and follows known patterns.)

It would be more helpful/accurate, to say that one should only have a maximum of 3 items on the stack at a time (which is what Chuck suggests) and so any time where a word takes in 3, it should be consuming the stack and replacing it with either zero or one items in return.

qgc · on July 19, 2022

If you want to see Chuck talk about these kinds of things "1x Forth" [1] is a great talk from him on these kinds of things.

[1] https://youtu.be/pSnNy7IpVMg

MaxBarraclough · on July 19, 2022

> ANS generally frowned upon by Forthers

That sounds like an overstatement. Opinions vary, yes, but many of the major Forths for desktop and server target ANS Forth: pForth, gforth, iForth, SwiftForth, and I think also VFX Forth.

qgc · on July 23, 2022

It's really not.

This isn't a difference of opinion that looks like "well some thing x is better at doing y than z" It's a fundamental disagreement on the purpose and practice of the language. For regular Forth users, the high and low level sit together with Forth, you don't experience a sense of any kind of "layers" of abstraction over the fundamentals of what the machine is and does. One can dispute whether Forth has this property, but it's really not an overstatement to say that regular Forth users feel this sense deeply and share this subjective experience.

The capabilities of processors change a ton over time, which is the most damning aspect of ANS Forth, it is simply outdated!

--

The "major forths" you speak of are exactly what is antithetical to the general forth philosophy of implementing only what is necessary above the lowest level. Even the term "major forth" really speaks to the bias that popular programming languages and their users have. And there's a really good reason for this, standard Forth words really just don't make any sense unless they give you access to the hardware in a way that cross platform software is largely antithetical to. Cross platform components in software fundamentally act as either a service to a platform layer or otherwise provide an abstracted interface to a varying implementations.

The whole point of a language like Forth is to turn the lights on, to play the audio and to drive the car. Chuck was wholly aware of higher level systems like Lisp when he created Forth and purposefully developed a tool that made it easy to start from first principles and develop a Forth system to suit your needs on a domain specific level. That is, the domain of the device and it's capabilities.

The closest thing to a de-facto standard is more like eForth.

> "eForth allows me to make a complete Forth system with about 30 very simple machine code routines. With so few words to code, I could do the coding by hand, that is, without the need to write an assembler first. After this simple model is running, it is desirable to code much of the rest in assembly." http://www.forth.org/eforth.html

Notably it is reasonable to have portable forths, in that some of the same patterns may emerge, (some primitives are obviously going to be the same, you'll always need dup), but they aren't cross platform in that the code simply doesn't translate. The words needed just aren't the same. You don't want to write reusable code, you should just rewrite it.

It's actually hard to overstate how disliked ANS Forth is, not only by it's creator but by the active community (there aren't many avid, regular users who aren't a part of some kind of interest group.)

I really do recommend 1x Forth which I posted in response to your first comment to get at the kind of simplicity that the Forth philosophy targets.

"The ANS Forth standard is at least one, maybe two orders of magnitude more complex than Mr. Moore's approach to Forth. He says that code should be so simple that most type of errors simply can't happen. In the late eighties and early nineties Chuck quit writing code in Forth and experimented with sourceless programming. His first versions of his VLSI CAD software, OKAD, were constructed without source using his tools in his OK operating system. Later he return to Forth programming and rewrote OKAD II under his new colorForth. [Moore, 2000] In the chat session Chuck was asked, 'How did you come to the conclusion that Forth was too complex, and that sourceless programming was your next move?'

His reply was, 'Maybe by reading the Forth Standard.'" http://www.ultratechnology.com/levels.htm

In Chuck's conception, it's the kind of simplicity that's necessary if you want to start putting computers in your eye. https://youtu.be/0PclgBd6_Zs

astrobe_ · on July 17, 2022

"One" would be troublesome... Two is actually common, three is in the red zone, four you refactor. Except in specific cases where 4+ arguments doesn't cause stack juggling.

Chuck Moore (inventor of Forth) said once that he don't use stack comments anymore. He claims that the arguments a definition takes should be obvious from the name or the definition (one-liners are common in Forth) or be documented somewhere else than in the code. Moore's claims are often bold and hardly believable. But when you put it to the test yourself, they sometimes do work for you too. This one does work for me.

The answer to the "cognitive overload" argument is kind of trivial: "if it hurts, don't do it". The long answer is that concatenative languages have their idiomatic ways of doing things like any other language. Don't be stubborn; practice and embrace it.

BaseballPhysics · on July 17, 2022

I completely disagree with this analysis:

> The reason “Why Functional Programming Matters” was necessary in the first place was that functional programming had been mischaracterised as a paradigm of negatives—no mutation, no side-effects—so everybody knew what you couldn’t do, but few people grasped what you could.

The issue with functional programming, and the reason for that referenced piece, aren't because people don't know what you can do with functional programming languages.

It's because people don't know why they should care.

Things like "immutability, referential transparency, mathematical purity" are pretty, and it's neat to see the machinery applied in practice, but in the end people pick their tools because they help them get their jobs done more effectively, and it's exceedingly difficult to explain to a working programmer why "referential transparency" is gonna meaningfully improve their lives.

This piece, then, starts from this basic misunderstanding and proceeds to make precisely the same mistake.

Yes, what the author demonstrates in this piece is pretty neat. But why should I care? What would motivate me to move off of the imperative languages I'm already familiar with?

avgcorrection · on July 17, 2022

You say that people know what they can do with FP. Then you say that it is “pretty” and it is “neat” to see the machinery in action. I.e. it’s just superficial stuff.

Seems your precise disagreement with the author is: the author thinks that FP is useful and you don’t. Because people who argue for FP don’t do it because it’s a neat and pretty exercise.

BaseballPhysics · on July 17, 2022

> Seems your precise disagreement with the author is: the author thinks that FP is useful and you don’t.

I'm afraid you've misunderstood my point. I had no intent to make a value judgement about FP whatsoever, one way or the other.

My criticism is about the messaging.

IMO this piece misses the mark because it starts off with a faulty premise, and it's the same mistake I've seen made by every other article about FP.

My comment was inspired by the fact that I got to the end of the piece thinking "that's pretty cool, but I still don't understand why I should care about this in practice", and then I looked back at the thesis and realized the author never intended to explain that because they misunderstood the motivation for writing "Why Functional Programming Matters" in the first place.

avgcorrection · on July 17, 2022

Do you think that FP is useful? Specifically pragmatic?

BaseballPhysics · on July 17, 2022

Honestly I'm not terribly interested in participating in that particular debate (I've been around long enough to know to avoid religious discussions), but I will say this:

Step 1: define FP.

Does that mean purity (meaning no side effects)?

Does that mean immutability?

Does that mean powerful and novel type systems?

Does that mean code as data?

Does that just mean a specific approach and practice of software construction and composition?

IMO a huge part of the battle when discussing FP is settling on the terms of debate.

avgcorrection · on July 17, 2022

Define it in the way that you want. You have read the Why FP Matters article.

BaseballPhysics · on July 17, 2022

I think some things that people throw under the FP umbrella are valuable and pragmatic and some aren't.

My point is FP isn't a monolith and the term is a bit of a Rorschach test.

I remember a time when Lisp was considered the quintessential FP language and was taught in schools as an example of the paradigm.

Today, Haskell adherents would scoff at the idea that Lisp is anything but a procedural language dressed up with some casual nods to the lambda calculus.

So your question is somewhat meaningless without qualification, and asking me to set the terms of debate is odd given you're the one who posed the question in the first place.

avgcorrection · on July 17, 2022

Let’s remember that you are the one who quoted the part about FP and how you took issue with it. But in a very strange way: the quote is wrong, not because people don’t know about the benefits of FP but because they don’t… know why they should “care”. But this is not a value judgement, you said (being apparently useless to the working programmer is not a value judgement). So you have some kind of opinion about FP. It felt natural to ask exactly what that opinion is, since it’s stated so indirectly (care). But now FP is a “Rorschach test” and anyone who wants to discuss it with you needs to tediously lay out exactly what kind of FP they mean by FP. Even though you are the one who prominently brought up FP in your comment (even though perhaps the point you were driving at was why one should care about conc. programming). And even though you had a handy reference in “Why FP Matters” that you referenced yourself.

FP is certainly not a Rorschach test compared to your obfuscated and indirect style of writing on this specific topic.

BaseballPhysics · on July 17, 2022

> (being apparently useless to the working programmer is not a value judgement)

Except I never said that.

You seem to be very confused about what I wrote and frankly seem to have an axe to grind. I suggest you read it again.

Mine was a point about articles advocating for FP, not about FP itself.

> So you have some kind of opinion about FP.

No, I don't, which is why I was reluctant to being baited into this discussion in the first place. Clearly I should've followed my instincts.

I have an opinion about the author's interpretation of the Why FP Matters essay, and I have an opinion about the structure and delivery of their message. I have an opinion about articles advocating for FP. I don't have a strong opinion about FP itself.

That's the difference between critiquing the content of an argument vs the structure of an argument.

Yes that's a subtle distinction.

> And even though you had a handy reference in “Why FP Matters” that you referenced yourself.

Did you actually read the article? Because I did.

The article cited that essay, not me. My criticism is in the author's interpretation of that essay.

> FP is certainly not a Rorschach test compared to your obfuscated and indirect style of writing on this specific topic.

And now it's getting personal. This will be my last response in this conversation. Carry on arguing without me if you like.

MaxBarraclough · on July 17, 2022

> it's exceedingly difficult to explain to a working programmer why "referential transparency" is gonna meaningfully improve their lives.

I think the Simple Made Easy talk, by Rich Hickey (of Clojure fame) does a pretty good job on this front.

https://www.youtube.com/watch?v=SxdOUGdseq4

Discussed:

* https://news.ycombinator.com/item?id=4173854

* https://news.ycombinator.com/item?id=23905051

Hickey did another similar talk called The Value of Values.

moomin · on July 17, 2022

I can understand why people get excited about the concatenative paradigm, but I’ll concede that my personal take is “Sounds like some bits of LISP married to the point free obsession of some Haskell programmers”

iostream25 · on July 17, 2022

actually.... FORTH...

https://en.wikipedia.org/wiki/Forth_(programming_language)

astrobe_ · on July 17, 2022

Except perhaps that Forth doesn't "obsess" over theory, it is rooted in pragmatism: making a Forth interpreter is easy.

voxl · on July 17, 2022

Most if not all concatenative languages are linear. Lisps can also be linear, and languages like Forth can also be non-linear, but there is a cultural divide there.

zetalyrae · on July 17, 2022

If you're interested, this article by Henry Baker draws a connection between permutation stack machines and linear types/GC-free memory management: "Linear Logic and Permutation Stacks--The Forth Shall Be First"

https://www.plover.com/~mjd/misc/hbaker-archive/ForthStack.h...

carapace · on July 17, 2022

I didn't see it linked from the article, Jon Purdy has a great talk about this too: "Concatenative Programming: From Ivory to Metal" (2017) https://www.youtube.com/watch?v=_IgqJr8jG8M

There's also "A Conversation with Manfred von Thun" http://archive.vector.org.uk/art10000350 which is worth reading (IMO) if you're interested in concatenative languages. He's the creator of Joy.

I've been working with Joy over the last few years now and I really think there's something there. It seems to combine the best features of both Forth and Lisp.

adastra22 · on July 17, 2022

It’s still possible to work with Joy today?

carapace · on July 17, 2022

The C source on the old site compiled and ran when I tried it: https://www.kevinalbrecht.com/code/joy-mirror/joy.html (mirror)

My own project is here: https://joypy.osdn.io/ It includes interpreters in Python, Prolog, and Nim (and a start on Rust) and some explorations of compilers and type inference/checking written in Prolog.

avgcorrection · on July 17, 2022

Author ended up doing a lot of work on Kitten https://github.com/evincarofautumn/kitten

vitiral · on July 17, 2022

I'm writing a concatenative language with a C-similar syntax. As a first language (to write) I've found it very pleasant. Following the forth models, I start with an extremely lean (~2000 lines of C) VM and assembly and that builds a macro-based language in a few thousand lines -- all in a single build+execute step.

I'm currently implementing function syntax, and the type system will be next (also C like with a bit of interface-like "Role" objects and type extension)

GitHub.com/civboot/fngi

iostream25 · on July 17, 2022

https://www.concatenative.org/ might be interesting for some folks

amelius · on July 17, 2022

Sounds like something you'd rather not work in directly, but could be useful as an intermediate representation for a compiler (?)

pxeger1 · on July 17, 2022

Some compilers do indeed use a stack-based IR (which is a concatenative system). Also mentioned in the blog post are JVM and CPython bytecode.

More common, though, is Static Single Assignment.

amelius · on July 17, 2022

I was thinking more in terms of supercombinators, lambda lifting, etc.

iostream25 · on July 17, 2022

Great stuff, this is a different approach than the FORTH books provide, and I suspect it may open other doors.

adastra22 · on July 17, 2022