Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The elegant minimalism of Forth is inspiring-- how many other languages can fit their REPL and development environment into a few kilobytes?

However, I find implicit arity to be the largest barrier in reading concatenative programs.

To understand what a line of Forth code does, you need to understand both what each function does, and how it manipulates the stack to accomplish it. That's more memory pressure than an imperative language like C, where the flow of data and computations is obvious. It's exacerbated by the tendency to factor a single function into many simpler ones.

In a team, the increased memory burden also requires more communication for shared understanding of code.

Many concatenative languages eventually grow support for local variables, since it's _incredibly_ awkward to implement certain algorithms when you have to do stack juggling to use a variable in a computation.



This is true, but it’s also important to realise that Forth is not the only direction. ;)

It’s been a while since I wrote this article, and there are a number of things it doesn’t address, such as the practical concerns you mention. It’s much the same issue as comes up when dealing with large assembly projects, or dynamically typed ones for that matter. Knowing what’s going on can be difficult to discern without a deep understanding of the overall system, and bugs have a tendency to hide in obscure places.

And speaking of assembly, Forth is essentially assembly code for a virtual stack machine. Or an actual stack machine.

I’ve been idly working on a statically typed concatenative language for a long time now, partly because I intend it to be taken seriously sometime in the next decade, but mostly as a learning experience. It’s designed to address many of these concerns—particularly, enabling programmers to focus on data flow by framing it as a functional language. (E.g.: don’t shuffle stacks—use combinators and/or locals.)

However, I continually run into problems with the type system. Typing of concatenative languages is not well researched and is surprisingly difficult. I’m not a type theorist by any means, but at this point I’m probably the world expert on concatenative type systems, simply because there are so few people working on the problem. If anyone would like to help out, I’d love to chat by email (username@gmail).


How you looked at J. Paul Morrison's "Flow Based Programming" book? I recommend it since it touches upon a lot of similar topics. As your article points out, the concatenative style can be visualized as a flow; in the Morrison book flow is expanded upon with a more detailed, explicit connectivity system and an emphasis on visual language.

Regardless of syntax this is a very interesting language design space, since it seems to hold some of the most promise for lowering average software complexity.


Cat is concatenative and statically-typed. http://www.cat-language.com/


Sure is. Its type system just suffers from some limitations that I’d like to remedy in my own language.


Did you know Robin Popplestone designed a type system for Forth? It involved what I think he called 'regular types' (like regular expressions), IIRC, to handle (disciplined use of) words like ?DUP.


It's a very interesting article. I wonder if you've heard of Push (a stack language) and PushGP (an evolutionary automatic programming system for which Push is designed)?

http://faculty.hampshire.edu/lspector/push.html


You should look in to Factor, a higher level concatenative language.. http://factorcode.org/

There is a package for lexical variables for those instances where stack shuffling is absolutely not the way to implement your algorithm. This page http://docs.factorcode.org/content/article-locals-examples.h... compares the stack shuffling vs lexical variable approaches. Oh, and those lexical variable have essentially no computational overhead.

I disagree that the tendency to factor a single function into many simpler ones exacerbates the understandability of a function. If done properly you get fantastically readable functions, such as this gem from https://github.com/JohnEarnest/Mako/blob/master/games/Deep/D... :

    move-player fire waves bounce think draw-score spawn-crab storm
This snippet of a high level function is self describing. 1) Check to see if movement keyboard entry is pressed and move the player 2) check to see bomb keyboard entry is pressed and move an existing bomb 3) simulate the ocean waves etc...

Factoring gives you a chance to mitigate complexities in stack shuffling and allows you to build a semantically meaningful representation of your problem. Concatenative languages lend themselves to creating DSLs; every solution you write should be a small DSL for that particular problem or subproblem.


This snippet reads to me as if it says:

(1) move the player unconditionally (2) do something involve fire, or perhaps involving firing, unconditionally (3) maybe the player waves? maybe there are waves? (4) ??? (5) ??? (6) update the score (7) create a new crab unconditionally (8) storm the fortress? there is a storm?

Nothing is self-describing absent context. (I have sufficient context for "draw-score" and "spawn-crab" that I don't think they mean, say, "check scores to see if there's a draw" and "create a crab of the 'spawn' type".)


"This snippet of a high level function is self describing. 1) Check to see if movement keyboard entry is pressed and move the player"

I disagree. I do not see anything there that can even remotely be explained to imply the 'check to see if movement keyboard entry is pressed' part. Something like 'handle-player-movement' would be better.

Similarly, 'fire' could fire a gun, simulate fire in the way 'waves' (apparently) simulates waves, etc.


It's exacerbated by the tendency to factor a single function into many simpler ones.

That's an interesting comment. I often wonder about where the "sweet spot" of expression lies. If you break things up into too many small things, it's confusing. But a huge monolithic entity is also confusing. What's the "right level of decomposition?"


I think this is closely related to why 'naming things' is hard. Small functions do one thing and are easy to name, larger things do more than one thing and are harder to name. But very small functions, smaller than useful leads to harder names as well!

This does not directly answer your question but I think a good sign that you're on the wrong track is when naming things gets harder, that's when you've decomposed too far.


Great answer, thanks. It makes me think about sudden domain shifts. "Fetch the list of users. Locate the user matching the description. Push the user object onto the stack." The last item shifts from the business domain into implementation details, and it probably shouldn't be "close" to the others.


I think this is closely related to why 'naming things' is hard.

I find this especially problematic in exploratory and experiment-driven programming because naming things often gets in the way until code has stabilized somewhat.

For this reason, I've found working in Max/MSP extremely pleasant because you don't need to name things unless it makes sense to. Yes a lot of Max code you see online is a mess of lines, but I found the key is that if you do, in fact, split related things into their own named components, then I hit the sweet spot between not being required to name things that don't have natural names (or at least deferring doing so until I'm done experimenting and trying things) and still maintaining good software practices, maintainability and abstraction.


I've always thought of a hyper-abstracted codebase being like a binary tree, lazy code being like a linked list, and optimal code being like a B-tree tuned for the page size and seek latency of your particular brain.


>If you break things up into too many small things, it's confusing.

Not if you're disciplined about it.

>What's the "right level of decomposition?"

The key is not to decompose your whole problem into a network of tightly coupled parts. It's far better to build up your language with layers of abstraction until you have a new language which makes your problem trivial to express.


I think a decent IDE with some preprocessing of words might solve the Forth-family stack problem. A lot of Forth words are deterministic about how they use the stack (no looping). It would be interesting to have an IDE show a stack usage diagram for each word. If it cannot determine than take it from the comment or put up a ?.

Postscript has support for local variables and I have found it to be a nicer language than most Forths.

disclaimer: I can read Forth / Postscript easily, but for some reason Lisp always messes with me. I also don't see much of a difference between Forth words, API calls, and domain specific languages. They are all of a piece.


>To understand what a line of Forth code does, you need to understand both what each function does, and how it manipulates the stack to accomplish it.

I don't understand this. I don't need to know how add works to know it pops two bytes off the stack and pushes them added together. Likewise for my own functions. I don't need to know how they do it, just what they do.


I think the point was that you can't tell if `foo bar` is equivalent to `bar(foo())` or `bar(foo(value_from_stack))` or `bar(foo(val1, val2))` or `foo(); bar()` or what unless you know the stack effect of the words.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: