The actual article seems to raise the issue in a new way... yet it looks like "I like hammers, now this is how I want to solve this problem with my favourite hammer".
Even if he defends the design decisions, I have a problem with them:
> also is unable to support nearly a million nested stack frames, then execution of this query will fail
Meanwhile - if you get to the sets with a million elements, I'm pretty sure you do not want to test for an element by traversing a million elements in O(n) style. Additionally instead of running in a single environment, each of the `contains()` functions is a new closure (killing the cache)
Then, he wants to support many types of AdjonObject implementations in a single list - and destroy the trace of execution on every call... good luck debugging this project if any of the functions fail semi-randomly.
Yet - this is only one specific case where the TCO is needed and expected. The sets objects are not general enough to implement anything else than... sets (that's what IntSet's interface defines). So if TCO is that crucial in this case - let's go: no stack implementation in a no TCO language which doesn't violate the "don't inspect the elements" constraint. `contains(x)` is the main abstract function that initialises local `cf` with local `__contains(x)`, then does `cf = cf(x)` until the result is a boolean and not a closure. Now every object can construct whatever chain of calls it needs, there is no relaying on `cf` coming from an IntSet interface (or .contains() being available) and there is no call stack. `.adjoin()` can modify the local object's checking function in any way it needs. Still not really doable in java because it cannot return a "closure or boolean" type, but python will do just fine.
Isn't your rebuttal obliquely preempted by "Trying to satisfy the typecase without using an else clause would then point up the need to handle all the other possible implementations of IntSet"? More explicitly, I'm thinking that the commenter explains the advantage of TCO over your solution: "And trampoline provide you full PTC at the cost of a (local) encoding that must be followed by every single bit of software that you want to use PTC with. [...] Now why should the user have to be doing manually what the computer could automatically do a better job at? " [http://projectfortress.sun.com/Projects/Community/blog/Objec...].
Your use of dynamic dispatch is better for extensibility than suggested typecase, but you're still mucking about with a trampoline that a compiler could write for you.
But why would you ever want to run an FSM which doesn't return the new state as a result? You cannot test a single transition, because it will just keep running. If you need callbacks, you have to pass them inside the machine. You cannot implement an external halting condition easily. FSMs are usually doing one step on demand (at least in all the network applications) or whole thing at once in regexes and parsing. Those are implementations that are either trivial, or can be solved with simple objects extending State class.
FSM is defined as `S * E->S` typically, not `S * E * (your implementation details and callbacks)->(result)`. Do you see any gains in the 2nd solution? What exactly are the existing problems TCO would solve?
I am not nearly as knowledgeable as Guy Steele, but I attempt an answer.
(I will be using Haskell-like notation, because it's most natural for me at the moment. I hope currying does not confuse you.) I agree that in a lot of scenarios you can be happy with a step function
step :: State -> Input -> (Output, State)
and a loop that feeds inputs to the step function, then does something with the output (like displaying it) and repeats with the new state. Here I could argue that this loop is best described as a tail-recursive function; but I guess the loop is bearable. However what if you have more complex flow of control? Like a non-deterministic state machine. Think of a parser. There loops won't suffice and you will have to make up your own control structures out of function calls. (Or emulate them with an explicit stack. But why re-implement language features?) Composability also benefits from tail calls.
I do not know if the above paragraph contains an argument that's good enough. What do you think? I do know that parsing with combinators is pleasant to program and benefits (at least in its natural expression in FP) from proper tail calls.
I also toy with the idea of using the following definitions instead of separating the (deterministic) stepping function from the state in the interface:
date StepState = State (Input -> (Output, StepState))
(It might be a good idea to then separate state and stepping function in most implementations. In Haskell I'd also --- probably --- use typeclasses. But that doesn't add to the point here.)
Interesting. Parser combinators might actually be a "proper" example that can't be transformed into either a loop or a single trampoline chain... I've never tried to implemented one, but I can imagine some parts to be easier when done directly with closures (try one path, if it fails just drop it and try another one). On the other hand - which parser-combinator needs more than 50 or so levels of depth...
I don't think it matters. There's no value-add in LTU's blog post on this other than the equivalent of "hay guys, this is cool". Granted, that has a lot of weight coming from that blog, but it's still just a link to a link.
It depends on the language and the programming paradigm you're trying to approach, but I feel that TCO is important only in functional languages, and they don't matter much in in OO.
Haskell is a language that absolutely depends on TCO as the functional paradigm lends itself to recursion, as the preferred form is that there be no mutation of state within functions. Recursion is evidently the better form for traversing data structures based on recursively enumerable types.
Having said that, in practice very few OO languages use inmutability. The very purpose of OO is to model entities as objects and such objects provide interfaces for changing state, which is about as impure as it gets. Whether you feel it is better or worse than a functional approach, it is the modus operandum in OO languages, including Smalltalk, Ruby, Python, etc.
My own reality is that rarely do I work with recursive data structures at a productive level (not the case at the library level, where trees and lists are the norm), hence I find very little use for TCO. TCO is also very kludgy on arbitrary data structures with lots of twists (undirected graphs, meshes, etc.). Worse of all, it is very easy to get TCO wrong and actually having to use the stack.
The nice thing about implementing a function iteratively is that it is easy to see where state is being altered on every step. This is irrelevant in a language like Haskell which has a wonderful syntax for pattern matching which makes end cases and control flow very explicit, but it is decidedly very un-OO. Most pure OO languages, on the other hand, have ugly and expensive pattern matching (instanceOf and friends are pretty much what you have). I'm not convinced that TCO works very well in practice in the current OO languages. And if you're doing something that's altering state all the time it's very easy to structure a function such that it can be tail-called without having to preserve state.
In short, Ruby, Python, and Smalltalk don't have much place for TCO. TCO is not idiomatic, libraries are not made for it, and it is counterproductive to use TCO on languages that expect you to alter state all the time to model entities (unless you like recursing with 10-ary functions, which happens a lot on functional languages and it is not a pretty sight). Modern iterator patterns a pretty and work well, so it makes sense for languages that prioritize iterators over recursion to get one of them right when the other one hardly gets much use and can be easily done wrong. In my own experience (and it may not be yours) recursion is nice for 5-liner functions, but try to get a 50-line function to do TCO properly and you're probably in a world of hurt.
I do have to admit that, generally speaking, once you get the TCO to work more likely than not the function is correct and modern compilers can optimize them very aggresively.
Even if he defends the design decisions, I have a problem with them:
> also is unable to support nearly a million nested stack frames, then execution of this query will fail
Meanwhile - if you get to the sets with a million elements, I'm pretty sure you do not want to test for an element by traversing a million elements in O(n) style. Additionally instead of running in a single environment, each of the `contains()` functions is a new closure (killing the cache)
Then, he wants to support many types of AdjonObject implementations in a single list - and destroy the trace of execution on every call... good luck debugging this project if any of the functions fail semi-randomly.
Yet - this is only one specific case where the TCO is needed and expected. The sets objects are not general enough to implement anything else than... sets (that's what IntSet's interface defines). So if TCO is that crucial in this case - let's go: no stack implementation in a no TCO language which doesn't violate the "don't inspect the elements" constraint. `contains(x)` is the main abstract function that initialises local `cf` with local `__contains(x)`, then does `cf = cf(x)` until the result is a boolean and not a closure. Now every object can construct whatever chain of calls it needs, there is no relaying on `cf` coming from an IntSet interface (or .contains() being available) and there is no call stack. `.adjoin()` can modify the local object's checking function in any way it needs. Still not really doable in java because it cannot return a "closure or boolean" type, but python will do just fine.
Edit: that article really bothers me for some reason... probably because I expected better from LtU - so I implemented http://bitbucket.org/viraptor/ootr/src/tip/intset.py