There is actually a way to produce CPS without lots of administrative redexes, which uses the same main trick that "higher-order transform" utilizes: tracking the whether the expression being transformed is in tail context or general context, and translating accordingly — but without using meta-continuations because boy are those mind-boggling.
Instead, the normal loops are used, and the subresults are accumulated in some builder structure which is then finalized to produce a complete tree of a CPS expression. The main trick is to figure out just what this builder structure should be, and it turns out it's a tree zipper of sorts!
The margins of this comment are not big enough to sketch the whole thing but it's actually not as formidable as it sounds ("tree zipper", wooooh): it's mostly just a stack of tree-building instructions which can be easily extended/completed. An alternative would be to have a mutable CPS expression as the result, and keep appending things into it, but it requires lots of tree traversal and lot of care to not accidentally leave the thing in not quite constructed state which is way too easy in my experience.
I think I'll make and publish a gist with it when I get home because I don't believe I've ever seen it done this way anywhere.
Sounds like this is for a syntactic or metaprogramming transformation over source code? I assume this implies you are either working with a homoiconic language or you have already converted the source to AST?
However it sounds like a very portable technique, would be interested to see it.
Okay, there it is: [0]. It mostly implements a version of CPS transform from "Compiling with Continuations, Continued" by A. Kennedy [1], figure 8. Two main differences in the output language from the one in your post is that, first, this version binds everything to temporaries (including literals and continuations), and second, it uses "let" (with multiple definitions, so it's actually "let*") as the primary name-binding feature instead of immediate application. And I use name "$ret" instead of "$call-cont".
The main point of interest is the use of CpsExprBuilder as an accumulator and explicit looping over sub-expressions instead of using meta-continuations, as is popular in most papers on CPS transformations. This is not the only possible way to implement such CpsExprBuilder: another option would be to have a mutable CPS expression with a pointer into it, and gradually grow it. But since you can't just take a pointer to a list's element in Python, you would probably have to keep a whole path (as, say, an array of indexes), and re-traverse the tree every time you append to it, like this:
result = {
'expr': ['let', [['#t0', 2], ['@k0', ['cont', ['#t1'], None]]], ['f', '#t0', '@k0']],
'path': [1, 1, 1, 2],
}
cursor = result['expr']
for index in result['path'][:-1]:
cursor = cursor[index]
cursor[result['path'][-1]] = ['g', '@halt', '#t1']
I've tried this approach, and even when it's encapsulated in a class, it's way too easy to accidentally end up with a random None somewhere deep inside your resulting expression. So instead, the expression's tree is turned inside out, so to speak. It's similar to using an explicit stack instead of using recursion.
Just following the paper's convention (their justification being that now they only need to substitute variables for variables which is simpler), plus this gist is a trimmed down version of my hobby project which had full-blown algebraic data types: the integers were handled by that case as well. You absolutely can merge "int" case with the "var" case above, and use ints as themselves:
if isinstance(expr, str) or isinstance(expr, int):
result = CpsExprBuilder()
return maybe_tail(result, expr)
The second argument in maybe_tail would need a better name though...
> Why not support lambda expressions?
They were missing in your write-up as well :) I just went by your text, and look at the code snippets in it. But adding them looks like this:
if what == 'lambda':
args, body = rest
result = CpsExprBuilder()
tmp_name, ret = gen_tmp(), gen_cont()
fun = ['fun', [*args, ret], cps(body, ret)]
result.add_let_without_body([[tmp_name, fun]])
return maybe_tail(result, tmp_name)
Some test cases I've tried look like this then:
['lambda', ['x'], 'x'] =>
let
#t0 = fun(x, @k1):
$ret @k1(x)
in $ret @halt(#t0)
[['lambda', ['x'], 'x'], 2] =>
let
#t0 = fun(x, @k1):
$ret @k1(x)
#t2 = 2
in #t0(#t2, @halt)
['let', ['f', ['lambda', ['x'], 'x']], ['f', 2]] =>
let
@k0 = cont(f):
let
#t3 = 2
in f(#t3, @halt)
#t1 = fun(x, @k2):
$ret @k2(x)
in $ret @k0(#t1)
By the way, if you don't like creating contiunations just to immediately invoke them, you can adjust make_LET adjusted to flatten that pattern as well: check if the body is '$ret' with the continuation name that's in the defns list, replace body with this continuation's body (with variables substituted!) and throw its definition away.
[['lambda', ['x'], 'x'], 2] =>
let
#t0 = fun(x, @k1):
$ret @k1(x)
#t2 = 2
in #t0(#t2, @halt)
['let', ['f', ['lambda', ['x'], 'x']], ['f', 2]] =>
let
#t1 = fun(x, @k2):
$ret @k2(x)
#t3 = 2
in #t1(#t3, @halt)
But this also throws the programmer-supplied names away during the substitution, sadly. Preferring to keep programmer-supplied names is possible, of course, but it requires both tracking them and making sure they don't collide with temporaries or other programmer-supplied names so... this simplification step is better performed in the optimizer after the CPS-conversion.
Instead, the normal loops are used, and the subresults are accumulated in some builder structure which is then finalized to produce a complete tree of a CPS expression. The main trick is to figure out just what this builder structure should be, and it turns out it's a tree zipper of sorts!
The margins of this comment are not big enough to sketch the whole thing but it's actually not as formidable as it sounds ("tree zipper", wooooh): it's mostly just a stack of tree-building instructions which can be easily extended/completed. An alternative would be to have a mutable CPS expression as the result, and keep appending things into it, but it requires lots of tree traversal and lot of care to not accidentally leave the thing in not quite constructed state which is way too easy in my experience.
I think I'll make and publish a gist with it when I get home because I don't believe I've ever seen it done this way anywhere.