Compiling an expression to a tree of closures, and a list of statements to a slice of closures, is exactly how I optimized [gomacro](https://github.com/cosmos72/gomacro) my Go interpreter written in go.
There are more tricks available there, as for example unrolling the loop that calls the list of closures, and having a `nop` closure that is executed when there's nothing to run but execution is not yet at the end of the the unrolled loop.
For optimal speed, you should move as much code as possible outside the closures.
In particular, you should do the `switch op` at https://github.com/skx/simple-vm/blob/b3917aef0bd6c4178eed0c...
outside the closure, and create a different, specialised closure for each case. Otherwise the "fast interpreter" may be almost as slow as a vanilla AST walker.
The core idea is simple:
do a type analysis on each expression you want to "compile" to a closure, and instantiate the correct closure for each type combination.
Here is a pseudocode example, adapted from gomacro sources:
This works best for "compiling" statically typed languages, and while much faster than an AST interpreter, the "tree of closures" above is still ~10 times slower that natively compiled code. And it's usually also slower than JIT-compiled code
There are more tricks available there, as for example unrolling the loop that calls the list of closures, and having a `nop` closure that is executed when there's nothing to run but execution is not yet at the end of the the unrolled loop.