Looks interesting. Curious about how much of an issue you find GC latency and how you're handling it. I tried something vaguely similar and it caused me a few problems with larger compositions.
Thanks. I did do a rudimentary benchmark with a filter bank implemented in C++ versus LuaJIT/protoplug.
Using VC++'s default compiler settings, the performance was exactly the same. However, enabling SSE while using floats instead of doubles is what made the difference. It renders the C version 1.5 - 2 times faster.
So yeah, a float version of LuaJIT would even things out completely, but I don't think that's coming anytime soon.
[0] http://terralang.org/