It's faster to hand-generate machine code straight from an interpreter than to invoke a C compiler. But that is not the only issue. As with everything else, this is a trade-off, and I'm eager to see how it works out. I can see some positive reasons to do this:
1. The Ruby developers get highly-optimized machine code, with relatively little effort on their part. Many, many man-years have been spent to make C compilers generate highly optimal code.
2. The C language, as an interface, is extremely stable, so once it works it should just keep working. Compare that to the constantly-changing interfaces of many alternatives.
3. Debugging is WAY easier. If there's a problem in generated code, it's way easier to read intermediate C code (especially after going through a pretty-printer) than many other kinds of intermediate formats, and millions of people already know it.
In short, this approach means that they can very rapidly produce a system that can run tight loops very quickly, one that resists interface instability (so the approach should keep working), and one that's easy to debug (so it should be reliable). For many applications, the fact that it takes a little more time to do the compilation may be unimportant, especially since that work is embarrassingly parallelizable.
I'm very interested in seeing how this plays out. If this works well for Ruby, I suspect some other language implementations will start considering using this approach. I'm sure it's not the best approach in all circumstances, but it might work very well for Ruby - and maybe for some other languages like it.
> The Ruby developers get highly-optimized machine code, with relatively little effort on their part. Many, many man-years have been spent to make C compilers generate highly optimal code.
Not for machine generated code. C compilers work well on human generated code, and not as well as Ruby -> C "translations".
> Not for machine generated code. C compilers work well on human generated code, and not as well as Ruby -> C "translations".
That depends on the machine generated code. C compilers are optimized for whatever the C compiler authors perceive as a common construct. If the generated C code uses constructs similar to what humans do, it's often quite good. If not, you can change the code that generates C, or in some cases you can convince the C compiler authors to optimize that situation as well.