I was reminded of something Dan Weinreb recently said:
[A]t the time Common Lisp was designed, it was assumed that many common use cases and patterns could and would be optimized by a "sufficiently clever compiler". [...] Unfortunately, nobody demonstrated such a compiler at the time, and it turned out to be far harder to do than the designers had anticipated.
I think the SSC is nigh onto inevitable and more a function of the fact that for quite a while, people were dicking around with new languages and implementations. I think you've got a few emerging cases, like GHC, OCaml, and maybe the Java bytecode optimizer where the collection of optimizations are converging to create a true SSC.
A significant portion of OCaml's performance happens because it infers types, including a big picture "understanding" of operations with higher-order functions, and all types are hard-wired at compile time. The downside to this is that, until what the compiler infers makes sense to it, it refuses to compile. Once it does, though, you've already jumped to something like the 90th percentile of C performance (and eliminated several categories of bugs), and once you're familiar with the idioms of its type system, writing the actual code is pretty easy. It's a very concise language. (I didn't really get the type system until I read _The Little MLer_, though. Same series as _The Little Schemer_.)
Also, if you're thinking of his recent article comparing Markdown-style text processors, that's a problem domain for which OCaml is particularly well-suited.
OCaml's optimizations aren't really an SSC effect, though - the compiler is actually designed to communicate about what it understands, and thus what optimizations it's able to make. The real danger with an SSC is that it does deep optimization magic, but in a way that makes you lose perspective of what actually made it possible, so making informed design decisions becomes harder in the long run.
This is almost assuredly out of my league, but shouldn't it be fundamentally simple to have the compiler explain the optimizations and why it made them on the code, regardless of the deepness of the magic?
I mean, I suppose the onus then lies on the programmer to understand the why or how the optimizations work but (and again, going out of my league here) don't compilers fundamentally just perform state transformations once they're gotten the AST ready? Is it really that hard to have them, as an artifact of the compilation process, produce a plan showing the source and generated byte code or intermediate representation and the steps taken in getting from one to the other?
Not that any of this makes the SSC easier to build. But maybe some of the concern in relying on one could be relieved.
That would be helpful, but the problem is not so much that you don't know when an optimisation will happen but that its often a fragile condition. Seemingly trivial changes can wreak havoc on your performance. When using GHC I often end up with code that I just can't change without turning a 1MB process into a swap eating monster.
The best thing about haskell is that encourages really clear, high-level code. Having to tiptoe around hard-to-predict performance bombs is definitely against the spirit of declarative code.
According to Jane Street's article ( http://www.haskell.org/sitewiki/images/0/03/TMR-Issue7.pdf , page 23 ) the ocaml compiler doesnt really do much optimisation at all. The performance comes down to data representation, gc and native code generation.
This has the nice side effect of making it really easy to reason about the performance of ocaml code, at the cost of wasted optimisation opportunities.
Equality Saturation is a really, really neat idea. That paper is full of gold, especially when they use a modified Rete algorithm to generate the different equivalent forms. Do you know of any mainstream compilers that are considering implementing this approach?
I'm reminded of VMware's x86-to-x86 binary translator. In some degenerate sense, this thing is a "compiler", whose input is supervisor-level x86 code, and whose output is sandboxed x86 code. You might imagine a layer like this performing all kinds of fascinating transformations on the input binary, ala JVM/transmeta. VMware's BT, however, is more concerned with preserving the presumed-good properties of the input binary, than with introducing new goodness of its own. This lack of ambition also had the nice side effect of making performance relatively easy to reason about: there were various things the guest could do to incur virtualization-induced overhead, but those overheads were close to constant per-occurrence, so you could just multiply frequency by cost to get a swag at a workload's overhead.
This actually reminds me of the part of "In the Beginning Was the Command Line..." where Stephenson points out that more powerful tools are more difficult to use but also give you more control. Using a less smart compiler is harder, but it gives you more control, and is less likely to blow up on you.
I was reminded of something Dan Weinreb recently said:
[A]t the time Common Lisp was designed, it was assumed that many common use cases and patterns could and would be optimized by a "sufficiently clever compiler". [...] Unfortunately, nobody demonstrated such a compiler at the time, and it turned out to be far harder to do than the designers had anticipated.
http://ilc2009.scheming.org/node/7