> Meanwhile, writing a (non-trivial, i.e. something that does more than MOV, JMP, CALL, CMP, ..) backend and optimizer requires a few decades and an investment of a few million dollars.
I think that's massively over-estimating it, based on practical experience of such compilers being built.
> most languages nowadays either emit C (which I find is a very effective way to learn code generation, by the way) or interface with LLVM
I think this is also not grounded in reality. I rarely see a language emitting C. LLVM is popular but not as universal as you're making out.
> I think that's massively over-estimating it, based on practical experience of such compilers being built.
Who still writes backends from scratch nowadays (i.e. after ~2010)? The only real examples from recent times I'm aware of are Go and Cranelift: the first is backed by Google and still (arguaby) generates worse code than GCC and LLVM, the second is vastly experimental.
> LLVM is popular but not as universal as you're making out
If you exclude JITs (such as the CLR, JVM and JS engines, all of which have massive corporate backing behind) almost a 100% of _all_ languages that came out in the last decade use LLVM for code generation, even those coming from big players:
- Swift (LLVM)
- Rust (LLVM)
- Zig (LLVM)
- Julia (LLVM)
- Pony (LLVM)
- Kotlin Native (LLVM)
The only other recent languages that come to my mind are Nim and Vala, and the both generate C. Outside of these, no new language has ever rolled out its own backend, unless there was a huge (Google-class) corporation behind.
The only compilers I know which have their own backends and are still relevant (i.e. they are used in the real world) are:
- FreePascal
- GHC
- OCaml
- DMD (now largely supplanted by GDC and LDC)
These all have been started in '90s or before, and thus had their own backends already when LLVM became popular. Everything else is JIT, or is either backed by very big companies (such as Microsoft) or uses GCC.
Why would anyone waste time writing a compiler backend nowadays? It took LLVM almost 20 years to reach performance parity with GCC, and still it is sometimes problematic due to not supporting as many architectures as GCC.
> Who still writes backends from scratch nowadays (i.e. after ~2010)?
Didn't V8 just got a new custom backend like literally last week? And wasn't B3 a completely new backend? And what about Graal? That had much of its development and several of its backends after 2010. Not everyone is using LLVM like you think they are.
> Why would anyone waste time writing a compiler backend nowadays?
Because LLVM isn't great at everything! It's not designed for dynamic compilation, and while it can be used for that, you have to put a pretty powerful IR in front of it.
V8 has Google and Microsoft behind. If there is something that's definitely fuelled by big bucks, it's that.
If you read my post above, I explicitly excluded JIT backends from my argument - I was mainly taking about "classic" compilers, which in general don't have a stringent need to also compile fast.
But anyway... why exclude JITs? If you want to know about how compilers are written in practice, why discount JITs?
And fundamentally, the answer to the question 'who still writes backends from scratch nowadays' is quite a lot of people, when you don't exclude anyone who does.
I think that's massively over-estimating it, based on practical experience of such compilers being built.
> most languages nowadays either emit C (which I find is a very effective way to learn code generation, by the way) or interface with LLVM
I think this is also not grounded in reality. I rarely see a language emitting C. LLVM is popular but not as universal as you're making out.