Hacker News new | past | comments | ask | show | jobs | submit login

Precisely made for "creating a new language, interpreter, and compiler from scratch": http://craftinginterpreters.com



This does not cover compilers. I know because I specifically asked munificent (the author) about this before and he mentioned some other books to look at.


He's being modest :) Sure, the book doesn't cover many traditional compiler topics such as register allocation and code optimization. However, it does cover parsing and translating to intermediate representation, which is 90% of what you need to get your own programming language off the ground. And to be honest, many university compiler courses won't have time to go much farther than that anyway.


What do you mean exactly by "compilers"? Because the book features a chapter on how to compile source code to virtual machine byte code.


Most of the time when people talk about compiled languages/compilers, they mean languages that compile to native machine code, not languages that compile to a bytecode/run on a virtual machine.


Compiling to machine code rather than VM bytecode is a difference in degree rather than kind, and doing it well rather than naively is a deep topic.

Simple machine code generation with basic register allocation and evaluation which eagerly spills to the stack is not super hard. Producing fast code instead is a bottomless well.


Is that true? I think of C# and Java as compiled languages and they run on the CLR/JVM which are both virtual machines.


It's just the nomenclature that I'm used to, compiled languages are so called because there is no VM, it's just machine code that is run while C# and Java still requires the JVM to do JIT compilation at runtime. But it's mostly splitting hairs, there is no agreed upon definition of what makes a languages compiled/interpreted.


Both have JITs. So someone wrote a compiler from CIL/JVM bytecode into native code.


There's literally no difference in the theory or technique of compiling for a virtual machine or a real one. Real machines are just messier and have a bunch of edge cases and weirdness that require you to read those multi-kilopage manuals to figure out. Having one or more intermediate steps is useful for portability and optimization though, so it's probably a good idea to target a virtual machine unless you only intend to ever target one specific architecture that you know inside and out.


Plot twist - the developer then runs the code in qemu for rapid iteration and ease of debugging... ;)


Yes, but it's still the most beginner-friendly book on implementing programming languages that I've ever seen. It covers VMs on the second part, and going from that to asm is not a huge leap.


Compiling to machine code is not that hard once you get to the IR or bytecode phase, check out this toy compiler:

https://github.com/byo-books/pretty_laughable_lang/


Is that really true? I would expect that going from IR to native code would be the hardest part because you have to worry about architecture and supporting different ISA’s. IR can be simple because it is abstract and hides lots of the dirty details of the hardware.


Generating code is not all that hard. Generating fast code is a challenge. Decoding instructions on some architectures is hard. If you have a nice general IR naively generating native code can be just a table lookup with a register tracking/allocation routine.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: