> The parser being stand-alone means it is much simpler to understand and unitte...

Dylan16807 · 2025-01-12T07:47:28 1736668048

> A gigantic advantage: a single-pass-compilable language is simpler. By definition.

That's only "by definition" if you take a language that needs multiple passes, then remove the features that need multiple passes, and don't replace them with anything else to compensate.

The "by definition simpler" version of C would not only disallow forward references, it would have no forward declarations either. As-is, forward declarations add some complexity of their own.

(Also, if you can figure out a way to emit jump instructions in a single pass, you can probably figure out a way to call unknown functions in a single pass.)

WalterBright · 2025-01-12T18:36:18 1736706978

Doing jump instructions in a single pass is done by creating a patch list, and when the compilation is done walking the patch list and "fixing them up".

Doing this with functions is a lot more difficult, because one cannot anticipate the argument types and return types, which downstream influence the code generation. Of course, early C would just assume such forward references had integer arguments and integer types, but that has long since fallen by the wayside.

EuAndreh · 2025-01-12T18:31:52 1736706712

I have the impression you're mixing single-pass compilation and O(1) memory use of the compiler.

As is, C already is single-pass compilable, modulo some unnecessary syntax ambiguities.

As the compiler reads the text, it marks some character strings as tokens, these tokens are grouped as a fragment of code, and some fragments of code are turned into machine code. A simple function of a 100 lines doesn't need to be parsed until the end for the compiler to start emitting machine code.

Like the parser, this requires memory to keep tabs of information and doesn't work for all types of constructs, like a jump instruction to a label defined later in a function. The code emitter soaks input untill it is possible, and does so, like when the label is already known and can be jumped to.

WalterBright · 2025-01-12T18:40:32 1736707232

You cannot do any optimization when generating machine code that way. That's fine for a primitive compiler built for a school project, but not much else. (Even "no optimization" switch settings on a compile do a lot of optimizations, because otherwise the code quality is execrable.)

EuAndreh · 2025-01-12T21:54:39 1736718879

> That's fine for a primitive compiler built for a school project, but not much else.

Not true.

On the one hand, just see how many non-compiled languages are used outside of primitive school projects.

On the other hand, this simpler approach is actually faster for writing actually fqst compilers. Many modern compiled languages have compilers that work on the order of ~100ms on a simple file with 1k LoC, when it could (and arguably should) work on the order of ~1ms, IOW, imperceptible given the syscalls overhead.

A 100x faster compiler that generates meh code is more useful 99% of the time: when one is recompiling all the time during development.