Compilers can go straight to output without building an AST, which is just one form of output. I should have more generally written about output.
Output can be produced procedurally in a single pass. However, it will still require refactoring to work that in. For instance, say we go to a register machine (one that has an unlimited number of registers). The expression() function needs to know the destination register of the calculation. If it returns void as before, then it has to take it as an argument. Or else, it has to come up with the register and return it.
I think that if the target is a stack-based machine, that would be the easiest to work into the parsing scheme. This is because without any context at all. Let's use term() as an example:
Original recognizer skeleton:
static void
term(void)
{
factor();
while (type == TOK_MULTIPLY || type == TOK_DIVIDE) {
next();
factor();
}
}
Stack-based output:
static void
term(void)
{
factor(); // this now outputs the code which puts the factor on the stack
while (type == TOK_MULTIPLY || type == TOK_DIVIDE) {
next();
factor(); // of course, likewise outputs code.
output_stack_operation(type); // we just add this
}
}
What is output_stack_operation_type:
static void
output_stack_operation_type(int type)
{
switch (type) {
case TOK_MULTIPLY: output_line("mul"); break;
case TOK_DIVIDE: output_line("div"); break;
// ...
}
}
For instance if we see
3 * 4 / 6
we have the grammar symbols
factor TOK_MULTIPLY factor TOK_DIVIDE factor
the term function calls factor() first, and that function produces output, which might be the line:
push 3
the term function calls factor() again, so this time the output:
push 4
is produced. Then it calls output_stack_operation_type(type) where type is TOK_MULTIPLY. This outputs:
mul
The loop continues and further produces
push 6
div
Because stack code implicitly accesses and returns operands using the stack, the syntax-directed translation for it doesn't have to pass around environmental contexts like register allocation info and whatnot.
The stack-based operations do not take any argument, and therefore a parser whose functions don't return anything and don't take any arguments can accommodate stack-based code generation.
Stack-based code can be an intermediate representation that is parsed again, and turned into something else, like register-based.
If the plan is to go to stack-based output, the code can work.
Then what will be going on after parsing? What will be compiled?