> [building an AST is] a declared non-goal. Then what will be going on after par...

kazinator · on Aug 16, 2021

Compilers can go straight to output without building an AST, which is just one form of output. I should have more generally written about output.

Output can be produced procedurally in a single pass. However, it will still require refactoring to work that in. For instance, say we go to a register machine (one that has an unlimited number of registers). The expression() function needs to know the destination register of the calculation. If it returns void as before, then it has to take it as an argument. Or else, it has to come up with the register and return it.

I think that if the target is a stack-based machine, that would be the easiest to work into the parsing scheme. This is because without any context at all. Let's use term() as an example:

Original recognizer skeleton:

  static void
  term(void)
  {
    factor();

    while (type == TOK_MULTIPLY || type == TOK_DIVIDE) {
      next();
      factor();
    }
  }

Stack-based output:

  static void
  term(void)
  {
    factor(); // this now outputs the code which puts the factor on the stack

    while (type == TOK_MULTIPLY || type == TOK_DIVIDE) {
      next();
      factor(); // of course, likewise outputs code.

      output_stack_operation(type); // we just add this
    }
  }

What is output_stack_operation_type:

  static void
  output_stack_operation_type(int type)
  {
     switch (type) {
     case TOK_MULTIPLY: output_line("mul"); break;
     case TOK_DIVIDE: output_line("div"); break;
       // ...
     }
  }

For instance if we see

  3 * 4 / 6

we have the grammar symbols

  factor TOK_MULTIPLY factor TOK_DIVIDE factor

the term function calls factor() first, and that function produces output, which might be the line:

  push 3

the term function calls factor() again, so this time the output:

  push 4

is produced. Then it calls output_stack_operation_type(type) where type is TOK_MULTIPLY. This outputs:

mul

The loop continues and further produces

  push 6
  div

Because stack code implicitly accesses and returns operands using the stack, the syntax-directed translation for it doesn't have to pass around environmental contexts like register allocation info and whatnot.

The stack-based operations do not take any argument, and therefore a parser whose functions don't return anything and don't take any arguments can accommodate stack-based code generation.

Stack-based code can be an intermediate representation that is parsed again, and turned into something else, like register-based.

If the plan is to go to stack-based output, the code can work.