I have never written code to translate templates. Do you actually build a syntax tree for the template itself? I always assumed you would just store a simpler representation of the template (e.g., just a string of lexemes) and only build syntax trees when the template is instantiated.
Of course you still need to "parse" the template when it is encountered but you have to do it without semantic information (e.g., you don't know what `a` will expand to until instantiation) -- I guess that is the problem.
It is funny that Lisp's defmacro doesn't have this problem because the code itself is a syntax tree.
Either of your mentioned options for implementing template parsing were used by implementations when C++ was conceived (but before it became an ISO standard). Your "token string" approach is the route that Microsoft took with MSVC, whereas other compilers went with what later became standardized as "two phase lookup".
In short: Token stream alone is not enough. You need to decide whether T::A * b; is a pointer declaration or a multiplication immediately when you parse the template. If A is a dependent name (i.e. if T is a template parameter), it is assumed to be a variable (if that's not correct, the programmer must use typename or template).
MSVC has only recently completed their implementation of two-phase lookup, some twenty years after it was defined as the correct option in the ISO C++ standard. They have an excellent writeup here: https://devblogs.microsoft.com/cppblog/two-phase-name-lookup...
Of course you still need to "parse" the template when it is encountered but you have to do it without semantic information (e.g., you don't know what `a` will expand to until instantiation) -- I guess that is the problem.
It is funny that Lisp's defmacro doesn't have this problem because the code itself is a syntax tree.