I for one find this separation so useful that I wrote my parser combinator library for polymorphic token streams where the first layer (à la lex) parses characters into lexer tokens, and the next layer (à la yacc) parses those into an AST, for instance. That also makes it easier to separate orthographic (“syntax”) errors from grammatical errors.