Hacker News new | past | comments | ask | show | jobs | submit login

Most (if not all) languages are compiled or interpreted in two stages: first the code is parsed (a.k.a. lexed, tokenized) to an internal representation (usually called "Abstract Syntax Tree", or AST), which just means translating something like ">" to "GREATER_THAN_SYMBOL", "5" to NUMBER, etc. and storing this in some data structure. This is also where you usually detect syntax errors like forgetting to use " to close a string and such.

In Go, you can do this fairly easily with the go/ast package.

Then you take the AST and do something with it. That something can be compiling the code, but also writing some sort of tooling like a linter, or identifier rename tool, or generate documentation from it, or whatnot. When compiling the code you need the type information, for a lot of other purposes you don't really care.

It's pretty valuable to keep the parsing as simple as possible; it makes it easier to detect errors, improves the quality of the error messages, and makes it easier to write tooling. It also keeps the code a lot simpler, easier to understand and modify, etc.




> Most (if not all) languages are compiled or interpreted in two stages: first the code is parsed (a.k.a. lexed, tokenized) to an internal representation (usually called "Abstract Syntax Tree", or AST)

Lexing/tokenization doesn't produce an AST; it produces a token stream. Parsing (which may or may not be preceded by lexing) produces an AST.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: