Hacker News new | past | comments | ask | show | jobs | submit login

Traditionally the C Preprocessor is a separate dialect distinct from C itself, and outputs C. So, historically it would be before lexing with respect to C. It's also been used to process non-C text, but that's fairly obscure. The hint is in the name: "Preprocessor"

I'm sure some modern compilers blur the stages nowadays.




Yeah, I totally hear what you are saying. At the same time, I at least do know that Clang and GCC do the preprocessing in-line with parsing, rather than forking out to an external preprocessor. In fact, I assume these days, the way it works is actually that your cpp binary would basically be the compiler except it disables the part after the preprocessor and just dumps the resulting AST.

That said, I have not read a lot of compiler source code to back this thought up. I have read some, but mostly in the C++ backend side, usually to resolve curiosities about vtable implementation details.


The #line directive essentially requires that you either handle macros directly in the compiler or add a syntax extension to carry that info over the wall.


> or add a syntax extension to carry that info over the wall.

Isn't that why #line exists in the first place? A lot of online literature, including the clang and GCC manuals, are worded in a way that makes it seem like the purpose is for consumption by the preprocessor, but I'd be surprised if the original purpose wasn't to communicate line numbers from the preprocessor to the compiler. The #line directive isn't mentioned in K&R; not even in the "ANSI C" second edition, despite C89 standardizing #line.

Of course, it's a preprocessor directive, but leveraging the syntax this way is a nice hack:

1) It's very simple to identify C preprocessor directives (see #2), and cheap to add a pre-pass to a C compiler to identify directives, accepting #line but throwing an error (or complaining about and discarding) any other directive.

2) There's no way to generate valid, non-preprocessor C code that might look like a directive. '#' is not and never will be an operator in C or have any other lexical role; it can only exist within comments, character literals, or string literals.

3) You can gainfully feed #line to the preprocessor, for example from the output of a different transformer like M4.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: