Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Another great piece by Eli on how Clang solves this problem and the nested functions inside C++ classes is https://eli.thegreenplace.net/2012/07/05/how-clang-handles-t....

Clang basicaly moves the semantic processing to another pass, and tokenizes both types and variables as ‘identifiers’.

I’d agree that C/C++ have some sort context sensitivity. However I think this is not an ambiguity. You can deduce a single logical solution when you see T *x; . If the T in scope is a variable you do multiplication, if the T in scope is a type definition you make a variable declaration.

Real ambiguity happens when there are two logical solutions to the same statement and the language specification has to prefer one to remove ambiguity and undefined behaviour.

For example a C# code example from spec that is ambigious and therefore doesn’t compile:

“ static void F(bool a, bool b) { Console.Writeline($”{a} and {b}”); }

static void Main(string[] args) { int G = 2; int A = 1; int B = 2; F(True, False); F(G<A, B>(7)); ”

The compiler prefers to interpret the last line as a function call with one argument, which is a call to a generic method G (that doesn’t exist in the code above) with two type arguments a one regular argument. Instead of one function call with two boolean arguments.



It can be complicated even when it's not ambiguous. Consider:

   template<size_t N = sizeof(void*)> struct a;

   template<> struct a<4> {
       enum { b };
   };

   template<> struct a<8> {
       template<int> struct b {};
   };

   enum { c, d };

   int main() {
       a<>::b<c>d;
   }
This is not ambiguous, but the meaning of the line inside main() depends on the compiler and the target - it can be either a declaration:

   a<>::b<c> d;
or an expression:

   a<>::b < c > d;
depending on which b gets selected. This in turn affects future references to d etc.


Note that this is only an issue in non-generic context. For dependent names you have to disambiguate with "typename" and "template".

But yeah, it looks like a C++ compiler has to instantiate templates in lock-step with parsing to make this work, but possibly there are some tricks that are applicable here. In any case, this looks like a hard problem.


I am not familiar with C++ templates, but isn’t this a similar ambigious pattern? If the compiler doesn’t evaluate the sizeof’s before parsing, there are still more than one equally logical ways to parse it.

Nevertheless I think the C# way of choosing generics over comparison without semantic processing is strange. The example above should run as intended (as a simple function call with two arguments) and not fail because of some spec rule.


I vaguely remember C++ template crocodiles being white space sensitive some time ago when templates were nested at variable declaration.


Yes, the lexer considered >> as a single token, so you had to you write > >. The standard was changed in c++11 to allow the syntax; that definitely complicated parsing, but at least now heavily templated code looks a 100% less ugly.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: