Clang basicaly moves the semantic processing to another pass, and tokenizes both types and variables as ‘identifiers’.
I’d agree that C/C++ have some sort context sensitivity. However I think this is not an ambiguity. You can deduce a single logical solution when you see T *x; . If the T in scope is a variable you do multiplication, if the T in scope is a type definition you make a variable declaration.
Real ambiguity happens when there are two logical solutions to the same statement and the language specification has to prefer one to remove ambiguity and undefined behaviour.
For example a C# code example from spec that is ambigious and therefore doesn’t compile:
“
static void F(bool a, bool b) {
Console.Writeline($”{a} and {b}”);
}
static void Main(string[] args) {
int G = 2;
int A = 1;
int B = 2;
F(True, False);
F(G<A, B>(7));
”
The compiler prefers to interpret the last line as a function call with one argument, which is a call to a generic method G (that doesn’t exist in the code above) with two type arguments a one regular argument. Instead of one function call with two boolean arguments.
Note that this is only an issue in non-generic context. For dependent names you have to disambiguate with "typename" and "template".
But yeah, it looks like a C++ compiler has to instantiate templates in lock-step with parsing to make this work, but possibly there are some tricks that are applicable here. In any case, this looks like a hard problem.
I am not familiar with C++ templates, but isn’t this a similar ambigious pattern? If the compiler doesn’t evaluate the sizeof’s before parsing, there are still more than one equally logical ways to parse it.
Nevertheless I think the C# way of choosing generics over comparison without semantic processing is strange. The example above should run as intended (as a simple function call with two arguments) and not fail because of some spec rule.
Yes, the lexer considered >> as a single token, so you had to you write > >. The standard was changed in c++11 to allow the syntax; that definitely complicated parsing, but at least now heavily templated code looks a 100% less ugly.
Clang basicaly moves the semantic processing to another pass, and tokenizes both types and variables as ‘identifiers’.
I’d agree that C/C++ have some sort context sensitivity. However I think this is not an ambiguity. You can deduce a single logical solution when you see T *x; . If the T in scope is a variable you do multiplication, if the T in scope is a type definition you make a variable declaration.
Real ambiguity happens when there are two logical solutions to the same statement and the language specification has to prefer one to remove ambiguity and undefined behaviour.
For example a C# code example from spec that is ambigious and therefore doesn’t compile:
“ static void F(bool a, bool b) { Console.Writeline($”{a} and {b}”); }
static void Main(string[] args) { int G = 2; int A = 1; int B = 2; F(True, False); F(G<A, B>(7)); ”
The compiler prefers to interpret the last line as a function call with one argument, which is a call to a generic method G (that doesn’t exist in the code above) with two type arguments a one regular argument. Instead of one function call with two boolean arguments.