This has been posted before[1], and the "spiral rule" is a load of hooey.
The correct rule is "follow the C grammar". An easier to remember and also correct rule is "start at the identifier being declared; work outwards from that point, reading right until you hit a closing parenthesis, then left until you hit the corresponding open parenthesis, then resume reading right..." (this is sometimes called the "right-left rule"[2]).
The "spiral rule" dances around the truth without actually being precise enough to be useful.
I used to complain about re-posts too. But now it is the first time i see a reference to this website, which seems interesting to me. So i'm happy with new url on my to-read list.
Way simpler: from inside out, read any subpart of the type as an expression. (Arrays have precedence over pointers, as usual.) The type that remains is that expression's type. So e.g. given the type:
const char *foo[][50]
the following expressions have the following types:
That const applies only to the bar symbol itself, not to anything it points to. So once bar is dereferenced, the const doesn't matter. The beauty of this method is that it predicts that correctly without having to think about it.
Nope, * const means that the identifier (i.e. thing to the right of the star) is const. That is, in this example, the symbol "bar" is const, not anything that it points to. So once you dereference it, the const no longer matters.
The real rule is that the type construction operators mirror the unary and postfix family of operators (declaration follows use). For instance unary * declares a pointer, mimicking the dereference operator, and postfix [] and () declare functions, mimicking array indexing and function call.
To follow the declaration you make use of the fact that postfix operators in have a higher precedence than unary, and that of course unary operators are right-associative, whereas postfix are left-associative (necessarily so, since both have to "bind" with their operand).
So given
int ***p[3][4][5];
we follow the higher precedence, in right to left associativity: [3], [4], [5]. Then we run out of that, and follow the lower-precedence * * * in right-to-left order.
If there are parentheses present, they split this process. We go through the postfixes, and then the unaries within the parens. Then we do the same outside those parens (perhaps inside the next level of parens):
Spiral rule would state "arr is an array of pointers to arrays of 10 ints", where actually it would be "arr is an array of array of 10 pointers to int".
Instead, when you write declarations, do it from right-to-left, e.g.:
char const* argv[];
"argv is an array of pointers to constant characters"
Declaration follows usage is much more easier to follow than the artificial spiral rules, IMHO with a few typedefs the declaration follows usage can make things pretty simple.
It's the difference between "declare a constant integer" and "declare an integer constant" and to me the former more accurately represents what you're doing since `const` is modifying `int`, `int` isn't modifying `const`.
Putting const on the right makes more sense when you have pointers or references. Then you just always read from right to left: `int const ` is a pointer to constant integer whereas `int const` is a constant pointer to integer.
Also your argument about which modifies which is strongly anglocentric: there are plenty of people whose native language puts modifiers after the things they modify.
Looking at "idiomatic modern C++", I am often at a loss for words at what lengths they've gone to in order to reinvent things while greatly obfuscating them in the process. Is there a std::pointer_to<T> too? I don't know, but something like this
std::array<std::pointer_to<byte>, 10> str;
certainly does not look any more readable to me than
byte *str[10];
. (Disclaimer: I mainly work with C, but find some C++ features genuinely useful, although the majority of the time they seem more like absurd complexity for the sake of complexity.)
I've never seen nor heard of pointer_to ever being used to declare a pointer to something. I believe t's used inside of custom allocators for a generic type that might not use a normal pointer as the pointer type, but would never be used for normal declarations like this.
std::array is useful for letting the compiler avoid array-to-pointer decaying, value semantics, and also actually putting array length type info in a function parameter.
std::array does not exist because it is easier to read. It exists because C arrays behave strangely. Two examples: decay to pointer and no value semantics.
I am not familiar with Go, and have heard many praises of its declaration syntax, but is its dereference operator postfix? That would make sense in such a case.
On the other hand, IMHO the whole "make declarations read left-to-right" idea is misguided --- plenty of other constructs exist in programming languages which simply can't be read left-to-right, but are nested according to precedence. I mean, you might as well make 3+4*3 evaluate to 21 if you want to try making everything consistently left-to-right, but I don't really see anyone complaining about not being able to understand operator precedence...
Go's defererence operator `*` is prefix, like in C.
The point here is that type declarations are regular to read, and those tend to be the tricky ones. Expressions tend not to be so difficult, and are more commonly factored if they become complex. For various reason, type declarations are not so practically factorable.
When I came to Go, I hadn't used C or C++ for over a decade, only Java and C# in between. Using explicitly written pointers came flooding back, but the new "C for expressions, Pascal for declarations" syntax still takes getting used to.
Declaring `v * T` means we can write `* v` as an expression, so the use of token * is synchronized for both these uses, but I must vocalize the * in my head differently:
`*T` vocalizes as "pointer to something of type T"
`*v` vocalizes as "that pointed to by variable v"
`&v` vocalizes as "pointer to variable v"
So my thought process when I see * goes: If it's in a type, say "pointer to", otherwise say the opposite of "pointer to", i.e. "that pointed to by". It feels like an inconsistent use of * whenever I'm writing Go code -- even though I know it's a natural result of Go using Pascal-style declaration syntax but C-style tokens.
But then you lose C's nice property that declaration and use are the same syntax.
For example, D also uses a similar type syntax, so in D if you declare:
int[10][20] x;
x[19][9] // is legal
In C:
int x[10][20];
x[9][19] // is legal
I think the correct solution would have been to make pointer syntax post-fix like the arrays and functions, so that you get the best of both worlds. Go-like declarations and C-like matchup between use and declarations.
I read a paper somewhere from dennis ritchie, where he explained the development of C language, the pros and cons; and in there he mentioned that reading complex declarations is a problem in C, he said that if we had placed the * operator to the left of the type it was qualifying then it would have been easier to write and understand more complex declarations.
(PS. Golang has the right idea, since its developed by the guys who contributed to C)...
The type syntax exactly matches the expression syntax used to destruct values of the type. It is very intuitive once you realize this.
The alternative would be for the type syntax to mirror the expression syntax used to construct values of the type. Functional languages tend to do this, particularly ones which prefer pattern matching over destructors.
Yeah, the rule is intuitive when I am writing the code, but I think the type declarations in other languages like Go are easier to read correctly when I am skimming through the code even though I am much more used to C. I am not sure how useful this mirroring of usage is in practice.
Seems overly complex. The way I learned it, and now teach, is to read the type backwards (int const * is 'pointer to const int') for const correctness, but anything that requires more complex parsing by a human should just be typedef'd into submission.
Notably, one of the exercises in K&R (with a solution provided) is to write a mostly complete version of cdecl, which I think is great for dispelling much of the "magic" and increasing the understanding of how declarations are actually parsed.
Once I tried to figure out how to parse complex C declarations just by reading the specification (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf), that is, without consulting guides for layman like this. But I gave up. I looked at what seemed a BNF-like description of the C grammar but I had no idea what it tells about the parsing rules. So I ended up using this guide: http://ieng9.ucsd.edu/~cs30x/rt_lt.rule.html With this I managed to implement an imitation of cdecl.
What is the value of bending ourselves to fit a confusingly designed old language, rather than bending the language to fit us? The very fact that articles like this have to be written indicates a failure of user interface design, which we needn't forever perpetuate.
The “spiral rule” is just an approximation of the actual rule as defined in the standard: declaration follows usage.
Even with typedefs, that declaration means “when you call baz with a bing and a pointer (named bratz) to a function of type boff(biff), then you get back a pointer to a function of type foo(buff).”
It’s an extremely concise notation for expressing type information without (much) special type syntax, and I think it’s quite elegant in that way.
In C, the statement “type declarator;” is an assertion that “declarator” has the type “type”. In other words, if you read “declarator” as an expression (more or less), then it should have the type “type”. So here:
foo (*baz(bing, boff (*bratz)(biff)))(buff);
“foo” is the type, and the rest is the declarator. Then you just break it down according to the usual precedence rules:
The first red flag is that the rule says "clockwise" where there s clearly to way to distinguish clocwise from anticlockwise inside the code. Only the completely arbitrary choice of up/down direction of the drawing affects clockwiseness.
It's been 20years(!) Why is this incorrect advise still up at c-faq?
symbols with equal amount of open and close parentheses in order are counted by Catalan numbers
(())()(()())(())
these count different arrangement of parentheses for function application. this guy is describing something like contour integration for computer programs
This is the reason why golang declares the identifier before the type and the return value at the after the function parameters. This allows parsing any declaration from left to right.
Just don't make complex declarations in C, it's almost never useful and won't help anyone out. It'll confuse people and make your code write-only. Just put in a couple extra lines of code somewhere if you have to. It won't be the end of the world.
It'll certainly confuse people, but only those who aren't qualified to be doing anything with the code anyway.
"complex" is subjective. It reminds me of stupid "rules" like "don't use the ternary operator", "every function must be less than 20 lines" (I am not exaggerating --- this was on a Java project, however); and you could easily extend that to "every statement must have a maximum of one operator", "you must not use parentheses", "you must not use more than one level of indirection", etc. Where do you stop? To borrow a saying from UI, "if you write code that even an idiot can understand, only idiots will want to work on it." I don't think we should be forcing programmers to dumb-down code at all.
That said, I'm not advocating for overly complex solutions, and will definitely prefer a simpler solution, but you should know and use the language fully to your benefit.
>> It'll certainly confuse people, but only those who aren't qualified to be doing anything with the code anyway.
If the complexity can be avoided, why not avoid it. Removing complexity is not the same as dumb-downing code. It will improve readability and maintainability.
This mindset is defintitely applicable to declaration as well as code construct.
Note the last sentence of my comment. I am not advocating unwarranted complexity at all, but just saying that there are cases where an increase in local complexity can reduce overall complexity of the system, and you should not be afraid of using the language to the best of your ability.
The correct rule is "follow the C grammar". An easier to remember and also correct rule is "start at the identifier being declared; work outwards from that point, reading right until you hit a closing parenthesis, then left until you hit the corresponding open parenthesis, then resume reading right..." (this is sometimes called the "right-left rule"[2]).
The "spiral rule" dances around the truth without actually being precise enough to be useful.
[1] https://news.ycombinator.com/item?id=5079787 [2] http://ieng9.ucsd.edu/~cs30x/rt_lt.rule.html