Hacker News new | past | comments | ask | show | jobs | submit login

The way I got this to stick in my head was to always think of * as dereferencing, and tell myself that

    int *x;
is declaring that the type of *x is int.



It's not just a memorization trick; that's exactly what the statement means. If you do

    int *x, y;
You're saying that both *x and y are integers.


... which is why I never understood why this is the convention rather than

    int* x, y. 
Does somebody know?


Because now you've got an int pointer and an int. The star associates with the right, not left.

I prefer to use the variant you described though, because it feels more natural to associate the pointer with the type itself. As far as I know, the only pitfall is in the multiple declaration thing so I just don't use it.

IMO, it's also more readable in this case:

    int *get_int(void);
    int* get_int(void);
The second one more clearly shows that it returns a pointer-to-int.


Multiple declaration is generally frowned upon, because you declare the variables without immediately setting them to something.

If you always set new variables in the same statement you declare them, then you don't use multiple declarations, which means there is no ambiguity putting the * by the type name.

So convention wins out for convention's sake. And that's the entire point of convention in the first place: to sidestep the ugly warts of a decades-old language design.


Spaces are ignored (except to separate things where other syntactical things like * or , aren't present), and * binds to the variable on the right, not the type on the left. I actually got this wrong in an online test, but I screenshotted every question so I could go over them later (! I admit, a dirty trick but I learned things like this from it, though I still did well enough on the test to get the interview).

int*x,y; // x is pointer to int, y is int.

int x,*y; // x is int, y is pointer to int

And the reason I got it wrong on the test is it had been MANY years since I defined more than one variable in a statement (one variable defined per line is wordier but much cleaner), so if I ever knew this rule before, I had forgotten it over time.

I keep wanting to use slash-star comments, but I recall // is comment-to-end-of-line in C99 and later, something picked up from its earlier use in C++.

Oh yeah, C99 has become the de-facto "official" C language, regardless of more recent changes/improvements, as not all newer changes have made it into newer compilers, and most code written since 1999 seems to follow the C99 standard. I recall gcc and many other compilers have some option to specify which standard to use for compiling.


I think the question is why it binds to the variable rather than the type. It's obviously a choice that the designers have made; e.g. C# has very similar syntax, but:

   int* x, y;
declares two pointers.

I think the syntax and the underpinning "declaration follows use" rule are what they got when they tried to generalize the traditional array declaration syntax with square brackets after the array name which they inherited directly from B, and ultimately all the way from Algol:

   int x, y[10], z[20];
In B, though, arrays were not a type; when you wrote this:

   auto x, y[10], z[20];
x, y, and z all have the same type (word); the [] is basically just alloca(). This all works because the type of element in any array is also the same (word), so you don't need to distinguish different arrays for the purposes of correctly implementing [].

But in C, the compiler has to know the type of the array element, since it can vary. Which means that it has to be reflected in the type of the array, somehow. Which means that arrays are now a type, and thus [] is part of the type declaration.

And if you want to keep the old syntax for array declarations, then you get this situation where the type is separated by the array name in the middle. If you then try to formalize this somehow, the "declaration follows use" rule feels like the simplest way to explain it, and applying it to pointers as well makes sense from a consistency perspective.


You must've misunderstood, your statement looks like both x and y are `int *`, but in fact only x is an `int *`, while y is an `int`.


I don't know for certain, but I suspect it simplified the language's grammar, since C's "declaration follows use" rule means you can basically repurpose the expression grammar for declarations instead of needing new rules for types. This is also why the function pointer syntax is so baroque (`int (*x)();` declares a variable `x` containing a pointer to a function taking no parameters and returning an int).


I like that a lot! However, it makes things like

    int *x = &a;
a bit more confusing/inconsistent.


Not at all!

a is int; &a is pointer to int; x is pointer to int; *x in again int.


Gotcha, so it's kind of like:

    int (*(x = &(a)));
    i    i p   p i   // i means int, p means pointer


I prefer to think of it as

    (int *) x = &(a);
     i   p  p   a i // a means address       
    
Which is why I prefer to write

    int* x = &a;
"integer pointer" named "x" set to address of integer "a".

---

As a sibling comment pointed out, this is ambiguous when using multiple declaration:

    int* foo, bar;
The above statement declares an "integer pointer" foo and an "integer" bar. It can be unambiguously rewritten as:

    int bar, *foo;
But multiple declaration sucks anyway! It's widely accepted good practice to set (instantiate) your variables in the same statement that you declare them. Otherwise your program might start reading whatever data was lying around on the stack (the current value of bar) or worse: whatever random memory address it refers to (the current value of foo).


Thanks :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: