Hacker News new | past | comments | ask | show | jobs | submit login
Massacring C Pointers (2018) (wozniak.ca)
201 points by goranmoomin on Sept 4, 2020 | hide | past | favorite | 149 comments



Fortunately, 18 years ago, I began learning C from the book The C Programming Language, 2nd ed. written by Brian Kernighan and Dennis Ritchie.

The subject of pointers was generally believed to be scary among fellow students and many of them bought pretty fat books that were dedicated solely to the topic of pointers.

However, when I reached Chapter 5 of the The C Programming Language (K&R), I found that it did a wonderful job at teaching pointers in merely 31 pages. The chapter opens with this sentence:

> A pointer is a variable that contains the address of a variable.

The exact point at which the whole topic of pointers became crystal clear was when I encountered this sentence in § 5.3 Pointers and Arrays:

> Rather more surprising, at first sight, is the fact that a reference to a[i] can also be written as * (a+i).

Indeed, it was easy to confirm that by compiling and running the following program:

  #include <stdio.h>
  
  int main() {
      int a[] = {2, 3, 5, 7, 11};
      printf("%d\n", *(a + 2));
      printf("%d\n", a[2]);
      printf("%d\n", 2[a]);
      return 0;
  }
The output is:

  5
  5
  5
C was the first serious programming language I was learning back then and at that time, I don't think I could have come across a better book than K&R to learn this subject. Like many others, I too feel that this book is a model for technical writing. I wish more technical books were written like this with clear presentation and concise treatment.


K&R is easily one of the best CS/programming books. Short, concise and to the point.

Another way to learn pointers is via assembly. Either learn assembly or just look at the disassembled code. Look under the hood and it'll become clearer.

Of course the best way is to write your own C compiler... but that may be overkill for most.


> Rather more surprising, at first sight, is the fact that a reference to a[i] can also be written as * (a+i)

I did not find that all that surprising, but the first time I saw that a[i] can also be written as i[a], I was very irritated.


I don't know. I read the pointers chapter at least 3 times, and I still get confused when I see pointer stuff in real code. The other day, I read the famous regex post by Russ Cox[1], and it took me a couple of hours to understand the pointer parts of the code even though the algorithm behind it was relatively simple. I guess pointers are just one of those things that for some people make a lot of sense, whereas others struggle with.

[1] https://swtch.com/~rsc/regexp/regexp1.html


>> A pointer is a variable that contains the address of a variable.

Well, it is a little bit incomplete I think. A pointer contains both the address of a variable and a type. That is why pointer arithmetic works. Without type of pointer, p++ is not calculatable.


The pointer variable does not contain the type - it only stores the address. Sizeof will return whatever the machine address width is.

The type is implicitly stored as the pointers type. Casting a pointer can change a piece of memory to be interpreted as any type you want. Although the cast might not make logical sense or may even caused undefined behavior in C.

Pointer arithmetic works off of whatever the current type is, as the compiler needs to know how to the memory offsets.


I actually meant that pointer type is also important to understand the concept. I may not have stated it well. Yes, it does not store the type of the variable it points to but type of the pointer matters and it is significant because many operations depends on it.


C beginner here. I think you stated it well : "and a type" The post you responded to on the contrary misses your point that in memory a pointer is a variable that stores both an address and type information to what it points to.


No, it does not. The compiler knows the pointer type so that it can generate correct assembly, but the pointer type is not something stored in memory past the compilation phase.


> A pointer is a variable that contains the address of a variable.

Given the way the C-standard is relatively carefully written, that explanation is a useful lie, but not necessarily true in every implementation of C.


Sure. When you study physics you first learn Newtonian gravitation; and later, once you really master it, you learn that it is actually false and proceed to learn general relativity. In a similar way, it is very useful for people who learn C to start with the idea that "a pointer is an int". Then, they can discover that pointers are not really ints, but not before they have mastered working with pointers as if they were ints.


Could you explain (or link to a good resource about) what pointers really are? I'm surprised that they aren't just ints that point to the address of a variable.


In all modern implementations that I know of, pointers are effectively ints. But the formal definition of the C language takes great care in not saying that, allowing for implementation to chose how to implement pointers. In some older systems, for example in 16-bit MS-DOS with "high memory", pointers were not really ints, but pairs of ints (called the segment and the offset, if I recall correctly) whose mapping to memory positions was overlapping, i.e., non unique, you could have very different representations for the same memory position. I guess the gcc port to MSDOS by DJ Delorie had to deal with these issues somehow.

But it is possible to imagine completely exotic CPU architectures, where there is a separate type of register that can only be used for pointers and it is impossible to convert to int; you could still write C compilers for that architecture and be 100% conforming to the standard.


If I remember 16 bit C compilers let you choose a couple of different memory models. I remember 3, data and code fits in 16bits. data is 16 bits and code is segment and offset. And both pointers are segment and offset. I think there were more, especially for systems with 'high memory'

Most processors a pointer just gets stuffed into a register and then that's used by a assembly instruction to read/write a memory location. But it's certainly possible to have systems where it's not so simple.


They are variables that store an address. Usually they are of integer-ish thing internally, but this is in no way required, just like C has no need to run on a machine with linear memory. You could have a variable to memory hash table backing all your objects, for example, with a pointer that is a bucket or something.


This is the kind of thing that you need a significant other to straighten out.

"Hey!"

(smacks on head)

"It's an int that points to a variable."

"Now get off the computer and take out the trash without talking about garbage collection!"

"but it could be an int32_t and sometimes..."

(smack)


This is what you want to read: https://blog.regehr.org/archives/1621


No, it’s correct. The statement makes no claims about how the pointer is stored, but semantically it is a variable with the information in it to locate another variable.


This is true, as long as your definition of "another variable" stretches far enough to include memory regions allocated by malloc. And (leaving the C standard), also things allocated through other means, like memory-mapped files, and things like like memory-mapped peripherals.

In other words, the statement is incorrect.


I was specifically responding to the comment above, which had clearly conflated the standard not specifying the representation of a pointer with “anything anyone says about pointers isn’t actually correct”. Keeping the standard in mind the definition of “variable” generally extends to “any typed memory regardless of whether it has a name or not”, and the pointers K&R are talking about are fairly clear to mean this in the sense of the pointer being dereferenceable to a specific type. That being said, you are correct that there are other pointers not mentioned in that statement that are an entirely separate class from normal data pointers: untyped buffers, usually specified by void * but (for legacy reasons) often char * as well. And of course, there’s also function pointers that you dereference only to call them.


Um, why wouldn't it be correct to call any/all of thoes examples a variable?


They’re not variables until they’ve been specified a type, which is at the point you first assign to the memory in a typed fashion I believe. It’s significantly more pedantic than the first line in an introductory text will go into, though ;)


Nah, that's wrong. A variable doesn't need a type. Case in point; void.


All variables have types in C. You can't declare a "void" variable. You can declare a void function. Try assigning it to a variable. You can't. void pointers (void*) are a different story.


So just the opposite of what the comment you replied to suggested:

'void' is a type, but in C values of the type void are not first class. Ie a function can return a value of type void, but you can't assign it to a variable.


I'm not sure what you mean, could you explain a bit more?


Are you saying void is not a type?


It is a good book. But to be fair, not many modern languages are as simple (low number of concepts) as C.


> A pointer is a variable that contains the address of a variable.

Doesn't it technically also contain its own address? So it contains the address of another variable and the address of itself. Pointers are hard...


A variable "contains" only its (current) value. That's the only use of "contains" that makes sense to me.

You can "get" the address of a variable (or more generally, of any l-value) with the & operator. That operator is nothing else but an "escape" character that says "Dear C, I don't want you to load the current value of that variable. Give me just the address where it lives so I can load it later".

(Same applies to storing instead of loading)


> Doesn't it technically also contain its own address

Huh? Then you probably don't understand the meaning of "contains" here. Or what variables are in general. Variables are hard...


Well, the way I see it is this: you, as a person, contain your thoughts and possessions and so on - call it your value. But you also exist in space and time, you have a defined location. You don't carry it in your pocket, it's something you implicitly contain.


Nope. And they're not really. Imagine a simple computer with, say, 32 bytes of ram.

All our bytes of ram are numbered. We read a value from byte number 8 and that value is 17.

"Byte number 8" is a pain to remember or calculate so we give it a label and let the computer keep track. That's a "variable".

What does that 17 mean though? Is it a value we want to use later on? It might be. Or it could be referring to Byte 17 in our ram. Byte 17 happens to have the value 42 so if we use Byte 8 as a pointer that's the value we end up with.

But it's entirely our choice. It's just a number and we can treat it as a value or a pointer to another memory address as we like.


No, that would be a pointer to a pointer.


For a while I had a gig writing HR software, and I read a bunch of books on conducting technical interviews. One particular gem of a book -- the title escapes me -- had example interview questions for some popular programming languages. I don't know about the COBOL, XML and SQL chapters, but the C and Java chapters were terrible:

Q: "How do you get the size of a file?"

A: "Call open, and then sizeof."

... is the only one I recall, but they all ranged from "awful" to "not even wrong".

Over my years of job hunting I'm slightly disappointed that I've never been given a question from it. :-)

This is all fun and games until you run into this kind of code in production. I once had to do an emergency rewrite of some firmware by someone who did not appreciate the benefit of functions. That one was so bad that I stopped looking at the original code for any reason because I was afraid I'd accidentally adopt one of the author's many, many misinterpretations of the hardware, or of base reality.


I've met programmers who didn't understand functions. Pretty well everything was one giant code block. He also didn't understand the need for compiler warnings and switched them all off.

Come to think of it, I refactored some code from a guy who didn't understand procs. He just cut & pasted code 'snippets' (a snippet being about a screenful) a couple of hundred times.

I find it difficult to understand such people.


During my first few years as a professional I worked on COBOL systems. COBOL doesn’t have functions, only procedures (no parameters, no return values, everything come from the global state). People were writing horrible programs made of one giant procedure that would go for thousands of lines and crazy level of indentation.

Everything was done through a small terminal emulator for AS400, 24x80 characters. Every programs were copy pasted from other places.

I eventually found a way to transfer the file out via FTP then upload it back that way I could at least use vim and see what I was writing but I was alone doing that.

I’ve been reprimanded after documenting my code via comments because I was using some vertical space that should have been instead reserved to code (when you only have 24 lines at a time, people become really petty about things).

That was around 2009 :p


I suspect some of these programming habits might have come from working on MATLAB. I think experience with MATLAB could be considered positively harmfull:- and it fosters a mentality of quickly thrashing out a piece of code with hardly any thought of structuring it into a systematic organized manner.

Hopefully adoption of python based tooling in many fields could help alleviate this in coming years..


In Matlab's defense, it has become a much better language over the years.

They even introduced first class functions more than a decade ago.


Was the book intended for people who didn't know how to program but were for some reason interviewing a programmer?


It’s not that unusual of a situation. You might have other technical teams but your lead developer(s) leave.

These days you can outsource technical tests to online businesses that specialise in that kind of stuff but that wasn’t always an option, particularly if you travel far back enough in time, and hiring contractors to support you during the recruitment process is often seen as too expensive for some businesses.


I once has to rewrite legacy firmware where the original author didn’t believe in .c files. You can still compile with only header files (but only if it’s all headers, no mixing what he did with normal structure), but I still wonder why he thought that was a good idea. He works for Facebook now.


That is a legitimate strategy for libraries so you don’t have to package everything. Not sure why why you’d do it for firmware though.

https://en.m.wikipedia.org/wiki/Header-only

Maybe the author was cargo-culting that?


Often having everything in one compilation unit results in a smaller and/or faster binary. I’ve seen this happen even when using LTO. It’s a common practice when developing firmware for small micros.


At work I do amalgamation builds where I `#include` 10s of files into 4 .cpp files so I can do `make -j 4` and end up with 4 compilation units.

It's stupid. It messes up `using`. It doesn't do incremental rebuilds. But it does clean builds fast for the release script, and it saves me from setting up cross-compilation on a faster computer.


It's not just stupid, it breaks the language. Things will be valid when compiled together that are invalid C++ when compiled separately. Your #includes will become more and more wrong over time, and if you ever change the grouping of files all kinds of random unrelated stuff will break.

It's also a really great idea, for the reasons you give. This is how the Firefox tree is built, and it gives massive compile and link time speedups. (We call it "unified builds".)

Some runtime speedups too. I guess that's from better cross-file inlining?


WebKit does it too, and there’s a certain style you need to write your code in for it to work. (Generally, namespace everything.)


It wasn’t a small micro (16 bit dspic) and the code was far from fast. I think it was just cargo culting. He also removed all floats because “integer math is faster” but never tested it. Put floats back in and sped things up a couple orders of magnitude.

Another highlight, no main loop. Everything was in an interrupt. This allowed the code to run despite a very broken i2c driver because it was given a low enough priority that other tasks could preempt it while it was hanging indefinitely.


I recall encountering compilers that required this - there was no linker essentially. Sometimes that was all you had available.

Most of the manufacturers eventually just wrote gcc backends for their architectures, but the era of crappy proprietary compilers went on far too long in many cases.


Microchip XC16, broken in a few places but functional and definitely allowed for .c files.


A lot of embedded compilers are bad or at least old. You frequently only get C and not C++, so you can use header files as a sort of template. Write all your real code to use a peripheral in the header, but wrap all the function names, global variable names, and peripheral register names in special macros. Then, in the .c file, you can include the header file multiple times as long as before each include you define a handful of macros that rename all the function, global variable, and register names in the header so they refer to the correct peripheral instance.


From the detailed notes:

> "A pointer to a function serves to hide the name and source code of that function. Muddying the waters is not normally a purposeful routine in C programming, but with the security placed on software these days, there is an element of misdirection that seems to be growing." (p. 109)

It’s like a book by Calvin’s dad.


Ha!

More like a book by someone who fell for his tall tales!


So why is it that an editor would let these statements pass.

If, say, the publisher was Addison-Wesley or Prentice Hall, would they make it into print.

I am sure that this is not the only programming book that made it to publication with a lack of meaningful editorial oversight.


The worst part is the forward of the second edition says that the publisher hired a C programmer to review the book. That reviewer said "this book should not be published". They published it anyway!


I just downloaded a free sample of the Kindle edition. Here's an excerpt from the preface:

"Prior to publication of the first edition, the manuscript was reviewed by a professional C programmer hired by the publisher. This individual expressed a firm opinion that the book should not be published because “it offers nothing new -- nothing the C programmer cannot obtain from the documentation provided with C compilers by the software companies.”"

The author's claim is that there was a pressing need for a book that just explains C pointers rather than covering the whole language. (It would be interesting to see the rest of what the reviewer wrote.)


And that's how we get flat-earthers.

There is a line between crackpotness and groundbreakness which sometimes is not visible even for deep experts in the subject (though to be fair most of the time you only need high-school science to debunk most stuff).

The reviewer should have been more explicit "this guy doesn't know what he's talking about" rather than "don't publish this book"


It sounds like a scam ad that tries to make it sound like they are revealing a well guarded secret.

"Professional C programmers hate him! A BASIC programmer discovers a clever way to use pointers..."


That's not a bad interpretation. People forget that there were other programming models floating around at that time. In my youth I constructed a whole system of indirect referencing with register variables on an HP-41C calculator; when I got to C a few years later I found it doing many of the same things. If I had been more invested in the calculator programming I might have ended up like this author, but I quickly moved to the C programming model.


> Another sticking point in this interpretation is Traister’s incomprehensible approach to writing functions that take a variable number of arguments. He does this by passing an arbitrary number of arguments to a function (the first being the number of arguments) and accessing them using offsets from the address of the first argument.

This is one of those things that works by accident for some compilers for some platforms, but because it works for the developer, they think it's a brilliant idea.

From what I remember, and I'm taxing my memory here, some of the early Turbo compilers did lay out arguments in this fashion, for DOS, when all optimisations were disabled. Most of the time.

I do recall seeing similar layouts in some of the programs I played around with at the time, and thinking it was a genius idea. For context, I was about seven or eight years old.


By "some" you mean "every stack-based right-to-left calling convention", which happens to be the norm for 16 and 32-bit x86. In contrast, varargs on 64-bit x86 is a mess: https://blog.nelhage.com/2010/10/amd64-and-va_arg/


That convention is nowhere near as common as you present it to be. I was working on systems 30 years ago where the first few arguments were passed in registers. Many such systems. "Everything on the stack" was the rule for register-poor x86, which is why people weaned on DOS/Windows remain attached to it, but even back then it was far from a universal standard.


A mess, but also fast.


Varargs was not part of the language originally. If you look at the source code for printf() from Unix v7, I think you'll find it accesses its arguments by indexing off the address of the first one. I don't remember when varargs came along, but it was some time in the 1980s, and this book was written in 1990.

Which is not to excuse the author. His description of how this works and why you would want to do it is just wrong.


A particular implementation of printf used that technique. There probably weren't a whole lot of different implementations at the time, but it's not part of the C specification.

<varargs.h> was a pre-Standard header that provided a consistent interface for accessing variable arguments. It was superseded by <stdarg.h> in the 1989 ANSI C standard.

gcc in particular dropped support for <varargs.h> a number of years ago.


It's for sale on Amazon. 48% of reviewers have given it 5 stars: https://www.amazon.com/Mastering-Pointers-Tools-Programming-...


48% of 7 reviews....

One of which only seems to praise the delivery time and condition of the book.


I looked at the vote breakdown, since 48% seems an impossible figure for 7 reviews. 29%, 43%, 57% – 2/7, 3/7, 4/7 – yes, but not 48%. It says "we don’t use a simple average. Instead, our system considers things like how recent a review is and if the reviewer bought the item on Amazon. It also analyzes reviews to verify trustworthiness."


Shouldn't we all go bomb the book and give it one stars and write reviews telling everyone not to buy this book? After Kernighan calls it “malpractice”


My contribution to the genre of reviews of terrible books on C: https://www.amazon.com/gp/aw/reviews/B00NYBRH30/ref=cm_cr_un...

(“This is a terrible book ...”)


Wow, the part about hash tables is quite shocking. I've just thrown my copy of the book (which I've never read) into the trash bin. Thanks for writing your review!


> His terminology is all over the map and typically inaccurate, if not plainly wrong.

> Expressions “return a value.”

> “A pointer is a special variable that returns the address of a memory location.”

I don't see how these are clearly wrong? I guess the 'returning' is problematic?

Somehow I feel this was written with an air of superiority, the author doesn't expand on anything. I get that feeling a lot with articles about c.


> Expressions “return a value.”

Yes, "returning" is problematic, because that has a very specific meaning in the context of a function call. But even if we forgive the author the inaccurate terminology, it is also an incorrect statement. You can evaluate an expression. But in C in particular, an expression does not necessarily evaluate to a value.

A somewhat better statement might be:

"Expressions can be evaluated. Some expressions produce a result value."

This statement can probably further improved upon; it definitely still does not capture everything that expressions are in C: such as being able to cause side-effects! That might not be evident, but can certainly be important. But then again, I am not writing a book about C.

> “A pointer is a special variable that returns the address of a memory location.”

Indeed "returning" is problematic here. A pointer does not return anything, it points to something.

> Somehow I feel this was written with an air of superiority, the author doesn't expand on anything. I get that feeling a lot with articles about c

If you're going to write a book about C in order to teach programmers how to work with pointers, better explain things with the correct terminology.

If this were a journal, then of course you can forgive the inaccuracies. But it's a published book that supposedly teaches certain concepts to readers. The author of the blog post says as much in their notes:

"This provides some insight into the mind of the author: he's just picking up concepts and terms as he learns about them and tossing them in without any regard for the reader. This book is pretty much his journal — that somehow became a book with two editions."

> the author doesn't expand on anything

The author expands on a lot more than I would have the patience for; see the notes [0]. Why certain things are incorrect, misleading or dangerous does require some C knowledge. The article + notes won't be a place to learn more about C; but neither is the book that's being reviewed.

[0]: https://wozniak.ca/blog/2018/06/25/1/notes.html


There's an extensive list of everything wrong with the book but not a single point on how to do it better.

If you show the wrong way to do things, it's always good to give a few pointers to a better path so part of your reader base doesn't feel excluded.


Although, if you are writing a book to teach programming, you can't suddenly drop stuff like "evaluating expressions" on readers either. It might be meaningless to them.

For example. If they come from a 'toy' programming background where they just call functions that return values, set variables, jump to labels, and do some math... Even if they're experts at that task, the words "evaluating expressions" might likely have no meaning to them at all. You'll have to explain first with concepts they already understand.

In this case, it seems the writer might've come from such a background. (And assumed his contemporaries were in a similar mindset.) :)


Sure its sloppy, but in my experience at least it's a very widespread and normal terminology.

Colloquially, an expression is something on the right hand side of `=` or in parentheses, and that returns a value. Most importantly, "returns" implies that some kind of computation may go on, maybe with side effects. I'm aware that the C standard uses the terms differently, but there is usually a difference between the mental model and the C standard, or the parser.


An expression evaluates to a value. That value may be an lvalue, and it can be present on the left hand side of `=`.


The first one is a bit odd, and in C expressions specifically don’t always return values: void functions.

The second one reads really strange to me, and variables don’t return things.


I've been writing C since the 1980s. Sorry, this book is not just bad but astonishingly bad.

Your examples are just wrong. "Returning" has a very specific meaning in programming. An expression does not "return a value". A pointer very very much does not "return the address of a memory location".

The code example given was utterly awful. If you cannot see why that code example is almost guaranteed to cause problems, you might not want to continue in this discussion...


It would be a great help if you use the correct terms instead of just pointing out the wrong ones.


> If you cannot see why that code example is almost guaranteed to cause problems, you might not want to continue in this discussion...

Why not? Wouldn't it be especially important to educate these people? Overly ignorant people sure can be annoying, but at least give people some information on the right concepts instead of only derision.


What's the verb in this case then? "An expression ___ a value."


Evaluates to?


Ah, yes, that's the one. Thanks. :)


An expression yields a value.


As someone who's not very well versed in C, can someone please explain how I should interpret this function signature?

    char *combine(s, t)
    char *s, *t;
    {


This is an old way of writing the type of parameters to a function. I believe the modern equivalent is

  char* combine(char* s, char* t) { ... }
which means combine takes in two pointers-to-char and returns a pointer-to-char.

Opinions differ on whether the * should be next to the type or next to the identifier. I prefer putting it next to the type.


When I was learning C a couple decades ago, pointer syntax never made any sense to me until something suddenly clicked with “put it directly next to the type name and not next to the variable name”. Ever since then I have not been able to figure out why it is commonly taught next to the variable name.

I mean, short of enabling declarations like:

    char *a, *b;
But I have long since found the tradeoff to be worth the syntactic clarity.


The best argument against this, and for leaving the * by the variable name, is this declaration:

  char* a, b;
Now a has type char * but b is just char. It’s probably not what the author meant and it’s definitely ambiguous even if it was intentional. Better to write:

  char b, *a;
Or, if you meant it this way:

  char *a, *b;
“Well, don’t declare multiple variables on the same line,” you respond. Sure, that’s good advice too. But in mixed, undisciplined, or just old code, it’s an easy trap to fall into.


The question is, why doesn't C make

    char* a, b;
apply the char* type to both? (That is, why didn't they design it that way?)

I assume there was some reason originally, but it's made everything a bit more confusing ever since for a lot of people. :/

Edit: Apparently it's so declaration mirrors use. Not a good enough reason IMO. But plenty of languages have warts and bad choices that get brought forth. I'm a Perl dev, so I speak from experience (even if I think it's not nearly as bad as most people make out).


In the olden days of C, pointers were not considered types on their own (you cannot have just a pointer, a pointer must point ‘to’ something, grammatically speaking). The type of the object of that declaration is a char. So it’s not really read as ‘I’m declaring a char-pointer called a’, it’s more along the lines of ‘I’m declaring an unnamed char, which will be accessed when one dereferences a’. Hence the * for dereferencing.


This is why a lot of C people do one definition per line for any non-trivial variables.


I think this is why I abandoned a hobby project. I could not figure out why it crashed!


Another good argument is trying to define a function pointer without typedefs.


> Ever since then I have not been able to figure out why it is commonly taught next to the variable name.

C does this very cute (read: horrifyingly unintuitive) thing where the type reads how it's used. So "char ⋆a" is written so because "⋆a" is a "char", i.e. pointer declarations mimic dereferencing, and similarly, function pointer declarations mimic function application, and array declarations mimic indexing into the array.


I found it much easier to understand than Pascal. It just clicked. Later there was a moment of confusion with "char* a, b".

It helped than I've learned by K&R C book. Windows API and code examples are horrific, like another language.

https://docs.microsoft.com/en-us/windows/win32/learnwin32/wi...

https://docs.microsoft.com/en-us/windows/win32/learnwin32/ma...


The type of a is

  char *
. Some people (me included) find it clearer to write the type, followed by the variable name. So just as you’d write

  int a
, you’d write

  char* a
.

The fly in this ointment is that C type syntax doesn’t want to work that way. It’s designed to make the definition of the variable look like the use of the variable. A clever idea, but unlike nearly every other language, which BTW is why I think you should really use typedefs for any type at all complicated in C.

For example, the type-then-variable style falls down if you need to declare an array

  int foo[4]
or a pointer to a function returning a pointer to an int

  int *(*a)(void)
(...right?).

So I’m perfectly willing to do it the “C way”, I just find out more readable to do it the other way unless it just won’t work (and then prefer to use typedefs to make it work anyway).

Note that this was rethought for Go syntax.


It teaches declaration-mirrors-use

  char *a         => *a is a char
  char a[3]       => a[i] is a char
  char f(char)    => f(c) is a char
  char (*f)(char) => (*f)(c) is a char (short form: f(c))


But that doesn't mean it's understandable!


With the minor problem that `a[3]` isn't valid.


Depends how you use it (&a[3], sizeof(a[3])). But its type is still char.


IIRC, both of those are undefined.


&a[3] is allowed, it's a one-past-the-end-of-the-array pointer. &a[4] would be UB (if it were evaluated).

sizeof(a[3]) is not evaluating a[3], so it also isn't UB.


In my own projects, I like to put the * on its own, like so:

  int const * const i;
This is nice because you can naturally read the signature from right to left. "i" is a constant pointer to a constant integer. It's a little unconventional, but I think it's a really clear way to convey the types.


It's that way because in C "declaration follows usage".

In the above, you're declaring the types of * a and * b to be char, making a and b pointers to char.

(EDIT: how do you escape * properly inline with other text?)


I find it to be clear to have the * with the type. Otherwise it can be confused (not by the compiler but the reader) that you are dereferencing the variable a or b.


Take a common idiom like

   main(int argc, char **argv)
These all produce the same result

   main(int argc, char* *argv)

   main(int argc, char** argv)

   main(int argc, char* argv[])


In C,

    char * a;
declares a variable, `a` that, when dereferenced using the `*` operator, will yield an int.

In C++, the same line declares a variable, `a`, of type `pointer-to-int`.

C cuddles the asterisk up to the variable name to reflect use. C++ cuddles it up to the type because it's a part of the type. Opinions don't really differ on whether C-style or C++-style is better, but a lot of cargo-cult programmers don't bother adjusting the style of the code snippet they paste out of Stack Exchange so you see a lot of mixtures.


Is it equivalent? Because that seems to be the only difference between the two editions for the newprint function, and the author says one of them happens to compile.

https://wozniak.ca/blog/2018/06/25/1/code.html


Yes, the declarations are equivalent.

The difference is the function prototype `void newprint(char *, int);` at the start, which is missing in the second example. With the forward declaration, the compiler knows what arguments newprint takes and errors out if you pass something else. C is compiled top to bottom so in the older version of the example the compiler has no way of knowing what number of arguments the function takes at the point whree it is called. In (not so) old versions of C that implicitly declared a function taking whatever you passed it.


To add some flavor to the other answers, in K&R C the parameter passing was very uniform: everything (save doubles) got promoted to word-size and pushed on the stack. chars (and shorts) became ints. You couldn't pass aggregates. doubles were a problem, but were rare.

Because the parameter passing was uniform, you didn't need to inspect anything at the call site. All functions get called the same, so just push your params and call it. Types were for the callee and were optional. This is what powered printf, surely the highest expression of K&R C.

In modern C-lineage style, we enumerate our formal parameters, and variadic functions are awkward and rare. But LISP and JavaScript embrace them; perhaps C could have gone down a different path.


As far as I know, the promotion to word size (now 32 bit) still happens. Also if you have more than a fixed number of params (defined by the platform ABI), parameters are still pushed on the stack. You can't push 8 or 16 bit values on the stack. The stack pointer is always a multiple of 4.

The interesting thing is that with K&R C a function declaration/prototype is optional. That means you can call a function that the compiler has not even seen. Mismatches in parameter/return types (which are optional and default to int in declarations as well) are normally not a problem, because of the aforementioned promotion. If you have the declaration, then the compiler will at least let you know about wrong number of arguments.


As the others already said, it's the old way of declaring the arguments type.

But I think it's good to know other pitfalls:

- The "default" return type of a function is int

- A function that does not use void to tell that it takes no argument just have an undefined number of parameters

See this example: https://ideone.com/GfwS4O

You could in theory use the address of a local variable (allocated on the stack) to access the arguments passed to the function directly on the stack.

But this is just madness... Or is it ? Isn't C just assembly with a "nicer" syntax ?


This is ugly old style C function signature. Obsolete since a long time ago (more than 20 years)

Here is the "modern" equivalent:

char combine(char s, char* t) {


Ugly? I rather miss the traditional style of argument passing. My first real paid job used Microware's OS-9 compiler which only supported the traditional syntax, did absolutely no optimisation that I could discern, and made you use "register" if you wanted a local variable assigned to a register instead of always being spilled to the stack. (In fact looking back at it now I wonder if it was just a rebadged PCC).

As an aside it's not always more verbose because you can group parameters by type, eg:

  int foo (c1, i1, c2, i2)
    char c1, c2;
    register int i1, i2;
  {
  ...


  char *combine(char *s, char *t)


That is pre-ANSI C. The parameter types are declared between the end of the argument list and the start of the body, instead of inside the argument list.


Was it the ANSI Std that did away with this style? Can't recall.


ANSI adopted function prototypes from C++ in C89. Originally C had no type checking on function parameters. All arguments were promoted to word width and pushed to the stack. If you gave a function fewer arguments it would just read garbage off the end of the stack. If you gave it excess arguments they were silently ignored.


Sort of. ANSI introduced prototypes, but old-style function declarations and definitions, though they were declared obsolescent, remained valid (i.e., any conforming C compiler is still required to support them).

As of the 2011 standard, that's still the case. I think that C2X will finally remove them.


that's a K&R style function definition.

modern syntax is:

    char *Combine(char *S, char *T) {

        ...
    }
which means the Combine function returns a string (char pointer), and takes as arguments two strings, S and T.


In the same vein we have "The Annotated Annotated C Standard": https://www.davros.org/c/schildt.html

A blast from the comp.lang.c past.


Schildt is one author who triggers deep revulsion in me. Some time around 1988 I bought one of his books on C, and found it was mostly generic padding about programming in general. Later I somehow managed to buy another book by Schildt and found it contained 80% the same text.


And "C: The Complete Nonsense", a review of another Herbert Shildt book.

https://www.seebs.net/c/c_tcn4e.html


Actually somewhat interesting is The New C Standard: An Economic and Cultural Commentary - http://www.knosof.co.uk/cbook/cbook.html


> If you browse search results for other books by Traister you’ll find a lot of questionable sounding titles ... [snip] ... Cave Exploring (1983)

Yikes, that could kill somebody if he takes the same approach... otoh it looks like he was a scuba instructor?

https://trackbill.com/bill/virginia-house-resolution-92-comm...


The single Amazon review of Cave Exploring has a familiar echo: “Recommends that female cavers don't wear bras. So bad it is good. I bought it only because the National Speleological Society had it withdrawn from sale because it encouraged cavers to mark their route with a ball of string.”

Bob actually sounds like a guy who had many talents. Writing books perhaps wasn’t one of them. https://www.findagrave.com/memorial/29007415/robert-joseph-t...


Apparently he also wrote a book about hand loading ammunition,[0] and I wonder if that’s full of dangerous advice too.

[0] https://www.thriftbooks.com/w/complete-reloading-guide_john-...


In stead of worshiping "The C Programming Language", why don't we appreciate how Kernighan only uses the bad code example but didn't single out the book and author?

Honestly, I feel like I am witnessing a class room teasing act with all the comments just resonating. Do we really feel such an urge that if we don't tease this old author (not just the book or code) loud, it will lead a new generations of young programmers astray?


Previous discussion from a couple years ago: https://news.ycombinator.com/item?id=17397823


> In the preface of the second edition it says that the first edition was reviewed “by a professional C programmer hired by the publisher.” That programmer said it should not be published. That programmer was right, but the publisher went ahead and published it anyway.

Reminds me of https://m.xkcd.com/1096/


These books are available on libgen if you want to see them for yourself:

https://b-ok.xyz/book/2368114/60fa20 https://b-ok.xyz/book/2368115/f2ecc8


Is Robert Traister an actual human being? Or just a pen name that the publisher puts on quickie books? Or a name, like Alan Smithee, applied to books that have gone wrong?


I'd love to write a version of this book for Rust with the examples translated as close as possible within the constraints of the two languages.


I wonder if the author should feel honored or dejected that Woz took the time to read the book and write a detailed article explaining why it's a terrible book.

That's the question of how the author should feel. I'm sure that if he found out about Woz' review, he would dismiss it by saying "who is this Wozniak guy anyways, he probably knows nothing"


It’s not that Wozniak.


Hmmm...

> With BASIC, the key thing to know about most implementations at the time is that there were no functions and no scope aside from the global scope.[2]

Take a look at DEF FN in the Applesoft Basic Manual:

https://www.landsnail.com/a2ref.htm

(To be fair, it wasn't used that much, but it was there.)


I don't understand; you quote the reference to the footnote but haven't read it?


I think you do understand.


No. Apparently you want me to spell it out; okay. Your comment is entirely unnecessary because what you write is already in the footnote. Why are you including the "[2]" in your quote, indicating that you have seen that there is a footnote, but haven't read it? Or have you read it and are posting it anyway? That's even harder to understand why you would do that.

Also, making suggestive remarks is not very helpful.


Author mentions in the footnote


Is the author still alive? These books were published in the early 90s, which wasn’t _that_ long ago. I’m not advocating a lynch mob on an old man, of course, but his story/rebuttal/repentance might be interesting.


"At the bottom of the heap in a pointer stack is the ultimate object."


Hah, when I read this, I envisioned a movie poster of a B-class "cyber" action movie and this line under a picture of people holding keyboards and pistols.


Damn. Those code examples are complete insanity. Half won't even compile. It's like if I tried to write a textbook on how to do neurosurgery.


I mean, how hard can it be? You just cut open their head, mess around it a bit…


C beginner here. C abstractions don't make sense if you don't grok what the stack and the heap are and how they work.


This stuff is just amazing. Was this guy, like an adult when he wrote this book? I knew more C when I was 12.


this book is a travistery




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: