K&R C has a beautiful, elegant simplicity to it, which later ANSI C has lost --- the idea that everything is a machine word unless specified otherwise means that you get to do things like this:
add(a, b) { return a+b; }
fib(a) { return (a <= 1) ? 1 : (fib(a-1) + fib(a-2)); }
That's totally valid K&R C. The lack of boilerplate makes abstraction really cheap, conceptually --- Forth programmers will recognise the advantage here. You can write real programs out of tiny one-line definitions like the above.
Of course, ANSI C is better in just about every other way, but I still regret losing the stripped-down simplicity of K&R C.
I wouldn't call that C code, I would call that a language defined by the C macro system that also happens to accept C code if you choose to sprinkle that in.
ANSI C could be cleaner than it is. I regularly lament that I must type
int func(long_type_name a, long_type_name b, long_type_name c) { ...
when I really, really should be able to type
int func(long_type_name a, b, c) { ...
It would be nice to get this back and it would take away some useless noise. Sadly, this is now seen as a feature and languages like D insist on the repeated type name.
I'm sure there are other places were ANSI C could be made cleaner, but perhaps at the cost of backwards compatibility.
I don't now about the later versions of the standard but in C89, using the old style meant you did NOT get the type checking of the prototype style declarations. If there is no semantic difference these days, then yes, that would be a solution. I'm on the road and don't have the C11 standard handy, so I'll check later what the deal is.
Sorry, I didn't mean that literally, I meant getting equivalent functionality. I'd be fine with semicolons, Renderman Shading Language uses that syntax.
> In both cases, the compiler supposedly looks up the name on the right to figure out which struct type you intended. So foo->bar if foo is an integer would do something like ((Foo)foo)->bar where Foo is a struct that contains a member named bar, and foo.bar would be like ((Foo)&foo)->bar. The text doesn’t say how ambiguity is resolved (i.e., if there are multiple structs that have a member with the given name).
This is because in pre-ANSI C the names of struct fields were not internal to their enclosing struct but global identifiers that simply represented a type and offset. There was no ambiguity because different structs could not have the same member names. As raimue pointed out this is why some *nix struct members are still prefixed with a namespace like st_ for struct stat.
I can't find a reference, but I was under the impression that the global struct field thing was a pre K&R C feature, and that actual K&R C had a per-type namespace for struct fields, just like modern C.
Can anyone verify this? I can't find a copy of K&R C online to check with.
I think in the original Microsoft C compiler (maybe others?) bar had to be a member or any sub-member of Foo (so you could say foo->bar instead of foo->middle.bar).
And even with all these warts, K&R C is a good start for learning or re-learning C IMO. To grok C you don't need to know the exact syntax. You only need to realize that everything is either a byte sequence or a pointer to a byte sequence... represented by a byte sequence. This includes obvious things like arrays or "strings" or structs or unions, but also functions themselves. Once you internalize this you know C. Of course you'd also have to learn another language: the C preprocessor, but it's relatively small and only a few of its instructions would be needed to make productive use of it. So don't get hung up on the outdates syntax, and instead just read the book. You can probably buy a used copy for a few bucks or borrow it from your local library, then spend an evening reading it.
Then learn to use make (Makefile) and how to write a ./configure script, or better yet learn to use autotools. Oh you don't want to write all your code by yourself, so better use some external libraries. For that you better understand your preprocessor, compiler, assembler and linker. Code not working?... get yourself familiar with a debugger. Static or Runtime code analysis?... yes please.
Find a open source project to support. Get familiar with the code base and the community. Write a patch, send it, get it refused, work with the community, finally getting your patch in.
See, it's not that hard learning "C" and becoming a "C developer"... 21 days at most!
I've used autotools and I am definitely not a fan. A 10 line Makefile is probably sufficient for most projects and it'll take you 20 minutes to get one working if you take a 15 minute break.
External libraries are a reality in any programming language. The popular ones that come with your OS of choice are likely very well documented. You don't need to understand the preprocessor, compiler, assembler, and linker to use them. In fact, I'd argue that for most programming you only need to know how to invoke the compiler as the rest of the steps are already automated. K&R C will teach you enough about how to use header files to be able to use external libraries.
A debugger is useful, but something like gdb is not necessary to write C. In fact, I'd say Valgrind is far more useful, and once again, it'll take 20 minutes to learn.
As for supporting open source projects, why start there? Do you learn Python and immediately jump to submitting patches to Django? No, you write your own code and learn from that. Same here, as you are presumably learning C/JavaScript/Ruby/Erlang/CL/etc. to use it, not to add a line to your resume that you are a contributor to some open source project.
No, you won't become an expert developer in 21 days. But since C is such a small language with so few features I'd argue that learning how to use it is more about getting the memory structure right in your head than about learning syntax. By comparison, Python (my primary language lately) has many more types and constructs to worry about.
The complexity of a language is not proportional to the number of builtin types it has. If there is a builtin to do one particular thing, and you just use it for that, that's simple. You don't have to build it up yourself by tricky combinations of tricky components (easily exposing you to severe mistakes like buffer overflows unless you already have a good vocabulary of idioms down cold). You don't have to worry about each available type unless you are using it, so it is comparable to a separate library in C.
Autotools has a steep learning curve and makes a good deal of project maintainers nervous because it's not obvious how it works and it's diffult to cut through all the abstraction steps to figure out what-goes-where. CMake is usually a pretty good alternative to figuring out m4 scripts with autotools. CMake is usually faster too.
Valgrind is good but it doesn't work on all platforms and it again needs more examples and howtos.
Dtrace too could also use some more blog howtos.
Powerful tools need to be both clearly and succinctly explained and sufficiently configurable in order to be widely usable. The clear conveyance of system behavior should be a top priority for code and supporting documentation.
"void didn’t exist in the book (although it did exist at some point before C89; see here, section 2.1)"
-> There are many occurrences of void in the document above.
"It appears that stdio.h was the only header that existed"
-> From the document above many other headers are available (string.h, math.h, etc.)
"Note that return 0; was not necessary." (on the main function)
-> return 0 is not necessary in Standard as well (even if the semantic of omitting it is different between C89 and C99).
My paper copy of K&R1 (1978) does not mention void, nor does it mention standard headers other than stdio.h. The linked document may be a later draft (and it's almost certainly a copyright violation).
Prior to C99, reaching the closing "}" of main() caused the program to return an undefined status to the calling environment. In C99 and later (as in C++), reaching the closing "}" does an implicit "return 0;".
I had some problems in believing that C99 says that main() implicitly returns 0, so for future reference: It says so in 5.1.2.2.3, even including wording concerning the closing "}" (which seems to me as slightly quirky way to describe what it is describing).
As for stdio.h being only header: it makes sense, as headers like stdlib.h and string.h contain mostly function declarations, which you don't strictly need in K&R. Some of 80's UNIX C code examples in books I've seen don't include anything (and there are even some that declare things like FILE and errno directly without including it from anywhere).
The author's amazement at the oblique reference to teletype outputs is hilarious. Kids these days forget that much of Unix (and most other things) were created without the benefit of "glass teletypes" (aka screens).
As always, "kids" == younger than me. But no, I've never used a teleprinter for anything serious.
Still, the evidence of their recent departure is all around us. vi (short for "visual", because the idea of a visual editor was still pretty novel) didn't even exist until 1976.
IIRC, "backspace" as an overstrike compose character continued to be supported for quite a while after the death of teleprinters. Some BBSes depended on it, for example.
>In those days, a directory in UNIX was a file that contains a list of file names and some indication of where they are located (p. 169). There was no opendir or readdir; you just opened the directory as a file and read a sequence of struct direct objects directly. Example is given on page 172. You can’t do this in modern Unix-like systems, in case you were wondering.
What's the most recent file system where this is possible?
Actually, I believe you can still do this with some of the BSDs. I currently don't have a BSD system around to test it on but I remember doing it before, and search results agree that it's possible (and a feature, not a bug):
I don't know the most recent, but it's probably been awhile since it was usable.
The first UNIX I programmed on (SVR2, circa 85-86) still worked that way (no opendir/readdir) since it had the classic V7 filesystem. Each directory entry was just a fixed-size 16 byte chunk; 2 bytes for the inode number and 14 for the filename.
Of course, this meant a limitation of 64K files per filesystem -- that was still OK for us (40MB would be a large disk) but was already would have caused problems for bigger servers. It also was the cause of the "classic" UNIX 14-character filename limit.
By that time BSD4.2's FFS filesystem had already removed both of these limits (which I assume is when opendir() and friends arrived) According to wikipedia that made it to the AT&T world by SVR4 in the late 80s, which sounds correct.
As someone else mentioned some UNIXes may still let you open() a directory. However, since we're long past the days of there being just one "UNIX filesystem" it would be impossible to interpret its contents in a portable manner.
I find it interesting to see what C was like even earlier. Notably there was an incompatible break between the 6th and 7th editions, when the compound assignment operators were flipped. It used to be "=+" instead of "+=". I guess Ritchie figured it'd be better to avoid ambiguity with expressions like "x=-1;".
Just imagine those changes in modern language. Breaking changes, oh my god, I have to rewrite my software, no way I'll ever use that language again. People were more patient at that time.
I would think it was more because it was way simpler to adjust software. People mostly used their own software, they didn't have millions of lines of code hanging around (they probably could grep all existing source code in a few hours, if needed), the authors of any software that needed changes all were available, often in the same room, and even if the languages didn't change, they would have to update their programs because they made breaking changes to their OSes every day (that may be slightly exaggerated)
Also, what language would you switch to on your Unix installation?
I actually really like the old style function definitions:
foo(a, b)
int a, b;
{
}
I was mildly disappointed when this was [relatively recently - maybe 10 years ago?] taken out of g++. (Not sure what its status has been in the language standard.) Sure, it's a WTF to most people. But if you have really long type names used more than once, it actually saves typing.
> It also notes that the equal sign before the initializer in a declaration was not present, so int x 1; would define x and initialize it with the value 1.
That isn't so scary; the parenthesized version,
int x(1)
is still seen all the time, in places like C++ constructor member initializers. It's how you actually declaratively initialize a type with that value, rather than initializing the memory with a default constructor and then getting the assignment operator called on the resulting object.
The real issue is that the parentheses in the above used to just be part of the expression, rather than required, so
int x 1
would have worked just as well. I'm fairly certain I'm glad of the change, but the previous state of affairs wasn't crazy.
I heard a story that the original indentation style that K&R wanted to use was the same as Allman, but then the printers chose to do the current K&R style to save paper.
Does any one know if that's true, or just an urban legend?
Looks as valid to me today as when I first read K&R C 25 years ago. In short K&R is terrible. If you hew to it, first you'll end up with code that is unsafe and sketchy. Second you won't learn about modern features in c, which make writing safe non-sketchy code a lot easier.
I learned this dialect of C first (though I switched to ANSI C very shortly afterward) so none of this seems all that new or surprising. Things were different in those days. Man, I feel old.
I wasn't aware of that last point. Is this the answer why we have uniquely prefixed struct members in POSIX? Such as st_* in struct stat or d_* in struct dirent? Is that just a coincidence?
Another thing that it allows you is to #define "structure member" to be something more complex, for example on Linux you get st_atime (as in struct stat) #defined to st_atim.tv_sec (and so on for ctime, mtime, st_a/c/mtim is struct timespec, with nanosecond precision).
I don't think so. I'm going from memory here, but I think in pre-K&R C it wasn't assumed that every struct/union would be given its own separate namespace.
Depends on your definition of 'needed' --- if you don't return 0 from main, your program will exit with an undefined status code, which means that if you use it from a shell script or makefile weird things might happen because your program may be considered to have failed.
Of course, ANSI C is better in just about every other way, but I still regret losing the stripped-down simplicity of K&R C.