Hacker News new | past | comments | ask | show | jobs | submit login
K&R C (2014) (spin0r.wordpress.com)
136 points by bluetomcat on Sept 3, 2015 | hide | past | favorite | 66 comments



K&R C has a beautiful, elegant simplicity to it, which later ANSI C has lost --- the idea that everything is a machine word unless specified otherwise means that you get to do things like this:

    add(a, b) { return a+b; }
    fib(a) { return (a <= 1) ? 1 : (fib(a-1) + fib(a-2)); }
That's totally valid K&R C. The lack of boilerplate makes abstraction really cheap, conceptually --- Forth programmers will recognise the advantage here. You can write real programs out of tiny one-line definitions like the above.

Of course, ANSI C is better in just about every other way, but I still regret losing the stripped-down simplicity of K&R C.


> The lack of boilerplate makes abstraction really cheap, conceptually

Well, as long as you don't care about abstracting over anything other than int.


    >   fib(a) { return (a <= 1) ? 1 : (fib(a-1) + fib(a-2); }
    >
    > That's totally valid K&R C.
Apart from, presumably, the missing closing parenthesis ;)


* cough *

Fixed. Thanks.


Ever seen C code by APL programmers?



There could be a movie about the steps needed to reach that state.

ps: I have some javascript bits that are almost as dense... Hum.


I wouldn't call that C code, I would call that a language defined by the C macro system that also happens to accept C code if you choose to sprinkle that in.


ANSI C could be cleaner than it is. I regularly lament that I must type

  int func(long_type_name a, long_type_name b, long_type_name c) { ...
when I really, really should be able to type

  int func(long_type_name a, b, c) { ...
It would be nice to get this back and it would take away some useless noise. Sadly, this is now seen as a feature and languages like D insist on the repeated type name.

I'm sure there are other places were ANSI C could be made cleaner, but perhaps at the cost of backwards compatibility.


> int func(long_type_name a, long_type_name b, long_type_name c) { ...

The K&R style variable declarations are still also part of ANSI C, so you could write:

    int func(a, b, c)
      long_type_name a, b, c;
    { ...


I don't now about the later versions of the standard but in C89, using the old style meant you did NOT get the type checking of the prototype style declarations. If there is no semantic difference these days, then yes, that would be a solution. I'm on the road and don't have the C11 standard handy, so I'll check later what the deal is.


Actually that's how Go does it,

   func(a, b, c long_type_name) int {...


> get this back

That never existed. Syntactically it could work; you would need semicolons in there:

  int func(long_type_name a, b, c; int d);
similar to struct/enum member declarations.

Speaking of which: don't have long type names in C programs, and simplify API's with structs rather than large numbers of parameters. :)


It exists with K&R parameters:

    void foo(a,b,c)
    long_type_name a,b,c;
    {
    ...
    }


Sorry, I didn't mean that literally, I meant getting equivalent functionality. I'd be fine with semicolons, Renderman Shading Language uses that syntax.


It is actually valid in ANSI C89 too, although it does look rather odd to those accustomed to seeing the types.

http://stackoverflow.com/questions/5885156/no-defined-type-o...


Sounds like you might enjoy Haskell or similar :P


> In both cases, the compiler supposedly looks up the name on the right to figure out which struct type you intended. So foo->bar if foo is an integer would do something like ((Foo)foo)->bar where Foo is a struct that contains a member named bar, and foo.bar would be like ((Foo)&foo)->bar. The text doesn’t say how ambiguity is resolved (i.e., if there are multiple structs that have a member with the given name).

This is because in pre-ANSI C the names of struct fields were not internal to their enclosing struct but global identifiers that simply represented a type and offset. There was no ambiguity because different structs could not have the same member names. As raimue pointed out this is why some *nix struct members are still prefixed with a namespace like st_ for struct stat.


I can't find a reference, but I was under the impression that the global struct field thing was a pre K&R C feature, and that actual K&R C had a per-type namespace for struct fields, just like modern C.

Can anyone verify this? I can't find a copy of K&R C online to check with.


I think in the original Microsoft C compiler (maybe others?) bar had to be a member or any sub-member of Foo (so you could say foo->bar instead of foo->middle.bar).


Interesting. Sort of similar to Haskell record fields (which are really just global functions).


And even with all these warts, K&R C is a good start for learning or re-learning C IMO. To grok C you don't need to know the exact syntax. You only need to realize that everything is either a byte sequence or a pointer to a byte sequence... represented by a byte sequence. This includes obvious things like arrays or "strings" or structs or unions, but also functions themselves. Once you internalize this you know C. Of course you'd also have to learn another language: the C preprocessor, but it's relatively small and only a few of its instructions would be needed to make productive use of it. So don't get hung up on the outdates syntax, and instead just read the book. You can probably buy a used copy for a few bucks or borrow it from your local library, then spend an evening reading it.


Then learn to use make (Makefile) and how to write a ./configure script, or better yet learn to use autotools. Oh you don't want to write all your code by yourself, so better use some external libraries. For that you better understand your preprocessor, compiler, assembler and linker. Code not working?... get yourself familiar with a debugger. Static or Runtime code analysis?... yes please.

Find a open source project to support. Get familiar with the code base and the community. Write a patch, send it, get it refused, work with the community, finally getting your patch in.

See, it's not that hard learning "C" and becoming a "C developer"... 21 days at most!


I've used autotools and I am definitely not a fan. A 10 line Makefile is probably sufficient for most projects and it'll take you 20 minutes to get one working if you take a 15 minute break.

External libraries are a reality in any programming language. The popular ones that come with your OS of choice are likely very well documented. You don't need to understand the preprocessor, compiler, assembler, and linker to use them. In fact, I'd argue that for most programming you only need to know how to invoke the compiler as the rest of the steps are already automated. K&R C will teach you enough about how to use header files to be able to use external libraries.

A debugger is useful, but something like gdb is not necessary to write C. In fact, I'd say Valgrind is far more useful, and once again, it'll take 20 minutes to learn.

As for supporting open source projects, why start there? Do you learn Python and immediately jump to submitting patches to Django? No, you write your own code and learn from that. Same here, as you are presumably learning C/JavaScript/Ruby/Erlang/CL/etc. to use it, not to add a line to your resume that you are a contributor to some open source project.

No, you won't become an expert developer in 21 days. But since C is such a small language with so few features I'd argue that learning how to use it is more about getting the memory structure right in your head than about learning syntax. By comparison, Python (my primary language lately) has many more types and constructs to worry about.


The complexity of a language is not proportional to the number of builtin types it has. If there is a builtin to do one particular thing, and you just use it for that, that's simple. You don't have to build it up yourself by tricky combinations of tricky components (easily exposing you to severe mistakes like buffer overflows unless you already have a good vocabulary of idioms down cold). You don't have to worry about each available type unless you are using it, so it is comparable to a separate library in C.


Lots of good points.

Autotools has a steep learning curve and makes a good deal of project maintainers nervous because it's not obvious how it works and it's diffult to cut through all the abstraction steps to figure out what-goes-where. CMake is usually a pretty good alternative to figuring out m4 scripts with autotools. CMake is usually faster too.

Valgrind is good but it doesn't work on all platforms and it again needs more examples and howtos.

Dtrace too could also use some more blog howtos.

Powerful tools need to be both clearly and succinctly explained and sufficiently configurable in order to be widely usable. The clear conveyance of system behavior should be a top priority for code and supporting documentation.


Yay... my first grey comment. Why U no undarstend sarcazm?

All I'm saying really: Learning C is easy, becoming a halfway decent C programmer is hard.


Well, HN is not reddit, ppl here are not getting used to that.


Some very good observations.

Otherwise a few points, from the copy here (clearly a 1st ed): http://cs.upm.ro/_users/cursuri_on_line/Alte_documentatii/C%...

"void didn’t exist in the book (although it did exist at some point before C89; see here, section 2.1)" -> There are many occurrences of void in the document above.

"It appears that stdio.h was the only header that existed" -> From the document above many other headers are available (string.h, math.h, etc.)

"Note that return 0; was not necessary." (on the main function) -> return 0 is not necessary in Standard as well (even if the semantic of omitting it is different between C89 and C99).


My paper copy of K&R1 (1978) does not mention void, nor does it mention standard headers other than stdio.h. The linked document may be a later draft (and it's almost certainly a copyright violation).

Prior to C99, reaching the closing "}" of main() caused the program to return an undefined status to the calling environment. In C99 and later (as in C++), reaching the closing "}" does an implicit "return 0;".


I had some problems in believing that C99 says that main() implicitly returns 0, so for future reference: It says so in 5.1.2.2.3, even including wording concerning the closing "}" (which seems to me as slightly quirky way to describe what it is describing).

As for stdio.h being only header: it makes sense, as headers like stdlib.h and string.h contain mostly function declarations, which you don't strictly need in K&R. Some of 80's UNIX C code examples in books I've seen don't include anything (and there are even some that declare things like FILE and errno directly without including it from anywhere).


(clearly a 1st ed)

Heh. Don't believe everything you read, kid.


The author's amazement at the oblique reference to teletype outputs is hilarious. Kids these days forget that much of Unix (and most other things) were created without the benefit of "glass teletypes" (aka screens).


Kids? You can be over 30 and never have used a teleprinter :)


As always, "kids" == younger than me. But no, I've never used a teleprinter for anything serious.

Still, the evidence of their recent departure is all around us. vi (short for "visual", because the idea of a visual editor was still pretty novel) didn't even exist until 1976.


But I hope you wouldn't be amazed that a book written in 1978 made reference to common technology at the time.

The whole point of this post was the original K&R C, so why wouldn't they reference/assume teletypes?


IIRC, "backspace" as an overstrike compose character continued to be supported for quite a while after the death of teleprinters. Some BBSes depended on it, for example.


Overstriking with backspace is something I actually used recently:

https://github.com/TazeTSchnitzel/BBCode630

(to compensate for the limitations of a budget 80's printer)


I was going to make a "kids these days" remark after looking at that article, too.


>In those days, a directory in UNIX was a file that contains a list of file names and some indication of where they are located (p. 169). There was no opendir or readdir; you just opened the directory as a file and read a sequence of struct direct objects directly. Example is given on page 172. You can’t do this in modern Unix-like systems, in case you were wondering.

What's the most recent file system where this is possible?


Actually, I believe you can still do this with some of the BSDs. I currently don't have a BSD system around to test it on but I remember doing it before, and search results agree that it's possible (and a feature, not a bug):

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=96191

Also found others claiming that Solaris has the same behaviour.


I don't know the most recent, but it's probably been awhile since it was usable.

The first UNIX I programmed on (SVR2, circa 85-86) still worked that way (no opendir/readdir) since it had the classic V7 filesystem. Each directory entry was just a fixed-size 16 byte chunk; 2 bytes for the inode number and 14 for the filename.

Of course, this meant a limitation of 64K files per filesystem -- that was still OK for us (40MB would be a large disk) but was already would have caused problems for bigger servers. It also was the cause of the "classic" UNIX 14-character filename limit.

By that time BSD4.2's FFS filesystem had already removed both of these limits (which I assume is when opendir() and friends arrived) According to wikipedia that made it to the AT&T world by SVR4 in the late 80s, which sounds correct.

As someone else mentioned some UNIXes may still let you open() a directory. However, since we're long past the days of there being just one "UNIX filesystem" it would be impossible to interpret its contents in a portable manner.


Enclosing the return expression in parentheses is stylistic and also to allow instrumentation like:

    #define return(value) return debug_return(value), value
where a debugger couldn't be deployed. Or even something more fun like:

     #define return(value) return 0
     /* Dr. Evil pinky laugh here */
This could even be used as a basis for code testing fuzzer like so (missing seeding management and iterations, of course):

     #define return(value) (int)arc4random()


K&R corresponds to C in 7th Edition Unix. The source code to v7 and other early Unix versions is browsable at: http://minnie.tuhs.org/cgi-bin/utree.pl

I find it interesting to see what C was like even earlier. Notably there was an incompatible break between the 6th and 7th editions, when the compound assignment operators were flipped. It used to be "=+" instead of "+=". I guess Ritchie figured it'd be better to avoid ambiguity with expressions like "x=-1;".


Yes, that is the primary reason they were flipped. Typing

  x=-42;
when you meant

  x= -42;
was just too easy. I know. I did it.

Similarly, as I recall, variable initialization of the form

  int i 17;
created some sort of parsing problem and was changed to the current form

  int i = 17;
I learned Unix on a PDP-11 running v6 and I had to learn all of the changes when we got 32V for the new VAX.


Just imagine those changes in modern language. Breaking changes, oh my god, I have to rewrite my software, no way I'll ever use that language again. People were more patient at that time.


I would think it was more because it was way simpler to adjust software. People mostly used their own software, they didn't have millions of lines of code hanging around (they probably could grep all existing source code in a few hours, if needed), the authors of any software that needed changes all were available, often in the same room, and even if the languages didn't change, they would have to update their programs because they made breaking changes to their OSes every day (that may be slightly exaggerated)

Also, what language would you switch to on your Unix installation?


I actually really like the old style function definitions:

    foo(a, b)
    int a, b;
    {
    }
I was mildly disappointed when this was [relatively recently - maybe 10 years ago?] taken out of g++. (Not sure what its status has been in the language standard.) Sure, it's a WTF to most people. But if you have really long type names used more than once, it actually saves typing.


Verilog which is loosely based on C also blindly followed this trend, but created a big mess.

In Verilog you can say:

    module foo(a);
    parameter width = 20;
    input [width-1:0] a;
But "ANSI" Verilog is "improved":

    module foo(input [19:0] a);
Oops.. no place to put the parameter.. so later on they amended:

    module #(parameter width=20) foo(input [width-1:0] a);
They should have just removed the argument list:

    module foo;
    input a;
This would have worked for C as well:

    int foo()
    int a, b, c;
    {


> It also notes that the equal sign before the initializer in a declaration was not present, so int x 1; would define x and initialize it with the value 1.

That isn't so scary; the parenthesized version,

    int x(1)
is still seen all the time, in places like C++ constructor member initializers. It's how you actually declaratively initialize a type with that value, rather than initializing the memory with a default constructor and then getting the assignment operator called on the resulting object.

The real issue is that the parentheses in the above used to just be part of the expression, rather than required, so

    int x 1
would have worked just as well. I'm fairly certain I'm glad of the change, but the previous state of affairs wasn't crazy.


The text doesn’t say how ambiguity is resolved (i.e., if there are multiple structs that have a member with the given name).

The resolution was that struct didn't introduce a new namespace - it wasn't legal to have the same member name used in two structs in the same scope.


I heard a story that the original indentation style that K&R wanted to use was the same as Allman, but then the printers chose to do the current K&R style to save paper.

Does any one know if that's true, or just an urban legend?


The book was typeset by the authors, so it seems unlikely.

K&R style is clearly in evidence in source code from 1973 such as https://github.com/dspinellis/unix-history-repo/blob/Researc...



Naturally, reading this just makes me want to know what his critique actually was.


https://web.archive.org/web/20141205223016/http://c.learncod...

Looks as valid to me today as when I first read K&R C 25 years ago. In short K&R is terrible. If you hew to it, first you'll end up with code that is unsafe and sketchy. Second you won't learn about modern features in c, which make writing safe non-sketchy code a lot easier.


I learned this dialect of C first (though I switched to ANSI C very shortly afterward) so none of this seems all that new or surprising. Things were different in those days. Man, I feel old.


>It appears that it was necessary to dereference function pointers before calling them

Hmm.. I don't remember ever having to do this, even on very old systems.


I wasn't aware of that last point. Is this the answer why we have uniquely prefixed struct members in POSIX? Such as st_* in struct stat or d_* in struct dirent? Is that just a coincidence?


Another thing that it allows you is to #define "structure member" to be something more complex, for example on Linux you get st_atime (as in struct stat) #defined to st_atim.tv_sec (and so on for ctime, mtime, st_a/c/mtim is struct timespec, with nanosecond precision).


Yes, that's why member functions tend to have unique prefixes.


I don't think so. I'm going from memory here, but I think in pre-K&R C it wasn't assumed that every struct/union would be given its own separate namespace.


That also makes grepping for all occurrences of "flags" more useful, when it's p_flags or d_flags.


return 0 in main hasn't changed, it is still not needed.


Depends on your definition of 'needed' --- if you don't return 0 from main, your program will exit with an undefined status code, which means that if you use it from a shell script or makefile weird things might happen because your program may be considered to have failed.


No, if you don't return anything from main, return 0 is added for you by the compiler, at least in C++ and recent versions of C.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: