Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So why is EOF in C defined as -1 (when it is)?


If I had to guess:

prematurely ending a stream is a problem for the current stream, and for the next stream as well, and thus is more desiderable to avoid corruption in bytes signaling an EOF than other bytes.

corrupting an EOF in something else is probably not as bad, and/or maybe a corruption 1->0 is more likely than a bit flip 0->1 (maybe when stored/transmitted as 1==higher voltage) and thus

01111111 10111111 11011111 11101111 11110111 11111011 11111101 11111110

are less likely to be corrupted into a EOF, if 11111111 is the least likely state for a byte to be in.

Also, not one of these bytes is a printable ASCII character, which might thus possibly be/have been much less frequent in data streams, and thus further reducing the likelihood the probability of such a corruption happening.


While a nifty idea, corruption of this sort is so rare and so unbounded (that is, there's no reason to believe it'll strike in your incoming data, it could well strike at the CPU instructions itself or whoknows) that there's not much you can do about it from inside the code. It's all but impossible to deal with corruption rates on the order of 1 in 10^18 (or better! properly functioning hardware is obscenely reliable at doing what it was designed to do [1]) instructions on properly functioning hardware, and all but impossible to deal with failing instructions at a much higher rate on nonfunctioning hardware, except to replace it with functioning hardware.

[1]: If anyone wants to pop up with complaints about that statement, remember that properly functioning hardware is also doing a lot of things very quickly, so it has a lot of chances to fail. ECC RAM is important, for instance, because something that only happens every few billion accesses may still happen several times a day. But this is still an absolutely obscene degree of reliability. Most disciplines would laugh at worrying about something at that rate of occurrence... they wouldn't even be able to detect it.


EOF is returned by functions that can return a valid character. So this rules out all the values in 0x00-0xFF (0 would have been a nice choice for EOF but how do you read nul bytes?). -1 is the logical solution (note that getchar returns an int so -1 != 0xFF).


Not sure why this was downvoted. I'm curious too.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: