Hacker News new | past | comments | ask | show | jobs | submit login

Excellent writeup. Computer systems are discoverable. That attitude, along with some of the basic tools (such as strace, ltrace, man pages, debuggers and a compiler) and a willingness to dive into system source code go a long way. If your tracing leads you to a library call and you don't know what's going on inside, find the source code. If it's the kernel, load up lxr (http://lxr.linux.no/).



Very interesting. I spotted two minor problems with the posted code.

Doing this: #define BUF_SIZE 1024 * 1024 * 5

to define a numerical constant is a bit scary, since depending on how the symbol is used it can break due to dependencies. It's better to enclose the value in parenthesis. Personally I would probably write the right hand side as (5 << 20).

The first time the modification to skip entries with inode 0 is mentioned, the code is wrong:

I did this by adding if (dp->d_ino == 0) printf(...);

This should use !=, not == (it's correct the second time, but this adds confusion).


On second thought (too late to edit), it's quite likely that I would not even bother with a define for this, but instead just set the size directly in the code, e.g. char buffer[5 << 20], then rely on sizeof buffer in the call where the buffer's size is needed. I prefer sizeof whenever possible.


Using (5 << 20) instead of (5 * 1024 * 1024) is premature optimization. Modern C compilers will take the expression (5 * 1024 * 1024) and turn it into the combination of shifts and additions that are appropriate for your architecture.

And I definitely prefer defined constants for two reasons. One, it's likely you'll have to declare multiple such buffers in different places. Two, if I want to tune the parameter, I'd rather do it at the top of a source file with other such defined constants than hunting for the declaration in the code. I do agree that sizeof() is preferable when it's an option.


I don't think he meant it as an optimization. It's easy to understand as 5 * 2^20, and quicker to type.


I much prefer 5 * 1024 * 1024. I have to stop and do some reasoning with 5 << 20.


Correct, to me it's idiomatic. I realize it's not for everyone, so the ultimate best code is probably the way it was written ... On the other hand, sometimes it's fun to push the envelope just a tiny bit, and hope that readers will actually learn something from the code. But that's a thin edge to be balancing on.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: