Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I suspect the author is incorrect in his claim that reading in 32k chunks is responsible for the slowness. Due to read ahead and buffering, Unix-like systems tend to do reasonably well on small reads. Yes, big reads are better, but small reads are not unreasonably slow.

To test this, he should try "ls | cat". On large directories that often runs many orders of magnitude faster than "ls". This is because, I believe, ls by default on most people's systems, want to display information such as file modes or type via coloring or otherwise decorating the file names, and getting that information requires looking at the inode for each file.

It's all those inode lookups that slow things down, as the inodes are likely to be scattered all over the place. When you do "ls | cat", ls noticed output is not a terminal, and so turns off the fancy stuff, and just lists the file names. The names can be determined entirely from the directory, and so performance is much better.



The original onus for the post was python's os.listdir() which as far as I know doesn't stat(). ls, just made the blog post more interesting :-).

I was surprised that the 32K reads were taking so long. It's possible since it was on a virtualized disk ("in the cloud") that something else was slowing down disk IO (like Xen).

But I can assure you that a larger read buffer performed much better in this given scenario.

I'd welcome more tests though.


This is just a hypothesis based on very little actual knowledge, but perhaps a very long scheduling interval is responsible for the slowness with smaller reads? Consider this scenario: the virtualization hypervisor is on a reasonably loaded system, and decides to block the virtual machine for every single read. Since the physical system has several other VMs on it, whenever the VM in question loses its time slice it has to wait a long time to get another one. Thus, even if the 32K read itself happens quickly, the act of reading alone causes a delay of n milliseconds. If you increase the read size, your VM still gets scheduled 1000/n times per second, but each time it gets scheduled it reads 5MB instead of 32K.


I just tried this on non-virtualized hardware with 10 million files in a single directory using the zfsonlinux native zfs implementation for linux. It took a little over 4 minutes to do a "\ls -f | wc -l" so this might very well be something to do with virtualization.

I'll try an ext3 file system just for giggles and post the results.

Edit:

Didn't have an ext3 file system with enough free inodes handy so I used an ext4 file system. It takes too long to create 10 million files so I cut the test off early. It took about 7 seconds to complete a "\ls -f | wc -l" with 6.2 million files in a single directory.


You are right that 32k buffers should be more than enough to read 5MB in a reasonable time, regardless of disk/VM architecture, but I don't think stat was the problem either. My guess would be readdir or possibly getdents is probably O(n^2) somewhere.

[just noticed it was 500M (oh wow), but same difference]


> To test this, he should try "ls | cat".

Running /bin/ls will bypass the alias.


Prefixing the command with a \ will also disable the alias (e.g. \ls)


Do you know where this is documented? (I just skimmed the bash docs and couldnt find it) thx.


At first glance what you say makes sense, but then why did find and the python call both have similar issues?




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: