Hacker News new | past | comments | ask | show | jobs | submit login

Why open(2) and close(2) all the time? If I hit this problem—and hacking on Nginx itself were an option—then I'd make the following Nginx changes:

1. at startup, before threads are spawned, find all static files dirs referenced in the config, and walk them, finding all the files in them, and open handles to all of those files, putting them into a hash-map keyed by path that will then be visible to all spawned threads;

2. in the code for reading a static file, replace the call to open(2) with a look up against the shared file-descriptor from the pool, and then a call to reopen(2) to get a separately seekable userland handle to the same kernel FD (i.e. to make the shared FD into a thread-specific FD, without having to hit the disk or even the VFS logic.)

3. (optionally) add fs-notify logic to discover new files added to the static dirs, and—thread-safely!—open them, and add them to the shared pool.

This assumes there aren't that many static files (say, less than a million.) If there were magnitudes more than that, in-kernel latency of modifying a huge kernel-side FD table might become a problem. At that point, I'd maybe consider simply partitioning the static file set across several Nginx processes on the same machine (similar to partitioned tables living in the same DBMS instance); and then, if even further scaling is needed, distributing those shards on a hash-ring and having a dumb+fast HTTP load-balancer [e.g. HAProxy] hash the requested path and route to those ring-nodes. (But at that point it you're somewhat reinventing what a clustered filesystem like GlusterFS does, so it might make more sense to just make the "TCP load-balancing" part be a regular Nginx LB layer, and then just mount a clustered filesystem to each machine in read-only-indefinite-cache mode. Then you've got a cheap, stateless Nginx layer, and a separate SAN layer for hosting the clustered filesystem, where your SSDs now live.)




I think you are underestimating cloudflare's scale here. Obviously we do shard across many machines but each one still has many more files than what's reasonable to keep open all the time.


This will not scale at CF and is not compatible with their current architecture.


The use case here isn't static files, it's an HTTP cache.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: