Hacker News new | past | comments | ask | show | jobs | submit login

> Except that each of those billions of processes will have its own file descriptor with its own offset!

The offset is only a tiny part of how POSIX is stateful. The very fact that each read or write is associated with a particular fd, therefore with a particular authorization and lock context, is more of an issue at the servers. Even more of an issue is the possibility of still-buffered writes, which POSIX does require be visible to reads on other fds.

> At least in principle, there's no reason that remote filesystems should know or care about file descriptors

Untrue, and please don't try to "correct" others with your own inaccurate information. As I just said, each file descriptor (or file handle in NFS) has its own authorization and lock context, which must be enforced at the server(s) so knowledge of them can't be limited to the client.

> POSIX access bits are literally 15 bits per file. uid and gid are a few bytes.

Also mtime and atime, and xattrs which can add up to kilobytes, but more importantly what the author was really talking about was namespace information rather than per-file metadata. It's a common mistake. Even as someone who writes code to handle both of these separate concerns, I'm not enough of a pedant to whine every time an application programmer gets my domain's terminology wrong.

> the only one of the three POSIX file timestamps - access time - that actually can cause big scalability issues (if left enabled).

Untrue yet again. Mtime can be a problem too, as can st_size and st_blocks. In an architecture where clients issue individual possibly-extending writes directly to one of several data servers for a file but other clients can then query these values through a separate metadata server, that creates a serious aggregation problem. That's why I think the separate ODS/MDS model (as in Lustre) sucks. People resort to it because it makes the namespace issue easier, but it makes metadata issues harder. In the particular use cases where people have to stick with a filesystem instead of switching to an object store, it's a net loss.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: