Hacker News new | past | comments | ask | show | jobs | submit login

Latency is an issue. Especially when traversing history in operations like log or blame, it is important to have an extremely low latency object store. Usually this means a local disk.



Yuup. Latency is a huge issue for Git. Even hosting Git at non-trivial scales on EBS is a challenge. S3 for individual objects is going to take forever to do even the simplest operations.

And considering the usual compressed size of commits and many text files, you're going to have more HTTP header traffic than actual data if you want to do something like a rev-list.


I'm trying to think of the reason why NFS's latency is tolerable but S3's wouldn't be. (Not that I disagree, you're totally right, but why is this true in principle? Just HTTP being inefficient?)

I would imagine any implementation that used S3 or similar as a backing store would have to heavily rely on an in-memory cache for it (relying on the content-addressable-ness heavily) to avoid re-looking-up things.

I wonder how optimized an object store's protocol would have to be (http2 to compress headers? Protobufs?) before it starts converging on something that has similar latency/overhead to NFS.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: