Latency is an issue. Especially when traversing history in operations like log o...

cmn · on Dec 12, 2016

Yuup. Latency is a huge issue for Git. Even hosting Git at non-trivial scales on EBS is a challenge. S3 for individual objects is going to take forever to do even the simplest operations.

And considering the usual compressed size of commits and many text files, you're going to have more HTTP header traffic than actual data if you want to do something like a rev-list.

ninkendo · on Dec 12, 2016

I'm trying to think of the reason why NFS's latency is tolerable but S3's wouldn't be. (Not that I disagree, you're totally right, but why is this true in principle? Just HTTP being inefficient?)

I would imagine any implementation that used S3 or similar as a backing store would have to heavily rely on an in-memory cache for it (relying on the content-addressable-ness heavily) to avoid re-looking-up things.

I wonder how optimized an object store's protocol would have to be (http2 to compress headers? Protobufs?) before it starts converging on something that has similar latency/overhead to NFS.