Hacker News new | past | comments | ask | show | jobs | submit login

Kinda thinking it might be beneficial adding this to the libgit2 project itself (eg via GitHub PR).

Any objections to that?




You could certainly try. I believe the core contributors moved away from the idea of pluggable backends once they realized the performance limitations. It still works great for some use cases, but I think the folks at GitHub quickly realized it wouldn't work for them.


I'd be interested to hear more about the performance limitations.

My naive thoughts were that it would perform extremely well as I had thought that Postgres scales extremely well with multicore.

Is there anything I can read anywhere about such performance limitations? Am I correct in understanding that you found performance limitations - I assume when compared to file system?

Any pointers to info on where github tried this?


Once you understand how git works under the hood it's actually fairly easy to predict that performance will be poor. A simple checkout involves accessing 100s if not 1000s of objects. Also, you can't fetch these all at once because the objects you need to fetch are determined based on a nested tree. So you have to query the tree all the way down, getting each nested tree or blob based on the previous tree's contents. So ultimately you're doing 100s-1000s of queries for any given git command. Each query is fast, but even at 1-2 ms per query it adds up quickly.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: