Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Git works fine with petabyte-scale projects and handles 100+ gigabyte files totally fine with some modest scripting.

For one of the projects I oversee a typical file size is about 8gb each and a handful are about 140gb. What we version with Git are a list of the files in ion (superset of JSON) so the 8gb files aren't "in" Git but just referred to via hashes by what is committed in Git. These files are accessed via CephFS and rsync. We have "server side" hashing of these files via SSH. All of the relevant scripts are in Git hooks.

Someone (external to our team) once commented that what I just described is "too complicated" and "not friendly for developer speed". Yeah, well, we don't hire stupid people so.. "thank you for your interest in this opportunity, we have decided to pursue other candidates at this time".



Interesting! So you basically implemented something similar to git LFS, correct?

Can I ask what's your usecase and why not use LFS instead?


I just looked at git-lfs and yeah the approach is pretty similar. There may be a good reason either way, as far as I know it hasn't been explored.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: