This was actually the first thing that I actually noticed about Visual Studio Team Services, when I first looked at integrating my search and code analytics engine with VSTS. It was quite apparent that they wanted to make 3rd party developers, first class citizens.
Anybody who has ever worked in Enterprise, knows feature requirements are heavily driven by politics. And if you can't support the weirdest edge cases, resistance for adoption can become insurmountable. Having looked at VSTS, you could easily tell they wanted to reduce as much push back as possible.
It's a lofty goal to create tools with the view of supporting the direction of one of the world's largest software companies, with the ability to support single-person dev shops just as seamlessly. I don't know if it makes sense (i.e. would google's monorepo scale down like that? Is Microsoft hamstringing themselves in this way?), but I applaud the effort.
I think they sort of gave up too soon on splitting up their repos. We've been through this before and made BitKeeper support a workflow where you can start with a monolithic repo, have ongoing development in it, and have another "could" of split up repos, sort of like submodules except with full on DSCM semantics.
which has some Git vs BK performance numbers.
We actually made BK pretty pleasant in large
repos even over NFS (which has to be slower
than NTFS, right?).
And BK is open source under the Apache 2 license
so there are no licensing issues.
I get it, Git won, clearly. But it's a shame that
it did, the world gave up a lot for that "win".
Great to see MS working on this, and also posting the code!
"As a side effect, this approach also has some very nice characteristics for large binary files. It doesn’t extend Git with a new mechanism like LFS does, no turds, etc. It allows you to treat large binary files like any other file but it only downloads the blobs you actually ever touch."
It seems every day I see another attempt to scale Git to support storage of large files. IMHO lack of large file support is the Achilles Heel of git. So far I am somewhat happy with Git LFS despite some pretty serious limitations - mainly the damage a user who doesn't have Git LFS installed can inflict when they push a binary file to a repo.
I'm curious what other folks on HN use to store large files in Git without allowing duplication?
The common case that I see are binaries that are being versioned through the VCS. Large binaries, be they libraries or the application, are stored in the VCS and it is used as the source of truth from that point on.
Git explicitly called this out as a bad practice. Other vendors, like Perforce, never really did an amazing job with it, but it worked and on top of that, created more reliance on the vendor's system. Now that most people see the shear productivity gains of Git over centralized VCS systems, everyone wants to move, but that comes with a catch. Many of these companies have large workflows built around the way that their old VCS works, and even have compliance rules that have that methodology written into things like their SOX compliance regulations.
It's this set of people, that generally have this issue. For everyone else, using just boring old disks is fine for packages and built product as that can be recreated from the SCM. Now, with LFS in Git, you can maintain the same workflow as you had in your old VCS, without changing the entire structure of the organization to work with it.
This article covers the end-to-end approach, whereas the other article and discussion are more focused on the GVSF filesystem driver used to support scaling git to repositories with hundreds of thousands of files and hundreds of gigabytes of history.
Good story. It'd be interesting to see a portable version (which I guess would have to either run on Mono or be rewritten in something else); or maybe Google will release some of theirs. I'm impressed that Microsoft had the courage to scale mostly-vanilla git instead of hacking Mercurial.
During the presentation at Git Merge Microsoft mentioned that the are hiring Linux and osx driver experts. This suggests that they plan to release the fuse driver themselves.
This was actually the first thing that I actually noticed about Visual Studio Team Services, when I first looked at integrating my search and code analytics engine with VSTS. It was quite apparent that they wanted to make 3rd party developers, first class citizens.
Anybody who has ever worked in Enterprise, knows feature requirements are heavily driven by politics. And if you can't support the weirdest edge cases, resistance for adoption can become insurmountable. Having looked at VSTS, you could easily tell they wanted to reduce as much push back as possible.