Git is a tool for collaboration, but is sometimes a poor fit for the reality of centralised, corporate software development. I'd like to see a product that fills these gaps, and I sense that this is part of what Diversion is about. For example, large file support in git depends on clunky hacks (that I respect) like LFS, but Diversion aims to make it actually work seamlessly. Companies awkwardly split their projects across many repos just to allow for access control, but Diversion aims to makes it easy by letting you just set access for directory. Those are great, "no-brainer" advances. There are a lot of other places where clunky hacks appear in popular git usage. A lot of those are also areas where big tech has introduced their own internal solutions. Maybe Diversion can cover some of these? Some examples:
- git has a very useful hooks mechanism, but each user has to manage their installed hooks in their own clone. This should be made more team-compatible. Make hooks a part of the history that can be patched by anyone, and then others can update their hooks simply by pulling. A less-clunky, built-in version of https://pre-commit.com/ , basically.
- Keeping with use cases for hooks, it should be possible to apply code formatting transparently. Nobody should have to think about code formatting, ever. It's a common practice to ensure code conforms to the company's standard format. However, why can't a developer also work in their preferred format? It should be possible for the system to handle this. I work in my preferred format, but all of the committed code is actually in the company's standard format. A bidirectional transformation.
- Break down the barrier between repo and other channels of company communication. Often companies have a wiki, a document drive, a chat app, a ticket system. With the ability to include real-time collaboration in the repo, it makes it possible to unify all of these things, ideally while keeping a full history. I basically want to version control my entire company, including what all of the non-devs are producing. Fossil has some of this capability, but it hasn't captured the mainstream: https://fossil-scm.org/home/doc/trunk/www/whyallinone.md
- Make repos composable. It should be possible to include a repo in another repo in a way that isn't clunky. Submodules, subtrees and subrepos are clunky. A use case for this would be including a third party library as source. It should also be possible to take one piece of a repo, and distribute that "view" while allowing bidirectional contribution. An example use case for this would be for maintaining a public open source project simultaneously in the company monorepo and on a public forge. There are (very nice) hacks for this like https://github.com/google/copybara and https://josh-project.github.io/josh/ , but these have considerable clunk-factor. When repos can be sliced and glued like this, a lot of the monorepo vs polyrepo debate becomes unnecessary; we can have both.
- Make it easy for everyone to use a patch stacking / "branchless" workflow with Diversion. There are numerous projects to enable stacking in git, and it's ubiquitous in big tech, yet it hasn't gone mainstream. It's only a matter of time until some company brings this to the masses.
- Allow history to be viewed and maintained at varying granularity. As I'm sure you know, a common debate you will see unfolding online is between keeping a clean, manicured history, and keeping a full history of what you actually did. You see git forges try to split the difference by offering squash merges. You see people recommending --first-parent for viewing the logs without noise. To me, this all points to a pointless limitation of the tooling. Why can't I take the raw, messy, actual commit log of what I did, and then bundle those up into a non-destructive "summary" commit that appears in the log. That way, people can see the clean, summarised version of my change, but can also dive deeper and see all of the twists and turns I took while producing that change.
There is so much room to improve version control. Much like build systems, I feel like the industry standard falls short of what we know is possible. I am happy to see new developements in this space.
Thank you very much for the thoughtful comment! You're exactly right - Diversion's goal is to improve things such as what you mentioned. Moreover, we're aiming to build a flexible system that can be extended and improved further.
We'll definitely implement at least some of the things you mentioned!
- git has a very useful hooks mechanism, but each user has to manage their installed hooks in their own clone. This should be made more team-compatible. Make hooks a part of the history that can be patched by anyone, and then others can update their hooks simply by pulling. A less-clunky, built-in version of https://pre-commit.com/ , basically.
- Keeping with use cases for hooks, it should be possible to apply code formatting transparently. Nobody should have to think about code formatting, ever. It's a common practice to ensure code conforms to the company's standard format. However, why can't a developer also work in their preferred format? It should be possible for the system to handle this. I work in my preferred format, but all of the committed code is actually in the company's standard format. A bidirectional transformation.
- Break down the barrier between repo and other channels of company communication. Often companies have a wiki, a document drive, a chat app, a ticket system. With the ability to include real-time collaboration in the repo, it makes it possible to unify all of these things, ideally while keeping a full history. I basically want to version control my entire company, including what all of the non-devs are producing. Fossil has some of this capability, but it hasn't captured the mainstream: https://fossil-scm.org/home/doc/trunk/www/whyallinone.md
- Make repos composable. It should be possible to include a repo in another repo in a way that isn't clunky. Submodules, subtrees and subrepos are clunky. A use case for this would be including a third party library as source. It should also be possible to take one piece of a repo, and distribute that "view" while allowing bidirectional contribution. An example use case for this would be for maintaining a public open source project simultaneously in the company monorepo and on a public forge. There are (very nice) hacks for this like https://github.com/google/copybara and https://josh-project.github.io/josh/ , but these have considerable clunk-factor. When repos can be sliced and glued like this, a lot of the monorepo vs polyrepo debate becomes unnecessary; we can have both.
- Make it easy for everyone to use a patch stacking / "branchless" workflow with Diversion. There are numerous projects to enable stacking in git, and it's ubiquitous in big tech, yet it hasn't gone mainstream. It's only a matter of time until some company brings this to the masses.
- Allow history to be viewed and maintained at varying granularity. As I'm sure you know, a common debate you will see unfolding online is between keeping a clean, manicured history, and keeping a full history of what you actually did. You see git forges try to split the difference by offering squash merges. You see people recommending --first-parent for viewing the logs without noise. To me, this all points to a pointless limitation of the tooling. Why can't I take the raw, messy, actual commit log of what I did, and then bundle those up into a non-destructive "summary" commit that appears in the log. That way, people can see the clean, summarised version of my change, but can also dive deeper and see all of the twists and turns I took while producing that change.
There is so much room to improve version control. Much like build systems, I feel like the industry standard falls short of what we know is possible. I am happy to see new developements in this space.