Hacker News new | past | comments | ask | show | jobs | submit login

There's a lot of love for monorepos nowadays, but after more than a decade of writing software, I still strongly believe it is an antipattern.

1. The single version dependencies are asinine. We are migrating to a monorepo at work, and someone bumped the version of an open source JS package that introduced a regression. The next deploy took our service down. Monorepos mean loss of isolation of dependencies between services, which is absolutely necessary for the stability of mission-critical business services.

2. It encourages poor API contracts because it lets anyone import any code in any service arbitrarily. Shared functionality should be exposed as a standalone library with a clear, well-defined interface boundary. There are entire packaging ecosystems like npmjs and pypi for exactly this purpose.

3. It encourages a ton of code churn with very low signal. I see at least one PR every week to code owned by my team that changes some trivial configuration, library call, or build directive, simply because some shared config or code changed in another part of the repo and now the entire repo needs to be migrated in lockstep for things to compile.

I've read this paper, as well as watched the talk on this topic, and am absolutely stunned that these problems are not magnified by 100x at Google scale. Perhaps it's simply organizational inertia that prevents them from trying a more reasonable solution.




Context: Staff Eng @ Google for 7+ years

1) This is solved by 2 interlocking concepts: comprehensive tests & pre-submit checks of those tests. Upgrading a version shouldn’t break anything because any breaking changes should be dealt with in the same change as the version bump.

2) Google’s monorepo allows for visibility restrictions and publicly-visible build targets are not common & reserved for truly public interfaces & packages.

3) “Code churn” is a very uncharitable description of day-to-day maintenance of an active codebase.

Google has invested heavily in infrastructural systems to facilitate the maintenance and execution of tests & code at scale. Monorepos are an organizational design choice which may not work for other teams. It does work at Google.


> any breaking changes should be dealt with in the same change as the version bump

Does this mean that some things will never get updated, as the effort required is impossibly high?


Effort to update something is high because there's a lot of code, not because it's in a monorepo. Updating the same code scattered across multiple repositories takes as much work in the best case. More realistically, some copy of the same code will stay unupdated because the cost to track down every repository in the company is too much.


Can definitely feel this pain personally. Need to upgrade tooling across some dozen or so services and we're investigating how to migrate with potentially incompatible upgrades. So just suffer outage while we merge PRs across some 20 repos? The atomic changes of a monorepo are very beneficial in these cases, removing the manual orchestration of GitOps practices segmented across individual services..


When you say it's "as much work" there's an assumption the code is still used. This was years ago, but when I was doing migrations at Google we sometimes had to deal with abandoned or understaffed and barely maintained code. (Sometimes by deleting it, but it can be unclear whether code by some other team is still useful.)

If you're not responsible for fixing downstream dependencies then you don't need to spend any time figuring that out.


Sounds great to me because you are forced to delete code that's not in use anymore. Without the monorepo, that code would still be there with old libraries that are potentially insecure.

Deleting code that is not being used anymore happens way too rarely in my opinion.


The downside is if a product no longer have maintainers you are now encouraged to shut it down, even if it still works and it doesn't cost much to run.


If a product non longer has maintainers, it's probably because it's not worth it for the company. So it makes sense to delete it, from the company point of view.


In that case the product should still have maintainers. Even if only part-time, no software project should be completely unsupervised.


But with a multi-repo, its possible to e.g. upgrade the dependency just for a single service that has an immediate need for the upgrade, isn't it?


The flip side is that services with an immediate need will get upgraded, and others won't, and six months later you will be saying "Why am I still seeing this bug in production, I already fixed it three times!"

Of course, the problem can be mitigated by a disciplined team that understands the importance of everybody being on the same page on which version of each library one should use. On the other hand, such a team will probably have little problem using monorepo in the first place.

Whether you have a monorepo or multiple repos, a good team will make it work, and a bad team will suck at it. But multiple repos do provide more ropes for inexperienced devs to tie themselves up, in my opinion.


I don't think that's quite true. In my experience multi-repos have the edge here.

If you have one key dependency update with a feature you need, but you need substantial code updates and 80 services depend on it, that may be impossible to pull off no matter what. Comparatively, upgrading one by one may not be easy, but at least its possible.

The importance of everyone being on the same page with dependencies might just be a limitation of monorepos rather than a generally good thing. Some services might just not need the upgrade right now. Others may be getting deprecated soon, etc.


There are languages / runtimes where there could not be two different versions of the same thing in one binary (and they eagerly fail at build time / immediately crash upon run). That is not the case for JavaScript, Rust, etc. But it is the case for C++, Java, Go, Python and more.

Everyone claims different needs if they can. Nothing could be linked together anymore if you just let everyone use whatever they want.

Or maybe people start to try to workaround this by ... reinventing the wheels (and effectively forks and vendoring) to reduce their dependency graph.

There is a genuine need for single instance of every third party dependencies. It is not unique to monorepos. Monorepo (with corresponding batch change tooling) just make this feasible, so you don't hear about this concept for manyrepos, and mentally bind it to monorepo.


> But it is the case for C++, Java, Go, Python and more.

It certainly isn't for Java, hence why multiple classloaders exist.

For C and C++ it depends on the OS, on Windows (AIX, and similar OSes) this isn't an issue thanks to how symbol visibility works.

Two different libraries are free to have whatever versions they feel like.


Thanks. I'm not familiar with Java. I thought multiple classloaders are more like dlmopen (which doesn't help much - symbol visibility is hard) cause I saw people struggling on classpath conflict etc.


It is basically how application servers got implemented, every EAP/WAR file gets their own classloader, and there is an hierachy that allows to override search paths.

That is how I managed back in the day to use JSF 2.0 on Websphere 6, which officially did not had support for it out of the box.


> There are languages / runtimes where there could not be two different versions of the same thing in one binary

But I'm not talking about one binary here. I'm talking about multiple, separate services.


How many internal libraries does your "separate services" contain? You service A depends on library alpha@1, your service B depends on library alpha@2. All happy now. Introduce another layer, your service A depends on library alpha@1, beta@1, and alpha@1 depends on gamma@1, beta@1 depends on gamma@2, what to do now? It does not even matter how many services you have now.

With Javascript it does not apply, alpha@1 can have its own gamma@1, beta@1 can have its own gamma@2. But the same does not hold for most languages.

left-pad is both amazing and sad. It's amazing because JS's "bundle entire dependency closure" approach, combined with npm infrastructure, successfully drove the usability of software reuse to the point that people even bother to reuse left-pad. This is beyond what a well-regulated corporate codebases can achieve (no matter strongly encouraged single instance or not, not matter manyrepo or monorepo), and it happens in open. It is sad because without being regulated people tends to do so too aggressively, causing, well, left-pad.


> How many internal libraries does your "separate services" contain? You service A depends on library alpha@1, your service B depends on library alpha@2. All happy now. Introduce another layer, your service A depends on library alpha@1, beta@1, and alpha@1 depends on gamma@1, beta@1 depends on gamma@2, what to do now? It does not even matter how many services you have now.

Got several thoughts on this one. First, lets look at how bad the issue really is:

To start using beta@1 you need to upgrade alpha@1 to alpha@2 that depends on gamma@2. What's the problem with that?

The same situation can arise with 3rd party dependencies, except there its much worse: you have zero control over those. Here you do have the control.

Now lets look at what this situation looks like in a monorepo: you can't even introduce gamma@2 and make beta@1 at all without

1. upgrading alpha@1 to alpha@2

2. upgrading all services that depend on alpha@2

3. upgrading all libraries that depend on gamma@2

4. upgrading all services that depend on gamma@2, if any

So you might even estimate that the cost of developing beta@2 is not worth it at all. Instead of quasi-dependency-hell ("quasi" because your company still controlls all those libraries and has power to fix the issue unlike real dependency hell) you have a real stagnation hell due to a thousand papercuts

My second comment is about building deep "layers" of internal dependencies - I would recommend avoiding it for as long as possible. Not just because of versioning, but because that itself causes stagnation. The more things depend on a piece of code, the harder it is to manage it effectively or to make any changes to it. The deeper the dependency tree is, the harder it is to reason about the effect of changes. So you better be very certain about their design / API surface and abstraction before building such dependencies yourself.

Major version bumps of foundational library dependencies is an indication that you originally had the wrong abstraction. No matter how you organize your code in your repos, its going to be a problem. (Incidentally, this is also why despite the flexibility of node_modules, we still have JS fatigue. At least with internal dependencies we can work to avoid such churn.) It should still be easier with separate services, however, as you can do it more gradually.

Last note on left-pad and similar libraries. They are a different beast. They have a clear scope, small size and most importantly, zero probability of needing any interface changes (very low probability of any code changes as well). That makes them a less risky proposition (assuming of course they cannot be deleted)

Those are my (hopefully nuanced) 2 cents.


> To start using beta@1 you need to upgrade alpha@1 to alpha@2 that depends on gamma@2. What's the problem with that?

The problem is the team maintaining alpha does not want to upgrade to gamma@2 because it's an extra burden for them, and they don't have an immediate need.

The debate is not about teams owning separate services, it's about teams owning libraries.


I'm assuming a customer-driven culture where you work for your customers needs. In the case of libraries, teams using the libraries are customers. If you're the maintainer of alpha and your customer needs beta, your customer needs you to upgrade to gamma.


But then another customer still wants gamma@1, they are allowed to do that! But they also want your new features. So now you have to maintain two branches, which I hope we can agree: it is an extra burden.

This is unavoidable if we are talking about FOSS, people should be able to do whatever they want, and they do. A company has an advantage here: you can install company-wide rules and culture to make sure people don't do this. Which, in this case, happens to be: let's keep a single version of everything unless you have really good reasons.


> But then another customer still wants gamma@1, they are allowed to do that! But they also want your new features.

In this case, you still have the option of working with them to help them migrate to gamma@2, if the cost of maintaining gamma@1 is indeed too high and would negatively impact you in serving other customers. This was the original premise, wasn't it - upgrading all your dependants when you upgrade your library? That's still an option. The point is - you have more choices. And you can also help customers one by one - you don't have to do it all at once

I will agree though, restricting choices helps when the company is finding difficulty in aligning incentives through communication. But you do give up a lot for it - including ability to move fast and avoid stagnation.


Ah, ok. I see where you are coming from :)

From what I saw I'd say it's exactly opposite: allowing multiple versions actually means "make teams able to choose for stagnation". And because we are lazy, we certainly do! There is a non-trivial amount of people who believes "if it ain't broken don't fix it". I can work with them to migrate them over, but they might not want to do so! In this case, a hard "bump versions or die" rule is a must.

Maybe if you work in a small group of great engineers you don't need to set such rules and you can move even faster, but I unfortunately haven't found such a workplace :(

> you don't have to do it all at once

Yes. Nobody should do it all at once. Making "bump versions or die" compatible with incremental adoption is slightly harder (see sibling threads for how it's done). Still worth it I'd argue.


Note: steps 1-4 will likely be done in somewhat reverse order (3,4 first, then 1,2, then building `beta`)


Lots of code under the assumption that all the code needs to use the same version*

A big bang always sucks versus some migration over time


No you use automated systems to do the change. https://mobile.twitter.com/obeattie/status/10804969557537505...


Even so, the cost is often outrageously high.

No to mention that if you're the first team to import a third_party library, you own it and other teams can add arbitrary cost to you updating it. You have to be very aggressive with visibility and SLAs to work around this.


You're basically just describing all the pain with pulling in a dependency regardless of monorepo or not.

If the third party dependency does not add enough value to justify the cost then don't add it.


In a multi-repo setup you can upgrade gradually though, tackling the services that need the upgrade the most first. Can you do that in a monorepo setup?


In a multi-repo setup you can upgrade gradually..

This also means services can be left to rot for years because they don't need to be upgraded, while all the infrastructure changes around them, which is a giant pain when you do eventually need to change something.

If you have a multi repo architecture you absolutely need both clear ownership of everything and well planned maintenance.


The total pain is the same though.


With multirepo setups, you don't necessarily need to update the package for all code at all.

Instead, some newer package completely replaces an old one, with no relation to the old dependency package, or with a dependency on some future one, and both can run at the same time while turning the old one off


Almost. We had a UI library on Android that was stuck on an alpha version of the library for three or so years after the library had shipped.

Upgrading the library broke many tests across the org, and no one wanted to own going in and getting each team to fix it. Eventually, the library had a v2 release, and people started to care about being able to use it.

Ultimately, they just forked the current release and appended a v2 to the package name.

Not the norm, but it happens. The monorepo works for Google, but I wouldn't recommend it for most organizations; we have a ton of custom tooling and headcount to keep things running smoothly.

From the mobile side, it makes it super easy for us to share code across the 50+ apps we have, manage vulnerabilities quicker, and collaborate easily across teams.


> The monorepo works for Google

Does it? Or is it stopping Google from supporting products which only make millions in revenue because of the massive burden of continually updating?


Oh geez, that's an entirely different can of worms that isn't related to the monorepo.

Most products at Google are not dropped because the monorepo makes it difficult for them to support - and I'm not sure how it would or how you got to that association. Also, plenty of products that are killed are not in the monorepo.

They are usually dropped due to a mix of things, but a big part is just better product management.


Better project management as in, somebody politicked their way into owning a replacement for a currently running thing?

The implemented product, as well as the vision for something like inbox or Google music is still way better than Gmail and YouTube music as the end user


Fixed that:

>They are usually dropped due to a mix of things, but a big part is just worse product management.

Google is now at the point where their new projects fail (like Stadia) because they killed old products. Killing products has second-order effects.


> Ultimately, they just forked the current release and appended a v2 to the package name.

Hmm, does that explain the golang module versioning requirement where v2 must have a different name?


Yes.


Google's software mostly uses dependencies already in the google monorepo, so these issues don't crop up. The person/team working on library changes have to ensure that nothing breaks, or the downstream users are notified early on. Don't think this would apply to many companies.


That sounds like a huge amount of effort unrelated to your current project, both for those being forced to upgrade, and those organizing the upgrade


It’s not really even a true monorepo. Little known feature - there is a versions map which pins major components like base or cfs. This breaks monorepo abstraction and makes full repo changes difficult, but keeps devs of individual components sane.


This was done away with years ago. Components are no more.

There are still a couple of things that develop on long lived dev branches instead of directly at head, but my personal opinion is the need for those things to do that is mostly overstated (and having sent them cls in the past, it's deeply annoying).


>> 3. It encourages a ton of code churn with very low signal.

> 3) “Code churn” is a very uncharitable description of day-to-day maintenance of an active codebase.

Also implicit in the discussion is the fact that Google and other big tech companies performance review based on "impact" rather than arbitrary metrics like "number of PRs/LOCs per month". This provides a check on spending too much engineer time on maintenance PRs, since they have no (or very little) impact on your performance rating.


> based on "impact" rather than arbitrary metrics

Umm, from whatever I have seen in big tech "impact" is also fairly arbitrary. It all is based on how cozy one is with one's manager, skip manager, and so on. More accurate is "perception of impact".

Especially as it gets more and more nebulous at higher levels.


How do you deal with wanting to see the history, graph etc of just one sub-project? Does the tooling handle this?


I believe everything is tracked at the folder/file level and not a project level. I'm not sure there even is a concept of a project. But maybe someone can correct me.


There is a concept of a project. Though viewing change history is more organized around packages and files.


History for folders is visible in code search, it’s basically equivalent to what GitHub or Sourcegraph would give you. You can query dependencies from the build system. Anything beyond a couple levels deep is unlikely to load in any tools you have ;)


git log <directory> accomplishes this already.


Google uses piper and perforce (well, g4) before that


Is monorepo an important reason for Google to kill products? Or is it just my imagination?


Hi, unrelated to this, but since you are working at Google, were there actually "code red" meetings at Google concerning chatgpt?


  > The single version dependencies are asinine. We are migrating to
  > a monorepo at work, and someone bumped the version of an open
  > source JS package that introduced a regression.
There's no requirement to have single versions of dependencies in a monorepo. Google allows[0] multiple versions of third-party dependencies such as jQuery or MySQL, and internal code is expected to specify which version it depends on.

  > It encourages poor API contracts because it lets anyone import any
  > code in any service arbitrarily.
Not true at Google, and I would argue that if you have a repository that allows arbitrary cross-module dependencies then it's not really a monorepo. It's just an extremely large single-project repo with poor structure. The defining feature of a monorepo is that it contains multiple unrelated projects. At Google, this principle was so important that Blaze/Bazel has built-in support for controlling cross-package dependencies.

  > I see at least one PR every week [...] because some shared config
  > or code changed in another part of the repo and now the entire repo
  > needs to be migrated in lockstep for things to compile.
That really doesn't sound like a monorepo to me. If all the code has to be migrated "in lockstep", then that implies a single PR might change code across different parts of the company. At which point it's not independent projects in a monorepo, it's (merely) a single giant project.

[0] Or allowed -- I last worked there in 2017.


I never worked at Google, but this post sums up everything I had to say about the matter. GP has a sh-tty monorepo experience at one company and decides to make a statement about another company where they never worked (so I presume). HN absurdism as its best!

I second your point about monorepo versus ball of mud. They are so different. And managing all of this is about social/culture, less science-y. If you don't have good culture around maintenance, well then, yeah, duh, it will fall apart pretty quickly. It sounds like Google spends crazy money to develop tools to enforce the culture. Hats off.


It generally gives the sense that mono-repo is actually irrelevant, and the more detailed processes across the whole experience are what matters.


Yep, the top comment in this thread is a fantastic example of typical HN comments. Naïveté masked as expertise.


There's always been a very strong one version policy, multiple versions are usually only allowed to coexist for weeks or months, and are usually visibility restricted.

This prevents situations where "Gmail" ends up bundling 4 different, mildly incompatible versions of MySQL or whatever, and the aggravation that would cause. Or worse, in c++ you get ODR violations due to a function being used from two versions of the same library.


I think the catch, is that it isn't just third-party dependencies that are of concern. In particular, at a certain size, you are best off treating every project in the company as a third party item. But, that is typically not what you are wanting with source dependencies.

You can see this some with how obnoxious Guava was, back in the day. It seems a sane strategy where you can deprecate things quickly by getting all callers to migrate. This is fantastic for the cases where it works. But, it is mind numbingly frustrating in the cases where it doesn't. Worse, it is the kind of work that burns out employees and causes them to not care about the product you are trying to make. "What did you do last month?" "I managed to roll out an upgrade that had no bearing on what we do."


There’s a policy against multiple versions of third party dependencies. Though there is a mechanism for exceptions.


Sounds sane, usually you don't want multiple versions just because people were too lazy to keep code up to date. But in some instances it's probably worth supporting several major versions across the whole org.


I guess the question then becomes: Is it worth all the extra tooling required to manage a monorepo properly?


There is a lot of extra tooling required to manage a large non-monorepo org too.


The answer of course is no, but since they have a stable search product that supplied unlimited money, they were able to stubbornly stick to that decision.


https://opensource.google/documentation/reference

The third party documentation is public, one-version policies exist but they are exemptions.


> There's no requirement to have single versions of dependencies in a monorepo. Google allows[0] multiple versions of third-party dependencies such as jQuery or MySQL, and internal code is expected to specify which version it depends on.

Sure, but this is unsustainable. If service Foo depends on myjslib v3.0.0, but service Bar needs to pull in myjslib v3.1.0, in order to make sure Foo is entirely unchanged, you'd have to add a new dependency @myjslib_v3_1_0 used only by Bar. After two years you'd have 10 unique dependencies for 10 versions of myjslib in the monorepo.

At this point you've basically replicated the dependency semantics of a multi-repo world to a monorepo, with extra cruft. This problem is already implicitly solved in a multi-repo world because each service simply declares its own dependencies.


  > Sure, but this is unsustainable. [...] After two years you'd have
  > 10 unique dependencies for 10 versions of myjslib in the monorepo.
This is a social problem, and needs to be solved by a dependency management policy. Your org might decide that the entire org is only allowed to use a single version of each third-party dependency (which IMO is harsh and unhelpful), or might have a deprecation period for older versions, or might have a team dedicated to upgrading third-party deps.

Note that this need for a policy exists for both mono-repo and multi-repo worlds. Handling of third-party dependencies ought to be independent of how the version control repository is structured.

  > At this point you've basically replicated the dependency semantics of
  > a multi-repo world to a monorepo, with extra cruft. This problem is
  > already implicitly solved in a multi-repo world because each service
  > simply declares its own dependencies.
The problem with the multi-repo solution is that there's no linear view of the changes. Each repo has its own independent commit graph, and questions like "does the currently deployed version of service X include dependency commit Y" become difficult or impossible to answer.

That's why monorepos exist. They're not a way to force people to upgrade dependencies, and they aren't a get-out-of-jail-free card for thinking about inter-project dependencies. A monorepo lets you have a linear view of code history.

Phrased differently: many people approach monorepos as a way to force their view of dependency management on other people in their organization. The successful users of monorepos (including Google) take great efforts to let separate projects in the same repo operate independently.


> Sure, but this is unsustainable. If service Foo depends on myjslib v3.0.0, but service Bar needs to pull in myjslib v3.1.0, in order to make sure Foo is entirely unchanged, you'd have to add a new dependency @myjslib_v3_1_0 used only by Bar. After two years you'd have 10 unique dependencies for 10 versions of myjslib in the monorepo.

you're imagining a situation and speculating up a problem that might occur in that imaginary situation. in reality, no one does the thing you said - you don't add random deps on random external javascript libraries that don't have sane versioning stories.


> Sure, but this is unsustainable.

Not exactly unsustainable considering Google has been very successful with this approach!


I suspect Google spends more on developer tooling than any organization of on the planet. Probably worth considering that whenever trying to see whether something would work for you.


Is this a good comparison? Not everyone is Google-size. In fact, very few businesses are. What is managable for Google or a good practice for Google, might be unsustainable for another business.


I think the lesson to draw from bigorgs isn't what to as a smallorg, but what directions you can grow and what the pitfalls are on those roads.

Any smallorg probably wants a bare monorepo, git or what have you. If you grow to the point that becomes unwieldy, you can either invest in tooling the way Google has, or be prepared to split the repo into library and project repos in a way that makes sense for what your mediumorg has grown into.


What is a "bare monorepo" in contrast to simply a "monorepo"?


Google is a monorepo with sixteen layers of tooling to make it searchable, to not require you to spent 3 days making a local copy before editing, to manage permissions across orgs, etc etc etc.

A small organization of 1-20 people should not emulate the layers of tooling; just have a single git repo somewhere and call it a day.


Google isn't successful because of their tech decisions. They just happen to make an infinite amount of ad money; everything else they do is mainly getting their engineers to play in sandboxes to distract them so they won't leave to start other ad markets. It works though, since everyone is in love with their complex makework ideas.


I want to think they have. But... this is also why they kill older products. The cost of keeping the lights on is greatly elevated when keeping the lights on means keeping up with the latest codes.

This is absolutely no different from buildings. If you had to keep every building up to date with the latest building codes, you would tear them down way way way more often.


> this is also why they kill older products. The cost of keeping the lights on is greatly elevated when keeping the lights on means keeping up with the latest codes.

This is a really good point and I think accurate when it comes to smaller Google endeavors. I don't think this killed Stadia, for example, but maybe Google Trips (an amazing service that I don't think many folks used and likely had few development resources assigned, or none).


Yeah, I would not mean this to include Stadia. That said, if it adds costs to the smaller Trips and such, it has to add cost to the larger things, too. That is, if it makes the cheap things expensive, it probably makes the expensive things even more so.


So why is this a problem for a monorepo but not the multi-repo? It seems to me that the major difference is that in a multi-repo, you'd be more likely to be oblivious to the multitude of dependency issues than you are in the monorepo... and to be honest, that actually sounds like a bad thing to me, because it means you're sweeping legitimate issues underneath the rug.


In a multi-repo, you don't build source dependencies between projects.

You can do this with a mono, as well. However, the conceit is that "in the same repo" means you can "change them together." It is very very tempting that "went out as a single commit" means that it went out fine. Which, just isn't something you see in a multi world.


  > However, the conceit is that "in the same repo" means you can
  > "change them together."
In a monorepo you shouldn't be making changes to independent components in a single commit. That's how you end up being forced to roll back your change because you broke someone else's service.

If you're making a backwards-incompatible change to an API then you need to:

1. Make a commit to your library to add the new functionality,

2. Send separate commits for review by other teams to update their projects' code,

3. Wait for them to be approved and merged in, then merge a final cleanup commit.

If your repository is designed to enable a single commit to touch multiple independent projects then it's not a monorepo, it's just a single-project repo with unclear API and ownership boundaries.


This is clearly correct. But even in a multi world, I've seen far more attempts at atomic commits than makes sense.

I'd love for it to be a strawman. But I do keep finding them.

You do get me to question what a mono repo is. I've never seen one that wasn't essentially an attempt at treating a company as a large project. Akin to a modular codebase with a single build. Could be a complicated build, mind you. Still, the goal has always been a full repository build.


> In a multi-repo, you don't build source dependencies between projects.

In my experience with software projects, this is very much not the case. It's one of the main reasons I'm such a big fan of monorepos--I have been burned way too many times by the need to make atomic commits involving separate repositories.


If you have multiple repos, you can't have an atomic commit between them. Pretty much period. I'm scared to hear what you mean on that.

Ideally, all tooling makes the separate nature of the projects transparent. They should test separately. They should deploy separately. If that is not the case, then yes, they should be in the same repo.


I've worked at places where they would "solve" this problem by letting the build break while all the commit in various repositories land all at once. It's really bad.


After more than a decade of having tiny repos, I strongly believe that monorepos are the right way to go.

When you're pinning on old versions of software it quickly turns into a depsolving mess.

Software developers have difficulty figuring out which version of code is actually being deployed and used.

When dealing with major version bumps and semver pins around different repositories that creates a massive amount of make-work and configuration churn, and creates entire FTE roles practically dedicated to that job (or else grinds away at the time available for devs to do actual work and not just bump pins and deal with depsolving).

In any successful team which is using many dozens of repos, there's probably one dev running around like fucking nuts making sure everyhing is up to date and in synch who is keeping the whole thing going. If they leave because they're not getting career advancement then the pain is going to get surfaced.

The ability to pin also creates and encourages tech debt and encourages stale library code with security vulnerabilities. All that pinning flexibility is engineering to make tech debt really easy to start generating and to push all that maintenance into the future.


> The next deploy took our service down.

How would multi-repo change this? A dependency updated, and code broke, and the new version was broken—but you update dependencies in multi-repo anyway, and deployments can be broken anyway. I don’t see how multi-repo mitigates this.

> It encourages poor API contracts because it lets anyone import any code in any service arbitrarily.

This has nothing at all to do with monorepos. Google’s own software is built with a tool called Bazel, and Meta has something similar called Buck. These tools let you build the same kind of fine-grained boundaries that you would expect from packaged libraries. In fact, I’d say that the boundaries and API contracts are better when you use tools like Bazel or Buck—instead of just being stuck with something like a private/public distinction, you basically have the freedom to define ACLs on your packages. This is often way too much power for common use cases but it is nice to have it around when you need it, and it’s very easy to work with.

A common way to use this—suppose you have a service. The service code is private, you can’t depend on it. The client library is public, you can import it. The client library may have some internal code which has an ACL so it can only be imported from the client library front-end.

Here’s how we updated services—first add new functionality to the service. Then make the corresponding changes to the client. Finally, push any changes downstream. The service may have to work with multiple versions of the client library at any time, so you have to test with old client libraries. But we also have a “build horizon”—binaries older than some threshold, like 90 days or 180 days or something, are not permitted in production. Because of the build horizon, we know that we only have to support versions of the client library made within the last 90 or 180 days or whatever.

This is for services with “thick clients”—you could cut out the client library and just make RPCs directly, if that was appropriate for your service.

> It encourages a ton of code churn with very low signal.

The places I worked at that had monorepos, you might filter out the automated code changes there to do automated migrations to new APIs. One PR per week sounds pretty manageable, when spread across a team.

Then again, I’ve also worked at places where I had a high meeting load, and barely enough time to get my work done, so maybe one PR per week is burdensome if your are scheduled to death in meetings.


> How would multi-repo change this? A dependency updated, and code broke, and the new version was broken—but you update dependencies in multi-repo anyway, and deployments can be broken anyway. I don’t see how multi-repo mitigates this.

In a multi-repo world, I control the repo for my own service. For a business-critical service in maintenance mode (with no active feature development), there's no reason for me to upgrade the dependencies. Code changes are the #1 cause of incidents; why fix something that isn't broken?

We would have avoided this problem had we not migrated to the monorepo simply because, well, we would have never pulled in the dependency upgrade in the first place.

> In fact, I’d say that the boundaries and API contracts are better when you use tools like Bazel or Buck

I'm familiar with both of these tools, and I agree with this point. However, you are making an implicit assumption that 1. the monorepo in question is built with a tool like Bazel that can enforce code visibility, and 2. that there exists a team or group of volunteers to maintain such a build system across the entire repo. I suspect both of these are not true for the vast majority of codebases outside of FAANG.

> The places I worked at that had monorepos, you might filter out the automated code changes there to do automated migrations to new APIs

Sure, this solves a logistical problem, but not the underlying technical problem of low-signal PRs. I would argue that doing this is an antipattern because it desensitizes service owners from reviewing PRs.


Not updating old libraries is how you end up getting known security vulns years after they are patched.


You should ask your colleagues who work in critical industries like banking and healthcare how much of their software stack depends on things that haven't been patched in more than 20 years ;)



    "critical industries like banking and healthcare".
What a red herring. This comment reads like ChatGPT was trained on Reddit forums. 99% of the software in those industries runs "inside the moat" where security doesn't matter. I am still running log4j from 10 years ago in lots of my stack, and it is the swiss cheese of software security! Who cares! It works! I'm inside the moat! If people want to do dumb black hat stuff, they get fired. Problem solved.

Also what does "banking" mean anyway? That comment is so generic as to be meaningless. If you are talking about Internet-facing retail banks in 2023, most are very serious about security... because regulations, and giants fines when they get it wrong. And if the fines aren't large enough in your country, tell your democratically elected officials to 10x the fines. It will change industry behaviour instantly -- see US investment banks' risk taking after the Vocker Rule/Dodd-Frank regulations.


> If people want to do dumb black hat stuff, they get fired.

Firing people doesn’t get you un-hacked. When your risk model involves threats coming from the inside (and at sufficient scale and value it definitely should) then you want to harden things internally too.


I don't think it matters much if you're inside the moat. Running vulnerable software inside the moat makes it very easy for an attacker to move laterally once they're in. Patching everything where possible reduces the blast radius of an attack massively.


Both industries that are notorious for poor security hygiene, so I'm not sure this is the coup you were looking for.


Healthcare developer here, developing for German market, we've used Java preview features and unstable React versions many times before. And we literally have two different roles on our team for upgrading vulnerable dependencies whenever we get an alert.


I know devs in several and you can’t even deploy to QA if a dependency has a known vulnerability.


> In a multi-repo world, I control the repo for my own service. For a business-critical service in maintenance mode (with no active feature development), there's no reason for me to upgrade the dependencies. Code changes are the #1 cause of incidents; why fix something that isn't broken?

This is now the “what is code rot?” discussion, which is an incredibly deep and nuanced discussion and I’m not going to do it justice here.

Just to pick an example—if you have an old enough version of your SSL library, it won’t be able to connect to modern web servers, depending on the configuration (no algorithms in common). If you have old database software, maybe it won’t work with the centralized backup service you have. If your software is stuck on 32-bit, maybe you run out of RAM, or maybe the vendors stop supporting the hardware you need. If you need old development tools to build your software, maybe the developers won’t be able to make changes in the future when they actually become necessary. What if your code only builds with Visual Studio 6.0, and you can’t find a copy, and you need to fix a defect now?

As much as I like the idea of building software once and then running the same version for an eternity, I prefer the idea of updating dependencies and spending some more time on maintenance. I advocate for a certain minimum amount of code churn. If the code churn falls too low, you end up with getting blind-sided by problems and don’t have any expertise to deal with it. My personal experience is that the amount of time you spend on maintenance with changing dependencies doesn’t have to be burdensome, but it’s project-dependent. Some libraries you depend on will cause headaches when you upgrade, some won’t. Good developers know how to vet dependencies.

If you really need an old version of a dependency, you can always vendor an old version of it.

If you can’t afford to put developers on maintenance work and make changes to old projects, then maybe those projects should move from maintenance to sunset.

> that there exists a team or group of volunteers to maintain such a build system across the entire repo

Bazel doesn’t require a whole team. My personal experience is that a lot of teams end up with one or two people who know the build system and know how to keep the CI running, and I don’t think this changes much with Bazel.

Bazel is actually very good at isolating build failures. You can do all sorts of things that break the build and it will only affect certain targets. It is better at this than a lot of other tools.

> underlying technical problem of low-signal PRs

I honestly don’t see a few low-signal PRs as a problem. PRs are not there to provide “signal”, they are just there to logically group changes.

The kind of PRs that I see go by in monorepos are things like “library X has deprecated interface Y, and the maintainers of X are migrating Y to Z so they can remove Y later”. Maybe your experience is different.

I do think that owners should not feel that they need to carefully review every PR that touches the code that they own. This is, IMO, its own anti-pattern—your developers should build processes that they trust, monitor the rate at which defects get deployed to production, and address systemic problems that lead to production outages. Carefully reviewing each PR only makes sense in certain contexts.

If you’re working in some environment where you do need that kind of scrutiny for every change you make, then you are probably also in an enviroment where you need to apply that scrutiny to dependencies. Maybe that means your code has fewer dependencies and relies more on the stdlib.


You’re describing bad habits as if they’re a forgone conclusion. Repository-level separation between code makes certain bad habits impossible so a sloppy team will be more effective with many-repos because they physically can’t perform an entire class of fuck-ups but there’s lots of organisations where these fuck-ups… just don’t happen, and so the co-locating code in a monorepo isn’t a concern.

If your organisation can’t work effectively within a monorepo then you should absolutely address the problem, either by fixing the problematic behaviour or by switching away from a monorepo. The problem isn’t monorepos, the problem is monorepos in your organisation.


While 2nd and 3rd points are not really something unique to monorepo, the first point is actually valid. This is why monorepo usually should be packaged with bunch of other development practices, especially comprehensive tests combined with presubmit hook.

IMO, it's more of a development paradigm rather than a mere technology. You cannot simply use monorepo in isolation since its trade-off is strongly coupled with many other tooling and workflow. Because of this reason, I usually don't recommend migration toward monorepo unless there's strong organizational level support.


> 1. The single version dependencies are asinine. We are migrating to a monorepo at work, and someone bumped the version of an open source JS package that introduced a regression

Is this convention for monorepos to all share the same dependencies? Does monorepo imply monolith? Surely one could have dependencies per "service" for example a python app with its own pipfile per directory.


> It encourages poor API contracts because it lets anyone import any code in any service arbitrarily.

Perhaps that might be the default case, but the build system has a visibility system[1] that means that you can carefully control who depends on what parts of your code.

Separately, while some might build against your code directly, a lot of code just gets built into services, and then folk write their code against your published API, i.e. your protobuf specification.

[1]: https://bazel.build/concepts/visibility


I agree with every single point you made. Unfortunately, it's one of those discussions that is never going to be resolved because like so much else, it's difficult to find common ground when there are competing priorities.

My point is that in reality, we use what best matches our knowledge, experience and perception and prioritisation of the problems. I, for one, believe that a monorepo is dangerous for small teams because it encourages coupling - not only do I believe it, but I saw it with my own eyes. It also creates unnecessary dependency chains. Monorepos contribute to a fallacy that every dependent on an object must be immediately updated or tech debt happens. But that's not even remotely given.

In any case, companies like Google and Amazon have more than enough resources to deal systematically with the problems of a monorepo. I'm sure they have entire teams whose job it is to fix problems in the VCS. But for small teams I remain unconvinced that it is a good idea. We shouldn't even be trying to do the things the big guys do, unless we want to spend all our time working on the tools instead of our businesses.


Personally, I am looking forward to switch to a monorepo as it makes things a lot easier. Makes testing a lot easier when you don’t need to deal with 70 repositories to test something. Also it’s easier to ensure dependencies such as API libraries are up to date in each service. Quicker feedback whether code changes break the things. Now I have to wait at least 24 hours to find if my PR that I merged breaks things.


I've been saying this for half a decade. The solution to having to constantly update dependency version numbers is to ensure that dependencies are more generic than the logic which uses them. If a module is generic and can handle a lot of use cases in a flexible way, then you won't need to update it too often.

One problem is that a lot of developers at big companies code business logic into their modules/dependencies... So whenever the business domain requirements change, they need to update many dependencies... Sometimes they depend on each other and so it's like a tangled web of dependencies which need to be constantly updated whenever requirements change.

Instead of trying to design modules properly to avoid everything becoming a giant tangled web, they prefer to just facilitate it with a monorepo which makes it easier to create and work with the mess (until the point when nobody can make sense of it anymore)... But for sure, this approach introduces vulnerabilities into the system. I don't know how most of the internet still functions.


> The single version dependencies are asinine. We are migrating to a monorepo at work, and someone bumped the version of an open source JS package that introduced a regression. The next deploy took our service down

You're doing it wrong.

The point of monorepo is that if someone breaks something, it breaks right away, at build time, not at deployment time.

You're not really using a monorepo.


I find 1) to be a good property assuming you have some safeguards or rollback procedure, at a cultural/code ownership level it moves the efforts of shared-code changes on the person doing them rather than on the ones depending on shared code, which reduces communications, frustration points and increase responsibility.

For instance in multi-repo environments I've often seen this pattern: own some code, bump an internal dependency to a new version, see it break, ask the person maintaining it what's us, realize this case wasn't taken into account, few back and forth before finding an agreement.

On the other hand in mono-repo environments, it's usually more difficult to introduce a wide changes as you face all consequences immediately, but difficulty is mainly a technical/engineering difficulty rather than a social one, and the outcome is better than the series of compromises made left and right after a big multi-repo change.


That sounds like good arguments for monorepos. Bumping a js package that is used in several places should break the build, that how you test it. It sounds like the fallout of the version bump was caught already on the next build, so hopefully it didn't make it into the master-equivalent branch.

Compare that with hundreds of tiny repos, each with their own little dependency system. Testing a version bump across the board before mainlining it is much more involved and you are more likely to hit stuff in production which should have been caught in test.

The other two points sounds more like cultural issues which may touch on branch strategies, code review, and what's expected of a developer. Those mostly cultural issues that overlaps with technical are hard in a way that repository strategy isn't.


> 2. It encourages poor API contracts because it lets anyone import any code in any service arbitrarily. Shared functionality should be exposed as a standalone library with a clear, well-defined interface boundary. There are entire packaging ecosystems like npmjs and pypi for exactly this purpose.

I don't believe this is true, except in the short term. Unless the writing party is guaranteeing you forward compatibility, your consuming code will break when you update.

This is (almost) the only reason API contracts are worth having; the reason doesn't go away just because you can technically see all the code.


Context: happy monorepo user.

1, 2 and 3: Use separate dependencies for each package, so this doesn't happen. Use e.g. GitHub Actions or another CI/CD file filtering wisely: if a file is needed by two packages, tests for both packages needs to run whenever it's changed, before merging, in addition to usual end-to-end tests. Have vulnerable dependencies alerting and make sure to upgrade it everywhere it occurs.

2: Also have some guidelines on that and enforce it either automatically or manually in PRs.


I have worked at Google and have built multi-language outside Google.

1. Have some concept of visibility restriction e.g. Go language has internal package.

2. Ensure that every single package has a command to build the code.

3. Ensure that CI builds all the packages that changed our impacted by the change in a given pull request.

These three steps are mostly sufficient in having a monorepo. What you get in return is high code consistency and code visibility for the whole team.


1 and 2 could be solved by using proper gradle multi-module projects and tests. So I would say this is a problem of tooling of the language you're using. This is one of the reasons why I still can't understand how people operate with inferior ecosystems like node in the backend and I also wish go would have these things.


Code Monoliths make just about as much sense as Runtime Monoliths, that is to say, if you are splitting your project into different micro-services, you can split your code base into different repositories too.


1) You can have several independent projects in a monorepo

2) Private/public/internal modifiers

3) Independent builds/project in a monorepo




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: