How does the architecture block major refactors or improvements? Are you running a single codebase for all your tenants, albeit with separate schemas for each?
Edit: on reading the link you included, it seems like a lot of the problems are on the Rails implementation of the architecture with ActiveRecord and Apartment rather than with the architecture itself.
Here I'll give you one: If you want to change a property of the database that will give your specific use improved performance, you have no way to transactionally apply that change. Rolling back becomes a problem of operational scale, rolling out as well.
What if you need to release some feature, but that feature requires a database feature enabled? Normally you enable it once, in a transaction hopefully, and then roll out your application. With this style you have to wait for N database servers to connect, enable, validate, then go live before you can even attempt the application being deployed, much less if you get it wrong.
For this architecture to work you'd likely have to be able to operate more than one release at a time and move customers along with their database. If you're able to do that then this becomes much less of an issue.
I’m not a database architect, but can’t you just add the column or whatever as a nullable field in one migration, update the application to write to the column, and then make the field non-nullable in a further migration?
As I say I’m no expert but this is what I’ve done previously.
Sure, this problem can be solved as can any other -- but there's a cost to it in development time. For every feature, the dev working on it had to do it in 3 steps to ensure backwards compatibility and handle partial migration failures gracefully. Imagine doing this for a small feature - the ops effort dwarfs the actual feature! Many small features would probably be dropped.
Multi-single-tenant actually makes it easier to do transactional schema upgrades because these operations only impact one customer at a time. No fiddling with middle-of-the-migration state.
It becomes a non-issue if you're able to go down for a few moments.
Not necessarily viable for some software though, it just depends on what you're running.
(Especially because you can apply the migration on slave and subsequently switch master, making this downtime as quick as the time your system needs to do a database switch.)
This doesn't apply to big services with millions of users, however.
> You roll the feature out incrementally to the users who are ready. You build backwards compatible features. Basic software engineering.
The parent mentioned having to support ~150 B2B customers, so the effort is amplified x100 — more than 100 individual customers databases have to be coddled as you’ve described, albeit they are stuck with poor tooling to manage changes across their unusual architecture.
While not a web-app, we too have ~200 B2B customers running our application and each one have their own DB. Some self-host, most others are hosted by us.
We have written our own tool to upgrade the DB schema, it's nothing fancy, just takes an XML description, compares with the current DB and makes changes. However it ensures the upgrade process is automated.
New versions of our application come with a new DB schema, and we also have an auto-updater for our software. We have a small launcher application which allows the users to select a few older versions of our software, so upgrades are almost always safe in the sense that if the user encounters a bug with the new version, they can try in the previous one.
If the DB schema upgrade tool fails for some reason, the application upgrade will be held back and we'll get a notification.
Combined this allows us to be quite aggressive with pushing out new versions, and to do so without manual intervention.
A limitation with this setup is that DB changes have to be backwards compatible. So we do have a few cases of "if this table exists do this, else do that" type logic and similar, but mostly it's not a big deal.
For example I recently had to refactor some old table where they had used five columns for some additional references (ref_1, ref_2 etc), into a new table of additional references. To handle this I just made a new view which would return either the refs from the new table, or the data from the columns in case the new table didn't have any data for that record.
I then changed the application to use a grid instead of five separate edit controls, and to use this new view. If the user had to use an older version, that version would just write to the five fields and not populate a row in the new table, and the view would then return this data if the user later viewed it in the new version of our application.
So the "overhead" of backwards compatibility in this case was just ensuring this would be sufficient and writing that view, so just a few minutes.
Doing the same thing to 100 databases is usually easy to automate.
I managed a group that had a dozen customers on a single multi-tenant database and some migrations took more than a day - we needed to set maintenance windows on weekends (and once we had to roll back after an error).
Having a dozen databases we could roll out updates to on a client per client basis would have saved us from some enormous headaches.
For me it falls into the category of decisions that are easy to make and difficult to un-make. If for whatever reason you decide this was the wrong choice for you, be in tech needs (e.g. rails) or business needs, merging your data and going back into a codebase to add this level of filtering is a massive undertaking.
Indeed, but if you are making a B2B enterprise/SMB SaaS, I think you are most likely to regret the opposite choice [1][2]. A lot of companies run a single instance, multitenant application and have to develop custom sharding functionality down the road when they realize the inevitable: that most joins and caches are only needed on strict subsets of the data that are tenant specific.
If you get successful enough in this type of application space, you reach a mature state in which you need to be able to:
* Run large dedicated instances for your largest customers, because either their performance or security requirements mandate it.
* Share resources among a large number of smaller customers, for efficiency and fault tolerance reasons.
You can get there in two ways:
* You start with a massive multi tenant application, and you figure out a way to shard it and separate in pieces later.
* You start with multiple small applications, and you develop the ability to orchestrate the group, and scale the largest of them.
I would argue the latter is more flexible and cost efficient, and requires less technical prowess.
You can always take a multi-tenant system and convert it into a single-tenant system a lot more easily. First and foremost, you can simply run the full multi-tenant system with only a single tenant, which if nothing else enables progressive development (you can slowly remove those now-unnecessary WHERE clauses, etc).
See my sibling comment with the Hubspot example. Even though the system might work internally, other things will break if you start having duplicate account IDs because other systems don't think of the account ID as a cluster-local identifier, but as a global one.
Just thinking through this, but if it's an entirely separate environment, just host it on a separate subdomain and the account id becomes irrelevant. If you have a functioning staging environment, you already have the ability to run an isolated environment without issue, this is just a different application of the same idea.
You can probably run the environment itself, but other systems (release management, monitoring, billing, etc) probably rely on the account_id being unique.
I think you are talking about problems going from multiple single-tenant systems to a single multi-tenant system. You parent is talking about the opposite.
In the simple, YAGNI implementation of this, when you create a new HubSpot account, most likely that will insert a new row into the accounts table, and the auto generated ID of that row will be your account ID. Therefore you need uniqueness to be enforced at that level.
If you want to start running a separate copy of the system, you need to refactor the system to move that sequence out of the database so that two different customers running on different clusters don't end up with the same account ID. This is just an example, but there are many problems like this that are caused by the assumption that the production system is a single unique system.
Everything has a solution, but want to bet that at least 20 different internal systems at HubSpot assume that the account ID in that URL is globally unique?
In my experience by the time you reach this point you have a lot of operational complexity because you and your team are used to your production cluster being a single behemoth, so chances are it's not easy to stand up a new one or the overhead for doing so is massive (i.e. your production system grew very complex because there is rarely if ever a need to stand up a new one).
Additionally, a multi tenant behemoth might be full of assumptions that it's the only system in town therefore making it hard to run a separate instance (i.e. uniqueness constraints on names, IDs, etc).
Some of the issues I see in one of my projects is high interactivity between accounts. E.g. if account 1 'sends' something to account 2 both of the shared/separate db instances need to be up or there'll need to be some kind of queueing mechanism.
That's hard enough and then add to it that most clients want to BYOK to those instances
High interactivity between accounts is a good reason to not adopt the proposed multi-single-tenant architecture. The scenarios discussed are B2B enterprisey apps in which the environments are essentially independent from each other.
Yeah, it's far from a black and white issue. I am by no means against single tenant database structures, and for our initial use case it was likely the correct decision. I have not run into anything that stopped us in our tracks.
My pushback focuses on two things:
1: The Rails implementation of this, specifically with off the shelf gems and integration with ActiveRecord. Presents a lot of unknown framework complexity down the road. We are currently running on a fork of the Apartment gem as it breaks under newer versions of Rails.
2: Long term support for evolving business case. Now, this is very much a problem unique to our company and how our business/product has evolved. We started out with a very clear separation of account concerns, and as we've grown and pivoted things have changed. You are unlikely to experience the same evolution we have. HOWEVER, all startups face these unknowns, and giving yourself as much flexibility early in the game will pay off down the road. There are no right answers here, you just have to think it through and weigh the costs/benefits. Maybe for you that flexibility is found in single tenant. For us it appears in hindsight that multi-tenant would have been the best long term choice.
Edit: on reading the link you included, it seems like a lot of the problems are on the Rails implementation of the architecture with ActiveRecord and Apartment rather than with the architecture itself.