Meaning the cloud may go down more frequently than small scale self deployments , however downtimes are always on average much shorter on cloud. A lot of money is at stake for clouds providers, so GitHub et al have the resources to put to fix a problem compared to you or me when self hosting.
On the other hand when things go down self hosted, it is far more difficult or expensive to have on call engineers who can actual restore services quickly .
The skill to understand and fix a problem is limited so it takes longer for semi skilled talent to do so, while the failure modes are simpler but not simple.
The skill difference between setting up something locally that works and something works reliably is vastly different. The talent with the latter are scarce to find or retain .
Well, just a few weeks ago we weren't able to connect to RDS for several hours. That's way more downtime than we ever had at the company I worked for 10 years ago, where the DB was just running on a computer in the basement.
Most software doesn’t need to be distributed. But it’s the growth paradigm where we build everything on principles that can scale to world-wide low-latency accessibility.
A UNIX pipe gets replaced with a $1200/mo. maximum IOPS RDS channel, bandwidth not included in price. Vendor lock-in guaranteed.
“Your own solution” should be that CI isn’t doing anything you can’t do on developer machines. CI is a convenience that runs your Make or Bazel or Just or whatever you prefer builds, that your production systems work fine without.
I’ve seen that work first hand to keep critical stuff deployable through several CI outages, and also has the upside of making it trivial to debug “CI issues”, since it’s trivial to run the same target locally
> should be that CI isn’t doing anything you can’t do on developer machines
You should aim for this but there are some things that CI can do that you can't do on your own machine, for example running jobs on multiple operating systems/architectures. You also need to use CI to block PRs from merging until it passes, and for merge queues/trains to prevent races.
Every Linux desktop system has a keychain implementation. You can of course always use your own system, if you don't like that. You can use different keys and your developers don't need access to the real key, until all the CI servers are down.
Yes. I've quite literally run a self-hosted CI/CD solution, and yes, in terms of total availability, I believe we outperformed GHA when we did so.
We moved to GHA b/c nobody ever got fired ^W^W^W^W leadership thought eng running CI was not a good use of eng time. (Without much question into how much time was actually spent on it… which was pretty close to none. Self-hosted stuff has high initial cost for the setup … and then just kinda runs.)
Ironically, one of our self-hosted CI outages was caused by Azure — we have to get VMs from somewhere, and Azure … simply ran out. We had to swap to a different AZ to merely get compute.
The big upside to a self-hosted solution is that when stuff breaks, you can hold someone over the fire. (Above, that would be me, unfortunately.) With Github? Nobody really cares unless it is so big, and so severe, that they're more or less forced to, and even then, the response is usually lackluster.
It's fairly straightforward to build resilient, affordable and scalable pipelines with DAG orchestrators like tekton running in kubernetes. Tekton in particular has the benefit of being low level enough that it can just be plugged into the CI tool above it (jenkins, argo, github actions, whatever) and is relatively portable.
I mean yes. We've hosted internal apps that have four nines reliability for over a decade without much trouble. It depends on your scale of course, but for a small team it's pretty easy. I'd argue it is easier than it has ever been because now you have open source software that is containerized and trivial to spin up/maintain.
The downtime we do have each year is typically also on our terms, not in the middle of a work day or at a critical moment.
With a build system that can run on any Linux machine, and is only invoked by the CI configuration? Even if all your servers go down, you just run it on any developers machine.