Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In the limit, there is a hard tradeoff between efficiency and reliability.

Failovers, redundancies, and backups are all important for building systems that are resilient in the face of problems, for reasons you've pointed out.

However, failovers, redundancies and backups are inefficient. Solving a problem with 1 thing is always going to be more efficient that solving the same problem with 10 things.

It's interesting to see this tradeoff play out in real-life. We see people coalescing around one or two services because that's the most efficient path, and then we see them diversifying across multiple services once bad things happen to the centralised services.



This is a very important point, and often misunderstood on both a business & societal level. Reliability has a cost. If you optimize all redundancy out of a system, you find that the system becomes brittle, unreliable, and prone to failure. Companies like 3M and Boeing have found that in the pursuit of higher profits, they've lost their focus on quality and suffer the resulting loss of trust and brand damage. The developed world discovered that with COVID, our just--in-time efficiency meant that any hiccup anywhere in the supply chain meant mass shortages of goods.


> In the limit, there is a hard tradeoff between efficiency and reliability.

Yes, but notice that most things on the GP's comment have an exponential impact on reliability (well, on 1 - reliability), so they are often non-brainiers as long as they follow that simple model (what they stop doing at some point).


Imho, the problem is that it is hard to estimate trade-offs. Optimizations (not just in computer systems, but in general) often seen as risk-free, when in reality they are not. More often than not one will be celebrated for optimization, and rarely for resilience (dubbed as duplicate, useless work)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: