This reminds me of a particularly devious C preprocessor trick:
#define if(x) if ((x) && (rand() < RAND_MAX * 0.99))
Now your conditionals work correctly 99% of the time. Sure it's possible for them to fail, but unlikely.
Now you might object that C if() statements are far more commonly executed than "apt-get install". This is true, but to account for this you can adjust "0.99" above accordingly. The point is that there is a huge difference between something that is strongly reliable and something that is not.
Things that are unreliable, even if failure is unlikely, lead to an endless demand for SysAdmin-like babysitting. A ticket comes in because something is broken, the SysAdmin investigates and found that 1 out of 100 things that can fail but usually doesn't has in fact failed. They re-run some command, the process is unstuck. They close the ticket with "cron job was stuck, kicked it and it's succeeding again." Then go back to their lives and wait for the next "unlikely but possible" failure.
Some of these failures can't be avoided. Hardware will always fail eventually. But we should never accept sporadic failure in software if we can reasonably build something more reliable. Self-healing systems and transient-failure-tolerant abstractions are a much better way to design software.
That difference goes away at the point where other risk factors are higher. How high is my confidence that there isn't a programming bug in Nix? Above 99%, perhaps, but right now it's less than my confidence that apt-get is going to work.
Most of us happily use git, where if you ever get a collision on a 128-bit hash it will irretrievably corrupt your repository. It's just not worth fixing when other failures are so much more likely.
The point of my post wasn't "use Nix", it was "prefer declarative, self-healing systems."
Clearly if Nix is immature, that is a risk in and of itself. But all else being equal, a declarative, self-healing system is far better than an imperative, ad hoc one.
Other risk factors don't make the difference "go away", because failure risks are compounding. Even if you have a component with a 1% chance of failure, adding 10 other components with a 0.1% chance of failure will still double your overall rate of failure to 2%.
This is not to mention that many failures are compounding; one failure triggers other failures. The more parts of the system that can get into an inconsistent state, the more messed up the overall picture becomes once failures start cascading. Of course at that point most people will just wipe the system and start over, if they can.
File hashes are notable for being one of the places where we rely on probabilistic guarantees even though we consider the system highly reliable. I think there are two parts to why this is a reasonable assumption:
1. The chance of collision is so incredibly low, both theoretically and empirically. Git uses SHA1, which is a 160 bit hash actually (not 128), and the odds of this colliding are many many orders of magnitude less likely than other failures. It's not just that it's less likely, it's almost incomparably less likely.
2. The chance of failure isn't increased by other, unrelated failures. Unlike apt-get, which becomes more likely to fail if the network is unavailable or if there has been a disk corruption, no other event makes two unrelated files more likely to have colliding SHA1s.
Now you might object that C if() statements are far more commonly executed than "apt-get install". This is true, but to account for this you can adjust "0.99" above accordingly. The point is that there is a huge difference between something that is strongly reliable and something that is not.
Things that are unreliable, even if failure is unlikely, lead to an endless demand for SysAdmin-like babysitting. A ticket comes in because something is broken, the SysAdmin investigates and found that 1 out of 100 things that can fail but usually doesn't has in fact failed. They re-run some command, the process is unstuck. They close the ticket with "cron job was stuck, kicked it and it's succeeding again." Then go back to their lives and wait for the next "unlikely but possible" failure.
Some of these failures can't be avoided. Hardware will always fail eventually. But we should never accept sporadic failure in software if we can reasonably build something more reliable. Self-healing systems and transient-failure-tolerant abstractions are a much better way to design software.