HA is definitely super tricky. Not many products do it well. One of the last NoSQL databases I used for instance was quicker to restart than for failover to be detected so DBAs would just restart the cluster instead of waiting for failover to happen during an upgrade.