Generally speaking, redundancy/availability could also be achieved through replication rather than automatic rescheduling, where you deploy multiple replicas of the service across multiple machines. If one of them dies, the other one still continues service traffic. Like in good old days when we didn't have k8s and dynamic infra.
This trades off some automation for simplicity. Although, this approach may requires manual intervention when a machine fails permanently.
This trades off some automation for simplicity. Although, this approach may requires manual intervention when a machine fails permanently.