> The AWS container services, including Fargate, ECS and EKS, experienced increased API error rates and latencies during the event. While existing container instances (tasks or pods) continued to operate normally during the event, if a container instance was terminated or experienced a failure, it could not be restarted because of the impact to the EC2 control plane APIs described above.
This seems pretty obviously false to me. My company has several EKS clusters in us-east-1 with most of our workloads running on Fargate. All of our Fargate pods were killed and were unable to be restarted during this event.
Strong agree. We were using Fargate nodes in our us-east-1 EKS cluster and not all of our nodes dropped, but every coredns pod did. When they came back up their age was hours older than expected, so maybe a problem between Fargate and the scheduler rendered them “up” but unable to be reached?
Either way, was surprising to us that already provisioned compute was impacted.
Saw the same. The only cluster services I was running in Fargate were CoreDNS and cluster-autoscaler; thought it would help the clusters recover from anything happening to the node group where other core services run. Whoops.
Couldn't just delete the Fargate profile without a working EKS control plane. I lucked out in that the label selector the kube-dns Service used was disjoint from the one I'd set in the Fargate profile, so I just made a new "coredns-emergency" deployment and cluster networking came back. (cluster-autoscaler was moot since we couldn't launch instances anyway.)
I was hoping to see something about that in this announcement, since the loss of live pods is nasty. Not inclined to rely on Fargate going forward. It is curious that you saw those pod ages; maybe Fargate kubelets communicate with EKS over the AWS internal network?
This seems pretty obviously false to me. My company has several EKS clusters in us-east-1 with most of our workloads running on Fargate. All of our Fargate pods were killed and were unable to be restarted during this event.