We have, and as far as we could see it is a command line tool that needs to be r...

nrmitchi · on Aug 31, 2017

It sounds like you may be looking for the cluster autoscaler: https://github.com/kubernetes/autoscaler/tree/master/cluster....

When used with Kops, it should give you what you're looking for (with the baked in kops AMI)

djb_hackernews · on Aug 31, 2017

Thanks, I have looked at this as well. There are a few issues:

- It only scales worker nodes

- It only scales nodes if a pod can't be scheduled. This is too late for us as are requirements are to scale when resource utilization crosses a threshold (ie reserved memory is over 70%).

- It still doesn't solve the baked AMI problem. I may be wrong about this but the AWS docs for CA are lacking. For instance, when does the ASG/launch config get created? Is that something it does or it needs to be done ahead of time? Which AMI does it use? etc.

nrmitchi · on Aug 31, 2017

Addressing your points;

1) Yes, but I'm not sure why you would want to scale other nodes (I assume you're referring to the masters?)

2) Yes, the cluster-autoscaler will give you more node capacity. If you want to scale Pods on a CPU basis, you can use a Horizontal Pod Autoscaler (https://kubernetes.io/docs/tasks/run-application/horizontal-...). I believe it now support custom metrics as well, so you can scale on any resource threshold.

3) The ASG/Launch Config is created by Kops when you create the cluster; and is also manageable through the kops tools. It will default to the kops default AMI (which includes the kubelet, and everything you need for k8s to run), but you can also override that with a different AMI if you have a k8s compatible AMI that more fits your needs.

yissachar · on Aug 31, 2017

To expand on the sibling comment by nrmitchi, in Kubernetes you don't scale your worker nodes by CPU or memory. Instead you scale your pods by CPU or memory. Running the cluster autoscaler ensures that when you run out of space to schedule a new pod new worker nodes are brought up. It also works the other way; when your nodes are being underutilized for a while, the autoscaler will kill some nodes (and the scheduler will automatically handle placing any pods on those nodes onto other nodes).

djb_hackernews · on Aug 31, 2017

Right, I see that in the CA docs. This doesn't work with our business requirements since our workloads are spikey and we don't want to wait around for pods not being able to be scheduled to start more nodes.

A contrived example is imagine if for every gmail user that logs into gmail google spins up a pod for the user. No imagine if there wasn't any capacity to schedule a pod for a user that is logging in. That user would either be denied or have to wait for an instance to come up and then the pod to be scheduled. Not ideal.

yissachar · on Aug 31, 2017

Yes, this is my biggest problem with the autoscaler as well. I opened an issue about this almost a year ago, looks like they're finally getting around to addressing it: https://github.com/kubernetes/autoscaler/pull/77

jacques_chester · on Aug 31, 2017

I'm putting in a word for Kubo[1], which makes the scaling easier. It is however still manual unless you build some other automation around it.

[1] https://github.com/cloudfoundry-incubator/kubo-deployment

Disclosure: I work for Pivotal, we built Kubo with Google.