Hacker News new | past | comments | ask | show | jobs | submit login
Manage Kubernetes Clusters on AWS Using Kops (amazon.com)
164 points by betahost on July 7, 2017 | hide | past | favorite | 68 comments



The Kubernetes team has been doing a lot of work to make these admin tools less and less necessary. More and more pieces of it can be run from within K8s itself. For example etcd used to need to be set up and managed externally, now it's just inside. And extensions are growing up too; see CRD's in 1.7.

And unlike setup & management tools, it appears that we have a clear "winner" for K8s app management: helm. And there's more overlap than you'd expect. For instance I recently typed "helm install prometheus" and not only did it install prometheus but it installed it with all the hooks necessary to monitor the K8s cluster.

I'm not sure why I can't do the same to get an elasticsearch, logstash & kibana stack (or similar competing stack) set up as a cluster logging solution. AFAICT right now you have to have the right flags set on your kubelet startup script to do this, but that's the sort of thing that I hope & believe that K8s is making better.

And setting up a glusterfs cluster to use as a storage provider also did a surprising amount of its setup in k8s.

Obviously K8s setup can't quite be reduced to a simple `apt-get install kubelet`, but hopefully eventually it isn't much more.


100% agreed and looking forward to that - kops is a big part of that effort, actually. If you look under the hood you'll see that kops uses a lot of the kubernetes apimachinery. There's obviously a tricky chicken & egg situation with first creating a cluster specifically, but the hope is that a lot of the post-installation activities you do with kopscould be done through kubectl (e.g. adding groups of nodes of a different instance type) , talking to those same kops API objects on the k8s apiserver. The apimachinery team has been doing great work to enable this, that seems to have to fruition in k8s 1.7.


It's such a pity that AWS don't offer a managed Kubernetes cluster as a service.

Our team has "wasted" a fair bit of time researching and implementing all the bits and pieces to build pre-prod prod Kubernets environments, whereas Azure (and obviously) GCE's out of the box solutions make it so much easier.


It would be nice, but it's easy to see why they don't.

Assuming the customer set that wants a supported orchestration tool, it would allow those AWS customers to more easily leave for Azure or GCE. Perhaps more importantly, it would allow existing customers to have credible on-prem dev, test, or disaster environments.

When you have the sort of market share AWS has, there's no real incentive to open those doors. I suspect this won't change until/if this market is more balanced.


Perhaps for some AWS services (ECS, etc). But I would still stay on amazon for all the other surrounding services that are crucial, SQS, S3, occasionally DynamoDB, Lambda.

I would love to have managed k8s on amazon. It wouldn't move me away because of all those surrounding technologies.


That's a good point, though there are on-prem and "other vendor" api compatible services for some of those, like S3.

Minio, for example: https://github.com/minio/minio or Google's cloud storage. Both are compatible with the S3 api.


It's only a matter of time until these AWS primative replacements are mature within Kubrnetes, which is definitely a good thing.

AWS is wonderful, but you always have to have tooling ready to not be held hostage by a vendor. And any vendor will hold you hostage once they have enough market dominance or enough of your business.


A great many of those (but not all, I admit) have equivalents on Google Cloud. I know I am somewhat biased (as I work there) but the technology is really great. BigQuery is untouchable. Spanner is magical. CloudML is hands-down the best offering. Etc.

Have you ever considered "what if I didn't presuppose AWS?"


Is anyone here running k8s in production with kops? Are there any missing pieces that require "manual" work, like rotating certificates? How feasible is it to run, say, 30 clusters out of the box with a 2-person team?


I'm one of the kops authors, and I will say that a lot of people run k8s clusters created by kops in production - I don't want to name others, but feel free to join sig-aws or kops channels in the kubernetes slack and ask there and I'm sure you'll get lots of reports. In general kops makes it very easy to get a production-suitable cluster; there shouldn't be any manual work required other than occasionally updating kops & kubernetes (which kops makes an easy process).

But: we don't currently support rotating certificates. There used to be a bug in kubernetes which made "live" certificate rotation impossible, but that bug has now been fixed so it's probably time to revisit it. We create 10 year CA certificates, so it isn't something that you have to do other than just good security practice though.

If you file an issue (https://github.com/kubernetes/kops/issues) for certificate rotation and any other gaps / questions we'll get to them!


I am curious if you might share your thoughts are on Kops vs Kubeadm for standing up a Kubernetes cluster.


There's no need to choose: kops uses kubeadm (not a lot of it, but more with each release), so choose kops and get kubeadm for free!

kubeadm is intended to be a building block that any installation tool can leverage, rather than each building the same low-level functionality. It isn't primarily meant for end-users, unless you want to build your own installation tool.

We want to accommodate everyone in kops, but there is a trade-off between making things easy vs. being entirely flexible, so there will always be people who can't use kops. You should absolutely use kubeadm if you're building your own installation tool - whether you're sharing it with the world or just within your company. luxas (the primary kubeadm author) does an amazing job.


Thanks, I wasn't aware that it was leveraging kubeadm. This is good to know. I have been really impressed by my limited exposure to Kops so far. Cheers!


How do you handle that kubernetes requires the eth0 ip in no_proxy? Do you set that automatically?

How do you handle that DNS in a corp net can get weird and for instance in Ubuntu 16.04 the NetworkManager setting for dnsmasq needs to be deactivated?

How do you report dying nodes due to kernel version and docker version not being similar?

Do you report why pods are pending?

Does kops wait until a sucessful health check before it reports a successful deployment (in contrast to helm which reports success when the docker image isn't even finished pulling)?

Do you run any metrics on the cluster to see if everything is working fine?

Edit: Sorry to disturb the kops marketing effort, but some people still hope for a real, enterprise ready solution for k8s instead of just another fluff added on a shaky foundation.


kops is an open source project that is part of the kubernetes project, we're all working to solve these things as best we can. Some of these issues are not best solved in kops; for example we don't try to force a particular monitoring system on you. That said I'm also a kubernetes contributor so I'll try to quickly answer:

* no_proxy - kops is getting support for servers that use http_proxy, but I think your issue is a client issue with kubectl proxy and it looks like it is being investigated in #45956. I retagged (what I think are) the right folks.

* DNS, docker version/kernel version: if you let kops it'll configure the AMI / kernel, docker, DNS, sysctls, everything. So in that scenario everything should just work, because kops controls everything. Obviously things can still go wrong, but I'm much more able to support or diagnose problems with a kops configuration where most things are set correctly, than a general scenario.

* why pods are pending: `kubectl describe pod` shows you why. Your "preferred alerting system" could be more proactive though.

* metrics are probably best handled by a monitoring system, and you should install your preferred system after kops installs the cluster. We try to only install things in kops that are required to get to the kubectl "boot prompt". Lots of options here: prometheus, sysdig, datadog, weave scope, newrelic etc.

* does kops wait for readiness: actually not by default - and this does cause problems. For example, if you hit your AWS instance quota, your kops cluster will silenty never come up. Similarly if your chosen instance type isn't available in your AZ. We have a fix for the latter and are working on the former. We have `kops validate` which will wait, but it's still too hard when something goes wrong - definitely room for improvement here.

In general though - where there are things you think we could do better, do open an issue on kops (or kubernetes if it's more of a kubernetes issue)!


Nice, thanks. My feeling is that this is about 75% of what we want and thereby may really be the best solution there is, right now. I'll bring your responses into my next team meeting.


I sympathize, but HN isn't a support forum. A decent reply would have to be a huge wall of text in the middle of the conversation.


Thanks for feedback. I agree that a huge wall of text is not desired. I think a single sentence answer is fine.

For instance: "Yes, we can. We considered most of that and also have some enterprise customers with similar setups. Check out "googleterm A", "googleterm B", "googleterm C". If you don't find all of that join our slack chat to get more details."

And a more likely answer, also single line: "WTF are these questions? We thought docker+k8s already solves that." (I would've also expected solutions from there but don't hope for it anymore.)

PS (actually an edit to the previous post, but it's already too old): For instance Openshift, as I just found, considers the docker-version kernel-version problem via "xxx-excluder" meta packages: https://docs.openshift.com/container-platform/3.4/install_co...

A step in the right direction!


We've been running a small Kubernetes cluster of < 30 nodes that handles a variety of workloads using kops for almost a year now. kops is a significant improvement over other provisioning tools like kube-up.sh and kube-aws and has simplified infrastructure management a great deal. We can provision a brand new cluster and a couple dozen services across multiple namespaces in less than an hour - kops helps a lot with making that process smooth and reliable.

We have run into some issues with kops. Customizing the Kubernetes executables, e.g. using a particular Docker version or storage driver, has been buggy pre-1.5. Upgrading clusters to later Kubernetes versions has left some of the kube-system services, like kube-dns, in a weird state. Occasionally we encounter issues with pods failing to schedule/volumes failing to mount - these are fixed by either restarting the Kubernetes (systemd) services on the problem nodes or by reprovisioning nodes entirely. On one occasion, a bad kops cluster update left our networking in an unrecoverable state (and our cluster inaccessible).

I don't think there are any missing pieces, the initial configuration is what usually takes the most time to set up. You'll have to become familiar with the kops source as not everything is documented. As far as running 30 clusters with a 2-person team, it's definitely feasible, just complicated when you're constantly switching between clusters.


Definitely some great feedback there - I think most of those are known issues, and not all of them are technically kops issues, but we'll be figuring out how to work around them for kops users. (Switching Docker versions is tricky because k8s is tested with a particular version, so we've been reluctant to make that too easy, and the kube-dns 1.5 -> 1.6 upgrade was particularly problematic). Do file issues for the hung nodes - it might be k8s not kops, but one of the benefit of kops is that it takes a lot of unknowns out of the equation so we can typically reproduce and track things down faster.

And it is way too hard to switch clusters with kubectl, I agree. I tend to use separate kubeconfig files, and use `export KUBECONFIG=<path>`, but I do hope we can find something better!


Right, the hung nodes issue is probably least related to kops (though it'd be great if in the future, kops could leverage something like node-problem-detector to mitigate similar issues). Of the other issues, the incorrectly applied cluster config (kops decided to update certs for all nodes and messed them up in the process, then proceeded to mess up the Route53 records for the cluster) is the most serious one, and also not likely easy to reproduce. Apart from that, kops has been an excellent tool and we've been very pleased with it.


I run kops in production, and while we've had issues, the authors are responsive and super helpful. The problems we've encountered have more frequently been with k8s itself than kops; mostly been fire and forget except when I've gotta debug which experimental feature I tried to enable broke kops' expectations (or just broke). Ping me in the channels @justinsb mentioned if you want advice.

We're at three live and two dead (decommissioned) clusters with a two man team, and while we regret some decisions, most of the time it just works.


What decisions do you regret?


Using the default networking stack. Basic aws networking on k8s relies on route tables, which are quite limited - Only supports up to 100 routes. We had to use bigger nodes than I'd planned to stay under that limit.


> Only supports up to 100 routes.

I don't know if AWS has the disclaimer up anymore, but the default limit is 50 with limit increases available to 100 with "no guarantee that performance will remain unaffected"... or something like that.

What network type are you using, out of curiosity?


What is the concern with using bigger nodes then planned?

I agree the basic networking has a lot of limitations. Compared with added more layers with networking, I'd rather have a simpler setup with fewer nodes, even if they are larger.


I've been using it in production for a couple of my clients (Y Combinator companies). Except for a few hiccups it has been pretty great. Only thing is for HIPPA and PCI compliance environments there needs to be some additional changes.

We are slowly open sourcing some of that and more here:

https://github.com/opszero/auditkube


There is support for automatic certificate rotation in the recently released 1.7. Pretty sure this was also in 1.6, albeit as an experimental alpha feature:

http://blog.kubernetes.io/2017/06/kubernetes-1.7-security-ha...


Thanks for the heads up - looks like we'll be adding support very soon then :-)


We (https://www.ReactiveOps.com) run a lot of clusters for our clients in AWS (and also GKE, but..) using kops. It's definitely possible to run a lot of clusters, but kops is only one piece of the puzzle. Autoscaling, self-healing, monitoring and alerting, cluster logging, etc. is all other things you have to deal with, which are non-trivial (they scale workload per cluster, so...)

We open sourced our kube generator code, called pentagon, which uses kops, terraform https://github.com/reactiveops/pentagon


I saw you mentioned autoscaling first. How do you handle this? Do you just install autoscaler pod by hand? (edit: just saw the link you provided, not sure if you edited your post to add it or not, but thanks! I followed the link through to https://github.com/reactiveops/pentagon/blob/d983feeaa0a8907... and it looks like I would be interested in #120 and #126, but they don't seem to be GH issue numbers or Pull Requests. Where can I find more?)

It seems like a lot of the work like this just "isn't there yet" when it comes to orchestrators like Tectonic or Stackpoint, or Kops, in making this easy for you. (So there's surely a market for people who know how to do this stuff, but it seems like this would be the first feature that every tool supporting AWS would want to have. Unless there are hidden gotchas, and it seems like there would be a lot of blog posts about it if that were the case.)


Based on your experience, would you recommend one vendor over the other? (aws vs gcp)


Last I checked, support was missing for the newer ALB load balancer on AWS. That is a hold up for some as the older ELB doesn't scale as well, and needs "pre warming".


kops can set up an ELB for HA apiservers, but I think you're primarily talking about Ingress for your own services. We don't actually take an opinion on which Ingress controller you use, so you can just choose e.g. https://github.com/zalando-incubator/kube-ingress-aws-contro... when your cluster is up.

Maybe kops should use ALBs for the apiserver, and maybe k8s should support ALB for Services of Type=LoadBalancer. Neither of those are supported at the moment, if they should be then open an issue to discuss. (Even if you're going to contribute support yourself, which is always super-appreciated, it's best to start with an issue to discuss!)


Yes, it's not specifically a criticism of kops. Supporting ALB as an ingress controller seems to be the direction, with the coreos contributed code the likely winner.[1]

Thought it worth mentioning though, as the older ELB+k8s isn't great, and because the ALB support hasn't shaken out yet, a cluster created with kops could be suboptimal unless you address it afterwards.

I assume once it all shakes out, kops would support whatever the direction is.

[1] https://github.com/coreos/alb-ingress-controller


Managing ALB outside of k8s works pretty well if you don't deploy new services that often. Map ALB to the ASG and do host/path routing.


I've setup a handful of K8s cluster on AWS, some of them using Kops. I would say it's the easiest time I've had, and it's what I chose for production. My only real complaint is that it requires extremely broad IAM permissions. Great tool, and a good community.


I know scoping the permissions is something they've been working on: https://github.com/kubernetes/kops/issues/1873


If you're using (or want to use) Terraform and consider running k8s on AWS take a look at tectonic-installer[1] and its `vanilla_k8s` setting. My opinion is that it's far better than kops `-target=terraform` output. It's also using CoreOS rather than Debian which seems reasonable.

[1] https://github.com/coreos/tectonic-installer


You can use coreos with kops. Add something like

--image 595879546273/CoreOS-stable-1298.5.0-hvm


Isn't Tectonic also based on kubeadm and Terraform? Kops and Tectonic felt very similar to me.


Tectonic is built on Terraform but is not based on kubeadm. However, a Tectonic vanilla k8s cluster may be similar to a kubeadm one since they both leverage Bootkube to provide self-hosted k8s.


That does seem like a nice stack. Do you have an idea of the actual pricing past 10 nodes? I can't seem to find it anywhere.


I think the pricing starts at $19k/yr for a 10-node cluster (so $1900/yr/node). That's what I was given as "list price" when I asked, we are not using 10 nodes yet, but I anticipated that we'll need paid support before we are, and Tectonic is one of the only games in town.

I'd guess you can negotiate that down somewhat, especially if your shop has some expertise in-house already (if you're not anticipating a lot of need or demand for their support team's help on a regular basis.) We compared this to RedHat+AWS OpenShift Dedicated which is priced significantly higher at $12000 for 4 nodes, on a per-node basis, but also OpenShift Dedicated has already built-in the cost of hosting the nodes.


I recently started testing kubernetes using kops and was pleasantly surprised how explicit it was about exactly what changes it would make in AWS. If you get past the testing phase and want to use it conjunction with your existing production infrastructure, I found AWS's VPC peering to be a handy way to let your k8s cluster talk with existing services while keeping the kops-created VPC separately managed.


the dedicated VPC approach definitely works. But, in case you weren't aware, kops can launch a cluster in an existing vpc pretty trivially.

1. create your cluster specifying the existing VPC id and CIDR. do not use the --yes flag (I'm not sure the VPC cidr is really even used with this approach)

2. edit the cluster definition (kops edit cluster xxxxxxxxx) and just change the subnet CIDRs to whatever nonexistant ones you would like for kops to create and use.


We wrote a similar guide a few months back at Datawire.io (https://www.datawire.io/guide/infrastructure/setting-kuberne...)

It's a little bit out of date now but we're working on an updated version that deals with managing upgrades, CI/CD for infrastructure and adding other pieces of cloud infrastructure in a way that the Kube cluster can access it seamlessly.


can you comment on why you suggest using a public route53 hosted zone? Using a public rather than a private zone always felt pretty dirty to me.


Reducing friction. At the time private zones in Kops were more of a PITA to setup and I would have had to explain a bunch more stuff in the article.

I might look at that again when I redo the guide. Also considering switching to the new Gossip based discovery mechanism so that Route 53 isn't even a requirement. I was recently setting up a cluster for a customer and they complained about having to use Route 53.


just learned about the gossip support, pretty interested in it too.


How are people doing autoscaling clusters on AWS?

I just want to create a simple, single-AZ cluster with one master, and one autoscaling group for workers. Ideally when there are no jobs to run, the ASG scales down to zero.

(I have a Jenkins service that can spawn pods, and it works great on the single-master-only cluster with no workers. There's even room for it to spawn one pod alongside it on the master node. I just removed the dedicated taint annotation. But as soon as I schedule one more pod, I wish I had an ASG for worker nodes because I'm requesting more CPU and RAM than is available, so stuck waiting for the first pod to complete and resources to free up.)

If I had more nodes (or the possibility of more nodes), I'd likely put the dedicated taint back and give my Jenkins master server pod a tolerates annotation so that it can land there when the resources are available, or use a label selector in the pod spec to ensure it winds up on the stable master, or both. I think I know enough to put this combination together, but I am sure I can't be the first, and I don't want to miss the "golden path" if there is one, this thread seemed like a good place to ask about it.

This seems like it would be a really common configuration, but I haven't been able to find a lot of documentation or even blogs about setting up ASGs this way.

I found kube-aws[1], which seems to be the only K8S orchestrator that actually mentions creating AutoScaling Groups for worker nodes in the setup docs (??). I found cluster-autoscaler[2] which appears to be the piece you need to get the cluster to scale up an ASG when unsatisfied K8S resource requests have left pending scheduled pods waiting, or scale down when nodes are idle for too long. I'm totally mystified why this does not appear to be a supported configuration of Tectonic or Mantl or Stackpoint, or... what's your favorite K8s orchestrator and how does it handle this?

[1]: https://github.com/kubernetes-incubator/kube-aws

[2]: https://github.com/kubernetes/autoscaler/tree/master/cluster...


kops creates an autoscaling group for the worker nodes as part of the cluster creation process. Adding the cluster autoscaler is simple as deploying the autoscaler (as mentioned in your linked docs) pointing to the correct ASG. The IAM permissions for the autoscaler can be added with `kops edit cluster $CLUSTER`.


Thanks! I have been using kubeadm for my "cluster" with some home-made ansible playbooks so I hadn't just gone ahead and tried it out yet. Good to know I'm headed down a path that others have already been down.


I've spent a solid amount of time reading docs and looking in to this. It sounds like this might not be an issue for you, but the major gotcha is k8s does not support rescheduling pods when new nodes join the cluster yet to re-balance things (there was some stuff in proposal phase to address it though).

So, if your workload is made up of lots of short lived ephemeral stuff you are good to go, but otherwise you may have to manually step into rebalance stuff to new nodes.

The autoscaler addon might have addressed some of this, but I'm not seeing anything obvious after a cursory overview of the docs.


I think I heard about that. Does sound like a pretty big gotcha, but I think you're right that it won't be a problem for my use case. Hopefully I can spend some time this weekend to try it out!


The autoscaler will add minions when there is a pod that cannot be scheduled. It doesn't help in balancing existing things, but at least the new pod should be able to go there.


I saw that and was confused, I thought that would work out of the box with k8s since (I think?) it continually tries to keep scheduling pods; I must be wrong though.

If balancing gets implemented, am I correct that it would probably happen in k8s core?


I think I wouldn't assume that, myself; at least not after my stint in VietMWare (PTSD from a prior vSphere/vSAN experience)

We had VMTurbo/Turbonomic and VMWare's internal DRS that would sometimes compete against each other to decide where VMs should be scheduled.

You could handle balancing from inside of k8s core, or not. All you need is to evict the pod on the over-provisioned node, and to arrange for provisioning of the replacement node before the original pod is fully evicted. It should be the same for Kubernetes, if I had to guess.


If you need a management system that's not tied to AWS, I'd recommend folk look at Kubo. We (Pivotal) and Google have been working on it for a while now.

https://www.youtube.com/watch?v=uOFW_0J9q70 for the motivation (tl;dr Kubernetes solves container orchestration, but doesn't solve its own management).

https://github.com/cloudfoundry-incubator/kubo-deployment for the main repo.

Disclose: I work for Pivotal.


kops isn't actually tied to AWS; there's early support for GCE and very early support for VMWare, with more on the way.


I didn't realise! That's good.


Another possibility is kismatic[1]. That's what we use and it's worked well for us. Their focus seems to be on private deployments so they have good support for metal although they also support cloud deployments.

1: https://github.com/apprenda/kismatic


Running K8s on AWS isn't a new concept to me. Obviously lots of people are doing it with Kops and other tools (Ex: Tectonic). What I find curious is that Amazon is advertising this on their blog. K8s is essentially a direct competitor to their own ECS service. Arguably K8s is much better, much larger community adoption and has significantly faster improvement.

Maybe AWS is gonna give up the ECS ghost? Or at least offer K8s as an option. After all GCE offers it in GKE, so does Azure.


We (www.startopsgroup.com) have been rolling out a lot of Kops clusters lately for clients that need COLO + Cloud or multi-cloud environments. So far we haven't found a better tool for the purpose, even if there are a few rough edges.


wish there was some solution to not using the AWS Route53 - https://github.com/kubernetes/kops/issues/794


We should update that issue. There is support for private route53 domains, which means you don't need a DNS domain name, and there is early experimental support for gossip discovery (i.e. no Route53 at all) - just create a cluster with a name of `<name>.k8s.local`.


unsolicited feedback: I know you guys beefed up the documentation with one of the most recent releases, but this is an area that probably needed some work circa v1.5. Most of it is pretty self explanatory if you are familiar with Route 53, but It would be nice to have some more examples of stuff like "here's what a reference cluster using a private route53 hosted zone and private network topology looks like, and here's why you might want to do that".

is there documentation on the gossip support anywhere? I'm not seeing much aside from "just name your cluster with a k8s.local suffix and you are good to go"


I believe that is not strictly a requirement anymore: https://github.com/kubernetes/kops/blob/master/docs/aws.md#c...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: