Bootstrapping Kubernetes Google Cloud Platform without scripts

runeks · on Aug 31, 2017

Is it me, or does it seem weird to encrypt your secrets by uploading the secret key to GCP (contained in the config .yaml file)? I assume the controller instances are operated by Google in this[1] example.

Moreover, is there any sensible way at all to encrypt secrets without baking the secret key into your image? I can’t think of any.

I want to deploy an app that makes use of one or more fairly important secrets, but I haven’t found a sensible way to make it auto-scale while keeping the secrets on-premise.

As far as I can see, the only sensible solution is to create in-cloud/off-premise secret keys that can only be accessed by images signed with an on-premise secret key.

So,

1. Create secret key on an offline, on-premise machine

2. Produce application image, transfer to offline machine, sign with on-premise secret key

3. Create off-premise (in-cloud) secret, which can only be accessed by images signed with the on-premise secret key

4. Upload app image and signature to the cloud, allowing only this image access to the in-cloud secret

[1] https://github.com/kelseyhightower/kubernetes-the-hard-way/b...

philips · on Aug 31, 2017

To do this sort of encryption at rest you need a root. There are essentially three common solutions:

1. Keep the secret key in memory. This can be annoying though as a system administrator needs to unlock every process as it comes up; automation can help but is rarely deployed. This is what Vault does for example. There is a long design doc that we all worked on with this as an eventual option[1] but it is unclear if there is a ton of utility for users to be unlocking API servers

2. Hide the secret key inside of hardware using something like an HSM or a TPM. This is really the state of the art however it doesn't really work well in many environments and requires good inventory management.

3. Give machines a stable cryptographic identity and exchange that identity for a secret key that is kept in memory. This solution sort of shifts the problem around but is generally a good tradeoff between automation and security. The idea is, similar to SSH host keys, on first boot a machine generates an identity and then a system administrator approves that identity and later gives that identity authorization to do tasks. Kubernetes has a lot of the groundwork in place for this and it is still underway[2].

[1]: https://github.com/destijl/community/blob/3418b4e4c6358f5dc7... [2]: https://github.com/kubernetes/features/issues/43

npmccallum · on Aug 31, 2017

We are working on a solution to this problem in the Clevis project[0]. It is a basic FUSE filesystem that will transparently decrypt your secrets/configuration. It will evaluate your decryption policy on each open and log the attempt.

You can see the initial proof of concept[1]. It isn't secure yet, for a variety of reasons. But it is enough to play around with. Moving to a better encryption scheme will give us the ability to do locks and per-block validation.

[0]: https://github.com/latchset/clevis [1]: https://github.com/npmccallum/clevis/blob/fuse/src/clevis-fu...

sjeanpierre · on Aug 31, 2017

In AWS the combination of IAM roles + Param store gives us a pretty decent version of this. We have the app talk to param store to get secrets then just keeps them in memory.

garblegarble · on Aug 31, 2017

>Moreover, is there any sensible way at all to encrypt secrets without baking the secret key into your image? I can’t think of any.

Using PGP / GPG sounds like what you need

vikstrouss · on Sept 1, 2017

Have you looked into docker swarm secrets? They work only with swarm services but I think they do what you want.

hosh · on Aug 31, 2017

Kelsey Hightower has a repo experimenting with using HashiCorp Vault as a Secrets backend.

ctrlrsf · on Aug 31, 2017

Have you considered using AWS KMS?

lima · on Aug 31, 2017

I can highly recommend OpenShift and its Ansible deployment scripts. Documentation is very well-written and complete.

It takes care of all the annoying parts of Kubernetes and even has services like a full-featured Docker registry with ACLs and so on, a Docker build system and even a centralized logging mechanism (all optional, of course).

Running it in production. Couldn't be happier.

sleepybrett · on Aug 31, 2017

The best way to actually understand how the kube components relate to each other and work together is to follow this guide.

MandieD · on Sept 1, 2017

Going through the previous version of this tutorial really helped me, even though we're doing IBM Cloud private on-prem + Bluemix Container Service (don't ask.)

It works pretty well with Cloud Shell, in case you have corporate firewall issues. If your session is interrupted, run the commands that set the region and region zone again.

I can confirm that it costs about $6/day that the machines are provisioned, and is well worth it, but remember to run all the clean-up steps in the last chapter when you're done or if you're not going to finish it right away.

djb_hackernews · on Aug 31, 2017

This is great. I have been looking at Kubernetes for sometime and have struggled with adapting it to our deployment model. A lot of the tools and tutorials want someone to sit and run commands in order to start controllers and worker nodes but that doesn't make sense in our automated environment. What we really want is a way to bake AMIs etc that have everything ready to go and when we do a deployment or scale out it is as simple as starting an instance. This collection of labs lays a lot of that out and I think this is something we can work with.

felixgallo · on Aug 31, 2017

Look into kops as well - great setup tool for well designed deployments.

djb_hackernews · on Aug 31, 2017

We have, and as far as we could see it is a command line tool that needs to be run manually. This model doesn't work for us as we scale instances in and out by the dozens throughout the day. I could be missing something but anything that requires command line interaction isn't viable.

For instance, if we see that our worker cluster is has reached some consumed capacity threshold (CPU, memory, etc) then we need to add more worker nodes. The way we do this is via Auto Scaling Groups. Auto Scaling Groups work on launch configs that are essentially an AMI and instance user data. Getting an AMI that just has the Kubernetes stuff that can be started at will and told to join clusters at runtime has not been clear from the documentation I've read.

nrmitchi · on Aug 31, 2017

It sounds like you may be looking for the cluster autoscaler: https://github.com/kubernetes/autoscaler/tree/master/cluster....

When used with Kops, it should give you what you're looking for (with the baked in kops AMI)

djb_hackernews · on Aug 31, 2017

Thanks, I have looked at this as well. There are a few issues:

- It only scales worker nodes

- It only scales nodes if a pod can't be scheduled. This is too late for us as are requirements are to scale when resource utilization crosses a threshold (ie reserved memory is over 70%).

- It still doesn't solve the baked AMI problem. I may be wrong about this but the AWS docs for CA are lacking. For instance, when does the ASG/launch config get created? Is that something it does or it needs to be done ahead of time? Which AMI does it use? etc.

nrmitchi · on Aug 31, 2017

Addressing your points;

1) Yes, but I'm not sure why you would want to scale other nodes (I assume you're referring to the masters?)

2) Yes, the cluster-autoscaler will give you more node capacity. If you want to scale Pods on a CPU basis, you can use a Horizontal Pod Autoscaler (https://kubernetes.io/docs/tasks/run-application/horizontal-...). I believe it now support custom metrics as well, so you can scale on any resource threshold.

3) The ASG/Launch Config is created by Kops when you create the cluster; and is also manageable through the kops tools. It will default to the kops default AMI (which includes the kubelet, and everything you need for k8s to run), but you can also override that with a different AMI if you have a k8s compatible AMI that more fits your needs.

yissachar · on Aug 31, 2017

To expand on the sibling comment by nrmitchi, in Kubernetes you don't scale your worker nodes by CPU or memory. Instead you scale your pods by CPU or memory. Running the cluster autoscaler ensures that when you run out of space to schedule a new pod new worker nodes are brought up. It also works the other way; when your nodes are being underutilized for a while, the autoscaler will kill some nodes (and the scheduler will automatically handle placing any pods on those nodes onto other nodes).

djb_hackernews · on Aug 31, 2017

Right, I see that in the CA docs. This doesn't work with our business requirements since our workloads are spikey and we don't want to wait around for pods not being able to be scheduled to start more nodes.

A contrived example is imagine if for every gmail user that logs into gmail google spins up a pod for the user. No imagine if there wasn't any capacity to schedule a pod for a user that is logging in. That user would either be denied or have to wait for an instance to come up and then the pod to be scheduled. Not ideal.

yissachar · on Aug 31, 2017

Yes, this is my biggest problem with the autoscaler as well. I opened an issue about this almost a year ago, looks like they're finally getting around to addressing it: https://github.com/kubernetes/autoscaler/pull/77

jacques_chester · on Aug 31, 2017

I'm putting in a word for Kubo[1], which makes the scaling easier. It is however still manual unless you build some other automation around it.

[1] https://github.com/cloudfoundry-incubator/kubo-deployment

Disclosure: I work for Pivotal, we built Kubo with Google.

elsonrodriguez · on Aug 31, 2017

Kops is ok, but it's a grey box that's meant to be used directly. It's also fairly AWS specific.

This guide is more of core recipe from which someone can bake automation with config management or custom images.

philips · on Aug 31, 2017

What do you dislike about it being a grey box?

We built a Kubernetes platform installer[1] using Terraform which works fairly well and I think is more white box because it uses Terraform directly. However, we are still dealing with some of the necessary Terraform language repetition in our modules.

[1] https://github.com/coreos/tectonic-installer

hosh · on Aug 31, 2017

I did something similar off of CoreOS's tutorial. So while I'm missing a lot of understanding of the newer functionality, going through this was worth it.

mdaniel · on Sept 1, 2017

Another vote for CoreOS's repo; it was how I _really_ loaded k8s into my head, because the vagrant setup is multi-machine but _local_ and thus one can see what's going on and destroy-recreate it at will.

https://github.com/coreos/coreos-kubernetes/tree/master/mult...

I think one can choose which approach matches one's learning style, and I am very thankful we have both KTHW and coreos-kubernetes.