Kubernetes Production Patterns and Anti-Patterns

atombender · on June 9, 2017

Good news about zombies: Kubernetes will soon solve this by having the pause container (which is automatically included in every pod) automatically reap children. [1]

Note that this change depends on the shared PID namespace support, which a larger, still-ongoing endeavour [2].

[1] https://github.com/kubernetes/kubernetes/commit/81d27aa23969...

[2] https://github.com/kubernetes/kubernetes/issues/1615

shykes · on June 10, 2017

The zombie reaping problem is fixed in Docker. You can simply use `docker run --init` with no argument, and it will spawn a tiny init that will reap children processes correctly. We don't enable the flag by default to respect backwards compatibility.

dvdgsng · on June 10, 2017

Sidenote: this Docker native implementation uses Tini (https://github.com/krallin/tini#using-tini)

atmosx · on June 10, 2017

Could you explain a bit further?

What does backwards compatibility mean in this context. The API?

shykes · on June 10, 2017

In this context it means preserving the default behavior of running the command specified in the image as PID 1, and not injecting any extra processes in the namespace. The difference in the runtime environment is subtle, but some applications rely on it. If we changed this default those applications would behave differently after upgrading to Docker 1.13 or later.

atombender · on June 10, 2017

This would require Kubernetes to do the same, which I don't think it does?

shykes · on June 10, 2017

It's up to the Kubernetes developers to decide if they want to use that feature of Docker, yes.

lemoncucumber · on June 10, 2017

I believe shared pid namespaces are landing in kubernetes 1.7.0 and therefore should be available relatively soon, no?

atombender · on June 10, 2017

Not sure. The changelog doesn't include PID namespaces yet, not sure if it will be merged: https://github.com/kubernetes/kubernetes/blob/master/CHANGEL...

lemoncucumber · on June 10, 2017

It's in there, under "Changelog since v1.7.0-alpha.3"

atombender · on June 10, 2017

Oops, I was on mobile and didn't see that the list wasn't linear with respect to version numbers. That's awesome.

I hope GKE is planning to migrate to a newer Docker version soon (they've been on 1.11.2 for a long time) so we can benefit from this.

CurtMonash · on June 11, 2017

The zombie problem is solved by reaping children?

I guess Craster was right after all.

:)

web007 · on June 9, 2017

This is an excellent check-list of both kubernetes and docker gotchas to avoid.

Coming into the k8s ecosystem with very little container experience has been a steep learning curve, and simple, concrete suggestions like this go a LONG way to leveling it out.

twakefield · on June 9, 2017

We've also published some other workshops for Docker and Kubernetes that we take customers through when onboarding (if needed): https://github.com/gravitational/workshop

Feel free to take them for a spin and feedback welcome and appreciated.

mino · on June 10, 2017

Thanks, great work!

I browsed it and immediately bookmarked to have a ready "here, read this first" answer :)

outworlder · on June 10, 2017

I would like to have seen more "patterns" regarding configuration.

Right now, we have a bunch of microservices. Most of them talk to our shared infrastructure. We started with single configuration file, which has grown to monstrous proportions, and is mounted on every pod as a config map.

What would be the correct approach? Multiple configmaps with redundant information are just as bad, if not worse.

treffer · on June 10, 2017

Start by using done service discovery. That way service names stay the same but service implementations and locations can move.

Then ship the static list of names (should be short) and per-service credentials (highly highly recommended).

Another pattern is co-locating a proxy with your app. See e.g. linkerd on how to do that. This will also unify the handling of circuit breakers and connection pools across services - even without any shared code!

majewsky · on June 10, 2017

You could make a Helm chart that renders the configuration files of the separate services from a single values.yaml

https://github.com/kubernetes/helm

lclarkmichalek · on June 9, 2017

Might be worth mentioning about Docker's native support for multi stage builds: https://docs.docker.com/engine/userguide/eng-image/multistag... (still quite a new feature, plenty of people won't have it yet I guess)

Edit: oh, you kind of do. Well, it's not upcoming any more, it's in the latest Docker CE :)

moondev · on June 9, 2017

Only in edge, not stable

twakefield · on June 9, 2017

Thanks - will update that link now that it's out.

bryanlarsen · on June 9, 2017

Have you tried out istio yet? It's the packaging of Lyft's Envoy that Google and IBM are putting together to handle your last two points, circuit breaking and rate limiting and much more.

thesandlord · on June 9, 2017

Word of caution: Istio isn't production ready yet, still a lot of missing stuff and bugs. Definitely worth playing with though, once it's ready I think it's going to make managing microservices much easier.

Langhalsdino · on June 10, 2017

Awesome github repo! I think i need to incorporated some of your patterns into my work ;)

If some of you are interested in Kubernetes GPU cluster for deep learning, this article might be good to read as well. https://news.ycombinator.com/item?id=14526807

pooktrain · on June 9, 2017

The presentation of patterns here is quite helpful. Is anyone aware of other resources for container design patterns?

The k8s blog has some as well: http://blog.kubernetes.io/2016/06/container-design-patterns....

throwaway34802 · on June 9, 2017

Have you tried using Habitat? It pairs nicely with Kubernetes and solves alot of these antipatterns I feel like.

https://habitat.sh

https://www.youtube.com/watch?v=-yTeXCY3iM0

old-gregg · on June 9, 2017

Some background on these workshops: we (Gravitational) help SaaS companies package their applications into Kubernetes, this makes them deployable into on-premise environments [1]. This in itself is an unexpected and quite awesome benefit of adopting Kubernetes in your organization: your stack becomes one-click portable.

[1] http://gravitational.com/telekube

the_common_man · on June 9, 2017

Interesting. What SaaS products are available in kubernetes today?

brianwawok · on June 10, 2017

You mean like rent a pod or like a random SaaS app that uses k8s? I know of many of the latter but none of the former. Maybe some cloud logger guys have sidecar deployments you can use.

nunez · on June 10, 2017

what are everyone's thoughts on building containers for running one time binaries? like building a container to run jq or awk or something like that.

i've seen this pattern before and it didnt make me feel very good. it reeks of unnecessary complexity.

majewsky · on June 10, 2017

Depends on the tool in question. jq or awk are such common-place and light on dependencies that it's indeed unnecessarily complex.

The benefit is when a one-time tool is heavy on dependencies. For example, with OpenStack Swift (an S3-like object storage), a common one-time task is swift-ring-builder, which takes an inventory of available storage and creates a shared configuration file which describes how data is shared between storages. That's something you would run on a sysadmin's notebook, but it's included with Swift itself, so you would have to install a bunch of Python modules into a virtualenv.

In that case, it's probably easier to just use the Docker image for Swift that you have anyway, and run swift-ring-builder from there.

nunez · on June 11, 2017

Thanks for the response. I agree that long-running or very heavily-polled services make the most sense for being containerised and that builtins usually don't.

m0rganic · on June 10, 2017

we use kubernetes, helm and gitlab.. runtime configuration lives in each repo next to code - values.yaml, dev.yaml, test.yaml, prod.yaml to store applications runtime configuration -- each environment is host to 40+ redundant services.. its working quite well but has required a pretty big upfront investment... surprised there was much discussion about monitoring- prometheus and grafana work well for that

eldios · on June 10, 2017

Awesome article!

humanfromearth · on June 9, 2017

> Anti-Pattern: Direct Use Of Pods > Kubernetes Pod is a building block that itself is not durable.

Kind of.. but you can set `restartPolicy: Always` and will always restart in case of failure.

GauntletWizard · on June 10, 2017

No. If the node the pod is scheduled to goes away, the pod will not be restarted. 'restartPolicy: Always' applies to the containers in the pod, not the pod itself. Deployments (Or daemonsets, jobs, replicasets or replication controllers) actively maintain the pods, to assure in the event that the pods are deleted they're replaced.

outworlder · on June 10, 2017

Other posters have already commented on this but, if you are not using deployments (or at least, replication controllers), stop what you are doing right away and fix that. Otherwise you lose one of the biggest advantages of k8s.

brianwawok · on June 10, 2017

Deployments are key. One annoyance is now some of the lower units feel more clunky. Is there roadmap to say let you do a stateful set as a Deployment? Obviously a bit tricky and rolling deploy could never surge...

TheSwordsman · on June 10, 2017

WRT to StatefulSets I assume you are talking about the fact they are immutable (minus a few fields)?

If so, the hack we've applied for StatefulSets is to delete the StatefulSet with `--cascade=false`. This keeps the Pods of the StatefulSet online, but removes the StatefulSet itself. We can then deploy the StatefulSet with the new configuration vlaues, and manually delete the Pods one-by-one to have the changes be applied.

Graceful: Nope. Needs Improvement: For sure Gets the job done: Yep!

brianwawok · on June 10, 2017

Hum I have deployed a SS with just delete and create. No need for a cascade flag. Maybe that is the default?

But yah like I said, not as nice as deploying a deployment.