An update on container support on Google Cloud Platform

jbeda · on June 10, 2014

We are particularly excited about Kubernetes. We are taking the ideas for how we do cluster management at Google and creating an open source project to manage Docker containers.

https://github.com/GoogleCloudPlatform/kubernetes

boundlessdreamz · on June 10, 2014

Slightly offtopic - I haven't used docker yet but can docker be used to run Ubuntu on GCE? That seems to be possible by using a ubuntu base image?

wmf · on June 10, 2014

Ubuntu can run directly on GCE. http://doit-intl.com/blog/2014/5/31/how-to-install-ubuntu-se...

nivertech · on June 10, 2014

Note that some Google's Debian Wheezy images are specifically optimized for their Andromeda SDN [1], especially when they running on us-central1-b or europe-west1-a zones, where Andromeda is enabled [2].

Also Ubuntu 14.04 LTS has much worse performance than Ubuntu 12.04 LTS, so it's better to stick to 12.04 LTS for now.

In my opinion it's better to port to Debian or to run Ubuntu image inside container hosted on Debian, until they will have native Ubuntu support.

[1] http://gigaom.com/2014/04/02/google-launches-andromeda-a-sof...

[2] http://googlecloudplatform.blogspot.co.il/2014/04/enter-andr...

sciurus · on June 11, 2014

"Also Ubuntu 14.04 LTS has much worse performance than Ubuntu 12.04 LTS"

Citation needed.

nivertech · on June 11, 2014

http://www.phoronix.com/scan.php?page=article&item=ubuntu_14...

tonfa · on June 10, 2014

Wow, why is it so complicated? Can't you do it from the console?

martythemaniak · on June 10, 2014

Because they don't have ready-made ubuntu images, so you have to make your own first.

The simpler solution is to use their ready-made Debian images, which can be easily configured via CLI or API.

patrickaljord · on June 10, 2014

Yes, it is.

michaelmior · on June 10, 2014

"Everything at Google, from Search to Gmail, is packaged and run in a Linux container." Was this something which Google had previously disclosed? Seems a bit surprising to me.

jbeda · on June 10, 2014

Yeah -- I talked about it a couple of weeks ago at GlueCon. Also shared that we launch over 2 billion containers every week.

My slides from the talk: https://speakerdeck.com/jbeda/containers-at-scale PDF: http://slides.eightypercent.net/GlueCon%202014%20-%20Contain...

zmanian · on June 10, 2014

I went to a talk on Omega on Box.com about a year ago. At that point Omega was still about managing Google's giant statically linked binaries.

Did Google switch to containers in a year? Maybe that answer is in your slides? If so crazy...

zeroxfe · on June 10, 2014

The statically linked binaries run inside the containers. Static linking gives you a certain kind of portability (no need for library dependencies on the machine.) The containers give you isolation, resource management, etc.

zmanian · on June 11, 2014

The way it has been explained to me is that Google's fat binaries have no dependencies beyond libc. The task scheduler would just have to deploy that one file to one or more machines to get it to run.

Containers are much more flexible that statically linked binaries. You could have multiple binaries in a container sharing a common set of dynamically linked files.

Fat binaries inside containers sounds a bit like the worst of both worlds...

menage · on June 11, 2014

If you want unrelated jobs to share library binaries, you need to stick to explicit release schedules for those libraries, which means more coordination between the various app teams and the teams that own the libraries.

When the size of the libraries (megabytes) is compared with the typical heap size of running jobs (gigabytes - a small number of large instances per job is typically a lot more efficient than a large number of small instances) the space savings of shared libraries become pretty negligible.

Back then, one of the main bottlenecks in the system was the central scheduler, in particular the amount of work it had to do tracking what binary packages were installed on each machine and what were needed for the candidate jobs it might plan to run on those machines. Having many packages per job just makes the scheduler bottleneck worse.

There were two places where it did actually make sense to share packages between jobs:

- Java Virtual Machine and supporting libraries - libc

These were external (so changed much less frequently), quite large, and were needed by very large numbers of jobs, so the space savings of having one copy of each needed version per machine outweighed the extra scheduling load required for them.

thockingoog · on June 11, 2014

Static binaries are congruent to containers, in a way. Both bundle an app and dependencies together. There are advantages and disadvantages to both approaches, but Docker-style containers are more elegant in some ways.

menage · on June 11, 2014

Google had been using primitive kernel containers (based on cpusets and fake NUMA emulation) since early 2007 - this was quite some time prior to getting cgroups into the mainline kernel.

michaelmior · on June 10, 2014

Thanks for the link!

SEJeff · on June 11, 2014

I'm guessing you weren't aware that google is the company that wrote the overwhelming majority of the cgroup subsystem and much of the namespace bits that lxc/docker use. Paul Menage and Rohit something were the two biggest kernel guys on it if memory serves. I used to read the LKML firehose actively and gave up eventually.

menage · on June 11, 2014

Yes, we contributed the core cgroups system, and a good chunk of the memory and I/O cgroup subsystems. Google didn't have much need for namespacing though - when you have complete control over all the code that's running, it's possible to have the common libraries co-ordinating with the cluster management system to assign things like network ports and userids, so there's no need to virtualize IP addresses or userids. Google's containers are (or at least were - I left a few years ago) just for resource isolation.

michaelmior · on June 11, 2014

Thanks for the info Jeff! No, I wasn't aware of this.

thockingoog · on June 10, 2014

We've been talking about containers for years. I presented some of our problems with them back in 2011, and I know we've been talking about it before that. :)

planckscnst · on June 10, 2014

This post was written by Eric Brewer, the author of CAP theorem.

mp99e99 · on June 10, 2014

Is that the same Eric Brewer from Inktomi?

michaelmior · on June 10, 2014

Yes, that's him :)

seiji · on June 10, 2014

I imagine he's done some noteworthy things since? That's like introducing Elon Musk as "Founder of X.com."

Essentially it comes across as "I'm a fan of your early work, but nothing you've done since matters."

planckscnst · on June 10, 2014

He's done a bunch of stuff, but CAP is probably his most popular work. Even Wikipedia says that's what he's "known for".

PanMan · on June 10, 2014

I know Google runs at a huge scale, but isn't even for them 2 billion containers / week a LOT? I assume a lot of these only run for a really short time? Are containers the new scripts?

jbeda · on June 10, 2014

I can't give specifics but a lot of these are short lived. For example, if you launch a MapReduce it'll typically launch containers for each of the workers and then take them down when the MR is done.

This also doesn't speak to the number of long running containers. There are plenty that don't stop/start during the week I grabbed that number.

derefr · on June 10, 2014

If docker images are fancy static binaries, then docker containers are fancy OS processes. Going through two billion OS PIDs in a week doesn't seem that hard.

martinp · on June 10, 2014

Still seems like a lot, 2 billion a week is over 3300 every second. Says a lot about the scale at which Google operates.

planckscnst · on June 10, 2014

_The Datacenter as a Computer_ [1] is a free book written by Googlers that helps get your head around that scale.

[1] http://www.morganclaypool.com/doi/abs/10.2200/S00516ED2V01Y2...

cmelbye · on June 11, 2014

> Are containers the new scripts?

They certainly seem to work well for that. Heroku, for example, uses containers for not just persistent processes (application servers, workers) but also short-lived processes. Tasks that run on a schedule (hourly, daily, etc.) are run by, you guessed it, starting up a container running a processes which exits when it's finished. One-off commands like maintenance scripts or REPLs work the same way.