Hacker News new | past | comments | ask | show | jobs | submit login
Docker - the Linux container runtime (docker.io)
273 points by radimm on March 20, 2013 | hide | past | favorite | 204 comments



I recently changed our Jenkins CI infrastructure to use something like this: all jobs run in their own LXC container, using BTRFS and its snapshots/clones instead of AUFS. Works like a charm!

The script is available at https://github.com/Incubaid/jenkins-lxc (no docs for now, and several improvements are possible). It expects CI job scripts to be contained in the job repository, e.g. https://github.com/Incubaid/arakoon/tree/1.6/jenkins


That's pretty awesome.

At FrozenRidge.co, we work on our own CI server called Strider and have actually integrated Docker directly:

You can read about it at: http://blog.frozenridge.co/next-generation-continuous-integr...


This is great, I've been wanting to do the same thing for our jenkins setup.


Wow! Did not expect this to show up on HN before actual release! (I work at dotCloud).

We're still polishing a few rough edges. If you want early access add your github ID to this thread and we'll add you right away!


I've also been building something similar to docker by plugging into Puppet, Jenkins and ActiveMQ. Would love to take a looker at Docker. I'm http://github.com/georgebashi


Also: there's an irc channel: #docker on freenode. Come hang out!


After seeing all the github ID's posted, it reminds me how much I'd like to see a PM feature added to HN.



github/twitter: zimbatm

I've built a similar tool using Go. Wondering how you guys get around the lack of clone(2) in the stdlib :)


I want to hear more about your go version of Docker. Link?

You have a lot of projects in your github repository and I was not able to identify it from scanning the list.


It's not open source (yet?) but it looks more like systemd's nspawn. I just had a quick look of Docker and our solution is much less complicated (and also less powerful).

Send me an email if you want to discuss: jonas@pandastream.com


I'd love to have access, thanks. My github username is dwahler.


github: thoward

How does this compare with warden? https://github.com/thoward/vagrant-warden


BTW: If you're doing beta access, I could use this immediately for http://riakon.com .. Currently using something I hacked together w/ warden, but Docker looks like a more elegant solution (and I'd rather be part of a community than using my own one-off hack).


I'm daxelrod on GitHub, and I would be stoked to get an invite. I've been in the early design stages of something similar to this. Thank you!


Looking forward to testing this out. https://github.com/LeeSaferite


Yes please and thank you! Github: schell


Looks very interesting, look forward to testing

https://github.com/da-n


wereHamster (don't be scared, I won't bite and infect you)


There was a lightning talk about Docker at Pycon; I'd assume that's where OP got the info from :)


I've posted a video of the lightning talk here: https://plus.google.com/photos/115695491015706412558/albums/...

It should be public so you can view it even without a G+ account (I think)


That's possible. When we submitted the lightning talk, we pictured a poorly lit backroom with maybe 15 fellow kernel geeks. Instead, we presented in the main room to at least 500 people... So much for discreetly asking for feedback :)


Would love to have a look - @pbogdan



We've been working on a minimal LXC manager for a week now, but docker seems to be exactly what we need. Can't wait to check this out. https://github.com/cpra-lcoffe


Please add me, github.com/janne


github id: stormbrew

I've been wondering when something like this would come around and if I'd have to try to write it myself. I've made smaller, less isolated, scale versions of this idea before but this looks snazzy.


Github: noplay

Thanks, that sound interesting


Wow, really look forward for such a solutiion github.com/reeze please :)


I am working with cpra-lcoffe,and looking forward to testing this out. https://github.com/cpr-mbelarbi


Awesome! Could you add me too? Github id 'dedene'. Thanks!!


At first glance I thought, 'I already have vagrant'. After watching the lightning talk my only thought is, 'Want!'

Looks awesome and crazy fast. Great work.

github id: natejenkins


Github: jwhitlark

Looks really useful.


github/fsniper Docker seems to be making old technologies reappear with new implementations.

Is this linux-vserver or openvz re implemented with lxc and cgrougs?


this looks awesome. would love to check it out.

github: brianbrunner


Génialissime. GH: @JeanSebTr thanks ! :)


github: pjr

looked at the pycon demo ... looks awesome!


Sounds promising. Github: dvbportal


Thanks for the early access. From what I've seen so far, Docker is topnotch.


Thanks! Glad you like it. We have a few cool features coming up... Can't wait to show them.


amazing! congratz! github: luisbebop


Would REALLY love to take a look at this.

Github: 198d


Oooh! Tobu. edit: building, thanks!


GH: joonathan


been waiting for something like this,

github: madisp


Very interesting, guys! GH: themgt


Yay, here too! Github: buster


github: rafiss

Thanks! This might help a lot for an idea that my friends and I are working on.


Cool! My github id is 'malbin' -- would love to take it for a test drive.


github: jzawodn


github: garnieretienne


Looks awesome. I love containers/lxc. github id: wkharold


great! ... chrisfarms


github: jackinloadup


kolektiv on GH too.


github: tomjohnson3


I'd love early access

github: mzupan


github: sudorandom


Hey hope I'm not late, my github ID is: jmsduran


github: wiredfool


github: warf

Built a similar system in-house at my workplace. Would definitely be interested in migrating/contributing to a wider effort!


I'd love to check it out. Github id: cespare


github.com/peterkeen please and thank you :)


github: robertfw


github.com/radim


thanks - trotsky


github: jrsmith


alexchamberlain


github: yebyen


github: Contra


github: avidal


github: fgrehm



This is awesome. GH: leourbina


github: nwg


github: billbradford (thanks)


mrud


ottbot

thanks!


nnutter


dpaola2


Github: deepakprakash


github: visualphoenix


github: donspaulding


wow. github: seletz


github/gonzopancho


github: mharris717


github/tageorgiou


github: HashNuke


github: tdmackey


github: pnathan


github: rennhak


github: baransn


github: cookrn


github: silasb


github: ayosec


github: doda


github: pau


huski


andykram

Thanks!


tsabat


swdunlop


oh yeah :) Github: eins78


github: dennisferron


danellis on GitHub.


github/twitter: xetorthio

thanks!


github: naelyn


pepijndevos


stevvooe


aaronfriel


I'm not familiar with any of the technologies used in this. Anybody care to comment on how strong the isolations would be security wise, compared to normal virtualization?

If the security is almost at par and the isolation is good enough that one bad process can't bring the whole system down, might this be a good alternative to virtualization, since I imagine it would definitely use less resources.


Container based virtualization can provide an impressive amount of isolation while improving density dramatically on light duty loads over virtualization. Solaris zones are very well regarded and are used for multi-tenant by Joyent, and many many linux hosts provide multi-tenant solutions based on virtuozzo which predates linux containers by a good number of years.

The main theoretical difference between hypervisor isolation and container isolation is one sits above the kernel, so a kernel level exploit only applies to a single virtual machine. With containers you're relying on the kernel to provide the isolation so you are still subject to (some) kernel level exploits.

Practically linux containers (the mainline implementation) have only provided full isolation in recent patches and probably shouldn't be considered full shaken out for something like full in the wild root level multi-tenant access.

They are super for application isolation for delivery of multiple single tenant workloads on one machine though - something people use hypervisors for quite a bit. The resources used can be a small fraction of what you're committing to with a hypervisor.


As trotsky mentions, we at Joyent are fervent believers in OS-based virtualization -- to the point that in SmartOS, we run hardware virtualization within an OS container. There are many reasons to favor OS-based virtualization over hardware-based virtualization, but first among these (in my opinion) is DRAM utilization: with OS-based virtualization, all unused DRAM is available to the system at large, and in the SmartOS case is used as adaptive replacement cache (ARC) that benefits all tenants. Given that few tenants consume every byte of their allocated DRAM, this alone leads to huge efficiencies from both the perspective of the cloud operator and the cloud user -- a higher-performing, higher-margin service. By contrast, for hardware-based virtualization, unused DRAM remains with the guest and is simply wasted (kludges like kernel samepage mapping and memory ballooning notwithstanding).

DRAM isn't the only win, of course: for every other resource in the system (CPU, network, disk), OS-based virtualization offers tremendous (and insurmountable) efficiency advantages over hardware-based virtualization -- and it's great to see others make the same realization!

For more details on the relative performance of OS-based virtualization, hardware-based virtualization and para-virtualization, see my colleague Brendan Gregg's excellent blog post on the subject[1].

[1] http://dtrace.org/blogs/brendan/2013/01/11/virtualization-pe...


Solaris zones use similar concepts to LXC/namespaces, but are actually providing secure isolation.

Recent patches DO NOT provide "full isolation" and never did. What they add is usermode containers. Those are broken weekly since the release. Seriously. Have a look at http://blog.gmane.org/gmane.comp.security.oss.general


> Those are broken weekly since the release. Seriously. > Have a look at http://blog.gmane.org/gmane.comp.security.oss.general

Funny you should say that. The latest virtualization-related CVEs there are actually in KVM -- a trio including two host memory corruptions, which usually enables completely owning the host. http://permalink.gmane.org/gmane.comp.security.oss.general/9...

And on the other hand, I don't see any container-related CVEs at all from 2013 in the CVE database: http://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=linux+kernel (The KVM issues I mentioned don't show up yet either, because they're from today.) What vulnerabilities are you referring to?

Maybe you mean kernel vulnerabilities in general, some of which could be usable by a user inside a container. Everyone should stay on top of kernel updates in any event. If you hate the rebooting, Ksplice is free for Ubuntu (and Fedora.)


The Linux namespace stuff is evolving pretty fast, and I personally wouldn't trust it as the main line of defense for anything important.

With virtualization, a buggy or malicious guest is still limited to its sandbox unless there's a flaw in the hypervisor itself. With containers/namespaces, the host and guest are just different sets of processes that see different "views" of the same kernel, so bugs are much more likely to be exploitable. Plus, if you enable user namespaces, some code paths (like on-demand filesystem module loading) that used to require root are now available to unprivileged users.

There's already been at least one local root exploit that almost made it into 3.9: https://lkml.org/lkml/2013/3/13/361


> The Linux namespace stuff is evolving pretty fast, and I personally wouldn't trust it as the main line of defense for anything important.

If I recall, Heroku uses cgroups (EDIT: and namespaces) exclusively for multitenant isolation (and by the looks of this, dotCloud does too), so that's two big votes in the "if it's good enough for them" category.


Sure, but cgroups and namespaces are kind-of-orthogonal features that both happen to be useful for making container-like things. cgroups are for limiting resource usage; namespaces are for providing the illusion of root access while actually being in a sandboxed environment.

And as far as I'm aware (speaking as an interested non-expert, so please correct me if I'm wrong) cgroups have no effect on permissions, whereas UID namespaces required a lot of very invasive changes to the kernel.


That's correct: cgroups have no effect on permissions. They only enforce resource usage limits.

Shameless plug: I work at dotCloud, and I wrote 4 blog posts explaining namespaces, cgroups, AUFS, GRSEC, and how they are relevant to "lightweight virtualization" and the particular case of PAAS. The articles have been grouped in a PDF that you can get here if you want a good technical read for your next plane/train/whatever travel ;-) http://blog.dotcloud.com/paas-under-the-hood-ebook


Fundamentally, the cgroups framework is just a way of creating some arbitrary kernel state and associating a set of processes with that state. For most cgroup subsystems, the kernel state is something to do with resource usage, but it can be used for anything that the cgroup subsystem creator wants. At least one subsystem (the devices cgroup) provides security (by controlling which device ids processes in that cgroup can access) rather than resource usage limiting.


Personally I think the biggest value these days with para-virtualization like this is in development. I can be running twenty or so different applications on the same physical machine, and for the most part (as long as they're idle since I'm only working with one) I don't even notice that they're running.


Yes, you probably don't want to run untrusted code with root privileges inside a container if anything valuable is running on the same host.

However if that code is trusted, or if you're running it as an unprivileged user, or if nothing else of importance is sharing the same host, then I would not hesitate to use them.

Containers are awesome because they represent a logical component in your software stack. You can also use them as a unit of hardware resource allocation, but you don't have to: you can map a container 1-to-1 to a physical box, for example. But the logical unit remains the same regardless of the underlying hardware, which is truly awesome.


Barring kernel bugs, it should prevent against the mentioned resource monopolization issues. Normal virtualization is pretty resource wasteful, especially if the guests are not hypervisor aware.

Getting away from huge per-VM block devices is a step in the right direction.


This is still technically a virtualization technique, known as "operating system-level virtualization". http://en.wikipedia.org/wiki/Operating_system-level_virtuali...

Here are some of the technologies explained:

cgroups: Linux kernel feature that allows resource limiting and metering, as well as process isolation. The process isolation, also called namespaces, is important because it prevents a process from seeing or terminating other running processes.

lxc: this is a utility that glues together cgroups and chroots to provide virtualization. It helps you easily setup a guest OS by downloading your favorite distro and unpacking it (kind of like debootstrap). It can then "boot" the guest OS by starting it's "init" process. The init process runs in its own namespace, inside a chroot. This is why they call LXC a chroot on steroids. It does everything that chroot does, with full process isolation and metering.

aufs: this is sometimes called a "stacked" file system. It allows you to mount one file system on top of another. Why is this important? Because if you are managing a large number of virtual machines, each one with 1GB+ OS, it uses a lot of disk space. Also, the slowest part of creating a new container is copying the distro (can take up to 30 seconds). Using something like AUFS gives you much better performance.

So what about security? Well, like every (relatively) new technology LXC has its issues. If you use Ubuntu 12.04 they provide a set of Apparmor scripts to mitigate known security risks (like disabling reboot or shutdown commands inside containers, and write access to the /sys filesystem).


I am familiar with Microsoft App-V, VMWare ThinApp, and Symantec Workspace Virtualization. They can help you as a security sandbox but not as a full protection. A virtual machine will be much more secure (and theoretically very strong), although there are security bugs there that enable you to escape it.

Those products work at two levels: using filtering drivers for registry and the filesystem, and hooking into the Windows operating system API.


Virtual machines are not more secure. In fact there's been more documented attacks where root access on a guest VM has gained shell access on the host, than there's been against containers.

This doesn't mean that containers are more secure than VMs either. Attacking VMs attracts more security researchers from what I've seen (but I may be wrong on that point). However whether your running a container or a virtual machine, you still need some shared processes (eg the 'ticks' of a system clock) and with any sufficiently complicated code WILL have bugs that can be potentially exploited.

However the crux of the matter is regardless of whether you're running containers or full blown virtual machines, you cannot escape out of the sandbox without having elevated privileges on the guest to begin with. And if an attacker has that, then you've already lost - regardless of whether the attacker can or cannot escape the sandbox.

Lastly, I'm not sure if you're aware of this or not, but this is a Linux solution and has nothing to do with Windows (I only say this because your post seemed tailored towards Windows-hosted virtualisation)


Are you saying that both approaches have the same level of security or probable insecurities? or that you can't currently estimate the difference?

Even being aware that this is a Linux solution I mentioned the Windows technologies that I know technically.


> Are you saying that both approaches have the same level of security or probable insecurities? or that you can't currently estimate the difference?

A bit of both, but mostly the former. In practical terms, they both have the same level of security. But -as with any software- something could be published tomorrow exposing some massive flaw that totally blows one or the other out of the water. However neither offer any technical advantage over the other from a security stand point and from a practical perspective, the real question of security is whether your guest OSs are locked down to begin with (eg it's no good arguing which home security system is the most effective if you leave the front door open to begin with).

> Even being aware that this is a Linux solution I mentioned the Windows technologies that I know technically.

That's fair enough and I had suspected that was the case. I just wanted to make sure that we were both talking about the same thing :)


Back in January I got a new laptop, installed Arch on it, got it all nicely set up. And I decided it was high time to start playing with LXC because container virtualization seems extremely promising to me. Created an Ubuntu container, seemed to work fine, and then used lxc-destroy, which took some time.

It destroyed my entire file system. I have no clue how the hell it happened--it floors me that something like that would be possible--and I suspect it's probably simply the result of a newbie like myself somehow misusing userspace tools. But it was enough to turn me off of it for the time being.


I'm interested to see how that standardization works across linux distributions. I mean, different distros use different versions of various libraries (openssl for example). So if your app links with libssl.so.X but the host only provides libssl.so.Y, then your app won't work.

Of course, you could bundle libssl with your app. But then the standardization is at the level of kernel/libc ABI. In which case the container is basically a full LXC guest.

But then why standardize an image format if you can create a small script which builds the image with lxc-create + installs whatever else necessary for your app. That script will be much smaller than the full image, even a barebones ubuntu lxc guest (debootstrap quantal) is ~400MB.


Their base layer needs to specify exact versions of libraries.

Deploying from images should be much faster and less fragile (oops, is Github down?) than from scripts.


Absolutely. Think of a Docker container as a universal build artifact. Your build step can be an arbitrary sequence of unix commands (download dependencies, build libssl, run shell scripts etc.). Docker can freeze the result of that build and guarantee that it will run in a repeatable and self-contained way, no matter where you run it.

So you get clean separation of build and run, which is a hugely important part of reliable deployment.


So docker container ABI is basically the kernel? You just create a filesystem image and docker starts that as LXC guest. You could build a statically compiled binary and put that as /bin/init into the container image, right?


Yes! Exactly :)

You wouldn't need to save it as /sbin/init. You would just type:

    $ docker run MYIMAGE /path/to/my/static/binary
Here's the smallest image I've personally used to run a docker container: http://get.docker.io/images/busybox


Are the aufs filesystems easily navigable from the host OS, ala Solaris's zones feature?


The aufs mount is (probably) only mounted on the guest's VFS so I'm not sure how you would access it from the host.


The AUFS mountpoint is also reachable from the host. However, each container uses its own `mnt` namespace, so further mounts (done within the container) will not automatically be visible.


Docker lets you visualize changes on any container's filesystem live, as they happen. It also lets you snapshot changes from any container into a new image, and immediately start running new containers from that image. No manual post-processing of the image, no configuration files to templatize. The whole thing is 2 commands and maybe half a second.


Wow, i need this! What would be the requirements for docker? I suppose it wouldn't run on linux 2.6.x? (CentOS 5.x)? Any other requirements?


The main requirement is a modern kernel with the aufs module. We do most of our testing on Ubuntu 12.04 and 12.10. But any modern distro should be fine. There are a few people testing it on CentOS as I write this.


Note that LXC DOES NOT PROVIDE SECURITY. It provides resource separation (to a point) and so on.

Breaking out of a filesystem container is as easy as creating a root block device. Breaking out of a network container is as easy as creating a network device

And in all cases, you can just inject memory, load lkms, etc. That's without mentioning the amount of weekly CVEs for Linux namespaces.


> LXC DOES NOT PROVIDE SECURITY

This is out of date. As of Linux 3.8, or with out-of-tree patches in older kernels, LXC puts each container in its own user namespace, so that root in the container has no privileges outside. LXC also uses network namespaces, so the user inside the container can only do on the network what the admin allows them to do.

Because root inside a user namespace is unprivileged outside it, it can't scribble on memory or load modules, etc., either.

See https://wiki.ubuntu.com/LxcSecurity for a decent summary of the situation in Ubuntu's releases. Several Ubuntu contributors are also among the main drivers of LXC upstream.

It's true that user namespaces and other kernel features LXC relies on are beginning to get much more use than they used to, and probably still have flaws, though I think you exaggerate how many CVEs are actually being found. Ubuntu's LXC support also uses apparmor and seccomp to provide further isolation. Conservative users will probably wait a while more to see what bugs get shaken out.


Thank for your comment! I've been afflicted by FUD related to security in LXC for over a year. This really helps.


And apparmor is not LXC. Indeed, if you use apparmor to further restrict LXC you can some kind of security (as long as there isn't a new CVE every week that is).


(Copying my answer to a similar question)

Yes, you probably don't want to run untrusted code with root privileges inside a container if anything valuable is running on the same host.

However if that code is trusted, or if you're running it as an unprivileged user, or if nothing else of importance is sharing the same host, then I would not hesitate to use them.

Containers are awesome because they represent a logical component in your software stack. You can also use them as a unit of hardware resource allocation and multi-tenancy, but you don't have to: you can map a container 1-to-1 to a physical box, for example. But the logical unit remains the same regardless of the underlying hardware and multi-tenancy setup, which is truly awesome.

EDIT: details on multi-tenancy.


you're trying to justify the use of lxc for security, IMO. Your webpage does state "strong guarantees of isolation"

if you're sharing nothing of importance on the host, then, you don't really need LXC, unless you don't know how to setup mysql with more than one database, nginx with more than one virtual host, yada yada.

Here's the trick: you CAN use LXC and SUPPLEMENT it by something providing security such as SELinux.


dotCloud engineer here.

LXC lets you use cgroups, i.e. setup memory/cpu/IO limits per container. If you setup MySQL with more than database, you can't do that.

Also, we DO use LXC and SUPPLEMENT it by something providing security such as GRSEC (in the current version in production at dotCloud) and AppArmor (with docker) :-)


you can actually use cgroups without lxc, btw.

if you do use apparmor and grsec (as in RBAC's part of grsec in particular) it's probably acceptable, but I haven't seen it mentioned on the website - and people figure, they'll just use lxc "and be safe".


or use OpenVZ which has been designed with security in mind ;)


LXC is actually the work of the OpenVZ team.

When they were tired about seeing their patches rejected from the mainstream kernel, they decided to try a different approach, and that approach is LXC. In other words, LXC is a reimplementation of OpenVZ concepts by almost the same team.

LXC is actually more secure than OpenVZ, if only because it went through more scrutiny than OpenVZ.


Are you sure? I think LXC was developed by IBM. Team OpenVZ is still sticking with very old kernel releases (2.6.32 at most), they may adopt LXC for some of their features but they aren't very keen on upstream work.


Will there be any possibility of running dev instances in OS X? Perhaps we'll be able to do brew install docker-compat at some point in the future and get a best-effort emulation layer even though the Linux APIs are missing. I hate messing around with virtualization.


I run a Docker host fine on OS X w/ Vagrant. The client runs on OS X but not Linux Containers.


fyi, aufs which is the union mount filesystem required by docker (and cloud foundry warden), is not well supported on rhel, centos, and other linux distros, seems mainly support for ubuntu.


More than this, Ubuntu are trying to remove it from their distribution in favour of overlayfs [0]. As mentioned there, the reason is that aufs is not and will not be part of the mainline kernel. Precise 12.04 was going to be without AUFS, but some issues cropped up, keeping it around.

It now looks like overlayfs will make it into the 3.10 mainline kernel [1], so it may be a better choice in the future. I think that the Under the hood stuff on docker's page are implementation details that can change, so a switch to overlayfs when that becomes more suitable could be possible. (Confirmation from the dotClouders present would be apprecaited.)

[0] https://lists.ubuntu.com/archives/ubuntu-devel/2012-February...

[1] http://lwn.net/Articles/542707/


Whats the interaction between systems based on Mesos, which can (and many do) use containers? Is this really designed for more multi-tenancy with lower trust over same-org clouds?


Docker solves the problem of running things in a repeatable and infrastructure-agnostic way. Mesos solves the problem of telling many nodes what to run and when.

In other words, Docker + Mesos is a killer combo. There is already experimentation underway to use Docker as an execution engine for Mesos.


That sounds like a definitive improvement over LXC specially the isolation properties but I'm not sure what is the added value of Docker compared to OpenVZ ?

Any ideas ?


My guess is that the container specification is orthogonal to the sandboxing feature. In fact they're using LXC.


Correct, Docker is currently based on lxc, but that is an implementation detail. In theory it could be ported to any process-level isolation tech with similar features: OpenVZ, Solaris Zones - you could also try using BSD jails although I don't know if they have all the required features.

To answer the original question: Docker extends LXC with a higher-level API which operates at the process level. OpenVZ helps you create "mini-servers". Docker lets you forget about servers and manage processes.


Nice. Can't way to try it out. I've built a similar tool in Go for PandaStream to isolate or encoding processes.


Does anyone know, or can anyone speculate on exactly how the "heterogenous payloads" are specified?


The payload is anything which can be recorded on the filesystem by running a unix command. For example:

     # Run command which adds your payload
     $ docker run base apt-get install curl
     5b4a1ee8

     # Commit the result to a new image
     $ docker commit 5b4a1ee8 nwg/base-with-curl

     # Run a command from the new image. Your payload is available!
     $ docker run nwg/base-with-curl curl http://www.google.com
Docker doesn't care how the payload was added. apt-get, shell script, pip install, gem bundle... All the same. In the end it's just a change to the filesystem.


This sounds like a reimplementation of virtual machines at the os layer instead of hardware layer.


It sounds like they are just building on cgroups etc, which are already part of the Linux kernel.

I would argue that virtual machines at a hypervisor/hardware level were just a hack for OSs not living up to their isolation promises/obligations. Strong OS level isolation implementations (cgroups, namespaces etc) allow people to put isolation back where it belongs, the OS.

The job of the OS is to control the hardware, wrapping the OS is software to emulate hardware is ridiculous and VMs generally have much more performance overhead than isolation containers.


Containers have existed as long as virtual machines have. FreeBSD implemented "Jails" back in the late 90s / yr2000. Linux also has OpenVZ and Solaris has Zones.

If you couple a container with a CoW file system that supports snapshotting (eg ZFS or BtrFS), then you can have most of the features you'd expect from virtualisation but without as heavy footprint.

Containers are an underrated and often forgotten solution in my opinion.


lxc leverages hvm I think... someone correct me?

edit: it's too early, sorry this has nothing to do with your post... but I hope someone does correct me about hvm.


How is this different from what you can already do with lxc on, for example, Ubuntu Server?


It's not that it's any different, it's that it's standardized. The idea is that a Docker container would be portable between different PaaS hosts (and from your own staging environment to those hosts!) without rebuilding, because they'd all be using the "Docker standard for deployment."

A PaaS host saying they supported Docker would imply that they'd be using, for example, SquashFS for container format, AuFS instead of OverlayFS for union-mounts, LXC instead of OpenVZ/Xen/KVM for isolation, and any other set of things your container might subtly rely upon.

The culmination of this, I imagine, would be a PaaS host allowing you to specify the "stuff" you want to run just by the URL of the container-image.


Doesn't a standard involve, you know, standards? AFAIK a product name is not a standard.

What if the namespace changes? What if AuFS changes? What if LXC changes? Independently or all together? ABI changes? Version changes? Feature changes? Are all the licenses compatible? Will it ever support platforms other than just certain versions of Linux? Or languages other than Go?

I don't see a standard. I see marketing for a product and a mailing list to collect potential customers. But maybe i'm missing something.


Hi Peter, this website was only meant to be seen once Docker is actually open-source, which will be the case very soon.

I do think there is a need for a standard way to package and share software at the filesystem and process level - we don't pretend to define that standard, but hopefully we can contribute to it by open-sourcing a real-world implementation.


I guess I read too far into it when I saw the word "standard" everywhere and got excited - sorry about that. Do you plan on adding to your implementation the ability to differentiate between compatible versions/platforms, so one could use this on several cloud instances that aren't built the same?


Yes, that is definitely something we would like to add. And we will gladly accept pull requests :)


I'm just presuming, but:

1. every one of those attributes would be fixed against a given version of the (coming) Docker spec, and a given host would specify what version(s) of the spec they were compatible with.

2. Go is, I think, just the language the glue code is written in; not the language your own things-deployed-using-Docker must be written in.

3. It might support other Linux distros (Fedora, probably), but it won't support other OSes as hosts--because the whole point is to run things that need a POSIX-alike as their "outer runtime" (i.e. not Windows programs, etc.) The way to run these containers on another host will be to run Linux in a VM on that host, and run the containers in the VM--just like the way to play a Super Nintendo game "container" on your computer is to run them in a Super Nintendo VM. [Actually, come to think of it, game ROMs are a great analogy for precompiled SquashFS containers. I would adopt it if I were them :)]


The trick here is that their xmame (Docker) may not be the same build on all hosts, so it may not play the ROMs all in the same way or support all ROMs. A standard works to improve interoperability between different builds/hosts/etc as well as provide an expected set of operations and their results. If all they provide is just one version of one product and call that standardized, that's like releasing a new version of Internet Explorer and calling it a web standard.


Well, this is a good first step in the "free market" standardization process, though: get a public implementation out of what you would imagine standard-conformance to look like. Then, let the other guys (e.g. Heroku) get out their competing implementations. Then, find the similarities, resolve the differences, and write it down. Now you've got a standard.


In practice that does not work. Things get broken, people end up having to support 20 edge cases to use this "universal", "standardized" thing. Depends on the implementation, though.


"HTML 1.0" was the particular standard I had in mind. I guess I'm too used to coding multiplatform Javascript, but "end[ing] up having to support 20 edge cases to use this 'universal', 'standardized' thing" sounds like success in my books--in that you now have a (painfully) interoperating ecosystem, where before you had none. And it all gradually gets smoothed out as the spec evolves over the years, until you can't really tell the difference from a BDUF spec.


The best standards come from ratifying practice, not dictat.


Correct. Docker is the direct result of dotCloud's experience running hundreds of thousands of containers in production over the last 2 years. We tried very hard to put it in a form factor which makes it useful beyond the traditional PaaS.

We think Docker's API is a fundamental building block for running any process on the server.


What about virtsandbox from fedora? how does this project overlap with that? can this handle certain situations better?


So why not use VM images? Why not a VMware vagrant box, for example?


This might work if you only have to run three or four VMs on a box, and run several applications in each container. Full PC virtual machines are much too heavyweight, though, for isolating thousands of individual processes per box, especially when most of them might just sit there doing nothing most of the time.

Though! If you want to, you can think of this standard as specifying an "ABI format" for high-level, lightweight VMs that happens to run on a "Linux machine" instead of, say, an "IA32 machine."


One major benefit over VM images is Linux Containers have very fast spin-up time. This can be especially useful for PaaS providers and CI servers.


If your images use kvm+qcow2 you can just spin up new vms as a delta image using qcow2's support for a backing store.

I want a new instance of WEBSERVER.qcow2?

    qemu-img create -b WEBSERVER.qcow2 -t qcow2 WEBSERVER-$SERIALNUMBER.qcow2 
If you're doing this as a PaaS or for CI, you do this as part of your new image creation and then pass in the new qcow2 to your vm (maybe via libvirt). If you aren't doing this or something very similar, you're spinning your wheels and wasting time/resources.


Other benefits: docker images are basically tarballs, which means they are much smaller.

And, importantly, Docker maintains a filesystem-level diff between versions of an image, and only needs to transmit each diff once. So you get tremendous bandwidth savings when transmitting multiple images created from the same base.


Not to be a stick in the mud, but you're using a copyrighted image as your logo without any attribution or acknowledgement of the original owner of the copyright (the Lego Group). You should probably fix that.


You're right! We put it up there when it was an internal project, it's probably time to take care of that. Taking it down until we found a correct way to do it.

I'm a copyright noob: would simply acknowledging the copyright owner be enough and fall under fair use, or should we not use it unless we get written permission?


I am not a copyright expert, however I am familiar with the Lego Group (the company that produces Lego toys) ... the set is pretty old (1986), which means they're not worried about making money off it. If you just stick a disclaimer after it that it's copyright 1986 the Lego Group, that's probably fine, especially since this looks to be an open source project, so you're not going to be charging people for something with their picture on it.

This page has what is very likely the original image: http://www.peeron.com/scans/7823-1/ Peeron.com has special permission directly from Lego to display the images, so if you wanted to be extra careful you could email dan@peeron.com and ask for permission to deep link to their picture (they'd probably say yes, the admin is a linux geek too). But honestly, a simple copyright disclaimer is probably fine. Lego won't reach out and swat you even if they do decide they don't like it, they'll just ask you to take it down.


I would avoid using anyone else's images without their explicit permission. It's just safer that way.


Why should I use this when systemd offers the same (minus the AUFS root fs)?


We get that question a lot... Until people start playing with it. Then they never ask it again :)


I'm not entirely sure that I understand what this does. Is it some sort of hybrid between provisioning automation and deployment automation?


It's a white-labelling of dotCloud's implementation of Heroku's "slug" concept (https://devcenter.heroku.com/articles/slug-compiler): basically, a SquashFS image with a known SHA, storing a precompiled runtime+libraries+code artifact that will never change, able to be union-mounted atop a "base image" (a chroot filesystem, possibly also a known-SHA SquashFS image), then spun up as an ephemeral LXC container. I actually use the idea in my own ad-hoc deployment process; they're very convenient for ensuring repeatability.

(As a side-note, this is an example of an interesting bit of game theory: in a niche, the Majority Player will tend to keep their tech proprietary to stay ahead, while the Second String will tend to release everything OSS in order to remove the Majority Player's advantages. This one is dotCloud taking a stab at Heroku, but you can also think of, for example, Atlassian--who runs Github-competitor Bitbucket--poking at Github by releasing a generic Git GUI client, whereas Github released a Github client.)


This is mostly accurate :)

I will add that our implementation predates Heroku's. Using a generic container layer early on (first OpenVZ-based prototypes in 2009) is what allowed us to launch multi-language support a year before any other paas. It's also how we operate both application servers and databases with the same underlying codebase, and the same ops team.


A very bad description would be: "containers are essentially what you get when you cross virtualisation with chroot".

The reason why I give that description to begin with is because containers solve a lot of problems that often lead people towards the virtualisation route. And to the guest "OS", all the applications think they're running on a unique machine from the host. However containers differ from virtualisation in that it's just one OS (one shared kernel). This means you can only run one unique OS (though if you know what you're doing, you can run multiple different distros of Linux in different containers - but you couldn't run FreeBSD nor Windows inside a Linux container). Containers can also have their own resources and network interfaces (both virtual devices and dedicated hardware passed through).

Because you're not virtualising hardware with containers and because you're only running one OS, containers do have performance advantages over virtualisation while still being just as secure. So I personally think they're a massively underrated and under utilised solution.

If you're interested in investigating a little more into containers, Linux also has OpenVZ, and FreeBSD and Solaris has Jails and Zones (respectively). The wikipedia articles on each of them also offer some good details (despite the stigma attached to wikipedia entries).

I've not used Docker specifically, but I have used other containers in Linux and Solaris, so I'm happy to answer any other questions on those.


There've been quite a few articles on containers on LWN over the last few months; worth digging around there if you're interested. A couple of the more recent ones:

https://lwn.net/Articles/524952/ - Glauber Costa's talk on the state of containers at LinuxCon Europe 2012

https://lwn.net/Articles/536033/ - systemd containers; these follow on from a theme of containerizing whole distros rather than single apps


Containerization has been around a long time. Probably the start would be chroot, which was developed further by BSD's jails, Solaris zones, and Linux containers.

Basically, these are all ways a way of running programs with a partitioned subset of access to the OS - in some cases, it may even appear to be an entire installation under that subset.

This (and many of the other mentioned solutions) adds to that by being able to capture and deploy an environment for running a specific app.


GitHub: rolandtritsch


curl get.docker.io | sudo sh

Has anybody tried this?


Looks cool. When is it gonna be open sourced?


"Docker will be open source soon"

Soon could be in tomorrow, a week, a month, a year, or longer ...


ETA 2 weeks maximum.


So what we have here is a presently-closed-source, linux-only implementation of closure copying, which is a tiny part of Nix, which is the platform-agnostic package manager behind the elegant NixOS (nixos.org). Or is there something I'm not getting? If it's more platform-dependent, less well-tested, and (ostensibly temporarily) less open, then why is this on the top of my page?


As a fan of Nix, and someone who expected a lot from Conary/rPath back in the day, I can provide at least a partial answer: these distros are great as long as you use their packaging tools for everything. Which is simply not practical. There's a reason Nix and rPath never took off, and it's the same reason developers use virtualenv and rvm instead of system packages. They don't like being forced to use a single packaging tool for everything.

Docker doesn't have an opinion about how you package things. It only cares about the resulting changes on the filesystem. So you are free to use the best tool for each job.

By the way this means you can use Docker and Nix together. I would love to see that :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: