I've read[1], and it is excellent to have options, but there I'd like guidance for what is actually usable and useful, especially via Docker.
Background: I want to be able to execute arbitrary code submitted from the web. At the moment I'm sandboxing it inside a ZeroVM[2] container, inside a Docker (LXC) container.
I'd like to lock down the Docker container more than the default (eg, disable outgoing network connections), but I don't really know where to start.
(And yes, I do find it strange that it was easier to get an entirely new, experimental and undocumented container running inside LXC than it was to work out how to configure LXC. But layers are good, right?)
VeroZM is a perfectly correct answer, that is exactly what it is designed for. The security model is excellent, far better guarantees than anything you will get from containers.
If you want another answer that works, well what kind of code is it? That does make some difference. I am assuming it is some scripting language (as I presume you are not compiling it for zerovm). And I assume it is not one with a trustable sandboxing model (I only really trust Lua, and that is with caveats).
You can lock down networking by not having any (assume you use eg a pipe to communicate), or with iptables, or by using seccomp mode 2 filter system calls. The third is the most general extra filtering, but you do need to know exactly what your container needs to do - the more minimal it is the better.
I'm interested in the seccomp option, but it looks pretty intimidating to setup. I suspect I could do much of what I need with AppArmor, but I have no idea how to apply that to a single Docker container.
Edit: I also tried the MBox sandbox, but it doesn't work on Ubuntu.
You can always use Tomoyo instead of AppArmor (also available in ubuntu). It allows you to configure specific rules for each domain, where a domain can be "this app" just as well as "this app started by that script", so you can differentiate between them hopefully.
seccomp is pretty intimidating, and the lxc config makes it even harder (numbers not names for the syscalls!). Conceptually it is fairly simple, and it is quite fun to play with, but you really need tests with very good code coverage (including error handling) to know which syscalls you need, and it will vary if you change any software potentially. There are audit tools though, you could give it a go.
Apparmor has a rather simple "deny network" rule, which might be a good starting point... but I haven't spent much time with it and not sure how to apply it to one container either. Maybe apply it to the python in the container not the container itself? Might be easier.
This is one place I think Solaris (and Illumos) is still ahead. Not only do they have a full privilege system, but there is an easy to use CLI tool, ppriv(1), to control privileges on a per-process basis. You can start a process but drop its network privileges, or its file-write privileges (with some files possibly whitelisted), or its ability to spawn other processes, etc.. There's also a "privilege debug" mode so if the process crashes as a result, you can figure out what prohibited stuff it was trying to do. That allows an approach of just dropping all privileges to start, and then whitelisting a few things it needs.
FreeBSD's 'capsicum' and Linux's 'seccomp' look like they can conceptually do the same thing, but afaict there isn't yet a good command-line interface to them that lets you drop privileges of unmodified binaries.
For a simple case I could write a C wrapper that just drops privileges, but it'd be nice to have a more versatile CLI tool. Doing that in the general case, e.g. letting me specify options like "no network, no writing files except A and B, no reading files except files in this directory, no spawning processes", requires more or less porting something like ppriv(1) and its privilege-specification syntax to FreeBSD, or writing a workalike.
you really need tests with very good code coverage (including error handling) to know which syscalls you need, and it will vary if you change any software potentially
This is where I run into trouble - given that I want to sandbox arbitrary code in theory I should be able to define what I want to allow, and then set it up. But the practice seems.. esoteric.
you could give it a go.
I seem to end up doing that a lot, for every single thing I try in this area.
Maybe apply it to the python in the container not the container itself?
That would actually be Python-in-ZeroVM. But yeah, that's an interesting idea.
The other thing is that ZeroVM currently has no networking available via Python. So there is some protection there, too.
ZeroVM is very interesting, but my intuition would be to put the Docker container inside the ZeroVM, instead of the ZeroVM inside the Docker container. Is my intuition weird? Perhaps I misunderstand ZeroVM.
My train of thought would be that you'd put the strongest isolation on the outside.
Docker is evolving towards a generic "container engine" with swappable execution backends. In that architecture, lxc becomes one possible backend. ZeroVM, OpenVZ, kvm/qemu or plain chroot could be used as backends under the same management API.
In the case of ZeroVM it's hardly going to be a transparent change. You'll have to recompile all your code for one thing, and for another ZeroVM doesn't (currently) supply anything like the APIs a "normal" VM does.
My understanding is that recent (13.10+) Ubuntu ships with an AppArmor profile that blocks known ways of escalating out of the container. So, LXC under Ubuntu is in better shape than LXC on other distros, where you can use the uevent_helper trick[1] to easily escape the container.
It sounds like your use case could also use the new unprivileged containers[2] feature, which is going to be a dramatic increase in security (since even if you somehow get access to run a command on the host container, it would not be as root).
layers of containers wont give you much. the main issue is the shared kernel, any bug there, any shared resource from there, and all containers have the same risk, no matter how many levels of nesting you have.
layers of vm's, lxc containers, zerovms, etc, will probably make the attacks harder, but also the management, and speed, will suffer, and the security/useability trade off will probably not be adequate (given how fast vm's start anyway, if you want more isolation, you could just use a VM then)
Zerovm is nice, but it requires porting, and of course, is not a silver bullet either.
Can anybody give me pointers on setting up this scenario?
I would like to run my IRC bouncer inside a container. I want to make sure that this bouncer only makes outbound connections using an OpenVPN instance running outside the container. I think I need to wire up a tun/tap device to the container. The container also needs to somehow accept connections from my IRC client (directly, not through the VPN) while still concealing the connecting IP. Basically, I don't want my ZNC container or the IRC servers that it connects to to have access to sensitive Ip addresses.
Docker 0.8 added a lot of new features and fixed a lot of bugs but it is still work in progress. I.e. you can't rename containers and have to recreate (remove and create) them.
Hi, Docker 0.8.1 was released last week, it fixes several old bugs as well as all known regressions spotted in 0.8. Would you mind trying it out, and letting us know in an issue if we missed your bug?
I've read[1], and it is excellent to have options, but there I'd like guidance for what is actually usable and useful, especially via Docker.
Background: I want to be able to execute arbitrary code submitted from the web. At the moment I'm sandboxing it inside a ZeroVM[2] container, inside a Docker (LXC) container.
I'd like to lock down the Docker container more than the default (eg, disable outgoing network connections), but I don't really know where to start.
(And yes, I do find it strange that it was easier to get an entirely new, experimental and undocumented container running inside LXC than it was to work out how to configure LXC. But layers are good, right?)
[1] https://www.stgraber.org/2014/01/01/lxc-1-0-security-feature...
[2] http://zerovm.org/