Of Containers, Dockers, Rockets, and Daemons

wmf · on Dec 6, 2014

It seems like it ought to be easier to document Docker than to rewrite it from scratch, but maybe I'm wrong.

Likewise if Docker is already one big binary it shouldn't be impossible to create a daemonless mode (Doxorcism?).

mpasternacki · on Dec 6, 2014

Docker is not static. If much of it is implementation-defined, and implementation is evolving, then documenting a particular version of Docker is as valuable as I'd be willing to stay locked down to that particular version. And the APIs are not stable: just look at https://github.com/jenkinsci/docker-plugin/issues/115 to see how a Docker upgrade can break a project that uses its APIs (and if the plugin used CLI rather than HTTP API, it would still break: the cause was a change in formatting information that was not available anywhere in a parseable form, and as far as I know is still available only as "pretty text" to be parsed by a regexp)

wmf · on Dec 6, 2014

Yes, documentation would need to be an ongoing process combined with a policy that code changes need to also update the relevant documentation. This may make development slower but revolutions aren't exactly efficient either (or I've just been playing too much Unity).

mpasternacki · on Dec 6, 2014

What good is of updated documentation if change breaks my code? The good way would be to first define and stabilize interfaces, and work from there - but this would require stopping for a while to think, and refrain from adding new features in the meantime, which doesn't seem likely with Docker.

And while I don't support "revolution" (or even think that Rocket is necessarily better than Docker, or that Better One Should Win), I really appreciate break from monoculture. Having one large player in any area is harmful. Even if Rocket stays a niche project, its existence on the market will influence Docker's strategy.

shykes · on Dec 6, 2014

Is there a particular feature you wish Docker had not added? 1.4 RC had 2 months worth of stabilization and basically no new features.

mpasternacki · on Dec 6, 2014

It's less about particular features (none of these is individually harmful, but I personally find the monolithic architecture a bit of an issue), and more about taking time to actually define stable and complete APIs and boundaries. Stabilization (from technical reliability POV) is all good and fine, but - for instance - is there a way to easily find out containers' names and links, and exact forwarded ports, from the API without regexping `docker ps` output and "tcp/1234"-like strings that should be meant for human eyes only? Is it still Docker's official position that CLI is only official interface, and using HTTP API is frowned upon? Is all information about a container available in `docker inspect` or API equivalent, or do I still need to parse `docker ps` to find out about names and links? Does the registry/index mess ever going to go away, did it even make sense (other than vendor lock-in) at some point? Since Docker and registry API is HTTP, why can't I use http auth (even implemented by an nignx proxy) instead of certs (which are fairly new) or centralized index for authentication? It should be as simple as using `https://user:pass@IP/…` URLs, what's the problem with it? While we're at it, why on Earth is HTTP Docker endpoint specified as `tcp://IP:PORT` rather than `https://IP:PORT`, which would make it possible and easy to proxy, use http auth in an intuitive way, and generally reflect how it works rather than obscure it? Should I go on?

shykes · on Dec 6, 2014

> (none of these is individually harmful, but I personally find the monolithic architecture a bit of an issue)

The architecture is not monolithic. I want to break down Docker into the most discrete and composable parts possible. But I want to do that without hurting the user experience. If you can point out a way to do that, I will implement it. And if you are available to help, it will happen faster.

> is there a way to easily find out containers' names and links, and exact forwarded ports, from the API without regexping `docker ps` output and "tcp/1234"-like strings that should be meant for human eyes only?

You are not expected to parse 'docker ps' output programatically. That is not and has never been the recommended way to interact with Docker programatically.

Yes, the current API expects clients to parse strings like tcp/1234. A string of the form [PROTO/]PORT seemed like a pretty reasonable thing to parse. But if you would prefer a json object like {"proto":"tcp", "port":1234}, I don't have a fundamental problem with that. Feel free to open a github issue to suggest it. In general we frown upon randomly breaking things but if it improves the situation for clients we will consider it. It could also make for a nice first patch if you're interested in that :)

> * Is it still Docker's official position that CLI is only official interface, and using HTTP API is frowned upon?*

That was never true. I'm not sure how you got that idea. The HTTP API has existed forever, the official client uses it for 100% of its functionality, and I definitely recommend using that as the primary mode of interaction with Docker. I acknowledge that the API itself is not perfect, and will welcome any suggestions for improving it (we are discussing quite a few improvements already on #docker-dev).

> Since Docker and registry API is HTTP, why can't I use http auth (even implemented by an nignx proxy) instead of certs (which are fairly new) or centralized index for authentication?

You can use all these things with the registry API. The centralized index is completely optional for authentication.

I'm curious where you got all these incorrect notions? If you read it somewhere in the docs, then it's a documentation bug and I would appreciate it if you could point it out so we can fix it.

> It should be as simple as using `https://user:pass@IP/…` URLs, what's the problem with it?

I don't understand that part. I believe Docker registry auth today uses basic auth over TLS by default, which has security issues (ie storing your password in clear in your home directory). We are working to add a more secure token-based auth. But you should always be able to use basic auth if you prefer that.

> While we're at it, why on Earth is HTTP Docker endpoint specified as `tcp://IP:PORT` rather than `https://IP:PORT`, which would make it possible and easy to proxy, use http auth in an intuitive way, and generally reflect how it works rather than obscure it?

I don't understand your question here either. Are you saying you would like the format of the command-line option docker -H to start with https://* instead of tcp://? If so, I think that's a good idea, but it also seems like a ridiculously small and cosmetic problem. I also don't understand how it affects http proxying or auth in any way. Regardless of the command-line flag, the API actually uses https*. So auth and proxying should work completely as expected (with the exception of `attach` which drops to the raw TCP socket to allow bi-directional streams. But we have a websockets endpoint for that and I would really like to deprecate the "drop to tcp" mode soon.

Alupis · on Dec 6, 2014

> If you can point out a way to do that, I will implement it. And if you are available to help, it will happen faster.

> It could also make for a nice first patch if you're interested in that

> I'm curious where you got all these incorrect notions?

> it also seems like a ridiculously small and cosmetic problem

A lot of your comments come across snarky and backhandedly hostile.

Regarding the tcp:// vs https:// -- I think the parent comment is pointing out that Docker seems to have just made up a new address prefix which is used nowhere else and certainly is not standard. Even if it's a "ridiculously small and cosmetic problem", it should be fixed, and probably never should of been (perhaps hinting at the comment about slowing down to think instead of focusing constantly on new features and rapid fire). If it's the only small cosmetic problem, probably wouldn't be an issue... but add up many, and well...

Regarding the constant comments to commit -- I completely understand wanting others to put their ideas and critiques into PR's... but you have to see where it can come off a little condescending. Your users, of which not all are engineers, are offering critiques which will make your platform better for all, and your responses imply you will not consider any of them as valid unless they are in code, and even then, it's only a maybe.

shykes · on Dec 6, 2014

The flag is tcp:// for historical reasons. In earlier versions Docker did not use an http api, but a simple custom rpc protocol. Therefore it would have not made sense to use http:// . Then, when we introduced the http api, we decided to keep the flag for reverse compatibility (after all, as far as the user is concerned the functionality is the same: point to this network address, and you can control the daemon). We never really thought about it after that. I believe you're the first to complain about it. So, again - if you open an issue we will address it.

On suggesting commits: my intention is really not to be condescending, but to seek opportunity to recruit new contributors :) I even specified "if not, that's ok" to emphasize that. I have found that frustration about something in Docker has been the best source of new, valuable contributions. I'm just trying to catalyze that. And yes, I absolutely agree that non-code contributions are just as valuable as code contributions. Even taking the time to file a bug report, or suggest an improvement, is a contribution of your time and hugely appreciated. If you got anything else from my message, it was not intended.

Alupis · on Dec 6, 2014

> I believe you're the first to complain about it

I did not complain about it, the Parent post did, I simply clarified a point you seemed to not understand. But now that makes it twice it's been brought up.

> So, again - if you open an issue we will address it.

I do believe it's been brought to your attention here sufficiently for it to be addressed without my involvement.

Alupis · on Dec 6, 2014

> The architecture is not monolithic. I want to break down Docker into the most discrete and composable parts possible

This is something I've witnessed a number of Docker employees do over and over on HN -- state Docker is or is not something currently, based on a future promise.

This statement is a contradiction. You state Docker is not monolithic -- but then the next sentence says you want to break Docker down into composable parts. Which is it?

shykes · on Dec 6, 2014

> You state Docker is not monolithic -- but then the next sentence says you want to break Docker down into composable parts. Which is it?

It's both.

Docker is already quite modular. For example storage and execution backends are swappable using a driver interface. Image storage is a separate binary. Clustering and machine management are a separate binary. The primitives for interacting with kernel namespaces an cgroups are in a separate project called libcontainer. We are developing authentication and security features in a separate repo called libtrust.

And at the same time, I believe we can do better, and made it a high-level goal of the project to improve modularity. I am simply stating that high-level goal.

vacri · on Dec 6, 2014

You are not expected to parse 'docker ps' output programatically.

On the other hand, it's also not very friendly for terminals, either. Huge amounts of pointless whitespace means 'docker ps' is a confusing mess on a standard 80-char terminal.

The cli a non-uniform experience. Whether you're starting or stopping a docker container, you get the same output - the name of the container you supplied. No 'starting', no 'stopping', no 'ok', just the literal string you gave it. So 'docker ps' is meant for human eyes only, but 'docker start/stop' doesn't consider it worthwhile to provide a human-friendly string?

Just recently I've been playing with docker.py, and the inconsistencies continue there. On the cli, the port and volume bindings are assigned when you 'docker create' a container, then you just start/stop it. In docker.py, the port and volume locations are exposed in 'create', but not actually bound there - they're bound in 'start'... which applies the settings to the container's config permanently, so that other methods of starting (cli) also pick them up. It's all very weird.

mpasternacki · on Dec 6, 2014

> You are not expected to parse 'docker ps' output programatically.

I see that my particular pet peeve (there's no name or link information in `docker inspect`) has been fixed. Nice! Still, I can't imagine the thought process that leads to express links in JSON as `["/foo:/bar/foo","/baz:/bar/baz"]` rather than `{"foo":"foo","baz":"baz"}` (or even `{"/bar/foo":"/foo","/bar/baz":"/baz"}` if you're attached to the weird hierarchical names). I similarly cannot imagine reasoning that leads to providing port number in JSON as a string, with prefix.

>> Is it still Docker's official position that CLI is only official interface, and using HTTP API is frowned upon? > That was never true. I'm not sure how you got that idea.

Cannot find it now: it's not easy to google out (and if it was in the docs, I'd need to go through web.archive.org), and I didn't think about archiving it myself for later use. I think this was either in (possibly earlier version of) docs, or in some GitHub issue. The sentiment seems to stand, though: Docker's own code base doesn't even include any usable API client, and many seemingly basic CLI operations are very convoluted in API (how do I `docker run` with API? Is it still split across at least two separate calls?). Some of these problems are echoed in https://github.com/docker/docker/issues/7358

Some (possibly most of all - I certainly hope so!) are solved, but the API docs were good at listing the endpoints and (some of) the parameters [I recall JSON format was incompletely described, or not kept up to date]. There was virtually no pragmatic docs on how to use the API to achieve equivalent of, say, a `docker run` invocation. And while I obviously can open an issue or fix it in a PR, reflecting problems back at community because it's open source, if it bothers you, go fix it yourself doesn't seem a proper approach. In particular, if the problem is about keeping documentation in sync with code (format of JSON described in API docs did at some point diverge from actual behaviour; if you want me to go through the docs right now and verify each piece, I'd be happy to do that, but let's discuss the consulting contract first), it's not reasonable to expect your users to pinpoint the issue by going through Git changelogs.

> You can use all these things [HTTP Auth] with the registry API.

Can I `docker pull` an image that is behind plain http auth (e.g. at a registry proxied to https://foo:bar@example.com/)? How do I specify that? Last time I tried to figure this out (before 1.2.0, after 1.0.0, so it may have changed), this was either not possible, or completely undocumented.

And while we're at it, what would exactly be the problem with letting me pull an image available at https://foo.bar@example.com/path/to/image.tar? Why there even is a separate registry API rather than plain HTTP? This part still eludes me.

Re tcp://… vs http://… it's obviously functionally equivalent and cosmetic, unless you want to write some plumbing and you have to add logic to translate from the tcp:// format (which doesn't seem to have any actual reason to be written this way) to http:// URL and back. Similar thing with `tcp://` prefix for ports forwarded from linked containers: I either have to use the long form that includes the original port number, or parse out the irrelevant `tcp://` piece (yes, I know there could be udp:// [but why the double slash? is there any URI path that could follow?]; the protocol is usually obvious from the context, and need to parse out the irrelevant text is actual overhead).

So, for Docker's convenience (or because of a bit of thoughtlessness when something was first implemented), there's a small but significant overhead incurred on a lot of current and future pieces of plumbing and containers, because for some reason (was there any?) you decided `tcp://` looks better than `http://` for an HTTP URL, and `tcp://1.2.3.4:567` looks prettier than just `1.2.3.4:567`.

mpasternacki · on Dec 6, 2014

BTW, it's interesting that you chose to reply to "refrain from adding new features" part (which may have been unnecessary snark on my side), rather than to "stopping for a while to think" part (which is the actual point)…

shykes · on Dec 6, 2014

It seemed like the same question me.

On "stopping for a while to think", I suppose it's subjective. From my point of view we spend insane amounts of time thinking about the right design for all these things. At the same time we receive approximately 50 pull requests a week, and a lot of them are for small incremental improvements which are very reasonable and could be merged very easily with little effort. So, there is a balance between "feature all the things!" and "sorry we're not merging anything for 3 months while we rethink everything from scratch". Every large scale open-source project deals with that tension. We definitely have room for improvement. But we care very much about good, minimal design, and not breaking APIs.

andrewguenther · on Dec 6, 2014

Documentation is not the primary problem with Docker that Rocket is trying to solve. The issue is Docker's process model, which is fundamental to the way it is built. Docker seems to have taken the opinion that the best way to solve the problems with their process model is not to fix it, but add more features to the kernel to force it to work. Rocket went with the fix it approach and simply started over.

Also, would you mind clarifying what you mean by "daemonless mode"?

wmf · on Dec 6, 2014

By daemonless I mean making "docker run" be able to directly fork a container without needing dockerd. Like how "rkt run" works.

mpasternacki · on Dec 6, 2014

What's interesing is that Docker used to run (almost) this way some time ago. I guess the daemonless mode won't fly with managing communication (iptables NAT, IP assignment, linked containers, etc).

shykes · on Dec 6, 2014

Yeah we used to call it "standalone mode", it was pretty neat but really confused users (it would auto-detect whether to go daemon mode or not, so behavior became contextual and hard to troubleshoot).

Yes, managing communication or other centralized resources is harder that way which is a challenge for "embedded Docker". Rocket does not have this problem because it relies on systemd to manage all this centrally under the hood. So you get "daemonless" as long as you sweep everything under the giant systemd rug :)

mpasternacki · on Dec 6, 2014

It doesn't need to be systemd, and that's the beauty of Rocket's design: it is composed of individual, well defined pieces with precise boundaries and areas of responsibility. As I wrote, my main practical concern now is to get something container-like working on FreeBSD. Docker is useless here unless it's wholly ported; Rocket is useful even if I don't use its code and go with (even partial) implementation of the spec.

From my practical POV, my options now are: port Docker (NOPE), reimplement Docker (NOPE GOD WHY), port Rocket (Maybe?), reuse spec and pieces of Rocket's code in my own opinionated NIH plumbing (Hell yeah, somebody did the thinking part for me! The spec is usable!), or write own opinionated NIH plumbing from scratch (why would I if there's a decent spec to lean on?).

Alupis · on Dec 6, 2014

This is truly why Rocket is so interesting. The App Container Spec is what is missing from Docker. They had their chance to implement a common universal spec which others could implement independently, but didn't do so. Now CoreOS is picking up the ball and running. I don't see any option for Docker other than to work with CoreOS to form this universal specification, or to adhere to the specification after it's been finalized.

wmf · on Dec 6, 2014

Often people want to run without dockerd because they want to use a different orchestration system (systemd, OpenStack, etc.) and usually that system sets up the networking and such.

shykes · on Dec 6, 2014

Yes, what I call "embeddable Docker" is a desirable thing. It will help Docker integrate with other centralized daemons that want to own the process tree, like systemd. I've made it very clear in the past that I want this too, but would welcome help to implement it. I'm actually putting together a "hacking sprint" so that the various interested parties can participate in making it happen. I actually offered the CoreOS guys to join the effort last week - but they seem determined to do it themselves instead. That's OK but they should know they are welcome back if they change their minds.

So, anybody want to help hack on embedded Docker?

See also this good, unbiased explanation of the relationship between Docker, CoreOS and systemd: http://www.ibuildthecloud.com/blog/2014/12/03/is-docker-fund...

groks · on Dec 6, 2014

The next systemd hackfest is on Jan 30th:

  https://plus.google.com/events/c56kbn26s6g01n6m4tj2nmdgnfc

If someone could fix docker so that projects like this are not required:

  https://github.com/ibuildthecloud/systemd-docker

...that would be great!

simonebrunozzi · on Dec 6, 2014

LOL. Doxorcism is pure genius. :)

pwernersbach · on Dec 6, 2014

I hate to be that guy, but Slackware has been around for longer than FreeBSD, it beats it by a few months. One of Slackware's goals is to be Unix-like, and that decision is definitely evident in the distribution.

mpasternacki · on Dec 6, 2014

Thanks for pointing that out! Post updated. Slackware was first distro I ever used seriously, after getting hooked up on Linux with a Red Hat 5.0 or 5.1 CD that was attached to a (printed) IT monthly newspaper. I have to tell, FreeBSD feels kind of similar to Slackware.

vmiroshnikov · on Dec 6, 2014

Check out CBSD(http://www.bsdstore.ru/en/about.html). Seems as a viable management tool for jails.