Hacker News new | past | comments | ask | show | jobs | submit login

Not surprised at all.

And here we have the prime example, why the Docker-model of building and distributing containers is horrible when it comes to security and maintenance.

Bundling dependencies for production environments has always been and always will be a terrible idea.




This sounds like an oversimplification, though:

> Bundling dependencies for production environments has always been and always will be a terrible idea.

We're considering Docker currently -- not for the distribution model at all, since we'd only ever use our own internally built & maintained images -- but as a clean way to break apart dependencies, and make it possible to run a diverse multiple-server-type environment (production) in miniature (development, demo, UAT).

I quite like the idea of something that may occupy multiple VMs or dedicated servers in production be able to run as a lightweight app in a dev environment, with exactly the same dependencies in place -- that's quite useful.

If this kind of use case is also a terrible idea, I'm interested to hear more -- we're just now tinkering with the idea, and haven't yet moved from theory to practice.

My own concerns revolve around how easy it will be to keep updated on RHEL patches, for example -- apparently we should be able to keep both host and app dependencies updated without much trouble, but it adds more complexity to the maintenance cycle (it seems).


> My own concerns revolve around how easy it will be to keep updated on RHEL patches, for example -- apparently we should be able to keep both host and app dependencies updated without much trouble, but it adds more complexity to the maintenance cycle (it seems).

That's about the "problem" with Docker – it's deceptively easy to roll out everything as its own containerized app. Updating? Not so much.

It turns Docker from a magical silver bullet into a slightly fancier way to handle reproducible deployments. Using it this way is fine, but not what Docker is marketed as by many.


Actually its pretty easy, I just did it yesterday for my PostgreSQL container.

Debian/Ubuntu example: sudo docker exec -it my_pgsql_container_name /bin/sh -c "apt-get update; apt-get -qqy upgrade; apt-get clean"


And what happens when you launch new container from the same image? You need to run apt-get/yum again. Or rebuild image.


That's why you keep everything with state in a separate volume container. Attach volume to built image and that's it.


You can, if you want, mount your root as readonly so you're not tempted to modify it. Then it behaves like a Live CD.


Mount data, logs, configuration, eventual extensions in the data container?

For pg, there might be some migration needed when jumping from a major version to the next. Which requires both versions installed, on Debian at least.


>Mount data, logs, configuration, eventual extensions in the data container?

Many programs have their state represented as files that are stable across versions. If you have a cluster of the same image with different states it's more efficient to move volume containers across a network. Easier to backup/upgrade too.

pg is going to give you those problems whether you are using Docker or not.


and that defeats the whole selling point of docker, which is no forward config, containers do not change once shipped.


Worse, doing this breaks your guarantee that all environments deployed from this image will be consistent. You'll have to deploy some config management software (Puppet/Chef/Salt/Ansible) to stay on top of these changes.


Check out Project Atomic http://www.projectatomic.io/. Or its downstream project RHEL Atomic Host. The whole update process for the host is much simpler. Read more abou it here: http://rhelblog.redhat.com/2015/04/01/red-hat-enterprise-lin...

Note: I am not related to Redhat, but we are considering Docker, too. And we are evaluating how would Atomic fit in our infrastructure.


Basically, you're thinking of building a custom PaaS.

I'd just use an existing one. PaaSes require an enormous amount of work to make them featuresome and robust. That's all work you're spending that isn't user-facing value.

I've worked on Cloud Foundry and so obviously I think it's the bee's knees. You might prefer OpenShift.

If you're happy in the public cloud, you can host on Heroku, Pivotal Web Services (my employer's Cloud Foundry instance) or on Bluemix (IBM's Cloud Foundry instance).


First i'd like to point out that you cannot have a miniature version of production and you cannot reduce maintenance complexity. It violates the fundamental laws of nature. No matter how small, you still have the same number of moving parts, so it's effectively the same when it comes to actually operating and maintaining it.

But lucky for you, Docker provides some ways to run commands on an existing image, like the RHEL patching/updating tools. It should be possible to update an image's files using RHEL's patches, as long as the whole RHEL install is there in the images.

As far as breaking apart these sets of files into disparate dependencies: again, it's totally possible, but it does not simplify nor reduce your maintenance complexity.

Now, some really stupid people would recommend you compile applications from source and deploy them on top of RHEL, and basically build all your deps from scratch. You don't want to do that because a large company has already done that for you and put it into a nice little package called an "rpm". You take these RPMs and you find a simple way to unpack them on the filesystem, make a Docker image out of them, label/version them, and keep them in your Docker image hub. Now you have your RHEL patches as individual Docker images and can deploy them willy-nilly.

(This is, of course, exactly the same as maintenance on systems without Docker, and your dev & production environments would be the same with or without Docker, but Docker does make a handy wrapper for deploying and running individual instances with different dependencies)


Do you believe that locally built "homegrown" deploys on average are going to be better or worse than these images?

Because I know what I'd bet on.


I'd bet on homegrown - the quality of the official docker images is pretty low when comes to applications ones. Images for OSes are fine. Applications images are often not updated when a new version of the applications is available until you send a pull request on github. I can do better than that myself.

Also, official images are not production ready, they are apparently intended for development purposes. Take the Django image as an example. The server it runs on start is not Gunicorn, or uWSGI, or Apache. It is the development server of Django. I can do better than that myself.

I don't think that is a problem with Docker - the application. If Docker - the company - does not have the resources to properly maintain so many official images then it shouldn't try to.


Given how hard it is to find hires - that comes from good development jobs elsewhere - that understands even the very basics of security, I think you're being overly influenced by your own skills.

You may very well do better than that yourself. I don't doubt that large proportions of HN users would do better.

But how many will?

Consider that the quality of the official docker images is an illustration of the quality of images from people who are above average invested in this.

Look at some of the unofficial images, and you will find incredible dreck very quickly.

Now imagine the set of users of images that have not even tried to build their own images yet, and imagine they were asked to put together their own replacements for the official images to use...


Homegrown every time, sadly.

EDIT:

Reason being, you can more easily deal with silly things like goofy hosts, goofy networks, possible lack of internet connections, bad host OS support, etc.

The normal downsides of doing it yourself of course apply.


As I noted elsewhere, I don't doubt that many of us could, and I'd expect HN users to do better on average than developers overall, given the number of security conscious people here.

But that's not what I'm questioning, but whether or not homegrown images on average are going to do better. Look at the non-official images, and see how much nonsense is in there.

If you know you can do better, by all means do. For many of us that is the best option. And I absolutely wish there was more focus on more secure practices for the official images too. But I still think the official images are likely to be better than what most developers would cook up.

Doesn't mean it's good. Just better than the (terrifying) alternative.


I see it as the opposite. If the maintainer of the container put some effort in, everyone could have a secure version of their software with minimal effort.

The trick is to get people to care about their security. In theory, this is what open source is about. Why not assemble a taskforce to go and secure these containers?


The problem is that "secure version" is a constantly moving target; the taskforce would need to go around once a month (or whenever there's an urgent vulnerability discovered) and update apps that needed it.

If Docker apps were somehow integrated with maintained Linux repos, this could be possible by default -- e.g., all Docker images built on Debian stable dependencies would have their internal dependencies auto-upgraded with each Debian stable sub-release, and possibly be flagged as "needs human intervention" on major releases.

Have there been efforts to do anything like this? I'm new to the Docker world....

There needs to be, though, otherwise a "secure app" is always a temporary creation.


>The trick is to get people to care about their security.

Its 2015. If security isn't a priority by a project, then that project is just incompetent. That may be harsh sounding, but are we really talking about security as optional with internet facing services? This is what happens when devs build their own systems without the experience of being a sysadmin. There's a lot of kitchen sink and duct tape "does it work? Yes, then we're done," mentalities at play here. Not enough people are worrying about maintainability and upgradability.

Heck, most of these things ship with everything running as root. Its like we've regressed to the 90s with Docker and Docker-like technologies.


> Bundling dependencies for production environments has always been and always will be a terrible idea.

If you are not bundling dependencies how do you rollback a deploy that migrated to a new version of a dependency? If you rollback your code, you also have to do something to rollback the dependency.

For Python, I currently rebuild a virtualenv from scratch on each deploy, but it just feels like a poor solution. Docker containers seem like an interesting way to package these dependencies in a way that is portable, where a deploy is just pushing a new version of the Docker container. Is there a Better Way(tm) that doesn't involve me needing to deal building OS packages for all of my virtualenv dependencies?

(I'll note that several dependencies have C extensions, and are thus not pure Python -- e.g. `itsdangerous` depends on `pycrypto` was has extensions.)


If you're basing deploys on image builds, you're not "rolling back" anything, but are building a new host (or host image) based on the correct dependencies.

That process relies on your platform's own dependency-resolution system, and I hope you're using something sane such as Debian/Ubuntu, or are building from source via Gentoo. RPM distros can work but tend to be far flakier.

Start with a base install, have a package for your own source which specifies deps, including if necessary _maximum_ version numbers for deps, and build the target image. Once that's built, you can generally deploy that directly rather than re-build for each deployed host.

Packaging and image preparation _aren't_ tasks which can be abstracted away entirely. It's this point which the containers craze founders on the reefs of reality. Yes, packaging software properly is a pain. But not packaging it properly is an even bigger pain.


I think that the "real" problem here is that Python's package management and apt/yum don't really interface well. I've built .debs for Python packages before, and it was a huge pain in the ass, even with the scripts and automations that I was able to find for it.

It's 'simple' for me to build a virtualenv in a directory with `pip install -r requirements.txt` in my source repo, but everything I've read about making those virtualenvs portable (even moving them between directories on the same server you built them on) is that it is a path fraught with peril.


The 'real' problem is that you shouldn't have to install dependencies in OS-level locations for an application-level product.

In other words, the app should be able to bundle dependencies without having to use a crazy opaque container system, and those dependencies should be easily auditable.

This is the case for Java, where dependencies are 1) bundled with the application, 2) declared explicitly, 3) signed, 4) centrally managed with maven repository software.


In Python, you get similar things with the exception of "bundled with the application" and IIRC "signed" only happens when uploading to the Python Package Index.


I think you can with the latest version of Pip (7)

https://lincolnloop.com/blog/fast-immutable-python-deploymen...


He's talking about exactly what I am currently doing:

> It's now feasible to build a new virtualenv on every deploy. The virtualenv can be considered immutable. That is, once it is created, it will never be modified. No more concerns about legacy cruft causing issues with the build.

> This also opens the door to saving previous builds for quick rollbacks in the event of a bad deploy. Rolling back could be as simple as moving a symlink and reloading the Python services.

This is exactly what I do now: a new virtualenv from scratch on each deploy in the same directory with all other build artifacts (so that each deploy is in a self-contained, timestamped directory that is swapped out with a 'current' symlink). I just bite the bullet on the additional time it takes to deploy.

The part of this blog post that affects me is that upgrading to pip 7 would speed up my deploy times.

This part seems interesting:

> Another possibility is building your wheels in a central location prior to deployment. As long as your build server (or container) matches the OS and architecture of the application servers, you can build the wheels once and distribute them as a tarball (see Armin Ronacher's platter project) or using your own PyPI server. In this scenario, you are guaranteed the packages are an exact match across all your servers. You can also avoid installing build tools and development headers on all your servers because the wheels are pre-compiled.

I've looked at platter a bit, but I haven't really digested what will be needed to migrate to that point, and he doesn't really expand on it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: