Hacker News new | past | comments | ask | show | jobs | submit login
Using Alpine can make Python Docker builds 50× slower (pythonspeed.com)
61 points by bomewish on Dec 28, 2023 | hide | past | favorite | 43 comments



There is perhaps no greater example of a cargo-culted non-optimization than the choice of Alpine as a base image. This has cumulatively wasted millions of hours of developer time. And why? To reduce the base image size.

Image layers are cached and shared! If you have one image based on Ubuntu in your stack, you may as well base them all on Ubuntu, because you only need to download (and store!) the common base image once. And if you're such a purist that you've avoided anything but Alpine base images, what have you actually gained? Your deploy time is slightly faster? Your first cold boot is slightly faster? Outside of a few serverless use-cases, this is basically meaningless and almost never a real bottleneck (and even for serverless, there is probably some level of image caching across boots).

Meanwhile what did you give up? You opted out of the most well-tested and maintained versions of binaries, and forced all software in your image to link with musl instead of glibc. Congratulations on creating an ongoing headache for yourself. You've increased maintenance costs and probably even decreased security by preferring a less-audited codebase.

I suppose one positive side-effect of the Alpine cargo-cult is that some of its more technically-inclined adherents at least contributed back to projects to improve their cross-compilation toolchains. But otherwise, it's just been a huge waste of time for pretty much everybody.


> If you have one image based on Ubuntu in your stack, you may as well base them all on Ubuntu, because you only need to download (and store!) the common base image once

This is only true if your infrastructure is static. If your infrastructure is highly elastic, image size has an impact on your time to scale up. A great example is using spot instances; in my line of work we need to create thousands of spot instances to run many millions of time-sensitive batch jobs at certain times and dates, then scale those instances down rapidly to reduce the bill.

Of course, there are better choices than Alpine to optimize image size. Distroless (https://github.com/GoogleContainerTools/distroless) is a good example.


That's true. But the size difference between alpine and debian:*-slim (for example) is on the order of < 20mb. Datacenter hardware can download this in milliseconds. There may be some rare scenarios where this is a meaningful delay, but usually provisioning time is bottlenecked somewhere else. Heck, it probably takes longer to download the VM image (and if you're optimizing for provisioning speed, you shouldn't be downloading Docker images on boot anyway - you should be baking VM images ahead of time.)


Baking the VM images trades a lot of flexibility for some latency, and it doesn't apply to container-as-a-service infrastructure where you don't control the VM. My point is you can get most of the way there with slim/distroless without making the tradeoffs of Alpine, and that's probably enough for most use cases.


> But the size difference between alpine and debian:-slim (for example) is on the order of < 20mb.

  compressed size(amd64):  17.59MB vs.  45.35MB  ( > 2.5x )
  uncompressed size(amd64):52.00MB vs. 130.00MB  ( > 2.5x )

  $ docker images python:3.12.1*
  REPOSITORY   TAG                    IMAGE ID       CREATED       SIZE
  python       3.12.1-alpine3.19      0108959914c2   2 weeks ago   51.7MB
  python       3.12.1-slim-bookworm   7c5c05471c3a   2 weeks ago   130MB**


Those are Python images.

The base Debian slim image [0] is 28mb, compared to 3mb from Alpine [1]. So that's a difference of 25mb, or "on the order of 20mb." A machine in the datacenter will probably spend more time opening the connection to the registry than it will downloading that extra 25mb.

[0] https://hub.docker.com/_/debian/tags

[1] https://hub.docker.com/_/alpine/tags


and check the "VULNERABILITIES"

   bookworm-slim : 0H 0M 19L VULNERABILITIES
   alpine 3.19.0 : No Vulnerabilities Found

   python:3.12.1-slim-bookworm : 0H 1M 22L  VULNERABILITIES 
   python:3.12.1-alpine3.19    : 0H 1M  0L  VULNERABILITIES


There is no difference in download time for 20mb vs 50mb in a datacenter, but the amount of time lost to alpine being "quirky" is incalculable.


The compressed size is what matters when discussing image download time- CPUs made after around 2010 can uncompress files ludicrously quickly.


Distroless is Debian packages with worse security and without package management.


> without package management.

distroless is for the TERMINAL layer, you should already be done with the package manager prior to entering distroless.


> Image layers are cached and shared! If you have one image based on Ubuntu in your stack, you may as well base them all on Ubuntu, because you only need to download (and store!) the common base image once.

AIUI that's only helps if everything uses exactly the same base image. Every time the base image gets updated, your images will diverge as they're rebuilt.

> You opted out of the most well-tested and maintained versions of binaries,

Strictly true, though Alpine is pretty widely used and tested these days.

> and forced all software in your image to link with musl instead of glibc.

Well... Yeah, that's how that works.

> Congratulations on creating an ongoing headache for yourself. You've increased maintenance costs

I keep hearing this, but in years of shipping Alpine based images into prod the problems never materialized.

> and probably even decreased security by preferring a less-audited codebase.

Or, since we're handwaving, you've increased security because musl is better hardened and an attacker's binaries are less likely to work on it.


> I keep hearing this, but in years of shipping Alpine based images into prod the problems never materialized.

DNS was the big one; for many years musl's implementation of DNS was both divergent from glibc, and in some cases, broken due to lack of TCP support. This only got fixed earlier this year: https://gitlab.alpinelinux.org/alpine/tsc/-/issues/43

TCP support can be a really big deal for reliability at very large scales. When I led the container infrastructure team at $BIGTECHCOMPANY we eventually published a recommendation against using Alpine because musl's resolver was demonstrably less reliable than glibc's- we had the production data to show it.

The other case was when developers copied binaries compiled for glibc into alpine containers, which is user error to be fair, but it was easier to tell everyone to use glibc instead of teaching thousands of Python devs about how the compiler and linker works ;_;


> I keep hearing this, but in years of shipping Alpine based images into prod the problems never materialized.

Postgres has very subtly different sort ordering when we use the alpine based images.

That the dev trying the upgrade locally happened to pick alpine got what should have been a very simple couple person-days upgrade backlogged. And drawn out & stumbled over. Multiple people wading through a long series of changelogs. We eventually figured out what was going on, but it was a miserable long soggy sad trudge to get ourselves back on track. All because someone used a not-so-standard library.

Musl causes a more regular pain of node add-ons needing to be specially compiled, but at least we typically see these coming. Time lost, but at least predictable time lost.

One other gripe: musl also is not designed for server workloads at all. It's designed for space efficiency & small systems. Sometimes memory can be a dominant cost or a crucial factor for online services, but for most workloads, you are probably misoptimizing your system by misapplying what was intended as a niche standard library to your system.


I think the Postgres weirdness is due to collations, which tbf are a nightmare to deal with in DBs on their own.

But in any case, it would be tremendously irritating to troubleshoot.


> AIUI that's only helps if everything uses exactly the same base image. Every time the base image gets updated, your images will diverge as they're rebuilt.

Yes this is true. In our stack we pin all base images to a common hash, which makes builds fast and also increases reliability of CI pipelines since we don't get unexpected changes to builds. The tradeoff is you need to be deliberate about updating the pinned image with new security patches, but this feature is available out-of-the-box with update bots like Renovate.

> the problems never materialized

What about every hour you spent recompiling binaries to make them work with musl? That's what I meant by maintenance cost - every time you add a new binary into your minimalist image regime, you need to either pay the cost of making it work with musl, or abandon the effort, effectively nullifying the benefits of your other images using Alpine (since once you have one image based on a larger base image, you may as well base them all on it). This is a kind of Stockholm Syndrome where the cargo-culted optimization becomes forever embedded in your software requirements even when the options may have changed (for example, once debian-slim was introduced, there was little reason to continue using Alpine).


> What about every hour you spent recompiling binaries to make them work with musl?

We were a mostly non-Python shop, and what Python images we did use, IIRC, either 1. didn't use libraries with native code so not a problem, 2. just `apk add`ed the Alpine-provided libraries, or 2. just used the Debian based image for that particular image. I will grant that if you're working in a Python-heavy environment the calculus is likely different.

> That's what I meant by maintenance cost - every time you add a new binary into your minimalist image regime, you need to either pay the cost of making it work with musl, or abandon the effort, effectively nullifying the benefits of your other images using Alpine (since once you have one image based on a larger base image, you may as well base them all on it).

IME it just wasn't that much cost, and it dropped off even more once musl wheels started becoming normal. And we had enough churn in images that basing everything on ex. a single static version of Debian was 1. never going to happen, and 2. wouldn't have reduced our image sizes since the base image wasn't constant anyways.

> This is a kind of Stockholm Syndrome where the cargo-culted optimization becomes forever embedded in your software requirements even when the options may have changed

Or, it's a perfectly reasonable rule of thumb and if actual evidence shows that something else is better you can pick it up then. I mean, the same thing goes in reverse, right? You've cargo culted "Alpine slow" and have embedded that in your stack even though the reasons for it no longer exist.

> (for example, once debian-slim was introduced, there was little reason to continue using Alpine).

Alpine is still smaller. Debian shipping slim images reduced the gap, but it's still there, and installing packages will exacerbate it (since Alpine packages also seem to be smaller than the Debian equivalents.


You've forgotten one additional nuance - security. It's easier to look out for outdated/vulnerable components. A slim image also makes it quite a bit more annoying for attackers, if you don't give them all the tools they need straight away.


There's usually enough gadgets in slim images to infiltrate/exfiltrate, especially if you include package managers in your final image. Using a multi-stage build with a final image from scratch is a good practice where possible, especially now that you can attach debugging sidecar containers to your application as needed.


Registries are painfully easy to host, too.

One person can manage that on behalf of everyone expected to maintain this alpine ritual. Scaling can be solved in a way that keeps familiarity


100% agreed. For every bad example posted where the replies are full of, "yeah, but that was patched X months/years ago," I guarantee there are more edge cases waiting to be found.

Alpine is not worth the engineering headache, full stop.

"But it's a _correct_ implementation, glibc is buggy!"

Yeah, and it's been the accepted standard since forever. Everything expects it, and is used to it.


You're getting downvoted for a pragmatic observation. The documented "correct" approach is not necessarily the most likely to work reliably.

https://en.m.wikipedia.org/wiki/Protocol_ossification


Having run into some of the musl-related issues mentioned in the article, I now opt for Debian-based images even for non-python use cases. Having to spend time troubleshooting Alpine specific issues just doesn’t seem worth it.


In my experience most issues are not really musel issues but incorrect usage of libc where glibc used some "magic" to make that incorrect usage somehow somewhat work, at the cost of higher complexity which more then one time lead to security issues due to a larger attack surface.

So in general I would recommend testing with musel and fix whatever issues you run into in a non-musel specific way.

Through then I guess at work many people do not have time/resources for stuff like that.


> So in general I would recommend testing with musel and fix whatever issues you run into in a non-musel specific way.

But again, do all this just to save anywhere between 50-200mb of space compared to Debian slim images. I'd rather just use the latter and get on with my day.


it's not to save 50-200mb but to correctly use libc to avoid any unexpected situations with glibc down the line, e.g. when a change needed to be done for security reasons or running it on a more hardened system might lead to similar slow downs as with musel


I use scratch or distroless with a static binary. No nonsense needed! I love go.


Related:

Alpine makes Python Docker builds slower, and images larger - https://news.ycombinator.com/item?id=22182226 - Jan 2020 (149 comments)


I thought python now provided wheels for alpine? Wouldn’t that make this article outdated?

Also, real question, how often are you rebuilding docker images (not containers) that a few seconds matter?


As mentioned in an addendum at the bottom of the page, wheels can now target Alpine. You can try it yourself using the example package from OP:

docker run --rm python:3.11-alpine pip install pandas


I mean, the easy solution is don’t use Python if the container image size is important to you.

Build it in Go and you don’t need to drag in a million weird dependencies. The whole idea you need a 300MB container image to run a simple Python program is a perfect demonstration of everything that’s wrong with software development.


It's not that simple. Likely the root of this is that alpine uses musl which by default ships an allocator that is extremely slow in multithreaded code. I've seen Go programs be much much slower under alpine for this reason as well.

Given how often this comes up, I think it's high time that alpine replaced the default allocator with something more modern.


Tbf, if Python devs would just use stdlib, as you can for many, many tasks, this isn’t nearly as much of a problem.

I’ve seen people install numpy so they could compute Euclidean distance. math.dist() is _right there_, people.

Same with requests, if you're doing trivial things. urllib.request() exists.


> I mean, the easy solution is don’t use Python

:)

People will be much better off and happier if they switch to Go for 99% of tasks. I’ll understand if it’s python for machine learning.


Go is such an ugly language. It has it’s place but it’s not the end-all be-all.


How? It’s SO easy to read and you don’t need semicolons. How can it possibly be ugly?


  if err != nil


as opposed to...


Yes and? Do you not check return values? Pythons allows you to just yolo call a function lmao. It’s awful. You can NEVER guarantee behavior with Python. Never. Go error checking is guaranteed to be either an error type or in the absolute worst case, a nil pointer panic.


The question was "how can Go be ugly," which is subjective. I find Go's error handling verbose and ugly.

Re: guaranteed behavior, correct me if I'm wrong, but for anything in stdlib, you'll either get the expected return value or an exception. For example, checking an array with str.isdigit() isn't undefined if the array has a None, it raises AttributeError.

But yes, you can just yolo things and hope for the best, which is why it's a terrific scripting language. If you add typing and properly catch expected errors, it's quite capable. Whether or not Try/Except blocks are uglier than err is nil is subjective.


Seems a little disingenuous to say that Alpine is slow because it can't use wheels, and only in a little note at the bottom mention that that's not true anymore.


The blog post is three years old. The update is more recent.


If you're going to update a blog post to mention that it's inaccurate, it would seem preferable to put that at the top (so people don't skim it and walk away misinformed) or edit the document in place (which in this case would result in a very short page, since the meat of the post is the part that's wrong now).




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: