Hacker News new | past | comments | ask | show | jobs | submit login
Just say no to :latest (platformers.dev)
229 points by kiyanwang on March 6, 2022 | hide | past | favorite | 129 comments




I somewhat agree. But it's not that simple in practice. For example, I pin every single version and the container fails a security scan because it's not using latest versions of everything. Or I pin to arbitrary versions that happened to be latest when pip/npm/terraform first created the precious word-of-god lock file, leaving future engineers afraid to upgrade those packages because "Chesterton's Fence".

Pinning versions creates work in the future. Using latest versions can result in ad-hoc breakages and security risk. So instead of saying "v1.22=good, latest=bad" and talking generally about the fact that lock files exist in most decent systems, I'd like an article that contains example strategies for paying back the tech debt of a thousand arbitrarily pinned versions across your code projects.

I like the idea of deleting the lock file on a regular cadence, accepting whatever new shit comes in, perhaps bumping a major version number, and then testing the crap out of everything and leaving this new artifact in integration/staging for a few extra days. If things go wrong between builds, then you have the lock files in git (and in your artifacts) to let you know what changed. If it turns out that you now have a real reason to pin a specific package (ie: it broke something), then you have to find some way to note that outside of the usual lock files, since they arbitrarily lock everything.


You need to have a CD system which takes equality pinned deps and bumps them when new versions come through and automatically runs the new versions through whatever your test/integration/production environments looks like. And you should have sufficient integration testing that you should have confidence that an automated process should be able to decide if the new version should get deployed to prod (being realistic you probably want to confine these deployments to some sort of reasonable business hours).

Then the tech debt gets addressed automatically without any work. The work is all up front to write the tests that give you confidence about deployment to production, but that is work that needs to happens and pays off. You shouldn't need to have human testing in the loop.


Do people have automated tests to validate their SMSs are sent correctly (and received by a real phone)? Or that credit card payments are made correctly? These are things I use third party libraries for, and what I worry might break when I upgrade dependency versions.


Most payments vendors will have test CC numbers for integration testing. One of Paypal's test numbers is 5555555555554444, for example. https://developer.paypal.com/api/nvp-soap/payflow/integratio...


I guess the question becomes how far out you want to test. If your credit card processor is a third party they could technically break something, regardless of which version of their library you use. In that case, do you really care that a transaction goes through, or do you just care that you used your third parties interface correctly (be that rest or binary api calls).

If your requirements aren't that sharp, like maybe you can take a day or two to process the credit card payments, you could get away with monitoring your application and configuring alarms for if no transactions go through. That way you'd catch errors originating from your third party as well.


Depends, how big of a deal is it when they break?


I haven't used it myself but I'm pretty sure Whitesource Renovate is free tool that tries to do just that.

https://www.whitesourcesoftware.com/free-developer-tools/ren...


Weird, I thought it was a tool for encouraging engineers to learn how to use Github PR search filters.


I am a fan of using renovate, and with docker images in particular since I can define my remote as

  FROM mcr.microsoft.com/dotnet/sdk:6.0@sha256:15c22c170650b8db2f6250547a2dc5341978b0647c6b21ef67768e628de614f3 AS build
and have renovate automatically merge digest updates, the sha256 hash, while having manually(or automatic) PRs for the tag target.

So a when upstream updates their tag I get a PR(which is automerged) that looks like so, this allows me to know when upstream has changed while still being able to target a broader version range, 6.0 in this case

  - FROM mcr.microsoft.com/dotnet/sdk:6.0@sha256:15c22c170650b8db2f6250547a2dc5341978b0647c6b21ef67768e628de614f3 AS build
  + FROM mcr.microsoft.com/dotnet/sdk:6.0@sha256:70b890cd12f73f8ad80061d242081b61da666bda7ec2d729113855a8b9410e1e AS build


Agreed. There are logical, version-flexible dependencies. And then there are rigid, version-specific dependencies.

Robust systems need to model both so they can support reproducibility and automation needs.


Needlessly upgrading all of your packages leaves you open to supply chain attacks. If one of those NPM packages got quietly compromised (and my lockfile likely has 4-to-5 digits worth of dependencies in it), you invested time in making what amounts to an unneeded change to expose yourself.


Any bit of software can have a very nasty vulnerability (looking at you, log4j). Leaving your software pinned to old versions exposes you to existing undiscovered CVEs.

This is a pick your poison circumstance.


In general, for undiscovered CVEs new versions will be as exposed as old versions - if the bug was not discovered, it was not fixed as well; and if the latest version fixes an exploitable security flaw then there generally will be a CVE issued along with that release. Using the same log4j as an example, all the (many!) projects which were still using the old 1.x log4j branch were not exposed, but those who were using the latest released version were exposed to the undiscovered CVE at the moment it was released.

IMHO a reasonable compromise is to watch for discovered CVEs and update your dependencies when appropriate, but keep them pinned if no known CVEs apply.


Ok, so let's say it's day one: how do you choose all of your initial pinned versions? How do you know they're safe? What if you luck-out in the supply chain attack lottery and pin malware?


Older packages are more likely to be vetted as non-problematic than newer packages. As time goes own, the vulnerabilities are disclosed. If you're upgrading to a package that's 1 day old, the community hasn't had the time to go through that process and disseminate the vulnerability. Your point is valid, but I think it's an increased risk since time is against you when you accept fresher, newer, unproven versions of your dependencies.


I think you're right that cutting edge carries the risk of introducing new vulnerabilities that haven't been observed yet. And the closer you are to release, the less time the security community has had to look into it.

Conversely, there's also a factor that says that after some time, the security community will stop looking at a specific version. Even if that version still has vulnerabilities.

Those two factors lead me to believe there's some sort of sweet spot. You probably don't want the cutting edge newest version, but you still want a version new enough that it's in widespread use.


The only real defense to supply chain attacks are audits and self hosted dependencies/packages.


> self hosted dependencies/packages.

Why? If your lockfile specifies and enforces the hash of the package, you don't need to self-host it - the lockfile will take care of detecting supply chain compromises that swaps a package with another.

100% agree on audits being necessary.


Self hosting protects against attacks against the availability of your supply chain.


You're still downloading random stuff from the internet. A vulnerability in your package manager could let an attacker bypass the hash check, or they could discover a hash collision.

If you really want to be sure, having the exact bits on a system you control seems like the right move.


Pinning can be far less painful with tooling to review and manage all the changes. One example would be the Maven versions plugin: you can use it to query "what's been updated", update to next release, etc. The list of goals is pretty explanatory, and a good starting point: https://www.mojohaus.org/versions-maven-plugin/

When evaluating new toolchains, I often ask myself "what is the versions plugin equivalent here". There's often commercial offerings that help this process out.

Dependency management is one of these topics that rarely gets attention until much later in the ecosystem's lifetime. So chances are you'll need to find or create tooling to fill gaps.

I mean, you can do the whole "damn the torpedos and see what breaks" approach, but, I'd prefer a way to review changes. And I'd prefer doing those updates _separate_ from other changes.


Pinning versions creates future work but so does using the latest version.

Vulnerability discovered in old version vs. vulnerability maliciously inserted in new version; software engineers afraid of upgrading because of breaks vs these breaks actually occurring because of an unexpected upgrade.

I personally prefer pinned versions because whenever I try to install some old github example or other code, I usually end up getting stuck with errors unless everything (compiler, dependencies, etc.) are pinned. Gradle projects almost always work no matter how old (Gradle pins the dependencies, JVM, and even the version of Gradle), whereas npm and CMake projects (which are built with new compilers and usually new dependencies) often fail.


Dependabot is great for this. We have everything pinned, but dependabot takes care of a bunch of manual steps to upgrade a dependency: it updates package.json, package-lock.json, pulls release notes for all the changes since your pinned version, and makes a PR and triggers your CI build.

We’ve found many broken builds thanks to this, and could then create an issue in our tracker to tackle that tech debt later.

If the build doesn’t break, we review the new version’s release notes, and sometimes even look at the code changes of the open source package. Overall it’s helped us keep more up to date, has made upgrading versions of Node less painful, and even helped us adopt some improvements sooner than we otherwise would have.


> For example, I pin every single version and the container fails a security scan because it's not using latest versions of everything.

This security scan would be absurd. The amount of churn of updating everything the day it is released would be insane, specially wrt. breaking changes.


This is always a balance. The moment you pin to a specific version you need to have a process in place to ensure you regularly upgrade to avoid introducing vulnerabilities in your production system. Throughout my career I have seen many cases where certain software still runs on ancient versions as the team originally maintaining it is no longer around (e.g. reorganisations or lay-offs). It is always hard to convince senior management to invest any resources to upgrade (If it works don't fix it)

If I have project that has good unit test coverage, I prefer to use :latest, as this results in a gradual update over time. If something breaks due to a version discrepancy, it is a lot easier to convince management to fix this, as the breakage would only be noticed as part of a feature request, and often would only require a small amount of work.


I've also seen the same. As soon as people start locking versions, that code is no longer updated and nobody will change it, because it's extra work to do so.

I personally think running latest is the best thing to do. And if something fails, you downgrade it temporarily until the latest work again. It's pretty much opposite to what is recommended, and it's just the best solution in my opinion.


My company locks versions, but dependabot is configured on all of our repos. It automatically creates PRs to bump versions, and if CI passes for minor/patch bumps they get automatically merged. This takes a lot of the hassle out of the problem. For major bumps, a manual approval is required, but they happen infrequently enough that it's not a lot of work.


A similar tool to dependabot written by Salesforce: https://github.com/salesforce/dockerfile-image-update


In that case, it's better to just pin the major version of the container:

    FROM python:3
will work as long as Python3 doesn't make any backwards incompatible changes (note: Python3 occasionally deprecates then removes a feature that was part of the official API[0], so Python3 is not technically semver-correct).

If, however, one wants automatic updates WITH reproducible builds, then a CI/CD pipeline that automatically updates the FROM line in a Dockerfile on every upstream release is an only solution.

[0] https://docs.python.org/3.10/whatsnew/3.10.html#removed


Python never claimed to be semver-compliant. Also, if you’re using 3rd-party packages that require compiling stuff (eg. numpy), pre-built binary packages (aka wheels) might not be available on PyPI in the first few days (or weeks, or sometimes even months) after Python 3.x.0 is released, so you don’t want the version to be randomly changed under you. Moreover, I pulled python:3 a year ago, so I get 3.9.

FROM python:3.10 should generally be fine though.


This is very bad advice. Lots of nontrivial things aren't automatically compatible with the latest Python minor version. PyTorch for instance doesn't support Python 3.10 and it's been five months since the stable release (ten months since the first beta). For anything nontrivial, you almost always want to specify a minor Python version.


Why is this bad advice?

The PyTorch scenario is an exception, not the rule, and the users of PyTorch should know not to use unsupported versions.

> For anything nontrivial, you almost always want to specify a minor Python version.

Depends on how you define "trivial".


It’s non-optimal advice because PyTorch is not the only library that runs into problems like this, because Python does not follow semver. You should pin to an X.Y release, and let .Z releases update automatically (which is where your security updates fall)


this works great until a library decides it only likes python 3.8 or 3.9 or 3.10 and won't work with 3.10.


Its hard to convince management that them adapting devops means they actually have to follow devops principles.

Too many times Ive seen companies throwing this buzzword around without understanding how much work is actually required to have real devops.


:s/devops/agile/g


Doesn’t the same logic apply to using :latest? If you are always pulling the latest, unvetted, code, shouldn’t you also have a vulnerability management process in place?

You need vulnerability & risk management for any dependency, full stop.


Ideally you only use Docker official images,or their equivalent to avoid using unvetted code.

It is always a trade off, however it is far more likely that a hacker will use a ten year old well exploited CVE, rather than a recent one


> Ideally you only use Docker official images,or their equivalent to avoid using unvetted code.

Docker images don’t ship every dependency in the average development project. They’re also not a security guarantee either.

> It is always a trade off, however it is far more likely that a hacker will use a ten year old well exploited CVE, rather than a recent one

In general that’s the case but in practice that’s still the wrong mindset. You pin to a minor version and still get patch updates. Or run CVE checks to prove you’re dependencies have no reported vulnerabilities. And that will also mitigate the risk of you again accidentally updating to to new vulnerabilities (though you’d need to pin against patch version too if you want to be certain there)


I have seen many cases where a something is pinned to a minor version, and then gets forgotten about for 2 or 3 years. In my current companies there is Production codebase that hasn't been touched for over two years. The original developers have long since moved on and no one really owns the code. That is the most common scenario leading to security breaches.

This is about succession planning, with the realisation there might not be any succession if business priorities change.


I’ve seen many cases were version pinning has caused issues at the worst possible time (like missed delivery dates because of a breaking change in dependencies being pushed by maintainers at an inconvenient time for us).

Ultimately there’s no perfect answer here but given a choice I’d rather opt for a known risk than an unknown one.


In that case, better the devil you know? But either way if you don’t have a vulnerability management plan you are at risk.


I wouldn’t describe Docker official images as more or less secure than anything else. Supply chain attacks can easily slip a vulnerability in.


It's better to only change one thing at a time, not to get your dependencies updated as a surprise when you're trying to update something else.

Besides, if the dependencies aren't pinned in version control, how can you answer a query like "On this date last year, what version were we running?"


I am using AWS ECS as a Docker repo with releases using the image hash (as well as a server). If you want to rollback to a certain time, you just identify the appropriate Docker image tag, and restore the Docker you used at that time in your development environment.

To give an example, we got stuck on Python Pandas 0.24.2 for two years when the Pandas 1.0 upgrade happened. At the time we pinned the version as one job was failing. With COVID priorities naturally changed and this was forgotten. When normality returned we suddenly had to upgrade 10 jobs to Pandas 1.2.x, a very painful migration that took one Engineer 4 weeks of testing. In hindsight it would have been far better to upgrade this one job, as it would have only taken a few hours.

Again it is a trade off. Having unexpected dependency changes can be cumbersome when you are releasing a small change. However, getting senior management to agree of weeks of upgrade testing can be challenging as well. My natural preference is not to pin versions, unless I absolutely have to.


Right. Let's assume you're going to need to upgrade sooner or later - the longer you leave it, the harder it gets.

Best to upgrade and discover dependency issues during dev/staging, than try to fix many months worth of them when there's an active CVE on your live pinned version.


Covert maintenance is the name of the game. If the site goes down because of a security flaw you'll be blamed anyway. Might as well get fired for doing the right thing.

I'm only half joking... The right thing to do with dummy management is to get everything in writing to make it clear that if anything happens it'll be their fault.


I've used Maven version ranges to great effect... ...only projects that actually observe semver.

(Jackson, take a bow for doing so with your patch releases. I wouldn't use it for your 'minor' releases though.)


Maven versions are not present in VCS if you use ranges. Wouldn't this cause issues like not know what Library version is being used in production and builds not being reproducible?


You can solve that issue not at the source/VCS level but with artifact management, properly logging, storing and documenting the things that were built from that source. In some places that's tightly integrated with VCS in a single pipeline, in others the source and built artifacts are managed quite separately, especially if you're packaging many different builds for different customers.


That's true, but you can export resolved dependency versions at build time if necessary. I've rarely used version ranges because, well, semver rarely occurs, but for Jackson, allowing it for patch releases was a decision to ensure security updates were automatically picked up.


Not to mention that Maven doesn't have any mechanism for locking in transitive versions, because typically everyone uses hardcoded patch versions and never update them.


How do you mean? You can override versions of transitive dependencies.


In most projects JVM devs don't pin the versions of transitive dependencies, even though in many other languages it's recognized as a basic functionality of the build tool.


Yeah, that's a fair point about JVM culture. I suggest that there's a few cultural reasons this occurs, based on my experience:

1) JVM libs generally tend to follow semver to the Y level of X.Y.Z. reasonably well. (Spring can FOADIAF in this regard though)

2) Maven and Gradle etc. have very good dependency resolution algorithms

3) When the dependency algorithm struggles, you can provide guidance (e.g., Maven's dependencyManagement element, or the ability to exclude a dependency's dependency so that your pinned version of that dependency is what's used)

On that last point, you can use the Enforcer plugin to be very very draconian about transitive dependencies. I worked on a project in the past that would fail the build if you introduced a dependency that had dependencies, and didn't precisely specify the version of each new dependency at the project level.

I've done a fair bit of work also with Python, Go, and JS, and doing so made me realise that the dependency management situation in JVM land is the best I've ever encountered (so far). Which surprised me, I had always thought it somewhat over-engineered, but now I think it's sufficiently engineered and those other languages need to steal some ideas. (Especially Python, I love Poetry, especially compared to pipenv, but I would kill for the ability to override a dependency's dependency like I can in Maven and Gradle).


I'd say that there are two things that you should have in place regardless of which approach you take:

  - having tests (or even manually test the system with scenarios, if for whatever reason you cannot automate) in place to catch things breaking, before shipping any changes
  - having security scanning in place, be it for containers, your dependencies, or anything else; ideally for everything
Then, things should get a bit more clear:

  - you should be able to spot the publicly known vulnerabilities and adequately evaluate their impact to decide what must be done
  - when you update versions of your dependencies, you should then also be able to see whether anything breaks before shipping
Admittedly, all of that will only be viable if you have the buy-in from the people involved, since otherwise you'll get "blamed" for things breaking as a result of your initiative for avoiding shipping vulnerable software, or you'll find it difficult to justify to people why you're spending so much time updating dependencies and doing refactoring.

Not only that, but some systems out there are not really easily testable, e.g. those that have a large amount of functionality within the DB so your tests might as well end up mocking either too much of the system or the wrong parts of the system (i'm yet to see anyone mocking the low level queries that go to the DB and back, e.g. setting and validating the individual parameters within query abstractions, as opposed to just the data that's returned) - in many cases it simply won't be viable to test everything.

Another thing is that in practice semver tends to lie to us and even minor updates or patches can sometimes have breaking changes within them, something that i wrote about in my blog article "Never update anything": https://blog.kronis.dev/articles/never-update-anything (albeit there i also touch upon the fact that software should largely be more stable and have fewer breaking updates in the first place, or even less new features in "stable" branches)

I guess my argument is that there's a lot of complexity to be tackled here and that people should invest more time and effort into handling updates, refactoring and even testing, than they do now: i've seen teams where people all agree that tests are important, yet nobody wants to write any because if they tried, they'd have to mock large parts of the system OR try to do integration tests and handle the fact that nobody has invested the time and effort into bringing up reproducible environments and services, e.g. a new automatically migrated DB instance for the tests OR the fact that they'll need to do tests in a shared DB instance and clean up afterwards. Environments like that are just a downwards spiral that's bound to produce brittle software.

So what's my practical advice?

Use pinned versions, preferably starting with the latest and most boring one that you can get away with (e.g. LTS). Have a process in place for figuring out when you need to update, set aside time for doing just that (even if manually), assign someone to do this and someone else to make sure it's been done. Have testing and serious validations be a part of this process, to make sure that there's no breakage that'd slip past - full regression testing, all your unit tests, integration tests, feature tests etc. How often should you do it? Depends on the importance of the system - i've seen it done quarterly, i've seen it done monthly, some with plenty of resources out there could probably do it weekly or more often. Additionally, be able to do this ASAP when critical vulnerabilities become apparent, e.g. log4shell.

Of course, it's the ultimate example of useful but exceedingly boring work that people don't really want to do, so i've no doubts about running into unmaintained software in the future.


Worth noting that Hadolint[1] raises warnings the issues mentioned in the article. Some examples of warnings:

- https://github.com/hadolint/hadolint/wiki/DL3007: Using latest is prone to errors if the image will ever update. Pin the version explicitly to a release tag. - https://github.com/hadolint/hadolint/wiki/DL3013: Pin versions in pip. - https://github.com/hadolint/hadolint/wiki/DL3018: Pin versions in apk add.

[1] https://github.com/hadolint/hadolint


Hadolint is great! If you want to customize your lint logic beyond the checks in it, I recently wrote a Semgrep rule to require all our Dockerfiles to pin images with a sha256 hash that could be a good starting point: https://github.com/returntocorp/semgrep-rules/pull/1861/file...


The age old question is "How quickly do you update?"

Too slow - Exposed to known vulnerabilities.

Too fast - Susceptible to active supply chain attacks.

There's tradeoffs, and there's no solution here. It's a hard problem.


Not just from a vulnerability poerspective but also a dev and testing perspective. Constant updates makes testing far more involved for every release since you can't just focus on the parts the dev work would have affected, you also need to thoroughly test everything that was calling into the updated dependency. But if you don't update enough, once you really do need to update either for security or feature reasons, you may have a nightmarish process ahead of you of dealing with breaking changes and transitive dependency incompatibilities.


Exactly this. The OP reads like a nice perfect world scenario but...

1) Just because there is a vulnerability in a package of library doesn't mean that you are susceptible to that vulnerability 2) If you don't update immediately when will you? Most companies can't afford people to understand all posted vulnerabilities to work out whether updating is better than not updating

The truth is there are easier ways to reduce risk than pretending that a pinned version somehow solves a lot of problems: Reduce your codebase/dependencies and apply defence-in-depth.

Reproducability is something for some companies but, again, I doubt there would be many times that I would care about what exactly was deployed 6 months ago. If something is broken now, it gets fixed now.


I really like Microsoft’s mirror of CRAN because it’s versioned by date- you can install everything as it was on a particular date, so that kind of reproducibility is easy. I wonder if this could be added to pypi


Terrible advice.

It doesn't solve the "immutability" problem and may give you a false sense that it does, which is a much bigger problem.

Here's the "good" example from the article:

    # GOOD:
    image: "nginx:1.21.6"
Here is the header of the source of the Dockerfile for this image:

    FROM debian:bullseye-slim

    LABEL maintainer="NGINX Docker Maintainers <docker-maint@nginx.com>"

    ENV NGINX_VERSION   1.21.6
    ENV NJS_VERSION     0.7.2
    ENV PKG_RELEASE     1~bullseye
    ...
    ...
    ...
Notice the first line? `debian:bullseye-slim` is not "immutable". Should the `nginx:1.21.6` image auto-update when new `debian:bullseye-slim` is available? Or should it keep it what it was during the build time? I am not exactly sure how nginx does it but it would not be incorrect to do either (both address different issues and are completely valid depending on what you want to do).

If you are serious about using docker, you need a private registry and you need to make a tagging policy that is best for YOUR use-case.


The easier solution rather than managing your own OCI registry is likely to just pin the digest and have dependency update automation e.g. renovate update the digest while targeting a tag.

For example

  FROM mcr.microsoft.com/dotnet/sdk:6.0@sha256:70b890cd12f73f8ad80061d242081b61da666bda7ec2d729113855a8b9410e1e AS build
Where the tag is used by humans for targeting a tag while the digest locks it to a certain image version


Use latest during development, but push to production using image in your private Docker registry which has proper names and tags.

git push -> Docker build (bonus points for building only if needed [for example only if Dockerfile, Jenkinsfile or requirements.txt has changed], otherwise use latest from Artifactory) -> run all automated tests -> if pass, push the Docker image to Artifactory with reasonable name and tag.

When doing a release, push the image from Artifactory to production.

This way you don't have to have a process to update all the things in Dockerfile on reqular basis, but you still only push to production the actual binaries that are really tested and proven to be ok. And can re-use the actual binary when doing a hotfix of something similar.


So what actually always happens with this approach is that companies stay on ancient docker images, because there is never time for updates as long as they work.

It's the same with locking python pip packages to specific versions. Nobody ever looks at it again and you run code that is 5 years old in production. People only look at it when it breaks.


I just said that use latest exactly to avoid this issue.


You said to use it for development. But I don't use Artifactory so maybe you meant to use the latest for production also, if it passes the build.


Having robust process requires some tools. Artifactory (or any other private Docker registry) is one such tool.


At least they would have never upgraded to log4j 2…


Forget for a second the security concerns -

If you lose :latest, and the build breaks, you need to spend unexpected time fixing the build, for whatever reason that :latest broke it, in order to continue with the business objective that you were actually tasked to do. Essentially, you have InfoSec muscling in and demanding, by the most coercive way possible, that Thou Shalt Not Do Your Job Without Bowing Before Me. Who cares what the business wanted? Who cares about predictable schedules? Who cares about delivering for a customer by a point in time which was promised to the customer?

This isn't the way to promote a healthy culture in the company. Nobody should get immediate and total veto rights. In the real world, healthy organizations balance between different stakeholders, and once a path is set, everybody shuts up and executes. The right place to make a stand for your interests is in the stakeholder meeting, not the deployment pipeline.


This! It gets even better - there WILL be a time at your company when you'll need to revert for whatever reason. That reason will most likely be something VeryImportant(tm) and will probably have to be done RightNow(tm) on weekend if you're unlucky.

At that time, you REALLY don't want to find out that you forgot to pin your dependencies and that your older build now doesn't run due to breaking changes and you have no frigging idea what versions of that random NPM crap library were you running 5 weeks ago.

Just... don't do that to yourself. Ever. Pin dependencies, update them regulary, but update them as YOUR decision under YOUR CONTROL. Not under some techbros control based on his GitHub pushes. If you're smart, also host your own repository (Maven, Npm, Docket, whatnot) with your dependencies so your business can't be distrupted by some techbro drama and missing packages.

(This is one of those lessons old neckbears learned on their own skin.)


> you have no frigging idea what versions of that random NPM crap library were you running 5 weeks ago.

The most telling part of this screed is that you don’t appear to know you can tell exactly what version you were using at any version-controlled point in time by looking at the lock file for NPM.


My default workflow is not to the pin the version. The build will only break occasionally and is often trivial to fix. In the rare case it does require significant amount of work, and management is breathing down your neck, you do temporarily pin the version to get out the release, and create a backlog ticket to resolve this as soon as possible.

This workflow ensures any work for minor version upgrade is part of the development workflow. PMs can't deprioritise this work against more exciting shiny product features. If you include this as separate development tickets, it just tends to get deprioritised until it becomes a big chunk of work.


> This workflow ensures any work for minor version upgrade is part of the development workflow. PMs can't deprioritise this work against more exciting shiny product features. If you include this as separate development tickets, it just tends to get deprioritised until it becomes a big chunk of work.

Instead of your maintenance work being visible, you're hiding it. You're describing an adversarial culture, one where you need to do shadow work because you don't trust management to prioritize necessary maintenance, perhaps one where management doesn't trust engineering to place maintenance in its proper context. I don't think this is an ideal, I think this is sad.


I am not hiding it, I make it integral part of the development process, as it should be. Treating technical debt as a separate work item is asking for problems.


> you have InfoSec muscling in and demanding, by the most coercive way possible, that Thou Shalt Not Do Your Job Without Bowing Before Me.

InfoSec is not optional.


Of course it's not. But it's not King, either.


If you're fedramp certified, they kinda are. Fedramp is the key that unlocks a lot of really high-paying customers, and if you lose your certification those customers (with their firehose of revenue) go poof.

How to lose certification: Don't address a known vulnerability (CVE) within a specific number of days, based on severity. Doesn't matter if it's log4j or some random executable in your images that's never used.

When you're up to hundreds of services, thousands of packages, and millions in revenue from Fedramp customers, InfoSec gets pretty important.


FedRAMP also requires you to explicitly give veto power to InfoSec at every stage of design, development, implementation, operation, and maintenance, and to employ a Change Control Board (CM-03).

There's only ~250 companies / products that are Authorized at any level of FedRAMP, and many of them are explicitly "Federal" versions of their products in order to isolate the organizational controls away from affecting their commercial offerings.


I could see it argued that in $CurrentYear, any information oriented company that doesn't put InfoSec as their #1 priority, is just asking to be pwned. It's not an if, but a when and to what extent.


No, #1 priority is always doing core business - the job that gets money in, satisfies users and keeps the company running. Everything else comes after - without core business, infosec is pointless and can't sustain itself.

InfoSec is critically important, but it's important just like IT people, janitors and server maintainers - business breaks without them, but they aren't actually earning money and prioritizing them over core business is the tail wagging the dog.

(And yes, I've seen way too much entitled "InfoSec" experts explicitly undermining their own company because they forgot that. Read The Phoenix Project or similar for concrete examples.)


> they aren't actually earning money and prioritizing them over core business is the tail wagging the dog.

Uhm, InfoSec helps prevent your company from hemorrhaging money and trust in the form of fines and lawsuits. That makes them a touch more important than you make them out to be.

The bigger a companies customers are, the more important InfoSec becomes to your "core business", because the certifications and security required by those customers have large infosec requirements.


That's also your HR and accounting teams' jobs, but they don't tend to assert that they need to be treated as Big Damn Heroes for it.


People are mentioning that pinning versions lead to overhead in updates. These people have probably not used Renovate https://docs.renovatebot.com/

Renovate is smart enough to understand that a task such as “Update Node to v16” means updating .nvmrc files, updating FROM in Dockerfiles and engines in package.json in a single PR it creates for you.


Renovate is great, far better than relying on somebody remembering to upgrade dependencies in 12 different places.

But if you want full coverage, there will be a need to write a few regex managers. And testing can only be done in production on the mainline branch. The best way I found to do that was forking the repo to configure and try out renovate separately.


Renovate is amazing, and a tool most Developer and DevOps engineers should learn. It's FOSS, and does a great job at dependency management across a very wide range of software like NPM, Terraform, Docker, Ansible, Maven, Golang, Rust and many others.

https://docs.renovatebot.com/modules/manager/


Call me stuffy but I argue for writing your own Dockerfiles, all from one or a small number of common base images (e.g. scratch+debian, or what have you), served from your self-hosted registry.

If you want to use some provided official image for vendored software, just port it over.

Then you tie the build+push into your pipeline however suits your org, be it fully manual or end-to-end CI with regular rebuilds.

This solves several issues mentioned elsewhere in the thread. As a bonus you won't be affected should docker.io or quay have issues or make breaking changes.


The security questionaires we have to perform for our larger customers now include that we pin the exact hash of a container or mirror any image locally. This is in addition to pinning versions of NPM packages and using our package lock for installation.

We now have a basic rule that :latest doesn't make it past proof of concept stage.

This has made automation of our package & container maintenance crucial, since we are a lean shop. It's pretty much a weekly occurrence that a low-medium vulnerability pops up somewhere once your application is large enough.


Pinning image tags is only useful if the upstream doesn't keep retagging images. I've had a case in the past where a vendor has retagged an image and broken something.

Maybe we should even be pinning to a specific layer/checksum like so:

  FROM ubuntu@sha256:8ae9bafbb64f63a50caab98fd3a5e37b3eb837a3e0780b78e5218e63193961f9


The article covered that five paragraphs in:

> In reality, you probably need a mechanism to enforce tag immutability: e.g., your own registry mirror, or referring to images by SHA).


In practise using latest or just the major version has caused far less wasted effort for our team than scrambling to update versions because of some newly discovered CVE.

It's a problem in frontend builds because js libs are quite poorly behaved so we fix more there but any backend stuff, be it alpine / debian base images, jvm libs, toolchains so on we've not had an issue with specifying only major versions.

Dogmatic rules are wrong 100% of the time.


Speaking as potentially one of the most prolific editors in tech history, your wife was right for the wrong reason.

You needed your article to state an example where latest dependency would kill a puppy.

Then your title is valid.


If you don't use latest, your ecosystem fragments and risks diamond dependencies. If you do use latest, some of your ecosystem risks randomly breaking.

I think it's best to use latest because the combinatorics of the ecosystem as a whole is much smaller. With everything on latest, you have to play whack-mole on individual services to get them to catch up. But you are doing this just when the community at large is most invested in understanding the process of upgrading.

The alternative is that ancient binary A depends on ancient binary B's legacy behaviour, and B has no need to upgrade but A have a very serious need and now you have a serious problem. Maybe upgrading B breaks C for instance. It's almost uncoordinateable at that point once versions drift significantly. You get true technical debt.


I think this is an X/Y problem. The problem isn't pinning to latest. The problem is lack of automation that makes builds reproducable. In at least one FANG company I've worked for if you aren't building from latest no one will listen to your issue. Too much software changes to quickly to be pinning to specific versions. However, the build system keeps track of the build audit details and can rollback any build to any state. Teams are required to add the necessary layers of unit, integration, stress, crush, and chaos testing to validate each build. Its not cheap but when you need to do a monthly firedrill of 'emergency update this dep because of Z vulnerability' its worth it.


Depends on usage case. At home I :latest everything. The work of constant adjusting tags to bring it up to date is more hassle than the risk of something breaking.

Prod stuff....yeah best to pin it for supervised upgrades


Dockerfiles are generally bad for reproducibility. This is mostly because Linux distros generally don’t care about it. Nix and Guix both support Docker export.


Agree. I think it's better to explicitly update major versions and know to keep an eye on the deployment in case it's something that tests can't cover.

I had a CloudBuild fail to deploy due to a bug in one of the builders :latest. Ironically the bug was due to a change in error logging, so it failed silently.

So it's not just docker or dependencies.

In this case I could've used a GCP notify that the builder failed.


> In reality, you probably need a mechanism to enforce tag immutability: e.g., your own registry mirror.

Ah, that sounds so trivial. And yet it is very much not. Aside from the hot mess that is the docker registry daemon, some tags are meant to be rolling. You absolutely do not want them cached.


I agree. Or provide something in parallel such as:

    :lts      -- newest long term stable release
    :release  -- newest regular stable release
    :x.y.z   --  supported but you need find out the exact version each time!


There are containers that I want ASAP to be updated to the latest version which likely contains all known security fixes. I consider the risk of missing an update for a - then public - security problem greater than the risk of getting a new, still unknown security risk. While I am trying to, I will not have the time to be really on top of all updates at all times. And I am more comfortable with a container breaking than with a container missing an important update.

There are aspects where reproducibility is key, but thats only a subset of all containers on our network. And on my home network it’s none at all.


Yes, and no.

This does mean that you have to actively manage your versions for any security fixes. Which you should be doing anyway. However when upgrading or patching you have to check in a bunch of places to make sure the update was effective.

As someone mentioned else where, its better to pin to major versions rather than specific. You'll need to make sure you have decent integration tests and monitoring though. I mean you have that already right?


With a bit of work you can run with pinned versions that act like :latest - in the equivalent gradle scripts we have explicit dependencies but we have another script that updates them to whatever is actually the latest at the time.

So it’s like master-SNAPSHOT when we need it to be, but still with reproducibility and regular updates.


Going even further with this - it has always struck me that the ALWAYS should be just be a hash of the docker image such that it is essentially content addressed and can be verified on the receiver end. Is there any reason that this isn't a good idea?


>It breaks one of the core requirements of continuous delivery: reproducible, idempotent builds

As long as you are able to look up what latest evaluated to during the build you retain most of the benefits while getting automatic updates.


Eh. It depends. There are exceptions to every rule.

latest is useful for builds where you want to know when something will break. If you only stick to pinned versions, it's common for nobody to ever update the pinned versions for years. Then you're way behind the latest versions and suddenly upgrading becomes a huge pain. You may be stuck in a situation where you're forced to upgrade because of a security hole in your pinned version, but to upgrade to a patched version breaks everything.

".....But that's terrible!", you say. "My build won't be repeatable! I won't be able to perform a roll-back build! My build will constantly break!"

All of that is true, but there are workarounds. For example, if you download, version, and store all artifacts used for each build, you can reuse them later. This isn't hard if you take the time to write some scripts. You can pull a container and export it to a tarball and store it. System packages and language dependencies can be downloaded similarly, and repos mirrored. You can do this once a week and version all artifacts with a datestamp, and make your own app builds pinned to a particular datestamp.

If you always build from those versioned archived datestamps, you can always rebuild or revert to an old working latest build.

As far as builds breaking, they certainly will! You need good testing to catch bugs and regressions. But would you rather learn to adapt quickly to broken builds, or have a sleeping tiger waiting to bite you the one time you finally have to upgrade quickly?

Using a stable branch/tag is the safest hedge against frequently breaking builds while still getting security patches. But stable branches still introduce problems. You will eventually need to revert a stable change, and eventually the stable branch will be End Of Life. So even if you use stable, you should still use the practice of installing from downloaded versioned artifacts.

Since EOL will come eventually, you also have to commit to upgrading to a new stable branch and its breaking changes. You must therefore plan to sunset your own code. Find out when your dependencies will EOL, and plan to completely rebuild your apps using latest before then. It has to be a real commitment and not just a "nice to have", because you will end up being forced to do it by a security vuln or repos that stop carrying an old branch.

Ultimately you need to decide how much risk to take and how much planning is needed to avoid sticky situations. Be aware of the consequences of your design and have contingency plans.


Same with pip/Python. Recucing dependencies to a bare minimum helps long term survival of software.


It also helps with reducing dependency distribution issues - I've written single-file Python programs using just the stdlib, sometimes of a really old Python3 version. It's really nice to just `scp` the script to a bunch of remote devices and have everything work out-of-the-box.

Of course, PyInstaller can solve dependency distribution issues - but I've sometimes ran into packages that don't play well with PyInstaller (uwsgi, for example, because it's basically a C program distributed using pip, so PyInstaller can't figure out how to bundle it).


But it may hinder short-term survival (because you're spending time reinventing the wheel instead of building competitive product).

Lockfiles (poetry.lock, Pipfile.lock etc.) solve the survivality problem for the most part.


> But it may hinder short-term survival (because you're spending time reinventing the wheel instead of building competitive product).

That's very rare. It's a common trope though and causes amazing examples of startup companies self-destructing themselves when they end up being bogged down in dependency maintenance wars where they're spending a lot of time just fighting all the issues in 100s of alpha libraries they've put on to be "faster". The argument for that was always the one you used. Adding a dependency isn't just "free code", it's also "free code that will probably break at some time and will need to be updated and deconflicted."

In my experience, a good set of older, tested and well maintained core depdenencies is usually much better for company, even if that means that they themselves maintain a piece of functionality that exists as a library.


Read my comment in the context of the parent comment I'm replying to and you will see how your comment has nothing to do with mine.


In a perfect one would use CRIU [0] to suspend the old container, then download the new one and start it. If the service comes up with the new container, then remove the suspended one and otherwise, fallback to the suspended container.

[0]: https://criu.org/Docker


Maintaining docker containers up to date can quickly become time consumming compared to good old distribution packages. I use both and I'm still trying to find a good combination of tools to reproduce the "package" and unattended upgrades UX where things are safely kept up to date automatically.


    # GOOD:
    image: "nginx:1.21.6"
nothing "good" about it, literally no different from ":latest".

only full hash reference.


Sure version tags could also move, but by convention they do not. Unlike the latest tag, which by convention does move a lot.


By convention npm packages are not deleted or hijacked.


You do realize that NPM package versions already are immutable? Dunno since when though.

Deletion is possible within limits but it also is with Docker. All the hashes in the world won't bring back a deleted image.


They were made immutable after the issue was already widespread. Which is also why I'd heavily encourage to use hashes to pin container versions, even though people might not see the immediate need to do so.


version numbers are never immutable. they are arbitrary labels that are created by some centralized authority and can be changed by that same authority.

artifacts being removed is much less of a problem as artifacts being spoofed with malicious content.


From the original article

> This brings up an interesting side point, in that Docker Hub and most other registries allow mutable tags by default. So nginx:1.21.6 might not be the same image today as it was yesterday. In reality, you probably need a mechanism to enforce tag immutability: e.g., your own registry mirror, or referring to images by SHA)


yeah, well, i stopped reading when i saw that "GOOD" example. it's not. and that isn't a "side point", that's the most critical point for preserving security and reproducibility of builds.


It's literally the line that follows that example. Why do you think it's useful to comment when you haven't even read the topic of discussion. You're like a person not reading the book at a voluntary book club. If you don't want to read it, just don't show up.


i suggest you to read my comment again, because so far what you've said applies to you more than to me.


Well then you'd vendor your docker images anyway and not pull them from the internet, right?


Sure, and many do that, but you don't have to if you refer by digest of image's contents.


Stuff like this makes me miss pinned RPM packages.


:latest sounds like npm without lockfiles.


It seems kind of weird that we don't do "stable repositories" anymore.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: