They built tests that explicitly assumed that the library's interface wasn't going to change:
>When we upgraded, all our tests passed (because our test fixtures emulated the old behavior)
Doing that, upgrading your dependencies and expecting everything to work just because those tests passed? That's naivete.
If they'd built decent integration tests that used the actual library (instead of "assume nothing changes" fixtures) and made more of an effort to simulate realistic scenarios then their tests probably would have flagged up most of the issues they had.
Alas, this seems to be one of the side effects of following the "test pyramid" "best practice".
I wasn't talking about that paragraph, but the following paragraph where they had tests, but they didn't test with large enough packets.
Tests can never cover every scenario. They are very useful, and they catch a lot of unexpected regressions. But they're just a part of the puzzle, not a replacement for good development practices.
Updating a dependency without bothering to read the release notes because you have tests -- maybe naive is the wrong word, maybe hubris fits better.
Tests can't cover every scenario, no, but had they made a bit more of an effort to test realistically then it's absolutely possible that they could have covered every scenario that mattered here.
Over-reliance on unrealistic unit tests (which is likely what led to them not testing large packets) is a pattern I've seen cause issues like this many, many times before.
I upgrade pretty regularly without reading release notes - relying on realistic tests to catch everything. What they do catch is usually not in the slightest bit obvious from release notes (often a regression in the dependency). Call it hubris if you like, but it works for me.
>When we upgraded, all our tests passed (because our test fixtures emulated the old behavior)
Doing that, upgrading your dependencies and expecting everything to work just because those tests passed? That's naivete.
If they'd built decent integration tests that used the actual library (instead of "assume nothing changes" fixtures) and made more of an effort to simulate realistic scenarios then their tests probably would have flagged up most of the issues they had.
Alas, this seems to be one of the side effects of following the "test pyramid" "best practice".