Tripping over the potholes in too many libraries

pronik · on Aug 10, 2020

From personal experience: it doesn't take a FAANG type of scale to reveal those "potholes". For most open-source projects (and especially things like pip, npm and other infrastructure and build tools) a simple, almost classical enterprise outbound proxy with authentication and MitM-style HTTPS-reencryption is more than enough kill almost every assumption they have in their code. In my case, that proxy tended to deliver incomplete files at random, which makes you discover that there is no checksumming happening. Then you build a local package proxy to help with that and then you find out that a particular library doesn't re-evaluate proxy environment variables on redirect, so that a redirect from external site to internal does not work. I can't remember filing that many issues at open-source projects as that time and it's been a draining game of whack-o-mole. At the same time you are reading tutorials called "How I did something in three hours" and you consider whether after-hours drinking might have been a better career choice.

Sorry. Enough ranting. That problem is everywhere, not just in libraries. I'm just not sure the problems lies in the users of those libraries, it's not too easy to use those libraries, it's too easy to publish them with a generic sounding name, but solving only a particular use case.

im3w1l · on Aug 10, 2020

An unsatisfying solution is to surrender and try to be as vanilla as possible in everything. With a few carefully thought out exceptions as needed by your business.

jankotek · on Aug 10, 2020

Could you explain a bit more? I think most pkg repos are just static directories. If you do https reencryption, and https certs do not match EVERYTHING will break.

pronik · on Aug 10, 2020

Yes, it will. Normally, the IT department would roll-out their MitM CA certificates globally into Windows trusted certificates store or something like it. However, if you are trying to do CI/CD on a Linux server, you have to manually provide those CA certificates to the tools you are using. Many tools are developed in a kind of "HTTPS everywhere" bubble, so that they think HTTPS just works and don't provide any kind of remedy for broken, outdated or plain internal certificates. Then you start working around it and it feels like it never ends....

user5994461 · on Aug 10, 2020

I find that software support http_proxy and https_proxy environment variables well enough nowadays (tip: the variable name is case sensitive sometimes, must be lowercase).

The real struggle for me is if the proxy requires authentication. It's very often not possible to configure a username/password and either way I don't want to put my employee password in every goddamn configuration file.

The CA certificate must be setup on Linux hosts, that's the bare minimum if the company wants to do SSL interception (add to /etc/pki/trust/anchors and call update-ca-trust), then things mostly work out of the box.

I've had one job where I spent a lot of time debugging and setting up PKI in a bank, have a postmortem here of how various libraries retrieve CA from the system (including obscure bugs around python ssl). https://thehftguy.com/2020/03/19/jp-morgan-postmortem-why-yo...

Of course the real solution is to have an internal mirror of linux/python/java packages.

vbsteven · on Aug 10, 2020

I have been thinking a lot lately about a possible solution for a small portion of this problem: Microdependencies. I'll explain in more detail in the context of JS, but it applies to other languages as well.

Currently package repositories like NPM host 2 types of dependencies: big community packages (frameworks, database drivers, validation libraries, query builders,...) and smaller function-scoped utility packages (left-pad, is-even, math functions,...)

My proposal is to create a new package manager for the latter (or adapt existing ones) which handles micro-packages, typically scoped to 1 function, 1 file or a small set of files. The big difference with regular NPM is that these micropackages do not get downloaded into node_modules, but are downloaded inside the src directory and committed to source control. This means that when you add a micropackage to your project, its source code is pulled into the regular src folder and is subject to the same code review as the other code your team writes. When the micropackage is updated the changes are visible during code review as well (instead of just being an opaque version bump in package.json)

This process makes small dependencies more visible and has review builtin. It also solves the problem of having to rewrite the same logic in every project in a slightly different way.

Edit: I'm not currently working on this due to time constraints but if anyone wants to talk about this, hit me up.

marcus_holmes · on Aug 10, 2020

I think the answer is to not import the dependency at all.

If you absolutely can't write a single function, then copy/pasting it from a set of "known good" functions would be better than importing a dependency.

Of course, just writing the function would be better. It would probably take less time than discovering the package, working out the interface for it, discovering that it doesn't actually work for your use case, discovering another, better package, etc.

chrismeller · on Aug 10, 2020

I agree, but I would take it a step further.

If you think it’s easier to find and import a library to test if a number is even than to use the modulus operator in an if clause, should you really be responsible for writing any code at all?

I feel like JavaScript in particular has developed this ecosystem of inexperienced programmers that “don’t know what they don’t know”. I don’t want to discourage anyone, but I feel like other languages (particularly more mature ones) don’t have that problem to the same degree.

For instance, I feel like the cluster module in Node is far easier for people to do things they shouldn’t with than any of the equivalents in C# or Java or even PHP (so it’s not just compiled vs scripting languages).

jamie_ca · on Aug 10, 2020

I think the catch is that "microlibraries" can still contain reasonably large amounts of functionality that's non-trivial to replicate.

I'd be happy enough to import single-function dependencies for "urlencode" or "is_valid_email" rather than manually grow out all the edge cases (again).

mleonhard · on Aug 10, 2020

Many useful libraries cannot be easily re-created. Consider datetime handling and X.509 parsing.

marcus_holmes · on Aug 11, 2020

I agree, and happily import moment.js for that reason. But that's not what the GP was talking about. This is not a "microdependency". It's something that should have been included in the standard library, except that JS doesn't have one of those.

gioele · on Aug 10, 2020

CCAN (C Code Archive Network) http://ccodearchive.net/ follows a similar philosophy. CCAN contains various small, self contained snippet of C code are meant to be be downloaded in a local directory ("vendored") and used directly or with small modifications.

The most straightforward way to use CCAN is to

1) `git clone` it in a subdir;

2) create a `config.h`;

3) include whatever `lib.h` you want in your project.

Updating these micro dependencies means doing `git pull`.

All libraries come with tests and a somewhat standardized API design.

What is missing (compared to your aims) is a declarative way to lock a dependency to a specific version.

bonkabonka · on Aug 10, 2020

Reminds me of [Bob Stout's Snippets](https://github.com/vonj/snippets.org). I used his collection often back when I were a lad.

pjc50 · on Aug 10, 2020

> possible solution for a small portion of this problem

In CS research, this is known as "software components", and people have been worrying about it since at least the 1970s with very slow progress. Here's Stoustroup writing about it in the 90s, for example: https://www.semanticscholar.org/paper/Language-technical-asp...

"Reuse happens only when a variety of social conditions are favorable. However, social conditions, development processes, and design methods alone cannot guarantee success. In the end, working code must be produced"

The dream used to be that software could be like electronics: standard elements like resistors would have a small number of well-characterised parameters, and you can just pick them from a catalog. Large components would come from vendors with large datasheets that would also describe their performance, and would be guaranteed a certain period of availability such as ten years.

The end result is not quite like that. A lot of components have made it into standard libraries (what python calls "batteries included"). Everything else is available on repositories. However, because everything has to be free (and sometimes Free), that also selects for the absolute minimum maintenance effort and quality. Freedom has many huge advantages, as does not having to get purchase order or "BOM" approval every time you add a dependency to your project, but it does leave people in the "if it breaks you get to keep both pieces" situation.

mleonhard · on Aug 10, 2020

I really like this idea.

When reviewing changes of the imported code, one will need the original change sets with comments. Therefore the system will need a format for representing these diffs. I don't think git is the right answer.

After review, the tool could upload a "passed review" signature to the central repository. Folks can see which version has been reviewed and approved by which organizations. This would let small organizations benefit from the review work done by large organizations. For example, a startup founder can feel more confident using a library version that has passed review by several of the FAANG companies.

dna_polymerase · on Aug 10, 2020

Horrible idea, if a package get corrupted by an adversary not only would you have to clean the npm package but also hope everyone who committed the code into a repository gets the news and fixes it. At least with NPM there is the chance that a fixed version gets automatically pulled on the next build.

In addition to that, most package managers rely on metadata, like description files, the bloat introduced by these micro-packages is unfathomable.

Then there is the obvious answer to all this, use a proper language, learn it and don't pull in stuff like "left-pad" from external sources. It's a one liner. If you can't come up with that yourself get the fuck out of this industry. Seriously.

mercer · on Aug 10, 2020

I dunno, there's something to it. I almost /never/ commit code to my repo without reading it, so not only would I read the first version that enters my repo, but I'd read every subsequent file changes before committing.

I can definitely imagine a package manager that, in some way, differentiates between the two (in repo or not), whether manually specified or as OP suggests some distinction based on how 'big' the package is.

Right now, it feels too dichotomous. Either I use a package that itself relies on a ton of packages, and I won't read all the code changes, or I copy and paste bits of code into my repo and now have to manually update things of any consequence.

manicdee · on Aug 10, 2020

What about git submodules?

tarkin2 · on Aug 10, 2020

Oh lord no. They’re a pain to manage. The world needs fewer gitsubmodules.

manicdee · on Aug 13, 2020

How do you go about preparing a PR for an upstream if all the code is magically in your own version control system?

And then how do you separate site-specific patches from bug fixes and new features for the upstream?

habitue · on Aug 10, 2020

This comes up all the time and I never understand this attitude.

Yes: dependencies are bad. Not having dependencies and writing everything yourself: also bad.

Honestly, you have to rely on heuristics in your deps. How active is the project? How simple is the thing it's doing (simple enough to probably not have major bugs, but not so simple it's faster to code it yourself) etc.

You get so much velocity from depending on external libs, it's straightforwardly bad not to take advantage of it. Are the junior devs at your company really going to write a better config handling library than the random one from github? Probably not. Maybe they'll both be bad code but the open source one has a better shot of someone else noticing it.

marcus_holmes · on Aug 10, 2020

But once they've done it your junior devs will know how to write a config handling library. And they'll have hit a few bugs so they'll know about file locking, and concurrency problems, and overwriting, etc. It will be a great learning project.

If you teach them to just import a dependency, then that's all they know. They'll never be able to write a concurrent library, they'll just know how to import one.

Having lots of velocity is bad if there are too many potholes in the road. Slow is smooth, smooth is quick.

habitue · on Aug 10, 2020

> But once they've done it your junior devs will know how to write a config handling library

This is true, but there are always so many things that need building. It's kind of like needing an end table, and either buying one from Target or building it yourself in your garage. End tables aren't that hard to build! The end table at Target is a piece of crap! You'll learn a lot trying to build your own end table! These things are all true, but the question that you have to ask before all that is: do I want to spend my limited time building an end table?

> If you teach them to just import a dependency, then that's all they know. They'll never be able to write a concurrent library, they'll just know how to import one.

This is probably true of some developers, but I don't think I've run into a developer that doesn't love building things themselves. (Possibly selection bias). The tendency coders have is to code. They're going to write something from scratch (and learn a lot) whether you tell them to or not.

> Having lots of velocity is bad if there are too many potholes in the road. Slow is smooth, smooth is quick.

I agree with this principle. The way I think of it though is that it's not a road you're on: the asphalt stretches out in all directions, and you don't know which direction you'll be going next. It's very often the case that you spend a bunch of time smoothing over a single pothole to perfection, and crash into a another one right next to it. It is probably correct to shittily half-fill all the potholes around you just to avoid catastrophe and come back later to the ones you run over again and again.

marcus_holmes · on Aug 11, 2020

> It's kind of like needing an end table, and either buying one from Target or building it yourself in your garage.

Except that, in this analogy, you're a carpenter. Building furniture is your profession, and hopefully something you enjoy doing. Learning how to build end tables quickly is a useful skill to have in your profession. Practising that skill by building an end table for yourself is not a waste of time.

> It is probably correct to shittily half-fill all the potholes around you just to avoid catastrophe and come back later to the ones you run over again and again.

Indeed, but that's another skill :) Knowing when to accumulate tech debt because you're not sure if the feature will be needed is important. And again, it's a skill one acquires with practice.

yellowapple · on Aug 11, 2020

> Except that, in this analogy, you're a carpenter.

Alright, and if you're a junior carpenter, are you going to be tasked with chopping down the perfect trees and whittling those down into the perfect legs and tabletops from which you will assemble your perfect end tables? Probably not; you're almost certainly gonna start off by, at most, going to your local hardware store, buying whatever shitty wood they've got, and putting that together into something that might look god-awful but at least functions approximately like an end table. Only after you've gotten good at building end tables with off-the-shelf wood are you likely going to have the requisite understanding of the limitations of said off-the-shelf wood to have any interest (let alone understanding) in going with custom wood.

Similar deal with software. If you're a junior programmer, yeah, you might be encouraged to develop your skills by developing libraries, if you're lucky, but in most shops you're probably gonna be tasked with, you know, actually delivering software, and that software will probably need to use existing libraries for expediency's sake. It's exactly why a hefty chunk of web development nowadays involves full-blown frameworks like Rails or Django (or even lighter-weight alternatives like Sinatra and Flask, respectively) instead of people hand-crafting HTTP request parsers / response generators and hooking 'em up to raw sockets.

"Store-bought" libraries, like store-bought wood, ain't perfect by any stretch. They still provide a reasonable starting point for getting a project done reasonably quickly instead of getting stuck in the weeds (literally) of growing your own trees.

marcus_holmes · on Aug 12, 2020

I think we're probably extending the analogy too far, but I'll play along.

We don't craft programs out of machine code any more. We use high-level languages. So that's where we're not growing our own trees any more. I think the rest of your analogy agrees with me: even junior carpenters need to know how to build an end table by actually building them rather than buying them from Target.

It's a judgement call, I'll agree. But only allowing your junior devs to write glue code for imported dependencies means that's all they know how to do. They'll get the idea that that's all coding is - find the right dependency and write some glue code, and you're done.

I've seen this attitude so much recently, especially in startup product development, and it's actually dangerous. No-one audits their dependecies, they just read the description, import it and try to work out how to make it do the thing. If it works, who cares if it's huge, or has 34539473 other dependencies, or has been taken over by a malicious maintainer? Velocity, right? Get it into production and on to the next feature.

This is going to end badly.

yellowapple · on Aug 12, 2020

> We don't craft programs out of machine code any more. We use high-level languages.

A.k.a. we chop down existing trees instead of growing our own.

> This is going to end badly.

While I don't disagree, the alternative is more often than not for nothing to ever get shipped because the development team is busy reinventing the universe, especially when a "don't use libraries" mentality gets taken to its logical conclusions ("we need perfect custom hardware to run our perfect custom operating system written to run our perfect custom programming language/runtime to run our perfect application" - just because Google has those sorts of resources doesn't mean a lone developer trying to get a side project done has those sorts of resources).

There is, however, a reasonable middle ground: start with those dependencies, and then as you encounter their limitations start working to replace them. The vast majority of development teams don't have the resources to make "do everything in-house" the default, but they might have the resources to selectively fork or replace a critical library once they've built an initial implementation of their project on top of said library. To your point about junior devs and learning opportunities, this is a perfect way to learn: "alright, so you built this app with this library, great, now write a replacement for that library that does A, B, and C instead of X, Y, and Z".

marcus_holmes · on Aug 13, 2020

I agree with that. I think I just draw the line further down the scale than you. Possibly, that's because I'm an old fart and more used to having to write everything from scratch, and so therefore more used to it (and less liable to consider that wasted effort).

Ar-Curunir · on Aug 10, 2020

And by re-implementing a library, you're just importing dependencies into your code, with the difference that now you're responsible for bug fixes and security patches, without the opportunity to inherit performance improvements contributed by others.

KSteffensen · on Aug 10, 2020

You also have a much easier route to fixing issues, since there will be far less politics involved. Probably you will also have a deeper understanding of the code since you wrote it.

consolenaut · on Aug 10, 2020

This is true, but I've found it far more common that a predecessor at the company I'm at wrote the code and isn't around to support it anymore. For simple components it's no biggie, but for non-trivial components it can take about the same time to debug as an external module, the benefit of the external module generally being better documentation, although thats not a given.

armada651 · on Aug 10, 2020

If you find the fix yourself you don't need to deal with politics if you don't want to. Submitting pull requests is optional after all.

KSteffensen · on Aug 10, 2020

If you don't submit a pull request and get your fix merged, essentially you have given up on the benefits of an open source library maintained by other people. You will have to maintain your fix. At that point, you might as well have written the thing yourself, especially for trivial stuff.

l0b0 · on Aug 10, 2020

There are so many problems working in tandem here:

- Leaky abstractions which are treated/documented as if they're perfect. Read a file, do a thing, write a file. Except when running in parallel, things are never that simple.

- Lots of important projects ("base of Arch Linux" level important) still don't have public repos, an official site (or even page) with references, bug tracker, or any reasonable way of contacting the maintainers. How do these projects get updated? I guess they simply don't.

- Even the best feedback mechanisms are still awful. Figuring out whether something is effectively abandoned before spending 30+ minutes writing up a bug report could take longer than writing the bug report itself, encouraging writing the absolute minimum (which ends up being less than minimum half the time). Every project is also its own unique snowflake. They want different types and amounts of information, some won't touch an issue unless it's reproducible in the very latest commit (requiring hours of setup and lots of domain knowledge to compile the damn thing), and they all have a completely different process.

- Most tools are still skewed towards one maintainer per project. Take any random GitHub abandonware as an example. You can fork it with one click, but so have 20 other people. You can put up a web site for your fork, but search engines will rank the original web site (or just the GitHub repo) highest for years. You can rename it, but now you've just burned bridges with all the other forks and the original repo, in ways which would be difficult to repair. Now try to migrate all the issues, mailing list, forum, IRC channel or other communications. It's basically hopeless unless the original maintainer is completely on board.

- Most open source software solves only the author's immediate problem. But not a single user of the software has exactly the same problem to solve as the original author. So pretty much by definition no library will solve the problem you are dealing with.

Maybe one way out of this mess for any particular language might be if the standard library was meant to grow to take over popular code, possibly with a tiered approach. You could nominate a library, then it would go through reviews, extensive testing, the whole shebang, before being accepted into a low level "this might someday be part of the standard library in some form" tier. This ideally would make it possible for example to replace core libraries in steps, steadily pushing the old one out of core and pulling a better alternative in.

red_admiral · on Aug 10, 2020

I guess the conundrum is: for 80% of people, maybe an 80% library is good enough; and 80% of library vendors are happy to write libraries that are 80% good enough - and 80% of customers have learnt to put up with stuff that occasionally doesn't work?

To me, the best solution is a language ecosystem that distinguishes itself by having properly written standard libraries to start with - whether built in, dynamically linked or optionally included in your project one way or another. For example, key=value config file parsing and saving should absolutely not need a third-party library. That should be as much a selling point of your language ecosystem than having a package manager that can pull directly from github.

One of the few upsides to "enterprise Java" is that there's a lot of "enterprise level" libraries around for it; even a lot of the open source ones are maintained by organisations like the Apache foundation rather than some random person on the internet. At one place I used to work, we had a whitelist of approved libraries and vendors, and even if you could in theory just import a library from the web, you had to make a case to your manager and get someone to sign it off if you wanted to use something not on the list. (As a side-effect, they'd also check the licence conditions, as we were writing closed-source commercial stuff. As a young intern, I got my first lesson in what (L)GPL meant when I requested to include something in one of our projects.)

hedora · on Aug 10, 2020

I think a lot of comments here are missing the fact that one could reimplement the config file handling library and atomic file update library in less time than it took to track down the package responsible for the bug and root cause it.

I’ve found such issues are the common case for libraries written in languages that encourage tons of tiny dependencies.

stefan_ · on Aug 10, 2020

This is so bizarre, as if fixing a config write to be atomic would somehow fix the root issue of this program was not designed with a config store in mind that can tolerate multiple instances of itself. Because of course it doesn't. Because having multiple instances in parallel was a novel requirement you introduced that the original author obviously didn't care for.

This isn't a learning experience about flock at all.

kelnos · on Aug 10, 2020

> My guess was that people get into these situations where it seems like a library is going to be a solid "100% solution", and yet it lets you down and maybe reaches the 80% mark.

That's probably true. I think a difficulty in avoiding this is that often the person reaching for this library, if they tried to write it themselves, would only hit the 60% mark.

And I don't think there's anything wrong with that. There are people out there writing code at many, many different skill levels. Libraries -- even those that are less than perfect -- let people get things done that they might not be able to do on their own, or would do a worse job of without the library.

marcus_holmes · on Aug 10, 2020

I'm always amazed by this attitude. That somehow the authors of random libraries on npm are amazing coders whose code quality can't be touched by mere mortals like us.

Firstly, if you've always imported a dependency to get around a problem, then yes it's going to be hard to solve that problem yourself. But it's also a learning experience. Keep doing it and eventually it won't be hard.

Secondly, the library probably isn't a perfect match for your use case, and probably contains a lot of flexibility to match a wide range of use cases. Maybe as much as 80% of the code isn't actually any use to your project. It'll be more complex code, because it's dealing with a wider set of use cases. The thing you'd write to solve your particular use case would be smaller and cleaner almost by definition. You'd end up with less code, that you understand well. This is always a better position to be in.

kelnos · on Aug 17, 2020

> That somehow the authors of random libraries on npm are amazing coders whose code quality can't be touched by mere mortals like us.

That's not what I said, nor what I believe. But I think it absolutely is true that most developers would struggle to reproduce the functionality of a fairly meaty dependency they've chosen (that is, not most of the dreck on the npm registry).

If a dependency exists for what you need to do, and after some vetting believe it to be of good quality, it's highly unlikely that it would be a good use of your employer's time to reimplement it yourself. I agree that it'd probably be a good learning experience to do it, but that's not the only consideration.

As for the rest, there's nothing that you're saying that's particularly wrong, but it's absolutely wrong if you apply it 100% of the time, because that would mean everyone would have their own bespoke web framework, networking library, UI toolkit, etc. I don't expect that you're actually making that argument, so it's a bit uncharitable of you to suggest that I'm making the exact opposite one, that no one should ever think about implementing something themselves.

Dependencies have a cost. Not using a dependency also has a cost. Figuring out which cost is lower is part of the job, and that decision needs to be made several times on every project.

notacoward · on Aug 10, 2020

Reusing code - your own or someone else's - often turns out to be a mistake. Reimplementing functionality also often turns out to be a mistake. Either way people get fooled by 80% solutions. There's no simple answer to this. I can give a simple algorithm - look at how much time you'll spend on each vs. where it's most important for you to spend your time - but you still have to apply that algorithm to your own data set to know which choice is right. And sometimes you'll get a wrong answer, and it'll be annoying (as appears happened in the OP's example). That's why they pay us the big bucks.

sneak · on Aug 10, 2020

I think the generally accepted fix here (despite Rachel’s aversion) is to submit a PR to the file writing library that fixes the corruption issue (likely using atomic rename), then get the tool to bump the version of their dep or vendor in the fixed version.

I’ll admit, though, that the balkanization of code adds overhead from the abstraction. I just don’t think it’s a bad thing, because it’s all very new and things are still shaking out.

Imagine if the fix lands in the config file writer library and all the downstreams regularly upgrade their deps; the fix is now a lot more widespread. This is better than every single end dev knowing about atomic renames, I think.

rini17 · on Aug 10, 2020

It is likely that fix would break somebody else's code which unwittingly depends on the bug. Then, burden of educating the users would fall on maintainers. Who, most likely, aren't having any of that. (Author even linked a article about this.)

true_religion · on Aug 10, 2020

How could you depend on a race condition bug?

rini17 · on Aug 11, 2020

It's not so hard to imagine a scenario where system happens to depend on both data concurrently written into a config file by 2 processes. If the file is properly written and renamed then data from one process gets lost, and/or if there is locking, one process will suddenly stall.

peter_d_sherman · on Aug 10, 2020

>"In short, I think it's become entirely too easy for people using certain programming languages to use libraries from the wide world of clowns that is the Internet. Their ecosystems make it very very easy to become reliant on this stuff. Trouble is, those libraries are frequently shit. If something about it is broken, you might not be able to code around it, and may have to actually deal with them to get it fixed.

Repeat 100 times, and now you have a real problem brewing."

A future job interview question that might be asked at my future company:

"Let's say you're using a library for specific functionality, a library that you haven't written. Now let's say that there's a bug in that library that you can't work around.

How would you debug that library?"

See, there's a differentiator there.

There are programmers who can debug libraries, and then there are programmers that can merely use libraries provided to them.

Given a choice, you want a programmer who can debug libraries working for your company...

fnord77 · on Aug 10, 2020

As people here point out, using 3rd party dependencies is inevitable, unless you're a tech colossus and can afford to write everything from scratch.

For the most part, these 3rd party dependencies work fine. In my experience it is fairly rare to encounter a show-stopping bug.

I have a simple mitigation for the "potholes" though - try to take the time skim through the code of the library ahead of time and try to figure out what it is doing. That way if you do hit a problem, you can fork and fix. Doing this can also give you a sense of how well-written a library is to begin with.

hpcjoe · on Aug 10, 2020

I've run into an application, used for monitoring, that had that exact type of bug, albeit not with a dot file.

A customer of my old business had built a little monitoring system for their compute nodes mounting a parallel file system. Their integrated test had every compute node open a particular fixed path and file name (of course the same on every node across the system) for read and write.

This "monitoring" script meant that they could have up to 2000 or so simultaneous IOs going to the same file, with no read/write locking. The tool read/wrote some number of bytes to get a performance read.

The end result was 1) lots of contention of course at the metadata layer, 2) often times spurious and incorrect reports of the parallel file system being offline (it wasn't).

We tried helping them on this, but they insisted they were doing this correctly (they weren't).

This is less about libraries with potholes per se, and more about critical applications (to a degree similar to libraries providing critical functions that need to be correct in semantics, implementation, and in error generation) that are broken due to a misdesign/mis-implementation somewhere.

With regard to her commentary on CPAN, one of the more annoying things I've dealt with in many libraries is their choices of error return. Some will throw exceptions. Some will return real errors you can process in-line. I am not a fan of throwing exceptions, and when I build larger Perl apps, I tend to insert some boilerplate exception handlers specifically due to the burns I've encountered in the past when modules do dumb things.

smitty1e · on Aug 10, 2020

Rachel seems to trip over the concept of Division of Labor[1] to a point. There is arguably more net gain from not having everyone reinvet OpenSSL just to communicate.

Or maybe she's more against being indiscriminate about re-use.

But there just aren't many companies big enough to engineer their own complete solutions.

[1] https://en.m.wikipedia.org/wiki/Division_of_labour

ohazi · on Aug 10, 2020

> I will never know why the team chose to handle my report by swallowing the error instead of dealing with upstream, but that's what happened.

The team avoided dealing with upstream in exactly the same way that you avoided dealing with upstream. What's so difficult to understand?

loopz · on Aug 10, 2020

Honestly, you have this problem because you put up with it, or are "forced" to use certain software. Why is it everytime I venture into Python, interesting language aside, there are always multitudes of dependency problems even tied to the OS? People put up with it. Choosing quality over fast-food is a choice. Languages such as golang has largely solved much of this while still being pretty portable, and devs are encouraged to minimize deps.

Of course fast-food is tempting and may taste good, but after a while of abstention, this fades away. But it's a choice, and one might miss some opportunities.

user5994461 · on Aug 10, 2020

Python is really easy to use. I don't get what's the troubles?

    git clone https://example.com/myproject
    cd myproject
    /usr/bin/python3 -m venv env
    source ./env/bin/activate
    pip install requirements.txt
    python3 src/myapp.py

yellowapple · on Aug 10, 2020

A bit beside the broader point of the article, but...

> They responded. What did they do? They made it catch the situation where the dotfile read failed, and made it not blow up the whole program. Instead, they just carried on as if it wasn't there.

The tone here (and in subsequent paragraphs) seems to suggest that this is somehow the "wrong" answer.

Yet, it's exactly what I would've done, since not doing this is clearly wrong. The file's corrupted with no hope of recovery, is not apparently meant to be user-serviceable (so user configuration is highly unlikely to be at risk), and is evidently not essential for the proper operation of the program (nor does it, in general, hold anything especially important - and no, some counter for when to nag about updates ain't important in this situation). There is absolutely no reason why this file's corruption should prevent normal operation; therefore, gracefully catching this error and expanding the "file doesn't exist" case to "file doesn't exist or is otherwise corrupt/unusable" and proceeding normally is absolutely the right call.

In an ideal world, they'd fix the actual corruption, too, but preventing that corruption from being an issue is the first and most critical step. I'd hardly call making software less fragile (said fragility being exactly why so many "80%" libraries are indeed 80%) a "missed opportunity"; indeed, not doing this seems like a glaring missed opportunity to make the software more robust against issues far beyond those caused by the limitations of some library.

That is:

> I will never know why the team chose to handle my report by swallowing the error instead of dealing with upstream

Because not swallowing the error would be a patently broken design, regardless of whether or not there was some library involved.

----

EDIT:

> Yeah, that's right, because I worked for G or FB or whatever, somehow any time I have a problem with something, it's because I'm trying to do something at too big of a scale? Are you shitting me? COME ON.

I mean, yes? The situation described is very obviously one where the author is trying to use a tool clearly not designed for massively-parallel execution for a task involving massively-parallel execution. Why said author is somehow surprised that said tool breaks when put under that sort of stress is beyond me.

It's akin to buying a 1 ton jack from the auto parts store and then complaining to the minimum-wage employees thereof because when you tried to use it to change the treads on a 50-ton bulldozer the jack predictably flattened like a pancake.

There's always a point where off-the-shelf parts won't cut it and you gotta roll your own solution in-house. Evidently the author has a tendency to hit that point sooner rather than later. That's pretty rare, though, and condemning the very notion of an external library because of said libraries failing under very exceptional circumstances seems absurd.

That is: no shit most libraries are written for the 80%, because the 80% ain't got the resources or motivation to deal with the consequences of NIH syndrome like the 20% might. They gotta ship something, and can't afford to let perfect be the enemy of good.

The more reasonable approach (for those currently in the 80% but hoping to eventually be in the 20%) is to start with those off-the-shelf libraries, and then be prepared to fork 'em and use them as starting points for specialized approaches (or else swap them out with specialized replacements). In this regard the author's correct in that the package ecosystems of most languages are poorly suited to this, since they offer little to no mechanism for customizing dependencies without rather painful workarounds.

lowiqengineer · on Aug 10, 2020

> I show up with a problem ("hey, this thing keeps getting corrupted because X and Y") and suddenly it's because I'm "from" G or FB or something and I "want unreasonable things" from their stuff. So, my request is invalid, thank you drive though.

No, the answer is they're trying to deliver impact for the customer and Rachel is instead asking them to invest time in a solution that doesn't bring them any closer to that. I'd imagine most of them would be fine with fixing the root cause, but for some fucking reason everyone from Google or FB feels the need to reinvent every wheel in order to peacock and show that their IQ is the highest.

With that said, at my "normal people" non-bourgeoisie company the 3P libraries are all converted to the internal build system. If there was a fuckup of this magnitude, someone would just create a branch and bump the version number with the fix. Problem solved.

sulam · on Aug 10, 2020

This is a pretty bad take. Rachel has said implicitly that there was no version with a fix available and explicitly that working with the maintainers to get a fix is often unreasonably difficult. This matches my own experience fairly well. So your ‘normal people’ solution won’t work.

You may also want to consider that there are good reasons for engineers and engineering management to be default suspicious of 3rdparty dependencies at large companies such as those you’ve listed. These reasons have nothing to do with peacocking or demonstrating high IQ (replicating things available elsewhere is not a way to demonstrate your intelligence, it turns out). They have much more to do with the high bar for security consciousness, unique need to deal with scale or low performance tolerance, or extreme organizational risk associated with being a company with a target on your back for every major hacker, researcher, regulator, journalist, or self-proclaimed watchdog — not to mention operating under several consent decrees already.

donalhunt · on Aug 10, 2020

+1

The issue gets even harder when the code is proprietary. Had a situation with one of VMware's products that was being provided to us by a SAAS provider. It was a far amount of effort to a) convince the provider to file a bug with their vendor and b) then provide enough data to the VMware engineers so they could understand the value in prioritising a fix.

In our case, we identified the issue during a proof of concept so our vendor pushed VMware pretty hard because there was a sizeable contract at stake for them. Most engineers aren't as thorough in my experience.