One concern I have is that, every time he talks about the maintenance roles, I'm automatically thinking it's unglamorous, and marking a career plateau (perhaps decline).
I also instantly visualized a suited executive reluctantly deciding it's a necessary role they have to staff, and the exec will feed the trolls in the mine, but the focus (and rewards) will consciously be on the stars who are making new things happen.
Even though Graydon just explained that kind of thinking is a problem, I'm still thinking it.
If I'm still thinking that (with background that includes very serious software engineering, as well as FOSS), then my guess is that a lot of other people will be thinking that, as well.
BTW, I'm not saying that I'd personally devalue maintenance roles. If I was managing something important that needed to be maintained, and I lucked upon a stellar maintainer, I'd do everything I could to retain them and keep them happy, including making a case for why their comp should track with some of the new-product-star people.
I'd also try to make sure that, if their position declines/disappears (e.g., they no longer have someone advocating successfully for the role) that they'll be marketable elsewhere. (I don't want them ever walking into an interview and hearing, "I see from your resume that you're more of a maintenance programmer, but we really need people who can hit the ground running, banging out new huge kernel modules. Maybe you could assist them, by writing unit tests, and fetching their coffee, so they can focus on the challenging new stuff?")
One sign of hope is that the best-known worst-offender, at rewarding only new things, at least takes some aspects of reliability seriously.
"I see from your resume that you're more of a maintenance programmer, but we really need people who can hit the ground running, banging out new huge kernel modules. Maybe you could assist them, by writing unit tests, and fetching their coffee, so they can focus on the challenging new stuff?"
Is "new stuff" easier work than maintaining old stuff? I always thought it was the other way around. I always give greenfield stuff to the juniors. They fuck it up I fix it. It's basically like using chatgpt.
I agree. Maintenance work is hard. Writing new code from scratch is easy.
At work we put the smartest most experienced people on maintenance work. Because it is the lifeblood and cash flow of the company. It directly pays the salaries of people working on new software that might never make a profit!
> I'm automatically thinking it's unglamorous, and marking a career plateau (perhaps decline).
This may be a bug in your thinking, although the lack of organizational glamor checks out. The new shiny often gets the most attention. Often because it's trying to cross the gap from non-existence and out of mind to mind share and adoption, so marketing budget (literal but also emotional and attentional). However, the start of a project is usually far simpler, less constrained, and easier. Most of the truly hard trade-offs and the impacts from decisions don't come due until later. Not only this but the deep learning comes from observing these outcomes and starting to understand how the decisions in the beginning come together.
Imo number 1 thing that helps a maintainer on the resume is to say they brought in some number of revenue, and all their bug fixes (especially taking out the fires of the other engineers doing features) saves the company a dollar amount or guaranteed a dollar amount of ARR.
The trick is that in many cases the value delivered is invisible and unmeasurable. How do you quantify “time saved by not having bugs”? But that is what great maintenance does. Or, the same for “time saved by a really well-designed API that makes it easy to do the right thing and harder impossible to do the wrong thing”? Again: not measurable! ”Just put a number on it” is the kind of facile response I consistently get from too many folks in management when trying to have these kinds of discussions, and the annoying-but-inescapable reality is that it is not always possible to provide a monetary number on the value of this sort of work. Despite that value often very likely netting out in the millions or more every year!
> ... it is not always possible to provide a monetary number on the value of this sort of work. Despite that value often very likely netting out in the millions or more every year!
Hm. You fist state, that it is not possible to provide a monetary number, then you state it is very likely netting out in the millions -- which is providing a monetary estimate.
If the value of one thing is somewhere in the range of $1-$100, then its value is hard to quantify.
But if you have one million of those things, you can still say "it's very likely I have value in the millions of dollars or more here".
The same logic applies here. All that has to be true for "we have millions here" to be plausible is that (1) the value of each individual, unquantified contribution is positive and probably >$1 (2) there are probably millions of such contributions. You don't have to be able to quantify with any precision any individual contribution.
Then put on the imprecise number? If you are looking for precision in estimates for the impact of projects you worked on, the vast majority of hiring managers reading resumes won’t care. They’re already going to be mentally sorting impact into broad buckets
This is a totally reasonable response! So let me elaborate a little on how these things can be true at the same time.
1. Imagine a scenario where there are two versions of an API: one is bug-prone, the other is “correct by construction”—you literally cannot call it the wrong way.
2. Assume that for some percentage of the “invalid” versions of the bug-prone API are called, the result is something that ends up going wrong in production and taking 3 developers an hour to resolve. (This kind of level-of-effort is not at all unusual in my experience dealing with on-call at both a mid-sized startup and at the scale of LinkedIn!) Let’s call it 10% to pick a reasonably small number: only 1 out of 10 bad invocations for this API put us here.[1]
3. Assume the API is fundamental to some key library (a JS framework you use, for example), so the calls are proportional to the size of the code base. Again, pick a fairly low number: 1 mistaken call every 10,000 lines of code. If we are looking at LinkedIn’s front-end, that puts us on the order of well over 10 of these that actively cause this problem (over a million lines of code with a 0.1% “hit” rate and a 10% “blows up” rate).
4. Further take an average developer compensation of $150,000/year. (This is low for big tech, but again, it gives us a useful baseline.) This is ~$75/hour.
Put those together, and you’re talking about 100 incidents × 3 developers × 1 hour/incident × ~$75/hour/developer = $22,500. That’s one repeated bug over the lifetime of the program in question.[2] That excludes the other potential business costs there: what happens if that also impacts revenue in some way—say, because it prevents sales, or means lots ad revenue, or results in an SLA violation?
Add that up across the whole surface area of a codebase—dozens and dozens of bugs, across however many users and lines of code—and you’re talking real money. A million dollars is just 450 of those kinds of bugs with similar “blast radius” and occurrence rate. This is the kind of rough mental math that leads me to talk about “netting out in the millions” benefit-wise. Thus far you could imagine “putting a number on it”.
Where it goes wrong is: with the good version of that API, the bug never happens. There is nothing to measure, because our reasoning has to deal entirely in counterfactuals: “What would it have cost us if we had a bug in this particular part of the framework?” But you can do that ad infinitum.
More or less every part of a library can be more or less buggy, more or less easy to maintain, more or less amenable to scaling up to meet the needs of an application which uses it, more or less capable of adding new capabilities without requiring you to rewrite it, etc. The part that is impossible to measure is the benefit of all the “right” decisions along the way: the bugs you never saw, indeed never even had to think about because the API just made them impossible in the first place.
Nor can you measure “this API is easy to use and never breaks my flow” vs. “I spend at least a minute looking up the details every time I have to use it… and whoops, now I’m on Reddit because I switched to my browser from my code editor”. Nor can you measure the impact of “This API makes me angry” vs. “This API makes me actively happy” on velocity. The closest you get are proxy measures like NSAT surveys which tell you how developers feel overall and interviews where you can ask them what their papercuts are; but neither can be translated into dollar values in a meaningful way. And “putting on the imprecise number” (as a sibling comment down the thread suggests) is impossible for these kinds of things: there is no number.
[1]: Lest you think I am gaming this, I have real APIs we really deal with in mind which are so error prone that we deal with bugs like this from that specific API at least once a month.
[2]: Off the top of my head, I can think of half a dozen APIs we use very actively in production which have these kinds of problems. I have eliminated a fair number of them in my tenure, but demonstrating the impact is… well, see above.
I am still not convinced that your examples show that it would be unreasonable to estimate their monetary impact.
Since a company is an organic whole, every functional part of it would netting out in the millions if considered in isolation. However, such a perspective is usually without practical significance, as it is not linked to concrete business-relevant scenarios. If there is no risk[1] that something that functions properly could fail, then the costs associated with such a failure never occurring are irrelevant.
I would also concede that many things cannot be estimated accurately or might be very hard to estimate. But in my experience, the really difficult decisions are those that relate to new big and complex things, such as what technology to use for an innovative product. Evaluating whether it is worth improving a specific detail of an existing application is most of the time far less difficult.
Let me give you an example from my current work: It is a business application to process customer enquiries that result in an offer for a tailored product. In a specific scenario we know that we can process 5 enquiries per hour. The goal is to process 6, an increase of 20%. There are about 6,000 enquires per month, meaning saving 200 staff-hours per month. The hourly costs for software development are about 4 times the costs for the staff using the application. That means that for every 50 hours it would take me to reach that goal, the break-even point would move by one month. I estimated that I could reach the goal by putting betwenn 100 and 150 hours into it. This precision was enough to get the green light from the management.[2] And management does not really care how I reach that goal in detail (by improving the performance of the database, by reworking the user interface, by using better templates as a basis for the tailored products, ...). And even if my estimates were off by a factor of 2 or 3, it would still be worthwhile to attempt the improvement.
Regarding your case about the quality of an API, I cannot see a fundamental difference to the case I just described. Set some time and/or quality goals for the improvement of the API and attache a reasonable price point to everything. Than see, whether it makes economic sense at all, whether something else promisses a better return on investment, or if this is the best thing to do now.
Finally, I would like to emphasise that the correct thing to do is never only purely a question of technology. Notice, for example, how the assessment of the case I described above changes with the number of enquires per month. Were there only 1,000 enquires per month, the break even point would be 6 times further away, which means that there might probably exist a lot more other fields more worthwhile for the company to invest their money in (and not all in IT).
[1] More precisely, the risk is seen as marginal or irrelevantly small, or if it occurs there is no way to manage it anyway (a meteorite hits the factory), or circumstances are so fundamentally changed that the entire business model is called into question.
[2] Actually it was the other way round: The management came up with the idea to improve the process by 20% and already had some suggestions how to do it. Then I looked at it and gave my rough estimates and own suggestions what could be done.
> I'm automatically thinking it's unglamorous, and marking a career plateau (perhaps decline).
I think to do serious maintenance work one would have spend lot of years in industry/company/project etc. If these people are just motivated by glamorous work, ever increasing career growth, they don't seem to me great maintainer material.
I apply this to me in narrow way to support a long running in-house enterprise product which I am kind of maintaining for many years. I just do not feel motivated by glamorous role and climbing on greasy corporate pole.
> One concern I have is that, every time he talks about the maintenance roles, I'm automatically thinking it's unglamorous, and marking a career plateau (perhaps decline).
For the rare and lucky few, it could be a stepping stone to product development. However, I've never seen someone jump from product development to a maintenance role. I think most developers would view it as a demotion and be insulted by it.
> including making a case for why their comp should track with some of the new-product-star people.
Never. The development team knows far more about the product and the business aspect of it than the maintenance team. If there is an issue that maintenance cannot handle, they reach out to the development team for a reason.
Companies aren't huge on paying open source developers in the first place. From their perspective, the primary benefit from open source software is free labor.
I can't imagine they'll be enthusiastic about paying open source maintainers.
One concern I have is that, every time he talks about the maintenance roles, I'm automatically thinking it's unglamorous, and marking a career plateau (perhaps decline).
I also instantly visualized a suited executive reluctantly deciding it's a necessary role they have to staff, and the exec will feed the trolls in the mine, but the focus (and rewards) will consciously be on the stars who are making new things happen.
Even though Graydon just explained that kind of thinking is a problem, I'm still thinking it.
If I'm still thinking that (with background that includes very serious software engineering, as well as FOSS), then my guess is that a lot of other people will be thinking that, as well.
BTW, I'm not saying that I'd personally devalue maintenance roles. If I was managing something important that needed to be maintained, and I lucked upon a stellar maintainer, I'd do everything I could to retain them and keep them happy, including making a case for why their comp should track with some of the new-product-star people.
I'd also try to make sure that, if their position declines/disappears (e.g., they no longer have someone advocating successfully for the role) that they'll be marketable elsewhere. (I don't want them ever walking into an interview and hearing, "I see from your resume that you're more of a maintenance programmer, but we really need people who can hit the ground running, banging out new huge kernel modules. Maybe you could assist them, by writing unit tests, and fetching their coffee, so they can focus on the challenging new stuff?")
One sign of hope is that the best-known worst-offender, at rewarding only new things, at least takes some aspects of reliability seriously.