Software projects need heroes? lessons learned from 1100 projects

codemac · on Sept 22, 2021

> Accordingly, this paper explores the effect of having heroes in project, from a code quality perspective by analyzing 1000+ open source GitHub projects.

Oh gross. Open source projects?

And the paper's sanity checks are:

    - one pull request.
    - more than 20 commits.
    – at least 50 weeks of activity.
    – more than 10 issues.
    – at least 8 contributors.

Secondly, the majority of the projects they looked at have the majority of their code as shell:

    Language Projects
    Shell 416
    JavaScript 396
    HTML 344
    CSS 314
    Python 291
    Makefile 229
    Ruby 216
    C 167
    Java 150
    PHP 146
    C++ 126
    Batchfile 81
    Perl 67
    Objective-C 67
    Dockerfile 54
    CMake 48
    M4 43
    CoffeeScript 38
    Roff 35
    Roff 35
    C-Sharp 34
    Emacs Lisp 30
    Gherkin 26
    Perl 6 21

And yes, they list Roff twice. This paper is really saying "In open source rando libraries on github, usually there is one person who does the majority of the work".

The hero problem most people are talking about in software engineering are heroes at paid jobs, working on codebases with 2000+ commits, in teams of 10s to 10000s. I'm not sure this paper provides any reasonable insights to how we work together.

ipaddr · on Sept 22, 2021

The type of project most often found is Shell. The majority of code is something else.

The heros can be the one or two people on the project who make it happen. It doesn't need to be 10,000 developers, I would expect different silos to emerge with a group that large.

codemac · on Sept 22, 2021

Yes, thank you for correcting my error - I should have said the mode rather than the median.

Right, but the "hero" as discussed by most folks is only a problem if the team expects hiring more software engineers should speed up execution, even sub-linearly. With a "hero" this is extremely difficult to manage.

I see this as an interesting problem in software development, and there are many more interesting discussions about this.

However - this paper presents a pile of open source repositories (not even necessarily projects/products) that accepted commits/patches from others. Open source must operate differently as all contributions are voluntary. If someone wants to be a hero, great, move the project forward. Thus all of these BDFLs, as the projects can't even imagine themselves without their initial hero. Projects that don't look like this tend to have paid developers (rust, go, fuschia, chromium, firefox) as opposed to the projects with heroes.

This paper starts with "I found a data set, lemme try to fit it to my question" rather than "is this data set relevant for my question?"

perl4ever · on Sept 22, 2021

>The type of project most often found is Shell. The majority of code is something else.

The majority of projects is also something else, by my estimation.

OneTimePetes · on Sept 22, 2021

The heroes you mentioned, then go home and develop a "clean" implementation in open source, just to have that moment were there own management suggest switching to a open source library. Nothing more satisfying then seeing some smuggle own himself and not realizing it.

literallyaduck · on Sept 22, 2021

In school they assign group projects to teach students that given a group one sucker will be stuck with all the work. That sucker is the one that cares the most to get a good grade.

At work it can be the developer who is pissed off the most about the code quality causing development pain or just naive enough to believe they might be rewarded if they care about the code.

In open source it is the pet project owner that often wants to see the project succeed.

It all comes down to who cares the most.

If you care you lay down industry best practices as the fundamentals, you tend the garden of knowledge and encourage pride in making your code the best. You mentor junior devs and put gates around your code to protect it. Fill your seats with developers who are willing to learn, challenge one another to greatness and when you find a hero, don't let them commit again until they have mentored two devs to take on the mantel.

oreally · on Sept 22, 2021

> If you care you lay down industry best practices as the fundamentals

Gonna stop you right there. Ever consider industry best practices as not for everything and anything? 'The best' is very subjective.

sokoloff · on Sept 22, 2021

I’m skeptical anytime someone’s first argument about why to do X is “it’s best practice”. When their second argument is also “it’s best practice” that skepticism deepens significantly.

dudeinjapan · on Sept 22, 2021

Yes and no. "Best practices" (if they are indeed best practices, e.g. SOLID) one violates at their own peril. I liken it to an apprentice learning to paint. Once one has mastered the rules of the game, then one can become Picasso and know which rules to break.

christkv · on Sept 22, 2021

Completely agree. "Best practices in our industry" is usually just used to dodge responsibility, or as a cudgel to impose your view on others.

refactor_master · on Sept 22, 2021

> they assign group projects to teach students that

That’s a pretty curious take. But in some way, it makes a lot of sense.

kibbleble · on Sept 22, 2021

I can't remember where I heard about this, but isn't there some kind of "surgeon model" where you have one star player do the majority of work, and everyone else is support staff? It's a pattern that emerges, and not a bad thing to encourage, as long as it's acknowledged. Everyone has their strengths and weaknesses. But in college, usually everyone has to be graded equally (akin to everyone being paid equally). Administrative and support work is still important, because without it, a surgery or a project won't succeed. Kinda like the 10x dev thing, but for group dynamics in general.

pm215 · on Sept 22, 2021

That's the Fred Brooks "chief programmer" model that the post mentions in the first paragraph. Brooks describes it in one chapter of _The Mythical Man Month_, which is one of those rare cases of a book in the tech field that's still interesting and readable decades after it was written. If you haven't read it yet and you like that kind of book that's about the underlying principles of the craft of writing software rather than specific technical detail, I recommend it.

sergius · on Sept 22, 2021

This blog post is a short read on the the book: https://medium.com/@dvxwang/book-lessons-the-mythical-man-mo...

burntoutfire · on Sept 22, 2021

That's an exagerrated take. I think schools assigns group projects to students, so that they can learn all kinds of group dynamics, mooching off of harder working people being merely one of them.

a9h74j · on Sept 23, 2021

When I first saw that a local tech school was emphasizing group projects, my thought was: "Great. Now they will have ten additional years to tire of corporate-style work, even before they start."

bryanrasmussen · on Sept 22, 2021

well it's a cynical, misanthropic take on how the world works, but I would be interested in seeing anyone finding a manual on teaching suggesting this as the lesson to be learned from group work.

vasco · on Sept 22, 2021

It accurately represents at least 60% of the group projects I was involved in school, though usually the rule wasn't 1, but 2 people that cared with most others not caring. What it taught me is that the most important thing to be successful in group work is to be able to select the group members.

OneTimePetes · on Sept 22, 2021

Chose those with a panicked look in there eyes, who want to limit scope when asked and ask ahead of how "last years projects went".

Ask who is willing to do documentation, do not chose anyone volunteering for that.

bryanrasmussen · on Sept 22, 2021

yeah, I think that is closer to what it teaches. I think there will be at most 2 or three top people in the project, and then some lesser contributors who are skating - maybe because they are not that interested in that project and they are the leads in some other project that is running concurrently.

watwut · on Sept 22, 2021

It is accurate description of group projects in school.

Real teamwork on the job does not work that way, unless management sux. When management checks out and stops managing/leading, it can descent there too.

timpattinson · on Sept 22, 2021

If you're gonna have the cynical take on group projects, take the one that makes sense.

They require a fraction of the resources to assess.

imtringued · on Sept 22, 2021

The problem is that work is lumpy. Sharing work is difficult. To split work among two people they both need to be equally skilled and informed about the task otherwise the faster person will sit idle and take work from others leading to the appearance of a superstar when the truth is that everyone could have done their part even if it's slightly less efficient.

l0b0 · on Sept 22, 2021

I only read the article, not the paper, but aren't there a few issues/other plausible explanations? Off the top of my head:

- The vast majority of GitHub projects probably have only a single contributor. Hopefully this study excluded such projects. Edit: A quick survey of the paper doesn't clarify this - there's a plot showing number of developers which bottoms out somewhere below 20.

- Among the projects which have more than one contributor, the vast majority again probably has only one contributor who can be reasonably said to understand the whole project. They wrote it from scratch, and nobody else contributed until it became useful to a more general audience.

Rather than concluding that "heroes" should be supported to communicate more/better, wouldn't it be reasonable to conclude that each project is so different from every other project (there are such a huge number of tools and languages to choose from, after all) that any minor contributors are highly unlikely to be familiar with the project? Could we not from that draw the much more pedestrian conclusion that people unfamiliar with a project produce buggy patches?

"Hero developers are far less likely to introduce bugs into the codebase than their non-hero counterparts. Thus having heroes in projects significantly affects the code quality." This doesn't follow. You are saying that in hero projects heroes introduce fewer bugs than non-heroes. You can't conclude anything about the difference between hero and non-hero projects from that.

That said, I would hope that the communication tools become better in the future for all the contributors, so that newcomers and experts alike can more easily understand each other and the project, leading to better communication and contributions.

kesor · on Sept 22, 2021

As others pointed out the term Hero is not used in a conventional way in this article. Usually when Hero Developer is portrayed as a problem, it means there is one person who solely takes upon himself to solve all problems without consulting or delegating responsibilities with anyone else in the project. Then that person is praised for it by management, growing the developer's ego and belittling everyone else in the process.

On the other hand, the term Hero in this article and research refers to a developer who is most involved with other people in the project, and has the most communication with everyone else, and thus shortening the communication gaps and misunderstandings. Reminds me a lot of a hack against Conway's Law. This person also happens to do most of the work, but that might as well be accidental, since this person (in these github projects) is the nominated responsible person for that project.

And the bias for "publicly available open source projects on github" is just horrible. It is like inspecting the color of bananas and then proving that avocados are in fact yellow.

louthy · on Sept 22, 2021

Purely anecdotal, but on one of my open-source projects [1] I have around 60 contributors, most have done small things like fix defects, or fixed up inconsistencies. Some have contributed to a larger sub-project (which I drove).

In my case I don't see myself as a 'hero', I just see myself as the person motivated to build it. Probably all of the bugs were caused by me, mostly because I wrote nearly all of it.

In terms of interactions (comments on Issues), again it's mostly me because I know it and built it.

That's not to say others don't help or contribute, this project seems to 'fit the numbers', but I don't see how this is in anyway insightful, it's just obvious for open-source projects. To a certain extent I see the people who contribute as the real heroes, as they're not getting much out of it personally: it's me who gets all the kudos for the project, not them.

Some insight into how this translates into projects within real businesses would be much more useful. Anecdotally again, I do see the pattern of the hero in my organisation (and have in previous companies I've worked at).

Several members of my team are real force multipliers and could easily be called heroes, and to a certain extent they receive that hero worship within the organisation (which can be good and bad). I'd say we probably wouldn't have survived as a business without them being exceptional or going the extra mile.

[1] https://github.com/louthy/language-ext

H8crilA · on Sept 22, 2021

Based on my very limited experience this could be true.

But what I'm sure of is that you don't want to be the hero, unless you can participate in the upside on equity-like terms (unbounded) capturing significant fraction of the upside for yourself. Doesn't necessarily have to be monetary, starting a large and successful open source project is a great achievement.

Otherwise just close your laptop/go home and Let It Fail.

WesolyKubeczek · on Sept 22, 2021

What usually happens in those bigger projects is that for every distinct piece of functionality there is a person, let's say Igor, that knows it inside out and to whom everybody comes when they have a question. Sometimes one Igor is in charge of several subsystems, or there is one Igor per subsystem, but I'm yet to see a subsystem that has more than one Igor with a real, significant, working knowledge of it.

How exactly Igors come in possession of such knowledge is irrelevant to this empirical fact. Some may build up security by obscurity and make themselves indispensable; some might have accrued the knowledge by either being the one who designed it, or having had to debug its guts because a change had to be made. An Igor can be a bottleneck or a godsend. The defining trait is that whenever there's a weird issue in a subsystem or more than a cosmetic change needs to be made, everyone goes, "let's ask Igor" or "let Igor do it". Again, doesn't matter if Igor is good or evil.

Even when Igor is open towards the idea of knowledge sharing, even eager about it sometimes, everyone else is lukewarm at best, and all non-Igors just hope they don't replace Igor at his job. It's just damn convenient to have Igor around. When Igor departs the company, knowledge transfer sessions usually consists of Igor working hard to explain everything, and everyone else quietly hoping that somebody else will listen and take notes. No one will remember where the notes are, though, and where the recordings of the sessions are stored, or whether they indeed are made and stored at all.

When Igor gets hit by a bus, even the most documented system he was tending to will likely be replaced by something else, because reading is too much work already, and comprehension is even more so. There also may be an emerging larva of Igor who wants to put some cred onto his résumé, so he will obviously want to redesign everything in the coolest technology du jour.

mtVessel · on Sept 22, 2021

A very apt description. IDD (Igor-driven development) is probably the most common software engineering methodology in use, at least in the enterprise.

WesolyKubeczek · on Sept 22, 2021

Igor-Driven Development. I think the late Sir Terry Pratchett would be proud of the concept, indeed. :-)

a9h74j · on Sept 23, 2021

Threat model: IHBBB -- Igor hit by big bus.

lowdownfork · on Sept 22, 2021

So far, I've only read the abstract, but I see two problems with this approach.

One is that they are looking specifically at open source projects, which, as other comments have pointed out, can technically meet their definition of a "hero project" without anyone engaging in the types of heroics normally associated with that term.

The other problem is that they talk about comparing hero developers to non-hero developers, when they should be comparing hero projects to non-hero projects.

If hero projects (as the paper defines them) are so common, then the pool of non-hero developers is going to include a lot of non-hero developers working on hero projects. Of course those people make more mistakes than the heroes, they're less familiar with the project. But those mistakes were made following the hero project model and should count against it, not for it.

infogulch · on Sept 22, 2021

Interesting:

>> In our data, commits with lower defects come from the small number of hero developers who have learned how to talk to more people.

This makes me think of a RustConf 2021 talk that was just published, 'Compile-Time Social Coordination' by Zac Burns [0], where he discusses coordination problems that can come up when you have developers working on nearby code but don't communicate directly, and how you can encode what are normally 'socially enforced' norms and patterns into the type system to enforce correctness at compile time.

[0]: https://www.youtube.com/watch?v=4_Jg-rLDy-Y

chrislusf · on Sept 22, 2021

As a main contributor to an open source project (https://github.com/chrislusf/seaweedfs), I can confirm that this finding is so true.

However, seems this research did not look into Apache projects, which basically maintain a different culture to encourage more contributors, so much as to encourage the main contributors to refrain from jumping to solve an issue until another person steps in first.

Fragoel2 · on Sept 22, 2021

There are some related research works (that seem not cited in the article) that focus on estimating a project Truck Factor, which is "the minimal number of developers that have to be hit by a truck (or quit) before a project is incapacitated." A good starting point for those interested is [1], which is also the source of the quote in the last sentence.

Interestingly, these works suggest that having a small number of developers that are critical for the project is risky and should be avoided, opposite to the conclusions of the paper discussed in the article.

Moreover, both Heroes and TF seem to me as consequences of the same problem: the inability of current technologies/methodologies to enable a seamless sharing/spreading of the knowledge acquired by veteran developers, that have been working for a long time on the project, with less experienced ones. I believe the shift to agile methodologies has made this problem worse if anything.

[1] http://aserg.labsoft.dcc.ufmg.br/truckfactor/avelino87423.pd...

crocal · on Sept 22, 2021

The point seems missed to me. Like any serious collaborative undertaking (movies, buildings, complex systems development, scientific collaborations, etc), successful software needs a unique person to carry the vision of what is the expected result and do what needs to be done to achieve this result, including organizing the works. Call him or her a director, an architect, a chief scientist, a hero… He/she will always be needed.

It’s interesting to note that these people can also become the doom of the project. This is why there is no single easy recipe for picking the right person.

jffhn · on Sept 22, 2021

Reminds me of the Surgical Team from The Mythical Man-Month, and of an interview of I think one of the authors of the agile manifesto, talking about IBM internal experiments concluding that it was the most effective organisation.

I've been working in such teams, the surgeon was typically one of the early developpers, that did (and knew) most of the code, but as the system grew sub-surgeons started to emerge for new or reworked parts.

ochronus · on Sept 22, 2021

My interpretation of the hero is slightly different, nonetheless, shameless plug: "Hero engineers can be deadly to team culture, it's time to retire those capes." https://leadership.garden/articles/kill-your-heroes/

dragonwriter · on Sept 22, 2021

And the antithesis: ’team culture’ can be deadly to hero engineers, time to retire the kool-aid.

And the synthesis: the right team culture allows a diversity of engineers, including ‘hero engineers’, to flourish and effectively contribute together; time to adapt your culture to your team.

ochronus · on Sept 22, 2021

That's a very good point, depending on _your_ definition of a hero :)

xnzzz · on Sept 22, 2021

So the site is "A community for engineering leaders" and you are pushing the voodoo that pays your salary.

Same for another submission against 10x engineers that is currently on the front page.

10 years ago hackers just had to fight the machine. Now they have to refute propaganda on a daily basis as well.

ochronus · on Sept 22, 2021

We all have our own experiences, I guess. All I write about is mine and of people I talk to.

I wonder what you think pays my salary.

I wonder what propaganda you mean.

zivkovicp · on Sept 22, 2021

"... the heroes will wind up being the confident, the well-connected, and those with preparatory privilege more often than the most intelligent or capable."

So why is this not fair? Intelligence, like confidence or socioeconomic status, is also not fairly distributed. Only about 5% of the general population will have an IQ above 125, so not even genius level but already heavily skewed. We can probably assume that this number is slightly higher for a population that make a career in IT... but intelligence is still not a given, why should it be considered more "fair"?

In any case, the 2nd point is a non-issue; incompetent people are rarely considered "heroes", so it's a bit of a self-correcting system in the end.

mlinksva · on Sept 22, 2021

Vivek Haldar also has a new video summary of the same paper https://www.youtube.com/watch?v=T3fJlHey8kw

wiradikusuma · on Sept 22, 2021

As others have pointed out, this is "the norm", just like group projects in school, there must be at least someone who cares enough and will go extra length. In software projects, they're the ones who make more effort to standardize stuff, refactor, code cleanup, etc. If everyone just do "the basic minimum", projects tend to fail or the quality suffers.

It's not ideal. The challenge is increasing the number of "heroes" (removing the bus factor), and preventing the heroes from burning out or falling back to indifference like the rest.

Note: I'm not saying the rest are lazy. In repetitive/predictable work, that's fine, but when there are lots of unknowns (like most software projects), you need to go the extra mile sometimes.

abalone · on Sept 22, 2021

The study only looked at open source which could bias it. It is possible that open source thrives on rare individuals whose heroism is simply having the time to donate significantly to free projects.

rightbyte · on Sept 22, 2021

Is "heroism" an euphemism for write access to the repo?

eecc · on Sept 22, 2021

Authors or committers? Because if the “hero” is the project lead and reviewer there’s a big distortion right there