> Accordingly, this paper explores the effect of having heroes in project, from a code quality perspective by analyzing 1000+ open source GitHub projects.
Oh gross. Open source projects?
And the paper's sanity checks are:
- one pull request.
- more than 20 commits.
– at least 50 weeks of activity.
– more than 10 issues.
– at least 8 contributors.
Secondly, the majority of the projects they looked at have the majority of their code as shell:
And yes, they list Roff twice. This paper is really saying "In open source rando libraries on github, usually there is one person who does the majority of the work".
The hero problem most people are talking about in software engineering are heroes at paid jobs, working on codebases with 2000+ commits, in teams of 10s to 10000s. I'm not sure this paper provides any reasonable insights to how we work together.
The type of project most often found is Shell. The majority of code is something else.
The heros can be the one or two people on the project who make it happen. It doesn't need to be 10,000 developers, I would expect different silos to emerge with a group that large.
Yes, thank you for correcting my error - I should have said the mode rather than the median.
Right, but the "hero" as discussed by most folks is only a problem if the team expects hiring more software engineers should speed up execution, even sub-linearly. With a "hero" this is extremely difficult to manage.
I see this as an interesting problem in software development, and there are many more interesting discussions about this.
However - this paper presents a pile of open source repositories (not even necessarily projects/products) that accepted commits/patches from others. Open source must operate differently as all contributions are voluntary. If someone wants to be a hero, great, move the project forward. Thus all of these BDFLs, as the projects can't even imagine themselves without their initial hero. Projects that don't look like this tend to have paid developers (rust, go, fuschia, chromium, firefox) as opposed to the projects with heroes.
This paper starts with "I found a data set, lemme try to fit it to my question" rather than "is this data set relevant for my question?"
The heroes you mentioned, then go home and develop a "clean" implementation in open source, just to have that moment were there own management suggest switching to a open source library. Nothing more satisfying then seeing some smuggle own himself and not realizing it.
In school they assign group projects to teach students that given a group one sucker will be stuck with all the work. That sucker is the one that cares the most to get a good grade.
At work it can be the developer who is pissed off the most about the code quality causing development pain or just naive enough to believe they might be rewarded if they care about the code.
In open source it is the pet project owner that often wants to see the project succeed.
It all comes down to who cares the most.
If you care you lay down industry best practices as the fundamentals, you tend the garden of knowledge and encourage pride in making your code the best. You mentor junior devs and put gates around your code to protect it. Fill your seats with developers who are willing to learn, challenge one another to greatness and when you find a hero, don't let them commit again until they have mentored two devs to take on the mantel.
I’m skeptical anytime someone’s first argument about why to do X is “it’s best practice”. When their second argument is also “it’s best practice” that skepticism deepens significantly.
Yes and no. "Best practices" (if they are indeed best practices, e.g. SOLID) one violates at their own peril. I liken it to an apprentice learning to paint. Once one has mastered the rules of the game, then one can become Picasso and know which rules to break.
I can't remember where I heard about this, but isn't there some kind of "surgeon model" where you have one star player do the majority of work, and everyone else is support staff? It's a pattern that emerges, and not a bad thing to encourage, as long as it's acknowledged. Everyone has their strengths and weaknesses. But in college, usually everyone has to be graded equally (akin to everyone being paid equally). Administrative and support work is still important, because without it, a surgery or a project won't succeed. Kinda like the 10x dev thing, but for group dynamics in general.
That's the Fred Brooks "chief programmer" model that the post mentions in the first paragraph. Brooks describes it in one chapter of _The Mythical Man Month_, which is one of those rare cases of a book in the tech field that's still interesting and readable decades after it was written. If you haven't read it yet and you like that kind of book that's about the underlying principles of the craft of writing software rather than specific technical detail, I recommend it.
That's an exagerrated take. I think schools assigns group projects to students, so that they can learn all kinds of group dynamics, mooching off of harder working people being merely one of them.
When I first saw that a local tech school was emphasizing group projects, my thought was: "Great. Now they will have ten additional years to tire of corporate-style work, even before they start."
well it's a cynical, misanthropic take on how the world works, but I would be interested in seeing anyone finding a manual on teaching suggesting this as the lesson to be learned from group work.
It accurately represents at least 60% of the group projects I was involved in school, though usually the rule wasn't 1, but 2 people that cared with most others not caring. What it taught me is that the most important thing to be successful in group work is to be able to select the group members.
yeah, I think that is closer to what it teaches. I think there will be at most 2 or three top people in the project, and then some lesser contributors who are skating - maybe because they are not that interested in that project and they are the leads in some other project that is running concurrently.
It is accurate description of group projects in school.
Real teamwork on the job does not work that way, unless management sux. When management checks out and stops managing/leading, it can descent there too.
The problem is that work is lumpy. Sharing work is difficult. To split work among two people they both need to be equally skilled and informed about the task otherwise the faster person will sit idle and take work from others leading to the appearance of a superstar when the truth is that everyone could have done their part even if it's slightly less efficient.
I only read the article, not the paper, but aren't there a few issues/other plausible explanations? Off the top of my head:
- The vast majority of GitHub projects probably have only a single contributor. Hopefully this study excluded such projects. Edit: A quick survey of the paper doesn't clarify this - there's a plot showing number of developers which bottoms out somewhere below 20.
- Among the projects which have more than one contributor, the vast majority again probably has only one contributor who can be reasonably said to understand the whole project. They wrote it from scratch, and nobody else contributed until it became useful to a more general audience.
Rather than concluding that "heroes" should be supported to communicate more/better, wouldn't it be reasonable to conclude that each project is so different from every other project (there are such a huge number of tools and languages to choose from, after all) that any minor contributors are highly unlikely to be familiar with the project? Could we not from that draw the much more pedestrian conclusion that people unfamiliar with a project produce buggy patches?
"Hero developers are far less likely to introduce bugs into the codebase than their non-hero counterparts. Thus having heroes in projects significantly affects the code quality." This doesn't follow. You are saying that in hero projects heroes introduce fewer bugs than non-heroes. You can't conclude anything about the difference between hero and non-hero projects from that.
That said, I would hope that the communication tools become better in the future for all the contributors, so that newcomers and experts alike can more easily understand each other and the project, leading to better communication and contributions.
As others pointed out the term Hero is not used in a conventional way in this article. Usually when Hero Developer is portrayed as a problem, it means there is one person who solely takes upon himself to solve all problems without consulting or delegating responsibilities with anyone else in the project. Then that person is praised for it by management, growing the developer's ego and belittling everyone else in the process.
On the other hand, the term Hero in this article and research refers to a developer who is most involved with other people in the project, and has the most communication with everyone else, and thus shortening the communication gaps and misunderstandings. Reminds me a lot of a hack against Conway's Law. This person also happens to do most of the work, but that might as well be accidental, since this person (in these github projects) is the nominated responsible person for that project.
And the bias for "publicly available open source projects on github" is just horrible. It is like inspecting the color of bananas and then proving that avocados are in fact yellow.
Purely anecdotal, but on one of my open-source projects [1] I have around 60 contributors, most have done small things like fix defects, or fixed up inconsistencies. Some have contributed to a larger sub-project (which I drove).
In my case I don't see myself as a 'hero', I just see myself as the person motivated to build it. Probably all of the bugs were caused by me, mostly because I wrote nearly all of it.
In terms of interactions (comments on Issues), again it's mostly me because I know it and built it.
That's not to say others don't help or contribute, this project seems to 'fit the numbers', but I don't see how this is in anyway insightful, it's just obvious for open-source projects. To a certain extent I see the people who contribute as the real heroes, as they're not getting much out of it personally: it's me who gets all the kudos for the project, not them.
Some insight into how this translates into projects within real businesses would be much more useful. Anecdotally again, I do see the pattern of the hero in my organisation (and have in previous companies I've worked at).
Several members of my team are real force multipliers and could easily be called heroes, and to a certain extent they receive that hero worship within the organisation (which can be good and bad). I'd say we probably wouldn't have survived as a business without them being exceptional or going the extra mile.
Based on my very limited experience this could be true.
But what I'm sure of is that you don't want to be the hero, unless you can participate in the upside on equity-like terms (unbounded) capturing significant fraction of the upside for yourself. Doesn't necessarily have to be monetary, starting a large and successful open source project is a great achievement.
Otherwise just close your laptop/go home and Let It Fail.
What usually happens in those bigger projects is that for every distinct piece of functionality there is a person, let's say Igor, that knows it inside out and to whom everybody comes when they have a question. Sometimes one Igor is in charge of several subsystems, or there is one Igor per subsystem, but I'm yet to see a subsystem that has more than one Igor with a real, significant, working knowledge of it.
How exactly Igors come in possession of such knowledge is irrelevant to this empirical fact. Some may build up security by obscurity and make themselves indispensable; some might have accrued the knowledge by either being the one who designed it, or having had to debug its guts because a change had to be made. An Igor can be a bottleneck or a godsend. The defining trait is that whenever there's a weird issue in a subsystem or more than a cosmetic change needs to be made, everyone goes, "let's ask Igor" or "let Igor do it". Again, doesn't matter if Igor is good or evil.
Even when Igor is open towards the idea of knowledge sharing, even eager about it sometimes, everyone else is lukewarm at best, and all non-Igors just hope they don't replace Igor at his job. It's just damn convenient to have Igor around. When Igor departs the company, knowledge transfer sessions usually consists of Igor working hard to explain everything, and everyone else quietly hoping that somebody else will listen and take notes. No one will remember where the notes are, though, and where the recordings of the sessions are stored, or whether they indeed are made and stored at all.
When Igor gets hit by a bus, even the most documented system he was tending to will likely be replaced by something else, because reading is too much work already, and comprehension is even more so. There also may be an emerging larva of Igor who wants to put some cred onto his résumé, so he will obviously want to redesign everything in the coolest technology du jour.
So far, I've only read the abstract, but I see two problems with this approach.
One is that they are looking specifically at open source projects, which, as other comments have pointed out, can technically meet their definition of a "hero project" without anyone engaging in the types of heroics normally associated with that term.
The other problem is that they talk about comparing hero developers to non-hero developers, when they should be comparing hero projects to non-hero projects.
If hero projects (as the paper defines them) are so common, then the pool of non-hero developers is going to include a lot of non-hero developers working on hero projects. Of course those people make more mistakes than the heroes, they're less familiar with the project. But those mistakes were made following the hero project model and should count against it, not for it.
>> In our data, commits with lower defects come from the small number of hero developers who have learned how to talk to more people.
This makes me think of a RustConf 2021 talk that was just published, 'Compile-Time Social Coordination' by Zac Burns [0], where he discusses coordination problems that can come up when you have developers working on nearby code but don't communicate directly, and how you can encode what are normally 'socially enforced' norms and patterns into the type system to enforce correctness at compile time.
However, seems this research did not look into Apache projects, which basically maintain a different culture to encourage more contributors, so much as to encourage the main contributors to refrain from jumping to solve an issue until another person steps in first.
There are some related research works (that seem not cited in the article) that focus on estimating a project Truck Factor, which is "the minimal number of developers that have to be hit by a truck (or quit) before a project is incapacitated." A good starting point for those interested is [1], which is also the source of the quote in the last sentence.
Interestingly, these works suggest that having a small number of developers that are critical for the project is risky and should be avoided, opposite to the conclusions of the paper discussed in the article.
Moreover, both Heroes and TF seem to me as consequences of the same problem: the inability of current technologies/methodologies to enable a seamless sharing/spreading of the knowledge acquired by veteran developers, that have been working for a long time on the project, with less experienced ones. I believe the shift to agile methodologies has made this problem worse if anything.
The point seems missed to me. Like any serious collaborative undertaking (movies, buildings, complex systems development, scientific collaborations, etc), successful software needs a unique person to carry the vision of what is the expected result and do what needs to be done to achieve this result, including organizing the works. Call him or her a director, an architect, a chief scientist, a hero… He/she will always be needed.
It’s interesting to note that these people can also become the doom of the project. This is why there is no single easy recipe for picking the right person.
Reminds me of the Surgical Team from The Mythical Man-Month, and of an interview of I think one of the authors of the agile manifesto, talking about IBM internal experiments concluding that it was the most effective organisation.
I've been working in such teams, the surgeon was typically one of the early developpers, that did (and knew) most of the code, but as the system grew sub-surgeons started to emerge for new or reworked parts.
My interpretation of the hero is slightly different, nonetheless, shameless plug:
"Hero engineers can be deadly to team culture, it's time to retire those capes."
https://leadership.garden/articles/kill-your-heroes/
And the antithesis: ’team culture’ can be deadly to hero engineers, time to retire the kool-aid.
And the synthesis: the right team culture allows a diversity of engineers, including ‘hero engineers’, to flourish and effectively contribute together; time to adapt your culture to your team.
"... the heroes will wind up being the confident, the well-connected, and those with preparatory privilege more often than the most intelligent or capable."
So why is this not fair? Intelligence, like confidence or socioeconomic status, is also not fairly distributed. Only about 5% of the general population will have an IQ above 125, so not even genius level but already heavily skewed. We can probably assume that this number is slightly higher for a population that make a career in IT... but intelligence is still not a given, why should it be considered more "fair"?
In any case, the 2nd point is a non-issue; incompetent people are rarely considered "heroes", so it's a bit of a self-correcting system in the end.
As others have pointed out, this is "the norm", just like group projects in school, there must be at least someone who cares enough and will go extra length. In software projects, they're the ones who make more effort to standardize stuff, refactor, code cleanup, etc. If everyone just do "the basic minimum", projects tend to fail or the quality suffers.
It's not ideal. The challenge is increasing the number of "heroes" (removing the bus factor), and preventing the heroes from burning out or falling back to indifference like the rest.
Note: I'm not saying the rest are lazy. In repetitive/predictable work, that's fine, but when there are lots of unknowns (like most software projects), you need to go the extra mile sometimes.
The study only looked at open source which could bias it. It is possible that open source thrives on rare individuals whose heroism is simply having the time to donate significantly to free projects.
Oh gross. Open source projects?
And the paper's sanity checks are:
Secondly, the majority of the projects they looked at have the majority of their code as shell: And yes, they list Roff twice. This paper is really saying "In open source rando libraries on github, usually there is one person who does the majority of the work".The hero problem most people are talking about in software engineering are heroes at paid jobs, working on codebases with 2000+ commits, in teams of 10s to 10000s. I'm not sure this paper provides any reasonable insights to how we work together.