Methodologies for measuring open source project health

livebsd · on Aug 26, 2018

These seem to measure project popularity more than anything else. Of big projects. Of which you can choose among several similar ones.

The success of a OSS project lies in the hands of the user. There are projects done by a single developer over a decade, with zero contributors and with rare commits (or god forbid -- no repository!). In a niche which is too technical for a casual contributor, yet it does exactly what you need.

This is still 100% success to me, and represents what OSS is all about.

Big OSS projects backed by companies are a different breed of commercial endeavor.

Something1234 · on Aug 26, 2018

Can you name some of these really small open source projects? Like the ones that come in super handy.

livebsd · on Aug 26, 2018

For a long time, this has been a great resource:

http://onethingwell.org/

It's not exclusive to OSS though.

TAForObvReasons · on Aug 26, 2018

Most commit-based metrics are arguably flawed for the same reason looking at wikipedia edits doesn't tell the whole story. From Aaron Swartz's commentary http://www.aaronsw.com/weblog/whowriteswikipedia

> Wales seems to think that the vast majority of users are just doing the first two (vandalizing or contributing small fixes) while the core group of Wikipedians writes the actual bulk of the article. But that’s not at all what I found. Almost every time I saw a substantive edit, I found the user who had contributed it was not an active user of the site. They generally had made less than 50 edits (typically around 10), usually on related pages. Most never even bothered to create an account.

It's easy to belittle the drive-by single commits, but arguably that is a much more useful measure than the other proposed metrics

ssivark · on Aug 26, 2018

Interesting observation by Aaron Swartz. That could happen because the barrier to entry for adding to a text blurb is much lower than that for contributing to a code base — maybe because we practice the former for a decade in school.

For code, ease of access (edit ability) is non-obvious and therefore, an important axis along which to evaluate open source projects.

So, funnily enough, I agree with you because the data you quoted for Wikipedia is much less likely to apply to code :-)

Btw, I recently came across a talk by Evan Czaplicki (creator of Elm lang) on “What is success?” for an open source project like Elm, and he raises some interesting points about how measuring projects by Github activity is strongly biased by a model of how the larger Javascript community works: https://youtu.be/uGlzRt-FYto

cryptica · on Aug 26, 2018

There are a lot of metrics which are not covered in this article. For example, I think that having a lot of one-time contributors with small PRs is actually a sign of good health; it generally means that the project's code is stable and easy to read and modify for newcomers.

Also, I don't think that the number of commits or commit frequency has much to do with project health; low commit count could simply mean that the project is very stable; which is usually a good thing... Sometimes you don't need more features.

I think that when the project is still evolving, it's better to only have one or two main contributors - Otherwise the project's vision and direction can be lost. I think that Redis is a perfect example of a healthy OSS project. The contribution stats on GitHub look ideal to me: https://github.com/antirez/redis/graphs/contributors

jszymborski · on Aug 27, 2018

This would have been super useful while writing my comparison of Deep Learning Frameworks[0].

Trying to create quantifiable measures for OSS health is tough, but I used commit rates and merged PRs as a proxy for developer activity and # of StackOverflow Qs and Github Repos using the framework as signs of a growing ecosystem/community [1]. It's far from perfect, but it ranks maintenance-level projects like Theano low, and growing projects with momentum like PyTorch high.

"Health" is a broader term, however. Would you consider Redis as healthy, despite the "(not) open core" controversy that's begun to foment? Not all controversies are fatal to projects (see: nodejs/iojs) but some are...

Also funding is a big deal. It's hard to quantify objectively unless it's done exclusively through a public crowd-funding option...

[0] https://source.coveo.com/2018/08/14/deep-learning-showdown/

[1] https://source.coveo.com/2018/08/14/deep-learning-showdown/#...

chrxr · on Aug 26, 2018

Personally I would look at the following combination of three metrics:

- Last commit date

- Gross Product Pull Requests

- Number of regular contributors who have contributed in the last six months or so

Then on top of that I would look at the structure and organization of the project and the quality of the documentation. Well presented docs, even if short, and a well maintained release methodology go a long way for me.

tmikaeld · on Aug 26, 2018

There's also the funding question.

Small projects can thrive in open source but bigger ones won't survive without money or company backing even if the project looks healthy.

Example: https://github.com/WebTales/rubedo/issues/1477

hyperpallium · on Aug 26, 2018

An ideal for open source projects is a "plug-in" architecture. They are easily contributed and edited in isolation.

While this architecture has low coupling and high cohesion, it's not ideal for every problem.

gtirloni · on Aug 26, 2018

One important metric I like to check when the project was started by a company is how many of the top contributors are employees of that company (more outside contributors meaning healthier project).

the_duke · on Aug 26, 2018

One of my most important metrics when judging the health of a project is maintainer responsiveness:

* how long does it take for issues to have responses from a maintainer. this does not necessarily mean time to fix but time to recognition.

* Basically the same for pull requests: how long until PRs are reviewed or commented on by a maintainer and how valuable the feedback is. The kind of response can be all kinds of things like "sorry we won't do this", "please address these problems", etc.

styfle · on Aug 26, 2018

I haven’t seen any tools that measure this. I have seen one or two tools measure the time an issue stays open which is not a good metric since some GitHub issues are bugs, others are proposals that take more time to discuss.