Hacker News new | past | comments | ask | show | jobs | submit login

I see a couple of other issues, which share a root cause.

1. A lot of research is based on mining open source projects which bear little resemblance to how software engineering is done behind closed doors.

2. Very few prospective trials (much less randomized controlled trials), which are considered the gold standard of evidence for a good reason.

These retrospective studies are very useful, but to really have faith in a conclusion that X is better than Y you need to assign several engineering teams to condition X and several more to condition Y and then compare.

Nobody has money for that and it would be hard to find even one team (let alone several) that is going to let a local professor flip a coin and decide whether they are going to do Scrum or Kanban. You'd also need quite a few teams because there would be so many uncontrolled aspects.

Researchers in this area are in the very difficult situation where the interesting things are difficult to measure and the things that are easy to measure aren't that interesting. Somehow they still do good work and that's very impressive but I don't envy their task.




I think the whole "you need too many subjects and too much money" objection to doing science is very weak. There are other fields of research where they find inventive ways of experimenting on a shoestring budget (I think some social sciences belong in this category.)

You can even do proper causal analysis when all you have is an observational study, given that you can get an expert to produce a decent guess at the causal network. (And this guess can to some extent be verified by plain correlations.)[1]

Sure, you might not get the extremely high external validity of e.g. particle physics, but in many cases something is better than nothing.[2]

Heck, screw external validity. I'd be ecstatic to see more experiments just within an organisation, focusing on internal validity. Every engineering organisation can afford to experiment with different tools and practises. There are huge productivity gains to be unlocked (is my hypothesis, anyway), and the cost is small.

If we get together enough small studies lacking external validity, but which point in the same direction, we might even be able to infer some external validity between them.

[1]: For more on this approach, see Judea Pearl.

[2]: To use an example close to my heart: pregnancy is a condition where you rarely see randomised trials, and any research is sorely underfunded. Yet when people actually take the time to sit down and answer questions with data, we get something meaningful out of it.


I wouldn't be surprised if a few organizations had run major experiments over the years, with results never seeing the light of day.

It would also be great if there was more awareness of the kind of data the research community needs so that if someone saw a natural experiment shaping up in their company they knew who to call.

Regarding budgets, these researchers are a lot smarter than me and I bet they are doing their best with what they have. What might be good though is having several labs join forces and collaborate on one very good trial of a core issue.

Then you have a hope of settling a question and moving the field forward while creating some infrastructure that will help answer other big questions.

Maybe that's happening, I haven't seen a lot of it though.


It's worth pointing out that this blog post was published three years before Accelerate, by Forsgren, Humble, and Kim. That book demonstrates what you can do just with surveys. It's not about software engineering per se - it's got a devops focus - but the overlap is significant.

As far as I'm concerned that book significantly changed the game, not only because of the findings, but because it showed that hard results are possible in this arena. The downside is that you need enough data to make the maths work, and that precludes applying the techniques to internal studies in most organisations. There's just too much noise.


Behind closed door software engineering is actually done only when the bad first implementation runs into limits. Software Engineering is similar to JIT-Compiling, it only applies to the hot-loop in a company.

Things that must work, and were every transactional loss has a cost to it. The first fb-implementation could be lousy technological wise, but the moment it generated add revenue and the site was "slowed down" to the point it lost people, cooperated software-engineering starts to exist.

Any architecture created before these constraints applied, is just legacy and if burdened with useless abstractions, often even a total loss. So, the war stories, the strange Optimizations needed to get something valuable enough for somebody to be "on call 247" thats the actual software engineering research.

Academia does not really work under these constraints, thus its solutions are "beautiful" but only relevant, when "pre-incident" architecture proofs it can survive even to post-incident.


You can do trials, fairly cheaply, if you do them on CS students. But then you're (almost always) working on smaller code bases, and always working with very inexperienced devs (by professional standards). So the applicability of the studies to the real world of working software engineers is... questionable.

In particular, if you have a study showing that X tool helps people in Y situation, and your subjects were CS students, then you have a study showing that X tool helps CS students in Y situation. Does it help people with 5 years professional experience? 20 years? You can't tell. It might, but you don't know.


The use of students in place of professionals would have gone as #3 in my list if I had thought of it, thank you for adding it.

Generalizing from studies where all participants were students is one of the things that psychology and the social sciences have had to stop doing (except when they are studying students specifically, of course).

We are currently growing into it and may find we get the same mixed results.

I work with a lot of undergrad CS students and the difference between an undergrad senior and a second year junior developer is enormous.

That's not a problem in general, but it is if you want to test software engineering practices with students beyond just working the kinks out of your protocol.


It occurs to me that you might be able to actually run real tests, with real engineers.

Say you're in Iowa or Kansas or somewhere. You can probably hire software engineers, with 5 to 10 years experience, for $100K/year, maybe even less. Hire 10 of them, for two years. That's $2 million.

Assign them to random teams. Give them non-trivial projects that run, say, three months each. That gives you eight experiments you can run. Change the methodology, or the language, or whatever you're trying to study.

You say that you can't get funding for a $2 million experiment? Tell Microsoft, and Oracle, and IBM, and the federal government, that the things you learn will enable them to more efficiently create software. See if they'll fund at least part of it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: