Can you show how I've misinterpreted the plain language of the conclusion section?
I'm under the (perhaps mistaken) assumption that in academic papers people tend to mean what they say and choose their language carefully, particularly in the conclusion section.
If the following are not in fact broad, strong claims about the nature of static and dynamic languages in general, then won't you please explain to me how I should interpret them?
Here are the quotes from the conclusion of the paper (emphasis mine):
"The translation of these four software projects from Python to Haskell proved to be an effective way of measuring the effects of applying static type checking to unit tested software."
"Based on these results, the conclusion can be reached that while unit testing can detect some type errors, in practice it is an inadequate replacement for static type checking."
Honestly, at this point I can no longer tell whether you're misinterpreting or misrepresenting the conclusions. I'll make an honest attempt to argue, nevertheless.
"Static type checking" and "unit testing" are two concepts. There are numerous concrete implementations of these two concepts. The former is implemented in several languages, including C++ and Java and Haskell. The latter is implemented in several frameworks/tools, such as TestNG and PyUnit.
The article concludes that unit testing, as a technique for discovering and/or preventing defects, cannot wholly replace static type checking.
Apart from mentioning the concrete implementations of abstract techniques that the author used, the article does not conclude anything about the benefits of using specific languages, frameworks or tools.
What you have claimed so far is that:
1. there is a "hidden assumption is that all static and dynamic typing are created equal, i.e., since Haskell is statically typed and Haskell appears to have caught Python bugs that unit tests did not, therefore Java will catch bugs in a Ruby codebase, C++ will catch bugs in a JavaScript codebase, etc."
If anyone jumped to this conclusion, it was you. The only thing I can conclude from the article is that static typing checks such as those implemented in Haskell catch bugs that were not caught by unit testing logic such as that used in Python projects within the study. To conclude anything more I would need the data not present in the article, such as exactly what types of errors we caught or missed, etc.
2. the conclusion of the study "makes an unwarranted equivalence of all languages that have 'static type checking'"
It doesn't. The conclusion about the static type checking vs. unit testing might not be backed by enough solid data, but the conclusion makes no claims about languages, beyond specifying which languages were used in the study.
3. the claim that "this benefit applies to all languages with static typing" is "precisely what" the conclusion "says".
No occurrence of any phrase even remotely resembling the quote can be found in the article. Saying "this is precisely what it says" means "you'll find that phrase or one very similar to it in the text". Maybe you were trying to claim that "this is precisely what it means", but it's definitely what it "says".
All in all, the sweeping generalization about the concrete languages was introduced by you. My guess is that this is because you were, like me, frustrated by the vagueness of the article. I would have loved seeing more concrete data. Saying "X types of errors were found" is not as good as saying "the following types of errors were found" and that's just the start.
"All in all, the sweeping generalization about the concrete languages was introduced by you."
I think the plain, direct language of the paper's conclusion is clear enough without me having to embellish it, and without its defenders extrapolating all of the qualifications and subtexts that they think I missed. You really don't have much to work with, because the paper's clumsy conclusion is small, blunt and unqualified in its scope. It takes a handful of small Python programs translated to an idiosyncratic language like Haskell and concluded:
"in practice [dynamic typing with unit testing] is an inadequate replacement for static type checking."
This is unequivocal language. There's no qualifications about language, context, or any kind of variables that might possibly dilute the strength of the conclusion.
On the other hand, Peter Cooper gives a great example elsewhere on this thread of a much better paper with much broader scope, more stats, and much more modest, qualified conclusions. This is the kind of language that is useful and gives me confidence that the authors didn't start out with an axe to grind and merely followed what metrics they had to the warranted conclusion, no more, no less:
"Even though the experiment seems to suggest that static typing has no positive impact on development time, it must not be forgotten that the experiment has some special conditions: the experiment was a one-developer experiment. Possibly, static typing has a positive impact in larger projects where interfaces need to be shared between developers. Furthermore, it must not be forgotten that previous experiments showed a positive impact of static type systems on development time."
I'm under the (perhaps mistaken) assumption that in academic papers people tend to mean what they say and choose their language carefully, particularly in the conclusion section.
If the following are not in fact broad, strong claims about the nature of static and dynamic languages in general, then won't you please explain to me how I should interpret them?
Here are the quotes from the conclusion of the paper (emphasis mine):
"The translation of these four software projects from Python to Haskell proved to be an effective way of measuring the effects of applying static type checking to unit tested software."
"Based on these results, the conclusion can be reached that while unit testing can detect some type errors, in practice it is an inadequate replacement for static type checking."