>which makes an unwarranted equivalence of all languages that have 'static type checking':
No it doesn't. Read what you quoted, it says nothing even remotely resembling "this benefit applies to all languages with static typing". It is testing static typing, not a specific language. It uses the best static typing system to do so. You are entirely inventing the notion that this must then apply to java.
> If the study was meant to comprehend such a broad category as 'static type systems,' and from the explicit language of the study, it clearly was, then absolutely Java must necessarily be included
No it mustn't. Comparing the best of dynamic vs the best of static is a useful test. Just as nobody is complaining they didn't use a worse language than python, it makes no sense to complain they didn't use a worse language than haskell. You don't draw conclusions about the potential of X by examining the worst example of X possible.
This. Not all statically typed languages are created equal. Java's type system is old and is not state of the art. I wish people would stop using it as a straw man when anybody brings up static typing.
Java was state of the art 20 years ago, but it's definitely not the case any more.
Had the study qualified itself to merely "Haskell vs. Python" with deference given to the statistical significance of the sample size, you'd have a point. It wasn't me that brought all static typing, which of course includes Java, into the question at hand -- it was the study itself.
Yes, it was you. Why do you think the comparison should be "really bad static type system" vs "really good dynamic type system"? In what way does that make the test more useful? Allow me to say this again, as I do not know how to be any clearer:
You do not test the potential of something by using the worst possible example of it. The only point of your desire is to reinforce the strawman that java = static typing. A test of "do airbags help prevent deaths" would be a very poor test if it used anything other than the best possible airbag technology.
Since this hasn't been already mentioned, and I run the risk of really flaming things up. Java has a very high propensity of generating runtime type errors. This is easily done by skirting the type checker with casting, which is commonplace. The upshot of me saying this is that I'm actually on the fence of even considering Java to be a statically-typed language for this reason...which is part of why I disagree with the parent even using it as an example of a statically typed language equivalent to the one from this post in a counterexample (also included in this is C, C++, and the rest of that family).
As soon as you start using reflection in Java, you're doing non-statically-typed programming. Since a lot of popular Java frameworks use reflection implicitly - such as Spring, Hibernate, etc - that includes a lot of Java code that's out there.
And also, even if you carefully put a layer of explicit typechecking between the reflection based code, and the statically typed stuff, you're still throwing out the Java generics typechecking since none of that exists at runtime, and so your ArrayList<String> can mysteriously contain non-String types when you finally access it.
I don't think davesims is saying that should be the comparison. This particular complaint is about the conclusions, not the methodology. (I recognize he also criticized the methodology.) Conclusions should be useful. People shouldn't have to squint at the wording of your conclusion to determine what that means for them. So, you should bend over backwards in your conclusion, and err on the side of being clear.
With that in mind, I agree with davesims that the conclusion in the blog post is too strong. It is: "The application of static type checking to many programs written in dynamically typed programming languages would catch many defects that were not detected with unit testing" I say it is too strong because the author has not bent over backwards to make clear that this conclusion only applies to the "best" type systems, like Haskell.
For the record, I like the study, and once I run the author's conclusions through my bend-over-backwards-filter, I find them interesting. I upvoted this article. I also upvoted davesims' post because it is academic-reviewer level feedback.
> You do not test the potential of something by using the worst possible example of it.
So? Folks don't use the "potential", they use the real. They're asking questions like "should I use Java or Python".
> do airbags help prevent deaths" would be a very poor test if it used anything other than the best possible airbag technology.
That's not how things actually work. You decide between what's available. The performance of the best possible airbags is irrelevant. The real question is the cost and benefits of airbags that are likely to be deployed.
And the answer to "should I use Java or Python" is: no! Use Haskell ;). If you're entirely tied to Java (and, in that case, Python would probably not be ideal), you can still use Scala.
The question the study was asking was not "what language should I use for my lowest-common-denominator workforce" but rather "can a static type system catch more errors than unit tests and can statically typed code be as expressive as dynamically typed code".
In other words, it was asking for existential quantification: "does there exist some type system such that..." rather than "forall type systems..." or even "forall average systems...".
>So? Folks don't use the "potential", they use the real.
Haskell is real.
>They're asking questions like "should I use Java or Python".
That's wonderful, but it has nothing to do with the subject at hand, which was the question "can static typing reduce the number of bugs?". If you want an answer to a different question, don't complain about the answer given for this question, go find someone answering the question you want answered.
>That's not how things actually work. You decide between what's available. The performance of the best possible airbags is irrelevant. The real question is the cost and benefits of airbags that are likely to be deployed.
Why can't anyone follow a simple line of reasoning without resorting to fallacies? He tested the best airbags available. Not theoretical airbags that don't exist. He tested a car with the best airbags available to one without. The airbags were a benefit. You and the other guy making up fallacies insist that this isn't a fair comparison, because you want to drive a car where the airbags deploy 5 seconds after impact. Your crappy car isn't relevant to the question of "can airbags save lives".
>Why can't anyone follow a simple line of reasoning without resorting to fallacies?
Indeed. The conclusion C was out of scope with the premises A and B. C is wrong, but that doesn't mean A and B cannot infer useful, more modest conclusions.
What I don't understand about every one of your responses is that you seem to think false equivalence applies in only one direction.
You seem to think it's fine for OP to infer broad conceptual conclusions from a small subset of the domain, but counter-examples to the broad claims cannot be applied, according to you, because, rather bizarrely you continue to insist that the counter-examples are too specific and and don't apply because the scope is general? That doesn't even make sense.
It's quite simple. OP claims "unit testing is not enough," "you need Static Typing" and uses broad language like "static type systems." I continually insist that such conclusions are out of the scope of the data given: The fact that type-related bugs were found in a handful of relatively small Python programs translated to an idiosyncratic environment like Haskell cannot possibly infer something so broad as what the OP is claiming.
Using Java/C++/Clojure/C#/etc. vs JavaScript/Lisp/Smalltalk/Ruby to give a counter-example is clearly within the scope of the argument. If OP had claimed something like "Python shows risk of static type errors, exposed by Haskell port" and claimed something like "more care and unit-testing is needed to guard against certain types of type-related bugs" I wouldn't have a problem. But that's not what OP claimed.
> >And, it will never replace Java, C, Python, or even PHP
> It already has.
Oh really? Significantly fewer systems are being developed in those languages? How about some evidence?
What? You meant that a couple of applications have been written in Haskell instead of those applications? That's not "replace".
Which reminds me - if I find an application that was written in Haskell that is being replaced by an implementation written in some other language, would you claim that said other language is "replacing" Haskell? If not, don't make the mirror-argument.
Then you should be proposing he use whatever language you feel is better than python at being the best dynamic type system. The best unit tests is entirely irrelevant.
> Then you should be proposing he use whatever language you feel is better than python at being the best dynamic type system.
Nope.
> The best unit tests is entirely irrelevant
I can find errors in programs with a spell checker. Suppose that those programs have unit tests. Do you really think that spell checker is better than unit tests?
Are you trolling or incapable of reading? Nobody, at any point in time suggested that static typing was an alternative to unit testing. You haven't posted a single constructive thing in this entire thread, and you waited till it was over to do your trolling so you could avoid downvotes. Grow up, or go back to reddit.
>it says nothing even remotely resembling "this benefit applies to all languages with static typing".
That is precisely what it says, and that is reiterated later:
"...the conclusion can be reached that...in practice [unit testing] is an inadequate replacement for static type checking."
I'm not sure what you're reading, but there's no qualifications in the language used here regarding the idea of 'static type checking,' nothing so modest about the scope of the conclusion as claiming it was merely a "useful test" as you put it. It was a sweeping generalization about two very broad and extremely complex categories of languages. Had the conclusions used more moderate language and qualified itself adequately, I wouldn't have a problem. But all that has been shown here, is that in some contexts more care needs to be taken writing unit tests in a dynamic environment to catch some errors that are automatically caught in static environments. That is all that the data warrants.
Can you show how I've misinterpreted the plain language of the conclusion section?
I'm under the (perhaps mistaken) assumption that in academic papers people tend to mean what they say and choose their language carefully, particularly in the conclusion section.
If the following are not in fact broad, strong claims about the nature of static and dynamic languages in general, then won't you please explain to me how I should interpret them?
Here are the quotes from the conclusion of the paper (emphasis mine):
"The translation of these four software projects from Python to Haskell proved to be an effective way of measuring the effects of applying static type checking to unit tested software."
"Based on these results, the conclusion can be reached that while unit testing can detect some type errors, in practice it is an inadequate replacement for static type checking."
Honestly, at this point I can no longer tell whether you're misinterpreting or misrepresenting the conclusions. I'll make an honest attempt to argue, nevertheless.
"Static type checking" and "unit testing" are two concepts. There are numerous concrete implementations of these two concepts. The former is implemented in several languages, including C++ and Java and Haskell. The latter is implemented in several frameworks/tools, such as TestNG and PyUnit.
The article concludes that unit testing, as a technique for discovering and/or preventing defects, cannot wholly replace static type checking.
Apart from mentioning the concrete implementations of abstract techniques that the author used, the article does not conclude anything about the benefits of using specific languages, frameworks or tools.
What you have claimed so far is that:
1. there is a "hidden assumption is that all static and dynamic typing are created equal, i.e., since Haskell is statically typed and Haskell appears to have caught Python bugs that unit tests did not, therefore Java will catch bugs in a Ruby codebase, C++ will catch bugs in a JavaScript codebase, etc."
If anyone jumped to this conclusion, it was you. The only thing I can conclude from the article is that static typing checks such as those implemented in Haskell catch bugs that were not caught by unit testing logic such as that used in Python projects within the study. To conclude anything more I would need the data not present in the article, such as exactly what types of errors we caught or missed, etc.
2. the conclusion of the study "makes an unwarranted equivalence of all languages that have 'static type checking'"
It doesn't. The conclusion about the static type checking vs. unit testing might not be backed by enough solid data, but the conclusion makes no claims about languages, beyond specifying which languages were used in the study.
3. the claim that "this benefit applies to all languages with static typing" is "precisely what" the conclusion "says".
No occurrence of any phrase even remotely resembling the quote can be found in the article. Saying "this is precisely what it says" means "you'll find that phrase or one very similar to it in the text". Maybe you were trying to claim that "this is precisely what it means", but it's definitely what it "says".
All in all, the sweeping generalization about the concrete languages was introduced by you. My guess is that this is because you were, like me, frustrated by the vagueness of the article. I would have loved seeing more concrete data. Saying "X types of errors were found" is not as good as saying "the following types of errors were found" and that's just the start.
"All in all, the sweeping generalization about the concrete languages was introduced by you."
I think the plain, direct language of the paper's conclusion is clear enough without me having to embellish it, and without its defenders extrapolating all of the qualifications and subtexts that they think I missed. You really don't have much to work with, because the paper's clumsy conclusion is small, blunt and unqualified in its scope. It takes a handful of small Python programs translated to an idiosyncratic language like Haskell and concluded:
"in practice [dynamic typing with unit testing] is an inadequate replacement for static type checking."
This is unequivocal language. There's no qualifications about language, context, or any kind of variables that might possibly dilute the strength of the conclusion.
On the other hand, Peter Cooper gives a great example elsewhere on this thread of a much better paper with much broader scope, more stats, and much more modest, qualified conclusions. This is the kind of language that is useful and gives me confidence that the authors didn't start out with an axe to grind and merely followed what metrics they had to the warranted conclusion, no more, no less:
"Even though the experiment seems to suggest that static typing has no positive impact on development time, it must not be forgotten that the experiment has some special conditions: the experiment was a one-developer experiment. Possibly, static typing has a positive impact in larger projects where interfaces need to be shared between developers. Furthermore, it must not be forgotten that previous experiments showed a positive impact of static type systems on development time."
When you present conclusions in an academic paper, the onus is on the author to bend over backwards to prevent the reader from interpreting a stronger conclusion than intended. I think davesims' interpretation is fair given the language, and I were I reviewing the paper, I would have asked the author to temper his conclusions in a similar manner.
From the downvotes I can only conclude that many of you wish the study didn't claim what it claims and are merely shooting the messenger. If anyone can point out rhetoric within the study that qualifies it in such a way as to make comparisons of other statically typed languages with other dynamically typed languages out-of-bounds or expressing a false equivalence within the scope of the conclusions of the study itself, I'll retract.
But so far all of the arguments I'm seeing against using, for instance, Java, are coming from a perspective not advocated by the study. You all have a point -- it's just not the point made by the paper.
To simplify: there is a difference between "static typing is better than dynamic typing" and "all static typing is always better than all dynamic typing". It's basically the difference between ∃ and ∀.
Saying that "static typing is better than dynamic typing" is like the former: there exists some static typing system that is better than dynamic typing. Saying that "all static type systems are better than any dynamic system" is like the second. All the paper ever says is the first: "Based on these results, the conclusion can be reached that while unit testing can detect some type errors, in practice it is an inadequate replacement for static type checking." Note how it never claims to apply for all possible static type systems; rather, it just says that tests are an inadequate replacement for type systems in general (i.e. there exists some type system that catches more errors than tests). This is exactly like my first example.
In summary: a being better than b does not mean that all a is always better than all b. Just because static typing is better than dynamic typing does not imply that Java is always better than Python; it merely implies that some statically typed language is better than Python.
I agree with your characterization in your first paragraph, but I agree with davesims that the conclusions are too strong. If one has to do the level of analysis of the conclusions that you present in your second paragraph, then they are poorly worded. I find davesims' interpretation a reasonable one, which leads me to agree that the conclusions need to be tempered and clarified.
You would do well to consider the very real possibility that it is in fact you who is misguided, and not the rest of the world. You come off sounding childish when you refuse to even consider the possibility that you are simply misinterpreting the purpose and conclusion of the study. The only reason most people can think of to explain your behaviour is that you have an axe to grind and just want to shoot down anything that paints static typing as a positive thing.
When it says "static type checking" it does not mean "all static type checking" but rather "good static type checking". And this is what the study showed (ignoring issues of methodology and sample size for the sake of argument): a (good) static type system would have caught more errors than unit testing, therefore static typing is good.
Generalizing any comment to all static type systems is silly: there are language like C that have a static system but provide basically no additional safety at all. You can easily provide examples of really bad statically typed or dynamically typed languages, but these examples say nothing of static or dynamic typing in general: they're just bad. Questions about static vs dynamic typing can only be answered by the best (or at least good) examples of each.
Showing that a good statically typed system is more robust than a good dynamically typed system is a useful proxy for comparing static typing to dynamic typing. This is similar to a study on seat belts ignoring poor seat belts that strangle the passengers in the event of a crash.
In short: just because static typing is better does not mean all static type systems are better, because you can always come up with a sufficiently bad example of static typing.
>I'm not sure what you're reading, but there's no qualifications in the language used here
That is precisely my point. You are saying "this comparison of coke vs pepsi is no good because they used cold coke, and when I drink warm coke it isn't very good". Yeah, no shit. Stop drinking warm coke. Your decision to drink warm soda has no bearing on the test of cold soda vs cold soda.
Fine, then don't claim something like "All cokes in all contexts at all temperatures are better than all pepsis in all contexts at all temperatures."
This is equivalent to what the study does with static vs. dynamic. Your argument, if you actually had a point, would be something along the lines of, "wait I'm talking about this boutique hand-crafted cola (Haskell) I get at Whole Foods, not that old Coke (Java), that's 20 years out of date!"
You're trying to retro-actively reduce the scope of a study you didn't write. The conclusions clearly use generic language that brings all static typed languages into a comparison with all dynamic languages. The false equivalence is not mine! It's the study's. If you want it differently, go write your own study that reduces the scope of the conclusions.
I'm gonna have to disagree with you about the conclusion you're drawing. Yes, they are using the generic phrasing of "static typing" vs "dynamic typing", but this is because the study was intended to test the concept of static vs dynamic typing, not particular instances of it. However, seeing as we only have specific instances from which to test, it used the best one currently in widespread use. I don't see this as a problem, nor do I think the wording of their conclusion necessarily implies anything about all instances of static typing currently in use. Sure, it left that open as a possible interpretation for people looking for justification of a preconceived notion, but you can't really blame that on the authors.
> but this is because the study was intended to test the concept of static vs dynamic typing, not particular instances of it
Help me out here -- since the study confines itself to a handful of small Python programs translated to an idiosyncratic language like Haskell, how can the scope of the study possibly in any way qualify as a study on something so broad as "the concept of static vs. dynamic typing"?
Are you not confusing a better, more appropriate argument you'd make for the argument actually made in the paper?
EDIT:
> Sure, it left that open as a possible interpretation for people looking for justification of a preconceived notion, but you can't really blame that on the authors.
Is that really an argument you want to make, that I can't blame an author for using broad and imprecise language that infers unwarranted conclusions in an academic paper?
>Help me out here -- since the study confines itself to a handful of small Python programs translated to an idiosyncratic language like Haskell, how can the scope of the study possibly in any way qualify as a study on something so broad as "the concept of static vs. dynamic typing"?
You raise a good objection here. Is it possible to draw conclusions about the class of type systems labelled "static typing" vs dynamic typing by using a small sample of programs? I think this is where the impedence mismatch is occurring. The author seems to take static typing to mean "what can be currently accomplished through static typing", and thus he was justified in using the strongest static type system in use to do the study. Taking it this way, then the study seems meaningful.
Taking the other meaning, the class of type systems labelled static typing, then you end up with a very large set of languages each with (perhaps) varying amounts of power. Doing a study with just one static language does seem inadequate. Although, depending on the class of errors caught, it may still be valid. As far as I've seen, Haskell doesn't catch new classes of errors that are impossible in other systems, it just makes it a lot easier to do so. So essentially Haskell has the same power as other common type systems. If this holds, then the study would still be valid. (Admittedly I know very little about Haskell so I could be completely wrong).
TLDR: I see what you're saying, and I do agree that there needs to be more said before his conclusion can be supported by the study.
I don't think so -- anyone is free to choose to use the best-available type system. You can't just choose to write the best possible unit tests.
He could only compare one of the best possible environments for writing dynamically typed code and unit tests to one of the best possible environments for writing statically typed code.
>Fine, then don't claim something like "All cokes in all contexts at all temperatures are better than all pepsis in all contexts at all temperatures."
He didn't. He said "coke tasted better than pepsi". I've explained this to you several times already. You are the only one saying anything about "all the time in every context". You. Not the author, not his paper. You.
No it doesn't. Read what you quoted, it says nothing even remotely resembling "this benefit applies to all languages with static typing". It is testing static typing, not a specific language. It uses the best static typing system to do so. You are entirely inventing the notion that this must then apply to java.
> If the study was meant to comprehend such a broad category as 'static type systems,' and from the explicit language of the study, it clearly was, then absolutely Java must necessarily be included
No it mustn't. Comparing the best of dynamic vs the best of static is a useful test. Just as nobody is complaining they didn't use a worse language than python, it makes no sense to complain they didn't use a worse language than haskell. You don't draw conclusions about the potential of X by examining the worst example of X possible.