Whether enforced compile-time strong type checking is a benefit seems to depend on the programmer. It apparently helps some people, it certainly does not help me.
For what it's worth, I've been building "production" (is "production" something you make money on? I find this word increasingly vague) systems with both Common Lisp and Clojure for quite some time now. I prefer Clojure. But both languages are lisps. My thoughts so far: you can build spaghetti code in any language. You can either use a language that lets you run your spaghetti quickly, or one that whacks you on the head repeatedly thus making your spaghetti stiffer and straighter. But you end up with spaghetti anyway.
I agree that it is difficult to write good lisp code. In my code, I spend a lot of time thinking about contracts and data structures. If I am not careful, I end up with problems later on. But using a language like Java doesn't solve that: you just get the illusion of "better code", because your spaghetti design is now codified into neat factories and patterns.
The advantage of using a language from the lisp family is that reworking your spaghetti into something better is much easier, if only because there is so much less code.
Have you ever worked on a 1M line code base? Or with dozens (or hundreds!) of developers.
Once the team gets big, or the code gets so big you can't hold it all in your head, any machine-enforced QC can have a major impact.
(I'm working on a very large Haskell code base now -- the parts that tend to cause trouble are the dynamically typed bits -- they can just go wrong in so many unanticipated ways, and the complexity means its almost unavoidable that devs miss things).
I'm increasingly convinced that to wrangle massive code bases, you need both heavy abstraction abilities (higher order functions, DSLs, algebraic data types, polymorphism), and very strong typing, to keep the complexity under control
A 1M line codebase can mean many things. It can mean code that belongs to many systems mashed together as if it were a single thing. It can mean that the code that should belong to different systems is tightly coupled into one monolithic entity that should have been several little entities. It can mean lots of boilerplate code too. It can indicate a lack of timely refactoring on an aging codebase so interconnected nobody has the courage to separate in more manageable pieces. None of those can be solved by clever language choices alone.
You also mention things going wrong in unanticipated ways - this may signal that the problematic code where errors bubble up is not problematic at all - it is called by code written by people who don't really understand what the functions do and who probably didn't write adequate tests for that - because the tests should catch the unanticipated parts. The problematic code is the one calling the parts where errors bubble up. The canary is not responsible for the gases in the mine.
While you may be right that, in order to deal with multi-million-line codebases you need static typing, I'd much rather split that codebase into smaller units that could be more easily managed.
Wearing a straitjacket is often a consequence of an underlying condition that can, sometimes, be corrected.
I would expect that dons is talking about a 1M codebase which already consists out of manageable pieces. It's only, that the pieces have to work together, talk to each other, know about each other (but not too much).
Sometimes software solves problems which provoke incidental complexity because of sheer size and my experience (albeit not above 500k) tells me that indeed, all bits help, also compiler enforced type checks. I would never bet my life on tests. As you write: "because the tests should catch the unanticipated parts", that's the point: tests never catch unanticipated parts by their very nature. Sometimes, by sheer luck, yes.
A compiler can only go so far. It'll happily compile:
#include <stdio.h>
int main(int argc, char *argv[])
{
FILE *fp = fopen("outfile", "w");
fclose(fp);
do_something(fp);
return 0;
}
int do_something(FILE *f)
{
fprintf(f, "This should work if you know what you're doing");
}
And then you are back to dynamic typing again, except that, when the integer you send points to something fprintf doesn't like you'll get a segfault instead of a doesNotUnderstand.
The compiler will happily compile that because C's standard library thinks it's a fantastic idea to just throw type-safety out the window. Even C libraries can protect themselves from this via strong typing instead of overloading what FILE* effectively means.
I think you mean contracts or defensive coding. If someone calls my code wrong they will have to think to test that particular case themselves, which is hard. Unless I've written contracts; then when it blows up they'll know what they did wrong.
The question we have to answer to properly understand what went wrong is where did the argument originate. If it's being generated inside function A that then calls function B with it, a test of A should fail when it calls B with the wrong argument. In any case, I would imagine the test coverage in the A-B system is lower than it should.
The biggest programs I've worked on are way smaller than 1MLOC, but I've dug around a few large code-bases written in dynamic languages(emacs most notably, 1M lines of Lisp, 1/4M lines of C), and my unproven theory of large systems is that it might be a good idea to make mistakes have smaller impact, rather than try to avoid them(they are inevitable), by focusing on better abstractions, and thus keeping problems more local, focusing on the protocols rather than on the implementation. Static typing might help here, but designing a protocol/interface/api/dsl is hard and I need to experiment a lot with it, so I'm willing to sacrifice some safety for better overall design and flexibility(which will hopefully make debugging easier). IOW I agree you need DSLs, polymorphism and all that good jazz. Where my disagreement comes in is that It's a trade off, and not necessarily an absolute need to have a lot of static typing. In my personal work I'm willing to make the trade off and work with clay rather than marble. Other people work and think differently and I accept that; absolute truths are rare in computer science, and don't exist in software engineering, its all trade-offs here :)
ps My hobby vaporware CL project will probably be called mudball, I'm trying to learn how to design systems with hackability and extreme flexibility in mind :)
>Whether enforced compile-time strong type checking is a benefit seems to depend on the programmer.
And, due to the lisp's nature, you can have static type checking as a library, eg. there's some work on it for clojure https://github.com/frenchy64/typed-clojure, and AFAIK it's based on work already done by Typed Racked/Scheme.
Also I believe that because clojure is a lisp, has very few special forms, is immutable, has nice ns/var semantics and overall focuses on simplicity - you could build quality code analysis tools with relative ease, something that will do search on code for common error paterns, maybe even do verification outside type system with custom DSL. It's something that I would like to explore in few months after I finish my current project.
> But using a language like Java doesn't solve that
The article isn't talking about languages like Java; it's talking about languages with good static type systems.
I don't know if it's possible in Clojure without giving up on Java interop entirely, but good static type systems in other languages can make null pointer exceptions impossible, among other things. If you tell me null pointers aren't a problem for you personally, I'll find that difficult to believe.
If by null pointers you mean that some data is nil where it should not be, then certainly — they are a problem. In fact this is probably the biggest class of problems I encounter on a daily basis.
I still think you don't need a type system to address this. You could even argue that "nullness" is not a type, it's a data value (an out-of-band data value, shall we say). The way I deal with it is tests for corner cases (I always try to have tests that demonstrate what a piece of code does with null parameters), preconditions and assertions.
I agree that good static type systems can be of help, that is what I meant when I wrote that they seem to help some programmers. I don't dispute that. I dispute the other claim that I understood from the article: that languages that enforce a rigid structure are strictly necessary for "production software".
Instead of writing strong type checking code. I prefer to break function into pieces. Since Lisp has good trace function, you can easily see how functions composite with others during running time. For example, see my post: http://dryman.github.com/blog/2012/03/31/persistent-red-blac...
If you presume that it's OK to rely on compiler errors rather than attempt to exercise all code paths, then yes, Common Lisp, Scheme, Clojure, JavaScript, PHP, Python, Ruby, Perl, and bash will be more perilous than C, C++, C#, Java, ML and Haskell. (Which of the former group are Lisps?) I disagree with the presumption.
> more perilous than C, C++, C#, Java, ML and Haskell
Right. Unless you use features to escape the type system (such as C's casting, Java's reflection, or C#'s dynamic) which every non-trivial program I've ever worked on did.
Much less dangerous, in my experience, than the programmer that thinks he can keep a huge system all in his head. I've moved back to static languages after over a decade with dynamic languages and I couldn't be happier to let the compiler take care of a lot of low-level consistency issues so I can focus on the code.
Summary: use SBCL or CMUCL, declare your types, hire good programmers, and don't commit to a deadline without getting a commitment to a specific set of requirements.
Statically checked types are little more than very broad pre/post conditions on your code. Both CL and Clojure offer such conditionals and both can do it at compile-time if necessary.
Both this assertion and the one that types are "a set of compiler-enforced contracts between pieces of code" are very wrong ways of thinking about types, and the reason why most statically typed languages are so poorly designed and get such a bad reputation.
Static typing in a programming language is a mechanism for making certain types of non-terminating computations impossible to express in that programming language. The understanding "non-terminating" here includes errors as well (the idea of bottom ⊥ that you'll find in some earlier texts on programming languages semantics, such as Allen's Anatomy of Lisp). Non-terminating computations are equivalent to paradoxes in formal logic (see Curry-Howard correspondence), and the first type system was developed to deal with Curry's paradox (http://en.wikipedia.org/wiki/Curry%27s_paradox) in the first programming language (lambda calculus). Some type systems prevent more types of non-terminating computations, others less. But saying that run-time type assertions are equivalent to a typed programming language, or that types are like contracts or other things just leads to confusion.
No. They're a set of compiler-enforced contracts between pieces of code. Maybe more importantly, they enable a kind of constant, aggressive refactoring that's just way too dangerous in dynamic code.
Maybe at large scale, I have spent my career working with 200k - 500k loc systems in java, c#, and ruby, and in my experience, the thing that makes refactoring possible or a nightmare is well written test suites, not compilers. Compilers definitely catch a certain class of problem, but tests catch many more.
It holds even more so at small scale because I can pretty confidently move a method up an inheritance chain or rename a class and count on the compiler catching everything. In a bigger codebase I have to be more careful about breaking somebody's excessively dynamic code that might be pointing to those things in a roundabout way.
As an experiment last year I started teaching myself machine learning by implementing the common algorithms in Clojure and Scala simultaneously. I expected Clojure to win out and worked a lot harder in the beginning but after a while I was forced to concede that my Scala code was cleaner, faster, and less buggy, partly because I felt comfortable changing things because I knew the compiler would help me.
Pre/post conditions at compile-time in Clojure? Nope. How could you know the value returned by a function at compile time? Or the value of the inputs?
You could only know the types of the input if the compiler was smart enough and define constraints on the types but as far as I know it's not possible in Clojure.
This so childish. If this article was about Python or some other language, this type of comment wouldn't be tolerated. Why is it when the topic is lisp?
Because it's a funny little stereotype that has a grain of truth to it (come on, you've never heard of "old Lisp greybeards"?) and at least in my mind isn't insulting. At least, if someone implied I was a bearded old programmer, I'd be happy to be included with the likes of McCarthy, Ritchie, Kernighan, and Thompson.
Now, if we were talking about Python and someone said, "the main peril of Python programmers is that they'll get apple juice and crayon all over everything", that would be insulting!
Whether enforced compile-time strong type checking is a benefit seems to depend on the programmer. It apparently helps some people, it certainly does not help me.
For what it's worth, I've been building "production" (is "production" something you make money on? I find this word increasingly vague) systems with both Common Lisp and Clojure for quite some time now. I prefer Clojure. But both languages are lisps. My thoughts so far: you can build spaghetti code in any language. You can either use a language that lets you run your spaghetti quickly, or one that whacks you on the head repeatedly thus making your spaghetti stiffer and straighter. But you end up with spaghetti anyway.
I agree that it is difficult to write good lisp code. In my code, I spend a lot of time thinking about contracts and data structures. If I am not careful, I end up with problems later on. But using a language like Java doesn't solve that: you just get the illusion of "better code", because your spaghetti design is now codified into neat factories and patterns.
The advantage of using a language from the lisp family is that reworking your spaghetti into something better is much easier, if only because there is so much less code.