A Practical Optional Type System for Clojure

pufuwozu · on Oct 25, 2012

Overall, the work described in this dissertation leads to the conclusion that it appears to be both practical and useful to design and implement an optional static type system for the Clojure programming language.

I'm a huge type safety fan (see http://roy.brianmckenna.org/) so this is pretty amazing. I'm definitely going to play around with Ambrose's work.

Had a quick look, it's really exciting that the algo.monads was almost completely type-checkable with this system.

Leaves me with a couple of questions:

Multimethods are untyped. Anyone able to comment on how often multimethods are used in idiomatic Clojure code?

Anyone know if this work could eventually allow protocols to become full type-classes (allowing dispatch based on return type of protocol methods)? Am I misunderstanding how protocols are compiled?

Hopefully the videos for Typed Clojure at Clojure Conj 2012 will be posted soon after the talk is given :)

My only fear is that optional typing would be less useful than optional untyping. When you have libraries that are untyped, they're a pain to use from a typed language. The other way around is not true.

ambrosebs · on Oct 25, 2012

Multimethods are an awesome feature, which I'm a big fan of, but they were lower on my priority list so I haven't got around to typing them. It's not clear how comprehensively they can be typed either, I comment about it in "Future work".

I don't know a lot about type-classes, but I'm doubtful that Typed Clojure could help make protocols like them. I would expect a whole different dispatch system would need to be designed. But that's mostly a hunch. Very interesting problem.

If you want to play around with this kind of stuff immediately, check out Typed Racket. The Typed Clojure implementation is in alpha and might not be very friendly yet :) That said, please try it out!

noelwelsh · on Oct 25, 2012

If you like this work you'll also like Typed Racket, on which it is based. Sam TH is the main author and his publications are here: http://www.ccs.neu.edu/home/samth/

chadzawistowski · on Oct 25, 2012

The first thing I thought of while opening this article was Qi / Shen, an optionally typed Lisp which takes inspiration from the ML family of languages. http://www.shenlanguage.org/

Shen's dedication to providing functional concepts such as partial application really interests me, but I find Clojure's practical nature even more enticing. Perhaps Typed Racket and this type system for Clojure will provide the best of both worlds!

ambrosebs · on Oct 26, 2012

Shen is pretty amazing and is built with a type system in mind from the beginning. If you don't want the type system, just turn it off, in the middle of a REPL session if you want!

firlefans · on Oct 25, 2012

I am not sure about the motivation for this project but this article helps explain why some folks are interested in type systems as detachable (pluggable) components, separate from languages:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.175...

pufuwozu · on Oct 25, 2012

I've read a few of Gilad's papers - I just don't see how they can be practical in a world of third-party libraries (e.g. author A hates types, I like type-safety but I have to settle for none when I use that library).

jamii · on Oct 25, 2012

This paper demonstrates an effective way of interfacing between typed and untyped code such that type errors can only happen at the boundaries - not within typed code.

homepages.inf.ed.ac.uk/wadler/papers/blame/blame.pdf

firlefans · on Oct 25, 2012

I guess, but this seems like something that could become a best practice for library authors, and it'd be fairly trivial to retrofit or fork existing libs with annotations.

Tuna-Fish · on Oct 25, 2012

> Multimethods are untyped. Anyone able to comment on how often multimethods are used in idiomatic Clojure code?

Not at all.

Multimethods were Hickeys solution to polymorphism before the Haskell enthusiasts managed to preach the gospel of type classes to him. Today, they are more or less deprecated as a solution to a problem.

fogus · on Oct 25, 2012

> Not at all.

This is categorically false. Multimethods solve a certain problem "open arbitrary function dispatch". Nothing in Clojure aside from multimethods solves this problem. There are many projects that use multimethods to their great advantage, including, but not limited to:

* Clojure

* ClojureScript

* ClojureScript One

samaaron · on Oct 26, 2012

Overtone also uses multimethods in a number of places. They're crazy powerful and useful.

Rayne · on Oct 26, 2012

They are not even remotely deprecated.

th0ma5 · on Oct 25, 2012

This is really great, and immaculately put together. Recently though I've come across people who think that you need strong types when programming, and this is really weird to me. I think it has its place for sure, and is a great way to ensure you're being correct, but I also really like not having to mess with types on hobby projects and such for sure :D

Anyway, great paper.

ambrosebs · on Oct 25, 2012

Thanks, I intentionally chose examples where types are particularly useful, like monads and conduits (and apparently null elimination).

Evbn · on Oct 25, 2012

I love strong types on hobby projects, so I don't have to constantly go back and fix crash bugs after I get it working once.

jshen · on Oct 25, 2012

You must be doing it differently than me, and most other people I've worked with.

ambrosebs · on Oct 25, 2012

I'll just note that any corrections are more than welcome, this is my first paper and I've got a lot to learn! Please send them to abonnairesergeant at gmail dot com.

dschiptsov · on Oct 25, 2012

One of the distinguishing features of a classic Lisp is that values, not variables have a type-tag and the type-safety checking is performed at runtime. Everything is a reference (a pointer), everything is a first-class object (could be passed anywhere), everything is an expression (uniform syntax).

Ironically, Clojure is, a Java in the land of Lisp. They broke the abstraction layers, mess up logic with implementation, introduce not just different kind of parenthesis, but redundant and irrelevant data-structures and ruined the magic of "everything is a list". Look what a classic 3-lines 'keep function became.)

lukev · on Oct 25, 2012

> but redundant and irrelevant data-structures

Now you're just trolling. Clojure's immutable persistent data structures are one of the primary strengths of the language, and what I tend to miss most when I'm working in another language.

dschiptsov · on Oct 25, 2012

It is not about collections, it is about breaking the conceptual unity of the core language by using [] for a parameter.. array.)

Evbn · on Oct 25, 2012

All of closure's syntax is sugar on top of lists that still work.

dschiptsov · on Oct 25, 2012

Oh, excuse me, not an array, I'm old fashioned - heterogeneous vector!)

firlefans · on Oct 25, 2012

You say array, I say heterogenous vector, let's call the whole thing off!

KirinDave · on Oct 25, 2012

By this definition, only PLT Scheme has really kept the flame alive then, right?

A lot of Common Lisp implementations have gone forward with replacing a lot of underlying machinery with type-inference-driven compilation and parts to actually get performance out of modern hardware. I don't see a huge difference here, except that there is a big interop layer more firmly baked into the stdlib.

dschiptsov · on Oct 25, 2012

Lisp is not just some implementation, it is a well-balanced set of ideas and trade-offs, which together produced a natural programming language with several dialects.

Clojure, on the other hand, is just an implementation of a bunch of features.

All I'm trying to say is that Lisp is incomparable more harmonized and much more elegant creation that Clojure, which compared to Scheme or Arc, is, excuse me, an ugliness.)

KirinDave · on Oct 25, 2012

Engineering a platform or product is seldom a pretty business. Ask the Yesod folks. It's only recently that anything like Conduits with Conduit's performance is finally finding "good" and well-researched alternatives.

Not that your complaint is incorrect, it's just sort of difficult to actually use this as the basis for an argument against Clojure because everyone has already made their peace with this. It is not a "pretty" lisp; it's a natural response to the question, "How can I keep working with the JVM and leverage JVM libraries without devolving into the 1990's Taligent style verbosity that is modern Java?"

pufuwozu · on Oct 25, 2012

type-safety checking is performed at runtime

That doesn't really make sense to a type theorist. The definition of "type-safety" is only applicable statically.

What checking is done at runtime?

rprospero · on Oct 25, 2012

It's a different, weaker level of type safety that's become so prevalent that you probably take it for granted. The runtime checking is just to make sure that operations are performed on logical types.

My favorite analogy for this is the restaurant one. Haskell insists on having two waiters: one to deliver the food and a second to deliver the check. Clojure allows one waiter to deliver both, but throws a fit and leaves the restaurant if the check arrives before the bill. Assembler eats the first thing that arrives and signs the second, even if the bill comes before the food.

Clojure is "type-safe" in that it will exit instead of performing an illogical action (e.g. dividing a string pointer by a network socket). It's just as type-safe as Haskell code littered with fromJust.

pufuwozu · on Oct 25, 2012

That checking is by definition, not type safety.

What you're talking about now is totality. Both Haskell and Clojure allow non-total functions (functions which may not return a result) - the checking that they do are to stop undefined results. Agda is an example of a language that only allows total functions.

rprospero · on Oct 25, 2012

Adding the fromJust Haskell argument seems to have confused the argument I was trying to make. I had two points. The first one was better expressed by Evbn and I don't really have anything to add to that.

The second was that Clojure arguably has a type system. There is exactly one type, Stuff. Every function takes Stuff and returns Stuff. Of course, functions themselves are a kind of Stuff. Now, it lacks totality, but otherwise every compiled Clojure program has passed the type checker. It's just that type checking is a no-op, since there's no way for anything to have the wrong type.

Evbn · on Oct 25, 2012

Totality is different. In C you can (using a pointed cast cast) add a float to an intended and get gobbledygook (by interpreting the bits in memory incorrectly) and keep going. In Closure you can't, it will throw an exception, but you won't get warned about that until runtime.

wooby · on Oct 25, 2012

Your example is nonsense. Clojure's keep is based on lazy sequences instead of concrete lists, and is also optimized for performance.

That said, you raise a point that occurred to me recently, and it relates to Clojure's literal representation of data beyond lists and your point about the "ruined magic".

In most Lisps, the only evaluation semantic is that of the "operator form": a list of operator car and arguments cdr, both of which are evaluated themselves.

Clojure, however, adds an evaluation semantic for its data literals. Consider this code:

{(+ 1 2) (+ 3 4)}

It results in the map {3 7}, but where is the "operator form" that caused the top-level evaluation?

In Lisps with reader macros, an expression like above boils into code resembling (hash-map (+ 1 2) (+ 3 4)), a classic "operator form", which is then evaluated.

In Clojure, however, which does not provide reader macros, there is no intermediate list representation of the expression accessible to the programmer. Effectively, Clojure adds evaluation semantics to Lisp beyond "operator forms": data literals evaluate their contents, and qualify in many ways as "special forms".

I have run into at least one implication of this design having to do with the structural representation of Clojure code. With a "classic" Lisp, an IDE or editor is empowered to present to the user any representation of the code, and can convert it to Lisp expressions arbitrarily the same way reader macros do.

With Clojure, this type of "skinning" is harder, because IDE-like navigation of data implies ordering. The ordering exists in the character-wise representation of a map like {3 4} but not in its evaluated counterpart.

lucian1900 · on Oct 26, 2012

All of Clojure's literals desugar to calls. They're reader macros you can't add to.

wooby · on Oct 26, 2012

My point is that they desugar directly to data structures (instantiated in Java, in LispReader) instead of operation forms that evaluate to data structures.

See vector creation: https://github.com/clojure/clojure/blob/52633274225370dfb058...

lucian1900 · on Oct 27, 2012

I don't see how that distinction is useful.

gfodor · on Oct 25, 2012

nice troll, but in clojure everything is a seq.

Rayne · on Oct 26, 2012

That isn't actually true. Vectors, maps, etc, are not seqs. You can get a seq view of them, but the collections themselves are not seqs.

gfodor · on Oct 26, 2012

I'm confused -- list's aren't seq's either by your definition, are they? My understanding is a seq is just something that meets first/rest semantics?

dustingetz · on Oct 25, 2012

> They broke the abstraction layers, mess up logic with implementation

your opinion is one i haven't seen before, do you care to elaborate?

dschiptsov · on Oct 25, 2012

http://news.ycombinator.com/item?id=4517540

lukev · on Oct 25, 2012

The implementation of Clojure itself is not idiomatic nor is it intended it to be so. Most of the complexity in that example, for instance, is to deal with chunked lazy sequences. In user-land code, chunking of lazy sequences is pretty much a pure performance optimization that you don't have to worry about.

Granted, it's not a ideologue "turtles all the way down" LISP. But it isn't intended to be. If you want that, I advise you to find an old Lisp Machine and use that.

jimbokun · on Oct 25, 2012

Your three line version is pretty much equivalent to this part of the Clojure version:

    (let [x (f (first s))]
        (if (nil? x)
            (keep f (rest s))
            (cons x (keep f (rest s)))))

The rest of it appears to be performance optimizations, that probably make sense for a key function in the core library likely to be commonly invoked.

dschiptsov · on Oct 25, 2012

There are some subtle differences, which makes the whole sense.

Classic version has a self-evident clarity and familiar shape of a recursive function, which utilizes TCO.

Perfection is achieved when there is nothing more to cut off, not when there is nothing more to pile up.)

spacemanaki · on Oct 25, 2012

Is the classic one actually going to be able to leverage tail-call elimination? Obviously it could be refactored to make a tail-recursive call, but correct me if I'm wrong, it wouldn't be automatic.

After refactoring it, the tail-recursive version would be a bit harder to read than the naive implementation. In a sense the Clojure code is in the same boat, it's just that the optimizations are more complicated, because it's not working with singly linked lists in the "address register" and "decrement register".

rapala · on Oct 25, 2012

So the classic version works in a lazy fashion, for example with infinite structures? Keep in mind that Clojure is a strict language by default.

Evbn · on Oct 25, 2012

Nothing more to cut off, including runtime and memory usage.

zcam · on Oct 25, 2012

Apple & oranges. The only thing these 2 have in common is their name, the clojure version does a lot more.

A definition of a similar keep fn in clojure would be almost indistinguishable from your lisp example (it would just be more readable).

dschiptsov · on Oct 25, 2012

Who told you that more is better? Why we should have nested lets and all that boilerplate for a classic filter?

simonb · on Oct 25, 2012

Because you then don't need the complexity (& boilerplate) of SERIES or LOOP the moment you want to do a series of transformations on a slightly bigger list.

kenko · on Oct 25, 2012

You could write the clojure version in the exact same way (well, you wouldn't say null?, but, basically the same) as the lisp one. But it wouldn't be as efficient given the various ways sequences might be implemented in Clojure. That makes sense as part of the standard library, IMO---users of keep don't need to care about whether their sequence is chunked or whatever. And it makes sense that there should be a variety of implementers of sequences: there's more to life than singly-linked lists.

(Even Guy Steele, who should know, recommended a couple years ago that language designers omit cons from their langauges!)