Try Idris

jcurbo · on April 13, 2014

For those curious what Idris is, from http://www.idris-lang.org/ : "Idris is a general purpose pure functional programming language with dependent types. Dependent types allow types to be predicated on values, meaning that some aspects of a program’s behaviour can be specified precisely in the type."

Here's a couple of talks about Idris:

Edwin Brady at London Haskell: https://www.youtube.com/watch?v=vkIlW797JN8

David Christiansen at Haskell DC: http://bmsherman.github.io/idris-drc-3-26-14/

And there's a tutorial and other docs available here: http://www.idris-lang.org/documentation/

14113 · on April 13, 2014

Just to note, Edwin was the original author of Idris, so I'd definitely recommend watching his talk.

Edit:

Given that my comment seems fairly high at the moment, I'll post a link to his PhD thesis, which seems to be the genesis of Idris for anyone interested: http://eb.host.cs.st-andrews.ac.uk/writings/thesis.pdf

ryanartecona · on April 13, 2014

I'll add: if all this seems too opaque to jump into at once (and especially if you're unfamiliar with Haskell or other FP languages), don't give up on FP!

A "functional" language is not always a "pure" language, and a "pure functional" language certainly does not always have dependent types. These concepts build on each other.

A great intro to FP is "Programming in Standard ML". Concise, precise, and very approachable. SML is functional, but not pure nor dependently-typed (though purity is encouraged). http://www.cs.cmu.edu/~rwh/smlbook/book.pdf

Haskell is like SML with enforced purity, a different module/typeclass system, and default laziness. LYAH is the best intro. http://learnyouahaskell.com/

Idris is aimed primarily at those already familiar with Haskell, so don't be discouraged if the online REPL feels too foreign!

pufuwozu · on April 13, 2014

Hey everyone, I'm the creator!

Sorry for the problems - I was hotlinking to the awesome http://rawgit.com/ website during development and forgot to remove it before posting it on Twitter.

The website should be fixed in just a few seconds. Thanks for trying it out!

skybrian · on April 13, 2014

I'm glad to see dependent types move a bit closer to practical programming. I read (well, skimmed) the tutorial and learned a little of what it's all about. Being able to pass around assertions as first-class objects seems pretty interesting.

However, I'm still confused about what's happening at compile time versus runtime. Apparently some functions can be used in proofs (and are therefore evaluated at compile time), while some proofs must be evaluated at runtime?

The notation isn't very helpful here. Since types are no longer strictly compile time and values are no longer strictly runtime, it seems like we need some other way to easily distinguish compile time and runtime.

ihm · on April 13, 2014

See this page (https://github.com/idris-lang/Idris-dev/wiki/Erasure-by-usag...) for info on erasure. Unused proof values or indices in data types used for static checking are erased at compile time via usage analysis (if you don't use it in the RHS of a function definition, it gets erased).

If you want to ensure that the indices of your data type or arguments to your function are erased, you can use the new dot notation. E.g., if you give a function called "half" the following type

  half : (n : Nat) -> .(pf : Even n) -> Nat

then the compiler will complain if the proof argument is actually used.

tel · on April 14, 2014

I may be misinterpreting your statement, but passing around assertion objects is a common feature of many DT languages (Coq, Agda, Idris, Epigram, ATS, etc)

peterldowns · on April 13, 2014

Here's what I saw: http://i.imgur.com/4yNB6t4.png

    > This page abuses rawgit.com. You should complain to its
    > owner.

erikano · on April 13, 2014

They can use cdn.rawgit.com instead, which is ok to use in production according to http://rawgit.com/#cdn

Udo · on April 13, 2014

> You can use the REPL above to play

I don't see anything like that on my screen (OS X, Firefox latest). It's basically just three lines of code without any interaction.

fhars · on April 13, 2014

The beginning of the HTML (including the head with the script links) is [sometimes] missing from the page sent by the server.

RussianCow · on April 13, 2014

Same thing here, in both Firefox and Chrome.

Edit: Weird, after a couple refreshes it worked, but then I refreshed again and the console disappeared again.

chrisdevereux · on April 13, 2014

Same for me (OS X, Safari latest)

sabalaba · on April 13, 2014

I discovered Idris when I was playing around with Agda last summer. It's cool to see a systems programming language with dependent types.

Dewie · on April 13, 2014

> It's cool to see a systems programming language with dependent types.

What do you mean by 'systems programming language'?

Rusky · on April 13, 2014

The tutorial says "The goal of the IDRIS project is to build a dependently typed language suitable for verifiable systems programming. To this end, IDRIS is a compiled language which aims to generate efficient executable code. It also has a lightweight foreign function interface which allows easy interaction with external C libraries."

I generally think of systems programming as drivers, kernels, embedded stuff, and performance-sensitive applications like game engines. Not sure how far in this direction IDRIS is going though. It feels a lot like the same niche as OCaml, with dependent types.

edwinb · on April 13, 2014

I tend not to use the phrase "systems programming" any more. "General purpose" is probably better (or my current favourite "Pac-man complete"). I should probably edit the tutorial to that effect. We've done simple network protocols and packet formats (I had a student working on DNS for example) but not driver or kernel level things.

That said, some people are experimenting with lower level things (thinking about making a minimal run time system with no garbage collection for example). It's not a primary goal, but it'll be interesting to see what comes of it.

bjz_ · on April 13, 2014

> I should probably edit the tutorial to that effect.

Yes please do. Idris is a lovely language for bringing dependent types closer to the realm of production code, but giving people false hopes is a little cruel :)

tel · on April 13, 2014

Take a look at ATS then! My understanding from playing around with it is it is trying to be C with dependent types.

thinkpad20 · on April 14, 2014

ATS looks fascinating in concept, but what code examples I've seen, the syntax is terrible and it's ridiculously verbose. I'm sure you could write absolutely bulletproof code in it, but the productivity hit would render it almost unusable for large-scale projects. I might be off here, of course... but one of the lovely things about Haskell is that despite its reputation as a difficult language, once certain key concepts are tackled, it's relatively easy to program rapidly and create real world products (with a considerably higher assurance of correctness than most languages provide). ATS seems complex to a level that nearly guarantees its usage won't extend much beyond academia.

doublec · on April 14, 2014

If you use the non-dependently typed parts of ATS it is no more verbose than most ML variants. If you use it without the GC then verbosity increased because you need to add calls to free memory. If you add proofs or dependent types then declaring those adds more. I've used it in moderately sized production programs and it's similar to using OCaml or other ML in terms of productivity. Once you're familiar with the language of course.

If I paraphrase what you said about Haskell it's true for ATS too:

    "but one of the lovely things about ATS is that despite its reputation as a difficult language, once certain key concepts are tackled, it's relatively easy to program rapidly and create real world products (with a considerably higher assurance of correctness than most languages provide)."

The key, like any language, is learning the concepts. I think people assume ATS is more difficult than it is is because much of the documentation and examples for it use a verbose style and heavy proofs. A post I wrote a while back attempts to gently introduce ATS dependent types and you can see that the non-dependently typed examples are very similar to any other ML language:

http://bluishcoder.co.nz/2010/09/01/dependent-types-in-ats.h...

bjz_ · on April 14, 2014

> If you use it without the GC then verbosity increased because you need to add calls to free memory

The issue is that is the default mode I would like to work in. Rust is designed to make that the natural mode of working, rather than the exception. Fast, deterministic operations should be easier and more succinct than the managed alternatives – in the context of a systems language that is.

> people assume ATS is more difficult than it is is because much of the documentation and examples for it use a verbose style and heavy proofs.

Then they need better docs. And a better website! And better marketing!

doublec · on April 14, 2014

It's interesting looking at the different design decisions of languages. One persons pro is another's con.

ATS2 has no GC as the default mode of working too. The only reason it is more verbose is because you need the call to free. ATS strives to be explicit in what it does so it's easier to read the code and know what is happening.

Languages with destructors or implicit function calls seem to lean towards the C++ downside of "What does this code really do" when looking at it. I dislike hidden code where there is no easily identifiable call to tell you something is happening.

It's not all roses and unicorns with ATS either of course. What I really want is the next generation of dependently typed system languages which has learnt from the current ones.

bjz_ · on April 15, 2014

> you need the call to free

So how does ATS enforce memory safety? With Rust you have a bunch of library types that you can reuse that abstract away from the memory management for you. The types make sure you don't screw up. Unfortunately you can't prove things about the `unsafe` blocks, and have to rely on careful auditing, but that only needs to be done once, and then you have a nice, reusable abstraction. Bookkeeping sucks.

> What I really want is the next generation of dependently typed system languages which has learnt from the current ones.

Agreed. At the moment Rust suits me the best, but I definitely don't think it is the end of the road.

doublec · on April 15, 2014

> So how does ATS enforce memory safety?

Linear types. Much like unique pointers in Rust, the type system tells you when you fail to consume a linear type. A 'free' call would do this. Same with closing files, etc - you use linear types to track that the resource is consumed.

bjz_ · on April 15, 2014

Ah neat. Does it have regions?

tel · on April 14, 2014

And to be clear, ridiculously verbose, dense, and unreadable proofs is the UX failing of most DT languages today. My understanding is that ATS is not significantly better or worse in that department.

bjz_ · on April 14, 2014

Indeed. I love the idea of ATS, but it seems like it would be a challenge to wrangle with. Whilst Rust isn't a flexible and powerful as ATS, it seems to be much easier to handle at the moment.

doublec · on April 14, 2014

I think any language you're not familiar can seem like that - especially if it uses unfamiliar sigils or syntax. I feel similar about Rust for example as I'm not as experienced with it. But with ATS I can put programs together pretty quickly - all because I've learnt it.

bjz_ · on April 14, 2014

Thanks! Yeah, it's hard when you haven't used a language all that much to judge. I wish the sold themselves better though. The website is atrocious for folks wanting to get started with some code. :(

bjz_ · on April 13, 2014

'Dependently typed systems programming' was something that really excited me when I first learned of Idris, but it seems that they have backed down from that claim. Indeed, the home page now reads, "Idris is a general purpose pure functional programming language with dependent types".

orf · on April 13, 2014

Functional languages interest me, but are not very approachable. I mean I looked at the examples[1] page for Idris and came away completely confused, why he hell does this sequence of characters "data Nat = Z | S Nat" define a natural number? Or "data Maybe a = Nothing | Just a" somehow defines an option type.

It would be good to have an introduction for people not too familiar with functional languages.

1. http://www.idris-lang.org/example/

dragonwriter · on April 13, 2014

> Why he hell does this sequence of characters "data Nat = Z | S Nat" define a natural number?

Because a natural number is either zero (the one value of type Z, which has a nullary type constructor and therefore only one value) or (represented by the vertical bar "|") the successor to another natural number (represented by type S, which has a one-arg constructor whose argument is of type Nat.) Its a recursive type definition, with a base case and a recursive case.

> It would be good to have an introduction for people not too familiar with functional languages.

It would be; right now, you are probably be best served by one of the ones for Haskell -- Idris doesn't seem to have a lot of documentation not aimed at people not already familiar with functional programming, and really familiarity with Haskell seems to be the most applicable to Idris.

The Idris tutorial [1] assumes familiarity with Haskell, but does provide brief descriptions of the relevant syntax which may be enough.

[1] http://eb.host.cs.st-andrews.ac.uk/writings/idris-tutorial.p...

captainmuon · on April 13, 2014

While it's mathematically conventional and maybe event elegant to represent integers that way, isn't it ridiculously inefficient to actually implement it that way? I guess there is going to be something in the compiler that recognizes and special-cases this pattern, but still...

Also, I really wonder why there is no language with "nice"/"conventional"/C/ALGOL/... syntax for dependent types. Right now my dream language would be something like Spec# (which is C# with contracts), but it would try to prove its contracts (within the bounds of possibility of course, wrt. the halting problem and so on).

dragonwriter · on April 13, 2014

> While it's mathematically conventional and maybe event elegant to represent integers that way, isn't it ridiculously inefficient to actually implement it that way?

It would be, but the compiler recognizes the pattern and does ordinary integer arithmetic where it needs to (the pattern is relevant, however, to the use of individual natural numbers in defining other types, which is pretty important to Idris.)

> Also, I really wonder why there is no language with "nice"/"conventional"/C/ALGOL/... syntax for dependent types.

I don't think C/ALGOL-style syntax is all that nice when it comes to even Haskell's level of typing, much less dependent types.

That said, if dependent typing takes off, no doubt we'll see languages that fuse it with C-style syntax soon after, just as we saw with OOP and basically every other thing since.

edwinb · on April 13, 2014

> It would be, but the compiler recognizes the pattern and does ordinary integer arithmetic where it needs to (the pattern is relevant, however, to the use of individual natural numbers in defining other types, which is pretty important to Idris.)

The compiler does recognise it, because it'd be silly not to, but more importantly, Nats tend to be type level things which explain the structure of something else (e.g. a successor corresponds to a cons, or to another level in a tree), and as such don't appear in runtime code anyway. If what you want is machine arithmetic, better to use a type that's designed for that.

Idris has Haskell-like syntax because I like Haskell style syntax (this is probably the most important reason!) and Haskell programmers are the initial target audience.

fzltrp · on April 13, 2014

I recently stumbled on something called "view", a concept implemented in the Hope language. For those not familiar with it, it let you define bijections between 2 different datatypes, so that one may pattern match a piece of data from one datatype with the other datatype constructors, and one of the examples in the paper was precisely that correspondence between peano numbers and usual (unsigned) integers (well, really, it's not quite bijective, given the arch limits on integer representation, but for any purposes we can think of yet, it works). So, the question is: is idris using something like that? Or is there another mechanism? Is it a builtin feature of the language? Do you use the proof system to persuade the language kernel it may translate them back and forth (the "software foundation" book for Coq let the reader build a few proofs about that, though Coq use arbitrary precision numbers iirc))? I'm still quite new on dependent typing, so forgive me if this question is silly or weird somehow.

nmrm · on April 14, 2014

>That said, if dependent typing takes off, no doubt we'll see languages that fuse it with C-style syntax soon after, just as we saw with OOP and basically every other thing since.

Indeed, type systems for (actual) Javascript are often formulated in terms of dependent types.

tel · on April 14, 2014

Do you have a reference on that? I'd love to read it

nmrm · on April 14, 2014

http://goto.ucsd.edu/~ravi/research/oopsla12-djs.pdf

tel · on April 14, 2014

Thanks!

tel · on April 13, 2014

At its simplest, `data <Name> = <Impl>` establishes a new type named <Name>. The <Impl> side describes the implementation of the type <Name> in "algebraic data type" syntax.

Algebraic Data Types are types formed from "products and sums" or, in more normal usage, unions and variants. For instance, Booleans are a basic sum ADT:

    data Boolean = True | False

and 3D points a standard union ADT:

    data Point = Point Double Double Double

The <Impl> not only specifies the structure of the type but also its "constructors". These are automatically generated functions which can be used to build a value of a type from its constituent parts. In particular, for Boolean the constructors are True and False (each one a nullary function) and for Point the constructor is (confusingly, but conventionally) named `Point` and it takes 3 Doubles to form a Point.

    True  : Boolean
    False : Boolean
    Point : Double -> Double -> Double -> Point

Generally, we can combine these two notions of types, sums and products, as we like. Typically an <Impl> looks like a "sum of products" like

    data HTTPHeader = ContentType MimeType
                    | Accept MimeType
                    | Host String
                    ...

which generates the constructors

    ContentType : MimeType -> HTTPHeader
    Accept      : MimeType -> HTTPHeader
    Host        : String   -> HTTPHeader
    ...

So, we define the natural numbers, [0,1..], by starting at zero and counting upward.

    data Nat = Zero | Succ Nat

which introduces a new, interesting idea that ADTs may be recursive. This means that we have two constructors

    Zero : Nat
    Succ : Nat -> Nat

and, with a little play, we can see that all values of Nat must look something like the following

    Zero
    Succ Zero
    Succ (Succ Zero)
    Succ (Succ (Succ Zero))
    Succ (Succ (Succ (Succ Zero)))
    ...

and so we say that Nat represents the natural numbers because any value of that type can be converted to a number by counting the number of times Succ is used.

In particular, we can write a recursive function

    num : Nat -> Int
    num Zero     = 0
    num (Succ n) = 1 + num n

where the left side uses one final, crucially important, component of ADTs—the ability to "pattern match" on their structure. This bit is a complete mind blower the first time you see it, but it effectively follows as mere parallelism to the idea that we construct types using Constructors as functions—we "destruct" types, convert them to some other piece, by pattern matching on the Constructors as patterns.

(This duality is actually really key and goes by the name of "Gentzen's Inversion Principle", not that that really matters.)

The final question, about why Maybe is a definition of the option type, should look a little more clear by this point. That said, it betrays a simplification I've used thus far in that <Name> might not be a simple name, but instead a parameterized one where these parameters now occur in the ADT as well. To see if that makes a little sense, try replacing the lowercase parameter names with other concrete types like `Nat` and `Boolean`.

gjm11 · on April 14, 2014

This is an excellent answer but there's one important mistake in it that needs fixing. It occurs twice.

1. "or, in more normal usage, unions and variants". A union is not a product type, it's an unsafe sort of sum type. (The safe sort is sometimes called a "tagged union".) And while "variant" means exactly what you want it to, it's a Microsoft-ism that not everyone will be familiar with.

So: in C-speak, product and sum types are structs and (tagged) unions. In Microsoft-speak, product and sum types are structures and variants.

2. "and 3D points a standard union ADT": again, it's not a union but a struct.

A few other remarks that may help people unfamiliar with this sort of syntax make sense of what you wrote:

3. That notation "Double -> Double -> Double -> Point" can be read as "a function taking a Double, and another Double, and another Double, and returning a Point". The weird notation is because actually it parses as Double -> (Double -> (Double -> Point)), and means it's a function taking a Double and returning a function taking a Double and returning a function taking a Double and returning a Point! The word to look up if that doesn't make sense but you'd like it to is "currying".

4. Relatedly, there is no distinction between a value of type T and a function of no arguments returning type T. That's why tel wrote "Zero : Nat" without any arrow anywhere on the right-hand side. You can (and maybe should) think of this as a function from (nothing) to Nat.

5. (This is implicit in what tel wrote but may be a stumbling block for the newcomer.) When you see something like "data Point = Point Double Double Double" it may not be obvious that the four things on the right-hand side are not at all on an equal footing. The first one (Point) is the name of a constructor. The others (Double Double Double) are the types of its arguments.

tel · on April 14, 2014

Ah, thanks! I tend to feel a bit hazy around "struct"/"union"/"tagged union"/"variant" as I feel the definitions are always language dependent.

On the other hand, product and sum (coproduct) are very well-defined terms which exist far outside any particular language. There's still haziness around choice of category (and, in particular, call-by-value versus call-by-name) but they're an order of magnitude more well-defined. I'd love it if the ADT concept became more widespread and could be easily used as a (low-level) description of more sophisticated typing like struct/variant.

To my ear "union" and "struct" are heavily overloaded since there was never a strict definition of what they should be. Product and Sum (Coproduct) are very well-defined (although, in practice, choice of category leads to some actual haziness especially around CBV/CBN).

In any case, I really appreciate the added detail about how better to use those terms and I'd love if the ideas of sums and products caught on and became a more stable point of communication.

platz · on April 13, 2014

While not incorrect, I think the "functional language" designation is overly broad. Worst case, I'd hate for people to get turned off of functional languages because of difficulties understanding a relatively-advanced dependently-typed language such as Idris.

IMHO, A language more suitable for introducing functional programming may be Clojure, F#, Scala, or just 'plain' Haskell.

jonsterling · on April 14, 2014

Clojure has little relevance to functional programming: functions are the smoke, but types are the fire. F# might be a good choice for an introduction though (if you're willing to sacrifice the good parts of ML for the bad parts of .Net).

Scala and Haskell both have (in decreasing order) massively more complicated type systems than Idris, so I would certainly not recommend those! Moreover, good Scala and Haskell code tends to be way more difficult to understand initially than good Idris code. YMMV.

An introductory language needs to be simple enough that a learner can hold all of it in her/his head. SML and Idris fit that bill, IMO, in a way that Haskell and Scala cannot.

thinkpad20 · on April 14, 2014

I don't know that much about Idris, but from my understanding, its type system is more advanced than Haskell's, because dependent types by definition contain more information than the (independent?) types used in Haskell (can't comment on scala). Haskell as a language with all of its various extensions and bells and whistles is surely far more complicated, having been around a great deal longer and with a massive compiler and numerous features, but from my knowledge it's incorrect to say Idris is simpler than Haskell, at least in regards to its type system.

chriswarbo · on April 14, 2014

Idris's types system is both more advanced and simpler than Haskell's. For example, "vanilla" Haskell's type system implements (amongst other things) functions, records, parametric polymorphism and typeclasses. Idris implements (amongst other things) dependent functions and dependent records.

Idris doesn't need to support (non-dependent) functions or parametric polymorphism, since they're just special-cases of dependent functions. Likewise, typeclasses are just special-cases of dependent records.

When we consider Haskell extensions, many of them are designed to introduce dependently-typed idioms (eg. DataKinds). This makes Idris automatically simpler, since it supports those idioms directly and doesn't have to offer all of the non-dependent parts of Haskell.

On the other hand, typical Haskell programs may use types in a simpler way than typical Idris programs, but that's a separate point. We can focus on simple programs while learning.

jonsterling · on April 14, 2014

To be fair, Idris does not in general support parametric polymorphism, though you could extend it to do so.

Idris has a quantifier which is called Pi in type theory, and this is not the same as Forall (the quantifier present in Haskell and ML). Forall gives you guarantees about uniform behavior over the domain of quantification (this is parametricity), whereas Pi does not.

Idris does however recover parametricity anyway for types, since it lacks an eliminator for the universe (i.e. it doesn't have type-case). But it doesn't have parametric polymorphism for values.

platz · on April 14, 2014

We won't get anywhere without a definition, but I think most would disagree about the Clojure coment. They even assert it on the home page http://clojure.org . It's probably better to think of FP as a continuum rather than black and white.

While Idris may be able to eliminate some of the typing constructs in Haskell, I don't think one can say it would be easier to program in than Haskell. The chain of having to assert proofs to the compiler is an extra burden (i.e. refl).

Also, saying that a Pure FP language (Idris) is easier than a non-pure language (Scala), for someone without extensive FP backround, is an interesting comment. Part of the intimidation of Haskell is it's purity, and this extends to Idris.

jonsterling · on April 14, 2014

"Practical" Haskell programming very quickly devolves into carefully constructed stacks of monad transformers. "Practical" Idris programming lives in a modality of extensible effects. These are objectively easier to get started with.

(Moreover, I did also include SML in my list of languages better for beginners than Haskell and Scala incidentally. If the purity is the problem, then SML is a great place to start. Begin with proving by induction that an SML program is pure or impure—this will motivate having it internalized in the type theory.)

Moreover, you don't even need to use the "proving" parts of Idris. You can use the fragment of Idris which is a fixed Haskell, and move on up to the more interesting bits when they become useful to you. For instance, you can write partial programs (and coprograms) in Idris without doing any proofs; then at the end, you can have someone help you shore it up with proofs that it is sound.

gnuvince · on April 13, 2014

It's the traditional, inductive definition of natural numbers. You give a base case (Z = Zero) and an inductive case (S Nat = Successor of Nat). So Z is a natural number, S Z is also a natural number and corresponds to 1, S (S Z) corresponds to 2, etc.

fzltrp · on April 13, 2014

If you know what enums are, think of algebraic data types as enums on steroid. You define an enum type as a set of unique values. ADT let you do the same, but let you also parameterize the enum values, and even use the enum type itself as a parameter type (a recursive definition). You can also make several ADTs mutually recursive. As other said, that's very handy to represent usual data structures like lists, trees.

captainmuon · on April 13, 2014

Right, I'm a big fan of functional programming, but I never got warm with functional languages. One big reason is the syntax. I'd love to be able to specify (in pseudo C(++):

    typedef positive_int: int {
        ensure {this > 0}
    }
    typedef ambulance: vehicle {
        ensure {this.has_siren && this.has_bed && this.color == "red"}
    }

and so on.

(aside: The other main problem I have with functional languages is the following, I'm not sure what to call it but it bites me every time: You finally find out how to do what you want in an elegant functional way, you code it up and you're happy. Then, you want to make a small modification, but it turns out that it's impossible with the current code, and you have to rewrite everything with much less abstraction. I can't remember a great example from the top of my head, but a simple one is, you'd like to add a progress bar to a computation. It's trivial in imperative languages, but much harder in functional style.)

kazagistar · on April 13, 2014

Unless you can teach the compiler how to verify that arbitrary expressions will be true at compile time, you have created a glorified syntax for assert. The reason for the the often strange syntax and structure in functional languages, especially dependently typed ones, is to allow you to restrict how you can construct programs so that it can give you those stronger assurances at compile time. You don't want your program to crash or throw a runtime error when you try to convert an int to a natural, you want it to not be possible to do unsafely in the first place.

captainmuon · on April 14, 2014

> Unless you can teach the compiler how to verify that arbitrary expressions will be true at compile time, you have created a glorified syntax for assert.

Well, right now some languages are able to verify that a variable is non-null at compile time (C#). It should be possible to add range requirements to numeric types (e.g. i > 5), and (non-)equality requirements for most types. Intersections and unions of these requirements is trivial, so the compiler could work out possibly allowed and disallowed value (ranges). If we only allow these simple constraints, it is essentially constraint-solving at compile time. The compiler can "understand" these simple types, and it can see that a positive int (or a non-null Vehicle) is required, and propagate that backwards and see whether it is violated.

(One could go a bit further and add almost arbitrary expressions (of C++11 constexpr strength). If the compiler can't convert the expression to a union or intersection of things it understands, it can still check if a given constant violates the condition. E.g. it is known that x=5, and the precondition of a function is x2 != 25, then this is a violation. I don't know how useful this would be, though. One could also add "lemmas" to aid the compiler, e.g. "I guarantee that x > 10 in this branch".)

tel · on April 13, 2014

The way you've asked this it will lead people to think you're asking for very powerful static checking. If you instead just want runtime assurances then the typical (Haskell) method is a "smart constructor".

For instance:

    -- exporting only the type Positive, the extractor `getInt`, and the 
    -- smart constructor `pint` ensures constraint safety.
    module Positive (Positive (getInt), pint) where

      newtype Positive = Positive { getInt :: Int }

      -- creation of Positive types ensures that constraint
      pint :: Int -> Maybe Positive
      pint i | i > 0     = Just i
             | otherwise = Nothing

Now if you want that `ensure` line to be checked every time the Positive type gets modified... well, hopefully you're not leaving it up to the compiler to do that naïvely as you'll waste a lot of checks.

A similar method can be done with the ambulance type, though I don't think modeling the world like that in functional types is a good design.

captainmuon · on April 14, 2014

> The way you've asked this it will lead people to think you're asking for very powerful static checking

That's exactly what I'm asking for :-). It should be possible to create restricted types that are checked at runtime. For example, the fabs function would return a float which is guaranteed to be >= 0. The sqrt function would require the argument to be positive.

    float x;
    ...         // arbitrary code
    sqrt(x);    // warning, possible invalid argument

    float y;
    y = fabs(...);  // y is now float<positive>
    sqrt(x);    // no warning

    float z;
    z = -1.0;
    sqrt(z);    // error: invalid argument, argument z of type float
                // fails constraint z > 0. z is constant -1.0

It is an error if the compiler can prove it to be wrong. It would probably be a warning if the compiler can't prove it to be safe - but you could optionally make it an error (to prove critical code). There are some cases where it won't work - a la solve_halting_problem() - but you could add little proofs to help it. For example, you could declare that sin(x) is `float<positive>` if `x > 0 || x < pi`.

This may sound crazy, but I think the parts are already out there. C#'s non-nullable types are a very weak version of this (the compiler can infer if something could possibly be null, and complains if you assign it to a non-nullable type). Many compilers also track uninitialized variables, and only let you assign to them, not from them. They are also effectively a restricted version of the original type (every time you declare a float argument, you actually mean float<initialized>).

Another ingredient is control-flow analysis. For example there is the excellent PySonar [1] that performs a whole-program analysis of Python programs and is able to infer types (including union types where it is not sure). In Python, values and types are somewhat blurred IMHO. Instead of variables, Python actually has `names`, and each name has a `type` and a `value` inside. Inferring what `type` a `name` could have is a bit like inferring what value a variable could have in other languages. So I think some of the lessons learned there could be valuable for building a something like this, too.

[1] http://yinwang0.wordpress.com/2010/09/12/pysonar/

tel · on April 14, 2014

The non-nullable types are trivially implemented in Haskell. From what I can tell the flow control analysis you're talking about is just type inference (another thing well tackled in ML-like languages).

Indeed, everything you've written can be fairly trivially implemented in Haskell, though inference and annotation would feel different. The real challenge comes in when you've got an arbitrary "test" expression which must be satisfied by the statics of your language.

For instance, your original example had ensure blocks

    ensure { x.left > 10 && x.right < 30 }

the problem is that it's very difficult for the compiler to take constraints like these and propagate them through increasingly complex transformations without just representing the entire language at the type level.

Which is exactly the path taken by "DT" languages like Agda, Coq, Idris, ATS. The upshot is that you get arbitrarily powerful static queries. The downside is that your type system is exactly as complex as your language itself and you have to shuffle around proofs of various invariants being held because it's an NP-complete search problem for the compiler to find them for itself.

I highly recommend trying out Coq if this kind of stuff interests you. I recommend Coq over Agda/Idris because you're going to be learning new syntax anyway and Coq has better literature out there:

    http://www.cis.upenn.edu/~bcpierce/sf/
    http://adam.chlipala.net/cpdt/

captainmuon · on April 14, 2014

Thanks, I'll definitely have a look at Coq (well, once I have free time, and I wanted to really understand Haskell first).

The thing that bothers me about implementing these constraints the Haskell or Idris way (or in any other DT language I've encountered) is that it feels a lot like template metaprogramming in C++. Define integers with nested types as Zero and the sequence of successors (data Nat = Z | S Nat) seems a lot like defining integers with the Peano axioms using nested C++ templates - possible, even neat, but a bit crazy. Maybe that feeling goes away when you're more familiar with the languages.

Btw. I've just found out that Spec# (an extension of C#) almost has what I described... you can define preconditions and invariants that are checked at least at runtime, but a static checker tries to proove as much as possible at compile time. Its a pity it is no longer developed, this could have been the static type checked language for the rest of us:

    http://rise4fun.com/specsharp/

tel · on April 14, 2014

If you want to see a dependently typed (technically, though the term often doesn't include this class of languages in practice) language which enforces smaller sets of properties more eloquently take a look at Liquid Haskell

http://goto.ucsd.edu/~rjhala/liquid/haskell/blog/about/

The core idea here is called the Refinement Type which is basically what you're looking for. In order to power this type system Liquid Haskell depends on a full SMT solver, though, which is a rather large hammer to wield.

As it regards to Peano arithmetic in DT languages the Peano naturals data type is awkward for doing math... but FANTASTIC for defining the properties of recursion and induction. Indeed, what you find is that the Peano nats show up everywhere as being a stable foundation for complex proofs and smart recursive functions.

It's a very similar feature to Haskell's lists. Competent Haskell programers do not use lists when they want arrays or vectors—the latter types are naturally far more convenient when in their application space. But lists are not abolished! They're fantastically useful as control structures indicating sequential recursion.

Typically you think of data types as forming their meaning through construction. You use numbers by constructing them according to arithmetic properties. Another way to do it, though, especially with recursive ADTs and inductive families, is to think of a type's meaning as being derived from how you break it down.

The canonical way to break down a list is to find some way of combining the head of the list with the broken-down version of the tail of the list. In Haskell that is called foldr

    --       combiner         zero list   result
    foldr :: (a -> b -> b) -> b -> [a] -> b

and you can see its canonicity by thinking of it as simply replacing the constructors for a list by the arguments to foldr.

    let ls = Cons 1 (Cons 2 (Cons 3 Nil)))
    
    foldr comb zero ls =
             comb 1 (comb 2 (comb 3 zero)))

chriswarbo · on April 14, 2014

I think all of the complications with your proposal can be eliminated if you change "float y;" to "float<positive> y;". If you tell the compiler what type you actually want to use, it will check all of the constraint propagation for you. This is exactly what Haskell, ML, Idris and friends will do. In fact, in Haskell and ML you would be better off leaving out the declarations altogether; the compiler would spot that you've used "fabs", so it will infer "float<positive>", and unify this with the "float<positive>" argument of "sqrt". In Idris, you could declare that the fabs function converts "float" to "float<positive>" and it would automatically insert calls to fabs when you try passing floats to sqrt.

No need for runtime checks/errors, no need to solve the halting problem, etc.

fzltrp · on April 13, 2014

> you'd like to add a progress bar to a computation. It's trivial in imperative languages, but much harder in functional style.

That depends a lot on how you code your program. You'll havr as much difficulties in imperative style if you didn't think up front about that feature (or any similar ones). What you want is modularity, and it means the ability to separate concerns at the language core. Luckily, un FP, functions being first class, you can easily extract functions out of an algorithm, and the pass them as parameters. So in your example, you'd parameterize on the function which does that computation step, and plug in that very function wrapped with one doing the progress bar update. With a well written codebase, this sort of things are just one refactoring away. This exercise being easier or harder is mostly a question of being used to the language imho.

discreteevent · on April 13, 2014

Prof. Van Roy gives a good example of this in the Edx course "Paradigms of Computer Programming" He uses the simple example of trying to introduce code to a pure functional program that will record the number of times a particular function is called. To do this you need to modify the signature of every function on the call stack. No doubt there's some other way to do this and it probably involves a monad but he was dealing with a simple pure functional language in the example. His point was that in this sense the pure system is not modular.

sold · on April 14, 2014

That's exactly the point of monads - see http://homepages.inf.ed.ac.uk/wadler/papers/marktoberdorf/ba.... Such a system is still pure, btw.

tel · on April 14, 2014

I'd say any functional language so simple as to have that problem is probably some small variant on lambda calculus... and that's essentially the assembly of FP. The same thing is going to happen in your SSA-d C++ code.

platz · on April 14, 2014

Pure functional languages have this advantage: all flow of data is made explicit. And this disadvantage: sometimes it is painfully explicit.

Of course, you could just pull a Clojure and throw everything into maps.

riffraff · on April 13, 2014

> Failed to load resource: the server responded with a status of 503 (Service Temporarily Unavailable) http://rawgit.com/chrisdone/jquery-console/master/jquery.con...

this causes the console not to load. It would be better to host the js yourself, probably.

tel · on April 13, 2014

Sadly the console seems to be missing and the compiler doesn't emit errors.

The second is more sad since so much of the dynamics of programming Idris comes from the trying to get its statics to compile. In particular, you will spend an inordinate amount of time staring down collections of type holes trying to solve the puzzle they present.

None of that gets reflected in the compiled Javascript.

fprotthetarball · on April 13, 2014

"This website abuses rawgit.com. You should complain to its owner."

double051 · on April 13, 2014

Yeah, saw this alert after switching between 'Console' and 'Compile'.

tempodox · on April 14, 2014

This is one REPL, and compiler, that does not work and doesn't react to user interaction. That doesn't exactly inspire confidence.