Beginner's guide to OCaml beginner's guides

jallmann · on July 10, 2014

Real World OCaml is indeed awesome -- while it is free, if you do find yourself making good use of it, please also at least buy an ebook version. Totally worth its bytes in gold. (No affiliation, just an OCaml dilettante who's benefitted a lot from RWO).

For good code to learn from, I'd also recommend Batteries, another stdlib replacement [1]. It's not as expansive as JSC Core, which arguably makes the code easier to browse, IMO. Ditto for lwt in place of Async. Really, a lot of the FOSS tools and libraries in the OCaml ecosystem are extremely high-quality -- and still improving at a rapid pace, thanks to OPAM.

If any of the RWO authors are reading this, I'd love an expanded version (or another book) with more advanced features of the language -- GADTs, a deeper treatment of the functor/module system, covariant/contravariant types, Camlp4 and/or the extension points mechanism, etc.

[1] I recall reading somewhere the extlib/batteries author said he would have written the library in a more functorial style if starting over today, but I still find it an excellent example of clean, idiomatic OCaml.

avsm · on July 10, 2014

> If any of the RWO authors are reading this, I'd love an expanded version (or another book) with more advanced features of the language -- GADTs, a deeper treatment of the functor/module system, covariant/contravariant types, Camlp4 and/or the extension points mechanism, etc.

I'm actually meeting our editor (Andy Oram) in a couple of weeks at OSCON to discuss just this. We deliberately stayed away from newer features (e.g. GADTs) that we didn't feel had enough of an outing in production codebases, but with the release of OCaml 4.02 due in a few months this is a good time to do a refresh and fix that. What aspects of the module system would you like to see more of?

jallmann · on July 10, 2014

> I'm actually meeting our editor (Andy Oram) in a couple of weeks at OSCON to discuss just this.

Great to hear that!

> What aspects of the module system would you like to see more of?

The book is actually comprehensive in this respect, but the module system is just a lot to digest. I'm not actually an expert, so I really don't know what could/should be done, only that sometimes it takes me a while to figure out "best practices", or what's required by the compiler, or the semantic implications of some of the more unfamiliar syntax. For example, recently I saw:

    module Client : module type of Client.Make(IO)

which differs from the "module type of" usage discussed in the book. The OCaml module system is extremely rich, with a learning curve to match. I don't really feel that I know how to exploit the system to its fullest just yet. Maybe a section on common patterns or best practices would be helpful.

On a side note, thanks for all your effort, not just on the book but also in keeping the OCaml community vibrant. I wouldn't be nearly as interested without all the work that's going on now.

avsm · on July 11, 2014

Thanks, that's all very useful feedback. There have been several significant improvements to the module system in 4.02 (including generative functors and module aliases [1] in signatures), so I agree that there's quite a bit more to write about. We have a very short section on common patterns that could perhaps be expanded into a chapter.

[1] https://blogs.janestreet.com/better-namespaces-through-modul...

a0 · on July 10, 2014

RWO does an excellent job in introducing OCaml's main features in a simple and pragmatic way. It really does show the strength of the language and demonstrates the building blocks one can use to write high quality software.

One aspect that I think could be explored deeper is how to architecture complex applications with the language. How the module system can be used to build abstractions and protocols for not-trivial systems? For example how one could build a database with it, or even an operating system? I'm in the process of learning OCaml and what is really difficult for me is to understand what are the most idiomatic and efficient ways of modelling application domains. Although there are many open-source projects one can learn from (like JaneStreet's libraries, Nymote sub-projects, etc), the concepts and practices they use are sometimes hard to understand just reading the source.

amirmc · on July 11, 2014

I was just about to suggest looking at projects like Mirage/Irmin et al but since you mention Nymote, I assume you've already found your way to them.

If it helps, we have a series of blog posts about some of the recent library releases on the Mirage blog [1], some of which go into the design of those libraries. Perhaps they'd be useful?

[1] e.g. http://openmirage.org/blog/introducing-ocaml-tls

a0 · on July 17, 2014

Thank you! I'm impressed with the serie of blog posts the Mirage team posted lately – really informative and inspiring. I think this kind of system design articles are really valuable to actually learn how to use the language in the wild.

Another good example of a very good software design reading I've fuond is the OCamlgraph's paper[1], which demonstrates the strength of the ML's module system in a not trivial situation.

[1]: http://www.lri.fr/~filliatr/ftp/publis/ocamlgraph.ps

pjmlp · on July 11, 2014

I would like to see some more Windows love than on the first edition.

Bought the ebook, and ended up just reading it as a means to see what changed since the Caml Light days and differences to F#.

amirmc · on July 11, 2014

There's a new blog post up where GADTs have been used in one of the OCaml-TLS libraries (i.e ASN1)

https://news.ycombinator.com/item?id=8020125

hackerboos · on July 11, 2014

I've just bought the dead tree version on the recommendation of the OP.

Cheapest place to buy in the UK is the Book Depository.

edwintorok · on July 10, 2014

IIRC one of the chapters in RWO did briefly touch on type variance.

mjt0229 · on July 10, 2014

As I recall, GADTs are also touched on, but I agree that I still want more of all of these things.

dbpatterson · on July 10, 2014

It is really bizarre to contrast Real World OCaml to Learn You a Haskell. The former is a serious book, the latter is a very beginner level tutorial. Now there may be real criticisms comparing Real World OCaml to Real World Haskell (which, if from the names alone, would seem like a much more appropriate comparison), but saying "this beginner tutorial doesn't explain how the runtime works" is a little silly.

antics · on July 10, 2014

(Author here)

I think it's totally fair. The whole reason I wrote this post is to help people decide which resources to not use. I love LYAH but I don't care who it was written for. I wanted people to know, specifically, what they were missing with books like that, because it was not obvious to me when I started that I should not start with LYAH (or at least, that I should have kept in mind what it was missing).

hyperpape · on July 10, 2014

I think the comparison is absolutely essential. I always want a guide to the resources about a topic so that I can find out where to look to learn the way I want to learn.

But the judgment is probably strong. Right now, my immediate reason to learn OCaml is because Types and Programming Languages uses it for its type checkers. For that purpose, I might just want quicker tutorials that don't cover the runtime. Ditto if my primary goal is to learn more about using types for good API design. On the other hand, if I learn and continue with the language, or want to put it into production, the runtime is essential. There's lots of use cases out there.

antics · on July 10, 2014

Yeah, I agree, I was probably too hard on LYAH. But, even if my goals didn't match your goals, I am hoping that the information I am giving you is sort of useful either way. The emphasis is really on the reasoning, rather than the conclusion.

hyperpape · on July 11, 2014

It is useful either way. I love the idea of this post, and I wish it existed and was findable for every topic I'm interested in.

bstpierre · on July 11, 2014

Thanks for the article. I'm in the same boat you were -- picked up ocaml a couple of times but it hasn't "stuck".

Is dead-tree RWO worthwhile or is it going to become obsolete quickly? I like (prefer) reading on paper, but I've been having a hard time over the past few years justifying the purchase of books that are destined for the recycle bin within a year.

antics · on July 12, 2014

The authors are pretty conscientious about basing the book on tech that's mostly here to stay. For example, camlp4 is not included because it's on it's way out.

So, there's no way to say for sure, but I think there's a reasonable chance it sticks around.

tmountain · on July 10, 2014

Can anyone speak to the pros/cons of learning OCaml vs Haskell? They seem comparable in some respects, but it feels to me as if Haskell has more momentum and presumably a more active community. Am I correct in this assumption?

antics · on July 10, 2014

(Author here)

I love Haskell. Love it to death.

I see RWO as a justification of OCaml's language choices from the perspective of system deployment. It gives you a good start on the knowledge you'd need to push an OCaml system into serious production.

I do not see how to get the same knowledge for Haskell. Particularly the runtime is a black box to me. This might be acceptable for your use cases, but for me, this is the primary risk of Haskell.

re: community, It's probably true that Haskell so far is more prominent than OCaml right now, I would be careful to avoid making the mistake of assuming that this means people actually know how to deploy this for serious, scale production systems. Because it is not obvious that this is true. Besides, while the OCaml community is not as noisy, they are certainly still around.

happimess · on July 11, 2014

The Architecture of Open Source Applications, Vol 2 [1] has a great chapter on ghc, which includes a good look at the ghc runtime (section 5.5).

[1] http://aosabook.org/en/ghc.html

toolslive · on July 10, 2014

In the popularity of programming languages list, http://www.tiobe.com/index.php/content/paperinfo/tpci/index....

F# is number 13; ML is number 26; Haskell is number 38; OCaml is ranked below 50.

So the ML family of languages is way more popular than Haskell, but is fragmented.

Below are my impressions/experiences.

In general, the languages themselves are excellent, but the standard libraries vary in quality, as do the tool chains. ML is eager by default, Haskell lazy. Otherwise, it's the same mindset (although some people will think Haskell went overboard on monads, some people think OCaml's libs aren't monadic enough)

For example f# is excellent in all of the above, but is _challenged_ in the multi-platform aspect.

OCaml's standard library is limited, but this is compensated with Opam. Considering OCaml on windows is simply looking for trouble. The OCaml compiler is good, but other tools are limited or a bit primitive (heap profilers, debuggers, performance profilers). The community is rather reserved, but very helpful.

Haskell standard library is also limited (for example you ain't gonna jump far with the string implementation that's offered by Prelude, and Num is also warted). The toolchain is better than OCaml's. Also Cabal has libs for everything, but ymmv. The community is really gentle and eager to help.

ML, sorry no experience, most other flavours seem dead or quiet. MLton is a really interesting compiler.

toolslive · on July 10, 2014

I forgot: learning a statically typed functional language will make you a better developer. There is a catch though: some people will just refuse to go back to the normal Java, C++ or python insanity.

codygman · on July 11, 2014

Haskell does make Java and python a bit depressing to go back to for me, yes.

rtpg · on July 11, 2014

>In general, the languages themselves are excellent, but the standard libraries vary in quality, as do the tool chains. ML is eager by default, Haskell lazy. Otherwise, it's the same mindset (although some people will think Haskell went overboard on monads, some people think OCaml's libs aren't monadic enough)

I think it's important to point out that Haskell is actually functionally pure (in that there are no side effects in the language), whereas ML does have references baked into the language, so you can revert back to writing Pascal-looking inner loops in Caml, for example.

codygman · on July 11, 2014

Haskell does have effects though ;)

Some might take "Haskell doesn't have side effects" to mean that you can't talk to servers or read the filesystem.

petecox · on July 11, 2014

http://xkcd.com/1312/

dons · on July 11, 2014

Tiobe is not a good way to measure popularity. There is not really more ML code than Haskell , for example.

Try e.g. github http://redmonk.com/sogrady/2014/06/13/language-rankings-6-14...

e12e · on July 11, 2014

I find it a bit curious to contrast Haskell and ML -- maybe it's because I did a course on programming languages twice, first with Standard ML as (one of) the "functional" languages, and again when Haskell had taken over that role (the course first (extensively) used Sethi's "Programming Languages").

I've always seen Haskell as a kind of ML off-shot, but perhaps the similarity is more in the syntax than in the substance?

IvarTJ · on July 10, 2014

You can run Ocaml on the Raspberry Pi for one.

e12e · on July 11, 2014

> You can run Ocaml on the Raspberry Pi for one.

Huh, I don't have a pi, but judging (blindly) by:

http://www.haskell.org/haskellwiki/Raspberry_Pi

That should be "You can run the Ocaml toplevel on the Pi"?

LeonidasXIV · on July 11, 2014

Also, Raspbian Wheezy includes a rather old GHC which is troublesome if you want to install packages from Hackage and backporting a newer GHC doesn't seem to be possible, it crashes some hours into the build process (which also takes ages).

Mithaldu · on July 10, 2014

Having recently (speaking in years) worked on improving the situation for Perl, i have to say i find it fascinating how other languages come to realize, and deal, with the fact that most tutorials and beginner guides may be well-meaning, but are actively damaging to the newbie developer.

patmcguire · on July 10, 2014

I would also like a beginner's guide to beginner's guides to monads. I've started and given up on a half dozen of those.

cgag · on July 10, 2014

    - Don't read the monad tutorials.
    - No really, don't read the monad tutorials.
    - Learn about Haskell types.
    - Learn what a typeclass is.
    - Read the Typeclassopedia.
    - Read the monad definitions.
    - Use monads in real code.
    - Don't write monad-analogy tutorials.

http://dev.stephendiehl.com/hask/#monads

mercurial · on July 10, 2014

On the other hand, the classic http://blog.sigfpe.com/2006/08/you-could-have-invented-monad... takes you (gently) through code, and you find out at the end that you have written a monad.

cageface · on July 11, 2014

I think the best way to explain monads, at least to programmers with any real experience, is just to show a lot of examples of how they can be used to solve real problems. I went through dozens of explanations before the light finally went on and I saw what a simple concept a monad really is. All the crazy analogies and jargon-heavy discussions only serve to obscure what is really not a complicated idea at all.

GregBuchholz · on July 11, 2014

I think the best way to learn about monads is to write a monad tutorial. If nothing else it is a rite-of-passage.

rwosync · on July 11, 2014

Please don't do this. We have enough problems communicating this concept as it is.

e12e · on July 11, 2014

Please do write one. Please don't publish one. Writing is a great way to think.

Without yet really understanding monads, I've come to realize that their only real problem is the name -- being named after some rather obscure (for most!) category theory concept -- that itself seems just about as simple and trivial as monads are. I find they are a little like the y-combinator in that regard.

I'm not sure if what we need is category theory in elementary school, or a different framework for discussing monads as it relates to programming -- but I'm almost convinced the confusion has to do with the language used to describe them.

Programming is a very practical art that sits on top of the much more abstract art of mathematics. It's useful to know orders of magnitude, graph theory, statistics, logic... but the art by which these concepts are animated and put to use is rather prosaic. And the discipline of computing has forged a set of concepts that are "appropriately abstract" (much, I think due to Knuth) -- but monads seem to lie just outside the grasp of many programmers. And I don't think it is because they are a particularly intrinsically hard concept.

klibertp · on July 10, 2014

Avoid learning monads in the context of Haskell at all costs (unless you're a mathematician or into CS theory - my advice is for a competent programmer who wants to get working understanding of monads as fast as possible), don't read about monadic laws and any other theoretic stuff.

There are excellent explanations of monads for JavaScript, F#, Clojure and Erlang, to name a few. Pick one of them, for the language you're most familiar with. Avoid any syntactic sugar in the beginning. Avoid reading about Maybe/Option monad - it's too boring. Start with hand-crafted List monad, then go to State and/or Promise/Deferred monads, and then ignore State, Either, Writer and many more monads, which are either boring or only relevant in pure languages. Then start sugaring the syntax, `for` in Scala and "computational expressions" in F# are good for this, also LiveScript back-calls are passable. If you're into Lisps, then of course you have macros and it's a good exercise to write you own macros (of course, every Lisp out-there has many implementations already available to look at, if you're stuck).

At this point you know enough to get back to Haskell and/or to look at monad transformers and Cont monad. You can also read about monadic laws and actually have an idea what they are about.

It worked for me. I think the key to my method was to avoid reading about boring and/or not directly applicable things while constantly playing in a REPL with the interesting ones. I find it unbearable to learn theory of things I can't imagine being used. Once I use the "thing" a couple of times it changes and I can absorb theory pretty quickly.

ohazi · on July 10, 2014

This doesn't really go into monads by name, but I found it to be an incredibly useful introduction to the type of structural thinking you need to understand for monads to make sense. The "regular" monad guides started to make sense after reading and understanding this. I believe it was shared on HN or the Rust subreddit about a month ago.

http://fsharpforfunandprofit.com/posts/recipe-part2/

klibertp · on July 10, 2014

> This doesn't really go into monads by name

That's because on this site the m-word is banned. Which is very, very good move in my opinion, but anyway: the "Railway oriented programming" is an article about monads and begins with a definition of Maybe/Option, it's just not explicitly named as such. It's also a very good tutorial, maybe even one of the best I read. I strongly recommend it, too.

e12e · on July 11, 2014

Figures I see this right after I made my other comment...

edwintorok · on July 10, 2014

I actually first encountered monads in Lwt, and they were quite understandable in that context, although it took some digging.

Here is a list I found useful to learn more about monads (they aren't necessarily all the ones that I had to read to understand them, there was another page that I can't find right now): http://pauillac.inria.fr/~xleroy/mpri/2-4/monads.2up.pdf http://blog.enfranchisedmind.com/2007/08/a-monad-tutorial-fo... http://ambassadortothecomputers.blogspot.ro/2009/02/equeue-c... http://blog.0branch.com/posts/2012-03-26-01-implementing-fun... http://blog.0branch.com/posts/2012-03-26-02-from-functor.htm...

ufo · on July 10, 2014

I think that the original monad papers explain things better than the myriad of short tutorials tutorials around the interwebs. In particular, I like this one from Simon Peyton Jones because it gives some historical perspective:

http://research.microsoft.com/en-us/um/people/simonpj/papers...

The TLDR: Haskell uses lazy evaluation, meaning expressions are evaluated in an umpredictable order. If expression evaluation had side effects, programs would become unmanageable, so in Haskell, al computations must be pure. Monads are a neat trick where you use a specially crafted abstract (opaque) interface that lets you model side-effecting computations, specifying what order things should run in.

Of course, initially the biggest reason for using monads so prominently is the IO but the monad interface is much more general and an also be used for other things (List, Maybe, etc)

yomritoyj · on July 11, 2014

I found 'Learn You a Haskell's' treatment to be very good. After it, you have have some taste for mathematics, is Moggi's paper http://www.disi.unige.it/person/MoggiE/ftp/ic91.pdf

rtpg · on July 11, 2014

SPJ gave a talk on the history of Haskell which goes into monads and stuff at one point, and it's the best explanation I ever heard

https://www.youtube.com/watch?v=7NPBrWDzO2A

nbouscal · on July 10, 2014

http://homepages.inf.ed.ac.uk/wadler/papers/marktoberdorf/ba...

xamlhacker · on July 10, 2014

OCaml is a very nice language, and the compiler is pretty good but the only issue is that the implementation doesn't allow you to use more than one CPU core because of some issue with the GC design. F#, which is heavily OCaml inspired, has good parallel support.

amirmc · on July 11, 2014

This is one of the things OCaml Labs is working on [1]. We'll be talking about it at OCaml 2014 later this year [2]

[1] http://www.cl.cam.ac.uk/~sd601/multicore.md

[2] http://ocaml.org/meetings/ocaml/2014/program.html

e12e · on July 11, 2014

If one isn't running on Windows, is there still a strong argument to be made for supporting OS threads over simply multiprocess (with some form of message passing)?

In what kind of situations would one really benefit from threads over processes? I'm genuinely curious, it's been a long time since I've run into blocking on 100% cpu in any kind of real world workload (doesn't mean they don't exist!).

I understand that we will need multicore support, I'm just not sure why we need multithread support?

sigzero · on July 10, 2014

But limited x-platform support...they all have trade offs.

groovy2shoes · on July 11, 2014

> In addition, I would advise against reading other books, as they tend to be incorrect and/or in French. [emphasis mine]

Hey now, "Le Langage Caml" might be in French, and it might be pretty dated at this point, but it's a solid introduction to functional programming and Caml (note the lack of "O" -- like I said, it's pretty dated).

One of the things I like about it is that it it has 5 chapters dedicated to walking through some pretty complicated examples (following 11 chapters giving a tutorial introduction to Caml). With many programming books the examples are so simple that it's hard to see how they all fit together, but Weis and Leroy do an outstanding job of showing how interesting programs would be written in Caml: an interpreter for a little logic language, a file compression/decompression utility, an emulator and assembler for a RISC processor, a "mini-Pascal" compiler targeting said processor, and a regular expression search utility (à la grep). It doesn't just show the code for these programs, but it walks you through building it, from design(!) all the way to the finished product, explaining every little bit in detail. I wish more programming books did this kind of thing!

Like "Real World OCaml", it covers a bit of the runtime -- it's the Caml Light runtime rather than the OCaml runtime, but some of the discussion applies equally to OCaml as well.

Lastly, it has a style that I feel is lacking in many programming books: the authors are careful to use rather precise phrasing. I think this is good because it helps to eliminate a lot of ambiguous or misleading text.

So sure, advise against it for people who don't understand French. For people who do understand French (even a little -- it's mostly technical terms), "Le Langage Caml" is an excellent book.

omaranto · on July 11, 2014

That line also caught my eye. It sounds insane: isn't it much more reasonable to recomend books based on their quality rather than the language they are written in? I agree with you that "Le Langage Caml" is a pretty good book, but I disagree with what you say at the end about advising people who don't read French against reading it. I would instead just advise everyone interested in Ocaml to read it and people who can't read French will automatically ignore my advice without me needing to say anything about it!

e12e · on July 11, 2014

Thanks for this recommendation -- I could use a book to brush up on my French on (and caml :). I'd love to hear some other "best-of" programming book (originally) in French?

My native language is Norwegian, and I'm afraid I've not read a whole lot in Norwegian that I feel I could recommend (mostly because a lot of the most obvious interesting candidates like Simula and MVC is written in English anyway). Besides, I'm afraid the set of people that don't already speak/understand Norwegian, and would like to learn it in order to be literate in programming/technical jargon/style is rather small...?

I also speak Japanese, but I can't really claim to be literate -- I'd be happy to receive some best-of recommendations there too -- the only way to become technically literate is to read, after all :-)

omaranto · on July 11, 2014

I really like Christian Queinnec's Lisp in Small Pieces. I read the English translation but the original is in French, Principes d'implantation de Scheme et Lisp (that's actually the title of a revised edition, the first edition was called Les Langages Lisp).

andrewflnr · on July 11, 2014

The point of an article like this is to warn people about things like tht in advance, not make them find out themselves. For that reason, at least, it's silly to tell your readers to read a book you know very well most of them are incapable of reading; that's a waste of their time.

groovy2shoes · on July 11, 2014

Why would someone who doesn't understand French waste their time trying to read a book called "Le Langage Caml"?

rwmj · on July 10, 2014

Happy to have patches to the various real-world virt-* tools written in OCaml here:

https://github.com/libguestfs/libguestfs

avsm · on July 10, 2014

Just started a thread on ocaml.org about putting a list of recommended projects for newcomers to look through the code for; https://github.com/ocaml/ocaml.org/issues/497

rudi-c · on July 11, 2014

Any similar resource for other functional languages, that go deep into justifying language features, explaining their implementations and the underlying runtime? I come across way too many tutorials or books that only show how to use the syntax, with occasionally a neat syntatic sugar or two.

cwyers · on July 10, 2014

I've been thinking about teaching myself a functional programming language as a side project. Is there any reason to pick OCaml over F#?

yawaramin · on July 11, 2014

To be free from the .Net runtime. If you're deploying on an environment where .Net would be unnecessary overhead, you can use OCaml, compile to native code, and distribute with a minimal runtime.

cwyers · on July 11, 2014

Well, my personal machine runs Windows, and at work we're a heavy C#/SQL Server shop, so .Net is fine with me.

Jam0864 · on July 11, 2014

Better cross-platform support. In my experience .NET works fine in Mono anyway, so by extension F# is likely fine on Linux/OSX, but your mileage may vary.

pepijndevos · on July 11, 2014

Who is using OCaml in production outside of Jane Street?

I'm at the moment trying to run F# on Linux...

gnuvince · on July 11, 2014

Facebook, Microsoft, Bloomberg, Citrix, etc.

http://ocaml.org/learn/companies.html

http://ocaml.org/learn/success.html

wink · on July 11, 2014

A small portion of Clojure users: http://leiningen.org/grench.html