Hacker News new | past | comments | ask | show | jobs | submit login
Functional Programming in OCaml (cornell.edu)
176 points by noch on July 27, 2021 | hide | past | favorite | 108 comments



Learning OCaml will teach you the languages and concepts of the next 5 to 10 years of mainstream programming languages. E.g., here's a TC39 proposal to add declarations to conditionals: https://github.com/tc39/proposal-Declarations-in-Conditional...

It says, in the 'Future Work' section:

> This could be extended to allowing multiple names to be initialized (such as through destructuring) with a "normal" conditional to be written after the assignment with a ; (similar to a for).

    if (let x = 1, y = 2; x || y) {
        /* ... */
    }

    if (let {x, y} = data; x && y) {
        /* ... */
    }
Meanwhile, OCaml from day one:

    if let x = true in let y = true in x || y then
      ...

    if let { x; y } = data in x && y then
      ...


Some other examples: immutable records and tuples in JS, records in Java and C#; pattern matching in Java and C#, stage 1 in JS; algebraic data types with sealed classes coming in Java, and used a lot in Rust.


Reminds me of how lambdas got added to mainstream languages like Python, Java, and C++, when various other languages like Lisp, Smalltalk, SML, etc. had had them decades before.


Yep, LINQ is nothing new for anyone that has used Smalltalk collections for example.


> Learning OCaml will teach you the languages and concepts of the next 5 to 10 years of mainstream programming languages

I think this assertion was true 20-25 years ago. Less so now.


On the currently in OCaml and ahead of the time front:

* GADTs

* Row polymorphism with inference

* First-class modules

* Let binding operators, e.g., let* for monadic binding

On the coming soon to OCaml (hopefully) and way ahead front:

* Algebraic effects

* Modular implicits


Can you explain or direct me to resources on what these are/do?


* GADTs: generalized algebraic data types. As the name suggests, GADTs broaden the set of type-level constraints that one can encode using ADTs. A canonical example involves using GADTs to strongly type an AST and evaluator function s.t. attempting to construct or operate on an invalid AST, e.g., one that represents addition between a boolean and an integer, is a compile-time error [1]. Another great tutorial on GADTs is [2].

* Row polymorphism: statically type-checked duck types, for lack of a better description. In dynamic languages like python or javascript, it's common to pass dict or dict-like objects to functions and have them extract the key/value pairs that they need. Row polymorphism lets you have your cake and eat it too by enforcing enough structure to ensure that functions only receive records containing all relevant keys — without requiring that functions take a specific concrete type.

* First-class modules: another tool for type-level programming and modularity. I won't say much since it's hard to understand why you want and at times need them until you've organically come across a use-case for them. Once you do, you'll wish that other languages had them. Here's a resource, though: [4].

* Let binding operators: these let you define your semantics for binding values to variables. A common use-case is defining let* as the monadic bind operator for some type: let (let*) x f = bind x f, after which writing monadic functions is way more pleasant syntax-wise. In contrast to the other items on this list, this is pure syntactic sugar but incredibly impactful [5].

* Algebraic effects: the conceptual root of concepts such as generators (yield), async/await, and checked assumptions. Informally, they're (among other things) a way to encapsulate computation and jump around code like a goto statement, but in a more structured and type/memory-safe way [6].

* Modular implicits: these allow (among other things) ad-hoc polymorphism, meaning that the same operator, e.g. (+), can operate on both integers and floats. That's the oft-cited use-case, and it sounds pedestrian, seeing as most mainstream languages with weaker type systems have it. There's more to it than that [7]. Haskell and Scala have type classes and implicits as mechanisms for ad-hoc polymorphism, but modular implicits (if implemented) seem to strike a nice balance between readability, flexibility, and transparency.

[1]: https://blog.mads-hartmann.com/ocaml/2015/01/05/gadt-ocaml.h...

[2]: https://sketch.sh/s/yH0MJiujNSiofDWOU85loX/

[3]: https://stackoverflow.com/questions/48092739/what-are-row-ty...

[4]: https://dev.realworldocaml.org/first-class-modules.html

[5]: https://jobjo.github.io/2019/04/24/ocaml-has-some-new-shiny-...

[6]: https://overreacted.io/algebraic-effects-for-the-rest-of-us/

[7]: https://arxiv.org/pdf/1512.01895.pdf


Agree. Some did it better, but languages have caught up. Even the C you learned in school has evolved


Evolution comes at great cost though. Languages have to find ways to add new functionality while preserving backward compatibility. And the cost is great in human terms as well. Python added a walrus operator after so much fighting that Guido van Rossum decided to resign as BDFL. Meanwhile, OCaml doesn't even need an explicit new feature for that, because it's expression-oriented from day one.

Not to forget that when languages pile on features over the course of years, they effectively split their users into 'old-style' and 'new-style' codebases, where people need to worry about what features are usable with what versions of the language. It's a huge mess.


There is a very relevant detail many people forget when talking about programming languages, they are software products, no different than trying to sell a new version of a word processor or vector design application.

For some people Word 1.0 for MS-DOS is all they need, whereas others won't do without the latest set of Office 365 features, same applies to programming languages, as per application architecture, software design tooling and deployment scenarios.


5-10 years ahead? So they introduced multiprocessor support in like 2006? (taking c++ 11 as a reference point). That would have been pretty awesome.


That sounds about right, given that you can do multiprocessing pretty easily in OCaml by forking. They even provide a convenience function directly in the standard library, Unix.establish_server, that forks a new process to handle requests.


Forks on windows works in ocaml?


Nope, but you can use Unix.create_process. Which is admittedly not as convenient, but that's Windows for you.


fork() if a POSIX thing, and Windows is not the only non-POSIX OS around.

Also fork() doesn't work consistenly across POSIX implementations regarding threads and signals.


But the OCaml documentation for 'Unix.fork' says that it's not implemented on Windows. It doesn't mention any other OS.


I don't care about OCaml documentation, my point was about "Which is admittedly not as convenient, but that's Windows for you.", and UNIX versus other platforms in general.


Past threads related to this book:

Functional Programming in OCaml - https://news.ycombinator.com/item?id=22408664 - Feb 2020 (58 comments)

Functional Programming in OCaml - https://news.ycombinator.com/item?id=22400233 - Feb 2020 (1 comment)

Functional Programming in OCaml - https://news.ycombinator.com/item?id=19292067 - March 2019 (144 comments)


Also this https://news.ycombinator.com/item?id=27966776 22 Hours ago (2 comments)


There is a video playlist by Michael Clarkson which has been released recently too.

He is a pretty good communicator!

https://www.youtube.com/playlist?list=PLre5AT9JnKShBOPeuiD9b...


I’ve been working on an OCaml by example page. It’s not really ready for prime time but you can at least learn the basics with it and hell, even contribute if you want :)

https://o1-labs.github.io/ocamlbyexample/


I love that you started with opam, and not hello-world. More language guides should spend more time introducing tooling.


Yea exactly this !

A great help here is a "start-project-template" like with svelte (for example), they have RIGHT on the home-page a big-ass block that shows you how to "get and start" the svelte-new-project-template.

Coming to OCAML the "tooling" can be confusing and ambiguous. For example you can use ocamlc, dune,ocamlbuilder,esy opam etc.

And of course the "few options" for a std lib Core, Batteries etc.

I like the "syntax" from the Batteries more to be honest.


Looking good! Is there a reason for why you use Core at some points?


It seems like a lot of people use Core instead of the stdlib, so much that I felt like it was best practice to do that. I explain that in the first sections.


It would be nice if it had examples without Core, too, using only OCaml's standard library. I would like to stick to the standard library first, and if it is not enough, then maybe OCaml Batteries or Core.

Perhaps give a reference to "libraries" on the first page (or all?) where you use Core so people know how to use Core (opam, dune).


I've written a condensed intro without Jane Street libraries: https://dev.to/yawaramin/practical-ocaml-314j


Thank you! Personally I know a lot about OCaml and I have a few projects written in it, but it will still be useful as a refreshener (been away for a while), plus I am pretty sure it will be helpful to many!


Yeah that makes sense, I think it might be a good idea to teach with the stdlib.


I didn't find any explanation other than it's well-accepted. That's true, however if it doesn't bring any new capabilities I think using just the standard library would make the tutorial more helpful and universal.


Yeah I’ve been thinking about that a lot and I might just get rid of Core, at least where it’s not needed


I love it! Great job!


This book was my introduction to OCaml, and I followed it concurrently with the Nand2Tetris course. Project 6 of Nand2Tetris [1] has you write an assembler for their hack computer, and OCaml seemed like a perfect fit. Chapter 10 of this book introduces you to parsing and lexing, and it was really fun following along while writing the hack assembler.

Here's an implementation for those interested: https://github.com/senhorsolar/hack-assembler

[1] https://www.nand2tetris.org/project06


(Inspired by this post,) I gave the official ocaml tutorial a whirl. I quite enjoyed the syntax, and the easy install (of opam).

However, I thought I should mention that I got _exactly_ as far as [0] (found while trying to google a solution) before giving up. The REPL is great, but if I can't compile easily, it doesn't inspire the confidence needed to go any further.

[0] https://news.ycombinator.com/item?id=7418543


[0] is a post from seven years ago where the poster installed ocaml using the Debian package manager rather than through opam as recommended and failed to use a core library because they didn't understand the admittedly strange way Debian packages Ocaml. You probably shouldn't use Debian packages for Ocaml (nor for anything else admittedly - Debian has terrible habits regarding patching and splitting).

The community actually standardized its way of compiling around Dune which is very good.


Yes, it was uncanny that I was led to a seven year old post, trying to follow the currently available "a first hour with ocaml" tutorial [1]

[1] https://ocaml.org/learn/tutorials/a_first_hour_with_ocaml.ht...


There is a relatively new page to install OCaml that works really well (on Linux at least) https://ocaml.org/learn/tutorials/up_and_running.html. As the sibling mentionned, compiling is mostly done through dune https://github.com/ocaml/dune


It appears I unwittingly skipped a step, misled by the title "a first hour with ocaml" [1]. Thanks for the tip on dune.

[1] https://ocaml.org/learn/tutorials/a_first_hour_with_ocaml.ht...


Which 'official OCaml tutorial'? The only one I can find is the 'Up and Running' one that someone else posted, and that uses the modern toolchain with opam and dune which would prevent you from running into the error described in the 2014 comment.


I guess I skipped "up and running" and opted for "a first hour with ocaml" [1]

In my defense, [1] makes no reference to dune, which I am excited to try out.

[1] https://ocaml.org/learn/tutorials/a_first_hour_with_ocaml.ht...


'A first hour with OCaml' says, at the beginning:

> You may follow along with this tutorial with just a basic OCaml installation, as described in Up and Running.

So, you can't really skip 'Up and Running'. And, the Graphics module usage is in the section 'A module from OPAM', which explicitly talks about installing the package and then loading it in the REPL ('top level').


If you want to learn OCaml, you should probably check https://dev.realworldocaml.org/ (free to read online)

It is co authored by Yaron Minsky, I think Yaron plays a big role in making OCaml relevant

I think OCaml would have been a far less popular language, had Yaron chosen another language for jane street


IMHO, the OP book is a better choice for learning as it explains FP and OCaml's implementation of it from the ground up, and doesn't rely on Jane Street libraries.


As a veteran of this class, I do believe it helped make me a better programmer. There's a noticeable divide in terms of programming confidence/flexibility/efficiency among the Cornell CS students who do well in this class and enjoy it, and those who fear it or runaway from it. That said, a lot of the benefit came from this course gathering the brightest CS students as TAs. Without having access to those TAs and the challenges they posed in section or office hours, I'm not sure it has nearly as much value.


+1,

once you grok functors and algebraic type systems, a lot of other system design patterns seem like abstractions on a few key concepts.


The intro page says this course is not about OCaml, but then the remaining pages go into detail about the various features of OCaml. The problem with this approach is that when you read about a feature, you don't know if it is a OCaml-specific thing or whether it is a general functional programming concept.

I would like to see a course or textbook that explains functional programming in the abstract. Instead of going into the details of any one functional language, gives examples of how various functional languages implement the various core ideas of functional programming.

So, explain functional programming concepts such as pure functions, referential transparency, functors, monoids, monads, effects, lazy evaluation and so on, not in the context of any specific language, but giving examples from multiple functional languages.

In other words, focus on functional programming in general, instead of one specific implementation of it. I have been looking for such a textbook or course, but it doesn't exist.


I don't think that would be helpful. If a book is too abstract it won't teach anything useful. Teaching something concrete is a good way to allow students to synthesize abstract ideas over time. And OCaml is a good vehicle for it because it has so much in common with other mainstream languages, especially nowadays with the trend towards functional/imperative/OOP mixed paradigms.


> If a book is too abstract it won't teach anything useful.

Lots of developers are curious about functional programming. Even if you don't program in a functional programming language, many principles of functional programming, such as pure functions, can be applied in any language. Almost all languages in use today, including JavaScript, C#, Java, Python and Rust have varying degrees of FP features. Understanding what FP is all about is helpful in learning those features.

I'd argue that the market for learning what FP is all about, and how you can apply FP features in traditional languages, is bigger than the market for learning specific FP languages such as OCaml, Scala, F# and so on.


But in order to teach something actually useful you need to provide code samples in a specific language. Speaking in generalities like 'Functional programming emphasizes pure functions' doesn't really tell people how to practice it.


Sure, and as I said before, provide examples in multiple languages, because the demographic I am talking about is interested in FP, not one particular FP language.


Overloading the learner with examples from multiple languages is also a bad teaching method. It just adds to the confusion and complexity. They will be overwhelmed with the different techniques needed to bolt functional programming on to various degrees in C#, JavaScript, Rust, etc.

When teaching, it's much better to teach as little as possible at a time, so that learners can absorb concepts easily through their working memory and then build on their knowledge over time.


Programs = Algorithms + Data Structures

Hence why I really favour CS books that rather pick a pseudo-code language, than trying to sell language X as part of exercises.


Not a book, but you may like it, i did:

https://www.lihaoyi.com/post/WhatsFunctionalProgrammingAllAb...

(edit) a post that uses JS, PHP, Python etc. to explain many FP concepts: http://chriswarbo.net/blog/2015-01-18-learning_functional_pr...


If you can find a copy, the first edition of ‘Introduction to Functional Programming’ by Richard Bird covers the concepts of functional programming in a language-agnostic way.


> functors, monoids, monads, lazy evaluation

These three things are mostly Haskell-specific though, at least in relation to functional programming.



Rust 'does' lazy evaluation only in the sense that you can implement it if you go out of your way. It's not the default like in Haskell. It's not a functional programming language–having functors, monoids, and monads (in some restricted sense) doesn't make a language FP.


Lazy evaluation is not a requirement of functional programming. There's nothing really in typed lambda calculus requiring any laziness, that's just one possible way of expression evaluation.

The only real requirement of functional programming is the support of first class functions, the possibility to pass and return functions like any other values.

If you want to discuss your own interpretation of what makes a language more or less functional, which could be interesting, please provide a definition first.


Note that I didn't say that lazy evaluation is needed for functional programming. I said that Rust doesn't have lazy evaluation, and that it's not a functional programming language.

IMHO, a functional programming language requires:

- First-class functions and function literal (lambda) syntax support

- Expression-oriented syntax

- Emphasis on immutable data structures

- Tail call optimization

Other features are 'take-it-or-leave-it'.


So given that ML languages don't require tail call optimizations, and allow for mutable data structures....


AFAIK all ML languages support tail call optimization, and put the emphasis on immutable data structures.


Nope, Scheme is one of the few languages that require tail call optimisation as per language standard.

There is no emphasis when mutability is one mut and ref cell away.


OCaml doesn't have a language standard afaik, and its implementation(s) support(s) TCO, so imho that's more important.

'Emphasis' just means the 'paved path', i.e. default data structures like records and variants are immutable, default list type is immutable. Of course if you want mutable stuff it's there, but it's not the default.


Even so, OCaml isn't all ML languages.


Afaik all ML languages have TCO, let me know if I'm wrong.


> The only real requirement of functional programming is the support of first class functions, the possibility to pass and return functions like any other values.

By that definition C language supports functional programming. I think a lot of people would disagree with your definition.


It does. I enjoyed the book Functional C by Hartel & Muller a lot:

https://www.semanticscholar.org/paper/Functional-C-Hartel-Mu...

It uses SML, had fun learning this stuff on the train with termux and vim.


Actually, first class functions also means the ability to form functions (ie, closures). Javascript for instance has that feature. I think that python also supports it. C OTOH doesn't, but it has other advantages.


Basic implementation for car, cdr, map with function pointers and clang blocks, the rest of Lispy stuff is left as an exercise. :)

https://godbolt.org/z/Y5qWs4c64


That's cool!

C++ had support for partial applications with templates (std::bind iirc), and it was possible to implement closures as objects with the () operator implemented. Now it has support for closure with a dedicated syntax.

It would be possible to do something similar in C - a closure is just a function pointer and a bunch of parameters bundled together, but I don't think there's enough flexibility in the language to make it look like regular function calls.


I don't think it's particularly out of the way. Iterators are a normal part of rust, and are lazy.


Rust doesn't have useful functors or monads since it doesn't have higher-kinded types.


Monads or something like them are vital to be able to recover imperative programming in a functional context. If functional programming means programming with functions (in the mathematical sense) then it means expression-oriented languages/styles, and how you express conventional (procedural) programming idioms in those is a key part of having a functional programming language that you can actually use.


What's wrong with OCaml's approach of just breaking out in imperative-style code whenever it suits you?


Imo, the advantage of "Haskell's approach" is that it's always obvious when you can and can't use imperative code, and the type system enforces it. I think this is important in many contexts, like STM[1] or Facebook's React[0]. But there are plenty of disadvantages:

* it's a bit more verbose.

* you can accidentally implement a weaker interface than the one you'd want, e.g. by only implementing Functor and not Traversable for your data structure.

* the flip side of the benefits is that adding I/O to a previously pure function changes its interface.

I don't know how it'd compare to OCaml's algebraic effects proposal, though.

[0]: https://reactjs.org/docs/hooks-effect.html

[1]: http://joeduffyblog.com/2010/01/03/a-brief-retrospective-on-...


You lose the freedom to fearlessly refactor that's the main advantage of functional programming. If it's not clear which parts of the code can be reordered then you either just can't keep the code clean anywhere near as much, or you need much higher test coverage etc. to keep the same level of confidence when you do.


There's nothing Haskell specific about those concepts. In fact, the first three where developed by mathematicians long before Haskell even existed.


When mathematicians developed monads, they were doing abstract algebra. They were not using them as a kludge to (for example) impose sequence on functional programming.

Having to use monads for sequencing is something that happens in Haskell and not in Rust.


Sure, I/O in Rust is a little easier because it's not wrapped up in an abstract type, but in exchange, Rust needs Try, async/.await and const-fn to solve just a few of the issues that monads solve in more generality. I don't think either choice is a kludge, it's just a bunch of trade-offs.


Just to say, Haskell doesn't need monads to sequence computations, in fact Haskell as a programming language (1989) is older than Moggi's seminal paper on monads applied to programming in 1993.


True, but imo Haskell I/O was less than pretty before monadic IO came along: https://stackoverflow.com/a/17004448


> This course is about making you a better programmer.

It is a very bold statement. There is nothing in course about how to fight complexity. Knowledge how to traverse an ADT tree helps a little unfortunately. But it is a great classic CS course though.


> There is nothing in course about how to fight complexity.

You should read it more carefully. E.g. https://www.cs.cornell.edu/courses/cs3110/2020sp/textbook/mo...

> One key solution to managing complexity of large software is modular programming: the code is composed of many different code modules that are developed separately. This allows different developers to take on discrete pieces of the system and design and implement them without having to understand all the rest. But to build large programs out of modules effectively, we need to be able to write modules that we can convince ourselves are correct in isolation from the rest of the program. Rather than have to think about every other part of the program when developing a code module, we need to be able to use local reasoning: that is, reasoning about just the module and the contract it needs to satisfy with respect to the rest of the program. If everyone has done their job, separately developed code modules can be plugged together to form a working program without every developer needing to understand everything done by every other developer in the team. This is the key idea of modular programming.


To be fair, it doesn’t say that it will make a perfect programmer, but surely knowing about functional programming and ideas won’t hurt.


I think the "better" is a secondary effect.

Imagine you are a UFC fighter and going to the gym and lift weights to be a "better fighter" even though, you didn't throw one-kick or punch one bag etc :P


I agree. It's certainly beneficial to know different paradigms but it's not the end of the story. There are OCaml codebases out there which are pretty messy.


This looks great. I'll add it to my list of 100 MOOCs I'd love to do but realistically will never get past the first session.


I remember when coursera, stanford et al came.. I had subscription boulimia .. I only finished 50% of the one I attended.


Too real.


True story


Sorry this made you upset



By the same submitter. The mods probably asked them to resubmit it, they do that with interesting submissions that don't gain traction (poor timing usually).


How is the new OCaml multicore support going? Any benchmarks yet which compare it to e.g. GoLang/Java/...?


Multicore is coming along, you can read the latest news here: https://discuss.ocaml.org/t/multicore-ocaml-june-2021/8134

In terms of performance, there is this paper https://kcsrk.info/papers/system_effects_feb_18.pdf where on a single core async OCaml and effect OCaml are close to Go's net/http, and there is also this project https://github.com/ocaml-multicore/retro-httpaf-bench but I haven't see any results from it.


The last multicore prerequisite will land with 4.13 in September. My understanding is that the next release after that due in Q1 of 2022 will have multicore support.


It is coming along, however in the age of security and distributed computing, what most applications really want are multiple processes with OS IPC anyway.


I was under the impression that multicore would help for the dev tooling world, so things like Pyre, Hack, etc.


Minor improvements, they can already make use of something like Lwp today.


Any experiences teaching kids/students programming with FP? Starting with FP that is.


The first programming course at Oxford is FP. It makes some sense: (1) sometimes diving into something helps (especially with support from tutors) (2) it gives those with existing programming knowledge something new to work with and (3) functional programming concepts are generally helpful, so it’s still useful in other spheres.


Two that I can think of:

- Bootstrap teaches a toned-down version of Racket (i.e. Scheme): https://bootstrapworld.org/materials/spring2021/en-us/course... . It's taught in some schools as well as a comp sci curriculum.

- https://code.world/ teaches using a toned-down version of Haskell. To my knowledge it's not used in schools.


The Coursera course Programming Languages (Part A and B) by Dan Grossman teaches FP (and more) using SML and Racket. I enjoyed this course a lot. I would not recommend it for kids, but it should be great for students with some limited programming experience.

[1]: https://www.coursera.org/learn/programming-languages

[2]: https://www.coursera.org/learn/programming-languages-part-b


SICP focuses on Lisp/Scheme and it's one of the best programming books for beginners I can think of.


Apparently CS 3110 (Spring 2021) was only offered to students enrolled at Cornell. If true, why is this website of interest to HN readers? Just asking.


>> Apparently CS 3110 (Spring 2021) was only offered to students enrolled at Cornell. If true, why is this website of interest to HN readers? Just asking.

Note the following:

> This work is based on over 20 years worth of course notes and intellectual contributions by the authors named above; teasing out who contributed what is, by now, not an easy task. The primary compiler and author of this work in its form as a unified textbook is Michael R. Clarkson.

You do not need to be enrolled in CS 3110 to be able to utilize this content (said "unified textbook").


So from what I understand from this the objective is to teach a “vastly different perspective” and thereby improve one's programming productivity by idealistically a factor four, by being forced into a new way of thinking?

They could have certainly done something more radical than OCaml then such as, say, Idris2 which is truly a challenge for those not used to it.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: