Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Plans for OCaml 4.08 (janestreet.com)
246 points by girvo on July 4, 2018 | hide | past | favorite | 89 comments


It's really great seeing this contributions from JaneStreet. If you're interested in learning more about recent progress in the OCaml's ecosystem, I recommend watching this talk: https://www.infoq.com/presentations/ocaml-browser-iot


OCaml is a great pick for folks wanting to broaden their understanding of prog languages. It has its own set of sensibilities (Fp with some imperative features thown in, rather than the other way around). It'll force you to get really good with ADTs, which is a good thing.


Is there something you'd recommend for a person who wants to learn OCaml? Maybe some book that has great exercises, or just a toy-project example for which OCaml would be a much more "natural" pick than other languages?


I recently got into OCaml (and ReasonML) and wrote a blog post with my experiences. In it I recommend several resources for both languages and explain some stuff that wasn't obvious to me as a beginner:

http://donut2d.tumblr.com/post/171205516399/my-experiences-b...

But if nothing else, be sure to check out Real World OCaml[1]. It's pretty great and helped me out a lot getting started.

[1]: https://dev.realworldocaml.org


If you want to wait a little, the "Real World Ocaml" book is being re-written. It's in beta-ish right but it looks like it would be usable.

Though the core language is the same, some of the language metaprogramming features have been totally switched out for other (better) ones. And it uses the modern versions of the Core (note, that is NOT the standard library) library.


Check out dev.realworldocaml.org


Yes, this would be the book I would recommend as well. I'll give a bunch of general advice as well:

OCaml is pretty old (introduced a year after Java IIUC), and it contains quite a few crusty libraries and programming patterns that come from age. Stick with the latest version you can find through this "modernization" process headed up by JS.

I'm on 4.06, which just made immutable strings a default, and broke a ton of libraries. If you run into a broken lib, this recent version change is almost certainly the reason. I strongly recommend 4.06 as a minimum version (if you're not working with legacy code). If there's a library you want to use that is not compatible yet, bug the maintainer to update it. I've run into problems trying to downgrade/upgrade my compiler version + all the tools. I can't recommend doing that, it kinda ruins the experimentation experience.

What else can I say? Opam, the package maintainer, is great. There's a great build system, nice IDE support with a set of tools called merlin, and the compiler is super fast. The compiler field-level error messages are generally helpful (typo-fix suggestions), but can be fiendishly opaque if you happen to mess up a match boundary (which can happen easily when mixing in if/then/else statements).

Most stuff is google-able, but there's a lot of "line noise" in OCaml function signatures - especially for variant types and labeled arguments. That's harder to search for, so read that section thoroughly in the book.


OCaml was really born as Caml Light, in the early 90s..., it's soon to be 30 years old.


You could argue it started in 1973 as ML

https://en.wikipedia.org/wiki/ML_(programming_language)


Yeah but in this case it was really just a name change, the team that worked on Caml Light renaming it OCaml as they added more features, and stopping working on Caml Light shortly after.


Well yes, that's the re-write I'm talking about.


Facebook's Reason ML might be a more approachable introduction to OCaml concepts. Reason adds a cleaner, more JavaScript-like syntax on top of the OCaml compiler backend.

The Reason team's primary focus appears to be compiling Reason/OCaml to JavaScript. It's a nice language, but I'm not sure why they want to pass through a OCaml "middle end" before emitting JavaScript output. Why not just write a direct Reason->JavaScript compiler similar to TypeScript?

https://reasonml.github.io/


Reason is just a dialect of OCaml and actually uses the BuckleScript[1] compiler, which before ReasonML, just took OCaml code and transpiling it to JavaScript. It's known for being super fast and generating fast and small JavaScript. But you can compile ReasonML to native binaries as well, not just JavaScript, using the BuckleScript compiler or others like jbuilder[2].

The advantage of having OCaml as the foundation is its excellent type system. It's very mature.

I'm relatively new to OCaml, just getting into it this year, but I've found that it forces me to write correct code first. I've been really impressed by it having come from the world of PHP/JavaScript/Swift and others.

Personally, I greatly prefer OCaml's syntax to Reason's. I find it much simpler, and less busy. But I know a lot of people are more comfortable with Reason's syntax. I definitely wouldn't call it "cleaner", though.

[1]: https://bucklescript.github.io [2]: https://jbuilder.readthedocs.io


There is no "OCaml middle end". Reason is parsed into the same AST as OCaml, and then passed through the rest of the OCaml compiler.


When did Jane Street take over development of OCaml? I knew they were heavy users, but I thought it was INRIA that were the main developers. This post implies that JS are running the show now...


That's not the intended implication, and Jane Street has in no way taken over the show. Our PRs are discussed vigorously, just like everyone else's, and they're definitely not always accepted!

Indeed, one of the great things about the OCaml compiler development process is that the core team is highly skeptical, and does a good job of rejecting marginal changes.


Ah ok, I totally misinterpreted. Thanks for clarifying (and thanks for all your great work with OCaml!)


>we in Jane Street’s Tools & Compilers group have been planning what we want to work on for inclusion in OCaml 4.08

You'll find similar phrasing among major contributors to Postgres.


This looks like it works like Haskell: the original group still has control, but a group of consultancies has demonstrated the ability to reliably develop and get support for their extension proposals.

JS is probably confident in their ability to get their developments merged.


"take over" is a big statement, but you could see signs of major influence as early as 2012-2013. It was already apparent at ICFP 2012.


The language was going nowhere under INRIA for ~10 years, we should be thankful for the uptick in commercial interest (Jane Street, Facebook, ...) and their contributions to the language ecosystem.


I wasn't complaining, just curious. Jane Street have done a lot for OCaml and deserve to be a big say in its future.


When I tried to really get into ocaml a few months ago I loved it! Until I actually tried to put together a project with it. Every tutorial and doc seemed to be slightly out of date or not compatible with some package version. Real World Ocaml, while excellent conceptually, is between versions and most examples were broken.

I reeeaaalllly wanted to love Ocaml, but it just seemed like a collection of half baked tools and limited library support. So I learned Julia instead.

I would love to see a little unification and life come into ocaml to get its dev tooling up to par with other modern languages.

I gave up on it when I realized I had been spending more time debuging example code and conflicting versions of 'stdlib' type packages than i had been actually writing any Ocaml code.


Language ergonomics is a really underrated feature. I remember having to link system libraries to follow an OCaml tutorial, which was extremely painful. Or running into issues installing Snap for Haskell where the package manager would basically go ¯\_(ツ)_/¯. While on the other hand Cargo has been a breeze to set up and work. Makes all the difference when your users aren't willing to dive into the source code to fix the issue or wait for a StackOverflow/GitHub Issue response.


I am sure this is why the majority of software in use today is written in programming languages with great ergonomics such as C/C++ or Java. Besides you can’t use cargo once you are not developing a green field project and multi-language ergonomics can totally suck.


Opam combined with Oasis (https://ocaml.org/learn/tutorials/setting_up_with_oasis.html) does pretty good job IMO.

My personal issue with OCaml eco-system is that for quality tooling Emacs feels like the only option.


I am using VSCode with vscode-reasonml extension and I thought it is pretty good. Am I missing something by not going with Emacs?


In all fairness I tried it few months ago. Auto-complete was hit and miss and sometimes got false negative errors. Glad to here it works now apparently! :D


The best way to setup OCaml (native) is to follow https://github.com/janestreet/install-ocaml. If you're following Real World OCaml, the stdlib they use is Jane Street's Core/Base, which is quite comprehensive compared to the one shipped with the language. Mixing and matching the book with the language's own stdlib can cause a lot of heartburn.

There is also a lot of work going on in making the OCaml native development experience easier. install-ocaml (link above) actually takes you through building and testing a project using Dune (formerly JBuilder) which is fast becoming the standard way to build OCaml projects.

If you are however interested in just the language, you can get wonderful ergonomics by using BuckleScript/ReasonML. You'll be running your code using Node, or in the browser, and you'll get access to the large and wild npm ecosystem. You'll lose out on the native OCaml ecosystem, but there are attempts to meld both together.

If none of this is your cup of tea, then you have a great alternative in F# - it is Microsoft's attempt at OCaml, and has a few great features that are not present in the original language. It comes with great Visual Studio IDE integration and sits in the .net ecosystem like Clojure or Scala does in the JVM ecosystem.

Or you can use Elm or PureScript which are statically typed functional programming languages that run on the browser, or Haskell, or even Idris.

These languages have great differences between them, but all of them are ultimately statically typed functional programming languages. Julia is not. Typed FP gives you ADT, immutability by default, functions as first-class constructs, and lets you model complex software as just pure transformations of data. These things are worth learning just for them, and once you understand the motivations behind them well enough, the occasional jank would be but minor annoyances.


I'll definitely be giving it a try again at some point in the future, or one of the alternatives. I learned a ton even just dipping my toe in the waters.


We don't know yet if the developers of Julia value backwards compatibility -- they haven't reached 1.0 yet. So, I find that a rather strange choice given the reason you put forward against ocaml.


That's funny because that's exactly how I felt about learning OCaml, except that I reverted to Python instead (and retrospectively wondered what the heck I was doing toying with such a mess.) You got to find a language you feel comfortable with and that gets stuff done without fussing around with basic stuff like core libs and tutorials. It's a shame developers of the core language don't get it and sabotage the adoption just for lack of craftsmanship. I am sure OCaml would have been fantastic for my use case (context free parsing) but I just gave up and went back to something that just plain worked.


It has improved a lot over the last years. And python falls flat in other areas once you’ve hit a critical mass of code.


It will be interesting to see how the $200 million+ Tezos warchest affects the popularity and development of OCaml. That seems to be a fairly large amount in relation to the size of the current OCaml ecosystem, and the whole purpose of the fund is to grow the ecosystem of Tezos which by necessity includes the OCaml ecosystem.


I had a look at Tezos -- which I've never seen mentioned before.

If you mean this "Tezos: the self-amending cryptographic ledger" ( https://tezos.com/ ), this look like something done from 4 amateurs in their spare time.

I don't know if it has "$200 million+" warchest, but if it does I pity the fools that gave it that.


I think you have the wrong impression. This is one of the more well thought out projects in the space. Don't let the cover (marketing schlock) mislead you.


If you haven't even heard of it before, then your five second analysis of the home page isn't really worth sharing.

I don't have a lot of faith in Tezos, and it's spent most of the time since its (well-known, but apparently not to you) over-funded ICO spent in legal fights over the money, instead of engineering. However, it's only fair to observe that a lot more thought and work did go into Tezos than most of the other Blockchain crap we see every day. Glancing at the marketing splash is hardly enough to assess that.


>If you haven't even heard of it before, then your five second analysis of the home page isn't really worth sharing.

I'm not so sure. Besides there are lots of way more well known cryptocurrencies that are amateur hour themselves. It comes with the territory.

>well-known, but apparently not to you

Yeah, it's world famous among the people who follow these things...


FB s interest in the ecosystem with Reason might also give a nice boost in development resources for libraries and the compiler.


I hadn't heard of Tezos. Following that up I find https://tezos.com/, "a new platform for smart contracts and decentralized applications", leading to http://doc.tzalpha.net/index.html which is indeed full of Ocaml stuff.

Where does $200 million+ come into it? What is the history there?


They raised that much money in their "ICO"

https://news.ycombinator.com/item?id=15503768


Thanks. Wish I hadn't asked.


For a smart contract platform, it's probably ok that ocaml isn't multithreading friendly. But what a disappointment if this is still the case: https://ib-krajewski.blogspot.com/2015/11/ocaml-and-multithr...

At this point, if a language doesn't have first class support for multithreading and static typing, I wouldn't start a new project in it. Nullability support is pretty high up there too, but that is a far more tractable problem.


>But what a disappointment if this is still the case

Well, it's not yet ready, but it's incoming

https://discuss.ocaml.org/t/ocaml-multicore-report-on-a-june...


> Nullability support

Nullability? As in null references? The "billion-dollar mistake" (Tony Hoare, referring to Algol), also frequently called the "worst mistake in computer science"?


No as in something like swift optionals:

https://www.tutorialspoint.com/swift/swift_optionals.htm

It's also called nullability support, you can see it in use in this article:

http://journal.stuffwithstuff.com/2011/10/29/a-proposal-for-...


I'm not sure what you mean, optionals are just a sum type (says so in your own article) and Ocaml support for them is first class, also option is a built in Ocaml type http://batteries.forge.ocamlcore.org/doc.preview:batteries-b...


Julia and C# call it Nullable, other Languages Option or Maybe.

Option type on wikipedia, cant paste the link somehow.


Oh, option types! I don't know Julia or C#, so never heard this called nullability before.

On any case, as someone else pointed out, OCaml had supper for option types built in, through sum types.

I agree that sum types are indispensable in a modern programming language.


Modern OCaml multithreading is still a couple years away, the current (very promising) approach is still at the design/experimental stage.

Out of curiosity, what is it about smart contracts that makes multithreading so important? Why isn't multiprocessing enough? [edit: sorry, I mis-read your comment. You said that multithreading probably isn't a big deal for this problem.]


Modern OCaml multithreading is still a couple years away

It's been a couple of years away for the last 10 years.


I know, and I hate sharing "it's coming soon" stories for that reason. But the recent multicore progress-update does seem well conceived, and so personally I'm giving them the benefit of the doubt. :)


For the record, the current story is that they are merging some multicore things into 4.08, and it will only be available as an "experimental" feature. So even after it is in the compiler, it will be a couple more revisions before it is fully accepted.


Right, I linked to these summary notes in a different comment:

https://discuss.ocaml.org/t/ocaml-multicore-report-on-a-june...


It doesn’t necessarily have to be that far out: https://twitter.com/kc_srk/status/994535108127404038?s=20


That would be very nice. I was basing my comments on these meeting notes [1], which seem a bit more conservative. (Multicore should be available experimentally in early 2019, but not ready for production use. I guess the next steps depend on how well those experiments pan out!)

https://discuss.ocaml.org/t/ocaml-multicore-report-on-a-june...


Not for smart contracts, just as a language in general.


Nullability? Why?


This is a nice counterexample to the wisdom that production compilers all use hand-written parsers. Of course, the current parser does suck, but...


IIRC OCaml compilation was super fast and the errors were useful and pointed to the correct line.

What is/was the problem with the current parser?


As the blog post says: "Error: Syntax Error".

Helpful error messages have not been my experience.


Syntax errors are suboptimal. They are usually reported at correct location and it's relatively easy to spot them.

On the other hand type error messages are simply awesome.


Generally auto generated parsers have huge problems with error recovery (bailing out quickly) and reporting. They usually exceed in performance however. I’m not sure where OCaml stands.


They do not want to change the parser but the parser generator.


One issue is mentioned in the blog post:

The days of Error: Syntax error. may be numbered


How is this a counter-example?

They move from a auto-generated compiler that sucks to another one (whose quality is yet to be seen).

Even if it proves to be great, it would still be an outlier.


How's the performance of OCaml, and what's its concurrency story?

I feel like on paper, at least, OCaml ticks all the boxes on my programming language wishlist: non-nullability, ADTs, type inference, compiles to binary. A lot like rust but a little higher level. But I've never really had a go at it. I'm sort of reluctant to sink time in it if there's not a future, but it seems to be kind of chugging along, not really gaining or losing ground, AFAICT.


The concurrency story is similar to what you get in Python. On one hand, you have event loop implementations with various conveniences for scheduling callbacks (Async, LWT); on the other, you have OS-level threads, but the runtime system has a global lock which makes only one thread execute at a given time. So no parallelism without forking[1] to separate processes. Moreover, the interactions between an event loop and OS-level threads which sometimes arise are non-trivial and in my experience very hard to get right, even with the helpers in Async.

Single-threaded performance is generally good when natively compiled. In a benchmark I did a few years back, which consisted of traversing the filesystem with `opendir`, `readdir` and friends, OCaml was minimally slower than C++ and Nim (when compiled natively) and around Python and Racket when byte-compiled. It probably would be even faster if I used its imperative features. Looking at the code today I also see some unnecessary copying of lists, which could be mitigated. All in all, I think OCaml has a really good performance considering its high-level semantics. You have to be aware of relative costs of operations, but if you do, "as fast as C" is certainly attainable. OCaml code for the benchmark: https://klibert.pl/posts/walkfiles-ocaml.html and the results: https://klibert.pl/statics/images/walkfiled_perf_test1.png

[1] Actually, I just checked and it looks that `fork` is not implemented, so copy-on-write memory sharing is impossible, even on POSIX systems. Docs: http://caml.inria.fr/pub/docs/manual-ocaml/libunix.html http://caml.inria.fr/pub/docs/manual-ocaml/libref/Unix.html


Re-read the page that you linked. Fork isn't implemented on Windows, where fork doesn't exist anyways. Forking works just fine in Ocaml (on Unixes).

Unix module docs (with fork!): http://caml.inria.fr/pub/docs/manual-ocaml/libref/Unix.html


As a language, it's great. As a real world environment/eco-system to get stuff soon? I've been waiting for it to get there for years, and it still hasn't. I check in every 12 months or so, am reminded how nice the language is, and quickly abandon ship when I realise how annoying it is to actually develop software.


> what's its concurrency story

I think concurrency is fine (threads and async are options) but parallelism still seems to always be just around the corner and never actually here. I understand part of the problem is making their GC able to handle parallel allocations without hurting basic allocation performance more than they're willing to do.


Could someone write the equivalent of this in Haskell (assuming the GADTs ext).

    type 'a ty = Int : int ty | Bool : bool ty | String : string ty;;


  data Ty a where
    Int  :: Ty Int
    Bool :: Ty Bool
    ...


Hopefully multicore OCaml will land into the mainline eventually. In my opinion it is the biggest issue against Rust or Haskell.


Great, I'm using OCaml for a production code and it's the most pleasant programming experience I had so far.


They should focus on developer tooling for wide adoption. More OCaml dev, win win situation for jane street.


I'm wondering whether Ceylon could be considered a modern day Ocaml substitute? It has a very strong type system, focus on immutability, a strong module system and Object Oriented capability. What's missing?


Ocaml is a perfectly good modern day language. Our entire codebase at work is built on it and we’re pretty satisfied.


I don't think Jane Street would be using OCaml to this extent if it needed a "modern day substitute". And the language is evolving, which is always a healthy sign.


Okay, wrong wording. However, Ocaml has several 'warts'. https://news.ycombinator.com/item?id=9583659

I suggested Ceylon as the 'modern day substitute' in the sense that it has had the opportunity to learn from mistakes made by the older FP languages. Ceylon has a very consistent type system and is therefore able to deliver more clear error messages.


Since that user didn't specify them, what would you view as these warts?


It has 3 (4?) different, not very compatible stdlibs, for example. Also 3 tools for syntactic extension, two of which suck incredibly, while the last one is pretty new. It has 2 (3 if you count Reason) very different syntaxes defined, even though I've never seen the "light" kind in the wild. The whole String mutability story is interesting in its own right. I'm sure there's a lot more.

But note: every single language has warts like this, especially if it's been developed for a couple of years already. I think OCaml has probably fewer warts than JavaScript, or PHP, or even Java; but more than Go, or Elixir, or Scheme. It's still very nice language and toolchain, though, perfectly usable in many cases - the learning curve may be steep at some points, but on average it's not that hard to learn, and you gain a lot of benefits if you do.


OCaml isn't without its warts (like any language), but I don't quite think you've nailed them.

For one, the different stdlibs are in fact highly compatible. Basic types (option, result, string, int, array, float) are all the same, so code using different stdlibs works together seamlessly most of the time.

Lwt and Async are a different story, and there is a real incompatibility problem there.

The syntax extension story is pretty clear and simple: PPX rules the roost, and the tools for building PPXs are quickly getting better and more unified. Reason is an interesting variant in the ecosystem, but its existence doesn't amount to a wart in my eyes. It's an alternative syntax that you can use interoperably with the rest of the OCaml ecosystem (and Dune makes that awfully easy.)


Some clarifications:

It has one standard library, and several alternative standard libraries of varying popularity. I would argue that Core is probably the most popular. For what it's worth, Haskell also has several replacement for its standard library, Prelude.

For the syntax extensions, the community has largely migrated to PPX. Alternatives are being phased out.

String is no longer mutable in recent versions of OCaml. The string type is now immutable, and a new type, bytes, has been introduced which is mutable.


I don't think Ceylon is really in the same category as OCaml. OCaml is closer to F# and Haskell. Ceylon is probably closer to Kotlin.


I agree with the rest of your comment, but I think this Ceylon is probably closer to Kotlin. is unfair on Ceylon

Kotlin basically has the Java type system with a few extensions, but no substational changes.

Ceylon has its own type system that draws heavily from the FP/ADT style of thinking. It's not OCaml (nor does it want to be), but it's also not Kotlin.


Where does it fall relative to Scala?


Is there anything OCaml can do that no other language can do?


The most unique feature is the type system which allows you to both model domain problems effectively and guarantee consistency. This reduces a huge number of bugs in compile-time.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: