SML and OCaml: Why was OCaml faster?

CHY872 · on May 13, 2015

This is why we don't microbenchmark languages. Usually if you dig a little deeper, you find that actually the difference between the languages is mostly just that one of them does one thing badly.

This by the way is MLton's worst case scenario. MLton is designed to perform full-program optimisation (unlike almost any other compiler I've seen); the gap is supposed to grow as the code gets larger (read: more realistic).

Sadly due to its design tradeoffs (it's a compiler) it's totally unsuitable for the big ML projects (interactive theorem provers).

jasperry · on May 13, 2015

Can you elaborate on why being a compiler makes MLton unsuitable for theorem provers? Is it because they are usually written on top of the interpreter's REPL, and need its reflection capabilities?

CHY872 · on May 14, 2015

It's because they use metaprogramming so extensively. As an example, HOL4 (I can only speak for that) accepts a script written by the user (as though in a REPL), compiles and runs it, and produces an sml theory file which represents their defined theory [0]. The user when they start their next theory are supposed to use this computer generated code.

You would have to write your own repl, write your own way of loading, writing theories etc and by that time you've lost performance (HOL even after 20 years of progress and tweaks now has only mostly ok performance).

So it would at least require some rearchitecting, which is my point; it's not a drop in replacement.

[0] - This is AES, before and after building! https://gist.github.com/j-baker/244ef3e59f19d74c352b

jasperry · on May 14, 2015

That makes sense. It's pretty cool to think that ML was designed as a metalanguage for theorem provers, and the innovations required for that turned out to be the next step (or next several steps) in the evolution of general-purpose languages.

more_original · on May 14, 2015

MLton is unsuitable for LCF-style theorem provers, simply because it has no interactive toplevel.

ML was originally developed as a MetaLanguage for such theorem provers. The idea is to have an abstract type thm of theorems, whose operations (like modus ponens) only allow you to construct provable theorems. To prove a theorem interactively, you would start the ML toplevel and try to construct a proof of your proposition by programming in ML. The ML type system guarantees that you can only prove valid things. The toplevel is essential if you want to prove things interactively.

Modern theorem provers like Coq work differently. If Coq were written in SML rather than OCaml, then MLton should be good for Coq as well. I think people have experimented with non-interactive versions of HOL4, compiled using MLton (because it's faster than other SML compilers).

CHY872 · on May 14, 2015

Slight correction.

LCF-style provers do not use the Curry-Howard correspondance (ML's type system is not strong enough); having something of type thm is not enough to know that it's a theorem. The type thm is just a triple of tag (to store any oracles etc), a list of assumptions and a conclusion. The main way that you prevent against fake theorems being constructed is to (I think) make the constructor private, and then have combinator functions that you've very carefully verified. You can certainly use mk_thm to make a theorem of whatever you want (it adds an oracle tag, but a single line change in Thm.sml would drop that).

MLton would require you to write your own metalanguage (but this is not necessarily onerous; most Isabelle is written in thy files or Isar.

I had a think about it, and I think that the actual compelling advantage of Poly is probably its saveable heaps; they really make everything a lot faster.

The whole language is a bit janky though; there's a tonne of stuff totally unstandardized (production of executables, imports etc), which means that every compiler has fairly dramatic extensions (and hilariously you can't tell between them by attempting to use them because of the type system).

So HOL4 will try to determine whether you're using mosml or polyml by calculating factorial 100. With mosml you overflow, polyml seamlessly switches to a bignum.

tl;dr Coq implements a metalanguage on top of ML (with an undecidable type system). LCF type provers use ML as their metalanguage. Using mlton would force you to write your own metalanguage because it has no interactivity.

more_original · on May 14, 2015

> The main way that you prevent against fake theorems being constructed is to (I think) make the constructor private, and then have combinator functions that you've very carefully verified.

That's what I mean. The LCF approach is to give you a signature with an abstract type thm and operations to construct its inhabitants. The operations are such that you can only construct valid theorems. Example (taken from http://www.cl.cam.ac.uk/~jrh13/slides/sri-20feb05/deduction....):

  module type Proofsystem =
  sig type thm
  val axiom_addimp : formula -> formula -> thm
  val axiom_distribimp : formula -> formula -> formula -> thm
  val axiom_doubleneg : formula -> thm
  val axiom_allimp : string -> formula -> formula -> thm
  val axiom_impall : string -> formula -> thm
  val axiom_existseq : string -> term -> thm
  val axiom_eqrefl : term -> thm
  val axiom_funcong : string -> term list -> term list -> thm
  val axiom_predcong : string -> term list -> term list -> thm
  val axiom_iffimp1 : formula -> formula -> thm
  val axiom_iffimp2 : formula -> formula -> thm
  val axiom_impiff : formula -> formula -> thm
  val axiom_true : thm
  val axiom_not : formula -> thm
  val axiom_or : formula -> formula -> thm
  val axiom_and : formula -> formula -> thm
  val axiom_exists : string -> formula -> thm
  val modusponens : thm -> thm -> thm
  val gen : string -> thm -> thm
  val concl : thm -> formula
  end;;

With this type signature, the ML type system guarantees that only valid theorems can be constructed. Note that this is not Curry-Howard. Of course, you can punch holes into this by allowing unproved assumptions using mk_thm or the like. But the original idea was to implement all other tactics only using such a correct-by-design structure.

CHY872 · on May 14, 2015

yep! Sorry, definitely misread you.

cannam · on May 14, 2015

Made for an interesting explanation for the rest of us though.

loup-vaillant · on May 13, 2015

I'm guessing it's the significant compilation times required by whole program optimisations. MLton doesn't do separate compilation.

(I think. Citation needed. It may separate some things, then optimise after, GCC -flto style.)

dman · on May 13, 2015

You might want to check out Stalin. (http://en.wikipedia.org/wiki/Stalin_%28Scheme_implementation...)

tsuyoshi · on May 14, 2015

Another issue, comparing the MLton FFI and the Ocaml FFI, is that the FFI in MLton is somewhat slower. When I converted an Ocaml program to SML, I discovered that a small portion written in C ran much faster when I rewrote it in SML. In Ocaml, writing portions in C is almost always a performance win, just because the Ocaml compiler doesn't optimize very well. In MLton there's a bit of overhead just for the FFI (possibly because the code generator allocates registers very differently from how it's done in C). That overhead goes away if you use the C code generator instead, but usually the native code generator performs better anyway.

If I remember correctly, it has been proposed to rewrite the MLton floating point conversion in pure SML for precisely this reason.

darksaints · on May 13, 2015

So basically Ocaml was faster because it was FFI?

ori_b · on May 13, 2015

Both did FFI. Ocaml was faster because there was less data munging around the FFI.

agumonkey · on May 13, 2015

Still surprising how much time is devoted to formatting (ocaml or any language).

ori_b · on May 13, 2015

Input and output of floating point involves a decent amount of bigint math to do correctly. It's not surprising to me that it is slow -- doing it quickly is still an active area of research.

Writing a bit about that is still on my todo list.

thechao · on May 14, 2015

I remember reading, years ago, a paper which provided an algorithm for converting an IEEE 754 into a string using only a fixed number machine word length scalars—but I certainly can't find it now.

sanxiyn · on May 14, 2015

I think you are looking for "Printing floating-point numbers quickly and accurately with integers (2010)".

http://dl.acm.org/citation.cfm?id=1806623

The title is play on "Printing floating-point numbers quickly and accurately (1996)", which in turn is play on "How to print floating-point numbers accurately (1990)".

http://dl.acm.org/citation.cfm?id=231397 http://dl.acm.org/citation.cfm?id=93559

kristianp · on May 14, 2015

I wonder if any C standard libraries have implemented this.

jwmerrill · on May 14, 2015

I understand that v8 is using this algorithm, and Julia also now uses it thanks to work by Jacob Quinn https://github.com/JuliaLang/julia/tree/master/base/grisu

sanxiyn · on May 14, 2015

Rust also now uses this algorithm, thanks to Kang Seonghoon: https://github.com/rust-lang/rust/tree/master/src/libcore/nu...

Rust implementation is heavily commented with beautiful ASCII diagrams.

agumonkey · on May 13, 2015

Agreed, but it's mostly unnecessary on the logic side, maybe communicating through binary forms to avoid transcoding costs.