This is why we don't microbenchmark languages. Usually if you dig a little deeper, you find that actually the difference between the languages is mostly just that one of them does one thing badly.
This by the way is MLton's worst case scenario. MLton is designed to perform full-program optimisation (unlike almost any other compiler I've seen); the gap is supposed to grow as the code gets larger (read: more realistic).
Sadly due to its design tradeoffs (it's a compiler) it's totally unsuitable for the big ML projects (interactive theorem provers).
Can you elaborate on why being a compiler makes MLton unsuitable for theorem provers? Is it because they are usually written on top of the interpreter's REPL, and need its reflection capabilities?
It's because they use metaprogramming so extensively. As an example, HOL4 (I can only speak for that) accepts a script written by the user (as though in a REPL), compiles and runs it, and produces an sml theory file which represents their defined theory [0]. The user when they start their next theory are supposed to use this computer generated code.
You would have to write your own repl, write your own way of loading, writing theories etc and by that time you've lost performance (HOL even after 20 years of progress and tweaks now has only mostly ok performance).
So it would at least require some rearchitecting, which is my point; it's not a drop in replacement.
That makes sense. It's pretty cool to think that ML was designed as a metalanguage for theorem provers, and the innovations required for that turned out to be the next step (or next several steps) in the evolution of general-purpose languages.
MLton is unsuitable for LCF-style theorem provers, simply because it has no interactive toplevel.
ML was originally developed as a MetaLanguage for such theorem provers. The idea is to have an abstract type thm of theorems, whose operations (like modus ponens) only allow you to construct provable theorems. To prove a theorem interactively, you would start the ML toplevel and try to construct a proof of your proposition by programming in ML. The ML type system guarantees that you can only prove valid things. The toplevel is essential if you want to prove things interactively.
Modern theorem provers like Coq work differently. If Coq were written in SML rather than OCaml, then MLton should be good for Coq as well. I think people have experimented with non-interactive versions of HOL4, compiled using MLton (because it's faster than other SML compilers).
LCF-style provers do not use the Curry-Howard correspondance (ML's type system is not strong enough); having something of type thm is not enough to know that it's a theorem. The type thm is just a triple of tag (to store any oracles etc), a list of assumptions and a conclusion. The main way that you prevent against fake theorems being constructed is to (I think) make the constructor private, and then have combinator functions that you've very carefully verified. You can certainly use mk_thm to make a theorem of whatever you want (it adds an oracle tag, but a single line change in Thm.sml would drop that).
MLton would require you to write your own metalanguage (but this is not necessarily onerous; most Isabelle is written in thy files or Isar.
I had a think about it, and I think that the actual compelling advantage of Poly is probably its saveable heaps; they really make everything a lot faster.
The whole language is a bit janky though; there's a tonne of stuff totally unstandardized (production of executables, imports etc), which means that every compiler has fairly dramatic extensions (and hilariously you can't tell between them by attempting to use them because of the type system).
So HOL4 will try to determine whether you're using mosml or polyml by calculating factorial 100. With mosml you overflow, polyml seamlessly switches to a bignum.
tl;dr Coq implements a metalanguage on top of ML (with an undecidable type system). LCF type provers use ML as their metalanguage. Using mlton would force you to write your own metalanguage because it has no interactivity.
> The main way that you prevent against fake theorems being constructed is to (I think) make the constructor private, and then have combinator functions that you've very carefully verified.
That's what I mean. The LCF approach is to give you a signature with an abstract type thm and operations to construct its inhabitants. The operations are such that you can only construct valid theorems. Example (taken from http://www.cl.cam.ac.uk/~jrh13/slides/sri-20feb05/deduction....):
module type Proofsystem =
sig type thm
val axiom_addimp : formula -> formula -> thm
val axiom_distribimp : formula -> formula -> formula -> thm
val axiom_doubleneg : formula -> thm
val axiom_allimp : string -> formula -> formula -> thm
val axiom_impall : string -> formula -> thm
val axiom_existseq : string -> term -> thm
val axiom_eqrefl : term -> thm
val axiom_funcong : string -> term list -> term list -> thm
val axiom_predcong : string -> term list -> term list -> thm
val axiom_iffimp1 : formula -> formula -> thm
val axiom_iffimp2 : formula -> formula -> thm
val axiom_impiff : formula -> formula -> thm
val axiom_true : thm
val axiom_not : formula -> thm
val axiom_or : formula -> formula -> thm
val axiom_and : formula -> formula -> thm
val axiom_exists : string -> formula -> thm
val modusponens : thm -> thm -> thm
val gen : string -> thm -> thm
val concl : thm -> formula
end;;
With this type signature, the ML type system guarantees that only valid theorems can be constructed. Note that this is not Curry-Howard. Of course, you can punch holes into this by allowing unproved assumptions using mk_thm or the like. But the original idea was to implement all other tactics only using such a correct-by-design structure.
Another issue, comparing the MLton FFI and the Ocaml FFI, is that the FFI in MLton is somewhat slower. When I converted an Ocaml program to SML, I discovered that a small portion written in C ran much faster when I rewrote it in SML. In Ocaml, writing portions in C is almost always a performance win, just because the Ocaml compiler doesn't optimize very well. In MLton there's a bit of overhead just for the FFI (possibly because the code generator allocates registers very differently from how it's done in C). That overhead goes away if you use the C code generator instead, but usually the native code generator performs better anyway.
If I remember correctly, it has been proposed to rewrite the MLton floating point conversion in pure SML for precisely this reason.
Input and output of floating point involves a decent amount of bigint math to do correctly. It's not surprising to me that it is slow -- doing it quickly is still an active area of research.
Writing a bit about that is still on my todo list.
I remember reading, years ago, a paper which provided an algorithm for converting an IEEE 754 into a string using only a fixed number machine word length scalars—but I certainly can't find it now.
The title is play on "Printing floating-point numbers quickly and accurately (1996)", which in turn is play on "How to print floating-point numbers accurately (1990)".
This by the way is MLton's worst case scenario. MLton is designed to perform full-program optimisation (unlike almost any other compiler I've seen); the gap is supposed to grow as the code gets larger (read: more realistic).
Sadly due to its design tradeoffs (it's a compiler) it's totally unsuitable for the big ML projects (interactive theorem provers).