None of the example languages (C,C++,java) have serious type systems (in the sen...

millstone · on March 22, 2021

> These should be compilable to a single asm routine, as long as there is an homogeneous way to copy them

That's a massive asterisk. In practice, copying values to the heap is complicated. Integer values may be memcpyed. Pointer values may require read and/or write barriers, or perhaps reference counting. Compound types may require some combination. Maybe you have copy-constructors and you need arbitrary callouts! These implementation details can't be hand-waved away.

> The difference between this case and the polymorphic case is that here the underlying structure (eg "compare" or "insert") isn't done in a homogeneous way (single routine) for every type: the polymorphism is ad-hoc and not parametric. The vtable solution is a form of boxing and that impacts runtime

Kindly, I think you have some confused terminology. Parametric vs ad-hoc polymorphism is a difference in the type system, and occurs long before the optimizer kicks in. Parametric polymorphism is a single function implementation parametrized by types. Ad-hoc polymorphism is multiple implementations, and the compiler chooses one based on types (e.g. C++ overloading or template specialization).

If the optimizer chooses to emit a specialization, that does NOT make it ad-hoc polymorphism. That's just a detail of the optimizer. And it is a distinct optimization from inlining, for good reasons, like not wanting to be tied to the inlining cost model.

> for efficient compilation of languages with generics you need to have an efficient compilation of parametrically polymorphic functions (by type erasure)

C++ and Rust both have efficient generics without using type erasure. Boxed generics are not necessary at all, as long as you accept static linking.

choeger · on March 21, 2021

You are certainly right in that observation. But there is slightly more to it. The "generic" concept from Java and C++ is more akin to the functors of the ML family. In that view, the question is not how to efficiently compile functions that are polymorphic (SML/NJ does that quite well, I think). The real question is, what are your compilation units.

If you optin to the OCaml view, then Modules (and Functors) have to be compiled separately. Everything else follows. Java looks at this in a similar fashion, btw.

If you optin to the C++ view, than only monomorphic code has ever to be compiled. In that scenario a template has no meaning on its own, except as an abstraction over code to be generated when needed.

In principle, SML/NJ shows how both views can be reconciled. One can easily envision separate compilation up to the point of specialization which is then done in a separate phase just before linking. I think no one does it that way, though, except maybe Swift.

lapinot · on March 21, 2021

I'm not sure i get what you mean by "hav[ing] to be compiled separately", but as far as i understand, the flambda intermediate representation of the ocaml backend has no (notable) distinction between a functor and a function (or a module and a record). Everything is untyped so it works. That's also how the runtime values are (which need to exist since there are first-class modules now). I believe that thanks to this ocaml does (could?) inline modules and functors. I'm not so sure about cross-file or cross-package inlining tho.

lapinot · on March 21, 2021

See flambda, the recent evolution of the ocaml inliner.

https://caml.inria.fr/pub/docs/manual-ocaml/flambda.html

https://blog.janestreet.com/flambda/