Type inference really doesn't require ML (and is probably better done without it...

PartiallyTyped · on Nov 14, 2022

> Type inference really doesn't require ML

This holds only if you have sufficient context, giving what is virtually an untyped function into e.g. mypy will not provide anything meaningful. Even languages with rich type systems like Haskell do require some types - even implicitly - to infer everything else.

In contrast, copilot has access to significantly more information when emitting the type-annotated code, including context of what the project is like, how are other functions defined, plus all the prior knowledge accumulated from training.

> I think the Kolmogorov complexity of serious production codebases is very high

Do serious codebases exceed gigabytes in raw storage?

> in order to convey what the codebase does

Doesn't this assume no priors / blank slate?

capitalsigma · on Nov 14, 2022

> Even languages with rich type systems like Haskell do require some types - even implicitly - to infer everything else.

If the Haskell compiler is spitting out "ambiguous instance of +" or whatever that error message is, it's a sign that the author doesn't understand what they're asking for. Taking a language whose value proposition is "I refuse to compile your code if there is the slightest indication that it's not perfect" and slapping a fuzzy ML model on top of it to suppress a class of errors is not a good idea.

Type systems exist as a way of ensuring that the programmer's mental model matches the code. Offloading type annotations to something else removes that safety from you.

> no priors

Sure? I don't see how that changes anything. My point is just that we already have very high quality summarizers of code, but software development is still hard. Making a model that attempts to approximate the summarizers of code that we already have isn't going to help much, no matter how good it gets.

PartiallyTyped · on Nov 14, 2022

> and slapping a fuzzy ML model on top of it to suppress a class of errors is not a good idea.

I am not saying that that is a good idea, I am expressing that there are limitations on what one can infer without broader context.

> Offloading type annotations to something else removes that safety from you.

Counter argument; the annotations can be used as a second sanity check; if the inferred types match your mental model - and the type inference model is good - then you know that your mental model is correct and you didn't miss something.