Covariance and contravariance are just monotonicity and anti-monotonicity, applied to types ordered by subtyping.
That is: if we have a function on types, say, the function `f` defined by:
f(x) = Int -> x
Then we say `f` is covariant because it is monotone: it preserves the subtype ordering on its argument. That is:
if x <: y, then f(x) <: f(y)
Similarly, if we consider `f` defined by:
f(x) = x -> Int
then this `f` is contravariant because it is anti-monotone: it reverses the subtype ordering on its argument. That is:
if x <: y, then f(y) <: f(x)
People tend to find covariance more intuitive than contravariance; unless the issue is pointed out, they tend to assume everything is covariant. They see a type, say:
dog -> dog
and they assume "oh, every dog is an animal, so I can put 'animal' in place of 'dog' and it'll be more general (i.e. a supertype)". This is false, as the article points out.
> Covariance and contravariance are just monotonicity and anti-monotonicity
I am confused. It seems that monotonicity is a property of the function itself, that is, how it is implemented and what (type) it returns depending on what (type) it gets. In other words, we vary input value (real) type and observe what happens with the output value (real) type. Only one function is involved - monotonicity is its inherent property.
But covariance/contravariance is about '<:' relation between function signatures (the contract, not how it is implemented). Once this relation has been somehow defined, we can check, for example, if one function can be safely applied instead of another function. In other words, we vary signatures by changing input/output declared types and then decide about the relation among them.
Why do you say that monotonicity is similar or equivalent to covariance? They seem to be different properties.
Monotonicity is a property of functions that is relative to some ordering relation (e.g., f(x) = x + 42 as a function on natural numbers is monotone with respect to the natural ordering). In this case, that ordering relation is <:, and covariant functions are precisely those function that are monotone with respect to <:; as are contravariant functions the antitone functions with respect to <:.
> People tend to find covariance more intuitive than contravariance; unless the issue is pointed out, they tend to assume everything is covariant.
I can easily understand where covariance is useful and meaningful while contravariance is somehow counterintuitive for me.
Therefore I doubt if it is really needed (except for being a nice formal counterpart of covariance). In other words, would it be more natural to simply use covariance as a default rule while contravariance either being not available at all or used as a kind of exception (where you need to know what you are doing and understand the consequences)?
The most common cases are that anything that "receives" or "absorbs" is contravariant, while anything that "emits" or "produces" is covariant. If you're working with streams, Source is covariant and Sink is contravariant. A read-only collection is covariant, a write-only collection (sounds stupid I know, but it can be a good way to model e.g. an audit log) is contravariant. A function A => B is covariant in B but contravariant in A.
Contravariance is probably the least common, sure, but covariance should probably be opt-in too. A lot of languages treat things as covariant-by-default even for mutable types that are actually invariant, and so suffer from some variant of this unsoundness bug:
val dogs: List[Dog] = List[Dog]()
val animals: List[Animal] = dogs
val cat: Cat = Cat()
animals.append(cat)
val dog: Dog = dogs(0) // dog is now a Cat
If you ever want function types you will always need both co- and contravariance as pointed out in OP's articles.
Using covariance as a default is also not sound. Many things are "invariant", meaning neither co- or contravariant. This is already a sensible default. (It is however possible to automatically derive either property in some cases)
Both are just as common. The most trivial cases are getters (covariant) and setters (contravariant).
A getter which returns a dog is obviously a sub type of a getter which returns an arbitrary animal since the returned dog can do everything that an animal can. However it is different for setters, a setter which consumes an animal can also consume dogs, hence it is a subtype of a setter which consumes dogs. The reverse isn't true since a setter which consumes dogs can't also take animals which means that you can't pass it as an animal setter.
So we could (by default) assume that all getter (returns) are invariant, and all setter (arguments) are contravariant. If we forget about invariance for a moment then this means essentially that there is only one default pattern and hence no need in explicitly distinguishing and annotating covariance and contravariance - everybody simply knows this default rule. (I am simply reasoning after reading the article - I do not know if it can be done or makes sense.)
It can mostly be done. The OCaml programming language does exactly this - it infers co/contravariance. However, when describing a generic interface which has as part of it a function on types (usually called a "type constructor"), you need to say whether that type constructor is supposed to be covariant, contravariant, or invariant.
For example, if I were describing a sequence interface, I would write
module type SEQ =
sig
(* the "+" means that seq must be covariant *)
type +'a seq
val empty : 'a seq
val singleton : 'a -> 'a seq
val append : 'a seq -> 'a seq -> 'a seq
val lookup : int -> 'a seq -> 'a option
exception IndexOutOfBounds
end;;
It is not essential, and neither is covariance. For instance, C and C++ function types are invariant (but C++ virtual member functions are properly variant, and so is std::function). If a language does not support variance, it can always be emulated with wrapper functions. Eg. given types A :> B :> C
fn foo: (B -> B) -> ()
fn bar: A -> C
foo(b -> bar(b))
You are right, I probably did not express my idea clearly. What I meant is that there is only one meaningful pattern: returns are covariant and arguments are contravariant. If it is so (and the articles seems to confirm this) then why do we need special annotations? The system could itself assume that any argument (by default) is contravariant and any return (by default) is covariant. Does it make sense?
> The system could itself assume that any argument (by default) is contravariant and any return (by default) is covariant.
I guess languages could infer variance for generic parameters on methods/functions, and perhaps should - it would align with how type systems generally work. But that's a small part of the use case - variance mainly matters when you have generic values. If we have a Frobnicator[A, B] it would be very difficult for the language to infer whether variance for frobnicator values should run like for A => B, like for B => A, or some other way.
I would have Googled before making the strong claim that someone is making stuff up. "<:" is widely used notation for "is a subtype of". I'm familiar with its usage in Julia.
One of the ways I like to look at it is that I'm a foreman and one of my construction workers can't show up, and I need a substitute, I don't want someone that is even less useful than my worker. A really good substitute is someone that can do everything that my original worker could do, and maybe then some even if I don't take advantage of it. Meanwhile, I don't want his results to be worse than my original worker's, but I sure don't mind if it's better.
In other words, if my original worker only knows how to turn a Dog into a Dog, and that was good enough, the most useful substitute is someone who can take any Animal and turn it into any special kind of dog.
Or maybe I care about 2x4's, and my normal guy only knows how to turn Cherry into 2x4's and isn't trained on anything else, even though that suited my needs. The best sub is someone who can take lots of kinds of wood and turn it into different kinds of posts and planks; I'm just only taking advantage of his 2x4 skills.
Unfortunately a lot of programmers design subclasses by saying "hey look, I'm good at starting with a schnauzer" or "hey look, I'm good at starting with Brazilian cherry". These guys aren't helpful when they show up in your parameter list.
As a mathematician with no comp-sci type knowledge, my only understanding of inheritance is the "is a" rule. Using this, I realized that a subtype of the set of functions from Dog to Dog must be a set of functions such that each function could be treated as a function from Dog to Dog under an appropriate restriction. This would be the only way for such a set to satisfy what felt like the "is a" inheritance rule.
In other words, a set of functions from A to B where Dog is contained in A and B is contained in Dog would be a subtype of the set of functions from Dog to Dog. So Animals -> Greyhound works.
Can you figure out what the functor is? As another mathematician, if they're saying "contravariant" I expect to see something like a Galois functor. This kind of looks like a pullback or an evaluation functor, but I can't even see the categories it might be between. Is it the Hom functor?
See my other comment about co/contravariance as monotonocity/antimonotonicity. Since monotonicity is just functoriality when your categories are posetal, covariance is a special case of functoriality.
More generally, you can see
f(x) = Int -> x
g(x) = x -> Int
as, respectively, co- and contra-variant endofunctors on the category whose objects are types and whose arrows are definable functions in your programming language. And they're both just special-cases of the internal hom functor,
Wow, that's a new word, "posetal". Does that mean "can be partially ordered"?
What is "Int"? Integers? Integral? Int... ernal? ... types?
What's with the notation, anyway? Why use "<:" instead of "<" like you would for any other transitive anti-symmetric relation?
"Anti-monotone" is weird language. I would say "monotone" in general and "increasing" or "decreasing" in particular.
What does any of this have to do with programming languages? I mean, types are purely abstract, right? Do they have to be part of a programming language? I thought types were merely sets ordered by inclusion.
If programming languages are not something that can be abstracted away, what's a programming language in this context? A set (class?) of types and functions on those types?
What is a definable function? Are some functions not definable? Is it something like "there exists a Turing machine such that..."?
I don't know why computer people's category theory always looks so foreign to me. Sometimes I feel like we're in completely different worlds with vaguely similar-looking language. Category theory for me was mostly about homological algebra, but I don't think computer people care very much about the snake lemma or chasing elements across kernel and cokernel diagrams.
> What is "Int"? Integers? Integral? Int... ernal? ... types?
Int is the/a type of integers (possibly modulo 2^32), but just being used as an example of some arbitrary fixed type. "f(x) = Int -> x" is an example of a functor on the category of types in the same way that "f(x) = 4 + x" is an example of a function on the integers.
> What's with the notation, anyway? Why use "<:" instead of "<" like you would for any other transitive anti-symmetric relation?
I suspect computer people didn't initially realise they were the same... sort? of thing. Failing to properly distinguish between values and types causes a lot of confusion which one goes through a lot of effort to train oneself out of; finding out that some things are polymorphic over the two (polykinded) becomes really confusing then.
> I mean, types are purely abstract, right? Do they have to be part of a programming language?
One of the foundational definitions that distinguishes a type from something that isn't a type is that types are associated to terms in a language. I'm not sure how useful that is when we're talking about types in an abstract setting though.
> I thought types were merely sets ordered by inclusion.
Types can be larger than sets, and more importantly their equivalence doesn't behave much like set equivalence in the setting where we normally use them. I suppose you could identify a type with the set of terms of that type and that might work (though I'd worry about circularity), but it doesn't feel like a natural way to work with types.
> If programming languages are not something that can be abstracted away, what's a programming language in this context? A set (class?) of types and functions on those types?
I think it's a term reduction system, so a set of terms (and terms are a concept that have a bit of structure, though I'm unable to find a sensible standalone definition) and a finite set of reduction rules.
> I don't know why computer people's category theory always looks so foreign to me. Sometimes I feel like we're in completely different worlds with vaguely similar-looking language.
Well, another way to think about it is that we are using the same language to talk about very different things. For example, in homological algebra you are always working in (at least) a pointed category. In computer science a lot of the things we are interested in are already reflected in the partial order induced by a category (i.e., subtyping). If your category has a zero object the latter is trivial and so we're pretty much never looking at pointed categories.
> In computer science a lot of the things we are interested in are already reflected in the partial order induced by a category (i.e., subtyping).
If by "the partial order induced by a category", you mean 'let A <= B iff Hom(A,B) is nonempty', then that's not subtyping. That's "does there exist a function from A to B". Just because I can write a function from Int to String doesn't mean Int is a subtype of String.
I tend to think of subtyping as an additional structure: a PL comes equipped with a particular notion of subtype. Say C is the category of types and functions. Then subtyping is a partial order on types equipped with an inclusion functor into C. That is, a map (coerce : (A <: B) -> Hom(A,B)) that chooses for any pair A,B of types such that A <: B a coercion function from A to B. Moreover, coercing from A to A must be the identity, and if A <: B <: C, then coercing from A to C is the same as coercing from A to B and then from B to C.
A posetal category has at most one arrow in each homset – i.e. it’s a poset dressed up as a category.
A definable function is one that can be defined in the programming language in question. There's a lot more to definability than Turing machines, when you consider higher types: that is, types whose domain is a function type. John Longley has written on this extensively.
I know what you mean about not knowing where to start! I think “all of this” is roughly what theoretical computer scientists call “Theory B”, or at least a large chunk of it, and where to start depends on which specific things you find most interesting.
Benjamin Pierce’s _Types and Programming Languages_ is a relatively accessible introduction with an emphasis on types, but of course it’s just an overview of a pretty large research area.
> Wow, that's a new word, "posetal". Does that mean "can be partially ordered"?
Posetal means that Hom(A,B) has at most one element (the category is "thin"), and that any two isomorphic objects are equal (the category is "skeletal"; corresponds to antisymmetry). So the category can be seen as a poset, where (A <= B) iff Hom(A,B) is nonempty.
> What is "Int"? Integers? Integral? Int... ernal? ... types?
Integers. I was just picking an example type, as lmm says. I could have said: "Fix a type A, and consider f(B) = A -> B".
> What's with the notation, anyway? Why use "<:" instead of "<" like you would for any other transitive anti-symmetric relation?
Tradition. <= is also perfectly fine. Why do set theorists write \subseteq instead of \leq? \subseteq is also a reflexive, transitive, antisymmetric relation.
> "Anti-monotone" is weird language. I would say "monotone" in general and "increasing" or "decreasing" in particular.
Yeah, in analysis "monotonically increasing/decreasing" is more common. I find it more useful to explicitly draw out the fact that one is the reversal of the other; also "monotone" is more concise than "monotonically increasing". shrug
> What does any of this have to do with programming languages? I mean, types are purely abstract, right? Do they have to be part of a programming language? I thought types were merely sets ordered by inclusion.
Well, the OP was about types in programming languages. Types in PLs are not always best thought of as sets; for example, the equation
a = (a -> Bool) -> Bool
has solutions in some type theories! Interpreted in the naive way, this would be a set equal to the powerset of its powerset, which is impossible. This conundrum leads to the study of Scott domains, for example.
> If programming languages are not something that can be abstracted away, what's a programming language in this context? A set (class?) of types and functions on those types?
I'm being deliberately vague on that front. How to define "programming language" is not something widely agreed upon. But similar patterns, such as co- and contra-variance, recur.
> What is a definable function? Are some functions not definable? Is it something like "there exists a Turing machine such that..."?
I just meant "any function you can write in the programming language you're considering". As robinhouston points out, not all functions will be definable, and "Turing-computability" is a bit vague when talking about functions that take other functions as arguments (since TMs operate on bitstrings, so how you encode functions becomes relevant). Constructive mathematics studies which things are still definable/provable if you insist on computability; "constructible" and "computable" usually mean the same thing.
> I don't know why computer people's category theory always looks so foreign to me. Sometimes I feel like we're in completely different worlds with vaguely similar-looking language. Category theory for me was mostly about homological algebra, but I don't think computer people care very much about the snake lemma or chasing elements across kernel and cokernel diagrams.
Yes. Category theory is a very general framework, and different bits of it become relevant depending on what you're studying. Although there are sometimes deep connections between unexpected regions, such as between computation and topology; I don't really understand that one myself.
I use "just" to mean "exactly"; I say "X is just Y" to mean "X is exactly Y; when you have Y, you have X and vice-versa". I guess it might be kind of off-putting, because "just" implies the connection is boring or obvious, which is not what I mean to say.
So I used to think covariance and contravariance were related to covectors and vectors from physics, but in fact, our terminology is in fact confusing the concept (almost "opposite" actually) that is now accepted in math. (See comment on wiki[0]).
In this case it's the Category Theorists confusing things. Co- and contravariance of vectors with respect to their scale was introduced in the 1850s by Sylvester, Category Theory was formulated roughly 90 years later.
Why are covariance and contravariance important, and how do their type theory definitions differ from mathematical or statistical definitions of covariance and contravariance?
Is it just a distinction of which direction the type hierarchy flows, and the consequences that must have with regard to functions in order for logical consistency to be maintained?
You are getting the right idea there. It all has to do with how the typing rules look like in programming languages with subtyping.
In a type system with subtyping, having a type A be a subtype of a type B (A <: B) means that we can pass a value of type A to any place where a value of type B would be expected.
For basic types, the subtyping rules are pretty simple: just follow the class definition hierarchy. a Dog is an Animal, an Animal is an Object, and so on.
For parameterized types, like collections (List[X]) or functions (Function[A,B]), it is a bit more complicated. The only way to go is to define the subtyping rule for T[X] in terms of the subtyping rule for X but which rule do we use?
- T[A] <: T[B] iff A <: B ? (covariance)
- T[A] <: T[B] iff A :> B ? (contravariance)
- T[A] <: T[B] iff A = B ? (invariance)
Well, it depends on the type! For example, function types (A->B) are contravariant with respect to the input type (the As) and covariant with respect to the output types (the Bs). For collections it depends on what operations are allowed. Read-only collections are covariant, write-only collections are contravariant and read-write collections are invariant.
This last part is where knowing about variance becomes important for programmers. When it comes to user-defined parameterized types, the variance rules are different so to find out if a given type is covariant, contravariant or invariant the only option is to read the type annotations and documentation. And you also need to know what variance is to understand those annotations when you come across them.
The reason anyone cares is because we want static and yet flexible typing. The better a type system understands covariance and contravariance, the better it can be at statically checking complex relationships.
This post presents a nice example, but only for the particular type system envisaged by the author.
At least, I believe that the point of conceiving of 'covariance' and 'contravariance' is that we may have or not have either, in input or return types.
The submission presents one incarnation, a common one I believe, but nevertheless I think if the goal's to understand variance, the concept must be distinguished from implementation.
> At least, I believe that the point of conceiving of 'covariance' and 'contravariance' is that we may have or not have either, in input or return types.
Covariance in the return type and contravariance in the argument type is the most general subtyping policy you can have for function types in a sound type system—it's not arbitrary. If instead you had covariance in the argument type, as some languages do, your type system is broken (as demonstrated in examples 1 and 2 in the article). If you had contravariance in the return type, this also breaks your type system (example 3). For some language designers, having a broken type system is acceptable. But it should not be presented as a legitimate option in an article about programming language theory.
It's also sound to have invariance in argument and return types. But this is an arbitrary restriction with no clear benefits, except possibly eliminating the need to explain co/contravariance to users. I do try to avoid being prescriptive in the article, however; that's why the question is phrased as "Which of the following types could be subtypes of Dog -> Dog?" rather than "Which of the following types are subtypes of Dog -> Dog?".
So we can have contravariance or invariance in the argument type, and covariance or invariance in the return type. But contravariance in the argument type and covariance in the return type is a strictly more general rule. And bivariance in either is certainly incorrect.
> The terms "covariant" and "contravariant" were introduced by James Joseph Sylvester in 1853 in the context of algebraic invariant theory, where, for instance, a system of simultaneous equations is contravariant in the variables.
Great article - I'm impressed that it covers every discovery I've painstakingly worked through over the years, all succinctly expressed on one page: Java arrays, immutable lists and even the fact that Eiffel got it wrong (which I remember puzzling over with a colleague back in the 90s)
In mathematics we call each of them "invariance", if we do not want to sound too pedantic (or unless the distinction cannot be deduced easily from context)
I feel that the examples in math are easier to understand that in programming. For example:
- The integral of a function is invariant to additive changes of variable : \int f(a+x)dx = \int f(x)dx
- The mean of a distribution is contravariant to additive changes of variable : \int f(a+x)xdx = -a + \int f(x)xdx
- The mean of a distribution is covariant to shifts of the domain (same formula, because f(x-a) is a shift of size "a")
- The variance of a distribution is invariant to additive changes of variable
In math, we call it "variance" - as in invariance, contravariance, and covariance. They each refer to the character of a given functor between categories. A functor is covariant if a -> b maps to Fa -> Fb, contravariant if a -> b maps to Fb -> Fa, and invariant (otherwise known as exponential) if (a -> b) and (b -> a) map to Fa -> Fb.
Every example of variant functors follow these laws. It helps to say that every higher-kinded type which admits a type parameter (i.e. List, Array, Option, etc) can be seen as an instance of a Functor (in fact, in haskell, they are instances of the Functor Typeclass).
Variance is a statement about the functionality of containers given a pre-existing relationship between types they contain. In Math, keeping with the example set, the functor \pi_n : hTop* -> Grp is canonically covariant, while P: Set -> Set: s -> P(s), the powerset functor, is canonically contravariant.
In keeping with the Scala tradition, Option and List, which you can check obey the functor laws. For more info, see Scalaz or Cats.
This is typical technical jargon conflation in furtherance of interview pedantry. Please spare me.
Looking at the words superficially, their definitions are easily discerned:
Covariance: changing together, with similarity.
Contravariance: changing in opposition to one another.
But, as part of strategic nerd signaling in interviews to parse what a candidate has been reading lately, you'll encounter middle managers that will cull contending close-call applicants based on trivia like this. Similar technical jargon that isn't what you think it might be, due to some seminal blog post include "composition" or "isomorphic" or perhaps most obviously, the simple-but-loaded term "functional."
Try defining the word functional incorrectly during a technical interview and see what happens.
> This is typical technical jargon conflation in furtherance of interview pedantry.
That's an interesting point of view. I thought it was just an article explaining what covariance and contravariance are in the context of a type system. The author also went so far as to give some illustrative examples.
The author could have just copied the definition from the dictionary. It seems like that would have been sufficient for you but it's unlikely that it would have been helpful for the majority of people that work in type systems and aren't aware of the more general concepts they work with.
The author never mentioned anything about interview techniques.
I also get a bit of a chuckle when I see these terms used in language documentation, along with "monomorphism" and other "math-y" terms. It just to me looks a bit like the authors are trying a bit too hard to appear very intelligent: sort of like a person who uses obscure vocabulary or complex grammar when communicating.
Alternatively, they know the words that a mathematician will understand, and realise that those words are the fastest and simplest way to get a mathematician to understand what is meant. Or perhaps the author is just not very good at aiming their words towards their target audience (if that target audience is not mathematicians).
What fraction of users of programming languages are mathematicians, or even CS graduates (of any degree level) who (should) care about the mathematical properties underlying the principles of the language? The answer: close to zero. The reason I chuckle is because I do understand the terms and I do recognize how absurdly placed they often are.
Isn't the point of technical interviews to determine one's knowledge of useless trivia? That is to say, that is what is typically measured, and it's seemingly the rare workplace that even attempts to measure whether their hiring processes correlate with work performance. Hiring for developer positions is a cargo cult practice -- which is not necessarily a blanket condemnation, but as a non-traditional candidate it's certainly done me no favors. Generally, this is a hard problem with no simple solution: evaluating programmers requires almost as much domain knowledge as programming does.
I'm pretty sure that you're not entirely wrong about what you say re: nerd signaling. However, while I share your frustration about the current state of the programming interview, I personally don't see much point in trying to buck the trend. If you really can't see your way to learning the trivia, then probably it's best to focus on your other strengths and show them what you do know. If they really only value the appearance of wisdom then you probably don't want to be working there anyway.
That is: if we have a function on types, say, the function `f` defined by:
Then we say `f` is covariant because it is monotone: it preserves the subtype ordering on its argument. That is: Similarly, if we consider `f` defined by: then this `f` is contravariant because it is anti-monotone: it reverses the subtype ordering on its argument. That is: People tend to find covariance more intuitive than contravariance; unless the issue is pointed out, they tend to assume everything is covariant. They see a type, say: and they assume "oh, every dog is an animal, so I can put 'animal' in place of 'dog' and it'll be more general (i.e. a supertype)". This is false, as the article points out.