Why is Invariance, Covariance and Contravariance necessary in typed languages

Question

Ok not really sure if I'm right.

I only recently learned that I needed to have contravariant interface to be able to pass that interface as a parameter in C# and this feature was only added in .NET 4.0.

So obviously there is some reason you can't do this Covariant or Invariant interfaces and it probably has something to do with passing in, getting out the generic class.

I'm not really sure what the limitation here is, and know that I've seen where Contravariant interface can be used where the others can't where can I use Covariant (when is that neccesary?).

This is all out of curiosity (I'm very interested in programming concepts surrounding some languages). I would love to see (link would be fine) some examples of why it is necessary to use these features, what would break if you would allow covariant/invariant into contravariant scene and vice versa and where each one shines.

"necessary" is a strange word in the IT world. Several languages have proven that none of those concepts are necessary (whitespace and other [esoteric programming languages](https://en.wikipedia.org/wiki/Esoteric_programming_language) don't have those and are still [turing complete](https://en.wikipedia.org/wiki/Turing_completeness)). And "those concepts" even includes things like "methods", "variables", "names" or "readable syntax". And yet, arguably, all of those are necessary for a useful programming language. A more fitting question would be "what good do those features do"? — Joachim Sauer, May 21 '12 at 11:54
relevant: http://blogs.msdn.com/b/ericlippert/archive/2009/11/30/what-s-the-difference-between-covariance-and-assignment-compatibility.aspx — jk., May 21 '12 at 12:02
Ok true, not necessary to be turing complete. But .Net made invariant interfaces unable to be passed as parameters, so they deemed it necessary in a way. I just figured, at least in one case, why Contravariance is needed. You seem to know something about this. Care to give a full answer if you have time :) — Ingó Vals, May 21 '12 at 12:03
@JoachimSauer: co/contra variance is an important concept whenever you add subtyping to a programming language, so yes, Id say its "necessary". — hugomg, May 21 '12 at 18:01

Mason Wheeler · Accepted Answer · 2014-08-09T16:59:30.157

The need for covariance and contravariance can be best understood with an example.

Let's say you have a function that accepts a parameter of type List<Base>, where Base is a base class that other classes inherit from. You would intuitively think that it would be just fine to pass a List<Derived1> to this routine, if Derived1 inherits from Base, and you would be mostly right. It would be fine if the function treated the list as read-only, but otherwise you'd have a problem.

Since all the function sees is a List<Base>, it could theoretically call the Add method and add a Derived2 instance to the list, which is a different, incompatible class derived from Base. And then the function returns, and you've got a Derived2 in your list of Derived1 objects, and you've violated type-safety. Then you try to iterate over the list and call some method on each object that doesn't exist on Derived2, and the whole thing blows up in your face.

Covariance and contravariance requirements exist to prevent this sort of scenario by enforcing extra constraints on the parameters. For example, in this example, the function could be made type-safe by ensuring that no methods that add new objects to the list can be called. If the compiler can use variance requirements to prove that, then it's safe to create a function like this. Otherwise, it isn't.

If there were some means of marking parameters as "ephemeral" (which would be a useful ability for a variety of reasons, actually), then it would in some cases make conceptual sense to accept ephemeral covariant as well as contravariant parameters in input positions. For example, given a `Dictionary`, the operations `Contains(someAnimal)`, `Contains(someSiameseCat)`, and `Contains(someDog)` are all logically valid. If the parameter is known to be unrelated to `Cat`, the method can simply report `false`. If it's a supertype of `Cat`, the method can start... — supercat, Jan 28 '14 at 16:29
...by seeing if the passed-in instance is a subtype of `Cat` and, if not, report `false`. If the parameter type is a subtype of cat, there's no need to check the type of the passed-in instance. Code which directly passes a parameter type which is unrelated to `Cat` is probably erroneous, but code which passes a generic type that might be unrelated to `Cat` may be perfectly fine. As it is, `Dictionary`'s `Contains` method doesn't like things that don't derive from `Cat`, since there would be no guarantee it wouldn't try to persist the parameter as a `Cat`, but if it were ephemeral, it could. — supercat, Jan 28 '14 at 16:32

Telastyn · Answer 2 · 2012-05-21T15:44:45.390

I only recently learned that I needed to have contravariant interface to be able to pass that interface as a parameter in C# and this feature was only added in .NET 4.0.

.NET 4 added co/contra-variance for generics. The variance concept though exists in any language with subtyping.

When doing type checking, a parameter type is acceptable if it is the same type as what is expected or more specific. Or simply, the types are covariant. This is straight-forward. If your method is working on Person then passing in a Student or a Teacher is irrelevant.

The gotcha for this feature is generic parameters. In some cases, it is safe to use generic types covariantly: eg Action<Base> can be assigned to a variable of type Action<Derived>; Who cares if a Derived is passed in, Action<Base> can work with it.

Likewise there are cases where generic types can be used contravariantly: Func<Derived> can be assigned to a variable of type Func<Base>. Who cares if a Derived is always returned, anything that calls Func<Base> can deal with the result.

And there are cases where the generic types cannot be used co- or contra- variantly. List<T> is a good example. You can't assign List<Derived> to a List<Base> (or vice versa) because then some functions would not be of the correct type.

The basic idea is that the type system is correct. Allowing user specified generic variance allows the user to tell the type system where it is possible for generic types to be co/contra-variant. The concepts though exist in any type system with sub-typing regardless of explicit notation.

err not quite, you are conflating covariance and assignment compatibility in this — jk., May 21 '12 at 15:40
I believe you are correct; took the example mostly from MSDN which reversed the English. Fixed in post. — Telastyn, May 21 '12 at 15:44
not what I meant, see the E.L. blog linked under the question or in my answer — jk., May 21 '12 at 17:05
Co- and contravariance for generics were not added in .NET 4, they were supported by generics from day 1, i.e. .NET 2. Language support for writing and consuming co- and contravariant generic interfaces was added in C# 4, but other languages supported them long before that. — Jörg W Mittag, May 22 '12 at 00:16

score 2 · Answer 3 · edited Aug 17 '22 at 08:18

This is mostly a rewording of this excellent blog by Eric Lippert:

C# sub types have always been assignment compatible with their base types e.g. given that Teacher is derived from Person Person p = new Teacher(); is valid i.e. there is a relation isAssignable(x,y) which is true IFF x= y is allowed.

before C# 4 generic collections of sub types were not assignment compatible with generic collections of base types (i.e. IEnumerable<Teacher> does not derive from IEnumerable<Person> so you could not assign IEnumerable<Teacher> to a IEnumerable<Person>) i.e. isAssignable(IEnumerable<x>,IEnumerable<y>) was always false regardless of the value of isAssignable(x,y)

C# 4 adds variance for generic types, this allows a generic type to be *variant over a projection from T -> A<T>. which means that for covariant parameters, relations which hold for T,U hold for A<T>,A<U>, and for contra-variant the relation is reversed. i.e. for the relationship isAssignable(x,y) a types that are covariant over the projection like IEnumerable<>s has the same relations as their generic parameters so isAssignable(x,y) == isAssignable(IEnumerable<x>,IEnumerable<y>)

Daniel Pryden · Answer 4 · 2012-05-21T22:00:55.877

The different forms of variance are only needed because you have two independent axes of polymorphism going on:

Subtype polymorphism: A function that is defined to operate on values of type Base can also operate on values of type Derived, if Derived is a subtype of Base. Within the function, only the operations exposed by type Base may be used (unless reflection or run-time type inspection are used).
Parametric polymorphism: A generic function that is parameterized over a type T (e.g. void Foo<T>()) can operate on values of type A, B, etc., if those types match the where clause restrictions (if any) on the type of T. Within the function, only operations which are valid for the entire bounds of the type parameter can be used. (So if there are no bounds on T, then only operations that are guaranteed to be valid for all objects may be used.)

Each axis makes sense on its own. But you run into problems when you try to act polymorphically in both axes at once.

For example, given classes Base, Derived, and SomethingElse:

A method void Foo(Base b) can accept a value of type Derived as a parameter, by subtype polymorphism.
The method void Add(T item) on the type List<T> can accept values of any of Base or SomethingElse, etc, provided the type has been instantiated as List<Base>, List<SomethingElse>, etc.
The method void Add(T item) on the type List<T>, when the type has been instantiated as List<Base>, is treated as if it were declared as void Add(Base item), and thus can accept a value of type Base.
Additionally, the method void Add(T item) on the type List<T>, when the type has been instantiated as List<Base>, can accept a value of type Derived, due to the combination of both parametric polymorphism (the parameterized type of Base) and subtype polymorphism (the is-a relationship between Derived and Base).
However, given a function void Foo(List<Base> list), and a value of type List<Derived>, the method void Add(T item) on the type List<Derived> is treated as if it were declared as void Add(Derived item), which is incompatible with the method void Add(T item) on the type List<Base> (which is treated as if were declared as void Add(Base item)).

This is because the value which is required for the formal parameter list of void Foo(List<Base> list) must be an object that has an Add() method that is able to accept values of type Base, but the value which is being supplied (a value of type List<Derived>) does not accept values of type Base in its Add() method!

Now, if you are only dealing with one axis of polymorphism at a time, then this isn't a problem:

If all you have is subtype polymorphism, then we're just like the type system of C# 1.0: there are no generics, and thus no way for problems of covariance and contravariance to come up. (Actually, the problems can come up, it's just that they appear at run time, not compile time. For example, see ArrayTypeMismatchException.)
Conversely, if all you have is parametric polymorphism (a.k.a. generics), then it is never allowed to pass a type Derived to a function that expects a value of type Base, since functions cannot act polymorphically with respect to a subtype relationship.

(While this restriction seems hopelessly restrictive to someone coming from an OO background, it's actually quite common in some functional programming languages. It does change how you go about structuring your code, however.)

Now, the problems of covariance and contravariance can occur just as easily in a dynamically typed language as in a statically typed one. The only difference is that dynamic language code is necessarily more resilient to unexpected types.

For example, in a Python program, you might have a class that represents a list of strings, and an add(item) method that adds an item to the list. However, the type system does not enforce the invariant of the class -- any caller can pass a value of any type they like to the add() method. Therefore, the code must either defend against illegal values somehow (e.g. by doing a check in the add() method to ensure that a non-string doesn't get added), or cope gracefully with finding a non-string in its internal storage.

These kinds of problems are exactly what covariance and contravariance are about: cases where code can be "surprised" by the type of values. In a statically type-checked language, the type checker is responsible for proving that code can never be surprised by the type of a value: if you have a local variable that was declared with type string, it will never contain a value of type double, no matter what. If, due to the way you're using different forms of polymorphism together, the type checker can no longer prove that, it will reject your program with a type error.

Basically, this is the trade-off of static versus dynamic type checking: a static type checker can prove that certain kinds of undesirable behavior are not present in your program (in this case, "surprising" types of values), but only at the cost of rejecting some interesting programs. That is, a type checker is by necessity conservative.

In order to make type checkers more flexible (and thus allow more interesting programs to be checked by them), modern languages like C# have introduced new language constructs (like the in and out keywords to identify covariant and contravariant type parameters), which allow more fine-grained control over the type checker's proof. Generally, this is considered a good thing: it allows us to have the advantage of a type checker proving useful things about our program, while still allowing the maximum number of interesting programs through.

Why is Invariance, Covariance and Contravariance necessary in typed languages

4 Answers4