# Proposal: Make Set covariant

I’ve just made a PR that attempts to make Set covariant: https://github.com/scala/scala/pull/6728, if this is something you care strongly about, feel free to weight in with your opinion.

A `Set` in Scala, is something that can

• be iterated: `iterator: Iterator[A]` (covariant);
• test membership: `contains(a: A): Boolean` (contravariant).

So a `Set` should be invariant.

Also, a set implies an equivalence relation on its type parameter `eq(a: A, b: A): Boolean` to test distinctness. A better design would be using a `cats.Eq[A]` typeclass (or `Hash[A]` for `HashSet`s; `Order[A]` for `TreeSet`s) to specify its equivalence relation.

1 Like

To be fair, the same things are true of `Seq`.

1 Like

Membership testing is different for `Set` vs `Seq`.

The signature of `Seq`'s `contains` method is

``````def contains[B >: A](b: B): Boolean
``````

This is actually

``````seq.exists(a => a == b)
``````

where the `==` relation is implicitly defined because that `B >: A` and because of the existence of the universal equality relation endowed on supertype `B`.

This is in stark contrast with `Set`, where the distinctness of the elements in the set is defined in the set itself. Note that in `SortedSet[A]`, the distinctness is defined within the `Ordering[A]` of that object instead of universal equality.

``````scala> SortedSet("a", "b", "c")(Ordering.by(_.length)).size
res3: Int = 1
``````

Consider the following:
`Set[A]` implicitly requires `Eq[A]`
`HashSet[A] <: Set[A]` implicitly requires `Hash[A] <: Eq[A]`
`SortedSet[A] <: Set[A]` implicitly requires `Order[A] <: Eq[A]`

None of these sets should be covariant.

3 Likes

I recall an example from very old (and partially obsolete) video of Paul Phillips. For now, invariant `Set`s do not allow to do meaningless checks like

``````def f(intSet: Set[Int]) = intSet contains "your mom" // doesn't compile :-)
``````

But `Seq`'s `contains` with `B >: A` allows this:

``````def g(intSeq: Seq[Int]) = intSeq contains "your mom" // compiles :-(
``````

By the way, @smarter, as far as I understand, your pull request also makes such meaningless expressions compile.

2 Likes

I believe it does. And that being the case, the type parameter B serves no purpose; the signature could just as well be `def contains(value: Any): Boolean`.

1 Like

Sure, but that’s not a reason for `Seq` and `Set` to have different variance. That’s a reason to make `Seq` invariant as well. Or to implement `contains` as an extension method, but that’s also far from ideal.

Passing anything into `contains` that is not of type `A` is almost certainly a bug that I want the compiler to capture before runtime.

`Seq[A]#contains` is fundamentally different from `Set[A]#contains`.

`Seq[A]#contains` could be understood as

``````def contains[B >: A](b: B)(implicit B: Eq[B]) = this.exists(a => B.eqv(a, b))
``````

where the equivalence relation `Eq[B]` is attached to the supertype `B`, not the internal element type `A`.

However, since `Set[A]` inherently contains the capability of deciding whether two elements are equivalent (universal equivalence, or by `Ordering[A]` as in `SortedSet[A]`), the `contains` method is just

``````def contains(a: A) // using internal Eq[A] to check distinctness
``````

where the `Eq[A]` object could be considered as within the set (compare `ordering: Ordering[A]` in `SortedSet[A]`).

1 Like

But this implies non-covariancy, doesn’t it?

Maybe, we are having wrong abstractions? Maybe, it’s better to have sets for enumerating (iterating, if you like) and sets for existence checks to be distinguished types with appropriate variancy each?

This issue is much deeper and harder.

For instance, how do you compare two `Set`s with different equality definition?

Imagine, you compare `Set`s like sets of objects (considering, you have fail-prone universal equality, maybe). If so, it would imply that `s1.size != s2.size` would imply `s1 != s2`.

Consider now two `Set`s `s1 == s2`. It seems natural that for every normally-typed `a` `s1 + a == s2 + a`. But it is not depending on whether internally stored equality (or ordering) of these two sets (at runtime!) differ or not — at least, because sizes after addition can differ depending on the equality or ordering.

All that goes to a thought that again we have wrong (not precise enough) abstraction. In particular, that `Set` must have to have somewhat `Eq` at its type (and `SortedSet` must have `Ordering` at its type too), i.e. something like `Set[A, E <: Eq[A]]` and `SortedSet[A, E <: Ordering[A]]` or so.

2 Likes

I completely agree. I’m arguing for invariance.

`Set[A]` should be dependent on some `Eq[-A]` instance. Whether to encode this as a type or as a runtime value is dependent on how precise we want the types to be.

We should explore making `contains` an extension method for both sets and sequences. Covariance is nice, but detecting non-sensical `contains` expressions is also nice.

3 Likes

I object making `contains` an extension method for `Set`. It is the essential operation of sets.

Making `Seq#contains` an extension method is good, as it is not part of the core definition of a sequence.

If we do `Map[K, +V]` (invariance on `K`) then I think it naturally entails that `Set[K]` is invariant on `K`. Unless we want to do `Map[+K, +V]`, not `<: (K => V)`?

How would that work? I remember trying to define a `contains` extension method for a covariant set-like data type. I tried all kinds of tricks (such as various implicit evidence schemes) to prevent it from upcasting the type argument on application, but did not find any way to do it – which is actually a testament to the completeness of the compiler’s type inference.

EDIT: as pointed out by @Jasper-M it turns out that this works and is probably a good solution to the problem.

Maybe I’m misunderstanding you, but what doesn’t work about:

``````implicit class HasSyntax[A](private val self: Seq[A]) extends AnyVal {
def has(a: A): Boolean = self.contains(a )
}
``````
``````scala> List(1,2,3).contains("a")
res9: Boolean = false

scala> List(1,2,3).has("a")
^
error: type mismatch;
found   : String("a")
required: Int
``````
2 Likes

I can image that @LPTK does not like that `self` must have a suitable contains method that can be called by the extension ops:

``````trait Container[+A] {
private[Container] def permissiveContains[B >: A](b: B): Boolean
}

object Container {

implicit class ContainerOps[A](val container: Container[A]) {
def contains(a: A): Boolean = container.permissiveContains(a)
}

}

``````

Hmm… strange. I must have misremembered my old experiments, then. Please disregard my objection 1 Like

I checked. The no-covariance trick works in Scala 2 but not in Dotty, so probably @LPTK did the experiments in Dotty. Dotty’s type inference is more global, so that it can handle the case. This is good in principle, but it does deprive us of the trick in question.

1 Like

There is a way that seems to work in both. I don’t like it, but still.

``````implicit class HasSyntax[A](val s: Seq[A]) extends AnyVal {
def has[B](a: B)(implicit ev: B <:< A): Boolean = s contains a
}
``````

So, we’ve almost repeated that `Seq` is covariant by using `<:<` here.

Then those work well:

``````def u[A](s: Seq[A], a: A) = s has a // compiles :-)
def v(s: Seq[Int]) = s has 5 // compiles :-)
``````

And even artificially introduced method-like covariancy works:

``````trait A
trait B extends A

def v(s: Seq[A], b: B) = s has b // compiles :-)
``````

And this does not compile both in Scala 2 and Dotty:

``````def w(s: Seq[Int]) = s has "your mom" // does not compile :-)
``````

But all this stuff that can be done only through extension methods (which seems to be not intended for this naturally) looks really weird. Isn’t there any way to set type inference bounds during declaration of methods/arguments? It looks like those cheats we are doing through extension methods should be done when declaring a class (but maybe I have the wrong feeling).

1 Like

I believe Kotlin has an escape hatch that allows the programmer to say “No really, allow this use of the type paramter in a variance incorrect location.” I’m not really a fan of this approach (basically break the type system to allow something that is incorrect, but “useful”). But if we are able to make it happen with extension methods anyway then maybe it’s a reasonable thing to do.

(PS: I was unable to find documentation of the Kotlin escape hatch, but I specifically remember hearing about it in a talk.)