Type-safe contains

martijnhoekstra · September 23, 2018, 7:53am

Ha, you gravely underestimate my ability to finger-point. I point at the signature of contains, which is not sensible and can’t be implemented unless you have universal equals.

With great power (like having universal equals) comes great responsibility (like not writing the signature of contains as it exists now)

edmundnoble · September 23, 2018, 2:12pm

Fair.

LukaJCB · September 24, 2018, 7:27am

I can confirm this. There’s multiple data types in Cats that are covariant and have a contains method, but do not suffer from this issue, because they use Eq.

I agree that the current situation isn’t ideal and could benefit from a temporary fix until Dotty arrives

smarter · September 24, 2018, 7:48pm

I don’t think a type inference trick is a good solution here. For starter, you’d have to make sure it works with Dotty and its improved inference capabilities (which might improve even more in the future). I’ve been thinking about this problem for some time and came up with a different idea. Fundamentally, the issue is that the type signature of a method serves two purposes:

Limit which arguments can be passed to the method
Limit how the passed arguments can be used inside the body of the method

We’d like to write:

def contains(elem: A): Boolean = ...

This is exactly what we want for 1. (I should only be able to check if a List of A contains elements of type A), but this is wrong for 2. (Because of covariance, I cannot assume that I’m actually being passed an A).
So instead we write:

def contains[B >: A](elem: B): Boolean = ...

Which satisfies 2. but not 1, and now we’re stuck.

I think the only way to get out of this conundrum is to add a way to decouple 1. from 2. For example, maybe we could write:

def contains(elem: +A): Boolean = ...

Where +A here would mean: when calling contains, the user must pass a value of type A, but in the body of contains, the type of elem is just an abstract type lower-bounded by A. This allows us to satisfy both 1. and 2.

eed3si9n · September 24, 2018, 8:01pm

smarter:

def contains(elem: +A): Boolean = ...
Where +A here would mean: when calling contains , the user must pass a value of type A , but in the body of contains , the type of elem is just an abstract type lower-bounded by A . This allows us to satisfy both 1. and 2.

I think this is a promising direction.

To be specific about this, I think we should allow A and all subtypes of A so:

List(Option(1)).contains(None) // compiles
List(Option(1)).contains(1) // doesn't compile
List(Some(1)).contains(None) // doesn't compile

soronpo · September 24, 2018, 8:06pm

I think variance notation is confusing enough in Scala to allow such a thing as : +A.

eed3si9n · September 24, 2018, 8:18pm

While I agree with you, the moment someone puts +A in their class they’ve already gone into the confusing rabbit hole. If one were to write def contains(elem: A): Boolean, the error message they would get is:

Error:(4, 11) covariant type A occurs in contravariant position in type A of value a

And as we’re seeing here, [A1 >: A] is just a cooler way of writing Any.

soronpo · September 24, 2018, 9:00pm

Yes, but when people begin with variance, some of their trial-error usages can involve putting +/- in different places (I need to find that @adriaanm quote), and I don’t think we should allow the parser to accept more places for +/-Type to exist.

eed3si9n · September 24, 2018, 10:48pm

Here’s Pierce’s definition of type system:

In this light, there are certain things that are incompatible with Liskov Substitution Principle, Scala’s current design, or the fewer-possible-output theory.

To me, the fact that 1 == 1 always returning true, or 1 + 1 always returning 2 is not a problem. The fact that List("abc").contains("abc") always returning true is some sort of logic level concern. However, the fact that 1 == Option(1) compiles is unsound. Sort of like asking someone “how did you kill Fred?”

The implementation of contains is underpinned by this Liskovian dystopia that is universal equality:

  def contains[A1 >: A](elem: A1): Boolean = exists (_ == elem)

This make unsound List(1, 2, 3).contains("wat") possible.

To some degree we need to keep living in the design, but for the safety improvement we would need to break apart from the hegemony of Any.

A known workaround for this that’s been tried by a few including Dotty is introducing Eq typeclass (Multiversal Equality). Typeclasses will select the most specific instance of Eq[A], === operator will behave differently based on lhs and rhs’s type. That would not be kosher according to Barbara either.

jvican · September 25, 2018, 2:30pm

This is an interesting analysis, I had not thought about it this way I like the solution too. The only drawback I can see is that it would not interact very well with the prefix types SIP but, given how important this problem is, I would be totally fine with your solution.

Y’know, consider SIPping this

soronpo · September 25, 2018, 3:47pm

As the owner of the prefix types SIP, I don’t mind killing it for this. I would still prefer some other way to annotate this, from the reasons I stated above.

szeiger · September 26, 2018, 11:59am

You can get a hack version of that already with def contains(elem: A @unsafeVariance) but I don’t see how it would solve the problem. You’re still relying on the intricacies of type inference. After all, the reason why def contains(elem: A) is not allowed is that you cannot restrict the type of elem in that way (given a covariant A).

Assuming xs: Seq[Int] and s: String the current definition of contains gives you two choices of solving xs.contains(s):

(xs: Seq[Int]).contains[Any](s: String)
(xs: Seq[Any]).contains[Any](s: Any)

The second one still works if you remove the type parameter from contains. You would need a type inference rule like “never widen the LHS type” to prevent his. Is such a rule reasonable?

smarter · September 26, 2018, 1:43pm

Type inference in Scala will fill in missing type arguments, but it will never make up a type ascription to make something typecheck.

lihaoyi · September 26, 2018, 9:15pm

Such a rule would be a gross violation of the Liskov Substitution Principle, which basically all of subtyping is built upon. If a supertype A is allowed in the LHS position, then the subtype B <: A must also be allowed in the LHS position.

If not, you end up with an ad hoc set of rules full of holes and inconsistencies, unless you have some other sound theory you can use as a foundation to replace LSP.

Scala already lets you violate LSP with implicit magic, which when used heavily can accurately be described as ad-hoc rules full of holes and inconsistencies

szeiger · September 27, 2018, 10:43am

No, you have to distinguish between rules of the type system and type inference. If a certain type is allowed for a term that doesn’t mean you have to infer it, especially in a language with subtyping where you often have many valid choices.