Why no type inference for union types?

bilal-fazlani · January 13, 2021, 5:17pm

What stops scala from having type inference for union types?

Union types are excellent feature addition to scala. But I feel very limited as it does not provide type inference like other types.

For example

val x = if some_condition then true else 0

Is inferred as Any.

Where as

sealed trait Employee extends Person
case object A extends Employee
case object B extends Employee

sealed trait Person
case object C extends Person

val x = if some_condition then A else C

Is inferred correctly as Person

I see some other languages like typescript providing type inference for union types

ysthakur · January 13, 2021, 7:43pm

It’s likely to make sure you don’t get union types that are too precise. Having List(1, 1.2, "foo", List(1, 2)) inferred as List[Int | Double | String | List[Int]] instead of List[Any] would be annoying.

Relevant StackOverflow question.

smarter · January 13, 2021, 8:00pm

More than just being annoying, it can also break implicit search, for example: https://github.com/lampepfl/dotty/blob/master/tests/pos/i7829.scala#L24

nafg · January 13, 2021, 9:08pm

Wouldn’t that work if Inv was covariant? Is (A | B) <: Base? (I assume the comment should say A|B not X|Y.)

In any case I don’t think that argues it “breaks” implicit search, rather it affects it. The reverse is true after all: There could be implicits that would be found if the union were preserved and won’t be found if it’s not preserved.

Unless you mean relative to Scala 2, but aren’t we breaking type inference and implicit resolution a lot already?

Alternatively, couldn’t implicit search be smarter? That is, first try to type the getInv expression with T being the union type, then if implicit resolution fails, try to retype it with T as the LUB as a fallback. Does Scala 3 fix T once it assigns it based on the first parameter list (similar to foldLeft type inference issue)? If the second parameter list can still affect T then this wouldn’t need a lot of changes to type checking, it would be more about expanding implicit resolution, although I guess it’s more complicated because not every implicit search for a union type is valid resolving via its LUB. It’s valid here since it isn’t set in stone that we want specifically the union type; if implicit search only finds a result for the LUB then that should typecheck. Maybe it’s also valid when the union and the LUB are equivalent, such as a sealed trait hierarchy? I guess that probably has holes.

Maybe a better idea is the type could be resolved more lazily? That is, the expression if (true) new A else new B should produce neither A|B nor Base, but rather a special type that basically says “this could go either way, potentially A|B and potentially Base,” and then after processing the first parameter list of getInv T is “fixed” to this ‘quantum’ type. Next implicit search would be performed with this T, giving a union type resolution higher priority. The result would settle T to either a union type or the LUB, depending what it found. If nothing is found the error message should reflect that it tried both.

There would have to be a rule that says that if once typing succeeds a type is still in this “quantum” state, the type should be settled. At this point perhaps LUB wins, or perhaps it depends on the complexity of the union type.

smarter · January 13, 2021, 9:23pm

Yes.

“A lot” is relative, if we were to infer union type everywhere, we would probably break an order of magnitude more code than we already broke (consider that most typeclasses like Functor, Monad, … are invariant, and there’s no easy way to make them variant).

In many situation it would indeed help to be able to “try x, if it fails then try y”, but this is backtracking which leads to exponential blowup unless it’s extremely carefully managed and limited.

This is already how type inference works, we create type variables which get constrained and only instantiated when they need to (see my talk on the subject: https://www.youtube.com/watch?v=lMvOykNQ4zs). We still instantiate some type variables before doing an implicit search because if we don’t do it we run into the opposite problem: ambiguous implicit errors because now multiple implicits end up being valid, with none being more specific than the others.

nafg · January 14, 2021, 12:39am

Couldn’t you say that unions are more specific than their LUB?

smarter · January 14, 2021, 12:59am

It is since it’s a subtype, but when you’re doing an implicit search for MyTypeclass[X], if MyTypeclass is invariant in its type parameter then MyTypeclass[A | B] is not more or less specific than MyTypeClass[SuperType].

nafg · January 14, 2021, 8:47am

I hear, because the typeclass is not a subtype (due to invariance) there’s nothing making it more specific.

What I’m saying is that this needs to be special-cased. Implicit search needs to be able find results both for the union type and for the lub, with the union type having higher priority, and whichever succeeds influences what T will be.

megri · January 14, 2021, 9:51am

Aside from the discussion about implicit resolution, I would absolutely love if this union was inferred rather than Any. Not only would it better–in fact, exactly—communicate the type of the list, it would make debugging an unexpected type much simpler since it is easy to spot the undesired type at a glance.

RubenVerg · January 14, 2021, 10:07am

What about some kind of “marker type” eg

def foo(x: List[String]): Union = {
  if (x.size < 5) x
  else x map _.toInt
}

is inferred to List[String | Int] while no explicit return type goes LUB?

RubenVerg · January 14, 2021, 10:09am

wait, isn’t if a then true else 0 inferred to AnyVal?

dcsobral · January 14, 2021, 4:59pm

There’s a whole lot of people that hate when Any is inferred, as it is often an indication of error, not design. Inferring an union instead of Any would be the same.

What perhaps might be nice would be a way of telling the compiler to infer an union for you. Something like:

def: testX(x: X): [|] = if someCondition(x) then true else 0
def testXx(lst: List[X]): List[|] = lst.map(testX)

Ultimately, however, I think the best place for this kind of thing are IDEs. Let them offer a refactoring to declare an explicit union type, or turn a more generic type into a more specific union type.

morgen-peschke · January 14, 2021, 5:21pm

While it’s true they’re both errors, inferring a union instead of Any is going to generate a far more informative compilation error.

dcsobral · January 14, 2021, 5:33pm

Agreed.

RubenVerg · January 15, 2021, 7:02am

I thought that’s what I said?