Making union types even more useful

Only if A and B are disjoint. Otherwise more than one branch can match, and only the first one will be executed.

I get that :slight_smile: , what I’m trying to understand is in what situations this would actually cause undesirable behavior.

The cases seem to be:

  1. Disjoint (A | B)
  2. Same type (A | A)
  3. Super & subtype (A | B where A <: B)

As far as I can tell, none of these cases should cause undesirable behavior:

  1. Unambiguously handled by the match
  2. Regardless of which case matches, the same method will be called on the same instance
  3. LSB should mean that the super and subtype should be interchangeable.
    If not, then in a program where a.foo and (b: B).foo do different things, match order mattering is probably the least surprising thing you’ll encounter :slight_smile:
5 Likes

Yes, I think you are right, as long as we restrict our focus to member selection. If there is an overlap then dynamic dispatch will provide the right method.

6 Likes

Clarification question on intent, in the case where the type signatures are not identical: does this capture the desired return type in this case?

class A() {
  def foo: Int
}
class B() {
  def foo: String
}

val input: List[A|B] = List(A(), B())
val output: List[Int|String] = input.map(_.foo)

I guess you mean input.map(_.foo)
Yes, List[Int|String] would be a type I’d like :+1:

1 Like

Yep :person_facepalming:

I think union types need to be more common in order to be more useful, which I assume would mean disabling eager widening—having to force them to stick around with type annotations at every point works against them.

What’s the rationale for widening something like if someBool then B() else C() to A early if B|C <: A holds for trait A; class B extends A; class C extends A? After all B() or C() on their own isn’t widened to A.

I think I remember asking this at some point before and being answered that it interfered with implicits (seems like a lot of things do :roll_eyes:) but couldn’t the rules for implicits be adjusted then, rather than make everything else conform?

2 Likes

do you mean if someBool then B() else C()?

Yes, typo. I edited my post. Although strictly speaking an unwidened type like A|B would still be valuable.

See Pre-SIP: Exact type annotation

There are at least three reasonable+legal ways to type if p then 42 else "hello" as in:

scala> var p = true
var p: Boolean = true
                                                                                
scala> val x = if p then 42 else "hello"
val x: Matchable = 42
                                                                                
scala> val x: 42 | "hello" = if p then 42 else "hello"
val x: 42 | "hello" = 42
                                                                                
scala> val x: Int | String = if p then 42 else "hello"
val x: Int | String = 42

And I think its is most intuitive to the user that it gets the inferred type Int | String and not Matchable as currently (or even Any). (Students ask me about Matchable and Any and the explanation often leaves them unsatisfied as we need to get ahead of basic things to give a meaningful picture or else simplify roughly, BUT a union of Int | String seems much easier to understand for a beginner.)

And if the compiler could generate the match then this would work:

scala> extension (i: Int) def size: Int = i
def size(i: Int): Int

scala> x.size
-- [E008] Not Found Error: -----------------------------------------
1 |x.size
  |^^^^^^
  |value size is not a member of Int | String.
  |An extension method was tried, but could not be fully constructed:
  |
  |    size(x)    failed with
  |
  |        Found:    (x : Int | String)
  |        Required: Int
1 error found

Then x.size would expand to:

scala> x match { case i: Int => i.size case str: String => str.size }
val res3: Int = 42

All this seems reasonable to me, given that the union commutativity / match order discrepancy can be lived with or (partially) “solved”.

7 Likes

Including extension methods of individual components of the unions is an even bigger can of worms than what was discussed so far. Now when you see

x.size

with x: Int | String, you have to test whether:

  1. size exists on Int and on String (separately, not via a common superclass), or
  2. size exists via extension on Int | String, or
  3. size exists on Int, and is available for String via an extension, or
  4. size exists on String, and is available for Int via an extension, or
  5. size is available on Int and on String via separate extensions.

That’s a lot of possible cases! You can simplify it down to

  1. size exists on a common superclass of Int and String, or
  2. size is available on Int | String via an extension, or
  3. Decompose:
    a. size exists on Int or is available via an extension, and
    b. size exists on String or is available via an extension

That’s still a lot of fallback cases to test for, and therefore a lot of time spent by the compiler typechecking these things.

In addition, you argue that the implicit union types of if..then..else should be preserved, which means more union types in the program, which means more situations where this deep lookup is necessary.

I don’t think this is realistic.

4 Likes

If I’m allowed to dream a bit more I actually think the 42 | "hello" looks really interesting—after all it is the most exact definition. It could probably be a bit jarring to see a type signature like this, but it’s the most truthful one and also reasonable if we consider how user-defined singleton types behave:

trait A
object B extends A

def foo = B

If foo is inferred to return B.type rather than A, why should def bar = 42 be inferred as Int rather than 42? Intuitively I expect foo: B.type and bar: Int, but this is probably because I’ve grown used to it. And for all intents and purposes they behave exactly like their “wider” type, only the type is kept around for the compiler.

I’m not sure when it comes to generating static forwarders; I’d be happy enough if the compiler was smart enough to eliminate branches as they were tested and error out if I forgot to test for something. The power of keeping the unions around would still be very possible and allow a secure match to be written like

x match
  case n: 42 => 
  case s: "hello" =>

… or using the wider case n: Int => and : String if so desired.

2 Likes

Wouldn’t case 3 be sufficient?

For typing, you could just look for (_: Int).size and (_: String).size separately, exactly as it would be done now for one of them. It should not matter, if that method is defined on the type, a superclass or an extension of Int | Whatever.
Then you take the union of their results and – in case one is subtype of the other – simplify. So (AnyRef | String) becomes AnyRef and (Int | Int) becomes Int. That does not sound overly complex to me.

I’m not to deep in compiler details, but if I understand it correctly, you could then just check, if (_: Int).size and (_: String).size resolved to the same method (they do have unique Symbols, right?). If they’re the same, just generate a call, if not, generate a match.

(Weird thing: you could also always generate a match. If it’s the same method, that shouldn’t change any behaviour :smiley:, but maybe you’d want to re-optimize that in a later phase)

1 Like

That strategy only multiplies the number of lookups for the common case where the method is defined for the common type (directly or via extensions). So I expect it would make things worse, despite reducing the number of lookups for the uncommon case.

1 Like

It seems to me like the compiler automatically using methods for the correct type, will add cost of matching that is hidden to the user. Although natural I’d probably have it behind a import feature flag

Another thing I feel should be possible to convert disjoint union cases to Tuples via match types? It could be used to filter lists/database using types. I’d like to inline pattern match on union types to get a list of supported types. One way I thought could be match types to convert to tuples first and to do a toList to extract singleton values.

type FromUnion[U] = U match
  case Nothing    => Nothing
  case Int | y    => Int *: FromUnion[y]
  case String | y => String *: FromUnion[y]


summon[FromUnion[Int | String] =:= Int *: String *: Nothing]
/**
Cannot prove that <error Recursion limit exceeded.
Maybe there is an illegal cyclic reference?
If that's not the case, you could also try to increase the stacksize using the -Xss JVM option.
A recurring operation is (inner to outer):

  reduce type  String match ...
  reduce type  String match ...
  reduce type  String match ...
  reduce type  String match ...
  reduce type  String match ...
    */

Workaround is do something like the following instead


inline def Name[E <: Matchable]: List[String] =
  List(
    inline erasedValue[E => Nothing] match
      case _: (Int => Nothing) => List("Int")
      case _                   => List.empty
    ,
    inline erasedValue[E => Nothing] match
      case _: (Double => Nothing) => List("Double")
      case _                      => List.empty
    ,
    inline erasedValue[E => Nothing] match
      case _: (String => Nothing) => List("String")
      case _                      => List.empty
  ).flatten

Name[Int | String | Double] // List(Int, String, Double)

Who cares?

You don’t need to do this manually, you use a computer to do it!

Computers can do billions of things per second. Every few years they get one order of magnitude faster…

Computers were invented specifically to automate very large if-then-else decision trees!

I don’t believe that this would slow down the compiler significantly, anyway. (Especially as such checks can be likely trivially parallelized as I don’t see any data dependencies).

This sounds like a major over-dramatization.

Not only that this is "the obvious right thing"™ to do from the UX and user expectations side of things, this is even something that is very common elsewhere.

Have you ever heard of TypeScript? (Sorry, but SCNR :sweat_smile:)

We’re talking here only about a few extra checks in case a fallback is needed.

But in TS any and every type needs exactly this kind of checks! With much much bigger unions, and a lot of very complicated extra stuff around that, including complex type-level computations. (And BTW, people are loving how this works!)

Even the most stupid and primitive language since the invention of computers, namely Go, does exactly this kind of checks. Again, for every type! And it even still compiles very fast. (Because computers are fast, and some extra if-else-then lookups don’t make any difference. In Big O such an extension of the algo would be likely just an addition to a constant, so not even there at all, as I see it; not sure though about the last point as I don’t know the internals of the compiler; still I see no issue, as this is just a fallback).

I fully support the sentiment of the others here: Instead of avoiding unions (and intersections) we should embrace them. Otherwise they’re almost useless (because they get “erased” anytime soon you pass an expression forward).

Everything should have the most precise type. So we need indeed unions (and intersections) everywhere, actually. Scala is supposed to be a smart language from the future. So it should at least catch up to some more primitive languages, like Go… :wink:

People are asking for proper structural types in Scala since at least a decade! Making unions (and intersections) a first class feature would be a step in this direction as both topics are closely related. (The only difference being that one is the “inner view”, the other the “outer view” of the “type structure”).

Instead of discussing how much “more work” the stupid machine would have to do if implemented we should just implement it and than measure the actual real impact on performance.

And just to prove that there is absolutely no technical issue with the whole previously shown concept:

https://www.typescriptlang.org/play?#code/MYewdgzgLgBADjAvDKAnArgUwFCkrADyXhgH4YAWAJhgC4YAiAC0wBtWQHtdxoZgWwANYB9CvTDoAtgCNMqGAB8Y0VAEswAc2IEe+foNFUJ0uQuXM2HBjr18BmYSIDM9akpVoN25LqA

[This forum breaks the link when not quoted…]

(Please hover over the x in the TS code to experience the whole “magic”. :grinning:)

This leads to the question: Do we really like Scala to lack behind something like TS for no proper reason?

In practice nobody cares much about how fast a compiler is when it delivers intelligent and powerful features in return. See C++ or Rust… (Especially as computers will be at least ten times faster in four to five years.)

Once again Rust is making here the right trade-offs. And once again, the mass market loves it!

Scala should finally overcome the many years of gaslighting of some random idiots on the internet who constantly complained about alleged “slow compilation” or “complexity”. This were mostly some Java die hards who just wanted to save their world of the past.

Rust and other competitive languages show this was all just FUD, FUD and even more FUD. You can be widely successful with a compiler that is 10x - 50x times slower than Scala! But now all this accumulated FUD holds back our language—and is even inducing “fear of progress”.

In my opinion the this way induced paranoia reached some unhealthy levels by now. But we should just ignore the FUD spreaders (finally!) and just look what works in other powerful and successful languages.

Who wants a “better Java” can have Kotlin by now. The real contenders of Scala are imho Rust and C++ (and maybe to some extend Idris, F*, or LEAN)! That’s the right ballpark. Just forget about the “better Java” folks. It would be even relieving if those folks leave as fast as possible. They can have Go, Kotlin, or whatever they prefer, as long as they’re out of sight.

I would say, let’s get back to the initial premise of Scala: A really smart compiler that pushes the boundaries of what’s actually possible. Scala should again lead the pack; instead of fearing to implement features that are successfully used in production elsewhere since many years.

1 Like

Thanks @MateuszKowalewski for reviving this thread. We in the SIP committee have been focusing for a while on a backlog of SIP proposals, but now there are less pending proposals in the queue, so now might be a good time to propose something around making union types more powerful.

It is interesting to hear that you prioritize smarter inference over compilation time, and I think many agree (although I think your post can be taken as an unnecessary “over-dramatization” when using wordings like “stupid” to convey your views in this nice forum, which values Factual | Pertinent :heart_eyes: and I don’t think we gain much by speaking unfriendly about other languages).

Compiler speed is a legitimate concern for some and we welcome many different kinds of Scala users, also those coming from Java, Haskell, … , so ideally we can find a balance in language design of union types that can support many use cases - Scala is after all known for being both pragmatic and powerful. Having said that, I agree with you that the ergonomics of union types could be improved, exemplified by several proposals in this thread.

A good next step could be if someone steps up willing to start crafting a SIP proposal with ways forward to make ergonomics and type inference around union types more up to speed with e.g. type script, but such a SIP should also include an account of concerns on compiler and runtime overhead etc. I welcome anyone to start constructing such a SIP and then participants in this thread can help with reviewing and suggestions, and we can continue the discussion there.

9 Likes

Today we discussed this at the SIP meeting. I started with a summary of this thread to kick of the debate. Here are the slides that I used:

The last slide summarizes the discussion as I remember it, and the conclusion is that type inference is getting improvements but membership selection on unions based on structure is not inline with nominal typing and there are also questions from a soundness point of view.

Also, there are workarounds. With the nice extension method feature of Scala 3 you can create your own abstraction that provides a short-cut for the match, esp. in the use case where types are defined in code you don’t control and you still want to relate unrelated types:

scala> class A(val x: Int)
// defined class A

scala> class B(val x: Int)
// defined class B

scala> val ab: A | B = A(42)
val ab: A | B = A@7e2bc2f4

scala> ab.x
-- [E008] Not Found Error: ---------------------------------
1 |ab.x
 |^^^^
 |value x is not a member of A | B
1 error found

scala> extension (ab: A | B) def x: Int = ab match
    |   case a: A => a.x
    |   case b: B => b.x
    | 
def x(ab: A | B): Int

scala> ab.x
val res0: Int = 42

So given the problems and the existing workarounds I think we have to accept status quo on absence of structural selection on unions.

The good news on other issues with union types, are that more things work now in 3.2.1 , e.g.

And more is coming when nightly is out :tada: (If I understood @smarter right (?)).

7 Likes