Math on union numeric union types

oscar · May 24, 2021, 6:13pm

I was writing some scala 3 code where I wanted something like:

val one = 1 | 2 = 1
val two = 2 | 3 = one + 1

but this does not compile.

on the other hand:

val one: 1 = 1
val two: 2 = one + 1

does compile. So, it seem that the compiler understands that if you add 1 to a value of type 1, you get one of type 2, but it can’t distribute that across a union of all integer values. It seems tractable to me where you have a union of primitives of the same type that you could distribute operations with singleton types across those unions.

Are there any issues that I am missing to adding support in the compiler for these kinds of operations on primitive values? It could be rather useful, I think.

RubenVerg · May 24, 2021, 6:32pm

I’d love some more general “map”-like operation on union, call it Map, so (1 | 2) Map (_ + 1) (I guess (1 | 2) + 1 as well) would return 2 | 3

oscar · May 24, 2021, 6:46pm

It occurs to me if you have a A1 | A2 | A3 ... where Ai <= A and you call a method m defined on A but each Ai may override with a more specific return type Ri then you should get as a result R1 | R2 | R3 ... This wouldn’t require any reflection since it is just propagating static information about the refined override return types.

Then, we can consider Int as having 2^(32) subtypes, and each have different values for def +[N <: Int](n: N): scala.compiletime.ops.int.+[this.type, N], but that could be done virtually at typechecking time.

Something like that…

odersky · May 25, 2021, 10:23am

Maybe: combinatorial explosion of possibilities resulting in slow compilation times? I don’t know for sure but it looks likely this would be misused precisely because it would be so powerful.

Iltotore · May 25, 2021, 1:19pm

I agree with this. The compiler allows you to do:

val one: 1 = 1
val two: 2 = one+1
val twoOrThree: 2 | 3 = two

But directly passing from one+1 to 2 | 3 is not allowed.
It doesn’t seem to be consistent/logic.

The compiler doesn’t seem to directly cast one+1 into 2 but only when assigning to a 2 variable.

Maybe the compiler should consider each part of the union when assigning:

val one: 1 = 1
val twoOrThree: 2 | 3 = one+1 //Is one+1 castable to 2 or 3 ?

Odersky said:
I don’t know for sure but it looks likely this would be misused precisely because it would be so powerful.

I don’t understand what would be “too” powerful here.

Ichoran · May 25, 2021, 3:57pm

You could pretty easily do arbitrary bitset arithmetic at that point, which means that your type tests could need to store multiple not-very-sparse 2^32-bit type occupancy sets. At least, that’s what you have to watch out for.

The time taken wouldn’t be exponential (unless you have exponentially many types) since as far as I can tell the occupancy would always be concrete and thus could just be directly calculated. But it might require unreasonable amounts of computation and memory, depending on what exactly was allowed.

Iltotore · May 25, 2021, 7:52pm

What exactly requires these computations ?

I mean I’m not talking about making 2^32 subtypes of Int. Just keep actual singleton type computation but make the compiler check each “side” of an union:

val one: 1 = 1
val twoOrThree: 2 | 3 = one+1 //Attempt to cast `one+1` into `2` and if unable, try to `3`.

Ichoran · May 25, 2021, 10:49pm

It’s potentially 2^(2^32) subtypes of Int.

Anyway, it depends what is allowed. Is val two: 1 + 1 = 2 okay? What about type nuggets = 4 | 6 | 10 | 20? What about val complicated: (2 | 4 | 6 | 8 | 10 | 12 | 16) & (1 | 4 | 9 | 16) = (one + one)*(one + one)? type sum = a + b, with parameterized a and b?

I personally would allow any of it–if you’re going to do type computations on subsets of ints, just go the whole way and put a timeout in the compiler and a size-check, and complain if people do anything too intractable.

But depending on what you allow, it could require almost arbitrary computation.

Iltotore · May 26, 2021, 3:35pm

Here, I’m only talking about union types. In this example:

val one: 1 = 1
val twoOrThree: 2 | 3 = one+1

the compiler should check if one+1 can be transformed to 2 or 3. It actually doesn’t on unions.

What about val complicated: (2 | 4 | 6 | 8 | 10 | 12 | 16) & (1 | 4 | 9 | 16) = (one + one)*(one + one)

I don’t get your point. How can a singleton type be, for example, 1 and 2 ? This shouldn’t be some sort of bitset. It’s still union and intersection types.

Jasper-M · May 26, 2021, 5:01pm

This example actually already works. It’s when you take it one step further (adding something to twoOrThree) that it stops working.

Iltotore · May 31, 2021, 7:01am

Thanks for the info.

However, this does not change my proposal:

val oneOrTwo: 1 | 2 = 1
val twoOrThree: 2 | 3 = oneOrTwo+1

Shouldn’t the compiler check the correctness of 2 | 3 for 1+1, then if incompatible 2+1 ?

Also, maybe @oscar should describe his use case ? Maybe there’s a better alternative.

ansvonwa · May 31, 2021, 8:04am

Pretty sure that would be possible, but the compiler would have to check for both, type 1 and 2 that the resulting type is subtype of 2 | 3. This should not be a problem to compute, as long as only one operand is a union type.
However, if both operands are union types, we get the combinatoric explosion @Ichoran mentiones:
What should (0 | 1) + (0 | 2) (as types) be? 0 | 1 | 2 | 3, right?
But then, (0 | 1) + (0 | 2) + (0 | 4) already is a union type of 8 types, if we add a + (0 | 8), we get 16 types. So the number of types doubles for each summand, (and there is not really a proper way to represent these types more efficiently) so we would get exponential compilation times.

oscar · June 11, 2021, 6:42pm

Yes, you could definitely have exponential blowup in compile times, but I think only in cases where you are typechecking a union type ascription. In practice, I think these ascriptions would not get enormous.

Also, there have traditionally been many ways to write code that typechecked very slowing (I think you can make an exponential implicit search without too much trouble), but that usually isn’t a reason in and of itself to not have the feature. I think the more important case is: are there important cases that do not take too long to compile, and I think the answer to that question is yes in this case.

rssh · June 12, 2021, 5:34am

Hmm, a standard method of avoiding such explosion in rewriting systems is normalizing terms to some canonical forms before unification.