Proposal for Opaque Type Aliases

Would it be possible to reuse the keyword new instead of introducing opaque?

object Person {
  new type Id = String
}
2 Likes

My hunch is that the vast majority of type aliases out there serve mainly as shorthands and to re-export types defined in other scopes, and absolutely need to be transparent. I think defining an opaque type is always a deliberate decision, which has to be accompanied by the appropriate API definitions to enable actually interacting with the type.

1 Like

I agree with @LPTK’s hunch. And without actual evidence of the contrary, we should lean on the side that preserves the existing semantics of the code. Swapping the default would require strong evidence that opaque type aliases would far exceed transparent ones. That could be provided by a motivated person with an analysis of a large corpus of existing Scala code, comparing uses of type aliases versus usages of AnyVal classes that are not also implicit classes (the latter are going to become extension methods).

2 Likes

Would it be possible to reuse the keyword new instead of introducing opaque ?

Technically this would be feasible. It would be quite late to make that change, however, as opaque types have been accepted by the SIP committee. We can give it a quick discussion at the next meeting.

4 Likes

FWIW, when doing things like rendering graphics and generating SVG and such, “opaque” is a handy variable name. So personally I would prefer new.

You can use opaque as an identifier name in Dotty, it doesn’t clash with the use of opaque as a keyword.

Personally, I would function just fine with new type but I think it’s actually a bad name that we all happen to understand because of exposure to Haskell. I suspect that folks outside our bubble would appreciate opaque type much more.

3 Likes

Only in my head :slight_smile:

1 Like

Maybe the rule ought to be “don’t inspect types at runtime unless you know what you’re doing.”

Due to erasure it is already limited (but at least the compiler can warn you about it, which is not possible for opaque types).

2 Likes

But still, pattern matching on Any is actually used, and I can think of user-facing apis in big projects (Akka, Spark) that would be broken by opaque types:

In Spark’s dataframe api (all ^ positions are of type Any which are pattern matched) :

dataframe.select(
  ((lit(1) + 1) === 2) && lit(true)
        ^    ^      ^     ^   ^ 

Akka of course has the Actor receive method

def receive: PartialFunction[Any, Unit]
...

override def receive = {
  case i: Int => ...
  case s: String => ...
}

It seems like this could be resolved by adding a warning for matching on Any, just like we have an erasure warning now for matching on generics

But the problem isn’t that the Spark and Akka creators accidentally matched on Any, they chose to go with that design for deliberate reasons.

The consequence of that is that their users need to be careful every time they use that design - a warning can act as a reminder to do so.

Pattern matching on Any already implies that you must be prepared to handle any type at all so I don’t really understand in what sense there are new edge cases to consider, can you expand on that?

The new edge cases to consider are that even if you are looking at a totally non-generic value, you cannot know for sure that that value is actually of the type it appears to be. Let’s consider this very obvious application of opaque types to provide a performant encoding of ip addresses:

object Ip {
  opaque type Ipv4 = Int
  object Ipv4 {
    def fromBits(bits: Int): Ipv4 = bits    
    // plus more user-friendly factory methods like `fromString` etc...
  }
}

And then if you are a user of Akka, maybe you want to send ip address messages to an actor

actorRef ! Ipv4.fromString("127.0.0.1")

but now, the actor must not also be accepting regular Int values because it will mistakenly think that these Ipv4 addresses are Ints

def receive = {
  case i: Int => 
    incrementCounter(i)
  case ip: Ipv4 => 
    ping(ip)
}

So, the goal of encapsulating underlying data types is not 100% achieved (though it is pretty good) because in some cases like this, the user must actually know about and carefully consider what is the underlying type.

or, in the case of Spark, there would be no way for spark to tell that your Ipv4 value is not an Int, so it will treat it as one:

users.where(col("year_of_birth") === Ipv4.fromString("127.0.0.1"))

^ the above will not fail at compile time nor at runtime, and may actually yield some rows.

I just think it might be quite surprising that users now can’t actually count on the runtime values to have anything at all to do with what their actual semantic type is. I know this was already somewhat the case for generics but this takes that limitation to a much greater level to extend to all types.

1 Like

The exhaustiveness checker should be able to warn about the second case not being reachable. (And maybe it should be even more strict and always require an @unchecked when matching on an opaque type).

Maybe, but that still won’t stop

def receive = {
  case i: Int => 
    incrementCounter(i)
}

from catching Ipv4's

I think the original mistake here is to upcast an IPv4 to an Any. Such an upcast is almost always going to result in some surprising scenario down the line. It might be possible to warn upon such an upcast.

4 Likes

It might be possible to warn upon such an upcast.

That’s cool, on upcasting any type T to Any? Or just upcasting opaque types to Any?
If the former, then this seems like a big change to the language. How is spark going to rewrite their DataFrame api in a way that doesn’t incur warnings?

If the latter, then I guess this would also have to catch types which reference opaque types, like List[Ipv4], Either[Ipv4, String] , Ipv4 => Double, etc. Also I guess it wouldn’t totally work if the upcast isn’t local, like…


def send[T](msg: T, actorRef: ActorRef): Unit = actorRef ! msg
                                                         // ^ upcasted
//somewhere else...

send(ipv4, self)
1 Like

That’s fine though, right? You can’t break any invariant of Ipv4 because you no longer know from this point onward whether you have an Ipv4 anyway due to the upcast.

2 Likes

Only opaque type aliases to Any.

I don’t think so. For those, the situation is not made worse by opaque type aliases than before.