Proposal for Opaque Type Aliases

Hi Scala Community!

This thread is the SIP Committee’s request for comments on a proposal to introduce Opaque Type Aliases in the language, a form of type aliases where the alias is only actually known in its companion object, while the relationship is opaque from the outside. You can find all the details here.

Summary

In Scala, we can define a type alias as follows:

object Enclosing {
  type ID = Long
}

With that definition, Enclosing.ID is treated everywhere as a simple alias to Long.

While such a definition provides a convenient textual abstraction, the abstraction is not checked by the compiler, so we can erroneously use a normal number instead of an ID, for example.

Opaque type aliases are similar, but the relationship between ID and Long is hidden from the program, except in the companion object of ID, so that we can write:

object Enclosing {
  opaque type ID = Long

  object ID {
    private var nextID: Long = 0L

    def apply(): ID = {
      nextID += 1L
      nextID // compiles because within object ID we know that ID and Long are equivalent
    }
  }
}

import Enclosing._
val someID: ID = ID()
val anotherID: ID = 5L // does not compile
val someLong: Long = someID // does not compile

There is already a mechanism in Scala for similar abstractions: value classes declared with extends AnyVal. The main issue with value classes is that they unpredictably box and unbox their underlying type.

Opaque type aliases, being, in the end, bona fide type aliases, box and unbox exactly when the underlying type would itself box or unbox (which means never for reference types, notably).

Another advantage of guaranteed equivalence at run-time is that it provides a good foundation for abstractions in interoperability scenarios. For example, in Scala.js, one might want to define a facade type for the set of JavaScript integers, even those that are bigger than Ints. We cannot do that with a value class, because it would systematically box when given to JavaScript, therefore defeating the whole purpose. However, we can do it with an opaque type alias:

object JSTypes {
  opaque type LargeInt = Double
}

An opaque type alias is almost always accompanied by a companion object, which can use extension methods to enhance the opaque type alias with relevant methods:

object JSTypes {
  opaque type LargeInt = Double

  object LargeInt {
    def fromInt(x: Int): LargeInt = x.toDouble

    def (x: LargeInt) toDouble: Double = x

    def (x: LargeInt) + (y: LargeInt): LargeInt = x.toDouble + x.toDouble
    ...
  }
}

Limitations

Due to their nature, opaque type aliases cannot redefine methods of Any such as toString(), equals() and hashCode(). Moreover, since they are (on purpose), indistinguishable from their underlying type at run-time, it is not possible to perform type tests on opaque type aliases, notably in pattern matching.

See the full proposal for all the gory details.

Implications

Together with extension methods, opaque type aliases replace virtually all use cases for value classes. We therefore expect that value classes will be progressively phased out from the language and deprecated. That is however out of the scope of this specific SIP.

Opening this Proposal for discussion by the community to get a wider perspective and use cases.

14 Likes

What are the thoughts for how this would interact with lightweight structs with copy semantics?

The reason I ask is that there is a natural way to make value classes work for that, and the feature is supposedly coming to the JVM sometime, but it’s less clear that opaque type aliases will work for that. So although I like opaque types, it’s not totally clear to me that they will allow us to remove value classes; and if not, having two ways to do the same thing is rather iffy, and it might be worth reconsidering how value classes could be retrofitted instead. For instance, maybe inline classes are a better way to conceptualize things than opaque types, if you need to cover the case where multiple parameters are cool.

2 Likes

When Project Valhalla finally lands in some future version of Java, then value types could be extended to allow more than one field. How do you plan to support future Java’s value classes if not by extending AnyVals?

To clarify, is this SIP only for adding Opaque Types or also for removing Value Classes?

1 Like

Many common data types we use in FP are trivial wrappers that are mostly used to select different typeclass instances (monad transformers have this property for example) and all the wrapping and unwrapping ends up being a big performance concern. So this is going to be a really important improvement and I’m very pleased it’s happening. :+1:

4 Likes

An opaque type alias is almost always accompanied by a companion object, which can use extension methods to enhance the opaque type alias with relevant methods:

I personally find this a pretty roundabout way of doing the same thing that value classes do: creating a bunch of instance methods that compile down to static functions operating on the unboxed type. Would it be possible to have a more value-class-like syntax?

object JSTypes {
  opaque type LargeInt = Double {
    def toDouble: Double = this

    def +(y: LargeInt): LargeInt = this.toDouble + x.toDouble
    ...
  }
  object LargeInt {
    def fromInt(x: Int): LargeInt = x.toDouble
  }
}

Otherwise it seems odd and boilerplatey to have a bunch of extension methods that aren’t really extension methods: they’re just the primary set of methods of a particular type. The fact that the type is an opaque alias and the methods compile down to static functions isn’t really relevant.

8 Likes

This SIP is only for adding Opaque Type Aliases.

The key word IMO is “supposedly”. We have no guarantee that it’s going to happen, let alone when. In my opinion, the promise of value classes on the JVM has already held us back too long, and I do not think we should base current decisions on the hypothetical future of the JVM, which we have no control over.

Moreover, that would only be for the JVM; JS is not going to have value classes anytime soon.

And besides all that, there remains one fact that I believe many often forget: even if the JVM does receive value classes, it is very likely that such value classes will in fact box almost as much as AnyVals do, because they would have to retain instanceof/toString()/checked-cast capabilities, i.e., run-time disambiguation from their underlying type. Even more so if they are multi-parameter value classes. Unless of course the JVM provides actual generic specialization for value classes, but that’s an additional battle.

I’ll also say that @smarter in the SIP committee actively shares your concerns about JVM value classes.

1 Like

The thing is that they actually really are extension methods. The syntax you suggest is not directly possible, because it’s already virtually correct existing syntax (as a type refinement), but assuming it would be a viable syntax, it would have to desugar to extension methods in the companion object anyway. Such a desugaring would be dangerous however, I believe, as we would probably have a difficult time dealing with scopes (think what happens if there is an import in the companion object, that shadows identifiers that are otherwise used in the type part).

1 Like

Long related discussion in https://github.com/lampepfl/dotty/pull/4028, including a cameo by Brian Goetz.

Why not support translucent types while we’re at it? They are types on which only an upper bound is known from the outside.
I think it would be a pretty trivial generalization.

I’d like to be able to write:

opaque type Sign <: Boolean
object Sign { def of(v: Boolean): Sign = v }
val Positive = Sign of true
val Negative = Sign of false

For the simple reason that I don’t want to have to reinvent all the Boolean methods, bloating my code with boilerplate. An example usage:

def foo(lhs: Sign, rhs: Sign) = if (lhs || rhs) ... else ...
foo(Positive)
foo(true) // error
val x, y: Sign = ...
foo(Sign of x && y)

I don’t mind having to call the Sign constructor again after doing some Boolean operations on Sign values,as above. In fact, I find that preferable to having to use a destructor for it (as in if (lhs.toBoolean || rhs.toBoolean) ... or if ((lhs || rhs).toBoolean) ...), even if the destructor is made an implicit conversion (I try to avoid these when possible).

2 Likes

The implicit conversion is still a very easy solution to your use case, that doesn’t require any more specification nor implementation work.

Besides, it works for both directions: you may want to expose the Boolean to Sign conversion but not the other direction. In that case your translucent types are not helping. I don’t see why the type-alias->underlying direction should be more “first-class” than the reverse direction.

Me neither, actually. I think opaque type Sign >: Boolean should be allowed too.

I’m sure translucent types are an easy thing to implement, and the meaning is quite straightforward. So why not do it?

But then I’d expect opaque type Sign >: Boolean <: Boolean to be equivalent to opaque type Sign = Boolean, when in fact it’d be equivalent to type Sign = Boolean.

4 Likes

Good point. Maybe a more appropriate syntax would use <: and >: only to bound the type, and = to define it.

opaque type Sign <: Boolean = Boolean

That would be similar to the syntax of recursive match types, where have the same “bound + equality” pattern.

Then in fact, we can drop the opaque altogether:

type Sign <: Boolean = Boolean
type ID <: Any = Long

EDIT: actually not, lest we have an inconsistency with bounded match types, which are actually transparent.

The tests proposed to check that the design goals have been met include

  • Performance tests should verify that loops producing sequences of composite values do not create a significant GC load.

This is not true with AnyVal, and is the main reason you would want an inline class. Even if you don’t get specialization, at least it would work in cases where the types are known, and in cases where both things are references anyway. That covers a lot of ground; only things like (7153, “wiggle”) are left out and even those would have half the allocations (box 7153, but the JVM could elide the allocation of the tuple more aggressively than it does now with escape analysis).

I agree about the skepticism of the feature arriving, but I think you’re giving a different characterization of the feature (as approximately pointless, because it frequently fails to achieve its primary goal) than the design goals state. Maybe the skepticism is justified, but if so maybe you can link to something showing that they’ve given up on, say, being able to return a couple of values from a function without boxing them.

I am also troubled by the overlap between opaque type aliases and value classes. Not so much for value classes as they are now (here I think opaque types are clearly the better tradeoff overall) but for the as-yet-unknown Valhalla supported variant. But I think in the end the use cases are quite different and the overlap at arity one is more or less accidental.

  • Opaque type aliases guarantee that the type and its alias have the same representation. This is important for performance but also for inter-language interop. For instance, immutable arrrays (which will play a big role in our future runtimes) are possible only with opaque types since they rely on the fact that they erase to normal arrays.

  • Value classes avoid boxing in many situations, which is good for performance. But they will always have a different representation than their underlying type(s). Whether that representation is a heap object or a type tag stored aside is immaterial. So, a value class is essentially a tuple of the element types plus a tag. This collapses to just the underlying element type if there is only one element and the tag is statically known. So in that special case a value class representation is the same as an opaque type representation. Everywhere else they are different.

3 Likes

I would like to propose a restriction on when we can “see through” the opaque type alias that I think would simplify the implementation in Scala 2 and 3 significantly.

Right now, we talk about the scope in which we see the underlying type – it’s visible in the companion object. That’s not actually how it’s implemented, and it’s tricky to paper over the “lie” of the encoding. Here’s a test case from my 2.13 prototype that shows the encoding and the problem (see TODO marker):

object LogarithmTest {
  // The previous opaque type definition
  type Logarithm = Logarithm.Logarithm

   object Logarithm { 
       self : {type Logarithm = Double} => // reveal underlying type to `self`
    type Logarithm // opaque from outside

     // These are the ways to lift to the logarithm type
    def apply(d: Double): Logarithm = math.log(d)

     def safe(d: Double): Option[Logarithm] =
      if (d > 0.0) Some(math.log(d)) else None

     // This is the first way to unlift the logarithm type
    def exponent(l: Logarithm): Double = l

     // Extension methods define opaque types' public APIs
    implicit class LogarithmOps(val `this`: Logarithm) extends AnyVal {
      // This is the second way to unlift the logarithm type
      def toDouble: Double = math.exp(`this`)

      // TODO: can we make it work when writing `Logarithm` instead of `apply`?
      def +(that: Logarithm): Logarithm = apply(math.exp(`this`) + math.exp(that)) 
      def *(that: Logarithm): Logarithm = apply(`this` + that)
    }
  }
}

To see through the opaque alias, the last two methods must use this.apply(...) rather than the more idiomatic Logarithm(...), as the latter sees the abstract Logarithm type member (because it’s not a selection on this/self and thus does not see the refinement in the self type).

Could we move the spec a little closer to the encoding and punt on the complexity of making the typing of Logarithm.apply sensitive to the context? (Lukas did suggest one sneaky addition to the encoding that would make this work: add private val Logarithm: this.type = this as a member to the companion :sunglasses: )

2 Likes

That’s reasonable. How would you suggest we word it? Can we talk about that limitation without referring to the encoding?

I don’t know yet. I’m willing to work on the spec if we have consensus on this change conceptually.