Pre-SIP: Unboxed wrapper types

jvican · September 22, 2017, 9:46am

Thanks @S11001001 for the insightful comments.

For the record, I’d like to publicly thank you for your blog post. You did a really good job at describing the problems with value classes, and your analysis motivated me to kick start the proposal with Erik. I’ll add this acknowledgement to our current proposal.

I didn’t want to add lengthy examples to ease the read of the code snippets, but I’ll add wrapFoo and mdl. They illustrate what we meant by:

Note that the rational for this encoding is to allow users to convert from the opaque type and the underlying type in constant time. Users have to be able to tag complex type structures without having to reallocate, iterate over or inspect it.

If the type equivalence between the dealiased type and the opaque type definition is described by the user, the implicit conversions will not be synthesized, as explained in the proposal. @non has also mentioned the possibility of user-defined =:= et al instances his last commit here.

I will explore this idea, will talk to the Scala team about it, and see if this can fit in the implementation without being part of the proposal. The proposal is already ambitious as it is .

Indeed, it may be a problem to typecheck wrapSomeMap only with those private synthetic implicit conversions in scope. I’ll have a closer look and try to find a way to make it typecheck.

Interesting observation, but I think this will not be a problem in the Scalac implementation. When triggering the implicit search of Ordering[Logarithm] inside the opaque type companion, typer doesn’t know yet that Logarithm =:= Double, so it will look for an instance of that implicit, it will fail, and it will try to to apply the implicit def conversion from Logarithm => Double. The result of this last search will be Ordering[Double], taken from scala.Predef. In Dotty, this could be a problem if the first step of the implicit search sees Ordering[Double] =:= Ordering[Logarithm].

Cannot this happen as well in other cases, especifically when relying on + being provided by an implicit?

In my opinion, if someone writes a public definition without a type, they’re looking for trouble. I strongly discourage it. The case you point out cannot be addressed in a principled way, or at least I don’t see how we could.

What I would propose is that we have enabled-by-default warnings that will warn users that define public methods in opaque type companions without an explicit return type. I believe this warning could be given a bigger scope, too, and warn about these cases all over your program.

The idea is that those users that want to specify upper and lower bounds are forced to define a type member in a trait:

trait T {
  type OT <: Any
}

and then implement it:

object T extends T {
  opaque type OT = String
}

just as you would with type aliases.

Opaque types need to be defined inside an entity after all, so the overhead of adding this type member in a trait is minimal. Would this cover all the scenarios you’d like to use upper and lower bounds on?

Interesting, it’s the first time I hear about Flow.

We can certainly consider doing so, but I’m not sold on its utility. One of the things I like the most about opaque types is that they have the same syntax (semantic-wise) than type aliases, and don’t require explicit type ascriptions. If we add this, we’re creating a new mental model of opaque types, and users need to learn it. The fewer rules, the better.

As I explain in the meeting, this has several problems:

APIs of different opaque types get mixed, hampering readability of the code.
Users cannot define a method tag for two different opaque types that have the same underlying type. The same happens with implicits.
Use sites of these opaque types do not know where these methods are defined. It’s way clearer to see Logarithm.tag than tag somewhere in your program.

I don’t like the idea of defining multiple opaque types in the same prefix. I’m personally in favor of opaque type companions, and I think companions are a natural way of thinking about Scala code. Its addition does not add overhead to the language; instead, it creates a more consistent language that converges towards common and widespread language features.

As @adriaanm mentions in the meeting, a non-negligible part of Scala developers, especifically beginners, already think that an object with the same name of a type alias is a companion.

I haven’t given this too much thought, but it will inherit it. If you want to override it, you also can. This is consistent with the behaviour of type aliases.

Yes, and this needs to be made more clear in the proposal. @xeno-by and @dragos pointed it out in an email before the meeting. The golden rule of opaque types is: the runtime will box/unbox whenever the underlying type needs to. Hence, they do not add extra boxing.

Despite boxing for AnyVal instances, note that primitive boxing is cheaper than what AnyVal does, and therefore faster.

For example, let’s take the Logarithm example from the proposal and inspect its bytecode. In the value class example, the compiler triggers the instantiation (via new) of every logarithm in the following expression val xs = List(Logarithm(12345.0), Logarithm(67890.0)).map(_ + x). This is not the same bytecode than for opaque types, which uses scala.Predef.doubleToDouble to cast scala.Double to java.lang.Double, and whose implementation is just a cast (d: scala.Double).asInstanceOf[java.lang.Double]. This cast is cheaper for the runtime than the new instantiation because:

It is instrinsified and it’s a fundamental mechanism of the JVM.
It doesn’t have to go through the initializers of the value class and the extended classes (traits).
When you instantiate a new object, you waste a lot of memory for object headers, fields, metadata, etc. I haven’t checked yet, but my guess is that java.lang.Object is optimized to avoid all this waste, therefore being easier on memory consumption.

Opaque types have more non-obvious advtanges over value classes, if we follow the reasoning of the golden rule for opaque types. If we compile val xs = List(Logarithm(12345.0), Logarithm(67890.0)).map(_ + x) with an Array instead of a List, we have zero boxing/unboxing because arrays are specialized.

Opaque types do not solve the problem of boxing/unboxing (this is a problem of the runtime), but they are a mechanism that adds wrapper types avoiding any extra overhead that would not be performed had the underlying type be used.

Thanks, I’ll add this! I mention this in the meeting, but I forgot to make it explicit.

LPTK · September 22, 2017, 11:26am

Instead of adding a keyword opaque, an alternative could be to combine type bounds with type definition:

// An opaque type:
type A <: Any = Double
// A translucent type:
type A <: Double = Double

This perfectly mirror the way one can define terms of a certain type but annotate the definition with a wider type, as in val a: AnyVal = 1.0.

Yet another possibility, inspired by constructors with restricted visibility this time:

type A <: Double private >: Double
// in principle (but not practice?), A >: Double <: Double is equivalent to A = Double

The advantage now would be that one could control the visibility in a fine-grained way using traditional Scala mechanisms: private only visible in the companion, protected visible by subclasses, private[foo.bar] for a package, etc.

odersky · September 22, 2017, 12:59pm

I just opened a new conversation to add a parametric top type, which is relevant for the discussion here. I have already discussed this with @jvican and hashed out with him some of the details of how that would relate to value classes.

The gist is, I was quite in favor of the proposal yesterday, but now think we have found something better.

TomasMikula · September 22, 2017, 5:20pm

[bikeshedding] Another alternative syntax that doesn’t require a new keyword:

package object opaquetypes {
  type Logarithm = private Double
}

I feel like new type Logarithm doesn’t fully correspond to Haskell’s newtype, since newtype in Haskell is not opaque.

Using the private keyword would also be somewhat consistent with private inheritance (as in C++) if Scala ever gets it:

class Foo extends private Utils

yangbo · September 22, 2017, 5:33pm

How about this?

private[packagename] trait InternalUtils extends Utils

class Foo extends InternalUtils

jvican · September 22, 2017, 6:34pm

Please, let’s not discuss syntax until we’re done with the semantics. I’d like to keep this discussion focused and technical, for now.

NthPortal · September 26, 2017, 9:24pm

One of the benefits (from a programming perspective) of using AnyVal for wrapper types is that the code is very compact. For example, suppose one is wrapping a user ID:

case class UID(value: String) extends AnyVal

That single line succinctly gives you an apply(String) method to wrap values, and a value method to unwrap them. The equivalent code for opaque types is substantially more verbose.

opaque type UID = String

object UID {
  def apply(s: String): UID = s

  implicit class Ops(val self: UID) extends AnyVal {
    def value: String = self
  }
}

If creating several wrapper types (e.g. for half a dozen types in a user record), opaque types add significant source code burden.

Is this use case sufficiently important to warrant its own syntax/syntax extension (for example, opaque case type Foo = Bar)? (not trying to start a discussion about possible syntax; just want to discuss the possibility of having some syntax)

non · September 26, 2017, 10:11pm

(EDIT: I had misunderstood the visibility rules that value classes are allowed to use, so these examples will work with value classes, modulo some potential boxing in some situations. See ghik’s reply.)

One interesting point here is that the design of value classes is such that you are required to have unconditional public wrappers and unwrappers. (This is because the internal implementation’s extension$ methods require third parties to be able to wrap/unwrap the types.)

By contrast, the proposal here gives the user control of when (or if) wrapping and unwrapping is possible. Consider cases where we only want to allow certain values to be wrapped:

opaque type PositiveLong = Long

object PositiveLong {
  def apply(n: Long): Option[PositiveLong] =
    if (n > 0L) Some(n) else None

  implicit class Ops(val self: PositiveLong) extends AnyVal {
    def asLong: Long = self
  }
}

Relatedly, we might choose to use Int to encode an enumeration or flags, but want to ensure users can only use a small selection of actual values (to prevent users from wrapping arbitrary values):

opaque type Mode = Int

object Mode {
  val NoAccess: Mode = 0
  val Read: Mode = 1
  val Write: Mode = 2
  val ReadWrite: Mode = 3

  implicit class Ops(val self: Mode) extends AnyVal {
    def isReadable: Boolean = (self & 1) == 1
    def isWritable: Boolean = (self & 2) == 2
    def |(that: Mode): Mode = self | that
    def &(that: Mode): Mode = self & that
  }
}

Finally, we might not want users to be able to unconditionally extract back to the underlying values. In this case, we can restrict access to code in the db package:

package object db {
  opaque type UserId = Long

  object UserId {
    def apply(n: Long): UserId = n
    private[db] def unwrap(u: UserId): Long = u
  }

  def lookupUser(db: DB, u: UserId): Option[User] = ...
}

These are all interesting use cases that are not possible (except by convention) with value classes. I agree that the enrichment is a little bit cumbersome (which is why the first version of our proposal included it) but on balance I think the added flexiblity and power of opaque types is worth a bit of verbosity for enrichment. In the future, if we improve the story with value classes and extension methods, opaque types will be able to reap the benefits.

ghik · September 27, 2017, 12:36am

What do you mean? I thought value classes can have private constructor and wrapped member, e.g.

class Mode private(private val raw: Int) extends AnyVal
object Mode {
  val NoAccess = new Mode(0)
  val Read = new Mode(1)
  val Write = new Mode(2)
  val ReadWrite = new Mode(3)

  def apply(raw: Int): Option[Mode] =
    if(raw >= 0 && raw <= 3) Some(new Mode(raw)) else None
}

oscar · September 27, 2017, 1:27am

Note in your example, these is no way to make an unboxed Mode. To return Option[Mode] you must box, no?

ghik · September 27, 2017, 1:34am

I could have also done this:

def apply(raw: Int): Mode = {
  require(raw >= 0 && raw <= 3)
  new Mode(raw)
}

which incurs no boxing, but is less typesafe.

I am totally in favour of the opaque type proposal and I fully understand its superiority to value classes in terms of performance. I simply didn’t understand @non’s argument about “unconditional public wrappers” in value classes.

NthPortal · September 27, 2017, 6:25am

I am very much behind opaque types, and I don’t actually think they add too much verbosity in most situations. However, for basic wrappers, they do.

Let me motivate my concern with the following example:

case class BrittleUser(id: Long, firstName: String, lastName: String, email: String)

case class User(id: User.Id, firstName: User.FirstName, lastName: User.LastName, email: User.Email)

object User {
  case class Id(value: Long) extends AnyVal
  case class FirstName(value: String) extends AnyVal
  case class LastName(value: String) extends AnyVal
  case class Email(value: String) extends AnyVal
}

By using a few short wrapper types, you get type safety, preventing you from getting the order of the fields wrong. However, to accomplish the same with opaque types is… a little bit ridiculous.

I like the idea of opaque types, and I think they add flexibility at extremely low cost for types which are more than a simple wrapper (such as your Mode type). However, in some situations, they lose to AnyVals in source maintainability even though they win in performance.

lihaoyi · September 27, 2017, 11:03am

My thoughts on that SIP:

I really like the parallel of "class => actual reified/allocatable thing, type => compile-type thing with no runtime representation"
AnyVal boxing unpredictably is a huge downside. Good performance is good, bad performance is meh, but unpredictable performance is the absolute worst.
“Badly” behaved isInstanceOf/asInstanceOf is much less of a problem than people think.
- Those two methods are badly behaved in Scala.js, at least compared to Scala-JVM: (1: Int).isInstanceOf[Double] == true, anyThingAtAll.asInstanceOf[SomeJsTrait] never fails, etc. Of the things people get confused with about Scala.js, this doesn’t turn up that often.
- Scala-JVM isInstanceOf/asInstanceOf are already a bit sloppy due to generic type-erasure, e.g. List[Int]().isInstanceOf[List[String]] == true. Opaque wrapper types would simply be extending the type-erasure to non-generic contexts
- Being able to say "myStringConstant".asInstanceOf[OpaqueStringConstantType] is widely used in Scala.js by “normal” users e.g. for wrapping third-party library constant/enum/js-dictionary-like things in something more type-safe. It’s actually very, very convenient, and empirically the odd behavior of isInstanceOf/asInstanceOf when you make mistakes in such cases just doesn’t seem to cause much confusion for people.
In a similar vein, I would be happy for Array[MyOpaqueTypeWrappingDouble]().isInstanceOf[Array[Double]] == true. I want the predictable performance more than I want the predictable isInstanceOf behavior: if I wanted the other way round, I would use AnyVals or just normal boxes instead!
I think @NthPortal’s concern about boilerplate is valid. While it’s “ok” to provide the low-level opaque-type and then tell people to build helpers on top using implicit extensions, it would be “nice” to have a concise syntax for common cases: one of which is opaque type + manual conversions to-and-from the underlying/wrapped type.

For example, given
```
opaque type Id = Long

object Id {
  def apply(value: Long): Id = value
  implicit class Ops(val self: Id) extends AnyVal {
    def value: Long = self
  }
}
```
Would it be possible to extract all the boilerplate def apply/implicit class into a helper trait?
```
opaque type Id = Long
object Id extends OpaqueWrapperTypeCompanion[Long, Id] // comes with `def apply` and `implicit class`
```
If we allowed asInstanceOf to be sloppy and let you cast things between the underlying and opaque types, we could dispense the the special “underlying type can be seen as opaque type in companion object, and vice versa” rule: you want to convert between them, use a asInstanceOf, and be extra-careful just like when using asInstanceOf in any other place.
- This asInstanceOf behavior may already be unavoidable anyway (from and implementation point of view) if you want to avoid boxing in all cases, e.g. when assigned to Anys, or when put in Array[MyOpaqueType]s.
- And asInstanceOf already has the correct connotation in users’ heads: “be extra careful here, we’re doing a conversion that’s not normally allowed”
- This also obliviates defining a companion object for the common case of “plain” newtypes, without any computation/validation: just use (x: Double).asInstanceOf[Wrapper] and (x: Wrapper).asInstanceOf[Double] to convert between them. If someone wants to add custom computation/validation, they can still write a companion and do that. In particular, @NthPortal’s examples could then just use casting and not define a bunch of boilerplaty companion objects.
It might be worth exploring a custom syntax to smooth out the very-common case of “opaque type with methods defined on it”, rather than relying on implicit classes in the companion to satisfy this need.

People could continue to use implicit class extension methods when extending the types externally, but I think having some operations “built in” is common enough to warrant special syntax. I don’t have any idea what they might look like

non · September 27, 2017, 11:39am

Yeah, this is my misunderstanding of the value classes spec.

Value classes do have to expose a public wrapper and unwrapper to the JVM, but you’re correct that they aren’t exposed to Scala. I’ll edit my post to correct it.

For the curious, here’s some Java code that constructs an invalid Mode and is able to access its internals (but this wouldn’t work in Scala):

package example;

import example.Mode;

class Test {

    public static void test() {
        Mode m = new Mode(999);
        System.out.println(m.foo$bar$Mode$$raw());
    }
}

non · September 27, 2017, 12:42pm

I think casting will largely “work” in the same ways it does now (i.e. you can cast between erased and unerased types on the JVM and things won’t throw a CCE):

def forget(xs: List[Int]): List[Object] =
  xs.asInstanceOf[List[Object]]
def remember(xs: List[Object]): List[Int] =
  xs.asInstanceOf[List[Int]]

remember(forget(1 :: 2 :: 3 :: Nil))
// res2: List[Int] = List(1, 2, 3)

I don’t think it’s worth giving up on the transparency in the companion, opacity elsewhere since it makes it easy for authors to signal which operations they’d like the compiler to consider type safe. Library consumers will still have most of the same escape hatches they have elsewhere (and the same warnings about using them).

galedric · September 27, 2017, 7:24pm

About the syntax boilerplate, maybe it could be possible to allow opaque as a modifier for a class to mean “an opaque type + the standard boilerplate”, similarly to how implicit class means “a class + an implicit def”.

For example:

opaque [implicit] class T ([private] val a: A) {
	{defs...}
}

Could be syntaxic sugar for:

opaque type T = A
object T {
	[implicit] def apply(a: A): T = a
	
	implicit class T$Ops ($self: T) extends AnyVal {
		inline(?) [private] def a: A = t
		{defs...}
	}
}

Requirements of value classes would apply equally to “opaque” classes:

A single val parameter
No fields definitions, only methods
Cannot be nested inside another class

Ideally, this inside the opaque class body would be rewritten to $self and have type T.

Adding implicit to the opaque class definition would add implicit on the apply method, meaning that the value can be implicitly wrapped, otherwise explicit wrapping is required.

Adding private to the constructor parameter would prevent the underlying value from being accessed directly and require explicit accessors to be defined.

Maybe we could even imagine opaque class T private ( ... ) that would mean that the apply method synthesized on the companion object is also private, requiring custom wrapper to be defined on the companion object (eg. to perform validation).

While this syntax is arguably more complex that the single opaque type definition, I believe it allows many common use cases of opaque type to be expressed with a lot less boilerplate. It would also be syntactically very similar to current value classes, meaning that converting code would be as easy as replacing the extends by opaque to get the unboxed semantic.

This syntax might also scale to future JVM-level value classes by allowing more than a single parameter in the class constructor, but who knows. This may not be a goal for this feature.

A last idea: if some people are not willing to introduce opaque as a keyword, maybe the inline keyword from Dotty could be used instead (an inline class is a class that disappear at runtime), it obviously work a lot better in the inline class than in the the inline type version.

LPTK · September 27, 2017, 8:00pm

Is there really no love for macro annotations here? (ping @xeno-by)
It would be straightforward to make the syntax proposed by @galedric work with an @opaque macro to annotate class definitions (which would then expand into opaque types).

@opaque class T(val a: A) { ... }  // expands into opaque type with helpers

Why make the spec of the language more and more complicated, when the problem is simply boilerplate, a problem that is basically solved by macro annotations? IMHO the language should continue in its original goal of being scalable, providing simple but powerful tools to be used as primitives for building advanced features –– as opposed to being an ad-hoc assembly of specific features from someone’s wish-list at instant t.

non · September 27, 2017, 8:47pm

I was about to write a similar reply, but I’ll instead just second this suggestion. I’d prefer that opaque types form a foundation that things like newtypes or other projects can build on to reduce boilerplate or provide specific semantics that folks want.

It seems like this shouldn’t be too hard. Maybe someone with a bit more experience can weigh in on this?

One reason including this kind of syntax in this SIP might be difficult is that we’ve already received feedback that the first version (which included this kind of syntax) overlapped too much with value classes, and that the feature didn’t seem sufficiently novel. By focusing on the types themselves (and introducing the opaque type companion) we can keep things more orthogonal, and leave this part of the design space to either future value class improvements (another SIP anyone?) or to macros or some other higher-level library.

NthPortal · September 27, 2017, 10:52pm

I think that’s a really good solution!

NthPortal · September 28, 2017, 4:24am

Can opaque types be parameterized?

A possible use could be for complex numbers of any numeric type

opaque type Complex[A] = (A, A) // possibly Complex[A: Numeric]?

object Complex {
  def apply[A: Numeric](real: A, imaginary: A): Complex[A] = (real, imaginary)
  def apply[A: Numeric](pair: (A, A)): Complex[A] = pair

  implicit class Ops[A](val self: Complex[A]) extends AnyVal {
    def real: A = self._1
    def imaginary: A = self._2
    def +(that: Complex[A])(implicit num: Numeric[A]): Complex[A] =
      (num.plus(self._1, that._1), num.plus(self._2, that._2))
    // More ops here...
  }
}

Alternatively, one could imagine a type for unsigned primitives

opaque type Unsigned[A] = A

object Unsigned {
  def apply(value: Int): Unsigned[Int] = value
  def apply(value: Long): Unsigned[Long] = value
  // ...

  implicit class Ops[A](val self: Unsigned[A]) extends AnyVal {
    // math ops here
  }
}