Proposal for Opaque Type Aliases

Valhalla value classes are nominal. Point(x: Int, y: Int) and Rectangle(l: Int, w: Int) can not be equal. Tuples are not nominal, and a (Int, Int) instances representing a Rectangle can’t be distinguished from one representing a Point.

So Tuples can’t represent them, but it might be very useful to represent Tuples as them.

Case classes extend AnyRef (a.k.a. j.l.Object), and Valhalla value classes do not (logically, though in L-world they do in some sense), several restrictoins that apply to Valhalla value classes would be violated by Case Classes.

Something new may be needed, like data class or case classes would need to be modified in an incompatible way, or extends AnyVal can be adapted, but that may lead to changes / restrictions there as well.

Its been about 6 months since I read through the Valhalla mailing lists, so I may be a bit out of date, but the big shift to L-world was done prior to that (it was not completed, but the big shift in thought and the initial 80/20 implementation was done).

If you name the wrapped val member “self” the pollution isn’t that harmful. It makes thing.self equivalent to thing, which is interesting but not terribly surprising.

I’m out of the loop, but can the valhalla jvm return an unboxed value class instance direct on the stack? There’s nothing inherent in the JVM that prevents a return leaving more than zero or one slots on the top of the stack. It seems odd to me that this wasn’t leveraged before now, with a generic return_n opp that leaves n words containing the return values.

Yes, it will allow a few key things:

Passing composite values around on the stack, so method params and returns don’t have to box our friendly little Point class anymore.

Composite types can be inlined inside arrays, so we can avoid boxing there too. And nested valhalla types can be fully inlined in the object, and arrays too, e.g the equivalent of

data class Complex(r: Float, c: Float)
data class ComplexLineSegment(start: Complex, end: Complex)

ComplexLineSegment would be representable as four consecutive Float values on the stack, or in an array.

I’m not sure the array format is even forced to be ‘row-major’ (yet?). Its plausable that the JVM could columnarize – (e.g. every 8 Complex items could plausibly be written in memory as r,r,r,r,r,r,r,r,c,c,c,c,c,c,c,c instead of r,c,r,c,r,c,r,c,r,c,r,c,r,c,r,c).

A quick summary for today would be the top of this page: https://wiki.openjdk.java.net/display/valhalla/L-World+Value+Types

    Value Types are small, immutable, identity-less types
    User model: "codes like a class, works like an int"
    Use cases: Numerics, algebraic data types, tuples, cursors, ...
    Removing identity commitment enables optimizations such as
        flattening of value types in containers such as fields or arrays
            reducing cost of indirection and locality of reference with attendant cache miss penalties
            reducing memory footprint and load on garbage collectors
    Combining Immutability and no identity commitment allows value types to be stored in registers or stack or passed by value

The “L-world” change from the first prototype is summarized there in more detail. An even briefer summary:

“Q-world”, the old prototype, introduced a bunch of new bytecode instructions to distinguish value types from object/reference types, and a new class descriptor format for them. This led to lots of issues.

“L-world” makes everythig an object again – at least as far as bytecode for method signatures is concerned. The object descriptors are re-purposed to additionally cover the new value classes, by adding extra info to the object class descriptor. Then it is up to the JVM to optimize the call sites and optimize away boxing. Now only two new bytecode instructions are added: one to create a default instance (every value type has to have a default, last I checked this is just all zeroes in memory so that empty arrays are as if filled with default values), and one to copy a value but change one field for copy-on-write semantics.

EDIT:
Another thing to note is that because these are meant to be substitutable when the values are the same, equals() on a value type when the value type contains a reference, implies reference equality on the inner reference (following through to a reference’s equals() method can break the contract otherwise). There has been some debate about this, however.
This is one place where scala case classes as they are today would be a mismatch.

2 Likes

Valhalla value classes are closer to structs than Opaque types, they are solving different problems. Furthermore its important to understand that value classes don’t exist on other platforms (Scala.js and native unless native wants to reimplement the concept)

Opaque types are incredibly important imho, they have a very solid foundation that is easy to understand and its the only way to completely guarantee something won’t box. They are also very important for interoperability on Scala.js (and I assume scala-native as well).

Opaque types would even help current Scala, as there are so many work arounds when it comes to JVM interopt that can be greatly simplified with Opaque types.

2 Likes

Is it possible to avoid the new keyword opaque, in favor of reusing existing self-type syntax?

Can we allow type ID = Any { this: Long => } instead of opaque type ID = Long?

The self-type syntax has an additional advantage: it allows parts of the supertypes or members to be transparent.

type ID = Any { this: Long => }

implicitly[ID <:< AnyVal] // should not compile

type ID2 = AnyVal {
  // The actual self-type is `AnyVal { def +(rhs: Long): Long } with Long`,
  // which can be proven to be same type of `Long`, when calculating its `glb`
  this: Long => 

  def +(rhs: Long): Long
}

implicitly[ID2 <:< AnyVal] // should compile 

def minus1(id2: ID2): Long = {
  id2 - 1L // should not compile.
}

def plus1(id2: ID2): Long = {
  id2 + 1L // should compile and should not be a reflective call, because the backend knows `ID2 =:= Long`
}

We can also allow casting a trait or a class to its self-type in its companion object for consistency.

2 Likes

Would the underlying type be accessible via the opaque type’s TypeTag[T]?

I’m trying to think how Spark’s Encoder[T], and Flink’s TypeInfo[T] serialization typeclasses can be derived for something that is an opaque type, or something that contains opaquely typed values.

I think that, strictly speaking, there should be no TypeTag available for an opaque type. Only a WeakTypeTag. However there are some inconsistencies in this area. See this PR for some examples.

But I think you should manually make a TypeTag or a ClassTag available if you want.

opaque type A = Int
object A {
  implicit val tTag: TypeTag[A] = { 
    val tTag = null // ideally this line won't be necessary
    typeTag[Int] 
  }
}

TypeTags are available for all abstract type members, so not having them is an inconsistency… But, maybe there’s no point discussing TypeTags since dotty will kill off both scala-reflect and TypeTags without a replacement anyway?

Wait, what? Where are they saying that? They’re killing macros, but I hadn’t heard anything about that affecting the TypeTag family…

I don’t know how to link directly to messages in gitter but @smarter recently said in scala/contributors:

There’s no runtime reflection at all in Dotty
and no desire to implement it
But it’s not been formally deprecated, so it might come back from the dead
Runtime reflection in Scala 2 is an endless source of bugs: scala/bug#10766

According to this StackOverflow answer, Typetags are generated by the compiler (including Dotty):
https://stackoverflow.com/questions/50138533/how-does-one-completely-avoid-runtime-reflection-in-scala/50161155#50161155

The fact that they’re available for abstract type members is the inconsistency IMHO.

TypeTag will be superseded in the new principled meta programming framework we are working on with quoted.Type. We still need to work out details of a migration strategy. I guess we’ll either have to keep TypeTag around, or we can make it an alias of quoted.Type.

Why not though? If an abstract type member has a stable path (e.g. it’s a member of a global object), then generating its TypeTag is repeatable, it’s not weak because it’s stable - that TypeTag always contains the same info when summoned from different parts of the program and can be meaningfully compared by subtype check - it will return true vs. another tag summoned elsewhere or compatible type bounds, unlike weak tags for type parameters, for example, that are invalid immediately when going out of scope.

In practical terms, not generating TypeTags for opaque types means I can’t bind them in my DI framework - even though they’re stable in =:=; not generating them for type members might mean a couple features just going away, so it’s bad-bad news for me.

Hmm perhaps you’re right.
But then I think the default TypeTag for an opaque type shouldn’t expose its underlying type but just handle it like other abstract types, no? So if you want to expose the underlying type via the TypeTag you still have to define a custom TypeTag.

Well yeah, for my usecase I don’t care about the content of the tag as long as subtype checks are available for all stable paths. But, someone else might want to look inside the opaque though, e.g. to port standalone newtype deriving from Haskell. In that case the content of opaque being available in WeakTypeTag will help the implementor - but for that we’ll need to add yet another constructor for scala.reflect.api.Type and maybe that’s going too far?

This opaque type is just reinventing features and syntaxes that exist in Scala 2 (with a new keyword).

Opaque types are just self-types for type aliases.

Since we already have self-types for traits, which mean type constraints that only be seen inside the traits themselves. We can expand the self-type syntax to type aliases to describe type constraints that only be seen inside the type aliases themselves.

Since we want to the type constraints to be also seen from companion objects, we can in addition allow access modifiers for the self types.

Instead of opaque type ID <: Any = Long, we should reuse the current syntax:

type ID = Any {
  private[ID] this: Long =>
}
object ID {
  // This compiles because of the `private[ID]` modifier grants the conversion inside ID companion object.
  def toLong(id: ID): Long = id
}

The self-type solution is just a combination of current syntaxes, which, I think, is more elegance than introducing new keywords.

2 Likes

The current Scala 2 language has some arbitrary inconsistent decisions.

  1. Constructors are allowed in classes but not in traits.
  2. Self-types are allowed in classes and traits but not type aliases.
  3. Access modifiers are allowed on classes, traits members and class primary constructors to but not on self-types.

The first inconsistent decision will be fixed in SIP-25. I hope we can also fix the other inconsistent decisions instead of introducing more.

I suppose you can still use standard Java reflection, right?