Pre-SIP: Parametric Top

odersky · September 22, 2017, 12:53pm

Several members in the SIP committee yesterday were worried that opaque types in https://docs.scala-lang.org/sips/opaque-types.html duplicated a significant amount of the functionality of value classes, and discussions I had today, notably with @julienrf echoed that sentiment. An earlier iteration of that proposal indeed made opaque types look like value classes (I think it was called “Unboxed value classes”). @jvican, do you still have a link to that document?

When I saw the earlier proposal, I was not convinced because the restrictions imposed on these unboxed value classes seemed ad-hoc. By contrast, I rather liked the opaque types proposal, so was on the side of people preferring it. But I have now realized that the restrictions proposed for unboxed value classes are in fact natural consequences if these new unboxed value classes extend not AnyVal but a new, completely parametric Top type. This very much echoes the original post by @S11001001 on The High Cost of AnyVal subclasses.

The idea of a parametric top type was brought up in Dotty issue 2047 - it’s a rather long comment thread; best search for AnyObj to find where the discussion starts. A true top type is useful not just for type class coherence, the topic of #2047, or for unboxed value classes. It’s much more fundamental than that because it gives you theorems for free by ensuring parametricity.

In the Dotty issue, I proposed to keep Any as the top type of Scala’s type lattice, but strip it of all its methods. Instead there would be a new type AnyObj under Any which would get the ==, !=, equals, toString, hashCode, asInstanceOf and isInstanceOf methods. If we started from scratch that would still be my preferred scheme, but it’s probably too hard to migrate to it. So in what follows I propose to leave Any as it is and introduce a new, fully parametric type Top.

The top of Scala’s type hierarchy would look as follows:

      Top
       |
      Any      (= AnyVal)
       |
      Object   (= AnyRef)

There’s no need for AnyVal as a separate type anymore, but we can keep it around as an alias of Any.

Classes extending either Any/AnyVal or Top, but not extending Object, are called value classes. The usual value class restrictions apply to them; in particular they must be final. Traits can as before extend Any instead of Object, they are then called universal traits. Traits cannot extend directly the Top type.

The Top type does not have any methods. Because it has no isInstanceOf or == method, it is impossible to pattern match on a scrutinee of Top type. I believe that’s all we need to say about it!

It seems if we come back to the earlier proposal of unboxed value classes, all of the restrictions that were imposed are in fact consequences of the way Top is defined. So the unboxed value class proposal now looks very natural. Value classes have to extend either Any or Top; if they extend Top we have a guarantee that they can always be represented as their underlying type, no value-class specific boxing is needed.

Coming back to Top. Currently the rule for an unbounded abstract type or type parameter such as [T] is that it expands to [T <: Any]. I would love to change this to [T <: Top], but realize that this would also cause considerable migration headaches. So for the moment I propose to leave this as is, but consider cleaning up this aspect at some later time.

sjrd · September 22, 2017, 1:24pm

This is interesting. My main interrogation after a first reading is: what do we do about collections, and in general the ecosystem of libraries?

The opaque type alias proposal has one advantage over the Parametric Top: you can immediately use opaque type aliases in existing APIs, in particular collections, without any changes to those. With a Parametric Top, unless you change the collections to have their T <: Top, you won’t be able to declare a List[Foo] where Foo is an unboxed value class. But then if you do that, what happens to List.contains?

This is a major can of worms. In a clean design I would be all for it, but with that pragmatic mind of mine, I fear that a parametric Top will badly interfere with the ecosystem.

jvican · September 22, 2017, 1:43pm

Yes, this is a link to the document. It’s heavily influenced by the proposal of value classes – it follows the same structure and it’s less technical than our last version of the proposal.

odersky · September 22, 2017, 1:47pm

Ah yes, that is a major can of worms! It would go away to a large degree if [T] meant [T <: Top]. So maybe that’s a good reason to do that immediately rather than waiting for it. We do incur other incompatibilities then. I believe a rewrite tool can help fixing these, though.

In light of this, we might compromise and allow asInstanceOf as the only (unsafe, and not-recommended) method on Top. We could then expand [T] to [T <: Top] and insert _.asInstanceOf[Any] casts where a value of type T was used as an Any. Inserting the casts could be automated in a rewrite tool.

I believe we should be able to do that for Dotty. Can we do it for Scala 2? I am not certain.

Unfortunately that now puts us in the unenviable position that we can either add something that’s clearly valuable now, or wait for something else that is even cleaner but not compatible with the status quo. And if we go for both, we get unnecessary duplication.

odersky · September 22, 2017, 2:12pm

There’s one area where opaque type aliases are riskier than the Top type proposal. Consider:

val x: Any = Logarithm(10)
x match {
  case x: Double => ... // unsafe conversion!
}

I.e. we can detect the hidden representation using pattern matching and use it for unsound code. The Top type proposal does not have that problem.

Jasper-M · September 22, 2017, 2:41pm

From the moment you introduce asInstanceOf you could do

val x: Top = Logarithm(10)
try {
  val y = x.asInstanceOf[Double]
  ...
} catch {
  case e: ClassCastException => ...
}

Though granted, it’s a lot harder to fall in that loophole by accident. But you can still write the unsound code.

odersky · September 22, 2017, 3:05pm

@Jasper-M Sure. That’s the purpose of asInstanceOf. It’s the universal escape hatch, which comes with no guarantees.

non · September 22, 2017, 3:58pm

I agree this is not great, but this kind of pattern-matching can already defeat attempts at parametric design by allowing matching on post-erasure types:

def weird[A](xs: Array[A]): Array[Object] =
  xs match {
    case ys: Array[Object] => ys
  }

weird(Array("arrays", "are", "invariant"))
// Array[Object] = Array("arrays", "are", "invariant")

def weirder[A](a: A): AnyRef = a match {
  case y: Object => y
}

val x: Int = 3
weirder(x) // AnyRef = 3

From my point of view, if a caller wants a type that has a representation that can be matched at runtime, they should be using case classes or similar, not opaque types.

yangbo · September 22, 2017, 4:13pm

Is it possible to provide an explicit =:= between the underlying type and value type?

=:= will bring the ability of substitute, which is useful when you want to lift a type class.

TomasMikula · September 22, 2017, 4:59pm

You could have an opaque type and provide pattern matching on it via an extractor that converts it to something pattern-matchable.

opaque type Maybe[A] = Option[A]
object Maybe {
  def unapply[A](ma: Maybe[A]): Some[Option[A]] =
    Some(ma)
}

// then
val ma: Maybe[A] = ???
ma match {
  case Maybe(Some(a)) => ???
  case Maybe(None)    => ???
}

Of course, it would be nice if extractors statically known to return Some (such as the one above) would preserve exhaustivity checking (see #10502).

S11001001 · September 22, 2017, 5:13pm

Whether you can soundly substitute is a big sticking point, and it affects all sorts of things other than typeclasses; this was the major problem with “unboxed value classes” that was fixed so nicely by the followup opaque types proposal.

So returning to unboxed value classes would restore those problems as well, which go beyond merely whether the compiled code happens to box them.

You can do this with abstract types, too, in current Scala. It would be nice to be able to ban that, too.

That you can optionally supply =:= or <:< instances in the type companion providing unidirectional or bidirectional substitution as part of the interface is one of the great features of opaque types as currently proposed; it’s a sound way to declaratively expose part or all of the equality. (The whole substitute design space is supported, indeed.) So you can choose exactly how much of Logarithm = Double can be seen in the interface, and the language for doing so is simply defining methods.

I sure wouldn’t mind banning arbitrary type patterns and [ai]sInstanceOf calls, though.

oscar · September 22, 2017, 5:54pm

If parameteric Top is really being proposed instead of opaque types can we see exactly how it can replace?

We would do something like:

class Logarithm(private val exponent: Double) extends Top {
  
  def *(that: Logarithm): Logarithm = new Logarithm(exponent + that.exponent)
}

object Logarithm {
  def fromDouble(d: Double): Option[Logarithm]  =
    if (d > 0.0) Some(new Logarithm(math.log(d))
    else None
}

Is this what we are saying? And somehow extending Top means new Logarithm does not allocate? That seems a little weird to me (new not allocating and also maybe it precludes classes with two parameter constructors extending Top, which is also weird if it is a true top type).

I am a bit concerned about this because Top seems so ambitious it might never happen, or take very long, while working opaque types is something many in the community would really like to use now.

I think if this is used as a reason not to do opaque types it would be very nice to see a clear alternative proposal of how to implement zero-cost opaque types using Top.

scottcarey · September 22, 2017, 5:56pm

Perhaps for Scala 2, the syntax extends Top can be used to create opaque types, analogous to extends AnyVal, but without yet truly truly introducing the full glory of Top as the true top type. The idea would be to have all the features of the opaque types proposal, but with the same syntax that Dotty would have for defining them.

There are probably some serious drawbacks with this that I am blind to.

Blaisorblade · September 22, 2017, 7:37pm

+1 on parametric Top (at least for Dotty).

If A meant A <: Top, that pattern matching must be forbidden to ensure parametricity. A probably safe and overly conservative rule would forbid matching on scrutinees whose type mention any parametric type variable. Here a type match on Array is safe, but drawing good rules for full Scala in general seems nontrivial.

@oscar IIUC, the idea here is “value classes that truly never box” — most details are in doc linked earlier, and indeed new does not box: scala.github.com/_sips/sips/2017-09-20-opaque-types.md at f61ec0dd31c9d650dc867dcb5e74506e7c26f7be · non/scala.github.com · GitHub
@scottcarey extends Top and opaque types don’t seem to have quite the same semantics… or do they?

EDIT: more importantly, this proposal without a parametric Top would give you classes with questionable semantics — you still have all methods from Any, but they have inappropriate semantics, the ones given by the implementation, something a true parametric Top avoids. Though I suspect the earlier opaque types, and the encoding from @S11001001’s blog post, share this downside.

Blaisorblade · September 22, 2017, 7:55pm

In all these proposals, it seems polymorphism over primitives must still box.

How does one implement asInstanceOf on primitives without at least one level of boxing?
I guess in all these proposals, if A is a type variable, a is a primitive (wrapped in a “newtype” or not) and a: A , a must have been boxed anyway, but reference types need no extra boxing. (In miniboxing A <: AnyVal is erased to long, but it seems miniboxing is not going to be merged).

The only difference (maybe) is that if you use some suitable specialization (Scala 2’s, miniboxing, Dotty-linker) and have parametric Top, the optimizer has a easier time removing boxing. Casts to AnyRef would still box, and the results of reference equality would be affected, but those results shouldn’t be guaranteed under parametric Top (not sure of the current actual guarantees).

odersky · September 22, 2017, 8:05pm

Here’s a possible way to help migration to the new scheme:

[T] means [T <: Top]

Add a deprecated implicit conversion in Predef

 @deprecated def topToAny(x: Top): Any = x.asInstanceOf[Any]

special case pattern matching, so a scrutinee bounded by Top is converted to Any, using the same deprecated conversion.

That should cover the majority of problematic cases. There will be a few that remain, but overall it seems doable.

@blaisorblade asInstanceOf is by definition unsafe. It would simply succeed if the underlying type is compatible with the representation type. So, no boxing is needed.

oscar · September 22, 2017, 9:13pm

So assuming then that

class Foo(x: Double) extends Top

does not box, then what is the difference between:

implicit class Foo(x: Double) extends Top
implicit class Foo2(x: Double) extends AnyVal

We now have two ways to do method enrichment? That seems like a shame. I guess you should never do that after Top. Is that right?

Secondly, I find it ugly that now we would have a class that never allocates. Previously in scala class was something that could (in principle) be an allocation. Even with extends AnyVal in a generic context we get allocation. All class declarations correspond to some kind of JVM class entity.

To me, the opaque type proposal is much cleaner here because we already have type, and it never boxes or allocates (as far as I know). With opaque we are just putting additional constraints on a type but otherwise the rules apply. Similarly, we keep the notion that a class in scala corresponds to something that can be a JVM class.

S11001001 · September 23, 2017, 4:24am

I should add that while substitute is sufficiently powerful, it’s not necessarily convenient.

This example from the other thread which runs as written with opaque types:

def mdl(m: Map[Double, Logarithm]): Map[Logarithm, Double] = m

Suppose alternatively a =:= is provided and GADT-style =:= is made to work in patmat, it must be written

// supposing Logarithm.repr: Double =:= Logarithm
def mdl(m: Map[Double, Logarithm]): Map[Logarithm, Double] =
  Logarithm.repr match {
    case Refl() => m
  }

Supposing a =:= is provided but GADT-style is not made to work:

def mdl(m: Map[Double, Logarithm]): Map[Logarithm, Double] = {
  type F[A] = Map[Double, A] =:= Map[A, Double]
  Logarithm.repr.substitute[F](implicitly)(m)
}

Scala users might have some difficulty coming up with the correct F for various scenarios.

milessabin · September 23, 2017, 11:21am

If we started from scratch that would still be my preferred scheme, but it’s probably too hard to migrate to it. So in what follows I propose to leave Any as it is and introduce a new, fully parametric type Top.

I’m not so sure … I think the complications of adding a new Top type
to Scala are at least as great as changing the semantics of Any and
adding AnyObj to match Dotty.

It’d be interesting to make the change and try it with a few
representative projects and see how much breaks.

Cheers,

Miles

Blaisorblade · September 23, 2017, 4:47pm

I’m probably not making myself clear — simply stated, the JVM does not allow checkcast on primitives, so they have to be boxed once. Or do I miss something? That boxing is not due to newtypes but is still relevant.

In particular, saying that “newtypes never box” is as misleading as “primitives never box”. And since really never boxing is the goal, all the newtype proposals are necessary but not sufficient without specialization.