Pre-SIP: Unboxed wrapper types

Blaisorblade · March 25, 2018, 8:16pm

I am still somewhat concerned though, at least as long as macro annotations are not in. I am not convinced the feature needs to be novel, if the novelty is additional boilerplate. Opaque types with a value-class-like syntax is honestly what I (and many others) wanted in place of value classes in the first place.

I’m concerned about having to translate by hand code using value classes, to only later get a more compact syntax.

jvican · March 26, 2018, 12:42pm

Well, the good news is that no code will break. For those that do want the simplicity and predictiveness of opaque types, I’m fairly optimistic we can get some migration rewrites.

NthPortal · March 27, 2018, 1:56am

I was waiting for someone to bring up Scalafix

TomasMikula · April 4, 2018, 11:02am

@jvican Still expecting your comments about the “method that sees through multiple opaque types” use-case?

odersky · April 6, 2018, 10:01am

Here’s a comment I added to https://github.com/lampepfl/dotty/pull/4028

So, I am trying to summarize here:

Why not keep value classes?

The boxing model trips people up and can get in the way of high-performance code (in particular
because value classes cannot be stored in arrays)
The boxing model has surprisingly nasty consequences for code generation, including issues
that are still not fixed in any Scala compiler (#1905)
The limitations are hard to remember and a bit ad hoc.

What functionality do value classes provide?

A low-cost implementation of some functionality in terms of some other type
A way to add extension methods
(In the future, once we can support multiple parameters) A way to do structs

Opaque types address (1), with some caveats

They can currently not be toplevel
They require sometimes a bit more code than value classes for the same functionality
They require some concept shifts that are non-trivial to implement (e.g. companions of opaque types, with special visibility)

For (2) we have a separate extension method proposal which is syntactically nicer than implicit value classes.

(3) is currently not possible on the JVM, non-sensical on JS, and addressed with @struct on Native. (3) might be possible with Valhalla in the future.

One idea we should further explore before deciding is to come back to the original proposal of “unboxed wrapper types” Pre-SIP: Unboxed wrapper types. I.e. define a lightweight way to define classes that, like opaque types, do not box, but that can take multiple parameters if the platform allows it. A possible syntax could be

inline class C(x_1: T_1, ..., x_n: T_n) /* no parents allowed */ { 
   def ...
   def ...
   /* no other members allowed */
}

On today’s JVM, n = 1. inline classes defined that way are always represented as their underlying type. Their parent is Any but they cannot override any of its methods. The potential advantages that I can see relative to opaque types are:

sometimes less boilerplate
can do extension methods as well when combined with implicit . So it would then be a separate decision whether we want to replace implicit inline class with extension .
can express structs / multiparameter value classes.

I don’t know whether inline classes defined that way would map well to Valhalla’s value classes. Can somebody who knows the state of things in Valhalla give some insight here?

To expand on it a bit here: Originally, the SIP committee was against the Unboxed Wrapper Types proposal because the overlap with value classes seemed to great. The new development is that we are now prepared to phase out value classes altogether. This idea was not on the table when the proposal started but is gaining a lot of momentum now. So, if the question is how to replace value classes then it’s actually good if the new feature has a lot of overlap with the one it replaces because it will be more familiar, code will be easier to port, and so on.

fanf · April 6, 2018, 12:35pm

The correct link for @odersky comment is https://github.com/lampepfl/dotty/pull/4028#issuecomment-379190038 (because of the added colon in the end, github is perplexified and redirect to a bad url).

jducoeur · April 6, 2018, 12:40pm

I think your list is missing a significant element of opaque types: the opacity itself. And correspondingly, you’re missing a problem with value classes: unless you’re extra-careful, they tend to be porous in a very ad-hoc way, which in my experience often leads to them being used poorly.

Opaque types are conceptually rather different from value classes, in that the “walls” around them are much stronger – it’s clearer that this is a separate type, which just happens to have another type under the hood. As an in-the-field lead engineer, I really love that: while it doesn’t add technical value, IMO it adds conceptual value, which shouldn’t be discounted.

I’m not performance-focused – frankly, boxing is usually the least of my concerns. But the opaque types proposal addresses my most common complaint with value classes (which, mind, I use fairly heavily): it looks to be a better solution to what I want out of them, which is firm type separation.

By contrast, I don’t think I’ve ever actually found myself wanting multiple members in these things. I can see the value in it, but it’s not my actual pain point.

fanf · April 6, 2018, 1:21pm

I can’t abond more on @jducoeur comment. On Rudder, we have litterary 10s of identifier types which are just putting a type around an uuid (a string). Sometimes 2 uuid, or a long. We have NodeId(value: String), RuleId(value: String), DirectiveId(value: String), GroupId(value: String), PolicyId(ruleId: RuleId, directiveId: DirectiveId), and so on, and so forth, with tens of variants. We are very free handed on the definition of ad-hoc business object, just use on one step of a complex process, and each of them is likely to get its own identifier type.

We don’t use any extension methods (or very very rarelly, for ex in the case where the ID is 2 uuid, to concatenate them in a specific way), we really just use them to not mixe different kinds of ID.

The value is immense for that. It helped us in countless refactorings, to untangle complex business logic by giving things their own names.

And we use these ID as map/set keys, or in list/vector sometimes ten/hundred of thousands element large.
We would love to never have the wrapper class at runtime, because it’s just here to make the GC at work. We are most of the time IO bounded, but still - gc pressure is real in real world uses.

We tried to switch to value classes, but the gc pressure was not really change. I don’t know why, most likely our usage pattern trigger the boxing most of the time.

Opaque types seems to be the exact correct answer for our use case, and I will happily pay the little added verbosity to have a consistent behavior regarding boxing (ie: no additionnal wrapper at run time).

Hope it helps illustrate real world use cases

odersky · April 6, 2018, 5:44pm

Opacity is controllable:

 inline class C private (x: T)         // completely opaque
 inline class C(x: T)                  // T -> C  supported
 inline class C private (val toT: T)   // C -> T supported
 inline class C (val toT: T)           // T -> C and C -> T supported

One could argue about what the right defaults are, but I believe this is a smaller point. The crux of the matter is what can be expressed.

odersky · April 6, 2018, 5:48pm

@fanf inline classes and opqaue types have exactly same same boxing behavior.

LPTK · April 6, 2018, 6:38pm

The inline class proposal seems very promising to me. It’s intuitive, and completely in line (pardon the pun) with current Scala concepts. It also avoids introducing another keyword. inline for extension classes makes perfect sense (and is more intuitive than extends AnyVal – the first time I saw this keyword, I had no idea what it could possibly mean).

By contrast, opaque types introduce a lot of non-trivial machinery, while achieving almost nothing: most of what they do can be encoded in a clearer way with a trait and an upcasted val, as originally shown by @S11001001. Their only pros seems to be:

avoid boxing for primitive types;
reduce boilerplate – to a limited extent, as they still require a good amount;
allow the type to have a companion that extends implicit scope.

All of these are achieved by inline classes, but in a more idiomatic way.

I particularly dislike the way opaque types rely on implicit conversions inside the companion object. I thought it was clear by now that implicit conversions were to be avoided as much as possible, as they are confusing and insidious. But if on top of that the implicit conversion is invisible (compiler-generated), I can’t imagine it being a good idea.

odersky · April 6, 2018, 6:59pm

@LPTK Opaque types can be done without implicit conversions, at least that’s what the Dotty implementation does. But I agree with all your other points.

LPTK · April 6, 2018, 8:19pm

Inline classes can also support “translucent types” fairly intuitively, with an alternative but complementary form where the constructor has no parameters and the class has exactly one supertype T (which does not need to be a class type):

inline class C extends T { 
   def ...
   /* no other members allowed */
}

Of course, inline classes are forbidden from overriding anything, be it something coming from Any like toString or something coming from T.

Ichoran · April 6, 2018, 9:22pm

I really like inline types. With a little bikeshedding on syntax, I think they can re-unify extension methods with value classes/opaque types via either translucent types or anonymous translucent types.

Something like (anonymous version):

inline implicit extends Int {
  def sq = this * this
}

Also, inline types could be made cross-target by allowing a them to be not-inlined if you pass appropriate compiler flags, e.g. -Xallow-inline-tuple-class. So you could have multiple return values with zero overhead on Scala Native and if Valhalla ever gets far enough along to allow it, but still compile the code on JDK8 and JS.

lihaoyi · April 7, 2018, 5:16am

I’m not sure I understand all the edge cases, but the inline classes sketched out here basically seem to me like a nicer syntax over opaque types. Looks good to me (have never been super satisfied with the other syntactic proposals for defining opaque types)

Blaisorblade · April 7, 2018, 10:09am

EDIT: first of all, I really like inline class.

odersky:

Opacity is controllable:
   inline class C private (x: T)         // completely opaque
   inline class C(x: T)                  // T -> C supported
   inline class C private (val toT: T)   // C -> T supported
   inline class C (val toT: T)           // T -> C and C -> T supported
One could argue about what the right defaults are, but I believe this is a smaller point. The crux of the matter is what can be expressed.

OTOH, those defaults are consistent with how classes work elsewhere, and the common cases are mostly consistent with how opacity for value classes works:

class PosInt private (val x: Int) extends AnyVal
object PosInt {
  def apply(i: Int) = { require(i > 0); new PosInt(i) }
}

PosInt(1)
// new C(1) // compile error
PosInt(-1) // runtime error
java.lang.IllegalArgumentException: requirement failed
  at scala.Predef$.require(Predef.scala:264)
  at PosInt$.apply(<console>:11)
  ... 28 elided

The only difference is that, for consistency, hiding a value class member requires private val instead of having nothing:

class PosInt private (private val x: Int) extends AnyVal

But this point might be best left for review.

For extension methods, we considered that they could take implicit parameters on the class itself:

inline class C1[T](x: T) { // I'd want to write [T: Ord], but I can't, so I must move it to all methods
  def foo1(...)(implicit OrdT: OrdT) = ...
  def foo2(...)(implicit OrdT: OrdT) = ...
...
  def foo10(...)(implicit OrdT: OrdT) = ...
}
extension C2[T: Ord](this: T) {
  def foo1(...) = ...
  def foo2(...) = ...
  ...
  def foo10(...) = ...
}

— that works because, for extension methods, you can’t store a value of type C2 in a field — either C2 isn’t a type or it’s a non-value type (like MethodType, ExprType etc.) that can be compiled away by rewriting C2(t).foo1(...) to foo1(t)(...).

oscar · April 7, 2018, 5:36pm

I am happy as long as we get the ability to make zero cost newtypes which this provides.

I do feel like opaque type is closer to what we are really are doing and the term “class” being here without it being a JVM class is a bit confusing. I assume getClass returns the getClass of the inner type.

I liked that opaque type didn’t complicate intuition about the mapping to the JVM.

smarter · April 7, 2018, 6:20pm

Not in Dotty anymore, you can write class PosInt(x: Int) extends AnyVal.

Blaisorblade · April 7, 2018, 6:56pm

In a sense that’s true, but if we go for “closer to what we are really are doing”, one should stop writing class Foo(val bar: Baz) and write the Java for it instead. And yes, we write class without a JVM class, but value classes have a class and aren’t any simpler. Would a different keyword fix the second concern? But I’d really hate to start syntax bikeshedding now.

What is true (and annoying here) is that in Scala the only existing syntax for declaring data constructors (in the FP sense) uses class, so reusing that is the most consistent choice.

non · April 8, 2018, 11:36pm

Inline classes would be a welcome improvement over existing value classes, and a lot of people seem to be excited about being able to avoid having to define explicit extension methods for opaque types.

One question here: would inline classes allow classOf[_] or ClassTag[_] to be used? Would that behavior be undefined? Would they support pattern-matching? I think I share @oscar’s stance (as long as none of these things are going to introduce unexpected classes or behaviors, the type/class naming distinction isn’t a huge deal).

One danger with inline classes is that their semantics need to be flexible enough to match future Valhalla behavior (assuming you hope to implement them this way). This is a concern that several people have raised.

One major advantage of opaque types had is that they were not going to be implemented as value classes on the JVM, so that their compile-time existence doesn’t relate to any possible JVM encoding, and they would not have any runtime representation at all. (It’s not clear to me exactly how the JVM will represent Valhalla value classes at runtime, but it seems likely that they will not be entirely erased [1]).

If we guarantee that inline classes are completely inlined (as opaque types would be) then I don’t think this presents a problem, but it does complicate a possible future story involving multi-slot Valhalla value classes.

[1] http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-March/051958.html