Pre-SIP: Unboxed wrapper types

Putting this up to garner feedback for a possible SIP.

Context

While AnyVal is a great tool and is widely used, there are developers who are unsatisfied with it as a method of writing wrapper types (also known as the newtype pattern) because of the associated runtime cost.

@S11001001 has written an article titled “The High Cost of AnyVal subclasses”, in which he goes through some of the issues (boxing and unboxing penalty, O(n) complexity for wrapping and unwrapping containers), and argues convincingly that contrary to popular belief, the issues are not caused by Scala’s targeting of the JVM.

Instead, these runtime costs are because of the need to incorporate:

support for isInstanceOf, “safe” casting, implementing interfaces, overriding AnyRef methods like toString, and the like.

Proposal

Goal:

  • A fully erased newtype/wrapper type mechanism that aims to be completely penalty-free at runtime.
  • Since performance is a primary objective, support for features that stand in the way may need to be dropped.

In terms of changes to the language, I think it could live along side AnyVal (e.g. we don’t necessarily need to get rid of AnyVal or change its behaviour). It might make sense to have it as a separate thing entirely, like newtype Label(s: String).

16 Likes

@adriaanm @SethTisue @odersky @dragos Would you mind having a read through this SIP proposal? @lloydmeta, @non and I were thinking of putting it together and would like to get some early feedback on it. Perhaps we can discuss it in our next SIP meeting, too.

Even though we haven’t figured out yet the technical details of the proposal and how such a feature would be added to the language (and interact with others), @S11001001’s blog post goes into some of these details, which give hints on how it could be implemented. I recommend reading the blog post.

Also, we would like to get what the Community thinks about this feature. So please, do comment or thumbs up if you like the proposal.

2 Likes

:+1:

I think this would be a great addition! I know @adriaanm has said he doesn’t like the current tag encoding, but as the article shows it works better than value classes for many of these applications.

My sense is that this SIP should be easier to put together than SIP-15 (Value Classes) because the encoding is much less complex:

  • We don’t need rules around which things can be wrapped (since we’re totally erasing them and not supporting reflection).
  • The rules for re-routing and extraction should be simpler (we always do it).
  • We would not be supporting any kind of subtyping, so we don’t need universal traits.

Other than generalized anxiety over changing the language and spec, does anyone think this is a bad idea, or difficult?

As far as the API, I could imagine an annotation which changes the meaning of AnyVal or a new type to extend (e.g. AnyVal.NewType).

3 Likes

I wouldn’t be opposed to such a SIP. There are a few things that come to my mind, that should be carefully considered in the SIP:

  • The fate of newtypes of primitive values. Always avoiding boxing for them is not possible. When will it box? Also it should probably box to the normal boxed class of the primitive, e.g., newtype Int boxes to j.l.Integer.
  • What happens to newtypes of AnyVals? :kissing_smiling_eyes:
  • asInstanceOf[Foo] would of course “erase” to asInstanceOf[underlying of Foo]. What about isInstanceOf though? Is it even allowed at compile-time? Corollary: what happens to Foo in pattern matches?
  • What happens to classOf[Foo]? What happens to classTag[Foo]?
  • What happens to Array[Foo]?
  • What happens to values of type Foo in the context of interoperability with Java (on the JVM) and JavaScript (with Scala.js)?
1 Like

If we are working with these types:

class Meter(val toDouble: Double) extends AnyVal

class IntOps(val toInt: Int) extends NewType
class MeterOps(val toMeter: Meter) extends NewType

Then I’d answer your questions as follows:

  • Primitives would have to box in their usual cases. So IntOps boxes to Integer in exactly the cases that Int does (and uses int otherwise).
  • I think that we’d want to erase newtypes before any of the AnyVal rewrites occur. So, we’d potentially have newtype forwarders (essentially static methods taking the underlying type), as well as the whole AnyVal system of forwarders and boxing. So MeterOps would be allowed, and values of that type would be represented identically to Meter at runtime.
  • isInstanceOf[IntOps] could be rewritten to isInstanceOf[Int] or an error. Probably the rewrite makes more sense. Similarly, IntOps could be forbidden in pattern matches, or rewritten to Int.
  • I’m pretty sure we’d want to rewrite classOf[IntOps] and classTag[IntOps] to use Int.
  • Array[IntOps] would be represented as Array[Int].
  • At runtime there are no values of IntOps type, so I don’t think there are special considerations here.

A SIP would require a better formal specification, but I think these are the right properties to want.

1 Like

Because total erasure makes these things easier to reason about, I think that we can even allow newtypes of newtypes, but I haven’t fully worked through enough examples to be sure.

2 Likes

Why not use a Scalameta macro annotation and generate the same kind of code as in the blog post? This way the thing can live entirely in library and no language modifications are needed.
This is more flexible, as it could lend itself to user configuration/customization (like macro-based case classes would, a.k.a. data types à la carte). For example, it could be made to generate Scalaz-style subst functions.

AFAICT, the only thing that currently cannot be achieved is to erase a newtype of a primitive type P not to Object but to P itself, so that the compiler can avoid boxing the primitive. That is, without having to use a <: P bound, which partially defeats the purpose.

Therefore, I propose to only add to Scala an @erasureOf[T] checked annotation, that tells the compiler what an abstract type should erase to. For example:

class LabelAPI {
  @erasureOf[Int]
  type Label
}
val LabelImpl: LabelAPI = new LabelAPI {
  // type Label = String  // error: the erasure of String does not correspond to Int
  type Label = Int
}

This way, Label is still completely distinct from Int as far as the type checker is concerned, but the erasure phase will turn it into Int and so it will be the same as Int throughout the rest of the compilation, allowing for unboxed usage and for the right bytecode signatures.

Correct me if I’m wrong, but I think this would be relatively easy to add to the compiler, and thus to get accepted into the language, as compared to having a new stab at an AnyVal-like feature.

@LPTK I don’t think that proposal is a good substitute. My main concern with it is that it looks like “a compiler feature” rather than “a language feature”. It is an annotation that tweaks what the compiler should do when compiling a given piece of code, altering its semantics in non-obvious ways in the process, rather than properly defining semantics in the first place, and letting the compiler do whatever it takes to correctly implement those semantics.

1 Like

Does it actually alter the semantics? At least on the JVM, I don’t see where the semantics would be different with and without the annotation (perhaps related to JVM integer interning and referential equality?).

My point was that, as the blog post shows, the language features required to do wrapper-free newtypes are already there. The only “missing” part is related to performance (not boxing primitive types). In other words, I think we need a compiler feature rather than a language feature.

Hey guys, yesterday I started to work on an implementation for this proposal, just a prototype to show how it should work. It’s not possible to use macros to implement the whole feature, so I’m implementing it with annotation macros (to avoid touching typer for simplicity) + a compiler plugin. I hope to finish it off soon to get the proof of concept out. Erik is working on the spec, so at some point we’ll put our work together. After that, if such proposal is numbered in the next SIP meeting, I’ll invest some of my free time to port the prototype to the compiler (Scalac, maybe Dotty too).

@LPTK With regard to your comment, note that the main goal of this proposal is to make newtypes available to the whole Scala community. That’s why I don’t want my prototype to become the official way of consuming this feature – I just want it to be a tool to test and make the process review faster. IMO, this is something that merits inclusion in the language, so far it seems technically better than value classes in Scalac.

That said, I don’t think @erasureOf is a good idea because it’s a very specific compiler feature to enable the creation of certain language features. Its main goal is to circumvent the limitations of existing extension mechanisms like macros. I don’t think we should add features to the compiler just to enable the creation of other language features, it brings complexity for nothing. It’s better to add fully-baked language features that do work, bring immediate value and can be widely used to solve problems that we experience in our day-to-day jobs.

2 Likes

@LPTK @erasureOf does alter the semantics, because val x: Any = "hello"; x.asInstanceOf[Label] would succeed without @erasureOf[Int] but fail with @erasureOf[Int].

1 Like

If it was a library, it could be included in the Scala Platform, which is designed for this purpose.

Nitpicking here, but @erasureOf is not to circumvent the limitations of macros. It actually has nothing to do with macros.

I agree it is rather ad-hoc. If there are really no other use for the annotation and all other things being equal, it’s probably better to go with a language feature. It just seems like more work to do the latter. On the other hand, if we found more uses for @erasureOf then my opinion may change, as it would enable the mantra from the Programming in Scala book mentioned in my first link: “Instead of providing all constructs you might ever need in one ‘perfectly complete’ language, Scala puts the tools for building such constructs into your hands”.

No, it’s not designed for this purpose. The compiler is designed for this purpose. The Scala Platform is opt-in. The point of this feature is to make it available by default.

It circumvents the fact that you cannot modify erasure with macros. Otherwise you wouldn’t need erasureOf. :smile:

And it does, to some extent (and way more than other programming languages). But I would say that erasureOf is borderline. There’s probably people in the community more entitled to discuss this than I am.

Are you sure? I’m not knowledgeable enough in the Scala compiler to know how type-checking affects asInstanceOf casts. I would have thought that at the end of the day, it would be equivalent to x.asInstanceOf[Int], which does not fail.

EDIT: never mind, I read too fast. It would indeed fail in one case and not the other. Shouldn’t asInstanceOf on an abstract type with no bounds at least yield a warning?

Hey folks, using SIP-15 as a template I did a quick pass in creating something we could potentially build a SIP out of:

I am sure there are typos, mistakes, and oversights, but hopefully this gives us a common basis of comparison. I just threw it up into a gist to make it easy to read, but we can move this into a repo or other shared environment if people think it’s useful to collaborate on it.

4 Likes

One thing to add – I chose to use the extends NewType syntax in the document to be clear that this is different from AnyVal but to preserve some continuity. In practice I don’t care what the name is, or if this is done with an extends X versus an annotation (or even new syntax).

Thanks for putting that up! I think it would be a good idea to put it up somewhere for collaboration :slight_smile:

This already exists as a library as https://github.com/alexknvl/newtypes, including the macro annotation. It implements Stephen’s blog posts as well. Syntax is like:

@opaque type ArrayWrapper[A] = Array[A]
@translucent type Flags = Int

Where opaque types box like generics (never unless primitive) and translucent types are subtypes of the types they are newtypes over, with the advantage that primitives are not boxed unless in a generic context.

4 Likes

Thanks! That’s really great prior art!

I think this proposal is slightly more ambitious (or misguided, depending on your stance) in that it allows you to create things that look like methods on the new types (whereas if I understand correctly newts just provides a type member plus wrapping, unwrapping, and subst). But we should certainly make sure that the newtype classes here work at least as well as those defined via newts.

2 Likes

Though it should be mentioned that it would be really easy to extend newts and make it generate the appropriate method-providing implicit class from something like this:

@opaque class ArrayWrapper[A](val unwrap: Array[A]) {
  /* method defs */
  def size = unwrap.length
}

…and then extend it some more as the needs arise in the future, because it’s a library.