Pre-SIP: Unboxed wrapper types

This is sort of a Catch-22: it relies on having a reliable, zero-cost way to do implicit enrichment. This is one of the problems that this proposal (as well as value classes) aims to solve directly.

Value classes mostly work here, so maybe that would be good enough. Personally I would prefer to be able to solve both tagging as well as zero-cost enrichment with a single rewrite pass, rather than relying on more complex machinery.

I do take your point that implementing this as a library means it can evolve more easily and quickly. It’s possible that newtype classes could be provided as a compiler plugin to reap many (if not all) of those benefits.

@non - hey man, this looks great.

Something I missed maybe is how this works with pattern matching. Can I “destructure” with case Logarithm(exp) => ...? What happens with type patterns? Would case d: Double => ... match a Logarithm? I guess I expect the answers to be yes in both cases.

In the current spec, we aren’t allowing case classes to be newtypes. So unless you write a custom unapply the answer to the first would be no.

You are right about type casing matching the recursive underlying type: a newtype wrapping Double would be matched by Double at runtime. (I think we will probably forbid case l: Logarithm => ... matches; otherwise those would unwrap to case l: Double => ....)

Regarding syntax:

To what extent does this differ from scala.js’s @inline class?

Would it make sense to use @inline class?

In Dotty, would it make sense to use inline class?

Maybe @sjrd can can shed some light about the difference between @inline class ?

Just a couple more thoughts regarding the current spec.

  • Including/generating deconstructor (unapply) out of the box would be nice, and would also keep things consistent with “newtype”-like APIs in other languages like Haskell, Rust, and F#. It’s such a common use case that not having it would be a bit of a pain. I’m not sure about what the API would look like though (maybe extending NewType will mean that one is generated automatically).

  • Regarding pattern matching, I think it would be nice if the compiler can emit and error or a warn whenever the type being matched is the underlying type. In other words, it would be nice if the following code fails to compile or warns:

    val ls: List[Logarithm] = ???
    val ds = ls.collect { case d: Double => d } /* or map */
    

I feel that it would be better to disallow casting and instanceof checks for newtypes. Thus, for this snippet,

class C(val u: U) extends NewType
type T // arbitrary, perhaps U
val c: C = ???
val t: T = ???

the following would be compiler errors:

  • c.isInstanceOf[T]
  • c.asInstanceOf[T]
  • t.isInstanceOf[C]
  • t.asInstanceOf[C]
  • case c2: C =>
  • c match { case t2: T }

My concern is, C and U are identical at runtime, but disparate types at compile time. c.isInstanceOf[U] is statically known to be false (it’s a C, not a U), but at runtime, it will return true. Additionally, it doesn’t make sense to cast a C to a U (or vice versa), as they are different types; however, the cast will succeed at runtime. For the Logarithm example, Logarithm(0.0).toDouble == (1.0: Double), but Logarithm(0.0).asInstanceOf[Double] == (0.0: Double).

I am concerned that casting and instanceof checks are almost always wrong, and will only cause confusion.

@inline classes (background: this means something to Scala.js) are completely different. @inline, as an annotation, does not alter the semantics of the code at all. You can arbitrarily add or remove @inline annotations on any class in the world (including AnyVal classes) and your code will behave exactly the same (albeit with different performance characteristics). It’s just an optimizer hint (it doesn’t even have to follow it if it does not feel like it). In fact the optimizer could also very well apply the transformation it applies to @inline classes when you don’t write @inline (and I have plans to do that in the future).

AnyVals or NewTypes are fundamentally different: they alter the semantics (e.g., what eq means). You cannot even begin to compare @inline versus AnyVal/NewType. It’s not a compiler or optimizer hint. It’s a language feature.

I disagree. If you don’t want them, you’d have no way to disable them.

That would probably work out-of-the-box.

@NthPortal Disallowing asInstanceOf is flat out impossible. It is necessary for erasure to mean something for those types, which is required as soon as they enter generic contexts. In Scala, a type cannot afford not to have an asInstanceOf. It has to exist and be valid.

isInstanceOf is debatable, as well as type-matching in pattern-matching, as I mentioned in my first reply on this topic. Scala/JVM does not have a precedent for disallowing .isInstanceOf[C] for any class/trait C. Scala.js however has a precedent: if T is a trait inheriting from js.Any, x.isInstanceOf[T] is disallowed at compile-time. And consequently case x: T as well.

However, we have to take into account that, whatever we do, (Logarithm(1.0): Any).isInstanceOf[Double] will be valid and will answer true. That’s simply unavoidable given the initial spec goal. It would therefore be reasonable to say that (1.0: Any).isInstanceOf[Logarithm] is also valid and answers true.

I didn’t phrase that well, sorry. I didn’t mean that casting shouldn’t work for newtypes; rather, that in a non-generic context, it should be forbidden (or at least warned about) by the compiler. For instance, Logarithm(1.0).asInstanceOf[Double] would not be allowed, but (Logarithm(1.0): Any).asInstanceOf[Double] would be fine. Is there any reason that forbidding it in a limited scope would not work?

It would be inconsistent with any other type. Why should Logarithm(1.0).asInstanceOf[Double] be forbidden in user code, whereas 5.asInstanceOf[String] isn’t?

The reason Logarithm(1.0).asInstanceOf[Double] should be forbidden is because it doesn’t do what a (non-expert) user expects. In the unlikely event that a user would write 5.asInstanceOf[String], they would expect it to throw a ClassCastException, because 5 is not a String. Similarly, a user might very well expect Logarithm(1.0).asInstanceOf[Double] to throw a ClassCastException, because a Logarithm is not a Double. If Logarithm extended AnyVal instead of NewType, it would. However, it does not throw an exception, because Logarithm's type is erased to Double. In that sense, asInstanceOf is behaving inconsistently with how it behaves for every other type, by not throwing an exception.

More broadly, there are two types of problems with using _sInstanceOf for newtypes:

  • (_: T)._sInstanceOf[Logarithm] is problematic because Logarithm doesn’t exist as a type (to check or cast to) at runtime, and the operation will not actually be performed with the type Logarithm
  • (_: Logarithm)._sInstanceOf[T] is problematic because Logarithm is not of type T, but if T is the erased type of Logarithm, the operation will not fail (by returning false or throwing an exception) when it ought to
    • technically, it is only Logarithm(1.0)._sInstanceOf[Double] which is problematic by behaving unexpectedly, and not Logarithm(1.0)._sInstanceOf[String], but it would be odd to forbid the former and not the latter

In short, isInstanceOf and asInstanceOf behave unexpectedly when invoked on a newtype or with a newtype as the type argument, and are likely to cause confusingly incorrect runtime behavior.

There are already a lot of precedent for .asInstanceOf not behave as a non-export user expects. Examples:

  • "hello".asInstanceOf[T] when T is a type parameter, that we later instantiate to anything but String (or a superclass)
  • List(1).asInstanceOf[List[String]]
  • "hello".asInstance[SomeJSClass] in Scala.js, where SomeJSClass <: js.Any

In all those cases, .asInstanceOf is allowed and succeeds, because the left-hand-side’s run-time type conforms to the erasure of the rhs type. .asInstanceOf is basically defined in terms of erasure. Since Logarithm erases to Double, it is completely consistent for Logarithm().asInstanceOf[Double] to be allowed and succeed.

Currently, this seems to work better in conjunction with proposals for ParametricAny/AnyObj (Allow Typeclasses to Declare Themselves Coherent · Issue #4 · lampepfl/dotty-feature-requests · GitHub), so that disabling isInstanceOf is more generally possible (though it’s not clear how to exactly combine these constructs). Current proposals keep asInstanceOf also on ParametricAny, but suggest that asInstanceOf is a low-level construct:

One other refinement: When we talk about parametricity, I would reserve all methods so far in Any for AnyObj (or whatever we end up naming it), except or asInstanceOf. asInstanceOf is meant to be low-level, implementation dependent, and prone to fail. As such it is a useful escape hatch if (say) you want to have a parametric method, which nevertheless does internal hash-consing for memoization. That method could cast its argument to AnyObj and then store it in a hash map using == and hashCode.

Without ParametricAny, removing isInstanceOf would indeed be inconsistent. That doesn’t mean we have to add extensions that cause new puzzlers—after considering all tradeoffs, rejecting an extension might be the best option.
Personally, I’d add both ParametricAny and newtypes. Ideally I’d try to rewrite away/deprecate somehow value classes, but I doubt this is really feasible.

Erasure has to be defined, of course, but various proposals discuss a ParametricAny type, which IIRC has no asInstanceOf. I agree that forbidding asInstanceOf only for newtypes would be odd.

@NthPortal Without ParametricAny, asInstanceOf can’t indeed be removed— you’d still have (c: AnyRef).asInstanceOf[T] or variants available.

If enough expect ClassCastException in any of those cases, warnings might be worthwhile. I’m not sure they should be compiler warnings instead of linter warnings, and it’s hard to see how to decide. It seems more useful to provide compiler ASTs to linters via TASTY and scala.meta, so that linters can be accurate.

I don’t think that’s even possible, even if it were deemed desirable:

  • The first of my examples is not detectable without global analysis, which Scala doesn’t have because of separate compilation.
  • The second one is necessary to implement some parts of the standard library.
  • The third one is often used even by “regular” developers (as opposed to the authors of the standard library) for interop with JS.

So 2 examples are usages where it’s not failing on purpose, and 1 example is undetectable without global analysis.

I can understand that point-of-view, but I think you would more often want to have destructuring of newtype wrappers available than not *. And in the case that you don’t want it, I don’t know if having it would cause problems or annoyance. Is there a frequent use-case where it causes headaches?

  • I have no data to back this up other than how I tend to use AnyVal-based newtypes right now and usage patterns in other languages.

Couldn’t you create a newtype like this?

Welcome to Scala 2.12.3 (OpenJDK 64-Bit Server VM, Java 1.8.0_141).
Type in expressions for evaluation. Or try :help.

scala> trait Labelish
defined trait Labelish

scala> type Label = String with Labelish
defined type alias Label

scala> def m(label: Label): Label = label
m: (label: Label)Label

scala> val myLabel = “Hello!”.asInstanceOf[Label]
myLabel: Label = Hello!

scala> m(myLabel)
res0: Label = Hello!

scala> m(“Hello!”)
:13: error: type mismatch;
found : String(“Hello!”)
required: Label
(which expands to) String with Labelish
m(“Hello!”)
^

scala> val labels = Seq(myLabel, “world”.asInstanceOf[Label])
labels: Seq[Label] = List(Hello!, world)

That’s useful to know. But maybe we just define “possible” differently :-). Linters can and do use whole-program analysis (in general), and linters (including Scala ones) do have configurable warnings about constructs that some deem desirable and others don’t. I haven’t used Scala.js, you’re the expert—but the third linter warning might very well be feasible but undesirable.

Honestly, I like unsafe blocks in Rust (and IIRC C#?): they say compilers to trust a dodgy code block, and they say readers to not trust the typechecker on that block—and they’re easy to spot. EDIT: they might be useful here to combine safety (for newbies) and expressiveness for experts, assuming newbies don’t blindly use unsafe everywhere.

@sjrd @Blaisorblade You make a lot of valid points about why it should not be a compiler error; however, I think some sort of warning (whether compiler or linter, as @Blaisorblade mentioned) would be valuable.

True, but you’re far less likely to do that by accident.

I’m honestly less concerned about asInstanceOf or isInstanceOf, which don’t get called explicitly that often (especially by people new to Scala), and more concerned about pattern matching (which uses the two operations internally). I view it as very easy for someone to write case log: Logarithm =>, and not realize that it actually matches Doubles.

What happens if you use the type C | U (or C ∨ U in shapeless)? Double | Logarithm seems like it could be a reasonable type for a representation of a floating point number, but you wouldn’t be able to distinguish them at runtime.

I don’t think that representation works here because we’d like to hide the underlying methods and operations by default. By contrast, the Double with Tag encoding specifically ensures that you can still treat a tagged value as a raw Double (and makes it easy to lose the tag too). For example:

scala> trait Tag
defined trait Tag

scala> type Tagged = Double with Tag
defined type alias Tagged

scala> val x: Tagged = 123.456.asInstanceOf[Tagged]
x: Tagged = 123.456

scala> math.log(x)
res0: Double = 4.815884817283264

scala> x + 1.0
res1: Double = 124.456

In some situations this might be the desired behavior, but in many cases (for example my Logarithm example) we definitely don’t want to try that value as a normal Double, and we definitely don’t want unexposed methods (like -) to treat the value as a normal Double.