Cannot be cast to class scala.runtime.BoxedUnit

som-snytt · August 22, 2019, 5:00am

By coincidence, today I was looking at Java bugs related to overloading. One report had a bad cast, like yours, and that value was passed to an overloaded method in Object and String. In Java 7, type inference inferred Object and the Object-taking method was picked; in Java 8, it could infer String, picked the other method, which failed. I think the lesson is that these “under-the-hood” casts, which seem to be just getting in your face, are actually doing useful type system work.

curoli · August 22, 2019, 9:45am

Ah, right, I forgot about those. In that case, I guess Unit must remain a legal argument for type parameters.

Makes me wonder how often T => Unit will return BoxedUnit when the author intended it to return nothing at all, and how that affects performance?

I’m assuming that the following will create a Seq with 100 references to BoxedUnit, right?

(1 to 100).map(println)

hrhino · August 22, 2019, 11:16am

Yep. Note that BoxedUnit is a singleton read from a Java static field, so it doesn’t itself increase memory usage (that’d be the data structure itself).

undefeat · August 22, 2019, 11:43am

Right. And it’s also true that whenever we assign a Unit to a value the actual type will be a BoxedUnit. More than that, the actual value assigned will be a reference to the same singleton object:

def doSomething(): Unit = { new Object }

val u1 = () // Debugger: {BoxedUnit@793}
val u2 = doSomething() // Debugger: {BoxedUnit@793}
val u3 = (new Object).asInstanceOf[Unit] // Debugger: {BoxedUnit@793}

decompiled Java code:

public void doSomething() { new Object(); }
  
BoxedUnit u1 = BoxedUnit.UNIT;
doSomething(); BoxedUnit u2 = BoxedUnit.UNIT;
new Object(); BoxedUnit u3 = BoxedUnit.UNIT;

Note, we don’t see a cast to BoxedUnit in any of those cases.

A different situation happens when we pass a Unit as a type parameter:

def execute[R](): R = (new Object).asInstanceOf[R]

val o1: Unit = execute[Object]()
val o2: Unit = execute[String]()
val o3: Unit = execute[Unit]() // class java.lang.Object cannot be cast to class scala.runtime.BoxedUnit

decompiled Java code:

public <R> R execute() { return (R)new Object(); }

execute(); BoxedUnit o1 = BoxedUnit.UNIT;
execute(); BoxedUnit o2 = BoxedUnit.UNIT;
BoxedUnit o3 = (BoxedUnit)execute();

o3 is the only case when Scala doesn’t throw away the object but tries to cast it to BoxedUnit.

Jasper-M · August 22, 2019, 11:48am

Function0, Function1 and Function2 are specialized for Unit in their result type.

undefeat · August 23, 2019, 7:26am

Same here:

class MyFunc[R] extends Function0[R] {
  def apply(): R = (new Object).asInstanceOf[R]
}

val myFunc2 = new MyFunc[Unit]
val r2: Unit = myFunc2() // java.lang.ClassCastException: class java.lang.Object cannot be cast to class scala.runtime.BoxedUnit

val myFunc1 = new MyFunc[AnyVal]
val r1: Unit = myFunc1() // SUCCESS!

decompiled Java code:

public class MyFunc<R> extends Object implements Function0<R> {
  public R apply() { return (R)new Object(); }
}

MyFunc myFunc2 = new MyFunc();
BoxedUnit r2 = (BoxedUnit)myFunc2.apply();

MyFunc myFunc1 = new MyFunc();
myFunc1.apply(); BoxedUnit r1 = BoxedUnit.UNIT;

curoli · August 23, 2019, 8:36am

**Welcome to Scala 2.13.0 (OpenJDK 64-Bit Server VM, Java 1.8.0_222).
Type in expressions for evaluation. Or try :help.
> 1.asInstanceOf[Double]
res0: Double = 1.0
> def convert[A, B](x: A): B = x.asInstanceOf[B]
convert: [A, B](x: A)B
> convert[Int, Double](1)
java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Double
  at scala.runtime.BoxesRunTime.unboxToDouble(BoxesRunTime.java:113)
  ... 28 elided**

There is clearly an inconsistency in how asInstanceOf works. Sometimes it converts the value to a value of another type, and sometimes it preserves the value and merely ascribes a new type to it.

I suppose if the target is a parameter, the only possible thing to do is to preserve the value. For the sake of consistency, it should always do that.

sjrd · August 23, 2019, 8:52am

@undefeat

No, this is not what Unit means in the spec. Unit means you expect/request/require that the method returns (), the singleton value of the Unit type. Since it in fact returns an Object, which is not an instance of Unit, the cast .asInstanceOf[R] is invalid, and from that point on the compiler is free to throw a ClassCastException, either immediately or later, or not at all.

There is no other way to interpret the specification of the language. The compiler is not doing anything wrong.

What’s incorrect is the original definition of execute. It should say that it returns an Object. It shouldn’t lie by pretending it can return any R. This is whether the code is written in Scala or in Java. This is the root of the problem. The behavior of the compiler with Unit/BoxedUnit has nothing to do with it.

sjrd · August 23, 2019, 8:59am

That’s an entirely different problem, for which I completely agree that the compiler is wrong. It has nothing to do with the fact that x is a parameter or not. This behavior happens when the static type of x is known to be a primitive type, and the type in the brackets is also a primitive type. In that case, for some reason, it compiles it as a coercion (like .toDouble) instead of a cast.

There is nothing in the spec supporting this behavior. Writing this code should warn that it doesn’t do the right thing, and eventually become a compile error.

Would you like to submit a PR?

undefeat · August 23, 2019, 1:56pm

Then why is the following cast valid?

val u: Unit = (new Object).asInstanceOf[Unit] // u: Unit = ()

jducoeur · August 23, 2019, 2:12pm

Do you have reason to believe that it is? It looks to me like a compiler artifact that it happens to work…

curoli · August 23, 2019, 2:11pm

I’ve never submitted a PR for the compiler before, but it sounds exciting and I might consider it if there is some guidance. Are there some instructions? Thanks!

Jasper-M · August 23, 2019, 2:41pm

This what @sjrd just described:

Except that the compiler actually always tries to do this when the type between brackets is a primitive type. But Unit is the only primitive type for which the coercion works when type x is not a primitive type.

The difference with def execute[R](): R = (new Object).asInstanceOf[R] is that R is not statically known to be a primitive type.
The difference with execute[Unit]() is that as far as Scala-the-language is concerned there is no cast here. In the bytecode a cast gets inserted but that’s to satisfy the JVM.

You get the same thing with other primitive types:

scala> def foo[A]: A = { val i: Int = 42; i.asInstanceOf[A] }
foo: [A]=> A

scala> foo[Double]
java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Double
  at scala.runtime.BoxesRunTime.unboxToDouble(BoxesRunTime.java:113)
  ... 36 elided

scala> val i: Int = 42; i.asInstanceOf[Double]
i: Int = 42
res3: Double = 42.0

sjrd · August 23, 2019, 3:09pm

The readme of GitHub - scala/scala: Scala 2 compiler and standard library. Bugs at https://github.com/scala/bug; Scala 3 at https://github.com/lampepfl/dotty, along with some pages it points to, is actually quite complete. It should get you through the basics. In particular Redirecting… is quite good!

curoli · August 23, 2019, 3:32pm

Thanks, I’ll check it out!

som-snytt · August 26, 2019, 5:43am

I have a slightly different understanding, or, if you like, a different misunderstanding.

Here is the ticket for folks who want asInstanceOf[Unit] to fail.

Here is Lukas’s improvement to boxing I mentioned before. It includes special casing x.asInstanceOf[Unit] so that null is handled correctly.

But I think it’s OK not to throw; maybe there should be a lint for this case, something like “dubious cast.” Because usually asInstanceOf means you have more information than the compiler about the safety of the cast.

This thread began with “why is there too much unboxing of Unit”, but there is too little unboxing:

scala> def fromNull[A]: A = null.asInstanceOf[A]
fromNull: [A]=> A

scala> null.asInstanceOf[Unit] == fromNull[Unit]
                               ^
       warning: comparing values of types Unit and Unit using `==` will always yield true
res0: Boolean = false

scala> println(fromNull[Unit])
null

That is, it needs box(unbox(x)), where unbox could fail, as it ought to do here:

scala> def f[A]: A = new Object().asInstanceOf[A]
f: [A]=> A

scala> () == f[Unit]
          ^
       warning: comparing values of types Unit and Unit using `==` will always yield true
res0: Boolean = false

scala> 42 == f[Int]
java.lang.ClassCastException: class java.lang.Object cannot be cast to class java.lang.Integer (java.lang.Object and java.lang.Integer are in module java.base of loader 'bootstrap')
  at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:100)
  ... 28 elided

undefeat · September 9, 2019, 3:11pm

When you do a comparison the BoxedRunTime.equals(Object, Object)Boolean method is called. It accepts parameters of type Object, which is Any in Scala. I that case, there is no need to downcast the actual values and the behaviour is equivalent to:

def f[A]: A = new Object().asInstanceOf[A]
val a: Any = f[Unit]() // a: Any = java.lang.Object@4f45ed22

The behaviour is different for numbers, characters and null. If we compare numbers, then BoxedRunTime will perform an actual cast.

Looking at all these corner cases I can see that the Unit has a special place in the type hierarchy, regarding its conversion principles. Some of them are aimed at satisfying the JVM and others at making the life of the developers easier.
I don’t see any point in casting a value to a BoxedUnit when the expected type is Unit. For me, it’s one of the omitted cases, when the value discarding mechanism should take place.

som-snytt · September 12, 2019, 9:45pm

I see your opinion is unchanged from your original post.

Unit is interesting only because it is so boring. Moreover, boxed unit is also boring. The utility of the behavior you decry is that it exposes your bug. There is some interest in avoiding needless paired calls to box/unbox the Unit value.

The other behavior you haven’t addressed is your example in Nothing. It’s obvious the result of type Nothing must throw. Your example throws ClassCastException because of the bug, but if it did not unbox the value, you wouldn’t see the usual NPE as thrown by null.asInstanceOf[scala.runtime.Nothing$].

The value discard conversion only happens if conversion is required. Since Nothing and Unit already conform to Unit, there can be no discard.

In a related ticket, someone suggests that Unit should not be special in this regard, but that any Singleton type should incur a discard.

tarsa · September 15, 2019, 7:45pm

I think a lot of confusion stems from the fact that type erasure is not exactly intuitive. Decompiling Scala code to Java code won’t help IMO. First, consider a Scala code:

object Scala {
  class Something
  class Something1 extends Something
  class Something2 extends Something

  def cast[A <: Something](param: Any): A =
    param.asInstanceOf[A]

  def main(args: Array[String]): Unit = {
    val x = cast[Something2](arg)
    println(x)
  }

//  val arg = new Object
//  val arg = new Something1
}

You need to uncomment one of the lines that define arg. If you uncomment val arg = new Object then ClassCastException will be thrown inside def cast. If you uncomment val arg = new Something1 then ClassCastException will be thrown outside of def cast, in the line val x = cast[Something2](arg).

Why the cast fails in different places if there is only one cast in source code? Because type erasure causes some but sometimes not all checks to be moved from definition site to use site. What can be checked inside a erased method is checked there, what can’t is moved to use site. In this case param.asInstanceOf[A] inside def cast is erased to param.asInstanceOf[Something] as Something is the only known upper bound of A during compilation of def cast. A more precise cast to Something2 is thus moved to line val x = cast[Something2](arg) as that’s the line where type A of value returned from def cast is statically known.

java -p -c confirms my explanation:

Compiled from "Scala.scala"
public final class temp.Scala$ {
  public static temp.Scala$ MODULE$;

  private final java.lang.Object arg;

  public static {};
    Code:
       0: new           #2                  // class temp/Scala$
       3: invokespecial #22                 // Method "<init>":()V
       6: return

  public <A extends temp.Scala$Something> A cast(java.lang.Object);
    Code:
       0: aload_1
      // first cast to `Something`
       1: checkcast     #7                  // class temp/Scala$Something
       4: areturn

  public void main(java.lang.String[]);
    Code:
       0: aload_0
       1: aload_0
       2: invokevirtual #33                 // Method arg:()Ljava/lang/Object;
       5: invokevirtual #35                 // Method cast:(Ljava/lang/Object;)Ltemp/Scala$Something;
      // second, more precise cast to `Something2`
       8: checkcast     #12                 // class temp/Scala$Something2
      11: astore_2
      12: getstatic     #40                 // Field scala/Predef$.MODULE$:Lscala/Predef$;
      15: aload_2
      16: invokevirtual #44                 // Method scala/Predef$.println:(Ljava/lang/Object;)V
      19: return

  public java.lang.Object arg();
    Code:
       0: aload_0
       1: getfield      #49                 // Field arg:Ljava/lang/Object;
       4: areturn

  private temp.Scala$();
    Code:
       0: aload_0
       1: invokespecial #50                 // Method java/lang/Object."<init>":()V
       4: aload_0
       5: putstatic     #52                 // Field MODULE$:Ltemp/Scala$;
       8: aload_0
       9: new           #4                  // class java/lang/Object
      12: dup
      13: invokespecial #50                 // Method java/lang/Object."<init>":()V
      16: putfield      #49                 // Field arg:Ljava/lang/Object;
      19: return
}

In def f[A]: A = new Object().asInstanceOf[A] there’s no upper bound of A thus no casts are done inside of def f and instead they are moved to all use sites. If there are many use sites of method f then in each one compiler can decide to handle the cast differently.

Changing value discarding semantics would suprisingly change semantics of some valid (?) programs. Following code prints null:

object Scala {
  def execute[R](param: Any): R =
    param.asInstanceOf[R]

  def main(args: Array[String]): Unit = {
    val x = execute[Unit](null)
    println(x)
  }
}

What is more important then:

anything.asInstanceOf[Unit] == () for any type of anything
or
null.asInstanceOf[A] == null for any type A

?
These requirements are contradictory as it seems.